Cool Thing 3: Distributed Proofreaders

After  reading a few Gutenberg ebooks, you may wonder who creates them. It’s actually a voluntary, co-operative system: you can help by adding anything from a page to whole chapters. There’s no commitment to go back and do any more if you don’t enjoy it.

Ebooks are digitised by scanning the print text and running the result past a programme which tries to guess what each letter is (Optical Character Recognition, or OCR). The results are mostly ok-ish – except at line ends, or with old fonts, or damaged books, where words end up garbled. Distributed Proofreaders lets you ungarble the words by matching them against the scanned text.

To sign up, go to the Distributed Proofreaders’ site and register as a volunteer.  Click in the mail that gets sent back (check your junk mail), and then log in.  Select one of the books marked ‘beginners’ and try a page.

It can get addictive – but if its not enough of a challenge you could try working from old handwriting instead of print. UCL are running a project to transcribe the works of Jeremy Bentham


  1. Amy

    Some other nice examples of users carrying out distributed small tasks and as a result creating or improving data are:
    – Galaxy Zoo – – try your hand at classifying galaxies!!!
    – Trove – – this started out as a site which allowed users to transcribe historical Australian newspapers, but has now expanded to all things Australian
    – New York Public Libraries Map Warper – . This is a lovely tool that allows users to overlay historic maps on top of modern maps and then ‘rectify’ or digitally align the old and the new map.

