DLTJ Now Uses reCAPTCHA

Posted on 2 minute read

× This article was imported from this blog's previous content management system (WordPress), and may have errors in formatting and functionality. If you find these errors are a significant barrier to understanding the article, please let me know.

DLTJ now uses reCAPTCHA on comment forms. reCAPTCHA is an enhanced version of CAPTCHA (an acronym for "completely automated public Turing test to tell computers and humans apart") and like the original it is a type of challenge-response test used to determine whether there is a human user at the other end of the browser or if it is a software agent (such as a SPAM robot). And like the original it asks the user to type in recognized words from an image or a set of numbers from an audio clip.

reCAPTCHA example with textreCAPTCHA audio example

Help with reCAPTCHA

The reCAPTCHA box contains three buttons to help use the service:

Refresh button Refresh the word images. If you are unsure what the two words are, select this button to receive a new pair of words. (Alternatively, just try to guess what the two words are; if you are wrong, you'll get a new pair of words automatically.)
Audio button / Text button Alternate between the Audio- and Text-based challenges. If you cannot see the word images, select this audio button to hear a set of digits among random noise that can be entered instead of the visual challenge.
Help button Get help from the reCAPTCHA site about this human detection scheme. Also includes introductory information about the reCAPTCHA service itself.

What's Special About reCAPTCHA

Example words from a reCAPTCHA challenge The human mind is still a more powerful computer than any silicon circuitry in place now or in the foreseeable future. With just a glance our brains can recognize the patterns among the noise — something that is computationally very expensive or impossible to do. reCAPTCHA researchers at Carnegie Mellon University, also the home of the original CAPTCHA concept, estimate that 60 million CAPTCHAs are solved by humans around the world every day with roughly ten seconds of human time are being spent in each instance. That is not a lot of time per person, but in aggregate it adds up to more than 150,000 hours of work each day.

In the original CAPTCHA scheme, that work is wasted on deciphering random strings of letters and numbers. The researchers at Carnegie Mellon realized that they could harness that work to resolve ambiguities in deciphering scanned text from books. As with the original CAPTCHA system, there are some blocks of scanned text that computers cannot decipher yet are easily readable by humans. reCAPTCHA pairs a known word with one of these unknown blocks of text. If the human types the known word correctly, the reCAPTCHA system tells the DLTJ system that the comment is coming from a human. And if enough humans type the same response for the unknown block of text, the reCAPTCHA system can be pretty sure the word has been deciphered.

So by commenting here on DLTJ you are helping make the world a better place by aiding in the digital conversion of texts from the Internet Archive. This is a bit of an experiment, so if it is not working out, please let me know.