Getting Captcha to digitize books. Well solution is simple and elegant. show two words, one is known while other is one to digitize. Users do not know which is which. If the control (known word) is good, we can trust the second with high probability.
This is done by many sites, and they have digitized about 2.5 million books per year this way. See the above Ted talk by Louis Ahn for details.
They are also trying to use language learners to translate the web, which shows the samples to translate and combine translations to create the final version.