Language Log Wed 2009-09-16 18:57 EDT
Language Log >> Google Books: A Metadata Train Wreck
.This is almost certainly the Last Library, after all. There's no Moore's Law for capture, and nobody is ever going to scan most of these books again. So whoever is in charge of the collection a hundred years from now -- Google? UNESCO? Wal-Mart? -- these are the files that scholars are going to be using then. All of which lends a particular urgency to the concerns about whether Google is doing this right...you need good metadata. And Google's are a train wreck: a mish-mash wrapped in a muddle wrapped in a mess.