In response to the legal uncertainty, Google put its scanning project on hold for several months in the summer, but it has now resumed the project. Last week, the company put its first stash of scanned library books online. Diving into this trove is a trip. You could easily lose days in Google's digital labyrinth, not unlike the way you might walk into the stacks at Stanford or Harvard on a Friday and emerge punch-drunk on a Sunday, amazed by the breadth of the work you've seen. The difference is that Google's library is searchable; you can find what you want not just by looking up an author or a Dewey Decimal subject, but also by typing a particular phrase or quote -- "the play's the thing," say, or "Let my people go!" -- that you're looking for in a book. Google will look for your search term in every page of every volume in the library, and instantly show you images of the pages in each book where the phrase appears.

At the moment, Google's library mostly contains a trove of work published before 1923; copyrights on these books have expired, and the books' contents, therefore, are in the public domain, free for anyone to use in any way. Amid these titles you'll find all manner of books in Google's stash: Among many other things, there's an illustrated first edition of Henry James' "Daisy Miller,"; a 1702 history of France with an exceedingly long title; "Debates of the House of Commons, 1667 to 1694," which records a certain Mr. Finch arguing against the naturalization of aliens; and this 1785 gardening book, which advises farmers to plant hedges of holly around their corn, since "Holly does not fuck the land," and therefore rob the corn of nutrients. (It's possible, though, that this last one is actually "suck the land," and that Google's text-recognition program made a mistake with the old script.)

Google's collection also includes a vast number of books published after 1923 that publishers have already given Google permission to include -- but because these books are under copyright, Google limits their functionality in order to reduce the chance that the service will negatively affect book sales. For instance, searching for the name Calliope in Jeffrey Eugenides' "Middlesex" will yield several page numbers but not the content of all those pages. That way you can't read the entire book through the search interface.

The main fight between Google and publishers involves a third category of books, those that are still under copyright but that publishers have not given Google permission to include in its library. When Google, in the course of scanning books at a library, comes upon a book published after 1923, publishers insist that the company should set it aside and get permission first; Google says that it has the right to scan these books and make them available online. The company insists that it will soon include such books in its library.

At the moment, though, what this means for you is a truncated library. Right now, no text search in Google will return any phrases contained in many popular titles. For instance, you can't find such titles as "Lolita," "The Great Gatsby," "The Best and the Brightest," "The Da Vinci Code," or much of anything by John Updike, Philip Roth, Richard Feynman, John McPhee, Shelby Foote, Terry McMillan, Sharon Olds, Julia Child and Woody Allen.

This is most problematic for obscure books, books you don't know you're looking for. Take this hypothetical scenario: Let's say that somewhere in the stacks at the University of Michigan there is an essay by a writer you've never heard of, on a subject you didn't know about, in a volume no longer in print, by a publishing house no longer in business; let's say, moreover, that even though you don't really know it, this essay is exactly what you're looking for, the answer to all your searching needs, in much the same way you find Web pages every day by people you don't know that turn out to be just the thing. Ideally, as Google envisions it, you could one day go to its search engine, type in a certain bon mot, and find this book, your book. Because it's still under copyright, Google would only show you a few sentences around your search term as it appeared in the text, not the whole volume; but you'd know it was there in the library, and if you wanted it, you'd be free to check it out, or find some way to buy it. Without Google's system, you'll never hear of this book.

In such a scenario, proponents of Google's plan see nothing but good -- good for the company, for Internet users, and especially for authors. In most copyright disputes between content companies and tech firms, there is often a legitimate question over which party might benefit more from a new technology, notes Fred von Lohmann, an attorney at the Electronic Freedom Foundation, which sides with Google in this battle. "Take the Napster case," von Lohmann says. In that situation, Napster claimed that its file-swapping tool could increase CD sales by letting people preview music before they purchased it; the CD industry, meanwhile, said the system had caused a significant drop in sales. Both sides cited numbers to support their arguments, and each theory sounded at least plausible.

"But with the Google Print situation, it's a completely one-sided debate," von Lohmann says. "Google is right, and the publishers have no argument. What's their argument that this harms the value of their books? They don't have one. Google helps you find books, and if you want to read it, you have to buy the book. How can that hurt them?"

Obscure books -- books that are out of print or otherwise hard to get ahold of -- would stand to gain the most from such a system, and it turns out that there are plenty of such books in the libraries Google plans to scan. Not long ago, the Online Computer Library Center, a nonprofit library research group, set out to count and catalog the books Google would capture in its project. The OCLC determined that at the five research libraries with which Google had formed deals, about 80 percent of the books in the stacks were published after 1923 and still under copyright. But only a small number of these books are currently in print.

Tim O'Reilly, a computer book publisher and sponsor of influential tech conferences, points out that in 2004 only 1.2 million different book titles were sold in the United States, according to Nielsen Bookscan. This means that while a significant number of library books are protected by copyright, they are also out of print -- 70 percent or more, O'Reilly estimates. These books, he says, represent the "twilight zone" of the publishing world; someone owns them, but since they're perceived to have no commercial value (because they're no longer sold in stores), publishers don't have any incentive to promote and market them, let alone to go through the expense of scanning them and making them searchable online.

Indeed, in many cases the publishers and rights-holders of these books are unknown. There is no national registry of copyright holders in the United States, as there is a national registry of patents. Any book published is automatically granted a copyright, and if a book publisher goes out of business, or an author dies, the copyright to the work may well be buried in contracts that long ago turned to dust. "We precluded any possibility of creating a copyright database," says Vaidhyanathan, and "it's impossible for a company like Google, or a historian, or a documentary filmmaker, or anyone to find out who owns what. Even publishers don't know what they own. It's just impossible."

O'Reilly is one of few publishers who support Google's plan, and he likes it precisely because he thinks it will shed light on these little-known titles whose rights-holders are hard to track down. "One of the biggest arguments for Google's approach is that it is the only solution that solves a hard problem," O'Reilly says. He points out that only 2 percent of books sold in 2004 had more than 5,000 copies purchased; the rest languished in obscurity. And that, he wrote in a recent New York Times Op-Ed, "is a far greater threat to authors than copyright infringement, or even outright piracy." Google, O'Reilly went on to write, "promises an alternative to the obscurity imposed on most books. It makes that great corpus of less-than-bestsellers accessible to all. By pointing to a huge body of print works online, Google will offer a way to promote books that publishers have thrown away, creating an opportunity for readers to track them down and buy them ... In one bold stroke, Google will give new value to millions of orphaned works."

Recent Stories