From: Goldman, Ava (Ava_Goldman@CalPERS.CA.GOV)
Date: Tue Nov 22 2005 - 11:04:15 PST
Subject: FW: A MAN'S VISION: WORLD LIBRARY ONLINE : Brewster Kahle hopes to realize his 25-year dream of an international book archive Date: Tue, 22 Nov 2005 11:04:15 -0800 Message-ID: <781B30BA0EB0904F8203EC15339A3B6908EBAE3A@hqk110.calpers.ca.gov> From: "Goldman, Ava" <Ava_Goldman@CalPERS.CA.GOV>
Cross-posted.
Ava Goldman
Senior Librarian, CalPERS Human Resources, All Staff Training &
Development, 916-795-1533
http://www.sfgate.com/cgi-bin/article.cgi?f=/c/a/2005/11/22/MNGQ0FSCCT1.
DTL
Heidi Benson, Chronicle Staff Writer
Tuesday, November 22, 2005
A fat October moon shone through the Presidio treetops the night
Brewster Kahle launched the latest shot in the space-race for a digital
library.
"Let's get the people's books back to the people!" said Kahle, standing
at the podium inside the Golden Gate Club.
Founder of the Internet Archive, Kahle is an ebullient technology
visionary of the type Northern California cultivates. He has been widely
recognized as a digital guru and a catalyst for change.
Now, his vision is helping shape the debate over how a book library
should reside on the Internet. His idealistic yet pragmatic approach --
providing free digital access to works in the public domain -- could be
a bridge to detente in the war between publishers and Google Inc.
While Google has alienated authors and publishers with its plan to
digitize books still in copyright, Kahle has moved gingerly, forging
collaborations with Google's fiercest archrivals -- Microsoft and Yahoo
-- to create a kinder, gentler digital library effort called the Open
Content Alliance.
The alliance, focused on books no longer under copyright -- that is,
books published before 1923 -- echoes the computer industry's open
source movement, which has sought to spur innovation by enabling
software engineers to freely share their code.
Google's library initiative, the Google Print Library Project, which has
plans to digitize books from the collections of their partner libraries
(the New York Public Library plus the libraries of Oxford University,
Harvard, Stanford and the University of Michigan) -- including many
books still in copyright -- has earned the ire of authors and
publishers.
"Google is building a database of value that was created by authors and
publishers and using it to advance the interests of its
revenue-generating, for-profit search-engine operation," Allan Adler,
vice president for legal and government affairs of the Association of
American Publishers, which has filed a copyright infringement suit, told
The Chronicle by phone from New York.
------------------------------------------------------------------------
--------
The launch of the Open Content Alliance was like a step back in time for
one attendee.
"I had a feeling of being back in the early days of open source software
-- where everybody was there because they hated Microsoft," said Paul
Duguid, a visiting scholar at UC Berkeley's School of Information
Management and Systems. "This was the un-Google meeting."
That night, Kahle unveiled his new book scanner, Scribe. A kind of
portable darkroom, it looks like a black-draped office cubicle. Inside,
two digital cameras peer down on a book held in a V-shaped glass cradle.
A human technician turns the pages and works the cameras via foot
pedals. (Automated systems can damage precious paper.) The results are
super-high resolution photographs at a cost of 10 cents per page.
A handful of Scribe machines already have been sent to the library of
the University of Toronto, an alliance partner.
But more revolutionary than the book scanners -- variations of which
Stanford, Google and others have developed -- is the technology Kahle
has tested with his Internet Bookmobile, which enables scanned books to
be printed out and bound in volumes that faithfully resemble the
original.
It is here that Kahle and the Open Content Alliance have topped Google
-- by already getting real books into readers' hands, even in
book-starved eastern Africa.
"It doesn't look like a printout with a staple," Kahle said. "It doesn't
look like a report. It looks like a book.
"Maybe I'm old-fashioned, but I still love books."
------------------------------------------------------------------------
--------
By light of day, the parking lot of the Internet Archive's headquarters
at the Presidio hosts a motley array of vehicles, including Kahle's
favorite invention, the Internet Bookmobile, a green Ford van with a
satellite dish on the roof and a printer and bookbinding contraption on
the tailgate. The slightly creaky clapboard building, built in 1857 as a
military residence and store, has little in common with the nearby sleek
campus of George Lucas' Industrial Light & Magic.
"Where are the machines?" a visitor might ask. The Internet Archive's
souped-up servers -- storing petabytes of information (1 petabyte equals
100 million pages) -- are all South of Market, filling three warehouses
to the rafters.
Kahle's journey started at the Massachusetts Institute of Technology,
where he studied artificial intelligence. After graduating in 1982, he
helped start a company called Thinking Machines. By 1989, he had
invented the first electronic publishing system, WAIS (Wide Area
Information Server), with a client list that included the White House,
the Government Printing Office, both houses of Congress, the Wall Street
Journal and the New York Times.
The company was acquired by AOL in 1995, and Kahle decamped for San
Francisco, where he started the nonprofit Internet Archive in 1996 to
serve as a permanent archive of digital work -- Web pages, music, books,
software programs -- available free to scholars and researchers. That
year he also started a for-profit arm, Alexa Internet, a tool for
crawling the Web, which he sold to Amazon.com in 1999.
"My interest is to build the great library," said Kahle, perching
briefly in the conference room of the Internet Archive shortly before
the alliance event. "That was the goal I set for myself 25 years ago. It
is now technically possible to live up to the dream of the Library of
Alexandria."
That storied institution on the Nile delta housed all the world's
knowledge until its mysterious destruction 1,600 years ago.
"Folks are using the Internet as a library, and they're using it many
times every day," Kahle continued. "We're seeing much more traffic on
the Internet then we ever did in our public library system, but what's
available on the Internet isn't the best we have to offer. Almost
everything on the Internet has been written since 1996 -- and most of it
has been written for the Internet." Kahle's dream is to collect online
the great books on which modern civilization is based.
"Do you know what's carved above the Carnegie Library in Pittsburgh? --
'FREE TO THE PEOPLE' -- what a goal!" Kahle said. "I can believe in
this! At the Internet Archive, we think of our mission as 'universal
access to all knowledge.'
"That should be carved over our door."
------------------------------------------------------------------------
--------
Early this year, Kahle was in talks with Yahoo's vice president for
search technology, David Mandelbrot, and Sumir Meghani, business
development manager of the Sunnyvale Internet company.
"We wanted to figure out how the nonprofit sector could work with the
commercial sector," Kahle said. The subject of "a digital library of
Alexandria" just naturally came up.
Yahoo proposed creating a freely accessible digital library that would
include only books in the public domain.
"After that, it was easy to know how to proceed," Kahle said.
It was agreed that Yahoo would supply the search engine for the Web site
and index the books scanned by the Internet Archive's Scribe machines.
The Open Content Alliance was born. By October, an impressive group of
libraries and publishers had promised to participate, including the
Smithsonian, Johns Hopkins University, University of Toronto, British
National Archives, European Archives, O'Reilly Media and Prelinger
Archives plus multimedia companies LibriVox, Octavo and others.
The University of California already has started its contribution: a
collection of 18,000 works of American fiction, which librarians are
selecting from the 10-library statewide system. Microsoft's MSN Search
has promised $5 million toward the scanning of 150,000 books, and both
Adobe and Hewlett-Packard will contribute advanced digital imaging.
Kahle hopes to have "a couple of great collections up on the Web by the
end of 2006."
------------------------------------------------------------------------
--------
The Google Print Library Project differs from Kahle's in an important
way: Google is creating not a library but a vast electronic card
catalog.
"We have been very clear that we want to build a book-finding tool, not
a book-reading tool," said Jim Gerber of Google Print.
"Even before we started Google, we dreamed of making the incredible
breadth of information that librarians so lovingly organize searchable
online," co-founder Larry Page told author David A. Vise in his new
book, "The Google Story," out this month from Delacorte.
Back in October 2004, they took the first step, announcing the Google
Print program at the annual Frankfurt Book Fair. It would allow viewers
to search books online -- but not scan or print them out -- based on
agreements with publishers. A similar project, Amazon's free "Search
Inside the Book," had already proved to boost book sales.
Since Google's main source of revenue is its signature all-text ads,
which are linked by topic but are separate from searched content, that
model would be repeated. Google and the publishers would split the
proceeds.
But this summer, at the 2005 Frankfurt Book Fair, publishers and authors
were bristling over Google's most recent announcement.
A new project -- Google Print Library -- was set to begin digitizing
library books, including many still under copyright. Only snippets of
text would be viewable, so Google claimed this was fair use. Google also
considered itself under no obligation to ask copyright holders'
permission before scanning books.
Publishers saw it differently: Because entire books would be digitized
to provide such snippets, they feared piracy -- and the damage that free
file-sharing has done to the music industry. Also, publishers were wary
of Google having the biggest online library in the world at its disposal
when, in the future, copyright law changes to adapt to the Internet age.
In August, the 3,000-member Authors Guild sued Google to cease and
desist; the American Association of Publishers, with 300 members, sued
for copyright infringement; and PEN USA and the International Publishers
Association issued a joint declaration calling the Google Library
Project "in breach of existing copyright law."
In response, Google suspended book scanning for three months to give
authors and publishers time to be excluded if they feared piracy.
"Early on in the discussions about Google Print, that was one of the
fears noted most regularly," Gerber said. "Frankly, that's part of the
reason we changed our policy. That's the purpose of our exclusion
option."
On Nov. 1, Google resumed scanning books. Just two days later, Google
Print announced the availability of its first large collection of books
-- all in the public domain. And this week, a strategic name change was
announced. Google Print is now Google Book Search. A posting on Thursday
by Jen Grant, product marketing manager, said: "Why the change? Well,
one factor was all the comments we got about how excited people were
that Google Print would help them print out their documents, or Web
pages they visit -- which of course it won't."
Meanwhile, in anticipation of the new digital marketplace, two other
companies scrambled to accommodate fee-based online viewing. Amazon
announced Amazon Pages (unlike "Search Inside the Book," a fee will be
charged for page viewing of certain books). And, separately, one of the
nation's largest publishers, Random House, set a price for future
transactions -- 4 cents per page for viewing more than 5 percent of a
book.
"Brewster Kahle is an activist, not an empire-builder," said Paul Saffo
of the Palo Alto-based Institute of the Future.
"What I've always admired about Brewster Kahle is his attitude -- 'let's
get the job done and find out what the wrinkles are,' " said UC's
Duguid.
"If they would team up -- with Google's strength and Kahle's
philosophy," Duguid mused, "that would be great."
When asked whether an association with Open Content Alliance was in the
works, Google spokesman Nate Tyler said, "We are talking to them, but
there's nothing to announce yet." More than once, Kahle has expressed
his desire to see Google join forces with the project in some capacity.
Before taking the podium at the Presidio, Kahle told a visitor, "I
applaud Google's efforts. They've got a bold vision. But their approach
seems to have caused lawsuits.
"C'mon, guys! Let's get the businesspeople back at the table, and send
the lawyers back to their cubicles!"
Circulating in the crowd that night was Kahle's wife, Mary Austin,
founder of the San Francisco Center for the Book, and their two sons. It
was the end of a long day. The next morning, the family was set to fly
to China, where Kahle would address an international conference on
digital libraries.
"If we do this right, it will be remembered as one of the great things
humans have done, up there with the Library of Alexandria, Gutenberg's
press and putting a man on the moon," Kahle said in closing. "We're
going step by step -- first, let's see if we can get the technology
right so that you'll actually want to see a book on a screen."
This archive was generated by hypermail 2.1.3 : Wed Mar 22 2006 - 16:59:02 PST