Monday, October 15, 2012

Brewster Kahle's Internet Archive




Updated 1:25 p.m., Monday, October 15, 2012
  • Brewster Kahle operates the Internet Archive in a former Christian Science church in San Francisco's Richmond District. Photo: Michael Short, Special To The Chronicle / SF
    Brewster Kahle operates the Internet Archive in a former Christian Science church in San Francisco's Richmond District.
     Photo: Michael Short, Special To The Chronicle / SF.

Brewster Kahle was a 19-year-old computer science student at the Massachusetts Institute of Technology when a friend posed a simple, yet life-changing question: "What can you do with your life that is worthwhile?"

Kahle came up with two answers. The first, developing a microchip to ensure the privacy of telephone conversations, didn't pan out. But 32 years later, Kahle is still happily pursuing his second big idea - to create the digital-age version of the Great Library of Alexandria.
His Internet Archive - fittingly based in an old Richmond District church that architecturally harks back to the ancient Egyptian library - is building a rich repository of modern digital culture. It's best known for the online Wayback Machine, which provides a searchable online museum of the Internet, archiving more than 150 billion Web pages that have appeared since 1996The nonprofit archive stretches beyond the Internet. It has recorded 350,000 television news broadcasts, including reports from around the world during the week of the 2001 terrorist attacks, and stores 200,000 digitized books.
The nearly 10 petabytes - equivalent to about 10 billion books - of material in the archive also has 900,000 audio files, including 9,000 fan-made recordings of Grateful Dead concerts. Volunteers are even converting old home movies and stock footage of post-World War II San Francisco into digital form.
It's a mind-boggling, and constantly growing, amount of digital data, and it's all available for free, as the site's welcome says, to "researchers, historians, scholars, and the general public." With 50 times as much data expected to be produced over the next decade, it will be an ever-increasing challenge to capture, catalog and store it.
But at a time when what's brand new can almost instantly become passe, Kahle believes it's more important than ever to remember our yesterdays.
"Let's not throw out the old, even though we're going headlong into trying to invent some new future," he said. "And, in fact, the older things inform what we do."
The archive's mission of creating "universal access to all knowledge" would appear to be a Sisyphean task at best, as well as a venture that's not going to bring the 51-year-old entrepreneur and Internet pioneer the kind of money that would make a Mark Zuckerberg envious.
But Kahle isn't motivated by the pursuit of money - he says he already has "plenty of that" from previous ventures, including Alexa Internet, a Web information company that Amazon.com bought for a reported $250 million in 1999.
He's also earned plenty of accolades - in April, he was inducted into the Internet Hall of Fame, an online-only hall established by the Internet Society of Reston, Va. He was part of an inaugural class that included tech luminaries Vint Cerf, Robert Kahn and Charles Herzfeld.
His real reward, he says, is creating a place for researchers - and anyone else with a curious mind and a thirst for knowledge - to have unfet-tered access to the fleeting cultural artifacts of the Internet age. The "optimist and utopian" in him believes his "Library of Alexandria, Version 2" ultimately will make the world a better place.
"It's really meant to be a resource where you can come up with your own ideas," he said. "We want people to think deeper and then create new things that are worthy of putting in the library."
Kahle, a tall, balding, slightly rumpled-looking fellow, has been married for 20 years. His wife, Mary Austin, is co-founder of the San Francisco Center for the Book, a strictly analog venture that teaches classic bookbinding and letterpress techniques. They live near the archive in the Presidio and have two sons, Caslon, 18, (named for the Caslon typeface) and Logan, 15.

As he talks about his work and his staff, Kahle comes across as a proud paternal figure. But as he begins a tour of the archive - the former Fourth Church of Christ, Scientist, a neoclassic building with Greek columns on Funston Avenue - he's more like a kid showing off a new toy. With wide eyes and a slight giggle, Kahle describes how, since locating there in 2009, archive workers outfitted old storage areas and other rooms with racks of ultramodern, custom-designed computer servers.
There are even servers installed in the back of the old church's main hall, their constant whir replacing the sounds of worship. Somehow, the servers still fit naturally with the old church pews, which these days are filled with rows of Kahle's own Terracotta Army - half-size clay statues that are a sort of archive themselves, representations of employees who have worked there for three years or more.
Kahle - who on most weekends can be found on the bay in his sailboat - said the archive's unique home, built in 1923, anchors his personal and professional life.
"All of this comes from a perspective, which is why we have a Greek place with pillars," he said. "It's all about trying to help people understand what it is we're trying to do, and have ourselves be reminded. ... It's because we're supposed to be doing the public good."
Kahle is a computer geek who can go "nerd-to-nerd with anybody," but he is still able to articulate his vision to a non-tech-savvy crowd, said Cindy Cohn, legal director for the Electronic Frontier Foundation. Kahle is on the San Francisco digital rights advocacy organization's board of directors.
"That's what makes him so inspiring," Cohn said. "Brewster is one of the first people I've met in this area who had that whole package together. He's very serious about what he's doing, but he's kind of childlike in his enthusiasm, which is infectious."
Indeed, Kahle may have the perfect virtual pulpit for his efforts. "He has almost evangelistic zeal for promoting better access to information to take advantage of the opportunities that are out there," said Pamela Samuelson, a professor at UC Berkeley School of Law.
Samuelson, a renowned pioneer in digital copyright law, met Kahle about 20 years ago.
"If anything, he's become more of a visionary and more of an evangelist," she said. "He hasn't slowed down at all. I can imagine Brewster Kahle when he's 85, still out there saying, 'Oh, we can do this, we can do this.' "
He's also not been afraid to take on big companies or the federal government. Kahle, the Electronic Frontier Foundation and the American Civil Liberties Union fought a 2007 attempt by the FBI to obtain personal information about an Internet Archive user, arguing that it was unconstitutional. The FBI later withdrew the request.
He also has opposed Google's controversial project to create its own collection of digital books. Although Google and the Association of American Publishers this month settled their 7-year-old copyright dispute over the project, which already includes 20 million scanned books, Kahle still objects to the restrictions Google places on access to many of the books it has digitized.
"I come from the Internet generation, and the things we've seen work have not been these closed, walled gardens," Kahle said. "And what we're really about is having no centralized points of control. We want lots of winners. We want lots of publishers to win. We want lots of libraries to win."
Kahle was born in New Jersey, grew up in New York and studied artificial intelligence at Massachusetts Institute of Technology, where he was challenged to think about what he really wanted to do. While still in school, he tried to develop a chip that could encrypt telephone conversations, but "couldn't figure out how to do it cheaply enough to help the everyday person," he said. After graduating in 1982, he helped start a company called Thinking Machines, designing chips for supercomputers that could "search everything," he said. His work led to the development in 1989 of the Wide Area Information Server, or WAIS, the first system that enabled connecting to and searching databases through the Internet.
In 1992, he co-founded WAIS Inc., which helped traditional print publishers get on the Internet. Kahle helped set up an early version of the Gate, which is now SFGate.com, The Chronicle's website.
America Online bought WAIS in 1995 for $13 million. In 1996, Kahle co-founded Alexa Internet, a Web research and information company still based in San Francisco's Presidio. The name is derived from the Library of Alexandria.
At the same time, Kahle co-founded the Internet Archive, and used Alexa Internet's Web-crawling technology to feed the catalog of sites in the Wayback Machine, a play on the name of a time machine used by the old TV cartoon character Mr. Peabody.
When Amazon bought Alexa Internet, Chief Executive Officer Jeff Bezos agreed to continue donating data to the Wayback Machine. Kahle stayed with Alexa for three years after the sale before moving full time to the Internet Archive.
Kahle calls TV news just as ephemeral as websites, yet it is just as "pervasive and persuasive" in its influence on modern life, from culture to politics. The archive's latest project, TV News Search & Borrow, attempts to preserve those shows for future generations.
The project began Sept. 17 with a collection of 350,000 news programs digitally recorded during the last three years from domestic TV networks and stations in San Francisco and Washington. It also includes a section devoted to news broadcasts about 9/11 from around the world. The free service can be searched online by keywords, so someone researching political debates, for example, can search for clips on "Obamacare" or "Big Bird." (If an entire program is needed, it's loaned out on DVD-ROM to observe copyright restrictions.)
The service is about fostering better "media literacy," Kahle said. "It's meant to make television news researchable, basically like newspapers have always been."
The big idea Kahle conceived more than 30 years ago has recently inspired another.
The Internet Archive Federal Credit Union is scheduled to start operating this year and fully open next year. Based in New Brunswick, N.J., the home of Rutgers University, it will serve about 135,000 mainly low-income residents. The credit union, which was granted a charter in August, is partially funded by the Kahle/Austin Foundation, a nonprofit organization formed by Kahle and his wife.
Kahle says he was moved to do something after talking to Internet Archive employees who scan books into digital form about how they struggle to meet high Bay Area rents every month. One of the credit union's aims will be to help develop housing that employees of nonprofits like his can afford.
"So how do we get around this? I don't know the answer completely, but we're going to try some things," he said. "The first thing we needed was a bank that would be up for trying to help people more than the banks are doing these days, so we thought, 'OK, let's start a credit union.' "
Kahle is confident that it, like the archive, will be a worthwhile venture. Maybe even life-changing for some. His ambition for it is simple, and familiar:
"We'll do as much good as we can with it."
2 million
Unique visitors per day.
250
Rank among most popular websites.
10
Petabytes of material archived (one petabyte equals 1 million gigabytes).
350,000
TV news broadcasts archived.
2 million to 2.5 million
Digital copies of books.
100,000
Music concerts archived.
150
Employees.
Source: Internet Archive
Benny Evangelista is a San Francisco Chronicle staff writer. E-mail: bevangelista@sfchronicle.com

No comments:

Post a Comment