Beyond the unspoken collective effect on Google's results, the blog world has already been mined for global patterns in a number of interesting experiments, like Blogdex, which creates a kind of alternative headline news by tracking popular URLs in recent posts. Then there's Weblog Bookwatch, which scans for Amazon URLs in new blog entries, and constructs a regularly updated list of books that are "top of mind" with bloggers. (An interesting corrective to ordinary bestseller lists, in that it measures which books get talked about, rather than which ones get bought.)
But both Blogdex and Bookwatch share a conceptual limitation with most individual blogs, a limitation that is hard-wired into the software used by the great majority of webloggers: They are organized around time.
Time is central to the philosophical DNA the blogs share with journalism: Both compulsively feature today's link, today's controversy, today's top books. This might seem like an obvious organizational principle, but it comes with great restrictions. Google, for instance, is largely oblivious to time: When you use Google, you're usually not looking for up-to-the-minute info, you're looking for authority and depth. (Try getting a useful stock quote directly from Google and you'll understand immediately.) Many of the bloggers that I follow comment on links that are time-sensitive on the scale of a year or two: Someone's rant on the latest XML spec revisions is just as relevant next week, though probably not nearly so relevant a decade from now. But because those links fall off the front door every few days, they effectively enter a de facto oblivion, where I have to hunt them down actively three weeks later when I'm looking around for useful assessments of XML. The beautiful thing about most information captured by the bloggers is that it has an extensive shelf life. The problem is that it's being featured on a rotating shelf.
If there's a time element that I do care about, it's not the just-off-the-wires time of today's news. It's my time. It's what I'm doing right now. I don't always want to know what über-blogger Jason Kottke happens to be thinking about this morning -- I want to know what he thinks about the page I'm currently reading, or the paragraph I just wrote. If I stumble across a page 10 weeks after Jason wrote up a description of it on Kottke.org, his description is just as valuable to me as it was 10 weeks before -- in fact, it's probably more valuable, because I've come across the page on my own personal journey. But as it stands now, to figure out if Jason's referenced the page I have to copy the URL and paste it into the search engine on Kottke.org. If I've got 20 or 30 bloggers that I'm following, I've got to paste that URL into 20 separate input fields.
But the bloggers needn't be anchored to the headline-news mentality. Think of them as less like a newspaper substitute and more a kind of guardian angel, hovering over your shoulder as you surf. (The Alexa software created by Brewster Kahle relied on a similar approach: He called it a "surf engine.") Punch up a URL and if Jason, or Andrew Sullivan, or Sopsy has an opinion about that page, you see their comments in a floating window alongside your main browser window. It's a simple enough trick: Sites like Blogdex are already tracking blog-borne references to different URLs. All your browser would have to do is send an additional request to a database of blogged URLs anytime you pulled up a page: If there's a match -- if one of the bloggers you're following has referenced the URL -- their comments get sent back to your machine and appear in the floating palette.
The critical standardized part in this machine is the URL: Because pages -- and Amazon products -- have distinct identifying text strings, you can assemble references to them into new higher-level forms of information: bookblogs and blogdexes and guardian blogs. But the URL is only one potential component part among many. If we had standardized tags for just five or six additional elements, you could start mining the blog space for on-the-fly information resources that would truly rival Google's. You'd need fixed categories describing who is doing the linking and who his or her "friends" are; you'd need a summary of the response to the link, alongside the full text of the response; you'd need keywords, as well as the number of comments generated in an active thread responding to the link.
Perhaps most important, you'd also need a way to distinguish between positive and negative links. Right now, systems like Google's page rank presume that the decision to link to a page is by definition an endorsement of the page linked to. You need only think of how many times Andrew Sullivan has linked to the Op-Ed columns of his arch-nemesis Paul Krugman to recognize the flaw in this logic. Positive linking should certainly be the default, but if Bloggers are going to be organizing the Web for us, they need to be able to point to pages that suck without giving those pages an even higher standing on Google.
If the blog space were to standardize around these categories, what kind of information-management tools might we be able to create? Here's one scenario. You define a few "guardian" Bloggers, perhaps by checking a box when you visit their site. You also instruct your software to watch the activity on sites maintained by "friends" of those key bloggers. You tell the software that you want a medium level of intrusiveness: In other words, you want the system to point out useful information to you, but you don't want it constantly bombarding you with data at every turn. And then you start using your computer as you normally do: surfing, writing e-mail, drafting Word documents.
Behind the scenes as you write or read, the software on your machine scans the last few paragraphs for high-information text, the six or seven words that make that paragraph distinct from the average paragraph sitting on your machine. If there's a URL included in the text, it grabs that too. The software then sends a query to the blogs maintained by your guardian Bloggers, as well as those maintained by their friends -- say 20 blogs in all -- and searches for posts that include those keywords. Since you've defined a medium level of intrusiveness, it might only grab the URL and summary text for posts that match half of your keywords, and that appear on 25 percent of the bloggers you're tracking. Let's say Jason Kottke has linked to a related article; if four other bloggers you're following have also linked to that URL, Jason's description of the article pops up beside the paragraph you've just written.
This wouldn't be a recommendation engine so much as a connection machine, tracking the flow of words across your screen and linking them fluidly to other text residing on the Web. You can make those connections as loud or as soft as you want: Perhaps the software only suggests other URLs and blog posts when you request them. (Running your blog analyzer might be akin to running a spell checker when you're done with a draft.) Other users might set their thresholds around timeliness or "heat" -- only pop up a window when there's a related link that's been posted in the past 24 hours, or when there's a link that's generated a 20-post discussion thread.
There are almost as many potential ways to manage that new flow of information as there are bloggers providing it. But to open up these new avenues, the bloggers are going to have to shed their dependence on the traditional journalistic models: Instead of going to today's blog the way you pick up today's paper, the bloggers should follow us around, providing context and commentary, supplementing our libraries and our memory. Many blogs out there possess the standards and intelligence of conventional journalism, but there are already too many of them to keep track of the way we subscribe to old-style magazines or habitually tune in to favorite TV networks. If the blogging population expands at the current rate, soon enough you'll be able to spend an entire day just reading the front doors of all your bookmarked blogs. Better to do away with the dependence on front doors, and let your favorite bloggers come to you.
In an essay published in last month's Business 2.0, James Wolcott describes the Blog experience as "a one-on-one unmediated relationship between writer and reader paradoxically made possible by the most mass of media, the Internet. Each blog is like a blinking neuron in the circuitry of an emerging, chatterbox superbrain." It's a typically well-crafted phrase, and there's something undeniably compelling about the description, but the fact that Wolcott tosses out both ideas -- one-on-one relationships and superbrains -- as though they were synonymous suggests that it's the poetry of the words that attracts him, rather than the underlying substance. There is a world of difference between the one-on-one encounter and the emerging superbrain. Blogs already excel at the former -- they're long on one-on-one encounters. But their emerging superbrains could use a little work.