Google's Caffeine, the New Search Index

June 8th, 2010: Google announces the completion of the new web indexing system called caffeine. It provides 50% fresh results for web searches than the last index with large collection of web content. Whether it might be news story, blog & forum post. Now users can find links to relevant content than it published before.

What’s the need behind the built of caffeine. Content on web is blossoming and it’s not just in size or numbers. But with the advent of video, the images, news and real time updates, the average webpage becomes richer and complex. Besides, the expectation of users is getting increased than they use for searches. Searchers want the latest relevant content and publishers expect to be found the instant they publish.

With the evolution of web & need arises to meet the user experience, Google has developed caffeine. The illustration explains how old indexing system worked compared to caffeine.

The old index has several layers, where some can be refreshed at faster rate than others. The main layer will get updated in every couple of weeks. Previously, to refresh the layer of old index, the entire web has to be analyzed which takes a significant delay.

But with caffeine, Google analyze the web in small portions & update the search index on a continuous basis. If new pages or new information found on existing pages, it is also added straight to the index. It means that fresh information is found on the web comparative than before. Caffeine enables Google to index the web page on enormous scale. Actually for every second, it processes hundreds of thousands of pages in parallel. It takes 100 million gigabytes of storage in one database and adds new information at the rate of hundreds of thousands of gigabytes per day. To store that much information, 625,000 of largest IPods are required.

Caffeine not only built with fresher results, but also with robust foundation that makes it possible to build a faster and comprehensive search engines scaling with the growth of information online, delivers more relevant search results and more.

