The basic fiber of the operation of the Web today is based on pages (and metadata that defines pages) and hyperlinks. To find “things” we go to one of several search sites, most notably Google, Bing and Yahoo. These search engines crawl the web automatically producing an index of the words, pages and links that make up the web.
The basic fiber of the operation of the Web today is based on pages (and metadata that defines pages) and hyperlinks. To find “things” we go to one of several search sites, most notably Google, Bing and Yahoo. These search engines crawl the web automatically producing an index of the words, pages and links that make up the web. Google’s rise to dominance was a combination of the strength of its algorithms and the concept of “pagerank” that was used to rank order web pages based on an interative algorithm that looks at pages in relation to other pages (and votes up pages with more links, type of links, trust, etc.). PageRank has become a very important part of marketing optimization of web pages and helped create Google’s dominance in search advertising based on context. Here’s a graph that shows Google’s current position among the top 5 search engines from Statcounter.com:
The web has evolved though and there are new ways of organizing that are developing. Facebook’s recent move to shift from a hyperlinked organization scheme to a people-centric approach through creating the open graph is one such attempt. I wrote about it here. The paradigm of a socially connected web could lead to new ways of finding “things”.
The web continues to evolve of course and for several years there has been discussion, research and development towards Web 3.0 or the Semantic Web. Here’s a simplified diagram of the stages of the web:
Finding things on the web however is anything but easy, whether we’re using words, phrases or social connections. Using search engines and hyperlinks is often something of an exercise in attention deficit disorder thinking (and I should know). Now maybe I’m part of the problem in this, since my ADD brain does love chasing shiny objects, but I often start looking for something on the web to end up hours later with a pile of unrelated information and absolutely nothing about the original subject. The problem is the underlying structure, it’s just to easy to not find what you’re looking for because words are not precise and unique. Think about it, what happens when you search on something simple like “train”? You get Amtrak, multiple city transit web sites, train history, train videos, train pics and finally the band train (which was what you were looking for in the first place). People names are even worse (unless you’re luck and have a unique name). That’s one of the basic things that the semantic web is supposed to solve. In the simplest form the semantic web is supposed to create uniqueness of things so that they can be found and used in a much more efficient and effective way. The idea is to create methods and technology to enable computers to “understand” the meaning of words. That’s what Metaweb and its Freebase entity database is designed to do, provide order and uniqueness.
Freebase provides an open database of information about all sorts of things ranging from music to movies. The idea is to open source the collection of relevant data tied to unique entities that reside in an open format that can be accessed and used by web sites, people, etc. The massive scale database increases in impact based on its rapidly increasing size. Google already uses Freebase’s information in Google News searches so the acquisition is a natural fit. Here’s a video from Metaweb that gives an excellent overview of Freebase:
This acquisition is a strong move for Google and for the general semantic web movement. It will be interesting to see how Google integrates Freebase into it’s offerings.