It has been a wild few weeks in search engines — or search-engine-like services. We’ve seen the introduction of no fewer than three high-profile tools … Wolfram|Alpha, Microsoft Bing, and … each with their own strengths and needing their own techniques — or, at least, their own distinct frame of reference — in order to maximize their usefulness. This post describes these three services, what their generally good for, and how to use them. We’ll also do a couple of sample searches to show how each is useful in its own way.
The first new service, chronologically, is Wolfram|Alpha. The answer to the first question of its FAQ, “Is Wolfram|Alpha a search engine?”, tells you that it is something different:
No. It’s a computational knowledge engine: it generates output by doing computations from its own internal knowledge base, instead of searching the web and returning links.
A “computational knowledge engine”? What’s that? It is an attempt to gather facts, formulas, and natural language processing algorithms, encode them all in the language of Mathematica (don’t worry — the Wolfram|Alpha folks do that for us), and answer any factual question posed to it. An ambitious goal? It sure is, and the state of the system now is an impressive start. According to the service’s about page, “Wolfram|Alpha contains 10+ trillion of pieces of data, 50,000+ types of algorithms and models, and linguistic capabilities for 1000+ domains.”
So what is Wolfram|Alpha good for? Fact and computations about facts. The service already knows a lot about the world: geography, economics and socioeconomic data, physics, chemistry, engineering, sports, units of measurement, weather, and even music. The results come back as graphs, maps, simple facts, and tables. There is usually a “Source Information” link that lets you know how it got the answer.
- Growth chart for a 12 year old girl
- How deep is the Bering Sea?
- Swine flu (It would appear the data is regularly curated and up-to-date.)
- second cousin twice removed (shows a graph)
- Will it rain tomorrow? (I’m assuming it is guessing where I am based on my IP address, but it doesn’t give any indication of this)
- How far is Las Vegas? (Again, it seems to know where I am.)
- Where is the international space station?
How is Wolfram|Alpha different than the typical web search engine that crawls the web and processes keyword searches? The first difference is in the input: all of the facts, formulas and relationships between data points are curated by real humans. A search engine crawls the web, indexes whatever text it can find, and attempts to compute relevance of a page based on the number and quality of links leading to that page. The researchers behind Wolfram|Alpha select the data sets to put into the system, and go to great length to link data sets together. The second difference is in how it processes the user’s query: Wolfram|Alpha tries to determine the real meaning of the question being asked. Natural language parsers attempt to determine the domain(s) of the question, and that becomes part of the computation for the answer. (For instance, Wolfram|Alpha doesn’t understand the meaning behind the question How many goobles are in a pickus?, but Google will give you 64 web pages when it is asked the same question. Wolfram|Alpha also prompts you to disambiguate queries that apply to multiple domains.)
Looking for more information? David Weinberger has a good analysis of the underpinnings of Wolfram|Alpha and what it might mean for those of us in the business of answering questions. Jon Udell talks about being able to “compute with facts in a more frictionless way.” There is also an article in the Chronicle of Higher Education that talks about how Wolfram|Alpha can be used to answer mathematical questions and show how to get to the answer; this is generating some consternation among math instructors.
And what is a geeky service without some easter eggs1? For instance, Wolfram|Alpha’s response when you say “hello” to it. If you are a fan of 1960’s culture, you can ask it “How many roads must a man walk down before you can call him a man?”, with a nod to Bob Dylan. Lost? Wolfram|Alpha knows where you are — or, at least, where your computer is. It even has an answer for How much wood could a woodchuck chuck if a woodchuck could chuck wood? (Easter eggs courtesy of two posts at Mashable.com)
Bing is Microsoft’s new search engine, replacing Live.com. Of the three systems described in this post, it is the most similar to your experience with existing search engines. Bing makes an extra effort to help searchers with some targeted topics; from the Microsoft press release: “Microsoft’s research identified shopping, travel, local business and information, and health-related research as areas in which people wanted more assistance in making key decisions. The current state of Internet search isn’t optimized for these tasks, but the Bing Decision Engine is optimized for these key customer scenarios.” For readers that are academic librarians, those may not be topics geared towards academics but they might be useful in your own research. In fact, I couldn’t find anything in Bing’s feature set that is particularly attractive to an academic community. That said, there are some tweaks over the competition that are interesting to look at. One of the first things to notice is the list of search refinements along the left side. A search on Bing for Columbus, OH, for instance, includes links to search Bing for topics like attractions, hotels, and tourism. The refinements are different based on the context of the search term; for instance, you see different refinements in a search for digital cameras. As Ars Technica points out, the search refinements will even differ in the same type of search; the search for Toronto, for instance, has refinement links to weather, airport and real estate. The left-side bar also shows suggested “related searches” (although it is not clear how “related searches” are different from the search refinement links above it) and your search history.
The interface also has other usability improvements — thumbnails of videos will start playing when you mouse over them, the image search results employ “infinite scrolling” (where the browser will load additional hits in the background as you scroll down) and a variety of ways to limit results (in facets down the left side), and unique handling of “best matches” that allows for further navigation within a site (for instance, see the results page for “Google” and “Ars Technica“).
The last service is one that is in Google labs and is called “ “. “Squared” refers to its ability construct a table of facts from two search terms, similar in result to a spreadsheet. On one axis you can put a general search term — say “roller coasters” as in the example from the service announcement and across the other axis add headings that describe the facts you want to know about the search term — such as height and speed. The result is a . You can add facts to your table by putting the term at the top of an empty column (say, for this example, “location”). Click inside a cell and you can see the source of the answer, alternative answers, and the ability to change which answer is listed in the cell.
As mentioned above, Wolfram|Alpha is also a good tool for finding facts. In contrasted with Wolfram|Alpha, though — where all of the information is specifically curated to link up to each other — the facts in Google Squared are collected from the web. As such, you’ll see variability of information, as this example screen shot of the speed of the Superman roller coaster shows. But unlike Wolfram|Alpha, which may give you only the barest citation of data sources, with Squared you can go right to the page where the fact came from and use that page to determine the validity of the fact. As with many of Google’s services, it starts out okay but if it continues to get resources in the company we can expect it to get a lot better over time.
This screencast shows a comparison of the three services with three distinct searches that highlight the unique capabilities of each service.
The three searches, with the three corresponding links to the three services, are listed below. The first, bolded service is the featured service for that search.
- Will it rain tomorrow?
- Wolfram|Alpha, , Microsoft Bing
- Microsoft Bing, , Wolfram|Alpha
- Republican Governors
- Microsoft Bing, Wolfram|Alpha ,
Speaking of comparisons, a new service called “Blind Search” will allow you to run the same search across Bing, Google, and Yahoo and allow you to compare the results. The hits come back in three columns, but the search engine used to generate each column of hits isn’t revealed until you select the search engine that gave you the best results. So, in a blind test, you can see if one search engine is better than another in terms of the raw relevance ranked results without all of the additional bells and whistles of each service.
The text was modified to update a link from http://chronicle.com/free/2009/06/19910n.htm to http://chronicle.com/article/A-Calculating-Web-Site-Could/47316/ on January 20th, 2011.
- Easter eggs are messages, videos, graphics, sound effects, or an unusual change in program behavior that sometimes occur in a software program in response to some undocumented set of commands, mouse clicks, keystrokes or other stimuli intended as a joke or to display program credits. — Definition courtesy of Wikipedia [↩]