Exercise 2 - Web Search

By BrianHeM10BrianHeM10 (1221065515|%a, %b %e at %I:%M%p)

Web Search Exercise

Introduction

When someone wants to search for something online, what do they do? They don't search for it, they "Google" it. Google has increasingly become the default search engine for people across the world - its presence existing on desktop applications, mobile phones, instant messaging programs, etc. There was a time when people would consider using Yahoo, Ask.com, Altavista, and a few other major search engines when trying to find information. Now, most searches either are completed after one Google search or multiple Google searches.

This exercise introduces us to the crazy idea that Google is not the best at everything and, by leveraging the core strengths of multiple search tools, there are ways to improve our searching efficiency by thinking outside the gBox. Also, even within the Google search engine, the vast majority of searches are "simple queries" - a few words without any sort of advanced search syntax. This exercise opened our eyes to these more advanced ways of finding information within Google and Yahoo.

yahoo_logo.jpg 1_google_logo.jpg windows_live.jpg
Technorati.jpg google_blog_search_beta_logo.png Bloglines%20logo.png

Learning Advanced Search Syntax

Although most people do not know this, when one types in 'university of michigan' (without the quotes) into Google, Google really searches for anything that has the words University, Michigan, or University & Michigan. Although most results will come back related to our University, there still will be results (especially those past page 1) that relate to Michigan State University, Central Michigan University, Eastern Michigan, etc. Hence, Google returns over 25 million results from that search! There must be a way to control these results…and there is. Below are some of the following new ways I can search for information:

  • Search for X but Not Y (X -Y)
    • If I wanted to search for the cheese but I hate cheddar cheese, my query could be "cheese -cheddar"
    • Returns results related to cheese that do not have cheddar in the result
  • Search X + Words Similar to X (~X)
    • If I wanted to search for different kinds of wood doors, my query could be "~wood doors"
    • Returns results with doors + any word synonymous with wood, such as oak, birch, etc
  • Search X in Page Title (intitle:X)
    • If I wanted to search for websites related to granite that contained 'rocks' in the page title, my query would be "granite intitle:rocks"
    • Returns results with websites related to rocks and had page titles with "Rocks" in it
  • Search X in Page URL (inurl:X)
    • If I wanted to search for websites related to speakers that contained 'Logitech' in the page url;, my query would be "speakers intitle:logitech"
    • Returns results about speakers that have Logitech in the url (ie - http://www.cnet.com/reviews/logitechz5500.html)
  • Search X within site Y (X site:Y)
    • If I wanted to search websites about the Michigan Union but only wanted sites related to umich.edu, my query would be "union site:umich.edu"
    • Returns results about the Michigan Union that only have umich.edu in the URL

The intitle:, inurl:, and site: syntax have been helping me a lot in the last few days. About 50% of my searches now take advantage of at least one of those three syntax forms. They really help when you have a more specific idea of what type of page you are looking for. For example, I really use CNet and Tomshardware to get some hardware reviews, but I can't search both things at once. Therefore, I can type 'Intel Q9550 site:cnet.com OR site:tomshardware.com' to get information about that processor from only those two websites.

Exploring Web Search Engines

The next part of the exercise was to compare the results of the three largest search enginges - Google, Yahoo, and Microsoft Live Search. For the exercise, we tested the results of Google/Yahoo/Live Search when querying "Climate Change". My observations:

  • First results for Google and Yahoo were the EPA.gov, first result for Live.com was a wikipedia entry for climate change
  • Second result for Yahoo was its own News service section about climate change
  • Live and Yahoo's presentation is much busier than Google's

Exploring Blog Search Engines

Next, we tested and compared the three largest blog search tools: Technorati, Google Blog Search, and Bloglines. I had heard of Technorati and Bloglines before but never used them. Our search query was once again Climate Change. My observations:

  • Technorati's first result about climate change related to polar bears and the ice caps - I like this topic. Seems to rank results by most recently posted as opposed to most linked / most read (Google)
  • Google's first result is strictly about climate change - ranking results by relevancy and apparently by popularity. Note: includes "related blogs" that appear to be directly about the search
  • Bloglines first result is pretty random, happens to have climate change in the body. Results can be ranked by date, relevancy, or popularity.

Technorati's presentation was very nice and has more of a portal/news site look compared to Google Blog Search which is strictly a search-find site. I can now go to Technorati to find blog posts similar to how I can go to New York Times.com to find articles posted about sports. Bloglines has yet to really satisfy me with its results.

Exploring Other Search Engines

To further break the idea that Google's web search solves all problems, we explored some other types of search engines within and without of Google. Such as:

  • Google News returns results from newspapers, magazines, and news sites
  • Clipoid (powered by Google) returned primarily Youtube result. Why use Clipoid over Google Video? Useful for searching video clips.
  • Google Images returns images with climate change in the file-name.
  • Yahoo Directory returns lists of sites or companies that relate to a very specific set of information. Benefits is you can narrow down your search to a niche topic and get a set of results you know are 100% relevant

I've known Yahoo Directory existed for a while, but never understood its usefulness until now. It's helpful to have a tool that lets you go in without having a strong idea about what you are looking for.

Gerald Ford Exercise

For the final part of the exercise, we used a few different search engines to compare the results of searching for Gerald Ford

  • Yahoo Directory - Related categories are US Presidents and 20th Century History
  • Yahoo returns Gerald Ford's wikipedia page and mostly information about his presidency (16m results)
  • Google returns 2.3m results
  • Adding the -automotive + -cars syntax brings Yahoo's results down to 12.9m and Google's to 1.8M

Conclusion

This exercise was surprisingly interesting and very useful. As Professor Moore told us when the class first started, there is more to searching than typing "xxxxx" into Google and rinse/repeat if necessary. There are some really helpful search syntax techniques that are overlooked. Also, I had heard of but never used Technorati or Clipoid before so it was nice to try those out. Considering how much I learned in the class' first exercise (especially for someone of my technical background), I am excited what is in store for future classes.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License