buygoogle  
 Google's prospects from a Google user and independent investor   
 
    
Add to Google

Web www.buygoogle.com
« Home

Posts

Google under fire
Earnings confusion
Remember the Google Web Accelerator?
Google's Big Threat
What's not to like here?
Dramatic divergence
Google and Yahoo diverge more
Google and Yahoo diverge in a wood
Google's "Don't Be Evil" doesn't scale
WSJ on Google TV
 
     Archives
04/25/04 05/02/04 05/09/04 05/16/04 05/23/04 05/30/04 06/13/04 06/20/04 07/04/04 07/11/04 07/25/04 08/01/04 08/08/04 08/15/04 09/19/04 10/10/04 10/17/04 01/30/05 02/06/05 03/13/05 03/27/05 04/10/05 04/17/05 04/24/05 05/01/05 05/08/05 05/15/05 05/22/05 05/29/05 06/05/05 06/12/05 06/19/05 06/26/05 07/17/05 07/24/05 08/07/05 08/14/05 08/21/05 08/28/05 09/18/05 09/25/05 10/02/05 10/09/05 10/30/05 11/13/05 11/27/05 12/04/05 12/11/05 01/08/06 01/15/06 01/22/06 01/29/06 02/12/06 02/26/06 03/05/06 03/12/06 03/19/06 03/26/06
 
     Links
Chris Anderson, Current TV, Google Blog, Google Investor, Inside Google, John Battelle, MSN Search Blog, PVR Blog, Yahoo! Search Blog


How to tell who has the bigger index - 8/14/2005 10:26:00 PM

Yahoo claimed last week that their index now covers about 20 billion items. Google's publicly acknowledged about half that number. Although most observers say that size doesn't matter much, it would certainly be an insult to Google if Yahoo really does swing a bigger stick.

I did my own non-scientific sample, and I can't find queries where Yahoo has more results than Google.

Yahoo says that their index size can't be independently verified. But the Wall Street Journal quotes Sergey Brin (sub req'd) as saying that it's not that hard for ordinary users to see who has the most comprehensive index:
Verifying the approximate size of an index is easy for anyone to do -- just find a few searches that return a small number of results (about 50). Look through them, check that they are unique, and count them.
It's important to note here that searches with many results won't work here -- if you get a million hits on Britney Spears, there's no way you can verify that there really are a million valid links. And that million hits is just estimated by the search engine as well - it's not a verifiable number.

So here are 10 relatively obscure queries and the results for Google and Yahoo. My numbers include the duplicate items that each engine returns.

1. ["relatively obscure queries"] Google - 8, Yahoo - 5
2. ["blondes have no fun"] Google - 39, Yahoo - 28
3. [cowabunga schlumberger] Google - 10, Yahoo - 3
4. [studious carabinieri] Google - 51, Yahoo - 34
5. [séptimo álbum de la banda de Athens] Google - 97, Yahoo - 67
6. ["burning sepia"] Google - 56, Yahoo - 7
7. [Shickler waltz] Google - 13, Yahoo - 2
8.
[hakuna matata dissemble] Google - 6, Yahoo - 4
9. [counterrevolution netflix] Google - 35, Yahoo - 27
10. [Fremdsprache bietet Unterrichtsmaterialien Didaktisierung] Google - 95, Yahoo - 3

I didn't find a single search string that yielded more results in Yahoo than Google. How can Yahoo have a bigger index if every query shows Google with more results?

Now I know that this site is called "buygoogle," and you should infer that I have a Google bias. But I just recorded the results as they came, didn't exclude any trials, and didn't alter the results. In half the cases I started with Yahoo, and in half I started with Google. I tried to be as fair as possible, but it's still a non-scientific sample. But I sure expected the results would be a little more balanced.

Try your own search terms and feel free to post your results in the comments. (It's harder than it looks to find search terms that only yield 50 or so results!)

Update August 14, 2005 11:18 PM PDT: Maybe the difference isn't in the queries with small result sets, but in those with very large result sets. Google claims 4,770,000 hits for [britney spears], while Yahoo says they have 14 times as many at 68,200,000. Verifying that these are genuine links is left as an exercise for the reader.

8/15/2005 6:31 AM

Do a search for 'google' and 'yahoo' on each site. Note the number of search results returned, and which query returns the largest resultset on each engine.    

8/15/2005 6:55 AM

If you search for [google] and [yahoo] on each site, Yahoo is showing about twice the number of hits as Google does. Just as when searching [britney spears], the number is too large to verify that the estimated number of hits is real. It's pretty common for searches with small result sets to find that if you actually click through all the results pages, you'll wind up with a much different (and smaller) number than the initial estimate.

So the only way I know to test for this is to compare searches with small result sets, click through each page, and count the results. And in my 10-search sample, Google always came out on top.

Is there some reason that Yahoo would consistently return fewer results than Google on small searches, but a return multiple of Google's results on large searches?    

9/14/2005 3:41 PM

Yes. Google filters out crap.    

Post a Comment

 buygoogle.com