[Web4lib] Another Google question

Patricia F Anderson pfa at umich.edu
Tue Jul 5 18:46:02 EDT 2005


Bernie, this happens a lot. I consider this one of Google's weak points, 
and for this purpose prefer Altavista. There was a discussion about this 
very topic on Web4Lib a while back in which Rich Wiggins contributed some 
valuable insights. As I recall, it seems to have something to do with 
Google's ranking algorithm and filtering -- that they don't include 
certain types of pages in results for certain types of searches. I tried 
reporting this, but did not find it made a difference.

Regarding your other question about variable search results, this also is 
not unusual, although you found some LARGE differences in the real versus 
reported results! I haven't tried the "repeat the search with the omitted 
results included" button on large sets, only on small. It doesn't surprise 
me at all, though.

Ijust tried a search for the word "the". Reported results were 
3,190,000,000. Maximum displayed results were 946. "Repeat the search" 
button yielded the same number. I tried a few others, with equally 
unpredictatble results.

<pre>
            |  results      | displayed |  repeated
-------------------------------------------------- 
the        | 3,190,000,000 |    946    | 3,190,000,000 
Library    | 571,000,000   |    907    | 533,000,000
diagnostic | 28,300,000    |    912    | 28,300,000
tweak      | 4,380,000     |    820    | 3,750,000
booster    | 5,550,000     |    896    | 5,540,000
definition | 96,800,000    |    889    | 86,200,000
sed awk grep | 243,000     |    708    | 243,000

</pre>

I find the unpredictable numbers in large sets less problematic than in 
small sets, where I really *do* look at all the results. And what gives 
with the ever shrinking max for the displayed results? I hadn't noticed 
that before.

Patricia Anderson, pfa at umich.edu

On Tue, 5 Jul 2005, Sloan, Bernie wrote:

> Here's another thing with Google that I don't get...
>
> Over the weekend I used the Page-Specific Search on the Advanced Search
> page to find pages that link to the page:
> http://www.lis.uiuc.edu/~b-sloan/e-ref.html. I got five results.
>
> Then I did a Google search for the name of the document associated with
> that URL. I got quite a few results ("about 271"), but only went through
> the first 50. Twenty percent of those results (10 of 50) had live links
> imbedded in the corresponding documents matching the URL above.
>
> To the best of my knowledge, none of these ten documents were included
> in the five results from the "find pages that link to the page" search
> off of the advanced search page.
>
> I'm probably missing something obvious, but why would a "link to" search
> come up with five documents, while the second search came up with ten
> documents with the same URL as a live link, with little or no overlap
> between the two result sets?
>
> Bernie Sloan
> Senior Information Systems Consultant
> Consortium of Academic & Research Libraries in Illinois
> 616 E. Green Street, Suite 213
> Champaign, IL  61820-5752
>
> Phone: (217) 333-4895
> Fax:   (217) 265-0454
> E-mail: bernies at uillinois.edu
>
>
> _______________________________________________
> Web4lib mailing list
> Web4lib at webjunction.org
> http://lists.webjunction.org/web4lib/
>
>
>


More information about the Web4lib mailing list