[Web4lib] Which databases can Google Scholar crawl?

Kathryn Silberger Kathryn.Silberger at marist.edu
Thu Feb 21 08:53:43 EST 2008


Roy has asked an interestng question about how completely and frequently
Google crawled targets.  Here are a few factoids of interest:

Today a Jstor search returns about 1,810,000 items; limited to "since 2004"
-2950 items; "since 2003, - 10,600 items; "since 2002"  - 18,700 items.  (I
began with 2004 because of The Wall).  Also today, at
http://www.jstor.org.online.library.marist.edu/about/facts.html , Jstor say
it has 1,850,206 articles online.

I don't know how many duplicate entries there are in GS for Jstor, but I
bet there are some.  Nonetheless, it looks like GS indexes something in the
neighborhood of 90 - 95% of Jstor.

Today Blackwell Synergy has 912,000 items retrieved in a GS search. On
their website the say they have "over a million articles online".  Limiting
that GS search to "since 2008"  - 1950 articles are retrieved; "since 2007
" - 27,600, and "since 2006" - 42,700.

When I do a "DM Silberger search", (my husband), the Google count  is 326.
My husband has  only written about 3  or 4 dozen scholarly articles.  The
rest of what is coming up, seems to be items citing his papers.  Does
anyone know if  the "inurl:" search in GS is strictly limited to  the URL
of  main article, or will it pull up any URL listed in the full text of the
entry?


Katy

Kathryn K. Silberger
Automation Resources Librarian
James A. Cannavino Library
Marist College
3399 North Road
Poughkeepsie, NY  12601
Kathryn.Silberger at marist.edu
(845) 575-3000 x.2419



More information about the Web4lib mailing list