[Web4lib] RE: Another Google question

Ryan Eby ryaneby at gmail.com
Fri Jul 15 13:24:20 EDT 2005

I might as well chime in to add more noise to the flame since I have a
few seconds, grammer and ideas may be half-assed. I've followed the
various threads on Google on here, and though I disagree with most of
what was said, I kept quiet as I'm not really concerned how it affects
the library or what people think of google et al. I'll chime in now so
at least there is one counterpoint. I've used Google since around it's
inception and have used it, nearly exclusively (for web search), ever
since. In that time I don't think I've ever had a completely failed
search for any question or information I was looking for.

As others have pointed out in previous threads, the metadata out there
is terrible and will affect some of the features that are requested.
I'm personnally surprised that they offer some of the options they do.
In the ideal semantic web then all you want and more would be possible
but I just don't see it happening anytime soon. I've been recently
messing around with catalog searching and have found enough
limitations in the record metadata to make me pull my hair out. I can
see this being fixed, though, at least as we can do some quality
control. There is no such control for the open web. I do see Google
and others making strides at fixing some of this with such things as
"Google Sitemaps", etc though it will be a long road.

I can see the argument for data sorting and the like, but I just don't
see it as being a good business decision for google unless they can
find a way to do it while keeping the relevance and result time.
Depending on how they have things set up (which we just don't know)
this could be a near impossible task.

I think the same applies to number of results. A good business
decision would be to spend the time and money at pushing better
results to the top, not giving more results. I personally never go
past two or three pages and I can see this being true for the majority
of searchers. If it not on those I refine my keywords which usually
brings what I want to the top. I actually surprised that they return
as many as they do. This is not to say your use is not valid. It
definitely is and your not the only one that would like it. I'm sure
SEO's and spammers would love to see it to. Datamining is something
that search engines have to deal with unfortunately and until they
figure out a great way to do it , I can see it preventing many
features. I'm surprised they still have the "link:" option. I think
you can see how bad this has gotten when you look at the new Google
firefox extensions which include hash's that even say thinks like
"don't even think about datamining, spammer". And lets not forget that
almost all major search engines right now do the same thing with
result sets. Ironically Gmail is showing ads for SEO and spammers for
this conversation.

Other services are now facing similar hardships and I can see
del.icio.us and Technorati and the like cutting back on somethings in
order to fight tag spam and the like. I personally hate having to
register to comment but that is the choice some have made and I'm sure
it has impacted discussion.

All of this is not to say it may not happen one day. I think the
current competition between search engines may help enduce people to
come up with creative solutions to these problems and the features you
want may become reality. Right now the problems are real.

I personally, and everyone I know, know that google is not the one
stop shop for all research (nor would I want it to be), though it does
a damn good job at some things. Google Scholar may never allow
chemists to draw chemical structures like some of the chem databases
do, even though so many chemists might like it to. The limitations are
real but I think more realize it than thought. I've seen quite a few
search engines come out that fill the niches that google/yahoo/etc
leave open with features like subject aggregation, thumbnails, etc.
I've actually seen some of these in Google Labs over the years but as
they are now gone I presume a business decision was made that said
they weren't worth it for "most searches".

I'd really like to see this much energy and debate applied to the
OPAC. I think great things (and decisions) would happen.

More information about the Web4lib mailing list