[Web4lib] Re: Rollyo
Heather Christenson
Heather.Christenson at ucop.edu
Tue Jul 25 15:00:35 EDT 2006
I think Rollyo is a great, simple web-based alternative to running your
own crawler. However, compared to what you might be able to do with
your own crawler, the results aren't great. Rollyo would be much more
powerful if you could limit or filter to specific directories within a
given URL. Also, there are a few idiosyncrasies that arise because the
underlying crawler (Yahoo) is broad and aiming to get to as many sites
as possible, rather than getting as deep into specific sites as
possible. The crawl may or may not have gotten all the way to every
page underneath your chosen URLs, so the corpus that is searched within
your roll may be a subset of what you intended. Also, I believe the
results ranking shows how your results rank in the broader context, not
just against the smaller group within your roll. These problems come
up with the Google API as well.
All that being said, Rollyo is so easy to use and share (given the
challenges of running your own crawl -- which I haven't even begun to
enumerate!), living with the results isn't too hard
--Heather
__________________________
Heather Christenson
California Digital Library
University of California
415 20th St, 4th Floor
Oakland, CA 94612-3550
phone: (510) 987-0525
fax: (510) 893-5212
More information about the Web4lib
mailing list