link checking and google hunting
Jim Jacobs
ss3 at weber.ucsd.edu
Wed Jun 12 12:27:37 EDT 2002
I have several thousand links to check and am, naturally, finding a
high percentage of broken links. This leads me to two questions, one
old, one (i think) new:
1. Has anyone found the definitive, works-everytime, always-
correct, wouldn't-use-anything-else link checking software? :-)
Features I'm interested in would include:
- inclusion of <title> of page found in report.
- accurate way of dealing with <refresh> tags.
- accurate way of dealing with load-balancers that redirect to
different, but correct, machine.
- easy, accurate way to re-check bad links to verify they are
really bad and not just unreachable at the moment of last
check.
- follows links on existing web site and extracts urls to check
from existing html files.
- runs on unix, preferable.
I've used, at various times, MOMspider, webxref, and linklint.
2. Has anyone experimented with using google or the new google API
to track down new URLs for bad links? Specifically, has anyone
integrated a google-search and result-report into a link
checker?
I'll happily accept any advice, condolences, pointers to sources of
comparison studies, recommendations, etc.
---
Jim Jacobs, Data Services Librarian voice: (858) 534-1262
University of California, San Diego FAX: (858) 534-7548
9500 Gilman Drive Library 0175-R jajacobs at ucsd.edu
La Jolla, CA 92093-0175
More information about the Web4lib
mailing list