link checking and google hunting

Jim Jacobs ss3 at weber.ucsd.edu
Wed Jun 12 12:27:37 EDT 2002


I have several thousand links to check and am, naturally, finding a
high percentage of broken links.  This leads me to two questions, one
old, one (i think) new: 

    1. Has anyone found the definitive, works-everytime, always-
      correct, wouldn't-use-anything-else link checking software? :-)
      Features I'm interested in would include:
        - inclusion of <title> of page found in report.
        - accurate way of dealing with <refresh> tags.
        - accurate way of dealing with load-balancers that redirect to
          different, but correct, machine.
        - easy, accurate way to re-check bad links to verify they are
          really bad and not just unreachable at the moment of last
          check.
        - follows links on existing web site and extracts urls to check 
          from existing html files.
        - runs on unix, preferable.

       I've used, at various times, MOMspider, webxref, and linklint.

    2. Has anyone experimented with using google or the new google API
      to track down new URLs for bad links?  Specifically, has anyone
      integrated a google-search and result-report into a link
      checker?

I'll happily accept any advice, condolences, pointers to sources of
comparison studies, recommendations, etc.

---
Jim Jacobs, Data Services Librarian             voice: (858) 534-1262
University of California, San Diego               FAX: (858) 534-7548
9500 Gilman Drive   Library 0175-R                  jajacobs at ucsd.edu
La Jolla, CA 92093-0175                     




More information about the Web4lib mailing list