[WEB4LIB] Re: link checking and google hunting

Michael McCulley drweb at earthlink.net
Wed Jun 12 23:57:49 EDT 2002

I'll second the vote for Xenu --shareware, and does what it says. Pretty powerful for link checks....


P. Michael McCulley
mailto:drweb at earthlink.net
San Diego, CA

>-----Original Message-----
>From: web4lib at webjunction.org
>[mailto:web4lib at webjunction.org]On Behalf Of Raymond Wood
>Sent: Wednesday, June 12, 2002 11:33 AM
>To: Multiple recipients of list
>Subject: [WEB4LIB] Re: link checking and google hunting
>On Wed, Jun 12, 2002 at 10:23:18AM -0700, Jim Jacobs remarked:
>> I have several thousand links to check and am, naturally, finding a
>> high percentage of broken links.  This leads me to two questions, one
>> old, one (i think) new: 
>>     1. Has anyone found the definitive, works-everytime, always-
>>       correct, wouldn't-use-anything-else link checking software? :-)
>>       Features I'm interested in would include:
>>         - inclusion of <title> of page found in report.
>>         - accurate way of dealing with <refresh> tags.
>>         - accurate way of dealing with load-balancers that redirect to
>>           different, but correct, machine.
>>         - easy, accurate way to re-check bad links to verify they are
>>           really bad and not just unreachable at the moment of last
>>           check.
>>         - follows links on existing web site and extracts urls to check 
>>           from existing html files.
>>         - runs on unix, preferable.
>>        I've used, at various times, MOMspider, webxref, and linklint.
>>     2. Has anyone experimented with using google or the new google API
>>       to track down new URLs for bad links?  Specifically, has anyone
>>       integrated a google-search and result-report into a link
>>       checker?
>> I'll happily accept any advice, condolences, pointers to sources of
>> comparison studies, recommendations, etc.
>re: #1:
>These are worth a look - YMMV:
>For *nix:
>  linbot
>  linkchecker
>For *doze:
>  xenu link sleuth  <-- this one puts a goofy banner add in the
>    HTML report, but otherwise works OK.

More information about the Web4lib mailing list