[WEB4LIB] Re: link checking and google hunting
Michael McCulley
drweb at earthlink.net
Wed Jun 12 23:57:49 EDT 2002
I'll second the vote for Xenu --shareware, and does what it says. Pretty powerful for link checks....
Best,
Michael
P. Michael McCulley
mailto:drweb at earthlink.net
San Diego, CA
>-----Original Message-----
>From: web4lib at webjunction.org
>[mailto:web4lib at webjunction.org]On Behalf Of Raymond Wood
>Sent: Wednesday, June 12, 2002 11:33 AM
>To: Multiple recipients of list
>Subject: [WEB4LIB] Re: link checking and google hunting
>
>
>On Wed, Jun 12, 2002 at 10:23:18AM -0700, Jim Jacobs remarked:
>> I have several thousand links to check and am, naturally, finding a
>> high percentage of broken links. This leads me to two questions, one
>> old, one (i think) new:
>>
>> 1. Has anyone found the definitive, works-everytime, always-
>> correct, wouldn't-use-anything-else link checking software? :-)
>> Features I'm interested in would include:
>> - inclusion of <title> of page found in report.
>> - accurate way of dealing with <refresh> tags.
>> - accurate way of dealing with load-balancers that redirect to
>> different, but correct, machine.
>> - easy, accurate way to re-check bad links to verify they are
>> really bad and not just unreachable at the moment of last
>> check.
>> - follows links on existing web site and extracts urls to check
>> from existing html files.
>> - runs on unix, preferable.
>>
>> I've used, at various times, MOMspider, webxref, and linklint.
>>
>> 2. Has anyone experimented with using google or the new google API
>> to track down new URLs for bad links? Specifically, has anyone
>> integrated a google-search and result-report into a link
>> checker?
>>
>> I'll happily accept any advice, condolences, pointers to sources of
>> comparison studies, recommendations, etc.
>
>re: #1:
>
>These are worth a look - YMMV:
>
>For *nix:
> linbot
> linkchecker
>
>For *doze:
> xenu link sleuth <-- this one puts a goofy banner add in the
> HTML report, but otherwise works OK.
>
>HTH,
>Raymond
>
More information about the Web4lib
mailing list