Link Checkers - Thanks

Gary Shultz gshultz at mail.smu.edu
Fri Sep 27 19:02:31 EDT 1996


All,

A while back, I asked about link checkers and received an overwhelming
response.  Many people recommended Dr. HTML, which is free. Rather than
string all the responses together, I thought I would just give you the
highlights, so here they are -- and there are quite a few. I did not list
all the Dr. HTML notes, but thanks to everyone for responding.

=======

go to
http://www.yahoo.com/Computers_and_Internet/Software/Data_Formats/HTML/Valid
ation_Checkers/
for a list of validation tools.

one i have used before is Dr. HTML, which is free on the web. you just type
in the URL of your index page and it provides you with a table indicating
which links may have changed.

a. gagliardi
McMaster University
Hamilton, Ontario, Canada

==========================================

Mosaic 2.0 [Yes Mosaic!] included AutoSurf, which we use to check our
links.  It creates a nice report too.
--
Marvin E. Pollard, Jr.
College Center for Library Automation
Tallahassee, FL 32304  USA

==========================================

Doctor HTML works very good.  It is a website that checks urls you enter.
No software required.  It is available at:
        http://www2.imagiware.com/RxHTML/

Dan Kissane
McNeese State University
Lake Charles, LA 70605

==========================================

A simple program that's free and can also be used as an editor as
well as check links is
GNNpress at:
http://www.tools.gnn.com

There's also some PERL scripts out there that will do the trick.

Or if you have the $$ there's also Adobe's Sitemill & Microsoft's
FrontPage.

GNN's quick and easy and probably best solution to just check links.

Mark Wilcox

==========================================

We use MOMspider.  See
http://www.ics.uci.edu/pub/websoft/MOMspider/

Steve Harding

=============================================
Try taking a look at http://www.imagiware.com/RxHTML/.

Dr. HTML and Site Doctor check to make sure links are active and also
checks various aspects of HTML code.

One problem though, just because a link is tested and is reported active,
it doesn't mean it is a "good" link. It could be pointing to a "This site
has moved. Please update your links now." kind of message.

Stan Furmanak
Lebanon Valley College
Annville, Pa 17003

================================================

The tools suggested so far have been one page at a time checkers, as far as
I know. There's a growing number of tools that have whole site maintenance
in mind. The ones I've seen are Site Sweeper
(http://www.sitetech.com/menu/products.htm), Web Analyzer
(http://www.incontext.ca/products/analyze.html), and Web Publisher
(http://www.skisoft.com/). They all are commercial products, with trial
version or periods. The advantages are checking all you documents at once
(takes awhile if you have hundreds of files, as we do). They are all in
their infancy, so they're not as polished as I want things to be, but worth
keeping in mind and encouraging if you can.

Michael Haseltine -- haseltin at ag.arizona.edu
Office of Arid Lands Studies, Arid Lands Information Center

==================================================

Check out:
http://ukoln.bath.ac.uk/ariadne/issue3/autotrack/

Paul Hollands
Loughborough University UK 01509 222373

======================================================


We use a script called lvrfy (http://www.cs.dartmouth.edu/~crow/lvrfy.html)
- I don't think I've seen this one mentioned yet. We have embedded lvrfy in
another script which parses the lvrfy output (list of bad links) and sends
a mail message to the author of each offending page. So far it has worked
very well: at any rate the list of bad links gets shorter each week :-)

Frances Blomeley      Computing Centre
King's College London  UK

=======================================================


We use a script called lvrfy (http://www.cs.dartmouth.edu/~crow/lvrfy.html)
- I don't think I've seen this one mentioned yet. We have embedded lvrfy in
another script which parses the lvrfy output (list of bad links) and sends
a mail message to the author of each offending page. So far it has worked
very well: at any rate the list of bad links gets shorter each week :-)

Frances Blomeley      Computing Centre
King's College London     UK

==============================================

We've used Webwatch (http://www.specter.com/ww.1.x/products.html)
to help maintain the Pier local University of Sussex gateway (3100 links at
present -
found at http://www.susx.ac.uk/library/pier/). As noted in an earlier message,
Webwatch is well reviewed in the Ariadne newsletter
(http://ukoln.bath.ac.uk/ariadne/issue3/autotrack/). It does seem to
identify 'document
moved' messages, although I think that the information has to be in the
header of the
page concerned for Webwatch to pick it up.

The main problem with it (apart from it being limited to Windows) is that,
as far as I
can work out, it requires a single HTML file to check. Hence we have to
concatenate the
entire gateway into one file (using a DOS batch file) before setting
Webwatch off on it's
overnight run. This also means that the output is a huge single file, which
can be rather
unfriendly to check through.

Neil Jacobs
University of Sussex Library

===============================================
There is a nifty program called InContext WEBanalyzer - it is
     available for Windows 95 and it checks your links and provides you
     with many different views of your Web site.  It is pretty easy-to-use.

     check it out at http://www.incontext.com

     Elisa Miller
     Institute for Scientific Information
     Philadelphia, Pa 19104

=====================================================


Dear Friends,

I maintain several sites and have found WebWatch to be an excellent product.

And yes it is intelligent enough to return the new URL if the old one has moved!

I have included the entire readme below:

regards,

Matthew

(No, I have no financial stake, blah, blah, standard disclaimer, blah, blah)

--
WebWatch is a tool for keeping track of changes in selected Web
documents.Given an HTML document referencing URLs on the Web,
WebWatch produces a filtered list, containing only those URLs
that have been modified since a given time.

The input file can be e.g.
- the bookmark file of Netscape; or
- the exported (to HTML format) hotlist of Mosaic; or
- any other, standard HTML document, edited locally or
retrieved from the Web.

The time used for filtering
- can be given as a global setting that applies to all URLs, or
- WebWatch can derive it automatically, using the time of your
last visit to the document, as recorded by your Web browser in
your local HTML (bookmark) file.

WebWatch generates a local HTML document that contains links to
only those documents which were updated after the given date.
You can load this document into any Web browser and use it to
navigate to the updated documents.

WebWatch stores its arguments in a parameter file. Once you
have customized the program to your needs, using its graphical
front-end, you can have it run in unattended mode, periodically.

WebWatch supports the use of proxy servers.

WebWatch is currently available for every (16 and 32 bit)
Windows platforms.

WebWatch is shareware. Single user registration after 30 days
evaluation is US $18. We offer free registration for every new
bug found.

For further information please visit
http://www.specter.com/users/janos/specter/

========================================================


Manually re-checking every link that has gone up or down in bytes
makes the task larger not smaller.

I think Underpaid Clerical Worker is the answer & fits so nicely with
underpaid webwriter, underpaid web researcher, underpaid web master.
Look--we have a team!

JobSmart is a pretty big site (100 files) and our links are annotated to
note special features. We have what we call a "web technician" checking a
big hunk every week for ten hours. (In our case a new library school
graduate working from home.) We're able to cover the whole site every
month or 6 weeks. Not only does she spot bad links, she can search for the
new link if necessary and check to see the annotation is still correct.

Mary-Ellen Mort,
JobSmart Project Director

========================================================


One I've been using on the Macintosh is SiteCheck, from Pacific Coast
Software <http://www.pacific-coast.com>.

- Kevin Broun
  Apple Library

========================================================

I wouldn't call it "automatic", but I really like the
InContext Web Analyzer product -- it maps your site and shows
bad links, and can be configured to give you varying views of
your pages-- http://www.incontext.ca/demo/analyzer.html

Some other suggestions are
 Surfbot (formerly WebWatch) --
http://www.specter.com/products.html  (limited shareware
version)

NetCarta's Web Mapper http://www.netcarta.com

Smart BookMarks (First Floor Software) http://software.net

Janice Painter
Ocean City Free Public Library
Ocean City, NJ  08226

=====================================================
Regarding Link Checking and URL verification; we use MOMspider at our
library and it works great. We tried standalone client products but this
server solution was better because it automatically generated reports and
mailed it to the appropriate authors. Although we are running a UNIX
machine it looks like the PERL code could be ported to Mac/Windows machines
quite easily.

MOMspider is also not that resource intesive on the server when it runs. I
have written shell scripts attached to cron jobs that generate reports for
departments monthly. So far the initial results are very positive in terms
of having people update their pages as a result of the generated reports.

You can view our MOMspider report pages at:
        <URL:http://www.library.nwu.edu/libstaff/reports>


Stu Baker
Marjorie I. Mitchell Multimedia Center
Northwestern University Library
Evanston, IL 60208-2323

====================================================
Netscape 2.0 and above has the ability to check links in the bookmark
file.  If you choose Go to Bookmarks under the Bookmark from the menu,
then select What's New?, Netscape will run through the bookmarked links
(either all of them, or just the ones you have selected) and mark them
with a ? for those it cannot verify and with a glowing looking bookmark
for those that have changed.  Bookmarks that remain unchanged from the
last visit just have the standard bookmark icon.

I monitor the activity of about 50 sites this way, but imagine that using
this technique with a large number of site could be somewhat cumbersome.

Melissa L. Just, MLIS
Norris Medical Library
University of Southern California


====================================================
This just announced from FaulknerWeb  (http://www.faulkner.com/)

--begin quote---

Take These Broken Links...

SiteSweeper Beta 2 from Site Technology ( http://www.sitetech.com ) is
the latest tool for harried webmasters trying to find ways to automate
some of their responsibilities and make life a little easier. SiteSweeper
hunts down and identifies broken links using agent technology and
provides a detailed report in HTML on such specifics as download time and
page size; it also supplies the webmaster with the cause of the
 problem and possible remedies for fixing it. SiteSweeper runs on
Microsoft's Windows 95 or Windows NT 4.0 and is available for
download ( http://www.sitetech.com/sitesweeper/sweepdl.htm )from the
company's Web site.

--end of quote---
--
Alan Withoff
Mississippi Library Commission
Jackson, MS 39289-0700



====================================================

Thanks again for all your suggestions,
=Gary



      ~(~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~)~
    ~~~) Gary Shultz - SMU News and Information (~~~
  ~~~~~( Tele. 214-768-7665   Fax: 214-768-7663 )~~~~~
~~~~~~~) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(~~~~~~~




More information about the Web4lib mailing list