Broken Links  Broken Links

go to home page Java Applications full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish by Roedy Green ©1996-2009 Canadian Mind Products
Introduction Sample Text Report
Why use Xenu? Sample HTML Export
How to Use Xenu Repairing Broken Links
Configuring Brokenlinks Futures
Running Brokenlinks Links
Presumed Good File

Introduction

Brokenlinks is a tool to help you find and track broken links on your website, namely URLs that no longer point to anything useful. It is a back end to the Xenu broken link detector that compensates for Xenu’s weakness of overwhelming you with reports of links that are not really broken. You get the basic idea. Brokenlinks whittles Xenu’s giant list of broken links to the ones you should look at first. This saves you immense amounts of time researching links that are not really broken.

Why use Xenu?

Finding the broken links is only 10% of the work. Fixing them is what is so labour intensive. If you let your website deteriorate with broken links, visitors become frustrated, and stop visiting. Having clean links encourages Google to take your site more cleanly.

How to Use Xenu

Download and install a free copy of Xenu Link Sleuth.

First you spider your local copy of your website with Xenu. Read the Xenu documentation on how to do that. You first have to be sure Xenu is working properly before Brokenlinks will work. Use Xenu directly to find orphans.

Once you are pretty sure you have Xenu configured correctly, run it on your local website, with external link checking turned on.

Be careful to verify the check external links option is on at the very last moment before you start the spidering. Xenu mischievously like to change the flag on you unexpectedly.
When it has finished spidering your website and checking all the links, click Export Page Map to TAB-separated File. (Don’t confuse this with Export to TAB-separated File). You may optionally get Xenu to also produce an HTML report.

Configuring Brokenlinks

Download and install a free copy of Brokenlinks.

The first time you use Brokenlinks you must configure it by creating a text file with a text editor. It will look something like this:

Configure it according to the embedded comments. Then save the file, giving it a name of the form xxxx.properties.

The properties are all pretty straightforward except for brokenForgivenessDays=7.

  1. If you have only a handful of broken links, and you religiously run Xenu/Brokenlinks every day, you might set brokenForgivenessDays=2, though I still set it to 6. One advantage of running every day is you stay on top of researching and repairing bronken links. You are never faced with large numbers of them to fix all at once.
  2. If you have only a handful of broken links, and you religiously run Xenu/Brokenlinks twice a week, use brokenForgivenessDays=5
  3. If you don’t want to think about brokenForgivenessDays, leave this property out, and accept the default: brokenForgivenessDays=7
  4. If you have only a handful of broken links, and you religiously run Xenu/Brokenlinks every week, use brokenForgivenessDays=8
  5. If you have hundreds of broken links, and you run Xenu/Brokenlinks only every once in a while, use brokenForgivenessDays=14
  6. You can experiment setting it to various values. The smaller the brokenForgivenessDays number, the the sooner and the more broken links will be revealed to you. However, you will be pestered with more temporarily broken links. If you are feeling overwhelmed by broken links, increase the value to show you only the deadest links. The minimum value that makes much sense is 1. Xenu itself effectively uses 0.

Running Brokenlinks

Now run Brokenlinks like this:
java.exe -jar brokenlinks.jar xxxx.properties
If you have Jet, you simplify that to:
brokenlinks.exe xxxx.properties

You will get a report of the critical broken links to research both in text and html form. Embed the html in a web page somewhere. Here is my list of broken links for mindprod.com. The layout is designed so make it easy to research the problems. You can click to get the page where the broken link is, or click to where it was trying to go.

Then research the broken links and fix them. The run Xenu again, click Export Page Map to TAB-separated File and run brokenlinks. Run this cycle at different times of the day, since some websites shutdown part of the day for maintenance. You want to catch them when they are up. Run the cycle after repairing a batch of links to see how you did. After you get the list whittled down to none, run the cycle weekly, twice weekly or daily to stay on top of the broken links. I find running it daily works best since you never get overwhelmed with work, and thus are not tempted to postpone the work.

If you erase the history.bin file, it will automatically start over from scratch collecting history.

Presumed Good File

If you find a link that Xenu/Brokenlinks thinks is broken, but which is actually ok, or it doesn’t matter for some reason, add it to your list of presumed good links. The presumedgood csv file will look something like this:
Thereafter that presumed good link will be excluded from the broken links list.

Sample Text Report

Here is roughly what the text report that Brokenlinks produces will look like:

Sample HTML Export

Here is roughly what the combined broken links and presumed good HTML report that Brokenlinks produces will look like:

Broken Links Sorted by Error Code

There are 2 links that have been broken for at least 6 days yet to be fixed. Last revised: 2008-11-04
Broken Links by Status Code
Status Code Links To
    Linked From
Internal Server Errorhttp://www.fanoos.com/photos.asp?album=photo_lebanon_israel_2006_war
 http://www.lebanonlinks.com/photos.asp?album=photo_lebanon_israel_2006_war
no connecthttp://www.sepomex.gob.mx/Paginas/default.aspx
 /jgloss/postoffice.html

Links Presumed Good

Xenu claims the following links are broken, but they have been manually found to be good. They should be manually rechecked from time to time. The problem may be an unknown SSL certificate authority which needs to be OKed manually, (a missing/unknown/uninstalled certificate root authority) or it may be the website sends the data, but with not-found status.

There are 9 links marked as presumed good despite what Xenu says. Last revised: 2008-11-04

Links Presumed Good
Link To
http://localhost/
http://www.asis.gov.au/Trace_in_progress/return_fake_error
http://www.microsoft.com/library/errorpages/smarterror.aspx?aspxerrorpath=/windows/windowsmedia/download/AllDownloads.aspx
http://www.rocketdownload.com/
http://www.thefreedictionary.com/403.htm
http://www.theserverside.com/tt/books/wiley/masteringEJB/
https://virtualschool.edu/jwaa/
https://www.eecs.harvard.edu/mailman/listinfo/jopt-users
https://www2.digicert.com.my/carootcert.html


Repairing Broken Links

Here are some tips to help you find a replacement link for a broken one.

Futures

Here are various ways I hope eventually to improve Brokenlinks:
  1. Vastly improve the speed of rechecking links by checking 30 of them time simultaneously the way Xenu does.
  2. Convert to Java Web Start. This will make the program easier to use by novices since it will not require configuration. The Configuration properties file will be replaced by a GUI. The user will not have to manually allocate a directory for the history file.
  3. Remove the dependence on Xenu. Handle everything it does in Brokenlinks.
  4. Avoid checking links that recently checked OK to vastly speed up link checking. You could then afford to do it daily or even before every upload. Xenu rechecks everything from scratch every time you run it.
  5. Check Applet links. Xenu thinks all Applet links are broken.
  6. Check style sheet links. Xenu ignores them.
  7. Tools to insert warnings styles on broken links so they will have an icon next to them warning your visitors of the problem and letting them know you are aware of it.
  8. Tools to help automate repair of broken links.
Google sitemap
HTML Broken link fixer student project
Xenu

CMP homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.62] Visit Western Canada Wilderness Committee.
You are visitor number 11.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/application/brokenlinksmanual.html J:\mindprod\application\brokenlinksmanual.html