~ Essays ~
|
|
|
|
essays |
(Courtesy of fravia's advanced searching
lores)
Follow Links in the Underground
by altosax
published at fravia's searchlores in
September 2002
Slightly edited by fravia+
"Real
searchers must be able to find what they are looking for
in the most effective way and when the site is realized to fool the users,
they should never have to click onto
a link just to discover it is not what they thinked it was."
Let's not forget, also, that seekers travelling
with a good hosts filter and all
their holy shields up
-- junkbuster + proxomitron chained together --
see Iefaf's, Bone Digger's and NME's chaining instructions
elsewhere
and moreover without any active java or javascript inside their browser wont need
this kind of lore that badly.
What is explained here could come quite handy for every
seeker... Thanks Altosax! Let's hope to have more feedback on this by
whoever will research further into it.
Follow Links in the Underground
by altosax
Many Searchlores users probably have read the Rumsteack essay about the use
of Getright as a bot to explore the site structure. This is useful mainly in
the underground sites, to avoid popups, redirection and/or tracking links,
links pointing to a false location and so on.
A different way to do the same thing, but with more and more informations
about the site structure is to use the freeware Xenu's Link Sleuth at
http://home.snafu.de/tilman/xenulink.html.
I know, there are a lot of
commercial programs doing this, and as I said also Getright, but I've
learned to respect the programmers' work so when a freeware exists doing
what I need, I never consider to use another.
Real searchers must be able to find what they are looking for
in the most effective way and when the site is realized to fool the users,
they should never have to click onto
a link just to discover it is not what they thinked it was.
This means that a searcher must know, or must find, the tools to do
his job the way he wants and not the way the webcoders want.
Xenu's Link Sleuth was realized by Tilman Hausherr to help the webmasters to
maintain their sites. It checks the site for dead links, external URLs,
redirection URLs and other. But the most useful feature for the searchers is
the check for valid URLs, that can be used to find the right path to what we
are searching for among all other tiresome links.
You simply need to give Xenu a working web address and it will scan the
whole site to check the links types you have set in the configuration
options window: start Xenu, type the URL to scan, then click on "More
options". Here you can select the options you prefer of those Xenu provides.
First you can set the number of threads Xenu has to execute up to 100
parallel threads. This depends from the speed of your connection and the
bandwidth you can use. I've found that 30 simultaneous threads are optimal,
but this is not true with every site. Some sites, when receiving so many
connections from a single machine, could think it is a DoS attempt and block
you. If this happens, you have to reduce that number and retry.
Then you can set the results it has to return in the report it can create at
the end of the check. The available options are:
Broken links, ordered by links
Broken links, ordered by page
Broken local links
Redirected URLs
FTP and Gopher URLs
Valid text URLs
Site Map
Statistics
Local orphan files
Because the report is not a text file but a html page, it contains clickable
links to the pages and the files hosted on that site, so you can click the
links in your local report without follow them on the site. This way you can
avoid also the popups, the redirections and the trackings. You can store the
report on your disk too, to use it again later.
I suggest to start using Xenu on small sites first, just to take confidence
with its options because if you enable all of them the time required to scan
a site can grow considerably.
Or you can start checking just for valid text URLs, because if you use Xenu
to peruse a site you don't want for sure the broken links :) Later, if
needed, increase the number of results to return.
If you prefer, in the main window you can also expand/restrict the checks
setting the program to scan additional external URLs beginning with
something- or to exclude URLs beginning with someother-.
As i wrote, explore the underground is not the use the author of Xenu had in
mind, this is just a different way to use a web-tool for searching purpose.
altosax,
August 2002.
(c) 1952-2032: [fravia+], all rights reserved