by Various Authors Mai 2002
edited by fravia+
There is so much to learn in this classroom that
you'll have to invest some time in it, gentle leecher, but
that time will be well spent. Your searching techniques will considerably improve. Learn from the
good ~seekers~ that have published this
on one of our [messageboards].
As erom wrote, the results of many searches are "either honeypots or gruyere", yet in both case
quite instructive, as you will see ;-)
Dark riders. Leaving the paths everybody else has trotted upon to find
knowledge gems shining in the dark dark web. Using ad hoc formulae, stratagems, tricks, our
brains, to
defeat any obstacle we may encounter en route.
And we do find what we want to find, oh boy,
we do!
erom's famous formel: +"Index of /" +"/book" +"zip" +"last modified"
loki's raging magic:
big query AND tolkien
Jeff's stone for breaking mirrors:
site:www.twinkle.ws tolkien
erom's famous formel
(with some photos of the 11
september aftermaths and a mine for netbusses)
+"Index of /" +"/book" +"zip" +"last modified"
i usually add a +"ps.gz" there are still good stuff in ps,
for computer stuff i add +"cisco" which filters quite efficently.
Adding a +".edu" or
+".mil" also filter nicely (or with :inurl)
http://www.google.com/search?hl=en&lr=&ie=utf-8&oe=utf- 8&q=%2B%22Index+of+%2F%22++%2B%22last+modified%22+%2Binurl%3A%22edu%22++%2B%22100%22+%2B%22110%22+%2B%22121%22+%2B%22jpg%22+%2B%222002%22&btnG=Google+Search
add +"1M" to have the chance to find galleries with hires pictures.
Lotta pictures of the 11 september attacks, never seen before
i don't really understand what they do with this 'folder' but pictures change often.
some are deleted (like were some post-crashboom ones) to be put again (for example
just after the pentagon crash controversy). nice hires propaganda ;)
+"Index of /" +"/book/" @ www.gigablast.com
gigablast returns fun stuff ... just nmap some of the first machines, it's either honeypots or gruyere
12345/tcp open NetBus
12346/tcp open NetBus
31337/tcp open Elite
54320/tcp open bo2k
but as the machine is made of pingwins i doubt netbus and bo2k are efficient ;)
méchant gougle '(and any subsequent words) was ignored because we limit queries to 10 words'
erom's military spelunking
(tiffs galore)
some 20 megs tiffs ..
movies from 1897-1920
"Managing Production Metadata Without a Management System"
i just have to hunt down where to dl the realmedia files ;)
from there
found here
hasco's addition
(more military maps)
just passing by.. saw military stuff.. wanna play too :)
some huge data files, DTED, Vmap
when i say huge, it IS huge. See this DTED
(Digital Terrain Elevation Data) file : /ftpdir/dted0/ddworldbsq.bsq
geeesh, i don't really understand what i'm fishing.
is this really public ?
Maybe this specs would help :
Geospatial Standards and Specifications
What is DTED? Some explanations:
What is DTED?
Digital Terrain Elevation Data (DTED) is an elevation database distributed by NIMA
The OpenMap DTEDLayer(http://openmap.bbn.com/ doc/api/com/bbn/openmap/layer/dted/DTEDLayer.html) can create terrain renderings for DTED levels 0 (elevations are 1 km apart), 1 (100 meters apart), and 2 (10 meters apart).
Can OpenMap display DCW, VMAP, VPF?
Yes, we have an OpenMap layer that can display this data. The VPFLayer javadoc (http://openmap.bbn.com/doc/api/ com/bbn/openmap/layer/vpf/VPFLayer.html) includes a description of the various openmap.properties you need to set to select data for display.
oh.. i almost forgot. i've fished a (again huge) map
gem. this time i'm sure, it is free and -open- stuff :
Major Military Installations (14.313 MB JPEG)
SaF's question
(dark google's cache)
This search:
http://www.google.com/ search?hl=en&q=http%3A%2F%2Fwww.twinkle.ws%2Freading%2Funsorted%2Fmirror4%2F&btnG=Google+Search
Gives you a link which when clicked on puts you to a mirror of an online library.
Looking at googles' cache gives something more interesting...lots of goodies.
Now using web.archive.org for different dates gives interesting things at:
different dates give different content but *not* what the cache shows...plus you can't get at them :-(
Any ideas how to get at these?
A lateral step: Jeff's funny 404:
http://www.twinkle.ws/reading/ Applied%20Cryptography,%20Second%20Edition:%20Protocols,%20Algorithms,%20and%20Source%20Code%20in%20C/
loki's magical big raging query
(books and books)
a big ragin' query to find fastly ebooks
title:"index of" AND "parent directory"
AND (title:"books" OR title:"book" OR title:"ebook" OR title:"ebooks")
AND (zip OR pdf OR tiff OR txt)
AND NOT (htm OR html)
simple, but quite powerfull : more than 9K sites in wich refine :)
i.e : big query AND tolkien
see the 2 first results (i didn't look further):
damn big list.. hey, there are even castaneda's books ! and hoffman's !
oh well... :)
loki's strange altavista behaviour
(books and books and treasures)
some stats for the big query, and a strange altavista behaviour
aristotle : 6
douglas adams: 4
rousseau : 3
bible : 67
blake : 0
hawking : 1 (a descr..)
lovecraft : 0 (it is there, i saw him. but wasn't crawled yet)
caroll : 0 (??)
lewiscaroll : 1 (..)
anne rice : 3
zen : 7
"zen and the art of" : 3 results :)
i've noticed a strange behaviour of altavista (sounds familiar).
this url will be our example. and more specialy this special zip file located there :
-> ReneDescartes-DiscourseOnTheMethodOfRightlyConductingTheReason,AndSeekingTheTruthInScience.zip <-
altavista check for you the uppercase if you keep all your queries in lower (but not the inverse); and he doesn't take in account the '-'
we can then search for ReneDescartes or DiscourseOnTheMethodOfRightlyConductingTheReason,AndSeekingTheTruthInScience (and 'zip', of course)
'descartes' : muz.ca not in the results NOP
no way to put an asterisk BEFORE the word.. that causes some trouble for ebooks searching (in particular) because frequently the name appear at the END of the filename (ex: lewiscaroll)
'rene*' : muz.ca not in the results NOP
logic :
The asterisk is a wildcard; any letters can take the place of the asterisk. Bass* would find documents with bass, basset and bassinet.
You must type at least three letters before the *.
You can also place the * in the middle of a word. This is useful when you're unsure about spelling.
Colo*r would find documents that contain color and colour.
i've read in a translated version of this doc that the asterisk standed for 5 chars.
renedescartes -> 13 chars. So we miss 9 chars with 'rene'
'rened*' : muz.ca is the result YEP
right. but it still makes 8 chars to have the full keyword.
why ? Maybe the translated version of the doc was wrong. maybe it's more than 5 char. but less than 9 chars.
Now, let's use the second part of the filename :
'discourseonthe*' : muz.ca not in the results NOP
hehe :) we miss 62 chars on this one.
'discourseonthe**' : muz.ca is the result YEP</b>
wtf ? a hidden feature ? (i'm sure i've read this somewhere in searchlores, but where ?)
and then why, o why, this query doesn't work :
'rene**' : muz.ca not in the results NOP !!
someone already noticed this strange behaviour of altavista ? or was it google.. RH ? jeff ?
or am i missing a point ?
Jeff breaks the chache
(books and books and treasures)
it takes a little bit more work... but ya can get in
search the files that are in the index folder .. ie Big.U.txt
or taking a look inside one of the indexs :
we see in viewing inside ..'The.Great.Simoleon.Caper.txt'
so we do a sites search of twinkle for the great caper
and then use the cache link as u all have noted works :)
Jeff breaks the mirrors
(just need a little more fineese to go to the proper places :)
I am terrible at explaining things so if you'll forgive me and give me another go I'd like to explain better (if possible--hic hic)
If you find an index such as the above example-- we can see that google cache HAS indexed and cached it... however clciking on the links may very well take you to a redirected or dead hyperlinked ozone
the fact that it has cached this index (though not tested) just may mean that these links are also CACHED -- however --- because they are hyperlinks when u click on them they perform in a different manner than we may want --- goin to redirects etc..........
so what we can do is this:
we know the site
we know the directory listings
(IF there are mirrors or a new URL site --- and this one is simply closed down etc... typing in 3-4 of the directory NAMES may find them listed elsewhere under a new domain)
we know google has cached THIS Index
we can therefore ask google to do a SITE search for any directory name that we see listed: one we see is tolkien
site:www.twinkle.ws tolkien
once we peek inside this directory we can grab file names we may be interested in ... then we can either hit back button and ADD this filename to our site:domainesearch directoryname filename
then check cache of
or if we see a link that we want to goto that is being driven elsewhere we can just copy the address instead of backclicking into google and adding the filename
site:www.twinkle.ws tolkien KING ARTHUR
this narrows the search down and because google has cached---it is there
another example address:
just take everything off the browser url after cache
and copy and paste the link u are looking at in google:too
so anyway the short story of the long winded story is--- if google is finding you an idex or page showing its been cached--chances are the hyperlinks u are clicking on that are taking you elsewhere just need a little more fineese to get them to goto the proper places :)
site:www.twinkle.ws tolkien
of course suggestions and comments are welcome...

