~ Dublin talk (20 April 2005) ~
(Date: Wednesday April 20th, 2005; Time: 7:00pm; Location: Joly Theatre (HAM4), Trinity College, Hamilton Building)


Back to mines

"Dublin can be heaven
With coffee at eleven
And a wardriving stroll around Stephen's Green
"
  
Petit image
Back to searchlores

Web searching talk at the Trinity
by Fravia+, April 2005, version .13


Approximate "Scaletta" of the Talk
(this is just a pot pourrie of many possible paths, the talk itself may -and probably will- differ from these tracks)

Introduction and caveats
email    How to discover whois    Some magic tricks    Guinness
books    music    strings to play with    languages    eventual sidepaths
Conclusions   

Searching for disappeared sites    Netcraft    Structure of the web
Synecdochical searching    What a search looks like   
All the main s.e.: Bk:flange of myth       [rose]      webbits' cosmic power




INTRODUCTION
Structure, Opera and Proxomitron

Excuse my English, please, which is my third language, and please note that I'm not sure I'll always be able to be politically correct. (That's one among the many reasons why I prefer to use a pseudonym instead of my "real" name)
With this talk I hope to show you how to search effectively the web, and hence give you cosmic power, no more and no less.
Once you know how to search the web, the entire human knowledge will become available, at your disposal.
To give you a silly example, using google, we will see how a simple "moronical" query like
"index of" warez has your target signal submerged under such a heavy commercial noise to be next to useless.
So what should you use instead? (In the rather unlikely event you would look for software on the web :-)
Well, a query like the following should cut more mustard:
("wares" OR "warez" OR "appz" OR "gamez" OR "abandoned" OR "pirate" OR "war3z") ("download" OR "ftp" OR "index of" OR "cracked" OR "release" OR "full") ("nfo" OR "rar" OR "zip" OR "ace")
This is just an example, and using just one of the main search engines. Always remember that the main search engines (at the moment the most important ones are google, yahoo, msn and teoma) cover -at best- just a third of the whole web... red 

So the problem is to wade through the slimy commercial morasses of the web, which were made specifically "ad captandum vulgus", and to quickly "cut" this useless ballast in order to find our targets.

We'll use google a lot as an example, but we'll see today various different ways to fetch "by hand" your targets. I say "by hand" because real seekers try to automate the process as much as possible and usually employ ad hoc bots to do the gritty digging. But that's for later, first the basic.

Let's first have a short look at what the web looks like from a searcher's point of view.
Outside linkers are fetched through klebing (and stalking and social engineering), the bulk and the outside linked through combing and short and long term seeking, the hidden and commercial databases through password breaking or guessing, social engineering or, more simply, just seeking databases' hardcoded passwords (à la Borland Interbase's "politically correct") on the web.
Here for instance one of these lists: defpasslist1.htm

In fact the web was made for SHARING information, not for "hoarding" nor for "selling" it. And it was made quite solid: its structure was made in order to resist a possible nuclear attack. It will resist even the commercial beasts that have tried to bury real useful information under tons of commercial crap, aggressive commercial porn sites and an avalanche of silly and useless advertisements.
Learning how to search, you'll be able to "cut" through the commercial pudding and morasses and fetch quickly (or relatively quickly) your target jewels.

But to "cut" the Web you'll need first of all a SWORD with a sharp blade: a capable and quick browser. That's the first and foremost step. MSIE, Microsoft Internet explorer is a no-no-no, too buggy, bloated and prone to all sort of nasty attacks. The two current "philosophical schools" are either Firefox or Opera... which is the one I am using now.

Whatever of the two "real" browsers you use, no sword will be enough without a SHIELD. And your shield, and a mighty one, is proxomitron.
Proxomitron is a very powerful tool. Its power lies in its ability to rewrite webpages on the fly, filter communications between your computer and the web servers of the sites you visit, and to allow easy management of external proxy use.
Here a link to an old, but very good essay about proxomitron basic installation: anony_8.htm, and a link to another essay, Oncle Faf goes inside proxomitron about further finetuning.... Let's sum it up: "Only morons 'just do it' without Proxomitron."

A word of warning: You'll most probably forget most of the things you'll learn today rather quickly, since most young people (and many elder ones as well) after having been heavily bombarded by advertisements from their birth onwards, have nowadays an attention span of just a few minutes and a memory as weak as an autumn leaf. But -I hope- you'll know more or less the basic of searching correctly the web.
So, should you forget, for instance, how to quickly find mp3s, you'll be able to use your combing knowledge to quickly find on the web those that will teach you how to find mp3 quickly. Or maybe even their teachers :-)

MUST KNOW
sine qua non

Just a short tour around the house

Main, regional and local
ftp, blogs and targets
usenet irc and then, of course, trolls
Again: anonymity and stalking



e-mail
Anonymity for beginners

You'll find more about "free" email in the ad hoc section of my site, the most important things is to NEVER give out real data on the web unless you are really compelled to do so (and even in that case there are many ways to avoid it).

Always choose the first option, whatever it is, when you (have to) "choose" some options from a menu ("Your income", "Your profession", Your "State" and so on): State=Afganistan, Income=less than 15 euro per year and so on... If you want to play, there are some funny logistical options like "American Samoa" "Fortune and Wallys Islands" and so on.
The option "other" that you often find on these menus is also great, because you will get the wannabye sniffers thinking hard about updating their long palette of options, adding even more crap to their possible choices.

Do not feel bad while feeding only lies to anyone asking for your data on line: such people are just scum that will use EVERYTHING you will tell them for profit the very moment you do, and they don't even have the decency to admit it. Screw them black and blue, such clowns deserve far worse than that: never believe for a minute that their 'privacy - pleads' about how they will "never use your data" could be anything else than cheap sarcasm.
The very reason they did set up such "free" email addresses sites (and such "free" search engines and "free" file repositories) is -of course- to READ everything you write and to have a copy of everything you upload or create.
Of course, klaro, no human being will ever read what you write, but their bots and grepping algos will do it for the owners of the "free" email services (or of the "free" search engines), presenting them nice tables built on your private data as a result.
This brings us to a very interesting contradiction: on one site "echelon" and the total big broterish control, on the other "wardriving" and pretty good anonymity... red 

Examples of "one shot" email addresses...
Mailinator http://www.mailinator.com/mailinator/Welcome.do
Anothe example:
http://www.pookmail.com/



How to discover whois

For instance using the previous example: http://www.whois.sc/pookmail.com (scroll down for contact names and info)

Some Magic 


Here, as promised, some simple "web-magic"...



1) Sourceror2 (by Mordred & rai.jack)
try it right away
Right click and, in opera, select "add link to bookmarks"

javascript: z0x=document.createElement('form'); f0z=document.documentElement; z0x.innerHTML = '<textarea rows=10 cols=80>' + f0z.innerHTML + '</textarea><br>'; f0z.insertBefore(z0x, f0z.firstChild); void(0);
javascript:document.write(document.documentElement.outerHTML.replace(new RegExp("<","g"), "<"));


2) Another google approach
http://www.google.com/complete/search?hl=en&js=tru%20e&qu=fravia

3) Another google approach (by Mordred)
Here is a way to gather relevant info about your target
"index+of/" "rain.wav******"
Useful to see date and size that follow your target name...

4) ElKilla bookmarklet (by ritz)
try it right away (no more clicking, press DEL to delete and ESC to cancel)
Right click and, in opera, select "add link to bookmarks"



More about bookmarklets in the javascript bookmark tricks essay.





"the best Guinness in Dublin"
"Pionta Guinness, Led Thoil"

Well, I'll try my hand at it, at risk of being ridiculous in your eyes... In fact it is often very dangerous -for any seeker- to claim that he can really find things on the fly :-)
Let's see together, in front of a competent professional crowd, the results of a seeker's attempt to find "the best Guinness in Dublin" BEFORE coming here...
The didactical aspect will be evident once we examine, at the end, the simple querystrings I have used.

Some insist that Guinness at its best can only be had in Ireland, and more specificaly in Dublin. "It doesn't travel well", they say.
Furthermore, as every beer lover knows, a pint drawn in a pub is always far superior to the bottled version. And it takes an expert bartender to "pull a pint" just right, so that it finishes off with a head of creamy foam so thick you can carve your initials in it.

Well, this is what I have found...
  1. The Guinness brewery (guinness, but they won't give away the source code).    The former Guinness Hop Store on Crane Street houses an exhibition centre and the gravity bar where you can taste (according to them) "the best Guinness in Dublin".
  2. Mulligans off Tara Street & not far from the Screen Cinema, in Poolbeg Street. Another institution and home to "the best Guinness in Dublin".
  3. Porterhouse on Parliament street (street opposite Dublin Castle).
    The beer on offer - from the session beer Porterhouse Red to the 7% BrainBlasta, and the hundreds of bottled beers from around the world, is second to none.
    If you bring them a beer they don't have, they give you a free beer (in the old days - free beers all evening!)
  4. Nesbits on Baggot Street (beyond Merrion Square - go to the top of Merrion Street & turn left) a beautiful Victorian exterior and wood-panelled interior. Alas -they say- frequented by boring, money-oriented, zombies.
    Or, rather, in the same Baggot street, with a nicer and less pretentious fauna:
    Toners on Baggot Street, a very small bar, fairly packed most nights... in the afternoons this is the best bar in Dublin, particularly if you manage to take possession of the wonderful window snug, which can be easily defended against invaders.
    Has arguably "the best Guinness in Dublin".
  5. The Brazen Head, 20 Bridge Street,
    The oldest pub in Dublin, and a must for every pub-lover. The only pub in Dublin with a courtyard. "The Guinness is among the best you will find in Dublin"
  6. O'Neills on Pearse Street (near one of the back gates to Trinity).
  7. ...or get out of town, take the Dart to Booterstown & try the Punch Bowl (near the station). Otherwise, there are lots of pubs in Blackrock (the following Dart station).
So, as you can see, there seem to be no lack of places where you can taste "the best Guinness in Dublin". How can you (quickly) evaluate which one(s) you should really visit?
There seem to be some simple rules:
1) Generally, the farther away from Dublin you are, the worse the Guinness gets.
2) Generally, the farther away from Ireland you are, the worse the Guinness gets.
(It doesn't travel well...)

So I would go for Mulligans and for Toners and I would forget the bad frequented Nesbit and the "Gravity" guinness' brewery bar, despite its probably good stout. Yes, despite! While substance is always king, form cannot be ignored, especially when -reversing it- you note that it gives you negative vibes.
Judging from the images the Gravity place seems shrill and "made like a Macdonald", a sterile and "anti-cosy" place with the obvious express purpose of pushing people quickly away and increment its turnover... get lost. Seekers usually love "slow food" places, where they can concentrate on their next queries :-)

Now it is up to you to judge if these guinness' conclusions of mine, pulled from air-searchstrings, are indeed correct or rather off the mark...
...this said, all these Guinness' stouts looks to me like being served everywhere a tag too cold nowadays :-)

And here are the strings I have used:
"best guinness in dublin" language:en {frsh=13} {popl=38} {mtch=29}
("best guinness in dublin" OR "best pint in dublin")
best-guinness-experience OR best-tasting-guinness OR best-guinness-in-dublin

As you can see the three strings may look quite straightforward to all of you now...

Microsoft sliders, the anti-google weapon?
Please take your time to investigate the first (microsoft) searchstring used in the above guiness' example.
Use the three sliders by yourself!
(Note how the morons at microsoft hide them instead of being proud of them: click onto http://search.msn.com/advanced.asp?qb=1 and then click "result ranking")
Would Microsoft implement MORE sliders, they would probably beat google (provided Microsoft increments its poor database and learn to be less americanocentric). These SLIDERS give THE USER the possibility to finetune (correctly) the searching algo. Basically, instead of trusting some idiot in Arizona to know what's good for your search, you decide that -say- "less popular" is a better choice BECAUSE "popular" searches are what peasants in Idaho are searching for :-)

Finally note that I haven't used any regional irish search engine, because -alas- there are no real useful local search engines in Ireland: the few existing ones:
www.niceone.com, www.irishsearch.net, www.browseireland.com, www.indexireland.com, www.whoisireland.com and www.irishsites.com
have all very little content and very little traffic. Yep, sometime you just cannot go regional enough :-(



Books

There's a whole section regarding books at searchlores, so I won't get into these simple targets much. Suffice to say that all books mankind has written are mostly already on the web somewhere, and that while we are sitting here, hic et nunc, hundreds of fully scanned libraries are going on line... if you'r attentive enough, and if your searching scripts are good, you can even hear the clincking "thud" of those huge databases going on line... red 

In order to fetch books you just need some correct strings.
A simple trick is to use the powerful A9 engine, for instance, for conan doyle, http://a9.com/conan%20doyle?a=obooks and then fetch the study in scarlet.
Of course once we have some arrows, it is relatively easy to fetch whole copies of a book all over the web...

Of course this is also true for all kind of copyrighted books... let's see: "I have no fitting gifts to give you at our parting,"...
and we land here, for instance: 'I have no fitting gifts to give you at our parting,' said Faramir; `but take these staves...

A typical example of the joys of good web-fishing: ftp://ftp.cdut.edu.cn/ let's see if you can imagine, "a ritroso" the querystrings you may use to fetch such a site (this kind of approach is a useful seekers' exercise as well).



Music

A completely new wave of music searching has opened through the relatively recent mp3 blogs phenomenon, but usually it is MUCH simpler to just fetch the music you need from the web any time you need it.
Most simple trick:
m4a "index of" dylan For instance:
http://www.stud.ntnu.no/~nikgol/21-11-2004/
Or even this, found with the previous query, so big that it may crash our browsers...
http://24.91.184.80/jserver/files/music/

A more complex webbit:
"icons/sound2.gif" "index of" mp3



strigs to play with

Consider this your "assignement" from this talk.
Play with the following strings THIS EVENING OR TOMORROW and you may (may) even remember some of this stuff despite your probably poor memory and attention span :-)
http://www.google.com/search?num=100&q=intitle%3Abig.brother+attention%20trouble%20unavailable%20offline
For instance:
http://www.google.com/url?sa=U&start=6&q=http://barnes.bloomu.edu/bb/&e=10094

Here, for those of you that want to flex their seeking muscles, another assignment for the next days
Try these (old and already overused) arrows on various search engines. Each one of them should give you plenty of interesting searching paths to follow :-)
#mysql dump filetype:sql
AIM buddy lists
allinurl:/examples/jsp/snp/snoop.jsp
allinurl:servlet/SnoopServlet
cgiirc.conf
cgiirc.conf
filetype:conf inurl:firewall -intitle:cvs
filetype:eml eml +intext:"Subject" +intext:"From" +intext:"To" 
filetype:lic lic intext:key 
filetype:mbx mbx intext:Subject 
filetype:wab wab 
(Financial spreadsheets: finance.xls OR Financial spreadsheets: finances.xls)
Ganglia Cluster Reports
generated by wwwstat
haccess.ctl 
haccess.ctl 
Host Vulnerability Summary Report
HTTP_FROM=googlebot googlebot.com 'Server_Software='
ICQ chat logs, please...
Index of / "chat/logs"
intitle:"index of" (mysql.conf OR mysql_config) 
intitle:"statistics of" "advanced web statistics"
intitle:"Usage Statistics for" "Generated by Webalizer"
intitle:"wbem" compaq login
intitle:admin intitle:login
intitle:index.of "Apache" "server at"
intitle:index.of cleanup.log
intitle:index.of dead.letter
intitle:index.of inbox
intitle:index.of inbox dbx
intitle:index.of ws_ftp.ini
inurl:"newsletter/admin/"
inurl:"newsletter/admin/" intitle:"newsletter admin"
inurl:"smb.conf" intext:"workgroup" filetype:conf conf
inurl:admin filetype:xls
inurl:admin intitle:login
inurl:cgi-bin/printenv
inurl:changepassword.asp
inurl:fcgi-bin/echo
inurl:main.php phpMyAdmin
inurl:main.php Welcome to phpMyAdmin
inurl:perl/printenv
inurl:server-info "Apache Server Information"
inurl:server-status "apache"
inurl:tdbin
inurl:vbstats.php "page generated"
ipsec.conf
ipsec.secrets
Most Submitted Forms and Scripts "this section"
mt-db-pass.cgi files
mystuff.xml - Trillian data files
Network Vulnerability Assessment Report






SEARCHING FOR DISAPPEARED SITES

http://webdev.archive.org/ ~ The 'Wayback' machine, explore the Net as it was!


Visit The 'Wayback' machine at Alexa, or try your luck with the form below.


Alternatively ,learn how to navigate through [Google's cache]!

NETCRAFT SITE SEARCH

(http://www.netcraft.com/ ~ Explore 15,049,382 web sites)

VERY useful: you find a lot of sites based on their own name, which is another possible way to get to your target...


Search: search tips
Example: site contains [searching] (a thousand sites eh!)













   Structure of the web
Structure of the web


   Short and long term seeking
Short and long term seeking

 

What a search looks like


To do... red 

 

Synecdochical searching


A Synecdoche ("sin-EK-doh-kee") is the rhetorical or metaphorical substitution of a part for the whole, or vice versa. This approach is widely used in searching, because it allows you to get at your signal 'from the bottom', eliminating part of the noise.

For some specific examples see synecdoc.htm.
Here let's just have "a visual look" at a search:

The red cylinder below represents the TOTALITY of accessible web sites that could be of interest to you -in the context of your current search. The small rings depict four different specific clusters of interesting sites.
Please remember that inside the cylinder the 'void' is only APPARENT! That's the part of the internet you cannot reach through the main search engines. There are interesting sites there as well (as a matter of fact MUCH more than on the 'accessible' outside), but to grab them you'll have to use more advanced techniques than commercial engines :-)

latilongi
1 You land first time to an interesting cluster of sites trough your 'clean cut'
2 You have 'synecdochically' moved horizontally, modifying your original clean-cut
3 These sites will be relatively easy to find, they are both on an horizontal and on a vertical synecdoche. Note that the signal width of the vertical synecdoches (e.g. the yellow one on the right side of the image) may vary quite a lot, while horizontal synecdoches' width seems more costant.
4 You'll never find this cluster with your current synecdochical approaches, you'll have to devise a COMPLETELY DIFFERENT cut.



LANGUAGES


Languages (that "english mothertongues" mostly underestimate): es: japanese bookmarklets... red 

As an example of how powerful some on-line services can be have for example a look at the following tool you lay use to understand a Japanese site,:

RIKAI
An incredible jappo-english translator!
http://www.rikai.com/perl/Home.pl
Try it for instance onto http://www.shirofan.com/ See? It "massages" WWW pages and places "popup translations" from the EDICT database behind the Japanese text!

for instance
http://www.rikai.com/perl/LangMediator.En.pl?mediate_uri=http%3A%2F%2Fwww.shirofan.com%2F
See?
You can use this tool to "guess" the meaning of many a japanese page or -and especially- japanese search engine options, even if you do not know Japanese :-)
You can easily understand how, in this way, you can -with the proper tools- explore the wealth of results that the japanese, chinese, korean... you name them... search engines may (and probably will) give you.

Let's search for "spanish search engines"... see?
Let's now search for "buscadores hispanos"... see?





EVENTUALLY


I would also like to draw your attention towards the paramount importance of names on the web... red 
The ethical aspect... red 
An injust society... red 
websearch importance nowadays recognized and obvious, you'll see tomorrow :-)... red 
libraries and documents: frills and substance... red 
the guardian of the light tower, the young kid in central africa and the yuppie in new york... red 

CONCLUSIONS

Ode to the seekers

Like a skilled native, the able seeker has become part of the web. He knows the smell of his forest: the foul-smelling mud of the popups, the slime of a rotting commercial javascript. He knows the sounds of the web: the gentle rustling of the jpgs, the cries of the brightly colored mp3s that chase one another among the trees, singing as they go; the dark snuffling of the m4as, the mechanical, monotone clincking of the huge, blind databases, the pathetic cry of the common user: a plaintive cooing that slides from one useless page down to the next until it dies away in a sad, little moan. In fact, to all those who do not understand it, today's Internet looks more and more like a closed, hostile and terribly boring commercial world.
Yet if you stop and hear attentively, you may be able to hear the seekers, deep into the shadows, singing a lusty chorus of praise to this wonderful world of theirs -- a world that gives them everything they want.
The web is the habitat of the seeker, and in return for his knowledge and skill it satisfies all his needs.

The seeker does not even need any more to hoard on his hard disks whatever he has found: all the various images, musics, films, books and whatsnot that he fetches from the web... he can just taste and leave there what he finds, without even copying it, because he knows that nothing can disappear any more: once anything lands on the web, it will always be there, available for the eternity to all those that possess its secret name...

The web-quicksand moves all the time, yet nothing can sink.

In order to fetch all kinds of delicious fruits, the seeker just needs to raise his sharp searchstrings.

In perfect armony with the sourronding internet forest, he can fetch again and again, at will, any target he fancies, wherever it may have been "hidden". The seeker moves unseen among sites and backbones, using his anonymity skills, his powerful proxomitron shield and his mighty HOST file.
If need be, he can quickly hide among the zombies, mimicking their behaviour and thus disappearing into the mass.

Moving silently along the cornucopial forest of his web, picking his fruits and digging his juwels, the seeker avoids easily the many vicious traps that have been set to catch all the furry, sad little animals that happily use MSIE (and outlook), that use only one-word google "searches", and that browse and chat around all the time without proxies, bouncing against trackers and web-bugs and smearing all their personal data around.

Moreover the seeker is armed: his sharp browser will quickly cut to pieces any slimy javascript or rotting advertisement that the commercial beasts may have put on his way. His bots' jaws will tear apart any database defense, his powerful scripts will send perfectly balanced searchstrings far into the forest.



So, that was it. Any questions?












Petit image 2124 bytes
Petit image
Petit image
Petit image
Back to allinone
The Door
the Hall
The Library
The Studio
The Garden path