~ Google malbehaves ~
|
|
|
|
Malwares |
(Courtesy of fravia's advanced searching
lores)
First published @ www.searchlores.org in August 2005
(This is version 0.05: August 2005)
Google is tracking your clicks
by ritz
with additions by Netzen (and vvf)
with a further addition (by fravia+)
Google is tracking your clicks! |
While playing around with the google results page in order to create some
customizing opera user javascript, I noticed this snippet of code:
function rwt(el,ct,cd,sg){
el.href="/url?sa=t&ct="+escape(ct)+"&cd="+escape(cd)+"&url="+escape(el.href).replace(/\+/g,"%2B")+"&ei=gyoLQ96BI8fYwgGNseDZDQ"+sg;
el.onmousedown="";
return true;
}
activated in the mousedown event that fires just before clicking in the link:
<a href="http://..." onmousedown="return rwt(this,'res','4','')">
As you can see, it carries four parameters:
1. this : called "el" (for element?) the DOM-object containing the hyperlink. the expression 'this.href' contains the URL of the hyperlink and can be modified to whatever you want, just before the link is clicked.
2. 'res' : called "ct", probably it's telling google that the link came from the search results
3. '4' : called "cd", this is the number of the search result. in this case it was the fourth 'hit' on google.
4. '' : called "sg", is also added at the end of the url, probably for transferring any other information as needed, that doesn't fall into "ct" or "cd".
and what does the function do? "rwt" obviously stands for "rewrite".
this is the line that does it all:
el.href="/url?sa=t&ct="+escape(ct)+"&cd="+escape(cd)+"&url="+escape(el.href).replace(/\+/g,"%2B")+"&ei=gyoLQ96BI8fYwgGNseDZDQ"+sg;
Let's say your URL is "http://somewhere.com" and it's the 5th result on the SERP you received.
Then this is what happens:
you see the link http://somewhere.com, you hover over it, see in your statusbar
"http://somewhere.com", so you think you will go to http://somewhere.com and you click on it, BUT THEN! the event fires and quickly ReWriTes this url to:
/url?sa=t&ct=res&cd=5&url=http://somewhere.com&ei=3427ABCDbla
("ei" is probably some kind of tracking identification code uniquely connected
to your cookie or search-query)
so your browser fires a GET request to www.google.com for this url, google
kindly saves all the (extra! and unnecessary) information your browser is
spewing, and the google.com server gives you back a 302 FOUND header:
HTTP/1.1 302 Found
Cache-Control: private
Location: http://somewhere.com
Content-Type: text/html
Server: GWS/2.1
Content-Length: 155
Date: Tue, 23 Aug 2005 16:19:24 GMT
which redirects you to the actual site you thought you were clicking
on in the first place. Of course you will never notice this little "detour"
your browser is taking, because google's servers are so darn fast, and the
information amount sent is tiny.
Now this is -in itself- nothing new, alltheweb has been doing it for ages, but
at least it doesn't try to do it secretly! you clearly see that the link you
click on doesn't go immediately to your webpage, but to some kind of ads.yahoo
server instead, where you get redirected in a similar way that google does.
i think this is sneaky and not very nice from mr google. also I wonder since
*when* have they been doing this? I'm quite sure I looked at googles' SERPs
some years ago and it wasn't there.
Well i know one thing for sure: THIS is going in my opera user script:
// ==UserScript==
// @namespace ritz.orange
// @name google distracktion
// @description removes one tracking script from google
// @include http://www.google.com/search?*
// ==/UserScript==
window.opera.defineMagicFunction('rwt',
function ( oRealFunc, oThis, el,ct,cd,sg) {
// do nothing!
return true;
}
);
Copy/paste, save to googledistrackt.js, in your Opera Userscript directory
(set that via prefs>advanced>content>javascript options>my javascript files)
.. just tested it, it works exactly as advertised in my opera 8.02 and you
know what, I think I can even notice a slight speed increase when loading the
page I want after clicking the link in my google SERP..
(copyright 2005 - ritz)
Addendum by - ritz:
I should of course be writing as crossbrowser code as possible, so
here is one that i think should work in firefox greasemonkey as well
as opera:
// ==UserScript==
// @namespace ritz.orange
// @name google distracktion
// @description removes one tracking script from google
// @include http://www.google.com/search?*
// ==/UserScript==
window.addEventListener('load',
function(){
window.rwt = function(){return true};
},
false);
I just tried it out, and it seems to work. except when google tried to redirect
me to my local google, which didn't match www.google.com/search?* .. so maybe
you have to adapt the @include line a bit. (or tame your ffox browser - why does
it redirect me in the first place?)
You can use Proxomitron for the same purpose
[Patterns]
Name = "Kill Google tracking clicks by Netzen"
Active = TRUE
URL = "www.google.com"
Bounds = "[script*[/script]"
Limit = 256
Match = "*(function clk|function ga)*"
Replace = "[!-- Killed Google tracking clicks JavaScript --]"
Netzen
If you try this:
[Patterns]
Name = "Kill Google tracking clicks by Netzen"
Active = TRUE
URL = "www.google.com"
Bounds = "[script*[/script]"
Limit = 256
Match = "*(function clk|function ga)*"
Replace = "[!-- Killed Google tracking clicks JavaScript --]"
and it doesn't work, try increasing "Limit" to 1024, it should do the trick.
Also be careful, in "Bounds", to replace square brackets by angle brackets.
If you're entering this info via the Proxo/Web Pages option, you must also remove all double quote marks.
vvf
The "sneakiness" of this google approach is due to the fact that all links in the serp have the
normal HREF attribute, so when you hover your mouse over the link you see the cleaned target URL in the status bar.
However, only when actually clicking, the mousedown event handler invokes the Javascript function described by ritz.
You can check everything at ease with -for instance- ipticker.
This might -also- be a system for tracking user browsers: firefox & opera versus M$IE.
Netzen, with his "ga" addition is referring to the fourth function below:
<script>
<!-- function
ss(w){window.status=w;return true;}
function cs(){window.status='';}
function rwt
(el,ct,cd,sg){ el.href="/url?sa=t&ct="+escape(ct)+"&cd="+escape(cd)+"&url="+escape(el.href).replace(/\+/g,"%2B"+"&ei=wAsMQ8mKI6bWwgGuuaXZBw"+sg; el.onmousedown="";return true;
} function
ga(o,e){if (document.getElementById){a=o.id.substring(1); p = ""; r = ""; g = e.target; if (g) { t = g.id;
f = g.parentNode; if (f) {p = f.id; h = f.parentNode; if (h) r = h.id; }} else{h = e.srcElement; f = h.parentNode;
if (f) p = f.id; t = h.id; } if (t==a ¦¦ p==a ¦¦ r==a) return true;
location.href=document.getElementById(a).href}} //--> </script> |
in my opinion this has mainly to do with their sponsored crap:
http://www.google.com/url?sa=t&ct=res&cd=1&url=http%3A//www.searchlores.org/&ei=wAsMQ8mKI6bWwgGuuaXZBw
sa=t --> "t" for normal links, "l" for sponsored links
ct=res --> "res" for normal links, "pro" for those sponsored results above normal links
cd=1 --> link position for normal links, special code for sponsored links
url=http%3A//www.searchlores.org/ --> HTTP encoded version of target
ei=wAsMQ8mKI6bWwgGuuaXZBw --> a base-64 encoded request number imo corresponding to the target being searched
Also let's not be too harsh with google. Do not forget that
other search engines are even worse.
Yahoo has an HARDCODED redirect in its URLs, for instance the apparent
http://search.yahoo.com/search?p=fravia&sm=Yahoo%21+Search&fr=FP-tab-web-t&toggle=1&cop=&ei=UTF-8
is in reality (today)
http://rds.yahoo.com/S=2766679/K=fravia/v=2/SID=e/TID=F548_118/l=WS1/R=1/IPC=be/SHE=0/H=2/;_ylt=AeUuhC1bosym47SMr4uRX5xXNyoC/SIG=11fclpehd/EXP=1124327757/*-http%3A//www.searchlores.org/
f+
(c) III Millennium: [fravia+], all rights
reserved