Curiosity is bliss    Archive    Feed    About    Search

Julien Couvreur's programming blog and more

Sneaky web user tracking hack


Slashdot had a story today on how to track users even when cookies get cleared, using the browser cache. This is not a new technique, but it certainly is sneaky.
I believe that they are many other covert channels to store information.
Here's one that may not very pretty, but that would allow persisting the user's identity even after clearing the cookies and the browser cache. Instead, this technique would rely on the browser's history.

The first step is the ability to tell, from javascript, whether a link was visited or not. This is made possible by using CSS to render visited links differently than unvisited links and then checking with javascript how the links were rendered. See more details about this here and here.

The second step is to store the user's identifier in his browser's history by loading a number of urls, such that you can read it later using the :visited hack above. Let's take user number 0101 (in binary) as an example. When that user first visists the site, two invisible iframes are rendered in the page, opening the urls http://site/0100 and http://site/0001 respectively.

The final step is to recover this identifier on subsequent page loads. To do this, you would check, on the client-side, whic of the urls http://site/1000, http://site/0100, http://site/0010 and http://site/0001 have been previously visited by the user. Both 0100 and 0001 have been, but 1000 and 0010 do not, which let's you piece back the user's identifier: 0101.
If none of the urls have been visited, which may happen when the browser's history is cleared or some of its data expired, then the identifier needs to be restored from another source (cookies, browser cache, etc.).
Similarly, when the value stored in the cookies or the browser cache is missing, it could be recovered from the browser's history.

Anyways, this is just a thought experiment, as I have not tried implementing this hack. It's not very efficient, but the point is that even small covert channels can be exploited, with some work, to store more information.

comments powered by Disqus