Opened 7 years ago

Last modified 7 years ago

#5530 new Bug

Reusing DHT id allows to track users

Reported by: thiemo.nagel Owned by:
Priority: Normal Milestone: None Set
Component: Transmission Version: 2.52
Severity: Normal Keywords:
Cc:

Description

Hello developers,

it seems that Transmission creates a DHT id upon first start and then perpetually re-uses this id. As far as I understand Kademlia, this allows third parties to track the user's current IP address by performing lookups for their DHT id. I believe that this does violate some (many?) users' expectation of anonymity when using Transmission.

I understand that DHT id reuse has some practical advantages because DHT routing information can be kept across Transmission restarts, but I don't think that this advantage outweighs the risk to users' anonymity.

To avoid this risk while still allowing limited caching of routing information, I suggest to re-generate the DHT id only whenever the (public) IP address changes.

What do you think?

Cheers, Thiemo

Change History (4)

comment:1 Changed 7 years ago by pathetic_loser

In principle, it can allow some tracking (matching DHT ID -> some IPs). If its concern for you, you may want to delete (or patch?) dht.dat file from time to time. On other hand, restarting DHT with same ID can improve connectivity, especially if you shutting down client for a short time and then starting it again.

As for privacy: you see, you connect other peers directly, exposing them IP. So determined attacker can request your identity from ISP anyway, etc.

Last edited 7 years ago by pathetic_loser (previous) (diff)

comment:2 Changed 7 years ago by thiemo.nagel

Imagine the following attack scenario:

  1. At some point in time, an attacker does eavesdrop on the communication from a particular machine and records the DHT id. (That could happen by legal or illegal wiretap, at the ISP or eg. by recording wifi traffic in a hotel lobby.)
  2. Now, through DHT lookups, the person's IP (and thus approximate location) can be tracked forever after, regardless whether he is downloading the same torrent or a different torrent (or simply has Transmission open and is downloading no torrent at all).

Or maybe with a slightly different twist:

  1. A dissident living in a totalitarian state wants to publish her accounts of the government's atrocities through Transmission. While uploading the incriminating data, she takes great caution to protect her identity and tunes into the local wifi in a dark corner of a crowded cafe.
  2. The next day, she's back at home and uses Transmission to get the latest Debian ISO.
  3. Secret police comes knocking.

Bottom line: I think this issue is relevant for more people than just me and that an attempt should be made to find a solution that improves anonymity without significantly compromising performance.

To alleviate the problem, I suggest to re-generate the DHT id a) at regular intervals, say every 24 hours, and b) (optionally) whenever the client detects that the public IP address has changed, since in that case the DHT routing is already half-broken anyways.

comment:3 Changed 7 years ago by jordan

The DHT code used in Transmission is written by Juliusz Chroboczek and its upstream location is http://www.pps.univ-paris-diderot.fr/~jch/software/bittorrent/ ... you may wish to request this feature upstream to Juliusz.

comment:4 Changed 7 years ago by thiemo.nagel

I've thought about this some more, and I believe that a) is not a valid option: The fact that a client changes its DHT id is easily observable by an attacker which dht_pings the id at regular intervals: The client will readily reveal its new DHT id as a response to ping requests.

For any observer, there is a bijective mapping between IP address and DHT id, knowledge of either one allows to determine the other. Therefore, to foil long-term tracking of clients, both IP address and DHT id must be changed at the same time. Only changing the DHT id while the client is running, as I've proposed above, doesn't cut it since the odds are minuscule that the IP address accidentally changes at the same time.

The only conclusion that I can draw from that is, upon start-up, to re-generate the DHT id (possibly only if it is older than a certain, preferably very short, amount of time) just in case that the user has changed its public IP address. To alleviate DHT startup performance degradation, in specific situations (UPNP?) that allow the client to confirm that the public IP address has not changed, the old DHT id may be kept.

It might make sense to expose the issue to the user (eg. in the network settings) and let her chose whether she'd like to be more private and re-generate the DHT id upon every start of the client or whether DHT startup performance is more important and the same id is to be kept for a long time, or even indefinitely.

(I haven't contacted upstream, yet, since it (now) seems to me that the issue can only be solved in the client code. Though, probably it would be sensible to ask upstream to update his README to reflect anonymity concerns over the DHT id.)

Note: See TracTickets for help on using tickets.