Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#4035 closed Bug (fixed)

In seed state, transmission disconnect from leechers

Reported by: athy Owned by: jordan
Priority: Normal Milestone: 2.22
Component: libtransmission Version: 2.04
Severity: Normal Keywords:
Cc:

Description

When seeding, transmission disconnect from leechers because it thinks they are seeders.

The reason for that is that we currently use the size of the bitfield as the number of pieces to compute the percentage done.

This assumption is wrong, the bitfield grows to meet our requirements.

Currently a peer with the first and second pieces has a bitfield = "11" So number of true bits / number of bits = 2/2 = 100% ---> the peer is a seeder and we can close the connection

We have to replace "number of bits" by "number of pieces" or always have complete bitfields from the beginning.

This patch does the first one. It has not been compiled/tested on trunk since I don't have libevent2. But the idea should be quite simple to get :)

Attachments (4)

disco-leechers.patch (745 bytes) - added by athy 11 years ago.
disco-leechers-2.patch (1.6 KB) - added by jordan 11 years ago.
revised patch to reflect comment:4 and comment:5
disco-leechers-3.patch (4.7 KB) - added by jordan 11 years ago.
revised patch to reflect comment:8
disco-leechers-4.patch (5.9 KB) - added by jordan 11 years ago.
Revised patch based on comment:11 and comment:13

Download all attachments as: .zip

Change History (19)

Changed 11 years ago by athy

comment:1 Changed 11 years ago by jordan

  • Resolution set to invalid
  • Status changed from new to closed

Hi athy,

I appreciate the patch but the premise is incorrect. When a peer sends a bitfield after handshaking, it's required to have pieceCount bits in it. Most clients explicitly disconnect from a peer with any other size.

comment:2 Changed 11 years ago by jordan

  • Resolution invalid deleted
  • Status changed from closed to reopened

comment:3 Changed 11 years ago by jordan

Ah, it is possible for the bitfield to be shorter than pieceSize. If we're a magnet link, and the the peer sent us a have none' on startup, and then sent us a have' message, then when we process the `have' we still have no idea how big a fullsize bitfield should be, so we grow the array far enough to handle the bit for that piece.

comment:4 Changed 11 years ago by jordan

Now, to the patch at hand:

if( peer->have.haveAll ) 
    msgs->peer->progress = 1.0; 
else if( peer->have.haveNone ) 
    msgs->peer->progress = 0.0; 
else 
    msgs->peer->progress =  (float) tr_bitfieldCountTrueBits( &peer->have.bitfield ) / (float) tor->info.pieceCount; 

First, if we're a magnet link, tor->info.pieceCount will be zero and this will crash. So that section needs to read:

else if( tr_torrentHasMetadata( peer->msgs->tor )
    msgs->peer->progress =  (float) tr_bitfieldCountTrueBits( &peer->have.bitfield ) / (float) tor->info.pieceCount; 

Second, but what do we do if we don't have metadata?? In the magnet->have none->have scenario, we've got no way of knowing.... :)

Third, this the only place that tr_bitsetPercent() is called. So if we don't need to call it anymore, we should delete tr_bitsetPercent() too.

I've got no idea what to do about the second point except maybe making up a number like 50% and adding a comment explaining why... there really doesn't seem to be a Right Choice here.

comment:5 follow-up: Changed 11 years ago by athy

I think the "best" solution when you don't have any metadata would be to divide by (bitfield.bitcount +1).

Because if you're here it means you haven't receive a haveall/havenone so you're likely to have received a bitfield. In this case, trueBits/bitCount is the exact progress.

If you haven't received a bitfield it means you just received one/multiple have. Then you have absolutely no clue of what the progress might be but it will get closer and closer to bitsetPercent. Anyway, this state won't last long and peer->progress has been wrong for ages whithout anybody noticing it ...

The +1 is essential because we have to be sure not to mark a peer as a seed if he is still a leecher. Otherwise we might request wrong pieces from him. And when we become a seed, we will also close the connection to him. It also avoid to divide by 0. (Even if this should not happen since this function is called after receiving some progress indicator (i.e. bitCount >0)


Also note that this problem of variable size bitfields is not limited to magnet link. A peer->have.bitfield only get initialized to it's final size when you get a BITFIELD message. When receiving HAVE messages, the bitfield will, each time, be extended just enough to accept the new index.

In fact, the progress indicator has been false for ~1/3 of leechers. And (when seeding) got us disconnected from any peer sending (a "HAVENONE" followed by) a "HAVE 0" first (which is a lot because most client put a higher priority on the first piece).

comment:6 in reply to: ↑ 5 Changed 11 years ago by jordan

Replying to athy:

I think the "best" solution when you don't have any metadata would be to divide by (bitfield.bitcount +1).

Because if you're here it means you haven't receive a haveall/havenone so you're likely to have received a bitfield. In this case, trueBits/bitCount is the exact progress.

This doesn't change the solution, but FWIW there are a couple of pitfalls there:

  • you can reach that point in the code without having received a bitfield, by route of an initial "have none" message + a "have" message.
  • trueBits/bitCount may not be the exact progress -- bitCount is sometimes an upper bound rather than a true count. If we don't have metainfo when we receive the bitfield, bitCount is set to msglen*8.

If you haven't received a bitfield it means you just received one/multiple have. Then you have absolutely no clue of what the progress might be but it will get closer and closer to bitsetPercent. Anyway, this state won't last long and peer->progress has been wrong for ages without anybody noticing it ...

The +1 is essential because we have to be sure not to mark a peer as a seed if he is still a leecher. Otherwise we might request wrong pieces from him. And when we become a seed, we will also close the connection to him. It also avoid to divide by 0. (Even if this should not happen since this function is called after receiving some progress indicator (i.e. bitCount >0)

This all sounds right to me.

Also note that this problem of variable size bitfields is not limited to magnet link. A peer->have.bitfield only get initialized to it's final size when you get a BITFIELD message. When receiving HAVE messages, the bitfield will, each time, be extended just enough to accept the new index.

Yes. My first thought was that it would be cleaner to resize the bitfield to the correct width when we get that first "HAVE" message. My point about magnets was that they make that approach impossible.

Changed 11 years ago by jordan

revised patch to reflect comment:4 and comment:5

comment:7 Changed 11 years ago by jordan

athy, If this second patch looks right to you, I'll commit it.

comment:8 Changed 11 years ago by athy

The patch looks all good to me. It does fix the bug.

But now I'm thinking that, when getting the metadata, we should recompute the progress of our existing leechers to detect the seeders. That's because a seeder will never send any new HAVE message, therefore his progress will never be recomputed. (and we might have the opposite problem of not closing the connection).

I really don't know magnet links are handled in transmission, so that may not be the case ...

Changed 11 years ago by jordan

revised patch to reflect comment:8

comment:9 Changed 11 years ago by jordan

athy that's a good point. here's disco-leechers-3.patch which adds these new changes:

  • updatePeerProgress() is now a public function, tr_peerMsgsUpdateProgress()
  • added a new peer-mgr function, tr_peerMsgsOnTorrentGotMetainfo(), that gets called when a magnet link finishes downloading the info dict. This function loops through all the peers' msgs objects and calls tr_peerMsgsUpdateProgress() on them.

Thoughts?

Last edited 11 years ago by jordan (previous) (diff)

comment:10 Changed 11 years ago by jordan

  • Version changed from 2.04+ to 2.04

comment:11 Changed 11 years ago by athy

Sounds all good except for one detail :

the updateProgress only work on tr_peer structures. tr_peermsgs are only used to access the "peer" fields and what it does has nothing to do with messages. Now that it's a public function, there is no reason to keep it in peer-msgs.[hc]. I suggest moving it to peer-mgr.[hc]. This is the place where i would be looking if I needed to use such a function.

Also rename the two functions : tr_peerMsgs* -> tr_peerMgr*

Sorry for only submitting comments but I can't run trunk at the moment :/

comment:12 Changed 11 years ago by jordan

  • Milestone changed from None Set to 2.22

comment:13 Changed 11 years ago by jordan

Attached is a new revision based on your suggestion about the division between peer-mgr and peer-msgs :)

Interestingly, the new tr_peerUpdateProgress() function eliminates the need for the PROGRESS event that was fired from peer-msgs, so that's been reflected in this patch too.

Changed 11 years ago by jordan

Revised patch based on comment:11 and comment:13

comment:14 Changed 11 years ago by jordan

  • Resolution set to fixed
  • Status changed from reopened to closed

disco-leechers-4.patch applied in r12022.

comment:15 Changed 11 years ago by jordan

backported to 2.2x by r12030

Note: See TracTickets for help on using tickets.