Opened 9 years ago
Last modified 7 years ago
#5505 new Bug
Assertion failed: (!tr_peerIsSeed (&msgs->peer)), function tr_peerMsgsCalculateActive
Reported by: | x190 | Owned by: | jordan |
---|---|---|---|
Priority: | Normal | Milestone: | 3.00 |
Component: | libtransmission | Version: | 2.82 |
Severity: | Normal | Keywords: | |
Cc: | peer-msgs, crash |
Description
Assertion failed: (!tr_peerIsSeed (&msgs->peer)), function tr_peerMsgsCalculateActive, file /Users/transmission/jenkins/workspace/trunk-mac/libtransmission/peer-msgs.c, line 706.
https://trac.transmissionbt.com/attachment/ticket/5493/Transmission%20r14203%20Crash.log
https://forum.transmissionbt.com/viewtopic.php?f=4&t=15004&p=66100#p66098
Thread 2 Crashed: 0 libsystem_kernel.dylib 0x00007fff8ee76212 pthread_kill + 10 1 libsystem_c.dylib 0x00007fff8ba35b24 pthread_kill + 90 2 libsystem_c.dylib 0x00007fff8ba79f61 abort + 143 3 libsystem_c.dylib 0x00007fff8ba7acb9 assert_rtn + 146 4 org.m0k.transmission 0x00000001000879c0 tr_peerMsgsCalculateActive + 121 5 org.m0k.transmission 0x000000010008a051 readBtMessage + 708 6 org.m0k.transmission 0x0000000100089a47 canRead + 1432 7 org.m0k.transmission 0x0000000100080325 canReadWrapper + 245 8 org.m0k.transmission 0x000000010007eea6 event_read_cb + 327 9 org.m0k.transmission 0x00000001000b12c7 event_base_loop + 1149 10 org.m0k.transmission 0x000000010007ac77 libeventThreadFunc + 145 11 org.m0k.transmission 0x000000010007311a ThreadFunc? + 15 12 libsystem_c.dylib 0x00007fff8ba34772 _pthread_start + 327 13 libsystem_c.dylib 0x00007fff8ba211a1 thread_start + 13
Attachments (2)
Change History (47)
comment:1 Changed 9 years ago by x190
comment:2 Changed 9 years ago by x190
2.82 + (14206)
comment:3 Changed 9 years ago by x190
comment:4 Changed 9 years ago by x190
Also occurring on 10.9.
comment:5 follow-up: ↓ 11 Changed 9 years ago by jordan
comment:6 Changed 9 years ago by Robby
comment:7 Changed 9 years ago by x190
canRead->readBtMessage->tr_peerMsgsUpdateActive->tr_peerMsgsCalculateActive
->CRASH
Maybe tr_peerMsgsUpdateActive() needs a lock so that peer-mgr.c doesn't change the peers status in tr_peerUpdateProgress() until we're done calculating?
comment:8 Changed 9 years ago by CrazyFrog
I think I got the same situation with the latest stable build, Am I right? The process is dying 2/3 times a day.
transmission-daemon: peer-msgs.c:703: tr_peerMsgsCalculateActive: Assertion `!tr_peerIsSeed (&msgs->peer)' failed.
comment:9 follow-up: ↓ 15 Changed 9 years ago by CrazyFrog
Well, Just did a kick & dirty patch on the svn source to add the x190's suggestion. Just 5 minutes active but seens to be working fine (I was affraid of dead locks). I'll keep it running for a few days to see if it's more stable.
comment:10 Changed 9 years ago by CrazyFrog
The lock didn't solve the problem but, when building I noticed a warning about structure alignment...
peer-msgs.c:2473: warning: cast increases required alignment of target type
comment:11 in reply to: ↑ 5 Changed 9 years ago by x190
comment:12 Changed 9 years ago by iSh0w
The issue /log I posted was wrt #5644, just want to post an update! I was trying to understand why Transmission was crashing and tried to retrace my steps!
Ok, so I had a Seagate GofLex? NET NAS ans used transmission-daemon on that which sat on a pogoplug firmware modded with optware. It used a JSON file for settings of transmission. I used a lil script I wrote to parse a .txt file and fetch all the 200+ magnets in it. All was well.
I then moved to Windows, where I used transmission for windows and copied all the preferences files from my Seagate GoFlex? NET NAS to the appropriate Transmission preferences folder in windows, basically migrated from the NAS to the windows version of Transmission.
Ultimately, migrated from the windows version of Transmission to the Mac OS X version and migrated the preference files!
Today, I had a look at my preferences files and they looked to be in bad shape, the JSON file was still there, I don't even know if transmission for Mac needed that, coz it stores preferences in another .plist file.
I decided to delete and start from scratch, i.e. copy all the magnet links to clipboard, save it in a .txt document, delete all my preferences and old transmission files from /Application Support/Transmission?!
I then added each magnet link manually via the remote browser gui!
The Transmission app does not crash anymore!
Furthermore, I will wait and watch as there might be a possibility that an app called "TV Shows 2" may be the cause of the problem as well, as, that app adds magnet links of my latest subscribed TV Shows to Transmission, at which point it may have been crashing.
I am writing this huge post only to help others who are facing similar problems and who might have been doing/done the same thing I did!
/iSh0w
comment:13 follow-up: ↓ 14 Changed 8 years ago by CrazyFrog
Still hapenning here after a clean install from svn.
comment:14 in reply to: ↑ 13 Changed 8 years ago by cfpp2p
Replying to CrazyFrog:
Still hapenning here after a clean install from svn.
confirmed here too: ticket:5782#comment:4
comment:15 in reply to: ↑ 9 Changed 8 years ago by x190
x190:
canRead->readBtMessage->tr_peerMsgsUpdateActive->tr_peerMsgsCalculateActive
->CRASH Maybe tr_peerMsgsUpdateActive() needs a lock so that peer-mgr.c doesn't change the peers status in tr_peerUpdateProgress() until we're done calculating?
Replying to CrazyFrog:
Well, Just did a kick & dirty patch on the svn source to add the x190's suggestion. Just 5 minutes active but seens to be working fine (I was affraid of dead locks). I'll keep it running for a few days to see if it's more stable.
@CrazyFrog?, how about providing your patch so we can debug it? ;-)
comment:17 follow-up: ↓ 18 Changed 8 years ago by spaam
with that patch i get:
Program received signal SIGPIPE, Broken pipe. [Switching to Thread 0x7ffff2ac5700 (LWP 10588)] 0x00007ffff64b38f0 in __libc_writev (fd=56, vector=0x7ffff2ac4060, count=1) at ../sysdeps/unix/sysv/linux/writev.c:54 54 ../sysdeps/unix/sysv/linux/writev.c: No such file or directory. (gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) n Program not restarted. (gdb) bt #0 0x00007ffff64b38f0 in __libc_writev (fd=56, vector=0x7ffff2ac4060, count=1) at ../sysdeps/unix/sysv/linux/writev.c:54 #1 0x00007ffff77945b3 in ?? () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 #2 0x00007ffff7797df3 in evbuffer_write_atmost () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 #3 0x0000000000444936 in tr_evbuffer_write (io=0x7fffea7c3140, fd=56, howmuch=314) at peer-io.c:317 #4 0x0000000000447348 in tr_peerIoTryWrite (io=0x7fffea7c3140, howmuch=314) at peer-io.c:1286 #5 0x000000000044750e in tr_peerIoFlush (io=0x7fffea7c3140, dir=TR_CLIENT_TO_PEER, limit=3000) at peer-io.c:1320 #6 0x000000000043812b in phaseOne (peerArray=0x7ffff2ac4d20, dir=TR_CLIENT_TO_PEER) at bandwidth.c:220 #7 0x0000000000438359 in tr_bandwidthAllocate (b=0x6cae20, dir=TR_CLIENT_TO_PEER, period_msec=500) at bandwidth.c:274 #8 0x000000000044ff7a in bandwidthPulse (foo=-1, bar=1, vmgr=0x7fffec004f90) at peer-mgr.c:3618 #9 0x00007ffff77903cc in event_base_loop () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 #10 0x0000000000421310 in libeventThreadFunc (veh=0x6cb4f0) at trevent.c:246 #11 0x0000000000408a91 in ThreadFunc (_t=0x6aee80) at platform.c:109 #12 0x00007ffff67850a4 in start_thread (arg=0x7ffff2ac5700) at pthread_create.c:309 #13 0x00007ffff64bac2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 (gdb) bt full #0 0x00007ffff64b38f0 in __libc_writev (fd=56, vector=0x7ffff2ac4060, count=1) at ../sysdeps/unix/sysv/linux/writev.c:54 resultvar = 18446744073709551584 oldtype = 0 result = <optimized out> #1 0x00007ffff77945b3 in ?? () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 No symbol table info available. #2 0x00007ffff7797df3 in evbuffer_write_atmost () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 No symbol table info available. #3 0x0000000000444936 in tr_evbuffer_write (io=0x7fffea7c3140, fd=56, howmuch=314) at peer-io.c:317 e = 4425595 n = 0 errstr = "\000\000\000\000\320\a\000\000\260\257l\000\000\000\000\000\360\b\000\354\377\177\000\000\001\000\000\000\000\000\000\000\060I\254\362\377\177\000\000 \256l\000\001\000\000\000PI\254\362\377\177\000\000\020\210C\000\000\000\000\000PI\254\362\377\177\000\000U%B\000\001\000\000\000\240[r\037\001\000\000\000 \256l\000\000\000\000\000\240I\254\362\377\177\000\000\003\206C\000\000\000\000\000\240[r\037\001\000\000\000:\001\000\000\000\000\000\000\240[r\037I\001\000\000 \256l\000\000\000\000\000\060J\254\362\377\177", '\000' <repeats 23 times>, "H\205@\360I\254\362\377\177\000\000J\207C", '\000' <repeats 13 times>... #4 0x0000000000447348 in tr_peerIoTryWrite (io=0x7fffea7c3140, howmuch=314) at peer-io.c:1286 e = 0 n = 0 old_len = 314 #5 0x000000000044750e in tr_peerIoFlush (io=0x7fffea7c3140, dir=TR_CLIENT_TO_PEER, limit=3000) at peer-io.c:1320 bytesUsed = 0 __PRETTY_FUNCTION__ = "tr_peerIoFlush" #6 0x000000000043812b in phaseOne (peerArray=0x7ffff2ac4d20, dir=TR_CLIENT_TO_PEER) at bandwidth.c:220 i = 2 increment = 3000 bytesUsed = 32767 n = 4 peerCount = 4 peers = 0x7fffec1bf530 #7 0x0000000000438359 in tr_bandwidthAllocate (b=0x6cae20, dir=TR_CLIENT_TO_PEER, period_msec=500) at bandwidth.c:274 i = 4 peerCount = 4 tmp = {items = 0x7fffec1bf020, n_items = 4, n_alloc = 32} low = {items = 0x7fffec0c9dc0, n_items = 4, n_alloc = 32} high = {items = 0x0, n_items = 0, n_alloc = 0} normal = {items = 0x7fffec1bf530, n_items = 4, n_alloc = 32} peers = 0x7fffec1bf020 #8 0x000000000044ff7a in bandwidthPulse (foo=-1, bar=1, vmgr=0x7fffec004f90) at peer-mgr.c:3618 tor = 0x7fffec005060 mgr = 0x7fffec004f90 session = 0x6cac10 #9 0x00007ffff77903cc in event_base_loop () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 No symbol table info available. #10 0x0000000000421310 in libeventThreadFunc (veh=0x6cb4f0) at trevent.c:246 base = 0x7fffec0008f0 eh = 0x6cb4f0 #11 0x0000000000408a91 in ThreadFunc (_t=0x6aee80) at platform.c:109 t = 0x6aee80 #12 0x00007ffff67850a4 in start_thread (arg=0x7ffff2ac5700) at pthread_create.c:309 __res = <optimized out> pd = 0x7ffff2ac5700 now = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737264768768, 5837302229518105092, 1, 140737354125408, 0, 140737264768768, -5837326834843928060, -5837316560835855868}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> pagesize_m1 = <optimized out> sp = <optimized out> freesize = <optimized out> __PRETTY_FUNCTION__ = "start_thread" #13 0x00007ffff64bac2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
comment:18 in reply to: ↑ 17 Changed 8 years ago by mike.dld
Replying to spaam:
with that patch i get: ...
That's not really an issue. When debugging you need to ignore SIGPIPE and pass it to the program. Before you start, execute handle SIGPIPE pass nostop (for GDB) or process handle SIGPIPE -p yes -s no (for LLDB).
comment:19 Changed 8 years ago by spaam
everything looked good with that patch and that ignore thing this morning i saw this:
transmission-daemon: peer-msgs.c:708: tr_peerMsgsCalculateActive: Assertion `!tr_peerIsSeed (&msgs->peer)' failed. Program received signal SIGABRT, Aborted. [Switching to Thread 0x7ffff2ac5700 (LWP 25111)] 0x00007ffff640a077 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) (gdb) bt #0 0x00007ffff640a077 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007ffff640b458 in __GI_abort () at abort.c:89 #2 0x00007ffff6403196 in __assert_fail_base (fmt=0x7ffff65398c8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x48aced "!tr_peerIsSeed (&msgs->peer)", file=file@entry=0x48abc6 "peer-msgs.c", line=line@entry=708, function=function@entry=0x48b9b0 <__PRETTY_FUNCTION__.10118> "tr_peerMsgsCalculateActive") at assert.c:92 #3 0x00007ffff6403242 in __GI___assert_fail (assertion=0x48aced "!tr_peerIsSeed (&msgs->peer)", file=0x48abc6 "peer-msgs.c", line=708, function=0x48b9b0 <__PRETTY_FUNCTION__.10118> "tr_peerMsgsCalculateActive") at assert.c:101 #4 0x00000000004525ef in tr_peerMsgsCalculateActive (msgs=0x7fffe8e93100, direction=TR_CLIENT_TO_PEER) at peer-msgs.c:708 #5 0x00000000004527f1 in tr_peerMsgsUpdateActive (msgs=0x7fffe8e93100, direction=TR_CLIENT_TO_PEER) at peer-msgs.c:762 #6 0x000000000045485c in readBtMessage (msgs=0x7fffe8e93100, inbuf=0x7fffe828f620, inlen=0) at peer-msgs.c:1523 #7 0x0000000000453f3f in readBtId (msgs=0x7fffe8e93100, inbuf=0x7fffe828f620, inlen=1) at peer-msgs.c:1308 #8 0x0000000000455590 in canRead (io=0x7fffeabde780, vmsgs=0x7fffe8e93100, piece=0x7ffff2ac3a80) at peer-msgs.c:1774 #9 0x00000000004444e5 in canReadWrapper (io=0x7fffeabde780) at peer-io.c:203 #10 0x0000000000444df7 in utp_on_read (closure=0x7fffeabde780, buf=0x7ffff2ac3dd4 "\030\305\377\222\205", <incomplete sequence \367>, buflen=464) at peer-io.c:424 #11 0x00000000004793ae in UTP_ProcessIncoming (conn=0x7fffec0080d0, packet=0x7ffff2ac3dc0 "\001", len=484, syn=false) at utp.cpp:2158 #12 0x000000000047a5da in UTP_IsIncomingUTP (incoming_proc=0x42031b <incoming>, send_to_proc=0x420437 <tr_utpSendTo>, send_to_userdata=0x6cac10, buffer=0x7ffff2ac3dc0 "\001", len=484, to=0x7ffff2ac3d40, tolen=16) at utp.cpp:2587 #13 0x0000000000420637 in tr_utpPacket (buf=0x7ffff2ac3dc0 "\001", buflen=484, from=0x7ffff2ac3d40, fromlen=16, ss=0x6cac10) at tr-utp.c:180 #14 0x000000000041fe37 in event_callback (s=16, type=2, sv=0x6cac10) at tr-udp.c:226 #15 0x00007ffff77903cc in event_base_loop () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 #16 0x0000000000421310 in libeventThreadFunc (veh=0x6cb4f0) at trevent.c:246 #17 0x0000000000408a91 in ThreadFunc (_t=0x6aee80) at platform.c:109 #18 0x00007ffff67850a4 in start_thread (arg=0x7ffff2ac5700) at pthread_create.c:309 #19 0x00007ffff64bac2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:10118
comment:20 Changed 8 years ago by x190
Look carefully at my patch.
https://trac.transmissionbt.com/attachment/ticket/5505/tr_peerMsgsCalculateActive.patch
There is no such assert at Line 708 (peer-msgs.c --- red means delete). Did you forget to remove it or comment it out?
comment:21 follow-up: ↓ 27 Changed 8 years ago by Robby
x190, are you still getting this crash? I got it when I tried the nightly a while back.
comment:22 Changed 8 years ago by cfpp2p
x190 has a fix that preliminary, but valid testing indicates a solution to the problem :) https://forum.transmissionbt.com/viewtopic.php?f=4&t=16639#p69969
comment:23 follow-up: ↓ 25 Changed 8 years ago by mike.dld
While trying to reproduce the issue (and succeeding at it), I also had a couple of failed assertions on similar condition in the same function
if (is_active) assert (!tr_torrentIsSeed (msgs->torrent));
(for TR_PEER_TO_CLIENT case), and also in tr_peerMsgsIsActive on
assert (is_active == tr_peerMsgsCalculateActive (msgs, direction));
Those assertions are there for good, i.e. how can a peer be a seed if we've just received BT_INTERESTED from it (from what I've observed)? Or how could we be interested if we're seeding already? IMO those are conditions which shouldn't be met. Seeing the failures suggests that either Transmission (and/or another client) misbehaves and should be fixed or that our expectations are wrong to begin with and assertions should be transformed into simple conditions and handled normally. Proposed patch does the latter, but I'm wondering if it could still be the former. Any thoughts on this?
comment:24 Changed 8 years ago by mike.dld
Oh, worth noting that assertion in tr_peerMsgsIsActive failed after I changed tr_peerMsgsCalculateActive to return false instead of asserting on tr_peerIsSeed and tr_torrentIsSeed.
comment:25 in reply to: ↑ 23 Changed 8 years ago by cfpp2p
Replying to mike.dld:
...i.e. how can a peer be a seed if we've just received BT_INTERESTED from it (from what I've observed)? Or how could we be interested if we're seeding already? IMO those are conditions which shouldn't be met. Seeing the failures suggests that either Transmission (and/or another client) misbehaves...
My thoughts on this are conflicted. Transmission is misbehaving, and at the same time some or all of our expectations are wrong. The simple conditionals may just push the misunderstanding further into the code but x190's patch and test subject hasn't indicated a problem. However your report in comment:24 indicates otherwise.
I've looked at this section of code many times and have never liked it or been able to focus and really get my head around it. I always trace back to ticket:5372 Wrong peer states displayed by Transmission thinking that's where the problem manifested and was an indicator of a deeper problem somewhere else in the code r14108.
I'm glad that you've taken initiative and reproduced the problem. Looks like the real problem could possibly be exposed.
These are my thoughts.
comment:26 Changed 8 years ago by dirty
I've been seeing this recently. I am using the nightly build, revision 14388.
comment:27 in reply to: ↑ 21 Changed 8 years ago by x190.303cal
Replying to Robby:
x190, are you still getting this crash? I got it when I tried the nightly a while back.
You have to apply the patch from comment:15 yourself.
comment:28 Changed 8 years ago by cfpp2p
The crash I think is reported somewhat frequently on the forums too.
comment:29 follow-up: ↓ 30 Changed 8 years ago by mike.dld
I've tried debugging this and I see that sometimes (quite rarely actually) Vuze 5.5.0.0 sends "bitfield" message indicating that it has all the data and right after that sends "interested" message. This is what causes assertion to fail. Yet to talk to Vuze developers (to determine if it is a bug in Vuze), could as well be already fixed in 5.5.0.1 beta. If someone has an account on their forums I'd be grateful if you take the lead.
comment:30 in reply to: ↑ 29 Changed 8 years ago by cfpp2p
Replying to mike.dld:
I've tried debugging this and I see that sometimes (quite rarely actually) Vuze 5.5.0.0 sends "bitfield" message indicating that it has all the data and right after that sends "interested" message. This is what causes assertion to fail...
It looks promising that you might be finally getting to the bottom of this issue.
But even if it is a Vuze bug I still think it necessary to eliminate in transmission where a remote client could crash transmission in this fashion. There have been issues with Vuze before i.e. #5049 r13527
comment:31 Changed 8 years ago by mike.dld
HastaJun (on IRC) suggested that this could be connected to another long-standing issue where Transmission is unchoking uninterested peers. Quick patch seems to fix assertion failure for me, but still points out that Vuze doesn't behave well.
-
libtransmission/peer-mgr.c
3116 3116 unchokedInterested = 0; 3117 3117 for (i=0; i<size && unchokedInterested<session->uploadSlotsPerTorrent; ++i) 3118 3118 { 3119 if (!choke[i].isInterested) 3120 continue; 3119 3121 choke[i].isChoked = isMaxedOut ? choke[i].wasChoked : false; 3120 if (choke[i].isInterested) 3121 ++unchokedInterested; 3122 ++unchokedInterested; 3122 3123 } 3123 3124 3124 3125 /* optimistic unchoke */
comment:32 follow-up: ↓ 34 Changed 8 years ago by mike.dld
The comment just above the code I patched indicates that unchoking uninterested clients is the intended behavior:
Peers which have a better upload rate (as compared to the downloaders) but aren't interested get unchoked. If they become interested, the downloader with the worst upload rate gets choked. If a client has a complete file, it uses its upload rate rather than its download rate to decide which peers to unchoke.
This leaves us with two consequences:
- Transmission deliberately unchokes uninterested peers and then asserts if they become interested while already having all the data. This could be fixed by eliminating the assertion and handling the case normally (after all, the bug is not in our code so assertion doesn't make much sense), which is done in x190's patch attached.
- Vuze 5.5.0.0 becomes interested when unchoked even if it already has all the data. This could be fixed by Vuze developers, but only in future versions. I've also observed rare cases when other clients (e.g. BitTorrent 7.1.0 and µTorrent 3.4.2) send 'interested' message at the time tr_peerIsSeed returns true, so it seems to be not at all uncommon.
Sorry for wasting everyone's time with my investigation, although it was interesting to get to know what's happening. I'm leaving actual commit to jordan if no one would mind. Note that release versions of Transmission don't crash (read: abort) as asserts are disabled, so workaround until fix would be to compile with -DNDEBUG.
comment:33 Changed 8 years ago by mike.dld
From ##bittorrent (on IRC):
<HastaJun> The_8472: Would you mind taking a quick look at this: https://trac.transmissionbt.com/ticket/5505#comment:29 IRC logg: http://pastebin.com/R68StLL8 <mikedld> "quite rarely actually" should be removed there, as it happens to each one of Vuze 5.5.0.0 clients I think. sometimes assertion isn't being hit though, didn't dig into that; jordan_ will be more productive in that anyway <mikedld> what I meant to say is that Vuze always becomes interested when we unchoke it (and it has all the data), and only "quite rarely actually" the assertion is being hit <The_8472> HastaJun, transmission unchoking uninterested peers sounds like a bug to me. <The_8472> the BT spec states "Connections start out choked and not interested." <The_8472> and it has no reason to unchoke according to the spec <The_8472> of course vuze's behavior seems odd too <HastaJun> Yea, I agree that Transmission have a bug there. Do you think Vuzes behavior is because it trying to get to a correct state in the state machine?! <HastaJun> I don't know how much the code have changed in Vuze since you where involved, that part ought to be quite stable and untuched since several years <The_8472> HastaJun, looking at the source i can't find a place where it would send an interested message unless it's really interested <HastaJun> mikedld: ping <The_8472> the whole thing seems conditional on there being unfinished pieces <HastaJun> Yeah, it is strange and messy with bug in two ends... <HastaJun> bugs <mikedld> The_8472: where's that source code you're looking at? I'd like to take a look myself as well <The_8472> http://svn.vuze.com/public/client/ <The_8472> org.gudy.azureus2.core3.peer.impl.transport.PEPeerTransportProtocol. checkInterested() <mikedld> and azureus3 directory is something else, right? <The_8472> the source is split over multiple subdirs, this particular file is in the azureus2/src folder <HastaJun> Well, now i'm out because Java is not my thing =) Thanks for you tock the time The_8472 ... <The_8472> mikedld, that's basically the method making the decision: http://pastebin.com/a7Nfz4Py <mikedld> so interested_in_other_peer could only become false if is_download_disabled is true <The_8472> hum... <mikedld> which in turn only happens in setDownloadRateLimitBytesPerSecond, right? <The_8472> yeah, isdownloaddisabled = true can be considered uncommon <The_8472> looks like the boolean logic was incorrectly modified to add the is_download_disabled logic <mikedld> and it happens so that particular code has changed from 5.4.0.0 :) <The_8472> interested_in_other_peer used to be initialized to false <mikedld> would it be too much to ask you to convey this issue to Vuze developers? <The_8472> can do
comment:34 in reply to: ↑ 32 Changed 8 years ago by cfpp2p
Replying to mike.dld:
...(after all, the bug is not in our code so assertion doesn't make much sense)...
Agreed.
Sorry for wasting everyone's time with my investigation...
didn't waste my time
<The_8472> HastaJun?, transmission unchoking uninterested peers sounds like a bug to me. <The_8472> the BT spec states "Connections start out choked and not interested." <The_8472> and it has no reason to unchoke according to the spec <The_8472> of course vuze's behavior seems odd too <HastaJun?> Yea, I agree that Transmission have a bug there. Do you think Vuzes behavior is because it trying to get to a correct state in the state machine?!
The above, stating transmission has a bug in the unchoking I don't agree with.
As mike.dld already stated:
The comment just above the code I patched indicates that unchoking uninterested clients is the intended behavior:
Peers which have a better upload rate (as compared to the downloaders) but aren't interested get unchoked. If they become interested, the downloader with the worst upload rate gets choked. If a client has a complete file, it uses its upload rate rather than its download rate to decide which peers to unchoke.
related references for libtransmission/peer-mgr.c :
but aren't interested get unchoked r4579
If our bandwidth is maxed out, don't unchoke any more peers r11229
support "reject request" and "have all/none" from BEP 6 -- support fast exensions #1549 r7234
comment:35 Changed 8 years ago by cfpp2p
A patch from the forums https://forum.transmissionbt.com/viewtopic.php?f=4&t=16639&p=70150#p70088 dictates an excellent method to fix without altering the original intended behavior. We'll patch peer-msgs.c
-
libtransmission/peer-msgs.c
1518 1518 1519 1519 case BT_INTERESTED: 1520 1520 dbgmsg (msgs, "got Interested"); 1521 msgs->peer_is_interested = true; 1522 tr_peerMsgsUpdateActive (msgs, TR_CLIENT_TO_PEER); 1521 if (!tr_peerIsSeed (&msgs->peer)) /* ignore spurious BT_INTERESTED messages */ 1522 { 1523 msgs->peer_is_interested = true; 1524 tr_peerMsgsUpdateActive (msgs, TR_CLIENT_TO_PEER); 1525 } 1523 1526 break; 1524 1527 1525 1528 case BT_NOT_INTERESTED:
comment:36 follow-up: ↓ 38 Changed 8 years ago by mike.dld
I tried making this kind of change myself before and I could tell you that
- (see below)
- IMHO it's not correct to ignore the message (and hence lose the information of client being interested) which could then lead to other illogical stuff like seeing that peer wasn't interested (because we ignored the 'interested' message before) upon receiving 'not interested' message.
EDIT: I stand corrected, assertion isn't being hit with this patch. I must have thought of something else. Second point is still there though.
comment:37 Changed 8 years ago by cfpp2p
One thing we do know, the assertion brought to light a problem and made it obvious something needs to be done.
comment:38 in reply to: ↑ 36 Changed 8 years ago by cfpp2p
Replying to mike.dld:
...which could then lead to other illogical stuff like seeing that peer wasn't interested (because we ignored the 'interested' message before) upon receiving 'not interested' message.
If we get peers that send transmission illogical sequences of messages it's ultimately the responsibility of transmission to properly deal with this. If the release versions of "Transmission don't crash (read: abort) as asserts are disabled" then transmission must at times already be dealing with illogical sequences as the true/false status of is_active is not correctly decided as it stands now.
When jordan comes onto the scene his expertise and valued experience should solve these dilemmas. We've got three different patches now each with their own merits so I don't see that we won't have a solid solution soon.
comment:39 Changed 8 years ago by x190.303cal
Why don't we just deal with it, as peers will be using old client versions basically forever?
comment:40 follow-up: ↓ 41 Changed 8 years ago by x190.303cal
Attached updated patch to handle spurious BT_INTERESTED messages.
comment:41 in reply to: ↑ 40 Changed 8 years ago by x190
Replying to x190.303cal:
Attached updated patch to handle spurious BT_INTERESTED messages.
comment:42 Changed 8 years ago by mike.dld
- Milestone changed from None Set to 2.90
Sorry that it took so much time and so many broken nightly builds, but I've commented the assert out for now (r14473) and am changing milestone to 2.90 so that @jordan won't forget to look into it before release. I think we all agree that number of reports for this one is getting ridiculously high.
comment:43 Changed 8 years ago by cfpp2p
OK, good. Now lets see if we can coax jordan back on the scene. :)
comment:44 Changed 7 years ago by mike.dld
- Milestone changed from 2.90 to 2.91
comment:45 Changed 7 years ago by mike.dld
- Milestone changed from 2.91 to 3.00
There's a good chance this crash from #5493 involves webseeds.