Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#4173 closed Bug (fixed)

Crashing with r12315 in Debian

Reported by: gunzip Owned by:
Priority: Normal Milestone:
Component: Daemon Version: 2.22+
Severity: Normal Keywords:
Cc:

Description

Linux Debian, transmission-daemon 2.30b1 (12315)

2 out of the first 3 torrents crashed with above build, about 5 to 10 minutes after starting a torrent. backtrace.log :

Thread 3 (Thread 0xb70f0b70 (LWP 22866)):
#0  0xb7cbebc7 in select () from /lib/libc.so.6
No symbol table info available.
#1  0x08076731 in tr_select (nfds=0, r_fd_set=0xb70f0260, w_fd_set=0xb70f01e0, c_fd_set=0xb70f0160, t=0xb70f02f0)
    at web.c:299
No locals.
#2  0x080769f2 in tr_webThreadFunc (vsession=0x80cf030) at web.c:373
        usec = 1000000
        r_fd_set = {__fds_bits = {0 <repeats 32 times>}}
        max_fd = -1
        t = {tv_sec = 0, tv_usec = 244839}
        w_fd_set = {__fds_bits = {0 <repeats 32 times>}}
        c_fd_set = {__fds_bits = {0 <repeats 32 times>}}
        msec = 1000
        unused = 0
        msg = 0x0
        mcode = CURLM_OK
        multi = 0x80dd0d8
        web = 0x80dd0c8
        taskCount = 0
        task = 0x0
        session = 0x80cf030
#3  0x0805ace8 in ThreadFunc (_t=0x80d1128) at platform.c:118
        t = 0x80d1128
#4  0xb7d447b0 in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5  0xb7cc58fe in clone () from /lib/libc.so.6
No symbol table info available.

Thread 2 (Thread 0xb78f0b70 (LWP 22865)):
#0  0xb7c24537 in raise () from /lib/libc.so.6
No symbol table info available.
#1  0xb7c27922 in abort () from /lib/libc.so.6
No symbol table info available.
#2  0xb7fac700 in ?? () from /usr/local/lib/libevent-2.0.so.5
No symbol table info available.
#3  0xb7fac747 in event_errx () from /usr/local/lib/libevent-2.0.so.5
No symbol table info available.
#4  0xb7fa2c24 in ?? () from /usr/local/lib/libevent-2.0.so.5
No symbol table info available.
#5  0xb7fa35b2 in evbuffer_add () from /usr/local/lib/libevent-2.0.so.5
No symbol table info available.
#6  0xb7fa3889 in evbuffer_remove_buffer () from /usr/local/lib/libevent-2.0.so.5
No symbol table info available.
#7  0x0805a429 in tr_peerIoReadBytesToBuf (io=0x819e4b0, inbuf=0x819e0f0, outbuf=0x8194f80, byteCount=1072)
    at peer-io.c:1060
        old_length = 15312
        __PRETTY_FUNCTION__ = "tr_peerIoReadBytesToBuf"
#8  0x080921da in readBtPiece (msgs=0x8239588, inbuf=0x819e0f0, inlen=1406, setme_piece_bytes_read=0xb78effec)
    at peer-msgs.c:1314
        err = 22865
        nLeft = 1072
        n = 1072
        req = 0x823aed0
        __PRETTY_FUNCTION__ = "readBtPiece"
#9  0x0809339e in canRead (io=0x819e4b0, vmsgs=0x8239588, piece=0xb78effec) at peer-msgs.c:1613
        ret = 1302049771
        msgs = 0x8239588
        in = 0x819e0f0
        inlen = 1406
        __PRETTY_FUNCTION__ = "canRead"
#10 0x08057a18 in canReadWrapper (io=0x819e4b0) at peer-io.c:148
        oldLen = 1406
        ret = 1281
        overhead = 3086783316
        piece = 0
        used = 141293624
        now = 1302049771828
        err = false
        done = false
        session = 0x80cf030
        __PRETTY_FUNCTION__ = "canReadWrapper"
#11 0x08057d7f in event_read_cb (fd=75, event=2, vio=0x819e4b0) at peer-io.c:229
        res = 1406
        e = 0
        io = 0x819e4b0
        howmuch = 37178
        curlen = 0
        dir = TR_PEER_TO_CLIENT
        max = 262144
        __PRETTY_FUNCTION__ = "event_read_cb"
#12 0xb7f9e79f in event_base_loop () from /usr/local/lib/libevent-2.0.so.5
No symbol table info available.
#13 0xb7f9f455 in event_base_dispatch () from /usr/local/lib/libevent-2.0.so.5
No symbol table info available.
#14 0x08071c10 in libeventThreadFunc (veh=0x80d0b98) at trevent.c:248
        base = 0x80cf528
        eh = 0x80d0b98
#15 0x0805ace8 in ThreadFunc (_t=0x80d0be8) at platform.c:118
        t = 0x80d0be8
#16 0xb7d447b0 in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#17 0xb7cc58fe in clone () from /lib/libc.so.6
No symbol table info available.

Thread 1 (Thread 0xb78f16e0 (LWP 22862)):
#0  0xb7d4c2dc in nanosleep () from /lib/libpthread.so.0
No symbol table info available.
#1  0x0807343b in tr_wait_msec (msec=1000) at utils.c:826
        ts = {tv_sec = 1, tv_nsec = 0}
#2  0x0804daf8 in main (argc=2, argv=0xbffff7c4) at daemon.c:537
        c = 0
        optarg = 0x0
        settings = {val = {b = 208 '\320', d = 2.5470622426780038e-312, i = 515531137232, s = {len = 135061712, str = {
                buf = "x\000\000\000r\000\000\000\000\000\000\000\000\000\000", 
                ptr = 0x78 <Address 0x78 out of bounds>}}, l = {vals = 0x80ce0d0, alloc = 120, count = 114}}, 
          type = 8 '\b'}
        boolVal = false
        loaded = true
        foreground = true
        dumpSettings = false
        configDir = 0x80ce0a0 "/home/john/.config/transmission-daemon"
        pid_filename = 0x0
        watchdir = 0x0
        logfile = 0xb7d3b560
        pidfile_created = false

Change History (13)

comment:1 Changed 11 years ago by jordan

Hmm, that's an odd crash.

If it's still repeatable, could you install the debug package for libevent2 and generate a new backtrace?

The reason for that request is, the libtransmission functions in that backtrace have all kinds of information, but the libevent2 lines are blank...

comment:2 Changed 11 years ago by gunzip

yes it's repeatable in the sense that 4 of the 6 runs have now crashed, while in past TR never crashed for me. this is latest backtrace with libevent2-debug info..

Thread 3 (Thread 0xb70f0b70 (LWP 4980)):
#0  0xb7cbebc7 in select () from /lib/libc.so.6
No symbol table info available.
#1  0x08076731 in tr_select (nfds=0, r_fd_set=0xb70f0260, w_fd_set=0xb70f01e0, c_fd_set=0xb70f0160, t=0xb70f02f0)
    at web.c:299
No locals.
#2  0x080769f2 in tr_webThreadFunc (vsession=0x80cf030) at web.c:373
        usec = 1000000
        r_fd_set = {__fds_bits = {0 <repeats 32 times>}}
        max_fd = -1
        t = {tv_sec = 0, tv_usec = 389418}
        w_fd_set = {__fds_bits = {0 <repeats 32 times>}}
        c_fd_set = {__fds_bits = {0 <repeats 32 times>}}
        msec = 1000
        unused = 0
        msg = 0x0
        mcode = CURLM_OK
        multi = 0x80dd0d8
        web = 0x80dd0c8
        taskCount = 0
        task = 0x0
        session = 0x80cf030
#3  0x0805ace8 in ThreadFunc (_t=0x80d1128) at platform.c:118
        t = 0x80d1128
#4  0xb7d447b0 in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#5  0xb7cc58fe in clone () from /lib/libc.so.6
No symbol table info available.

Thread 2 (Thread 0xb78f0b70 (LWP 4979)):
#0  0xb7c24537 in raise () from /lib/libc.so.6
No symbol table info available.
#1  0xb7c27922 in abort () from /lib/libc.so.6
No symbol table info available.
#2  0xb7fac700 in event_exit (errcode=-559030611) at log.c:79
No locals.
#3  0xb7fac747 in event_errx (eval=-559030611, fmt=0xb7fc4360 "%s:%d: Assertion %s failed in %s") at log.c:136
No locals.
#4  0xb7fa2c24 in evbuffer_chain_insert (buf=0x8168728, chain=0x86e6650) at buffer.c:264
        __func__ = "evbuffer_chain_insert"
#5  0xb7fa35b2 in evbuffer_add (buf=0x8168728, data_in=0x827ece8, datlen=35) at buffer.c:1554
        chain = <value optimized out>
        tmp = 0x86e6650
        remain = 0
        to_alloc = <value optimized out>
        result = <value optimized out>
#6  0xb7fa3889 in evbuffer_remove_buffer (src=0x8154bc0, dst=0x8168728, datlen=35) at buffer.c:1100
        chain = 0x827ecd0
        nread = 976
        result = <value optimized out>
        __func__ = "evbuffer_remove_buffer"
#7  0x0805a429 in tr_peerIoReadBytesToBuf (io=0x8159168, inbuf=0x8154bc0, outbuf=0x8168728, byteCount=1011)
    at peer-io.c:1060
        old_length = 15373
        __PRETTY_FUNCTION__ = "tr_peerIoReadBytesToBuf"
#8  0x080921da in readBtPiece (msgs=0x8172f08, inbuf=0x8154bc0, inlen=1448, setme_piece_bytes_read=0xb78effec)
    at peer-msgs.c:1314
        err = 4979
        nLeft = 1011
        n = 1011
        req = 0x8174850
        __PRETTY_FUNCTION__ = "readBtPiece"
#9  0x0809339e in canRead (io=0x8159168, vmsgs=0x8172f08, piece=0xb78effec) at peer-msgs.c:1613
        ret = 1302118059
        msgs = 0x8172f08
        in = 0x8154bc0
        inlen = 1448
        __PRETTY_FUNCTION__ = "canRead"
#10 0x08057a18 in canReadWrapper (io=0x8159168) at peer-io.c:148
        oldLen = 1448
        ret = 1281
        overhead = 3086783316
        piece = 0
        used = 147325536
        now = 1302118059129
        err = false
        done = false
        session = 0x80cf030
        __PRETTY_FUNCTION__ = "canReadWrapper"
#11 0x08057d7f in event_read_cb (fd=42, event=2, vio=0x8159168) at peer-io.c:229
        res = 1448
        e = 0
        io = 0x8159168
        howmuch = 120391
        curlen = 0
        dir = TR_PEER_TO_CLIENT
        max = 262144
        __PRETTY_FUNCTION__ = "event_read_cb"
#12 0xb7f9e79f in event_process_active_single_queue (base=0x80cf528, flags=<value optimized out>) at event.c:1287
        ev = 0x8158e68
#13 event_process_active (base=0x80cf528, flags=<value optimized out>) at event.c:1354
        i = 0
#14 event_base_loop (base=0x80cf528, flags=<value optimized out>) at event.c:1551
        n = 1
        evsel = 0xb7fc9100
        tv = {tv_sec = 0, tv_usec = 6422}
        tv_p = <value optimized out>
        res = 0
        retval = <value optimized out>
        __func__ = "event_base_loop"
#15 0xb7f9f455 in event_base_dispatch (event_base=0x80cf528) at event.c:1382
No locals.
#16 0x08071c10 in libeventThreadFunc (veh=0x80d0b98) at trevent.c:248
        base = 0x80cf528
        eh = 0x80d0b98
#17 0x0805ace8 in ThreadFunc (_t=0x80d0be8) at platform.c:118
        t = 0x80d0be8
#18 0xb7d447b0 in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#19 0xb7cc58fe in clone () from /lib/libc.so.6
No symbol table info available.

Thread 1 (Thread 0xb78f16e0 (LWP 4976)):
#0  0xb7d4c2dc in nanosleep () from /lib/libpthread.so.0
No symbol table info available.
#1  0x0807343b in tr_wait_msec (msec=1000) at utils.c:826
        ts = {tv_sec = 1, tv_nsec = 0}
#2  0x0804daf8 in main (argc=2, argv=0xbffff7c4) at daemon.c:537
        c = 0
        optarg = 0x0
        settings = {val = {b = 208 '\320', d = 2.5470622426780038e-312, i = 515531137232, s = {len = 135061712, str = {
                buf = "x\000\000\000r\000\000\000\000\000\000\000\000\000\000", 
                ptr = 0x78 <Address 0x78 out of bounds>}}, l = {vals = 0x80ce0d0, alloc = 120, count = 114}}, 
          type = 8 '\b'}
        boolVal = false
        loaded = true
        foreground = true
        dumpSettings = false
        configDir = 0x80ce0a0 "/home/john/.config/transmission-daemon"
        pid_filename = 0x0
        watchdir = 0x0
        logfile = 0xb7d3b560
        pidfile_created = false

comment:3 Changed 11 years ago by jordan

Does r12332 fix this?

comment:4 Changed 11 years ago by gunzip

no r12332 still crashes. i've since installed libc6-dbg which gives debugging symbols for the C Library, hoping that might provide more info about what is causing the crashes:

http://pastebin.com/pJfuxFQq

comment:5 Changed 11 years ago by gunzip

i went backwards a little to r12307 and that had same crash on first run. then went to the Beta release on the TR home page 2.30b1 (12287) and no more crashes on three successive runs.

so somewhere between r12288 and r12307 transmission breaks on my end.

comment:6 Changed 11 years ago by jordan

That's helpful -- most of the commits in that range are unrelated to this code.

Does r12303 work?

comment:7 Changed 11 years ago by gunzip

r12303 crashed on the 2nd test run with same backtrace as before.

going down one build to r12302 and so far no crashes with four consecutive runs. given the high probability of crashing i was seeing before this may have solved it, and r12303 appears to have introduced the bug.

if the above is true, is there any correlation with the backtrace.log and what was done in r12303? does it make sense?

just for info, i found the best way to trigger the bug was to find a public torrent with multiple trackers and a huge swarm, i.e. a "busy" torrent, giving transmission a lot to do.

comment:8 Changed 11 years ago by jordan

I wonder if there's a bug in evbuffer_remove_buffer().

Just for testing, could you try making this change to libtransmission/peer-io.c's function tr_peerIoReadBytesToBuf():

 void
 tr_peerIoReadBytesToBuf( tr_peerIo * io, struct evbuffer * inbuf, struct evbuffer * outbuf, size_t byteCount )
 {
+    struct evbuffer * tmp;

...

     /* append it to outbuf */
-    evbuffer_remove_buffer( inbuf, outbuf, byteCount );
+    tmp = evbuffer_new( );
+    evbuffer_remove_buffer( inbuf, tmp, byteCount );
+    evbuffer_add_buffer( outbuf, tmp );
+    evbuffer_free( tmp );

Sorry I don't have a proper diff with line numbers handy... if you need a proper diff let me know

Last edited 11 years ago by jordan (previous) (diff)

comment:9 Changed 11 years ago by gunzip

Ok, i'm now re-compiling r12303 based on above (i think). you can check if diff is right..

--- peer-io.c	2011-04-07 21:18:16.000000000 -0400
+++ peer-io.c.new	2011-04-07 21:18:24.000000000 -0400
@@ -1063,13 +1063,17 @@
 void
 tr_peerIoReadBytesToBuf( tr_peerIo * io, struct evbuffer * inbuf, struct evbuffer * outbuf, size_t byteCount )
 {
+    struct evbuffer * tmp;
     const size_t old_length = evbuffer_get_length( outbuf );
 
     assert( tr_isPeerIo( io ) );
     assert( evbuffer_get_length( inbuf )  >= byteCount );
 
     /* append it to outbuf */
-    evbuffer_remove_buffer( inbuf, outbuf, byteCount );
+    tmp = evbuffer_new( );
+    evbuffer_remove_buffer( inbuf, tmp, byteCount );
+    evbuffer_add_buffer( outbuf, tmp );
+    evbuffer_free( tmp );
 
     /* decrypt if needed */
     if( io->encryption_type == PEER_ENCRYPTION_RC4 ) {

will need some time to retest this with some torrent runs

comment:10 Changed 11 years ago by gunzip

Update

r12303 + your evbuffer_remove_buffer-patch looks promising .. so far 3 successive runs with no crashes.

i'll do some more stress testing and see if it holds up.

Update 2, the next day

now up to 7 good runs in a row. i tried hard but i can't get transmission to crash anymore using the patch.

Last edited 11 years ago by gunzip (previous) (diff)

comment:11 Changed 11 years ago by jordan

r12337 libtransmission/peer-io.c: (trunk libT) #4173 "crashing with r12315 in Debian" -- apply patch from comment:8 for testing in 2.30b2

comment:12 follow-up: Changed 11 years ago by jordan

  • Milestone None Set deleted
  • Resolution set to fixed
  • Status changed from new to closed

I guess I should mark this one as fixed.

gunzip, please reopen if this problem comes back. Thanks again!

comment:13 in reply to: ↑ 12 Changed 11 years ago by gunzip

Replying to jordan:

I guess I should mark this one as fixed.

defintely .. after nearly 1 week of using the daemon with the patch i've had 0 crashes. so whatever magic you did above worked wonders on my end.

Note: See TracTickets for help on using tickets.