Opened 14 years ago

Closed 14 years ago

Last modified 10 years ago

#161 closed Defect (fixed)

ratecontrol.c causes segment violation

Reported by: oleo Owned by: titer
Priority: Normal Milestone: Sometime
Component: libtransmission Version: Other
Severity: Critical Keywords: 0.7-svn (1446)
Cc:

Description

I have collected several core dumps on my embeded transmission daemon running with 32MB RAM (512MB swap) on Asus WL500G-router (uclibc-mipsel-linux) and using 0.7-svn (973) libtransmission. Whenever I use ratecontrol I have problems with segment violations. When I disable ratecontrol with -1 for download and upload, transmission can run for days without a problem. If rate control is enabled, segment violation can occur from 45 min to 4 hours.

I have studied ratecontrol.c and could not figured any mistake, except for some unsafe malloc lines and maybe using free(p) and not seting p=NULL afterwards.

Maybe this problem is very specific as I suspect that this is memory related problem and that ratecontrol gives too much stress with allocating many small chunks of memory which often expire and that this causes huge memory fragmentation which leads to "memory unavailable" and malloc to fail.

Here are some post mortem debugs:

Core was generated by `transmissiond -p 65534 -w 300 -u 41 -d 180 -i /opt/var/run/transmission.pid /tm'
(gdb) where full
#0  0x2adb789c in free () from /opt/lib/libc.so.0
No symbol table info available.
#1  0x00412ad4 in tr_rcTransferred (r=0x10266278, size=270) at ratecontrol.c:76
        t = (tr_transfer_t *) 0x18b16d57
#2  0x004095a8 in tr_peerRead (tor=0x10263258, peer=0x10348c70) at peer.c:277
        ret = 1024
#3  0x0040aad0 in tr_peerPulse (tor=0x10263258) at peer.c:383
        i = 2
        ret = 272208088
        size = 17
        peer = (tr_peer_t *) 0x10348c70
#4  0x00406218 in downloadLoop (_tor=0x10263258) at transmission.c:747
No locals.
#5  0x2ab0a5c8 in pthread_start_thread () from /opt/lib/libpthread.so.0
No symbol table info available.
#6  0x2ab0a63c in pthread_start_thread_event () from /opt/lib/libpthread.so.0
No symbol table info available.
#7  0x2ab0a63c in pthread_start_thread_event () from /opt/lib/libpthread.so.0
No symbol table info available.
Previous frame identical to this frame (corrupt stack?)
This GDB was configured as "mipsel-linux-uclibc"...Using host libthread_db library "/opt/lib/libthread_db.so.1".

Core was generated by `transmissiond -p 65534 -w 300 -u 41 -d 180 -i /opt/var/run/transmission.pid /tm'.
(gdb) where full
#0  0x2adb65bc in malloc () from /opt/lib/libc.so.0
No symbol table info available.
#1  0x00412998 in tr_rcTransferred (r=0x100161c8, size=1024) at ratecontrol.c:121
        t = (tr_transfer_t *) 0x1020cc90
#2  0x004095c0 in tr_peerRead (tor=0x10017288, peer=0x1020cc90) at peer.c:278
        ret = 1024
#3  0x0040aad0 in tr_peerPulse (tor=0x10017288) at peer.c:383
        i = 0
        ret = 274877907
        size = 716225076
        peer = (tr_peer_t *) 0x1020cc90
#4  0x00406218 in downloadLoop (_tor=0x10017288) at transmission.c:747
No locals.
#5  0x2ab0a5c8 in pthread_start_thread () from /opt/lib/libpthread.so.0
No symbol table info available.
#6  0x2adc0f7c in __thread_start () from /opt/lib/libc.so.0
No symbol table info available.
Previous frame inner to this frame (corrupt stack?)
(gdb) where full
#0  0x00412814 in tr_rcCanTransfer (r=0x10003608) at ratecontrol.c:56
No locals.
#1  0x00409350 in tr_peerRead (tor=0x100088a0, peer=0x10268b20) at peer.c:245
        ret = 270762304
#2  0x0040aad0 in tr_peerPulse (tor=0x100088a0) at peer.c:383
        i = 2
        ret = 271089748
        size = 716225076
        peer = (tr_peer_t *) 0x10268b20
#3  0x00406218 in downloadLoop (_tor=0x100088a0) at transmission.c:747
No locals.
#4  0x2ab0a5c8 in pthread_start_thread () from /opt/lib/libpthread.so.0
No symbol table info available.
#5  0x2adc0f7c in __thread_start () from /opt/lib/libc.so.0
No symbol table info available.
Previous frame inner to this frame (corrupt stack?)

The next core does not show problems in ratecontrol

(gdb) where full
#0  0x2adb8378 in memmove () from /opt/lib/libc.so.0
No symbol table info available.
#1  0x0040ac94 in tr_peerPulse (tor=0x1001cce0) at peermessages.h:94
        i = 3
        ret = 272309748
        size = 69
        peer = (tr_peer_t *) 0x103b1d78
#2  0x00406218 in downloadLoop (_tor=0x1001cce0) at transmission.c:747
No locals.
#3  0x2ab0a5c8 in pthread_start_thread () from /opt/lib/libpthread.so.0
No symbol table info available.
#4  0x2adc0f7c in __thread_start () from /opt/lib/libc.so.0
No symbol table info available.
Previous frame inner to this frame (corrupt stack?)

When I change to limit only upload rate and download limit is set to -1 segmentation fault still occurs.

This GDB was configured as "mipsel-linux-uclibc"...Using host libthread_db library "/opt/lib/libthread_db.so.1".

Core was generated by `transmissiond -p 65534 -w 300 -u 39 -d -1 -i /opt/var/run/transmission.pid /tmp'.
Program terminated with signal 11, Segmentation fault.
(gdb) where full
#0  0x00407a08 in tr_trackerPulse (tc=0x101350c8) at tracker.c:148
        tor = (tr_torrent_t *) 0x12c
        data = 0x2ab0ba34 "\020"
        len = 270
#1  0x00406230 in downloadLoop (_tor=0x1022cb30) at transmission.c:750
No locals.
#2  0x2ab0a5c8 in pthread_start_thread () from /opt/lib/libpthread.so.0
No symbol table info available.
#3  0x2adc0f7c in __thread_start () from /opt/lib/libc.so.0
No symbol table info available.
Previous frame inner to this frame (corrupt stack?)
(gdb) where full
#0  0x2adb7644 in __malloc_consolidate () from /opt/lib/libc.so.0
No symbol table info available.
#1  0x2adb65f4 in malloc () from /opt/lib/libc.so.0
No symbol table info available.
#2  0x2adab17c in open_memstream () from /opt/lib/libc.so.0
No symbol table info available.
#3  0x2ada86c4 in vasprintf () from /opt/lib/libc.so.0
No symbol table info available.
#4  0x2ada861c in asprintf () from /opt/lib/libc.so.0
No symbol table info available.
#5  0x0040d290 in readOrWriteBytes (io=0x0, offset=The value of variable 'offset' is distributed across several
locations, and GDB cannot access its value.

) at inout.c:437
        tor = (tr_torrent_t *) 0x100088a0
        piece = 8256
        begin = 16384
        i = 754
        cur = 16384
        path = 0x0
        file = 4194304
        readOrWrite = 0x2ab116e0 <write>
#6  0x0040dc50 in tr_ioWrite (io=0x1001b6f8, index=2097, begin=1425408, length=16384, buf=0x1034d2d2 "?\177 ") at inout.c:151
        offset = The value of variable 'offset' is distributed across several
locations, and GDB cannot access its value.

Change History (5)

comment:1 Changed 14 years ago by oleo

10:08AM <joshe> I would try running your daemon under valgrind for a while and

see if it catches any bogus malloc/free stuff

<oleo> But I must say that if I disable ratecontrol with -1, segfaults never

occur! And I mean never.

10:08AM <joshe> or some other malloc debugging library <oleo> what I've tought is to rewrite ratecontrol with some kinda circular

buffer or round robin buffer

<oleo> This segfaults occur randomly from 45minutes to 4 hours on a busy (rate

limiting) bunch of torrents.

<oleo> limiting uplodad od download only does not help.

comment:2 Changed 14 years ago by oleo

Discussion at http://transmission.m0k.org/forum/viewtopic.php?p=5348#5348 suggests that disabling ratecontrol also helps on MAC system.

comment:3 Changed 14 years ago by oleo

  • Keywords (1446) added; (973) removed

With native compiled r1446 and -O0 libefence.a I've got segmentation violation when ratecontrol is enabled. Here are some gdb info that shows problems with moving memory of pending blocks.

[admin@oleo .transmission]$ gdb /opt/bin/transmissiond core.19900 
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "mipsel-linux-uclibc"...Using host libthread_db library "/opt/lib/libthread_db.so.1".

Reading symbols from /opt/lib/libpthread.so.0...done.
Loaded symbols for /opt/lib/libpthread.so.0
Reading symbols from /opt/lib/libcrypto.so.0.9.7...done.
Loaded symbols for /opt/lib/libcrypto.so.0.9.7
Reading symbols from /opt/lib/libm.so.0...done.
Loaded symbols for /opt/lib/libm.so.0
Reading symbols from /opt/lib/libc.so.0...done.
Loaded symbols for /opt/lib/libc.so.0
Reading symbols from /opt/lib/libdl.so.0...done.
Loaded symbols for /opt/lib/libdl.so.0
Reading symbols from /opt/lib/libgcc_s.so.1...done.
Loaded symbols for /opt/lib/libgcc_s.so.1
Reading symbols from /opt/lib/ld-uClibc.so.0...done.
Loaded symbols for /opt/lib/ld-uClibc.so.0
Core was generated by `transmissiond -p 65534 -w 300 -u 25 -d 120 -i /opt/var/run/transmission.pid /tm'.
Program terminated with signal 11, Segmentation fault.
#0  0x2acbe448 in memmove () from /opt/lib/libc.so.0
(gdb) set heuristic-fence-post 30000
(gdb) where full
#0  0x2acbe448 in memmove () from /opt/lib/libc.so.0
No symbol table info available.
#1  0x00418a2c in blockPending (tor=0x2b1d9b58, peer=0x2b840b68, size=0x7efffc7c) at peermessages.h:105
        p = (uint8_t *) 0x2b840bec ""
        r = (tr_request_t *) 0x2b844ce4
#2  0x0041d83c in tr_peerPulse (peer=0x2b840b68) at peer.c:399
        tor = (tr_torrent_t *) 0x2b1d9b58
        ret = 13
        size = 13
        p = (uint8_t *) 0x0
#3  0x00413af8 in downloadLoop (_tor=0x2b1d9b58) at torrent.c:682
        tor = (tr_torrent_t *) 0x2b1d9b58
        i = 3
        ret = 0
        peerCount = 0
        peerCompact = (uint8_t *) 0x0
        peer = (tr_peer_t *) 0x2b840b68
#4  0x0040d404 in ThreadFunc (_t=0x2b1daccc) at platform.c:208
        t = (tr_thread_t *) 0x2b1daccc
#5  0x2aadc274 in __pthread_manager_sighandler () from /opt/lib/libpthread.so.0
No symbol table info available.
#6  0x7efffe20 in ?? ()
No symbol table info available.
Cannot access memory at address 0x7effbffc
(gdb) quit

comment:4 Changed 14 years ago by oleo

  • Resolution set to fixed
  • Status changed from new to closed

Fixed with [1447].

Titer said it was not ratecontrol related? Anyway after two days w/o crash I see, that it helped.

comment:5 Changed 10 years ago by jordan

  • Component changed from Transmission to libtransmission
Note: See TracTickets for help on using tickets.