Opened 10 years ago

Closed 10 years ago

#4477 closed Bug (duplicate)

Infective lock usage in bandwidthPulse

Reported by: Zuljin Owned by: jordan
Priority: Normal Milestone: None Set
Component: libtransmission Version: 2.13
Severity: Normal Keywords: lock bandwidth
Cc:

Description

Hi,

I'm writing simple Android transmission client and I have noticed one not trivial issue in bandwidthPulse function of libtransmission. I didn't analyzed code of bandwidthPulse deeply, but I think more or less what it is doing. It firstly lock session tr_lock and then go through all peers and is trying to get all bytes from receive buffers of those peers. Depending on how many bytes it can get from receive buffer it prioritize peers to not starve fastest one. During getting bytes from receive buffers it is possible that file buffer will be filled so libtransmission in that case write file buffer to file on disk.

Everything seems ok and it will work correctly for most of the devices, but lets imagine device that have low speed storage (android phone with sdcard, linux routers with usb stick etc.) and with high speed connection (>15Mbps). In case of such devices writing speed is only barely larger than download speed or even lower.

Writing 2MB file buffer could take around 1 second and during that writing time libtransmission could download enough data to fill another 2MB file buffer. So when one writing is finished data for another 1 second writing is ready to obtain from peer receive buffer. Such "looped" writing operation could take really long time during peers prioritize in bandwidthPulse. For me on Motorola Milestone 2 phone execution of bandwidthPulse easily could reach 5 minutes.

And the worst think is that at the begin of bandwidthPulse session is locked so other parts of the library or GUI interface cannot access session data for 5 minutes. GUI during that time for example is frozen and appears to be dead. In my opinion session lock usage should be optimized in this function, but this will be very hard to do without completely rewriting prioritize algorithm or even rethinking of session lock usage in whole library.

What do you think about this issue? Have you noticed it before on some embedded devices? Do you have any solution how to solve it?

P.S. Most of my analyzes of this issue were done on 2.13 version of transmission, but I think in latest it will also appear.

P.S.2 Sorry for my English. It is not my native language. :)

Change History (5)

comment:1 Changed 10 years ago by livings124

  • Version changed from 2.33 to 2.13

Would it be possible to do an analysis on current code? Thanks for looking this stuff over!

comment:2 Changed 10 years ago by Zuljin

I have checked today this issue also with 2.33 library and I can confirm that it exists. Execution of

tr_bandwidthAllocate( &mgr->session->bandwidth, TR_DOWN, BANDWIDTH_PERIOD_MSEC );

inside bandwidthPulse took 305 seconds. Such long lock have started just few seconds after I started downloading one of the Arch Linux *.iso from http://www.archlinux.org/download/.

I thought about another idea. When allocation of file on disk is started? When torrent is started or after receiving first part? Maybe this take so long, because libtorrent started allocating file during bandwidthPulse? Is it possible?

comment:3 Changed 10 years ago by jordan

Yes, file preallocation will block the libtransmission thread, and that could happen as the result of flushing the cache to disk, which could happen in bandwidthPulse().

Preallocation on a modern filesystem such as ext4, btrfs, or zfs should be nearly instantaneous.

At any rate, discussion of delegating file IO to another thread to prevent the libtransmission thread from blocking is already discussed in ticket #1753. This ticket seems like a duplicate of #1753, but I'll hold off on closing it until Zuljin has a chance to weigh in.

comment:4 Changed 10 years ago by Zuljin

I think you can mark this as a #1753 duplicate. My Android phone use FAT32 on sdcard partition and this fs has issue with preallocation. Any other issues I have mentioned could be resolved by using asynchronous writing operation so this is exactly what is proposed in #1753.

comment:5 Changed 10 years ago by jordan

  • Resolution set to duplicate
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.