Opened 13 years ago
Closed 13 years ago
#3427 closed Enhancement (fixed)
PATCH Use Shortest-Job-First scheduling for verifying local data
Reported by: | wateenellende | Owned by: | charles |
---|---|---|---|
Priority: | Normal | Milestone: | 2.10 |
Component: | libtransmission | Version: | 1.93 |
Severity: | Normal | Keywords: | |
Cc: |
Description
Hi,
The problem: I just started transmission on a slow machine. It will first verify the local data on the torrents - there are about 10-16 open torrents. 1) Each torrent takes a fairly long time to verify as they're all large files. 2) A torrent will only continue downloading once it has been verified. 3) To complete files as soon a possible, they should start downloading as soon as possible.
Currently, transmission seems to select files at random to verify and start. This is not optimal.
The solution: Use Shortest-Job-First scheduling. This means that the first torrent to be verified out of all that await verification, should be the one that takes least time to verify. This is probably the torrent that has the least amount of data downloaded.
The attached patch implements this, and should allow transmission to start downloading much faster.
Details: This could be implemented in (at least) 2 ways, 1) sort the list by nr-of-bytes-completed in verifyThreadFunc( ... ) in verify.c; 2) Keep the list sorted by inserting in the right position tr_verifyAdd( ... )in verify.c
The second option is more elegant, it requires less calculation. This would need modifications to list.h/list.c, to allow the ordered lists. A function
void tr_list_insert_ordered( tr_list ** list, void * data, TrListCompareFunc compare_func )
is added. Then, a compare function 'compareVerifyBySizeNow' for compares by downloaded data size. Last, the call 'tr_list_append( &verifyList, node );' in verify.c (315) is replaced by 'tr_list_insert_ordered( &verifyList, node, compareVerifyBySizeNow );
This patch compiles but it is NOT TESTED, I don't know whether I should use 'sizeNow' or 'tr_cpHaveValid()' to get the current number of bytes, and neither am i sure that these fields are valid /before/ the torrent is verified.
The description for tr_cpHaveValid() and tr_cpSizeWhenDone() is identical, maybe that should be fixed.
Note that the behaviour was observed with 1.93, but the patch is against 2.01.
Best,
Fokko Beekhof
Attachments (5)
Change History (15)
Changed 13 years ago by wateenellende
comment:1 Changed 13 years ago by charles
- Type changed from Bug to Enhancement
Changed 13 years ago by wateenellende
Previous patches had a bug in tr_list_insert_ordered(...) in list.c, handling of empty list. Fixed in this patch.
comment:2 Changed 13 years ago by sadface
tr_list_insert_ordered() is still bugged. You should have recovered a similar function removed some time ago: https://trac.transmissionbt.com/browser/trunk/libtransmission/list.c?rev=7404#L80.
comment:3 Changed 13 years ago by wateenellende
Here is a new patch, based on sadface's recommendation. I edited the patch by hand, hope it applies. If you spot another bug in it, please point it out so that I don't have to go hunting for it myself again, or provide a new patch yourself...
Additionally, could someone confirm that
1) 'sizeNow' is the correct field to use;
2) its value is meaningful before the data is verified ?
Thanks in advance,
wateenellende
comment:4 Changed 13 years ago by charles
I don't think sizeNow is the right thing to use until after verification's done.
tr_cpSizeWhenDone() isn't perfect either, but it's probably the right thing to use here, since it doesn't rely on the existing file size.
Alternately we could stat() the torrent's files on disk and take the total current file sizes, but that seems like overkill.
comment:5 Changed 13 years ago by wateenellende
I just tried the latest patch, and it's functioning correctly. I applied the patch by hand to Ubuntu's latest sources, and it's doing exactly what it is supposed to: it will verify a torrent with only a bit of data completed before a torrent with more data completed, even if the total size of the second torrent is smaller or if the other was added earlier to the list.
So I guess the patch is ok as it is now ...
comment:6 Changed 13 years ago by sadface
wateenellende, this time you forgot the following line: node->prev = l->prev;
comment:7 Changed 13 years ago by charles
I'm not sure that I understand why "assert(l->prev);" is valid in this diff. What if l points to the first node in the list?
Edit: ah, nvm, I see it's handled in the clause above. :)
comment:8 Changed 13 years ago by charles
- Milestone changed from None Set to 2.10
- Status changed from new to assigned
- Summary changed from PATCH Consider Shortest-Job-First scheduling for verifying local data on startup to PATCH Use Shortest-Job-First scheduling for verifying local data
It occurs to me that we're already walking through the torrent's files, calling stat() on each one, in torrentHasAnyLocalData(). In fact, that function is just a special case version of the ideal for here. So, instead of being overkill as I suggested before, it's really no extra cost.
comment:9 Changed 13 years ago by charles
r11023 libtransmission/ (list.c list.h verify.c): (trunk libT) #3427 "use shortest-job-first scheduling for verifying local data" -- patch from wateenellende and sadface
This is committed to the nightlies to get testing from other users too. If I've messed anything up in the commit, please let me know and I'll fix it :)
comment:10 Changed 13 years ago by charles
- Resolution set to fixed
- Status changed from assigned to closed
patch implementing shortest-job-first scheduling for verification on startup