Opened 10 years ago

Closed 10 years ago

#4445 closed Bug (fixed)

transmission only downloads partially from webseeds with redirections

Reported by: T_X Owned by: jordan
Priority: Normal Milestone: 2.40
Component: Transmission Version: 2.33
Severity: Normal Keywords: webseed http redirection 307
Cc:

Description

When using http 307 redirections on a webseed, transmission (trunk @ r12751) is not able to complete downloading all parts/files.

It seems rather random which parts finish, removing and readding such a torrent to transmission seems to result in different parts being successfully downloaded.

To reproduce the issue I'm attaching two torrent files, both responsible for the same content:

*-webseed-redirected.torrent uses http://x-realis.dyndns.org/jamendo2/45452 as a webseed, this webseed redirects any file request to the according files on the fast jamendo.com server.

*-webseed-not-redirected.torrent uses http://x-realis.dyndns.org/jamendo3/45452 as a webseed (jamendo3/ instead of jamendo2/), it serves a mirrored copy (warning, slow!).

The apache2 server's redirections for jamendo2/ seem to work, I verified it with the attached wget-dl.sh and the following steps:

  • Added *-webseed-redirected.torrent to transmission, started the dowload
  • Waited a couple of seconds until it stays (or loops around) 5-10% and does not seem to make any further progress.
  • Paused the torrent.
  • copied wget-dl.sh into the torrent's direction.
  • ./wget-dl.sh
  • Hit "Verify Local Data" in transmission
  • Waited until transmission verified that all content is now there.

Torrent files were created with:

 mktorrent -a http://tracker.jamendo.com/announce.php \
-c "http://www.jamendo.com/ : Free music" \
-w "http://x-realis.dyndns.org/jamendo2/45452" \
/tmp/jamendo2/Lorenzo\'s\ Music\ -\ BAZAR\ --\ Jamendo\ -\ OGG\ Vorbis\ q3\ -\ 2011.08.26\ \[www.jamendo.com\]

and:

mktorrent -a http://tracker.jamendo.com/announce.php \
-c "http://www.jamendo.com/ : Free music" \
-w "http://x-realis.dyndns.org/jamendo3/45452" \
/tmp/jamendo2/Lorenzo\'s\ Music\ -\ BAZAR\ --\ Jamendo\ -\ OGG\ Vorbis\ q3\ -\ 2011.08.26\ \[www.jamendo.com\]

Attachments (8)

Lorenzo's Music - BAZAR -- Jamendo - OGG Vorbis q3 - 2011.08.26 [www.jamendo.com]-webseed-not-redirected.torrent (3.6 KB) - added by T_X 10 years ago.
torrent file using a websed (no redirection, local mirror)
Lorenzo's Music - BAZAR -- Jamendo - OGG Vorbis q3 - 2011.08.26 [www.jamendo.com]-webseed-redirected.torrent (3.6 KB) - added by T_X 10 years ago.
torrent file using a websed _with_ http redirection 307
wget-dl.sh (3.2 KB) - added by T_X 10 years ago.
script to verify that the http redirection is working
0001-debugging-stuff.patch (4.4 KB) - added by T_X 10 years ago.
0002-experimental-changes.patch (5.6 KB) - added by T_X 10 years ago.
0003-debugging-stuff-2.patch (4.1 KB) - added by T_X 10 years ago.
logfile (63.2 KB) - added by T_X 10 years ago.
webseed_wrong_file_index_calculation.patch (1.5 KB) - added by alexat 10 years ago.

Download all attachments as: .zip

Change History (22)

Changed 10 years ago by T_X

torrent file using a websed (no redirection, local mirror)

Changed 10 years ago by T_X

torrent file using a websed _with_ http redirection 307

Changed 10 years ago by T_X

script to verify that the http redirection is working

comment:1 Changed 10 years ago by T_X

I've made a capture of the communication with the webseed. I added the torrent file in transmission-gtk and before hitting the "Ok" button, so before any dowload could have started, I only selected the first file, "01-...ogg".

Interestingly the capture shows that packet 1139, created by transmission without any redirections involved yet, does an HTTP GET for the second file although it has not been selected for download. Later the HTTP GETs loop around 3 distinct byte ranges of this second file. The first file has never been finished though, and during this looping the percentage of the first file goes up and down again in the GUI.

When pausing, the first file download percentage is at 1.75 of 2.5MiB.

comment:2 Changed 10 years ago by T_X

And here's a capture of the working download of the same, first file (01-...ogg) but without the redirection.

An (undesired?) HTTP GET for the second file seems to be present here again, though it doesn't seem to cause any issues. The first file still finishes with 100% and any other file remains idling at 0% as intended.

comment:4 Changed 10 years ago by jordan

I'm wondering if this is a regression caused by https://trac.transmissionbt.com/ticket/4338 ...

Changed 10 years ago by T_X

Changed 10 years ago by T_X

Changed 10 years ago by T_X

Changed 10 years ago by T_X

comment:5 Changed 10 years ago by T_X

With patch 001 just added some uggly extra debugging stuff. With 002 I actually tried to fix the issue of the wrong file_index in connection_succeded(), by just using the same way as used by task_request_next_chunk().

Together with r12848 this worked way better already! Instead of 66% I was now able to get all but the last piece.

The debugging stuff from 001 showed, that in connection_succeded() with 002, urls[file_index] was NULL for file 11 / piece 107 - as the calculated file_index should have been 10 instead, according to the debug output in the previous on_content_changed().

So with patch 003 I added even more debug output and tried to determine how many Bytes were too many in connection_succeeded() -> webseed_find_file_location() -> step_piece_offset and tried to hackishly work around it. And it turns out it's just off by one single Byte! (see attached 'logfile')

So with 002 and 003 + r12848 I was able to download the complete torrent via http redirections for the first time!!

But of course, 003 (and maybe even 002) are not the way to go and I actually have no clue why the offset is off by one... Anyone having an idea?

EDIT:

Replying to jordan:

I'm wondering if this is a regression caused by https://trac.transmissionbt.com/ticket/4338 ...

Yes, looks like a regression. E.g. just removing https://trac.transmissionbt.com/browser/trunk/libtransmission/webseed.c#L225 to https://trac.transmissionbt.com/browser/trunk/libtransmission/webseed.c#L235 fixes the issue (but also removes the 2nd feature of #4338, of course).

Last edited 10 years ago by T_X (previous) (diff)

comment:6 follow-up: Changed 10 years ago by alexat

The problem probably is that in connection_succeeded() neither blocks_done nor the contents of the evbuffer are taken account of. The reason why I introduced this tr_http_info thingie in the first place was that the task structure actually belongs to the other thread and thus shouldn't be accessed from connection_succeeded(). I'll take a look at the bug tomorrow, don't have a system for it currently.

Btw back then when I wrote the patch I didn't relly understand what step_piece is good for. Can you think of any situation in which step_piece is different from piece_index?

Changed 10 years ago by alexat

comment:7 in reply to: ↑ 6 Changed 10 years ago by T_X

Hi alexat!

Nice hearing from you and thanks for the great patch back then. Despite the redirection part it works really well for me so far.

Replying to alexat:

The problem probably is that in connection_succeeded() neither blocks_done nor the contents of the evbuffer are taken account of. The reason why I introduced this tr_http_info thingie in the first place was that the task structure actually belongs to the other thread and thus shouldn't be accessed from connection_succeeded(). I'll take a look at the bug tomorrow, don't have a system for it currently.

Hmm, I guess you're right, the experimental patch I had made to move these offset calculations from task_request_next_chunk() to its own function, webseed_find_file_location(), is probably not threadsafe... evbuffer_get_length( t->content ) might change due to the other parallel thread moving the evbuffered stuff to disk...

After reading the part in task_request_next_chunk() over and over again, I think I'm at least getting why it is off by a single byte. When using that code in connection_succeeded(), it is not calculating the file_index of the current webseed_task, but due to the addition of all the just received data, we end up at the offset of the next chunk and therefore possibly the next file(_index). That's why it's just off by 1 Byte...

Btw back then when I wrote the patch I didn't relly understand what step_piece is good for. Can you think of any situation in which step_piece is different from piece_index?

Yeah, that part also took me some time to get. I've been reading those lines again and again... I think pretty much due to the thing described above, that we are actually getting file_index of the next and not the current chunk. For instance for a torrent file with two file and two pieces like this, step_piece and piece_index could differ (at least in task_request_next_chunk(), not from connection_succeeded() though):

|-----piece 0 ---|-----piece 1 ---|
| file 1 |     file 2             |

I think when we want to request the first chunk of piece 1, then the piece_index could still be 0, but for the next chunk (= the next step?) the index / step_piece would be 1?

comment:8 Changed 10 years ago by alexat

Thank you for the clue that there was only one byte too much using your second patch. That's because task_request_next_chunk() calculates the offset of the next byte that will be requested, but here we need at most the last byte that was received to make sure it's within the current file.

Could you try this patch? On my system the download goes straight from 0 too 100% with it.

comment:9 Changed 10 years ago by T_X

Ups, didn't see your patch at first. I tried it and yes, it works for me too!

At first I was wondering whether it might be better to use '- info->n_added' instead of '- 1' in case we might not have received any data. But with the check in the beginning of on_content_changed() ("if( info->n_added <= 0 ) return;") and the comment you've added it should be quite clear and safe, so nevermind :).

I created a torrent with 261 files of various sizes, 232MB in total, and with various piece sizes (2^15, 2^18, 2^28). They all finished successfully with redirections with your patch! And no checksum failures in the logs.

Thanks!

comment:10 Changed 10 years ago by alexat

Perfect :)

comment:11 Changed 10 years ago by jordan

alexat: I'm not positive, but I *think* this function could calculate the wrong value in task->blocks_done * task->block_size if the torrent's final block is one of those blocks_done. The final block's size is usually smaller than block_size.

As a second point, I don't see why we're delegating connection_succeeded() back to the libevent thread. connection_succeeded() looks fairly inexpensive -- it's probably cheaper than the overhead of shunting the work between threads. :)

comment:12 Changed 10 years ago by jordan

After re-reading the code, my complaints in comment:11 don't hold up. At this point in the code, the last block won't ever be part of the blocks counted in blocks_done -- so its different size shouldn't matter. Also, connection_succeeded() does need to be in the libevent thread to safeguard against a pathological case of the "tor" pointer being freed by tr_torrentClose() in the libevent thread.

comment:13 Changed 10 years ago by jordan

  • Milestone changed from None Set to 2.40
  • Owner set to jordan
  • Status changed from new to assigned
  • Version changed from 2.33+ to 2.33

comment:14 Changed 10 years ago by jordan

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed in r12855 from alexat's patch.

Note: See TracTickets for help on using tickets.