Opened 12 years ago

Closed 12 years ago

#2383 closed Enhancement (fixed)

Faster verification for local data with small piece size

Reported by: mortennorby Owned by: charles
Priority: Normal Milestone: 1.75
Component: libtransmission Version: 1.74
Severity: Normal Keywords:
Cc:

Description

Currently, verification of local data is done piece by piece, and at the end of each piece, the thread sleeps for 50ms in order to allow other process access to the disk.

In practice there is nearly no time penalty from the read operation (and the verification of the hashes), meaning that verification of local data happens at a constant rate of 20 pieces per second, regardless of the size of the pieces. This can lead to a significant difference in execution time for torrents with 256kB pieces compared to, say, 4MB pieces.

One way to avoid this problem is to calculate the sleep (or yield) period based on a constant data rate. If read operations on the disk are somewhat more expensive (to move the head), the algorithm can lower the target data rate for small pieces sizes and raise it for large ones.

Attached is a patch that does exactly that if the project is interested.

It aims at a data rate of 100MB/sec for 2MB pieces, down to approx. half that for 128/256kB pieces. By working on the overall data rate and/or the acceleration curve, these values can of course be changed.

Attachments (2)

yield.diff (1.8 KB) - added by mortennorby 12 years ago.
yield2.diff (2.3 KB) - added by charles 12 years ago.

Download all attachments as: .zip

Change History (9)

Changed 12 years ago by mortennorby

comment:1 Changed 12 years ago by charles

This is definitely an improvement over the way I implemented it. Thanks for this patch, and for bringing the problem to my attention.

Thinking this over, even this second approach won't discriminate between fast drives and slower ones. Maybe it would be better to just sleep every second?

comment:2 Changed 12 years ago by mortennorby

Sounds like a great solution, and definitely simpler than mine.

Changed 12 years ago by charles

comment:3 Changed 12 years ago by charles

What do you think of the implementation in yield2.diff?

comment:4 Changed 12 years ago by charles

  • Component changed from Transmission to libtransmission
  • Owner set to charles
  • Status changed from new to assigned

comment:5 Changed 12 years ago by mortennorby

Looks great to me. Simple and clean.

comment:6 Changed 12 years ago by mortennorby

Just a quick comment from an outside reader:

In the interest of making the code more self-documenting, have you considered something like this for line 136:

if( now - lastSleptAt >= 1 ) {

This way of putting it would maybe express the calculation of time elapsed more explicitly to the casual reader, and should the code ever transition to a millisecs-based library, the logic will still hold.

However, this is all just semantics, and it will cost a couple of CPU cycles. My take is, the CPU got billions of them and won't even notice in a piece of code that is otherwise I/O-bound, so stealing a few for the sake of good is not a crime.

As a last resort, one could also simply add a comment, but that would be chickening out.

comment:7 Changed 12 years ago by charles

  • Milestone changed from None Set to 1.75
  • Resolution set to fixed
  • Status changed from assigned to closed

I see your point, but IMO the "MSEC_TO_SLEEP_PER_SECOND_DURING_VERIFY" makes it pretty explicit that this is called on a per-second basis.

yield2.diff applied to trunk in r9044 for 1.75

Note: See TracTickets for help on using tickets.