Opened 12 years ago

Closed 12 years ago

#2705 closed Enhancement (fixed)

speeding up request queue management

Reported by: sadface Owned by: charles
Priority: Normal Milestone: 1.80
Component: libtransmission Version: 1.76+
Severity: Normal Keywords:
Cc:

Description

List of pieces management shows scalability issues due to frequent sorting. For example, with -et option enabled, these tasks waste around 40% of overall daemon's cpu time while it's downloading a 3000-piece file (1.1GB) on my AMD K8 linux system.

Following patch keeps the list always sorted by weight and avoids qsort(). More performance can be reached, but this code seems to me simple and fast enough: maybe below 1% for previous file.

Attachments (2)

pieces-always-sorted-by-weight.diff (8.7 KB) - added by sadface 12 years ago.
pieces-always-sorted-by-weight-v1.1.diff (9.0 KB) - added by sadface 12 years ago.
up-to-date, small improvements

Download all attachments as: .zip

Change History (7)

Changed 12 years ago by sadface

comment:1 Changed 12 years ago by charles

  • Milestone changed from None Set to 1.80
  • Status changed from new to assigned

Looks good, thanks!

comment:2 follow-up: Changed 12 years ago by charles

Okay, after reviewing this diff a little more it still looks like a good idea, both in terms of speed and code simplification. I do have two questions though:

  • What was the stupid compiler error that made the reverse search necessary?
  • Why did you replace the merge code with adaptive insertion? The merge algorithm should have ( pieceCount - 1 ) + ( i log i ) comparisons where i is the number of newly-requested pieces, while the adaptive insert is closer to ( i log pieceCount ) plus i memmoves. In nearly all cases i will be much smaller than pieceCount.

comment:3 in reply to: ↑ 2 Changed 12 years ago by sadface

Happy new year!

Replying to charles:

What was the stupid compiler error that made the reverse search necessary?

There were no errors at all. I meant other implementations I tested generate code with extra branches and shouldn't do it. Feel free to delete that comment.
Reverse search was chosen according to an empirical analysis. Better results can be obtained if, before reverse search, we check positions 0 to ~5. But it seems a too specific tweak.

Why did you replace the merge code with adaptive insertion? The merge algorithm should have ( pieceCount - 1 ) + ( i log i ) comparisons where i is the number of newly-requested pieces, while the adaptive insert is closer to ( i log pieceCount ) plus i memmoves. In nearly all cases i will be much smaller than pieceCount.

I add to merge's list: qsort()'s permutations and the need of allocate, deallocate and fill a new array on every call.

comment:4 Changed 12 years ago by KyleK

For what it's worth: I applied the patch and also see great reduction in CPU usage on my NAS device.

Great work!

Changed 12 years ago by sadface

up-to-date, small improvements

comment:5 Changed 12 years ago by charles

  • Resolution set to fixed
  • Status changed from assigned to closed

Committed -v1.1.diff to trunk for 1.80 in r9920

Thanks for another good patch! Keep 'em coming...

Note: See TracTickets for help on using tickets.