Opened 8 years ago

Closed 6 years ago

#1521 closed Enhancement (fixed)

memory cache to reduce disk IO

Reported by: mmazur Owned by: charles
Priority: Normal Milestone: 2.10
Component: Transmission Version: 1.70
Severity: Normal Keywords:
Cc: ismail@…, colrol@…

Description

On a freshly installed ubuntu 8.10 the only visible signs of activity on an otherwise idle laptop is disk diode flashing every two seconds. iotop tells me that the only offender is transmission.

Transmission should be more intelligent about when it does reads and writes. Dunno how reads are handled (I'm hoping a reasonable amount of data gets cached on every read), but the writes are obviously suboptimal. With an average download speed of a few tens of kb/s I shouldn't be seeing disk writes more often than, say once per minute or so. It just doesn't make sense to write so little data.

Imho there should be some kind of reasonably-sized buffers (for current disks those should probably be at least a few megs in size) that get synced to disk only when either full or after a fixed amount of time passed since the last sync (a minute sounds to me just about right).

I'm quite certain that this would (a) reduce system load when disk is otherwise busy, since it's easier to deal with sporadic but bigger writes and, more importantly, (b) reduce power consumption, since I'm quite certain that an idly spinning disk uses less power, than a busy one (that last one is why I've marked this bug as having a major severity; laptops are a lot more sensitive to this issue).

Personal side-note: my gf asked ma a week ago to switch her vista to a linux system, since, among other things (like general sluggishness) vista had constant disk activity without an obvious cause. I hope the above is just a quick patch, cause it was kind of a downer to find ubuntu behaving more or less the same way :) (albeit with a known cause, thank the author for iotop, and an ability to fill a bug :)

Attachments (12)

disk-cache.patch (21.8 KB) - added by jch 7 years ago.
Experimental write cache -- do not commit yet
writecache.patch (16.4 KB) - added by jch 7 years ago.
Write cache, doesn't include #2551
writecache.2.patch (16.6 KB) - added by jch 7 years ago.
Refreshed patch, now that Charles has committed #2551.
writecache.3.patch (17.6 KB) - added by jch 6 years ago.
writecache.4.patch (16.5 KB) - added by charles 6 years ago.
minor changes from writecache.3: silence compiler warnings, make a couple of fields private
block-cache.diff (17.2 KB) - added by charles 6 years ago.
experimental block cache
block-cache-2.diff (7.3 KB) - added by charles 6 years ago.
revision of previous patch, which fixes the previous diff's "FIXME" for prefetching. The new version checks the block cache before prefetching from disk.
block-cache-3.diff (24.8 KB) - added by charles 6 years ago.
Add C and RPC API for changing cache size. Add settings key to initialize from settings.json.
block-cache-rc1.diff (32.4 KB) - added by charles 6 years ago.
Add support to transmission-remote; code cleanup; replace printf messages with tr_dbg calls
libt_fixQueue.patch (6.4 KB) - added by Longinus00 6 years ago.
experimental
libt_fixQueue.r2.patch (6.0 KB) - added by charles 6 years ago.
sync libt_fixQueue.patch to trunk
libt_fixCache.patch (8.9 KB) - added by Longinus00 6 years ago.
version 3 - now with some crash protection ( still need to rename cache->maxBlocks to cache->max_blocks )

Download all attachments as: .zip

Change History (75)

comment:1 Changed 8 years ago by charles

  • Severity changed from Major to Normal

comment:2 Changed 8 years ago by kysucix

imho it's a good idea but we should take care also of low memory systems(e.g. openwrt), where few MB can take a big part of available ram. Maybe with just a define in compile time to enable it.

comment:3 Changed 8 years ago by coolphoenix

something like this would also help using transmission on flash-based disks since they cannot handle writing a lot of small pieces as good as some big writes from time to time. on a system with low cpu (for example my fritzbox 7270) the cpu hangs on IO >70% when downloading something at about 100kb/sec on a flash-disk (other ~30% are occupied with other processes including transmission), resulting in sort of hardware-speed-limit because of cpu. (on normal hard disks IO stays <10%, but reducing disk-write would help there too!)

comment:4 Changed 8 years ago by charles

  • Type changed from Bug to Enhancement

comment:5 Changed 8 years ago by turbo

See #1753.

comment:6 follow-up: Changed 8 years ago by damien

I concur. The biggest problem for me is constant read activity, for example, when uploading at 80KB/s and downloading at 800KB/s from 5 active torrents. Appart from the annoying noise, it does seem to have an impact on the performance of other applications.

It should be possible to reduce these reads, since Azureus didn't suffer from the same problem in similar situations.

Sample iotop output :

2009 Feb  2 22:42:58,  load: 0.42,  disk_r:  10760 KB,  disk_w:   3320 KB

  UID    PID   PPID CMD              DEVICE  MAJ MIN D            BYTES
   89   8385      1 mdworker         ??       14   2 R           106496
  501   8254     85 Transmission     ??       14   2 W           212992
    0      0      0 kernel_task      ??       14   2 W          3186688
  501   8254     85 Transmission     ??       14   2 R         10919936

comment:7 in reply to: ↑ 6 ; follow-up: Changed 8 years ago by charles

Replying to damien:

I concur. The biggest problem for me is constant read activity, for example, when uploading at 80KB/s and downloading at 800KB/s from 5 active torrents. Appart from the annoying noise, it does seem to have an impact on the performance of other applications.

Damien: I'd be very interested to hear whether 1.50 beta 4 ameliorates this or not. It doesn't have the Big Fix that this ultimately needs, but it does send the OS hints on how read and readahead caching should be handled on local data, which may make things a little better in the short term.

comment:8 in reply to: ↑ 7 Changed 8 years ago by damien

Replying to charles:

Damien: I'd be very interested to hear whether 1.50 beta 4 ameliorates this or not. It doesn't have the Big Fix that this ultimately needs, but it does send the OS hints on how read and readahead caching should be handled on local data, which may make things a little better in the short term.

Charles: Thanks. I installed 1.50 beta 4 and there seems to be a two- or three-fold decrease in the volume of reads, with writes remaining unchanged. Subjectively it feels pretty much the same though, with constant disk activity.

I'm not sure this makes sense (measuring artifact?), and I'm quite puzzled at how Transmission reads so much more data than it's actually sending.

comment:9 Changed 8 years ago by cartman

  • Cc ismail@… added

Still reproducable on Leopard, Dtrace disk i/o output shows

   30333  Transmission\0

           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |@@@@@@@@@@@@@@@@@@@@@                    551      
            8192 |@@@@@@@@@@                               267      
           16384 |@@@@@                                    131      
           32768 |@@                                       54       
           65536 |@                                        24       
          131072 |                                         1        
          262144 |                                         0       

comment:10 Changed 8 years ago by charles

  • Owner set to charles
  • Status changed from new to assigned

@cartman: is there a way to have Dtrace generate a graph on just reads, and another graph on just writes? My guess is that the latter is the problem, but it would be nice to get verification of that before tearing up code...

comment:11 Changed 8 years ago by cartman

iotop output which is also posted by damien shows read/writes seperately:

2009 Mar  2 21:59:53,  load: 0.25,  disk_r:   1396 KB,  disk_w:   1440 KB

  UID    PID   PPID CMD              DEVICE  MAJ MIN D            BYTES
  501  40718     98 Transmission     ??       14   2 R          1429504
  501  40718     98 Transmission     ??       14   2 W          1495040

comment:12 Changed 8 years ago by shiretu

Hi,

I'm shiretu from #1883. Could you compile the sources for Mac OS X Leopard with -g and -O0 for me? I would love to help and track down the bug but I have to install all kinds of libraries and dependencies. Also if you have some defines that enables some verbose debug, please activate them too. I could do it but I think I'll spend at least 1 or 2 hours. I think you can do it in a matter of minutes. Shark utility isn't able to take any snapshots of the process while the app is in that state of intensive read/write operations. I think that my best chance is to attach from dbg and just see the call stack. If you don't have a mac available I'll gladly put mine in the front line via ssh. If you do and just want to see it in action we can meet via iChat.

Thank you

comment:13 Changed 8 years ago by charles

@shiretu: The nightly builds have debugging symbols turned on... try giving that a spin.

comment:14 Changed 8 years ago by damien

I've had a quick look at the code and it seems checksum verification on pieces just completed is done by reading them from the disk. So that would explain why reads are proportional to the download speed...

I don't have a complete understanding of corruption or torrent restart issues that may be affected by such a change, but maybe a quick fix could be to verify checksums incrementally as blocks are written to the disk? That wouldn't amount to proper scheduling, but at least the volume and frequency of reads should decrease quite dramatically.

comment:15 Changed 7 years ago by charles

Ticket #2124 has been marked as a duplicate of this ticket.

comment:16 Changed 7 years ago by sopeters

  • Version changed from 1.40 to 1.61+

I experience the same issue here with several transmission instances and approx 50-100 torrents per transmission instance the disk i/o wait is >80% because of read (not write) requests.

I guess most reads are done on small chucks (how big, does anyone know?), if I run tests on my drives I get a very poor performance when reading 4-16k blocks (random read) gives 1-2 MB/sec. So I would be interested in having transmission reading larger chunks for uploads and buffering them to increase performance and reducing disk i/o (wait).

What determines the size / amount of data which gets read from the disk for uploads? What buffers are used at the moment?

Happy to help with tests on patches.

comment:17 Changed 7 years ago by charles

  • Version changed from 1.61+ to 1.40

sopeters: I'm wondering if you could test damien's theory that the cost of reading pieces in -- to verify them after they've finished downloading -- is the main reading cost going on. One way to do this would be to disable the call to tr_ioTestPiece in peer-mgr.c's peerCallbackFunc(). Obviously this wouldn't be desirable for a production system, but for testing it could be helpful: if the theory is true, it's a lot easier to design IO caching around blocks requested by us than blocks requested by peers...

So basically the thing to test is, by what percentage does reading from disk drop when that call to tr_ioTestPiece() is disabled? Is it possible to run benchmarks on systems with similar loads with, and without, that test?

comment:18 Changed 7 years ago by sopeters

Charles, I could certainly test this. Do you want me just to comment the call? If more needed let me know and I'll compile and test the two settings.

I would still like to understand the concept of IO and caching which is in place at the moment.

I understand that peers request parts of files. These parts will have a certain size (let's say 5 MB e.g.) which need to be read from the disk. In what sizes do you read those parts from the disk and do you buffer them in memory? Can these read requests / buffers adjusted? If yes, where and how?


Another (not really finalized) idea which would help high volume transmission instances:

Introduce a switch to keep only the parts of a file in a memory buffer which were just loaded and only present them available for other peers? This buffer can be managed by the fact if file parts are requested in a certain time. If not requested they get dropped out of the memory buffer declined when requested. This would limit the disk access only to write and make only just loaded parts of a file available to other peers from memory.

Or do you e.g. about a layer which can be placed in between the application (transmission) and the disk which buffers I/O and presents only recently written data (like the cache I just described above)? Not sure how to initialize transmission though because it needs to know which parts are available on start, so it needs access to all available data on disk.

Open for more discussions around speeding things up and happy to test.

comment:19 Changed 7 years ago by charles

sopeters: Yes. It should be as easy as replacing "ok = tr_ioTestPiece( tor, p, NULL, 0 );" with "ok = TRUE;"

To answer your question, the IO caching in libtransmission is very crude. We keep a short list of the most active files and leave them open, leaving the caching to the OS.

comment:20 Changed 7 years ago by Lukian

For systems with lots of memory, allowing the user to specify a memory buffer size would be very useful (or detecting system memory amount and automatically allocating such a value).

comment:21 Changed 7 years ago by sopeters

charles: I have patched the peer_mgr.c. Does not seem to improve much unfortunately. Any other hints / ideas how to reduce the read i/o?

The majority of our disks requests are still i/o and producung > 50% iowait.

comment:22 Changed 7 years ago by charles

sopeters: well, the purpose of the test wasn't really to fix the problem -- you wouldn't want to go without piece verification in the long term anyway -- but just to try & narrow down what the worst IO offenders in T were. damien's suggestion was that it was in the checksum test when a piece finishes downloading.

...what kind of profiling tools do you have at your disposal? Do you have any way of profiling which source code lines in Transmission are causing the worst offenders?

comment:23 Changed 7 years ago by sopeters

Charles: I'll revert the change and put the checksum test in again. I'm not really profiling. I see the iostats via iostat and can see that most of the disk i/o is read and assume this is only for small blocks as the wait time is faily long (small data from various parts of the disk). E.g. iostat output similar all the time to this

avg-cpu: %user %nice %system %iowait %steal %idle

2.88 0.00 10.12 61.88 0.00 29.12

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util

sde 0.00 0.50 133.50 2.50 1068.00 35.00 8.11 0.42 3.12 3.12 42.50 sdf 0.00 0.00 197.00 6.50 1576.00 104.00 8.26 0.63 3.10 3.10 63.00 sdg 0.00 0.00 216.50 5.00 1732.00 54.00 8.06 0.31 1.40 1.40 31.00 sdh 0.00 0.00 156.00 0.00 1248.00 0.00 8.00 0.47 2.98 3.01 47.00 sdi 0.00 0.00 188.50 0.00 1508.00 0.00 8.00 0.53 2.76 2.81 53.00

All disks are purely used by transmission. Do you have an idea how to track down which part of T produces the constant i/o reads?

Thanks Sven

comment:24 Changed 7 years ago by charles

sopeters:

I put in some simple debug messages so that I could watch the flow of file-read traffic on my Transmission session. My session is nearly all seeding, so there were a lot of reads :)

The results were a little surprising: lots of block requests did come in order, at least within a piece boundary. So when seeding, a lookahead actually would do some good. So I wonder if the call to posix_fadvise( RANDOM ) is actually doing more harm than good.

So here's what I'd like you to test on your heavy-load system: Please measure a few minutes (say, 5 minutes) with a baseline normal Transmission run, and then do the same with a similarly-loaded version of Transmission that removes the "posix_fadvise" line from libtransmission/fdlimit.c's TrOpenFile?().

I'm just getting up-to-speed with iostat, but it looks like you could take an average over five minutes this way: "iostat 300 2" and use the latter of the two samples. Best to start iostat's sampling after the two tests have reached steady state, of course...

Does this make sense? If so, could you please post the results of these two tests?

comment:25 Changed 7 years ago by charles

I've tested passing NORMAL, SEQUENTIAL, and RANDOM into posix_fadvise(). According to this link, at least on Linux these three arguments make the readahead 1x, 2x, and 0x the normal readahead size.

What I found is that iowait is highest on RANDOM (0x readahead), lower on NORMAL (1x readahead), and lower still on SEQUENTIAL (2x readahead). SEQUENTIAL increases that volume of data read from disk, and actually makes the `await' time higher for each request... but cuts the reads to about 1/5th of those made by RANDOM, so the overall iowait is lower.

On average, POSIX_FADV_SEQUENTIAL is an improvement over POSIX_FADV_RANDOM because of the sequential block requests being made for random pieces.

comment:26 follow-up: Changed 7 years ago by sopeters

Thanks charles. Looks like some progress. What would you like me to test now? Shall I patch with POSIX_FADV_SEQUENTIAL? If you could let me know which files and lines to change I can provide you with test results on our heavy loaded box.

comment:27 in reply to: ↑ 26 Changed 7 years ago by charles

Replying to sopeters:

Thanks charles. Looks like some progress. What would you like me to test now? Shall I patch with POSIX_FADV_SEQUENTIAL? If you could let me know which files and lines to change I can provide you with test results on our heavy loaded box.

It's not a silver bullet, but it does appear to be an improvement most of the time, and no worse the rest of the time. The code is already updated in the nightly builds, and the diff is here.

Note, I don't think you've mentioned yet what OS you're using. Does it support posix_fadvise()?

comment:28 Changed 7 years ago by sopeters

charles, we're running on the latest Ubuntu version so shouldn't be a problem. Will run these during the week and see if that improves our i/o wait a little.

comment:29 Changed 7 years ago by sopeters

  • Version changed from 1.40 to 1.70

Charles, We updated to the latest version 1.71 which I assume has the changes included. Still approx 50% iowait on our heavy loaded system. I reduced the upload speed to 2000 ( -u 2000) but still see iostats like this:

avg-cpu: %user %nice %system %iowait %steal %idle

14.89 0.00 20.91 48.81 0.00 22.66

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util

sdc 0.00 20.90 62.50 65.40 5351.60 2738.30 63.25 3.24 25.36 7.49 95.85 sdd 0.35 62.15 103.75 42.85 19172.80 4051.85 158.42 7.66 51.52 3.53 51.70 sde 0.10 47.50 78.35 23.60 15622.80 2626.10 179.00 6.38 61.80 4.78 48.70 sdf 0.05 36.80 87.15 40.55 18380.40 3860.35 174.16 7.52 59.10 3.52 44.95 sdg 0.00 22.00 47.75 23.20 9869.20 2141.25 169.28 4.24 57.63 4.50 31.95 sdh 0.10 40.25 79.40 28.20 15713.20 2934.10 173.30 5.38 49.93 4.95 53.30 sdi 0.05 7.40 125.95 22.35 27587.60 1667.00 197.27 8.53 57.51 3.71 55.05 sdj 0.00 1.50 71.40 8.15 15641.20 613.20 204.33 3.69 46.45 3.39 27.00 sdk 0.10 0.95 47.05 16.05 8884.00 1061.80 157.62 2.74 43.46 3.48 21.95 sdl 0.00 3.70 58.35 11.95 12099.60 618.80 180.92 4.32 61.22 5.02 35.30 sdm 0.00 12.00 61.00 24.70 13781.20 2608.80 191.25 4.94 55.08 3.89 33.35

Any idea how to track down what causes almost 5 times more reads than writes? How can I help to improve this?

Do you have any recommended settings for the disk i/o scheduler I can try? I do remember the chaching is mainly directed to the OS, so is there any 'optimal' setting I can try for Transmission you are aware of?

comment:30 Changed 7 years ago by jch

Following the discussion on http://forum.transmissionbt.com/viewtopic.php?f=3&t=172 , here's a simple disk cache implementation.

Charles, please *do* *not* commit this until we receive confirmation that this does indeed improve things. It complicates the code, it's fragile (access the same file through distinct file descriptors, or read from the file without flushing first, and you get data corruption), and it may potentially slow things down (by making an extra memory to memory copy).

Summary: very simple data structure, just an unordered table of disjoint segments cached in memory. Aging is done using a classical "single-hand clock" algorithm, this part may require some tuning. (Right now: only age when trying to allocate a new segment, no aging at all when the cache is not even half full, double the speed of aging when the cache is completely full.)

Things to do when we decide to commit this into transmission proper:

  • call cache_set_size on startup,
  • provide compatibility wrappers for pread and pwrite.

--Juliusz

comment:31 Changed 7 years ago by jch

Oh, this includes the patch in #2551.

Changed 7 years ago by jch

Experimental write cache -- do not commit yet

comment:32 Changed 7 years ago by charles

jch: In the interests of clarity, could you address my comments on the #2551 patch before folding that patch into another ticket too? ;)

That aside, thanks for working on this aspect of Transmission. I've meant to do research on this for a very long time...

comment:33 follow-up: Changed 7 years ago by jch

before folding that patch into another ticket too?

That's not deliberate; it's just that I'm finding it increasingly difficult to generate clean patches as my tree diverges from yours. Now if you were using a real version control system instead of SVN...

--Juliusz

comment:34 in reply to: ↑ 33 Changed 7 years ago by charles

Replying to jch:

That's not deliberate; it's just that I'm finding it increasingly difficult to generate clean patches as my tree diverges from yours. Now if you were using a real version control system instead of SVN...

I don't entirely disagree, but if git is as good as you say, surely it can handle this? ;)

Changed 7 years ago by jch

Write cache, doesn't include #2551

comment:35 Changed 7 years ago by jch

Here's a new version. You'll need to first apply prefetch-base from #2551, then apply writecache.patch.

Once again, we're looking for data, so please benchmark this and let us know.

--Juliusz

Changed 7 years ago by jch

Refreshed patch, now that Charles has committed #2551.

comment:36 Changed 6 years ago by charles

jch, I saw you mention this patch in the forums recently. Do you have an up-to-date version of this diff that applies cleanly? I'd like to review it for 2.00 -- it's sat in trac for far too long.

comment:37 Changed 6 years ago by jch

As mentioned in this forum -- I'm strongly opposed to this patch being included unless there's actual evidence that it helps decrease disk traffic (as opposed to merely reducing the amount of disk cache traffic).

Note that I no longer believe it will help. Recent discussions on IRC seem to imply that it's spurious *reads* that people are seeing, not spurious writes.

Nonetheless, here goes a refreshed version, as requested.

--jch

Changed 6 years ago by jch

comment:38 Changed 6 years ago by charles

Well, I consider myself properly warned then. I must've missed that IRC discussion... I'll ask the #transmission librarian to dig that up.

jch, do you have any suggestions on how best to go about profiling transmission with and without this patch? I'm happy enough to do the legwork on this if you can point me in the right direction.

comment:39 Changed 6 years ago by jch

do you have any suggestions on how best to go about profiling transmission with and without this patch?

Here's some ideas. Sorry for the rambling, but it's not clear to me either.

One thing is clear -- avoid iotop, which measures cache activity, not physical I/O.

The one thing that comes to mind is to use an otherwise idle machine, and to run vmstat 5, for some suitable value of 5. This should give you a fairly accurate estimate of the physical I/O throughput. However, the results are not easy to interpret -- higher throughput might either indicate that we're doing useless I/O, or that the scheduling is better.

However, somebody on IRC (who?) indicated that on a pure leecher there is a significant amount of disk read activity, most probably doing verifications. Running vmstat indicates that this is physical I/O, not mere reads from cache. I do not understand how this can happen, and if that is the case, then my patch won't do anything good.

--jch

comment:40 Changed 6 years ago by Astara

Something I don't understand -- if I have just written something out, then on a non-busy system, if I turn around and read that block for a verify -- I shouldn't see any disk activity -- it should all be in the OS's buffers.

To the one person who talked about only wanting to see output every 1 minute -- the lower the disk-io rate, the more chance of losing data in the event of a crash or power interruption. Ideally, there'd be a checkbox/setting that would say whether or not to write out finished pieces immediately or to only write them out upon filling some 'buffer'.

But all this talk of buffers -- I assume everyone realizes that if the memory is not needed, on both windows and linux, the buffers in the OS would prevent any disk activity from happening on reads of data recently written (depends on how much free memory you have and how busy system is). I'll see gigabytes of free memory sit around filled with file-system cache for hours to days -- only when something isn't in system cache do reads need to happen. Writes are another story -- its a tradeoff between reliability and activity -- but even there -- in linux, you can tune how long you want written-out buffers to remain in memory before forcing them to disk.

IF you run a util "powertop", it will note how often you write buffers to disk and suggest raising the automatic forcing of buffers to disk from 5 to 15 seconds on laptops to save power. But when I really wanted to save money, I raised that to 60 seconds or longer so my disks would stay off for long periods. That's the best place to adjust your "buffer-write-to-disk" settings -- not in each program, BUT, that's not always possible (Windows, maybe MacOS?). And maybe someone wants to 'reserve' memory for Xmssion instead of other programs. I don't see that as wrong -- its just that people should be aware (especially linux users) that there may also be ways of tuning the OS to do what you want (assuming your machine has the extra memory -- if it doesn't, do you really want to reserve more memory in transmission?)...

comment:41 follow-up: Changed 6 years ago by Lukian

I, personally, am quite happy to allow my applications to store data in memory and write them once to disk (when they are complete, or at the very least, worthwhile writing to disk at that point in time). I have no preference whether this is implemented through the system buffers, or transmission's own memory storage. Regarding the OS's own measures for buffering writes, how does one go about this (on Linux)? I would love to investigate how Transmission handles various settings.

comment:42 in reply to: ↑ 41 Changed 6 years ago by Astara

Replying to Lukian:

I, personally, am quite happy to allow my applications to store data in memory and write them once to disk (when they are complete, or at the very least, worthwhile writing to disk at that point in time). I have no preference whether this is implemented through the system buffers, or transmission's own memory storage. Regarding the OS's own measures for buffering writes, how does one go about this (on Linux)? I would love to investigate how Transmission handles various settings.

These are SYSTEM settings. Not application (transmission) settings. Transmission would not normally handle these and you really wouldn't want it to.

To answer your question about what linux can do to suppress disk activity especially as it relates to people's laptops, uh, lemme google that for you:

<Google>: configuring linux laptop for low disk activity

(then...)

2nd result: Laptop Mode / Laptop Mode Tools FAQ | Laptop Mode Tools (http://samwel.tk/laptop_mode/faq)

3rd result: How to reduce power consumption (http://www.thinkwiki.org/wiki/How_to_reduce_power_consumption) -- has various ideas for power reduction, and points back to #2

Those can tell you general techniques to reduce disk I/O -- assuming transmission doesn't FORCE out disk requests (like doing synchronous I/O, which I do not believe is the case). That could definitely create extra (and usually unnecessary) disk activity.

In general -- you might want to:

1) make sure the "noatime" flag is set on your mounts -- if not, then everytime you read a file off of disk, it also has to do a write (to write out the new "last access time").

2) use a file system with good characteristics for stuttered writing like 'XFS'.

Xfs tries to accumulate multiple writes to lower the absolute number of disk writes. This isn't nearly as true with other file system types. XFS's delay parameters can also be extended by writing to its parameters in /proc (default is about 4-5 seconds, but in many instances can up that to 15-60 seconds), WITH the explicity proviso, that if power fails -- any unwritten data is lost (shouldn't need saying, but...). Google or the Documentation in the linux kernel can tell you which parameters you might want to play with.

2a) mount params on xfs (should be default, but defaults provide for minimum functionality that works on low memory systems and those with higher mem): "noatime, logbufs=8". If you have battery backup (most laptops/w charged battery would qualify), add "nobarrier".

If you want to add more memory to the log buffers use "logbsize=256k" (default is 32k).

Last edited 6 years ago by Astara (previous) (diff)

Changed 6 years ago by charles

minor changes from writecache.3: silence compiler warnings, make a couple of fields private

comment:43 Changed 6 years ago by charles

  • Milestone changed from None Set to 2.10

This is a very tentative milestone

Changed 6 years ago by charles

experimental block cache

comment:44 Changed 6 years ago by charles

Experimental block cache. This implementation caches blocks as they're written, and also allows them to be read back as well by replacing calls to tr_ioRead() with tr_cacheReadBlock().

To profile this code, I ran "vmstat -p /dev/sda3" (the disk I was downloading the torrent to) before and after downloading ubuntu-9.10-desktop-i386.iso.torrent under 1.93 and under this patch, both on an otherwise-idle system.

In the control (1.93):

  • "read sectors" grew by 400
  • "writes" grew by 6,106.

With the patch:

  • "read sectors" grew by 64
  • "writes" grew by 4,255.

Notes:

  • ext4 filesystem
  • caching eliminates something like 97% of the calls to pwrite(), but only eliminates about 1/3 of the actual writes, so as discussed above the OS is already doing a good job of batching writes.
  • the larger relative improvement in reads comes from cases where a piece becomes complete after downloading all its blocks and Transmission rereads the piece to checksum it. With the cache in place, many or all of the piece's blocks are still in the cache.
Last edited 6 years ago by charles (previous) (diff)

Changed 6 years ago by charles

revision of previous patch, which fixes the previous diff's "FIXME" for prefetching. The new version checks the block cache before prefetching from disk.

Changed 6 years ago by charles

Add C and RPC API for changing cache size. Add settings key to initialize from settings.json.

comment:45 Changed 6 years ago by charles

  • Summary changed from Inefficient disk usage (lack of read/write scheduling) to memory cache to reduce disk IO

Changed 6 years ago by charles

Add support to transmission-remote; code cleanup; replace printf messages with tr_dbg calls

comment:46 Changed 6 years ago by charles

block-cache-rc1.diff checked into trunk for 2.10 by r10798

comment:47 Changed 6 years ago by charles

I'm going to leave this ticket open while the patch gets some testing in the nightly builds. I've been using it for about a month with no trouble -- but if there is a bug here it could result in data loss, so it's worth keeping any eye on this.

comment:48 Changed 6 years ago by lucke

dstat is a great tool for measuring system behaviour in real time - running "dstat -cdn --top-bio --top-io --top-cpu" should give a pretty definite overview of the situation. A quick testing reveals that with an upload rate of 40kB/s rtorrent reads data every few seconds in 128kB/256kB bursts, while transmission-svn reads from the disk every second. Whatever rtorrent does, I like it more ;-)

Last edited 6 years ago by lucke (previous) (diff)

comment:49 Changed 6 years ago by mackint0sh

I have tones and tones av messeage in my log when downloading! I use nightly builds r10801 on snow leopard 10.6.4. here is a pic of my logg app: http://i48.tinypic.com/2u8dsu1.png

Last edited 6 years ago by mackint0sh (previous) (diff)

comment:50 Changed 6 years ago by Rolcol

  • Cc colrol@… added

This has been working fine for me so far. Does Transmission do any resizing to the cache when all torrents are paused?

comment:51 Changed 6 years ago by livings124

mackint0sh: If I didn't know better you were pirating a movie. We refuse to work with those that engage in such activities.

comment:52 Changed 6 years ago by sadface

Block replacement algorithm performs poorly when all blocks are non-contiguous: it always discards the first block! Flushing the oldest one reports much better results (about 1/2 of writes).

The simple implementation I tested: http://pastebin.com/g4PyA4T2

Last edited 6 years ago by sadface (previous) (diff)

comment:53 follow-up: Changed 6 years ago by charles

sadface: do you have a patch for this? I'd be happy to use it.

comment:54 in reply to: ↑ 53 Changed 6 years ago by sadface

Replying to charles:

sadface: do you have a patch for this? I'd be happy to use it.

I've worked on a smarter approach that helps to avoid fragmentation. It's easy to understand and reports very good results: on average about 4 contiguous blocks were written per flush while downloading ubuntu-9.10-desktop-i386.iso.torrent.

The patch also includes some bugfix: http://pastebin.com/zWKKBs0Y

comment:55 Changed 6 years ago by charles

looks good. patch committed to r10865

comment:56 Changed 6 years ago by charles

r10923 libtransmission/ (cache.c peer-mgr.c): (trunk libT) #1521 "memory cache to reduce disk IO" -- improve one of the debugging messages

comment:57 Changed 6 years ago by sadface

Replying to charles:

looks good. patch committed to r10865

Not so good. The following patch fixes a couple of bugs: http://pastebin.com/WtipzQhC

Sorry.

comment:58 Changed 6 years ago by charles

r10924 libtransmission/cache.c: (trunk libT) #1521 "memory cache to reduce disk IO" -- improved revision from sadface

Changed 6 years ago by Longinus00

experimental

comment:59 Changed 6 years ago by Longinus00

I have uploaded the cache changes that I used when analyzing the cache behavior in IRC. If you want I can copy everything here but I think this should suffice.

Old cache: http://i45.tinypic.com/6tcnja.jpg http://i47.tinypic.com/e02v6f.jpg http://i50.tinypic.com/29ftl02.jpg

Test cache: http://i46.tinypic.com/dh7146.jpg http://i45.tinypic.com/9uszfp.jpg http://i47.tinypic.com/2db9ekh.jpg

Both torrents used had 4MB piece sizes.

Version 2, edited 6 years ago by Longinus00 (previous) (next) (diff)

Changed 6 years ago by charles

sync libt_fixQueue.patch to trunk

comment:60 Changed 6 years ago by Longinus00

I used the my new patch and got this run.

http://i45.tinypic.com/2i0aflx.jpg http://i50.tinypic.com/20f25gj.jpg http://i46.tinypic.com/2d7ixoj.jpg

Same 20MB cache size but this time with a 2MB piece size torrent.

Potential further improvement of the algorithm could come from an analysis pass before adjusting rank for run age and setting flush size but I'm not sure how helpful that will be in general.

comment:61 Changed 6 years ago by Longinus00

I implemented my idea of flushing completed pieces before saving resume files to narrow the window for crashes leading bad data. This won't help with crashes causing future pieces to fail hash checks and screw with the bad peer algorithm.

Changed 6 years ago by Longinus00

version 3 - now with some crash protection ( still need to rename cache->maxBlocks to cache->max_blocks )

comment:62 Changed 6 years ago by charles

r10978 libtransmission/cache.c: (trunk libT) #1521 "memory cache to reduce disk IO" -- apply Longinus' libt_fixCache.patch version 3

comment:63 Changed 6 years ago by charles

  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.