Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#2585 closed Bug (fixed)

Enabling DHT causes a high CPU load

Reported by: TylerD004 Owned by: charles
Priority: High Milestone: 1.80
Component: Transmission Version: 1.75
Severity: Major Keywords: cpu, load, dns, 323, 1.75
Cc:

Description

Running Transmission 1.75 and 1.76 on the NAS Device DNS-323 causes unusually high CPU Load with DHT enabled. There must be some change in how DHT works since 1.75 that has caused it to overload the albeit slow CPU of the DNS-323. 1.74 worked perfectly with DHT enabled. The high CPU load greatly reduces the effectiveness of torrent downloading and having more than 5-6 active torrents on this particular device produces the problem. The problem is not limited to just this device; it is only evident here because of the limited CPU of the device.

Change History (33)

comment:1 Changed 11 years ago by TylerD004

  • Owner set to TylerD004
  • Status changed from new to assigned

comment:2 Changed 11 years ago by TylerD004

  • Owner TylerD004 deleted
  • Status changed from assigned to new

comment:3 follow-up: Changed 11 years ago by charles

  • Version changed from 1.75+ to 1.75

comment:4 in reply to: ↑ 3 Changed 11 years ago by TylerD004

Replying to charles: The problem exists in 1.76 as well.

comment:5 Changed 11 years ago by TylerD004

  • Version changed from 1.75 to 1.75+

comment:6 follow-up: Changed 11 years ago by KyleK

Might ticket #2583 be responsible for this?

comment:7 in reply to: ↑ 6 Changed 11 years ago by TylerD004

Replying to KyleK:

Might ticket #2583 be responsible for this?

It is highly possible. The only way to find out is to integrate that patch into the current release of Transmission. A task like that is way over my head, but if someone is willing to do it and code it for the DNS-323, I would be happy to test it extensively.

comment:8 follow-up: Changed 11 years ago by KyleK

I'll see what I can do. Check the DNS forum tonight.

comment:9 Changed 11 years ago by KyleK

Actually, 1.76 and earlier apparently use an older version of the DHT code where the 'if(n->token_len == 0)' doesn't exist yet. I don't know anything about the DHT code to just blindly add it in there. jch should comment on this.

comment:10 follow-up: Changed 11 years ago by jch

Might ticket #2583 be responsible for this?

No, that's not likely. First, #2583 is extremely unlikely to trigger. Second, if #2583 triggers, then we'll not actually be looping, we'll just keep going through dht_periodic never marking the announce as done; we're speaking about one extra pass through dht_periodic every few dozens of seconds or so. (If you prefer, the loop in question does yield to the libevent event loop properly, it just never marks the search as done.)

I'd be willing to bet one testicle (but not two) that what you're seeing is #2430. In .74, there was a bug which prevented us from seeing PORT messages (#2510, fixed in r9366). Because of that, the DHT converged fairly slowly, and it's not impossible that your client never got overwhelmed with DHT atoms.

In .75, this has been fixed, so your client is going to receive a metric shitload of DHT atoms after it runs for a few minutes. This means that we need to:

  • fix #2430 urgently (Charles, any chance you could work out a working version of patch 0003 in that report? it's beyond me);
  • be smarter about which peers to connect to (Charles apparently has some ideas, but he hasn't shared them with us yet).

Kyle, can you use a profiler? I'd really like to know whether my guess is right. (After all, there's a testicle of mine at stake.)

--Juliusz (who apologises for the off-colour comments, he's just been lecturing first years)

comment:11 Changed 11 years ago by KyleK

I wish I could profile directly on the NAS device, but none of the known profilers work.

comment:12 follow-up: Changed 11 years ago by charles

  • Version changed from 1.75+ to 1.75

I don't understand why the version keeps getting bumped to 1.75+. The OP explicitly stated that this first appeared in 1.75.

comment:13 in reply to: ↑ 8 Changed 11 years ago by TylerD004

Replying to KyleK:

I'll see what I can do. Check the DNS forum tonight.

Thank You!!

comment:14 in reply to: ↑ 10 Changed 11 years ago by TylerD004

Replying to jch:

Might ticket #2583 be responsible for this?

No, that's not likely. First, #2583 is extremely unlikely to trigger. Second, if #2583 triggers, then we'll not actually be looping, we'll just keep going through dht_periodic never marking the announce as done; we're speaking about one extra pass through dht_periodic every few dozens of seconds or so. (If you prefer, the loop in question does yield to the libevent event loop properly, it just never marks the search as done.)

I'd be willing to bet one testicle (but not two) that what you're seeing is #2430. In .74, there was a bug which prevented us from seeing PORT messages (#2510, fixed in r9366). Because of that, the DHT converged fairly slowly, and it's not impossible that your client never got overwhelmed with DHT atoms.

In .75, this has been fixed, so your client is going to receive a metric shitload of DHT atoms after it runs for a few minutes. This means that we need to:

  • fix #2430 urgently (Charles, any chance you could work out a working version of patch 0003 in that report? it's beyond me);
  • be smarter about which peers to connect to (Charles apparently has some ideas, but he hasn't shared them with us yet).

Kyle, can you use a profiler? I'd really like to know whether my guess is right. (After all, there's a testicle of mine at stake.)

--Juliusz (who apologises for the off-colour comments, he's just been lecturing first years)

Very Interesting.. for the sake of your nut and my sanity, I hope you're wrong and Kyle's shot will just work!

comment:15 in reply to: ↑ 12 Changed 11 years ago by TylerD004

Replying to charles:

I don't understand why the version keeps getting bumped to 1.75+. The OP explicitly stated that this first appeared in 1.75.

I'm under the impression that the "+" in 1.75+ means that the problem persists in versions beyond 1.75. If that is the case then there should be a "+" because the problem is evident in every version after and including 1.75 (which happens to only be 1.76).

comment:16 follow-up: Changed 11 years ago by livings124

1.75+ means post 1.75, pre 1.76. A confusing naming scheme to be fair.

comment:17 in reply to: ↑ 16 Changed 11 years ago by TylerD004

Replying to livings124:

1.75+ means post 1.75, pre 1.76. A confusing naming scheme to be fair.

I see.. ok 1.75 it is.

comment:18 Changed 11 years ago by TylerD004

  • Summary changed from High CPU Load Since 1.75 with DHT enabled to High CPU Load with DHT enabled

comment:19 Changed 11 years ago by TylerD004

  • Summary changed from High CPU Load with DHT enabled to High CPU Load with DHT Enabled

comment:20 follow-up: Changed 11 years ago by jch

Note that #2430 has been fixed. Do you still see this issue?

--Juliusz

comment:21 in reply to: ↑ 20 Changed 11 years ago by TylerD004

Replying to jch:

Note that #2430 has been fixed. Do you still see this issue?

--Juliusz

No. I need someone to compile a version for the DNS-323 that has the fix included. If someone were willing to do that, I would be happy to test this out.

comment:22 Changed 11 years ago by charles

TylerD004: Where do you usually get your DNS-323 copies of Transmission from?

comment:23 follow-up: Changed 11 years ago by KyleK

From me :)

That said, I still see high CPU usage, although it isn't maxed out at 100% anymore. It's currently performs at80-90%, with 4 torrents loaded. One of these is complete and idle, the other 3 are downloading at a combined speed of ~600kB/s.

@TylerD004: I'll prepare an experimental package for you to play with tonight.

comment:24 Changed 11 years ago by charles

It's a bit like debugging a black box, since the DNS-323 appears to lack any profiling tools at all... :/

comment:25 in reply to: ↑ 23 ; follow-up: Changed 11 years ago by TylerD004

Replying to KyleK:

From me :)

That said, I still see high CPU usage, although it isn't maxed out at 100% anymore. It's currently performs at80-90%, with 4 torrents loaded. One of these is complete and idle, the other 3 are downloading at a combined speed of ~600kB/s.

@TylerD004: I'll prepare an experimental package for you to play with tonight.

Thank You!

comment:26 Changed 11 years ago by KyleK

@TylerD004: It's up.

@charles: I know :/ Would enabling debug output help in any way? Could it be used to make out functions that are called often? Then again, functions like the ones in ptrArray.c don't have any debug messages.

comment:27 in reply to: ↑ 25 Changed 11 years ago by TylerD004

Replying to TylerD004:

Replying to KyleK:

From me :)

That said, I still see high CPU usage, although it isn't maxed out at 100% anymore. It's currently performs at80-90%, with 4 torrents loaded. One of these is complete and idle, the other 3 are downloading at a combined speed of ~600kB/s.

@TylerD004: I'll prepare an experimental package for you to play with tonight.

Thank You!

That patch seems to have worked! I'm running with 8 torrents and my CPU stays around 50%, which was what it used to be. That patched version definately corrects the problem in my DNS-323. How can we get this fix to be apart of the next Transmission release?

comment:28 Changed 11 years ago by charles

  • Milestone changed from None Set to 1.80
  • Owner set to charles
  • Status changed from new to assigned

KyleK: unfortunately the debug messages won't do much good. I'm not sure what the solution is, short of going in and making our own profiler with macros that call tr_time() at the beginning and end of every function... uggh.

TylerD004: this fix is already committed to trunk for the 1.80 release, so yes it will be a part of the next Transmission release. :)

comment:29 Changed 11 years ago by charles

  • Resolution set to fixed
  • Status changed from assigned to closed

comment:30 follow-up: Changed 11 years ago by KyleK

I need to refine my statement from comment:23. High CPU usage is only seen when adding a new torrent, but then it stays high for 1+ minutes (I wasn't patient enough to look at top much longer :). At some later time, it had quieted down to around 30%, and these were only sporadical. It normally would sit at ~18%.

So yes, I would say this patch worked after all :)

comment:31 in reply to: ↑ 30 Changed 11 years ago by TylerD004

Replying to KyleK:

I need to refine my statement from comment:23. High CPU usage is only seen when adding a new torrent, but then it stays high for 1+ minutes (I wasn't patient enough to look at top much longer :). At some later time, it had quieted down to around 30%, and these were only sporadical. It normally would sit at ~18%.

So yes, I would say this patch worked after all :)

Great Teamwork Man..

Keep me in mind if you or anyone else need to test out another patch.

comment:32 Changed 11 years ago by charles

http://forum.dsmg600.info/viewtopic.php?pid=33357#p33357

THAT WORKED!! I'm running with 8 torrents and my CPU stays around 50%, which was what it used to be. That patched version definately corrects the problem in my DNS-323. Thanks KyleK for making that!

comment:33 Changed 11 years ago by charles

  • Summary changed from High CPU Load with DHT Enabled to Enabling DHT causes a high CPU load
Note: See TracTickets for help on using tickets.