Opened 10 years ago

Closed 9 years ago

Last modified 9 years ago

#3504 closed Bug (fixed)

Error: Unable to save resume file. Too many open files.

Reported by: fingon Owned by: jch
Priority: Normal Milestone: 2.20
Component: Transmission Version: 2.04
Severity: Normal Keywords: backport-2.0x backport-2.1x
Cc: jch@…, atommixz@…

Description

I've been getting this error message ever since 2.0.0 - I have now tried 2.0.0 - 2.0.4. The bug didn't occur on 1.9.3. It usually happens after N hours (where N in interval of ~2-10) of serious seeding/downloading.

Attachments (2)

0001-Avoid-a-descriptor-leak-when-binding-the-IPv6-DHT-so.patch (781 bytes) - added by jch 10 years ago.
netstat_-apn.txt (3.5 KB) - added by egolost 10 years ago.
netstat -apn attachment after applying the 0001-Avoid-a-descriptor-leak-when-binding-the-IPv6-DHT-so.patch

Download all attachments as: .zip

Change History (50)

comment:1 Changed 10 years ago by fingon

All torrents I have open go to that state at once too.. Guess something's leaking file handles. This is on Mac, in case it matters.

comment:2 Changed 10 years ago by Longinus00

Have you tried using lsof or something similar to determine what's actually eating the files?

comment:3 Changed 10 years ago by fingon

It's full of IPv6 UDP listeners.. ~3300 of them:

  • Transmiss 36443 mstenber 3312u IPv6 0x0a2f9dc0 0t0 UDP *:*
  • Transmiss 36443 mstenber 3313u IPv6 0x0a2cdd38 0t0 UDP *:*
  • Transmiss 36443 mstenber 3314u IPv6 0x0a2fa054 0t0 UDP *:*

.. and so forth.

comment:4 Changed 10 years ago by Bombaclat

Having the same problem on CentOS 5.5 with Transmission 2.04. I have 50 torrents running, with 48 seeding. A restart of the daemon fixes the issue.

Last edited 10 years ago by Bombaclat (previous) (diff)

comment:5 in reply to: ↑ description Changed 10 years ago by Harry

Same problem for me. Kernel 2.6.35-ARCH.

lsof shows (snip): http://pastebin.com/dUEjyPEd A ton of "can't identify protocol".

I tried adding "* - nofile 2048" in my limits.conf, with no success.

If I try to open the settings menu I get a ton of popups about "Too many open files". I have 66 torrents, all private-flagged. Running Transmission 2.04 (11151) with a GUI.

My settings.json is: http://pastebin.com/wLHixR70

The box running Transmission is dual-stacked with a working IPv6 tunnel given to me via radvd.

I hope this helps fix whatever bug is causing this. Thanks.

comment:6 Changed 10 years ago by fingon

I think there's two different 'features' though, you with ton of seeded torrents probably just run out of connections due to number of connections in each seeded torrent (if you haven't chosen seed only X at once and/or otherwise limited it). My case happens even with low (5) torrents being seeded, with fixed max# of connections, after a while -> IPv6-only 'feature' seems to be in order.

comment:7 Changed 10 years ago by starmoon

I have the same problem. I have 60s torrents seeding.

comment:8 Changed 10 years ago by starmoon

Seems that i have solved this problem by using "ulimit -n 2048" before starting transmission-daemon.

comment:9 Changed 10 years ago by AVI

Having the same problem on Ubuntu 10.04 server with Transmission (daemon) 2.11.

400s seeding torrents. Kernel 2.6.32-25-generic-pae

# ulimit
unlimited
"open-file-limit": 64,
Last edited 10 years ago by AVI (previous) (diff)

comment:10 Changed 10 years ago by AzaToth

This happens to me as well now :( I've got atm 70 seeding torrents, and after a day or so, the process starts to run at 100% and impossible to connect to it. lsof shows similar output as previous posts shows.

Whats strange is that before I've had 200 seeding torrents without any problem, it's mostly latly it's started to hang... I haven't been able to identify which torrent might be the issue.

comment:11 Changed 10 years ago by apeiron

I just ran into this with two (2) torrents on OS X, Transmission version 2.12. My max open files sysctl is three times what the current open files amount is. Restarting Transmission seems to have fixed the problem for now.

comment:12 follow-ups: Changed 10 years ago by charles

Does the problem go away if you turn off DHT?

comment:13 in reply to: ↑ 12 Changed 10 years ago by AzaToth

Replying to charles:

Does the problem go away if you turn off DHT?

The tracker is private, and the majority of the torrents have private flag on, though I can't vogue for that everyone has that, thus I'll test by disable it, might take some days though. Is it same for peer exchange, or should I only explicit disable DHT?

Last edited 10 years ago by AzaToth (previous) (diff)

comment:14 Changed 10 years ago by charles

Yes, I'm asking what happens if you explicitly disable DHT.

The process of deciding which torrents to share on the DHT network is different from deciding whether or not your session sets itself up in the DHT network at all...

comment:15 in reply to: ↑ 12 ; follow-up: Changed 10 years ago by Harry

Replying to charles:

Does the problem go away if you turn off DHT?

I have had DHT off for about two days and the problem has not happened yet. Either I am lucky, or DHT is the culprit.

comment:16 in reply to: ↑ 15 Changed 10 years ago by AzaToth

Replying to Harry:

Replying to charles:

Does the problem go away if you turn off DHT?

I have had DHT off for about two days and the problem has not happened yet. Either I am lucky, or DHT is the culprit.

I've also tested, and havent seen any issues yet, doesn't seems to be too many open fds now

$ sudo lsof -p $(pidof transmission-daemon) | wc -l
116

comment:17 follow-up: Changed 10 years ago by charles

AzaToth?, for you is that also after turning off DHT?

comment:18 in reply to: ↑ 17 Changed 10 years ago by AzaToth

Replying to charles:

AzaToth?, for you is that also after turning off DHT?

yep

comment:19 Changed 10 years ago by charles

  • Cc jch@… added

comment:20 Changed 10 years ago by charles

Judging from the UDP sockets, and the behavior going away when DHT is disabled, this sounds like it might be a DHT bug.

CC'ing the wise and powerful DHT author jch :)

comment:21 Changed 10 years ago by atommixz

  • Cc atommixz@… added

comment:22 Changed 10 years ago by Sheriff Hobbes

I experience this problem even with DHT turned off! Using Transmission 2.13. System is RHEL 5.5. With Transmission 1.11 this did not happen!

Last edited 10 years ago by Sheriff Hobbes (previous) (diff)

comment:23 Changed 10 years ago by charles

Sheriff Hobbes: Have you tried using lsof or something similar to determine what's actually eating the files?

comment:24 Changed 10 years ago by Sheriff Hobbes

Transmission is eating the files. I've done "lsof -p <transmission process id>" and I get a very long list. I changed max. open files from the default 1k to 32k and now I've been running for 2 1/2 days w/o a problem. I have only 12 torrents in my list!

comment:25 Changed 10 years ago by egolost

I'm on CentOS 5.5 experiencing this problem for quite many releases now. Thought the solution was to increase the max open files in the OS but that only delayed the problem. Tonight I took the time to investigate a bit further with lsof -u tmission (which is my user running transmission-daemon).

lsof returns a lot of line similar to:

transmiss 19183 tmission 454u sock 0,5 2252802264 can't identify protocol

this was previous mentioned and a solution was to turn of DHT. So I did and it's still filling up with those kind of lines. After 36 min uptime lsof -u tmission returns 227 lines and of them 159 is similar to the one above. I'm currenty seeding 18 torrents and Downloading 11.

Could it be something related to ipv6 maybe? got a friend without ipv6 but he is on ubuntu and he don't have this problem. I have multiple ipv6 addresses on my host. I tried to bind it to a single of them but that did not change anything either.

Last edited 10 years ago by egolost (previous) (diff)

comment:26 Changed 10 years ago by jch

Does the following help?

--jch

comment:27 Changed 10 years ago by charles

  • Keywords backport-2.0x backport-2.1x added

comment:28 Changed 10 years ago by egolost

jch: applied and tested your patch on 2.13. Tried both with DHT enabled and disabled. The "can't identify protocol" open files are still increasing every minute. After 17 min i got 201 of them and totally 292 lines of open files(this with DHT enabled).

Changed 10 years ago by egolost

netstat -apn attachment after applying the 0001-Avoid-a-descriptor-leak-when-binding-the-IPv6-DHT-so.patch

comment:29 Changed 10 years ago by jch

Transmission has opened just 32 sockets, two of which are for UDP. Working just as designed, sorry for that.

--jch

comment:30 Changed 10 years ago by livings124

I'm a bit confused. Are you saying that the posted patch fixes this issue?

comment:31 Changed 10 years ago by jch

I'm only saying that in the particular attachment above, there's nothing wrong I can see -- everything is working as designed.

Egolost, I'm closing this bug for now -- please reopen it if you manage to reproduce this issue with r11644 or later.

--jch

comment:32 Changed 10 years ago by jch

  • Resolution set to fixed
  • Status changed from new to closed

comment:33 Changed 10 years ago by jordan

  • Milestone changed from None Set to 2.20

comment:34 Changed 10 years ago by jordan

  • Resolution fixed deleted
  • Status changed from closed to reopened

reopening for attribution

comment:35 Changed 10 years ago by jordan

  • Owner set to jch
  • Status changed from reopened to new

comment:36 Changed 10 years ago by jordan

  • Resolution set to fixed
  • Status changed from new to closed

comment:37 Changed 10 years ago by jordan

fixed by r11644

comment:38 Changed 9 years ago by masc

  • Resolution fixed deleted
  • Status changed from closed to reopened

I'm seeing regression with version 2.50 and higher (didn't check previous releases) running transmission-daemon on gentoo.

Having DHT disabled, unidentifiable sockets seem to remain at an acceptable level.

/ # lsof -p 5201 | grep .*identify.*protocol.* | wc -l
333

Having DHT enabled, transmission will reach the maximum open file limit pretty quickly.

Open file limit is now capped to FD_SETSIZE (https://trac.transmissionbt.com/ticket/4164), which makes it difficult to avoid the problem in addition.

Last edited 9 years ago by masc (previous) (diff)

comment:39 follow-up: Changed 9 years ago by masc

update. disabling DHT won't help. just takes longer until the limit is reached and transmission stalls.

/ # lsof -p 5201 | wc -l
1045 -> FD_SETSIZE limit exceeded
/ # lsof -p 5201 | grep .*identify.* | wc -l
755
grsec: more alerts, logging disabled for 10 seconds
grsec: From 200.35.62.46: denied resource overstep by requesting 1024 for RLIMIT_NOFILE against limit 1024 for /usr/bin/transmission-daemon[transmission-da:5203] uid/euid:110/110 gid/egid:105/105, parent /sbin/init[init:1] uid/euid:0/0 gid/egid:0/0
grsec: From 200.35.62.46: denied resource overstep by requesting 1024 for RLIMIT_NOFILE against limit 1024 for /usr/bin/transmission-daemon[transmission-da:5203] uid/euid:110/110 gid/egid:105/105, parent /sbin/init[init:1] uid/euid:0/0 gid/egid:0/0
grsec: From 200.35.62.46: denied resource overstep by requesting 1024 for RLIMIT_NOFILE against limit 1024 for /usr/bin/transmission-daemon[transmission-da:5203] uid/euid:110/110 gid/egid:105/105, parent /sbin/init[init:1] uid/euid:0/0 gid/egid:0/0
grsec: From 200.35.62.46: denied resource overstep by requesting 1024 for RLIMIT_NOFILE against limit 1024 for /usr/bin/transmission-daemon[transmission-da:5203] uid/euid:110/110 gid/egid:105/105, parent /sbin/init[init:1] uid/euid:0/0 gid/egid:0/0
grsec: From 200.35.62.46: denied resource overstep by requesting 1024 for RLIMIT_NOFILE against limit 1024 for /usr/bin/transmission-daemon[transmission-da:5203] uid/euid:110/110 gid/egid:105/105, parent /sbin/init[init:1] uid/euid:0/0 gid/egid:0/0
grsec: more alerts, logging disabled for 10 seconds

comment:40 Changed 9 years ago by x190

What is your setting for "peer-limit-global:" in settings.json?

comment:41 Changed 9 years ago by masc

240 (default)

comment:42 in reply to: ↑ 39 Changed 9 years ago by rb07

Replying to masc:

update. disabling DHT won't help. just takes longer until the limit is reached and transmission stalls.

Assuming you have something like this:

$ emerge -pv net-misc/curl

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild   R    ] net-misc/curl-7.25.0  USE="idn ipv6 ssl -ares -gnutls -kerberos -ldap -nss -ssh -static-libs -test -threads" 0 kB 

Can you rebuild curl, this time with c-ares enabled? i.e. 'USE="ares" emerge -av net-misc/curl'

comment:43 follow-up: Changed 9 years ago by rb07

Actually is threads enabled, and c-ares disabled (building libcurl) what causes a problem like the one you describe.

comment:44 in reply to: ↑ 43 ; follow-up: Changed 9 years ago by masc

Replying to rb07:

Actually is threads enabled, and c-ares disabled (building libcurl) what causes a problem like the one you describe.

Disabling threads and enabling ares for curl resolves the issue.. great, thanks!

comment:45 Changed 9 years ago by livings124

  • Resolution set to fixed
  • Status changed from reopened to closed

comment:46 in reply to: ↑ 44 Changed 9 years ago by rb07

Replying to masc:

Disabling threads and enabling ares for curl resolves the issue.. great, thanks!

Its only the one combination that doesn't work: threads enabled, c-ares disabled; the other 3 combinations do work fine, I'm using threads enabled, and c-ares enabled, which is the one that should give the best performance.

comment:47 Changed 9 years ago by samuli

Sorry, but how is the bug fixed if the problem is still manifesting with certain allowed build combinations of curl?

comment:48 Changed 9 years ago by masc

interesting, I can't combine those two. I had to remove threads support to get ares.

!!! The ebuild selected to satisfy "curl" has unmet requirements.
- net-misc/curl-7.24.0::gentoo USE="ares ipv6 kerberos ldap (multilib) ssl threads -gnutls -idn -nss -ssh -static-libs -test"

  The following REQUIRED_USE flag constraints are unsatisfied:
    threads? ( !ares )

  The above constraints are a subset of the following complete expression:
    threads? ( !ares ) nss? ( !gnutls )
Note: See TracTickets for help on using tickets.