Opened 5 years ago

Closed 5 years ago

#3818 closed Bug (invalid)

transmission-daemon hangs with no error messages

Reported by: ntompson Owned by:
Priority: Normal Milestone: None Set
Component: Daemon Version: 2.12
Severity: Normal Keywords:
Cc:

Description

I have been having this issue with transmission for quite some versions now. I have the white light MyBook?. I have installed transmission by Optware, and have upgraded as new versions come to hand. Currently at 2.12.

I find that, after some completely random length of time - sometimes 20 minutes, sometimes many hours, the transmission web GUI stops responding. After some investigation, it has become clear to me that this is not a GUI problem, but with the transmission-daemon itself. "transmission-remote -si" also returns nothing at the same point. I have also installed MRTG to monitor both my router and the MyBook?; it also shows that network activity drops to zero when this happens. So I think it is safe to conclude that UL / DL stops when this happens.

I have to kill and restart the transmission-daemon process to get things going again.

The annoying thing is that there are absolutely no error messages to help. Transmission-daemon reports into the system message log regularly, but there are no unusual messages prior to this happening. There are no dump files, nothing useful that I can find at all. I have tried running transmission-daemon in the foreground, but this also yields nothing interesting other than what I already see in the system log. I've also tried verbose logging, but it yields so much stuff that it is almost impossible to see what is going on.

There also seems to be no correlation with anything i do in terms of adding / removing torrents, other than a vague impression that it is associated with heavier torrent activity. Note that I have limited the max number of torrents to 20; while this seems to have increased the time it takes to crash, it still crashes. Note also that it never crashes if there are no open torrents.

I have googled plenty on this; I have found nothing. Which I guess is not surprising, given that I have no error messages to search on. I'd be very interested if others are seeing similar symptoms, or if anyone has suggestions on how to deal with this.

Thanks Nick

Change History (6)

comment:1 Changed 5 years ago by ntompson

Sorry - I have limted max peers to 20; not max torrents...

comment:2 Changed 5 years ago by x190

comment:3 follow-up: Changed 5 years ago by ntompson

OK, so the workaround is indeed effective (EVENT_NOEPOLL=1; export EVENT_NOEPOLL). The problem has gone away, and I've been able to significantly increase the torrent load without any issue. I'm very happy about this, because I have put up with instability for quite some time.

However, in terms of this ticket, I suggest there are two things that could be done.

(1) Make EVENT_NOEPOLL the default behaviour for transmission-daemon. From what I can tell, any ARM user (ie most NASs) of transmission-daemon can expect instability until this workaround is put into effect. A NAS is a perfect use case for transmission-daemon, and it is by far the best implementation of a bittorrent client I have seen for a NAS, so I would suggest there is a strong argument for doing this.

I haven't found out much about what effect the workaround has, other than it obviously invokes a different eventing model, which quite obviously works just fine. Unless anyone can point out an advantage of the current model, why not just make the EVENT_NOEPOLL behaviour the default? If this is considered too severe, perhaps EVENT_NOEPOLL could be implemented on ARM builds only.

I acknowledge that this is no fault of the transmission team, but with libevent, but let's be pragmatic...

(2) Improve error logging. The other feature of this issue was that there were absolutely no errors logged by transmission-daemon when things went pear shaped. This made rectification difficult, as there were few defining features to Google. "Transmission hangs" is not terribly specific... I would suggest that there should be some kind of error logging implemented to handle situation like this.

Once again, I am very happy that my problem has been resolved, but I see an opportunity to make Transmission better. Interested in your thoughts.

Thanks,

Nick

comment:4 in reply to: ↑ 3 Changed 5 years ago by charles

Replying to ntompson:

Thanks for the suggestions. I'm glad things are working now for you!

(1) Make EVENT_NOEPOLL the default behaviour for transmission-daemon

This is not something that we can control. This is the job of the packager. We have no control over how Transmission in invoked or what environment variables are set.

(2) Improve error logging. The other feature of this issue was that there were absolutely no errors logged by transmission-daemon when things went pear shaped. This made rectification difficult, as there were few defining features to Google. "Transmission hangs" is not terribly specific... I would suggest that there should be some kind of error logging implemented to handle situation like this.

If the process his hung, we can't log anything anyway. :(

I appreciate the suggestions but I don't see how either of these things can be implemented inside of Transmission's code.

comment:5 Changed 5 years ago by ntompson

Thanks for the quick reply.

My comments:

(1) Ah. Not a subtlety I previously understood. That's a pity. Perhaps I should be directing that comment at optware (I think this is how most people deploy transmission-daemon to their NAS).

(2) OK, point taken, but I wonder if there should be something more substantial done here. For example, perhaps you could have a thread that watches other key threads and generates logs / takes action if a thread goes bad. This measure would be quite unjustified for transmission running on a PC, but where transmission-daemon is running on a NAS, where we are talking up-times of weeks or months, the chances that something goes awry are relatively high. This would cover not just the issue in question, but any future issue that resulted in dead / blocked / bad threads.

Nick

comment:6 Changed 5 years ago by jordan

  • Resolution set to invalid
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.