Opened 7 years ago

Closed 5 years ago

Last modified 5 years ago

#5407 closed Bug (fixed)

2.80 (14102 and 14103) crashes in tr_peerMgrGetDesiredAvailable () on OS X v10.8.4

Reported by: bvrbt Owned by: mike.dld
Priority: Normal Milestone: 2.90
Component: Transmission Version: 2.80
Severity: Normal Keywords:
Cc:

Description

Transmission 2.80 (14102 and 14103) sometimes crashes on OS X v10.8.4 with EXC_BAD_ACCESS when pausing a torrent by clicking the per torrent pause button. I’m afraid I have not been able to determine the precise conditions needed to trigger this crash.

I’m attaching the 14102 crash report to this ticket.

Corresponding forum post: https://forum.transmissionbt.com/viewtopic.php?f=4&t=14905.

Attachments (2)

14102.txt (48.5 KB) - added by bvrbt 7 years ago.
14160.txt (52.1 KB) - added by bvrbt 7 years ago.
2.82 14160 crash

Download all attachments as: .zip

Change History (27)

Changed 7 years ago by bvrbt

comment:1 Changed 7 years ago by jordan

Thanks for the report.

It's hard to be sure from this crash report, but I suspect what's happening is the `swarm' pointer is being set to null by stopTorrents right before tr_peerMgrGetDesiredAvailable() is called. The simple fix is that function needs to test to see if we have a swarm object before using it.

comment:2 follow-up: Changed 7 years ago by jordan

  • Milestone changed from None Set to 2.81
  • Owner set to jordan
  • Status changed from new to assigned

I've fixed the swarm pointer issue in r14111. Please reopen this ticket if the problem persists. Thanks!

comment:3 Changed 7 years ago by jordan

  • Resolution set to fixed
  • Status changed from assigned to closed

Changed 7 years ago by bvrbt

2.82 14160 crash

comment:4 Changed 7 years ago by bvrbt

  • Resolution fixed deleted
  • Status changed from closed to reopened

I’ve just had another crash, this time with 2.82 (14160) on OS X v10.8.4 (12E55), also whilst trying to pause an active torrent.

comment:5 Changed 7 years ago by jordan

  • Resolution set to fixed
  • Status changed from reopened to closed

Hey, this isn't the same crash, so I've opened a new ticket for it (ticket #5471) and am re-closing this ticket as fixed.

Thanks for reporting this!

comment:6 Changed 6 years ago by mike.dld

  • Resolution fixed deleted
  • Status changed from closed to reopened

Please see new crash report in #5829 (marked as duplicate) with latest SVN build.

comment:7 Changed 6 years ago by x190

Possible fix --- please test: Add code @ Line 2615 in libtransmission/peer-mgr.c:

/* count how many bytes we want that connected peers have */
uint64_t
tr_peerMgrGetDesiredAvailable (const tr_torrent * tor)
{
  size_t i;
  size_t n;
  uint64_t desiredAvailable;
  const tr_swarm * s;

  assert (tr_isTorrent (tor));

  /* common shortcuts... */

  +if (!tor->isRunning) /* Line 2615 */
  +  return 0;

  if (tr_torrentIsSeed (tor))
    return 0;

The Mac Client uses tr_torrentStop(), so "tor->isRunning = false", will be set before "tr_peerMgrGetDesiredAvailable()" is called.

Last edited 6 years ago by x190 (previous) (diff)

comment:8 Changed 6 years ago by mike.dld

Wasn't able to reproduce on my own yet (probably need a bit more than 10 torrents in a list for this), but the proposed fix looks sane to me. Not only to prevent crash but also as a really good shortcut indeed.

comment:9 Changed 6 years ago by mike.dld

On a second thought, there's still room for race condition. It could happen that tor->isRunning is checked before set to false in tr_torrentStop, s->pieceReplication and s->pieceReplicationSize are checked before set to NULL/0 in replicationFree which then happens before or in the middle of the cycle. The probability is decreased but the issue is still there.

comment:10 Changed 6 years ago by x190

[Controller stopTorrents:]->stopTransfer->tr_torrentStop(fHandle)(tor->isRunning is now false)->[self update]->etc.->tr_runInEventThread (tor->session, stopTorrent, tor)->etc.

tr_peerMgrGetDesiredAvailable () shouldn't even get called, in this scenario, unless tor->isRunning is set to false, unless you are saying [self update] could proceed before tr_torrentStop(fHandle) returns?

MARCH 31, 2015 --- THIS CRASH IS NOT FIXED --- SEE FORUM POST BELOW:

https://forum.transmissionbt.com/viewtopic.php?f=4&t=16867#p70794

Last edited 6 years ago by x190 (previous) (diff)

comment:11 Changed 6 years ago by mike.dld

I wasn't talking about current implementation. That was a theoretical situation where tr_torrentStat is called from one thread and tr_torrentStop from another thread at the same time.

Don't get me wrong, your patch fixes the issue with our code (until someone unaware of such fragile call ordering architecture changes the code). I'm just thinking of other people using libtransmission and writing code which isn't under our control. Note though that I don't have any information on whether someone else does use libtransmission at present time.

comment:12 Changed 6 years ago by cfpp2p

tr_torrentStat no longer uses tr_torrentLock. How might the Mac client behave or create a race condition in this case?

I might suggest testing with a sufficient number of concurrently active torrents remove~torrent(s) and remove~torrent(s)~with~data in addition to pausing~torrent(s).

x190's fix looks good to me.

comment:13 Changed 6 years ago by mike.dld

  • Owner changed from jordan to mike.dld
  • Status changed from reopened to new

comment:14 Changed 6 years ago by mike.dld

  • Status changed from new to assigned

comment:15 Changed 6 years ago by mike.dld

Committed as r14372. Thanks ;) and please forgive my scepticism.

comment:16 Changed 6 years ago by mike.dld

  • Milestone changed from 2.81 to 2.90
  • Resolution set to fixed
  • Status changed from assigned to closed

comment:17 Changed 6 years ago by cfpp2p

Thanks for the patch x190. This should make a lot of users happy ;-)

comment:18 Changed 6 years ago by mike.dld

  • Resolution fixed deleted
  • Status changed from closed to reopened

Another crash in the same place reported on forum.

comment:19 Changed 6 years ago by mike.dld

  • Resolution set to fixed
  • Status changed from reopened to closed

Added another condition for tor->isStopping per x190's suggestion (see forum link above) in r14483.

comment:20 in reply to: ↑ 2 Changed 6 years ago by x190

Replying to jordan:

I've fixed the swarm pointer issue in r14111. Please reopen this ticket if the problem persists. Thanks!

Re-opened.

comment:21 Changed 6 years ago by x190

  • Resolution fixed deleted
  • Status changed from closed to reopened

comment:22 Changed 6 years ago by x190

This issue persists in recent builds.

https://gist.github.com/victorhooi/548c83006cf1e64b3cf2 https://forum.transmissionbt.com/viewtopic.php?f=4&t=16867&p=70794#p70782

See also: https://trac.transmissionbt.com/ticket/5407#comment:9

/* count how many bytes we want that connected peers have */
uint64_t
tr_peerMgrGetDesiredAvailable (const tr_torrent * tor)
{
...
  /* common shortcuts... */

  if (!tor->isRunning
      || tor->isStopping
      || tr_torrentIsSeed (tor)
      || !tr_torrentHasMetadata (tor))
    return 0;
   
  const tr_swarm * s = tor->swarm;
  if (!s->isRunning)
    return 0;
   
  n = tr_ptrArraySize (&s->peers);
  if (n == 0)
    return 0;
  else
...

In the latest crash, when verifying torrents, the "!tor->isRunning" check doesn't prevent the crash.

Re: r14111. How could "the `swarm' pointer is being set to null by stopTorrents" happen? "if (!s->isRunning)

return 0;" might be a better check here.

Last edited 5 years ago by x190 (previous) (diff)

comment:23 Changed 5 years ago by mike.dld

  • Resolution set to fixed
  • Status changed from reopened to closed

Added check for !s->isRunning in r14650. Thanks x190.

comment:24 Changed 5 years ago by x190

  • Summary changed from 2.80 (14102 and 14103) crashes on OS X v10.8.4 with EXC_BAD_ACCESS to 2.80 (14102 and 14103) crashes in tr_peerMgrGetDesiredAvailable () on OS X v10.8.4

comment:25 Changed 5 years ago by x190

Let's hope comment:9 doesn't come into play.

P.S. Show us how a pro would ensure it doesn't. :)

Last edited 5 years ago by x190 (previous) (diff)
Note: See TracTickets for help on using tickets.