Opened 10 years ago

Closed 10 years ago

#3852 closed Bug (invalid)

Removing data from an active torrent doesn't stop activity

Reported by: x190 Owned by: livings124
Priority: Normal Milestone: None Set
Component: Transmission Version: 2.13
Severity: Normal Keywords: data removal unrecognized
Cc:

Description

Removing all data while a torrent is active does not stop activity up or down even though Inspector>Info shows data N/A.

Trying to empty the data from the trash shows that it is still "in use". Apparently Transmission "follows" the data around, even to the trash. Not good!

Change History (22)

comment:1 Changed 10 years ago by x190

Same behavior when only part of a multi-file/folder active torrent was removed.

comment:2 Changed 10 years ago by x190

Tested with torrent paused. Moved all data to Trash. This time I could empty trash but Transmission didn't seem to care and this time recreated the file and carried on showing no data loss.

comment:3 Changed 10 years ago by livings124

  • Component changed from Mac Client to Transmission
  • Priority changed from Highest to Normal
  • Severity changed from Critical to Normal

comment:4 Changed 10 years ago by charles

x190, this sounds like just a variation of the issue you described at https://trac.transmissionbt.com/ticket/2955#comment:42 ... am I misunderstanding something? I don't understand how this is a separate issue.

comment:5 Changed 10 years ago by charles

Actually it looks like the discussion in https://trac.transmissionbt.com/ticket/2955#comment:42 exists in 2.13 as well as trunk, so it's not related to #2955 after all. Moving that discussion to this ticket.

That comment describes a variation of this ticket, where the user removes a portion of the downloaded data instead of the whole thing.

To quote x190's suggestions in that ticket:

Transmission should stop and return the "uh, oh" message rather than re-creating a missing folder/file for which it had previously downloaded data. I'm referring to a sub-folder of a multi-folder torrent. As of 11565, Transmission recreates the sub-directory and continues on as if it still had all the data.

Message should say something like "A file or sub-folder [include specific name if possible] is missing! Return the data to the original location or deselect the item in the Inspector."

What needs to happen in the case that the user wishes to re-download the data??? Perhaps, hitting the resume button after the error has been thrown should clear previous bitmapping for that data?? OTY :).

Two scenarios that should be tested with 11565.

#1 Remove all data while torrent is active. #2 Unmount external drive while torrent is active.

As with above, one would hope to see the torrent stop and give the "Uh, oh" message.

comment:6 Changed 10 years ago by charles

  • Summary changed from Removing data of an active torrent does not stop activity to Removing data from an active torrent doesn't stop activity

comment:7 Changed 10 years ago by charles

Okay, now my analysis:

If an application "opens" a file in OS X and then someone deletes the file, OS X doesn't finish deleting the file until the application(s) finish with it too and "close" it. What you're seeing is that Transmission's file cache still has the file "open" so it hasn't finished being deleted yet.

Now here's the question... it's hard to tell if the user deleted the file after we opened it. We could look on the disk for the file every single time we try to read or write it, but it would cause a noticeable speed hit, and also undo most of the wins that we gained from the memory cache.

So I'm trying to figure out how often users are going to be deleting a torrent's files while the torrent is active, and what the use case for doing that is. Is it worth adding these extra tests, when the problem could also be solved by telling the user to pause the torrent first? This seems like a "Doctor, it hurts when I do this" bug... but maybe there's some aspect to this that I'm missing? You listed this as a critical issue so I'd like to hear your thoughts on it.

comment:8 Changed 10 years ago by x190

Does it not matter that the torrent can complete to 100% and go to seed mode when in fact large chunks of data are missing?

Yes, it is user error, but it happens frequently, in my experience. Forgetting to mount an external drive is a large part of what [LAZY] is all about, is it not? What about inadvertent external drive unmounts? Are you going to revert to hiding the data in /Volumes, baffling many users?

Regarding the cost, could you ease cache size up a notch to 6 MB to compensate?

Even if the torrent could be stopped at the point in the code where the file/folder gets re-created you could then force a re-verify of local data.

comment:9 Changed 10 years ago by x190

FWIW, I repeated many of my previous tests with the torrent paused and obtained similar results. The application is not aware that previously verified data is missing (in current test that represents ~30% of the data).

Interestingly, the CM (Rt-click on removed folder in Inspector->Files) is aware the folder is missing as "Show in Finder" is grayed out. Also, in previous tests in which all data was removed while torrent was active the GUI (Inspector->Info) immediately was aware the data was missing yet not the application.

comment:10 Changed 10 years ago by x190

I think I've nailed this down as regards 2.13. Pause torrent. Remove all data. Resume the torrent. Download will continue with no indicated data loss and the file will be re-created.

The only time Transmission recognizes data is missing is when the data is removed before the application is opened. Then one gets the caution triangle and Inspector->Activity shows "No data found! Reconnect any disconnected drives, use "Move Data File To…", or restart the torrent to re-download." error.

If only some data of a multi-file torrent is removed before the application is opened then T will force a "Verify Local Data".

In all other cases T will complete the torrent and claim it has 100% of data.

Last edited 10 years ago by x190 (previous) (diff)

comment:11 Changed 10 years ago by charles

Could you test with r11600?

comment:12 Changed 10 years ago by x190

Test #1: Open T with torrents paused. Remove all data of test torrent. Resume torrent. Torrent stopped and error received. PASSED

Test #2 Remove all data belonging to an actively downloading torrent. No error received. File re-created with no indicated loss of data. FAILED

Back to the drawing board and no milk and cookies! :)

comment:13 Changed 10 years ago by charles

Are you sure about test 2? It's passing for me. I get an error message saying

Please Verify Local Data! A file disappeared: /path/to/some/file

comment:14 follow-up: Changed 10 years ago by x190

Charles,

I'm testing now with 1 file removed from a 68 GB (21% selected) torrent. I locked the removed file in a folder so T should have no access, cache or otherwise. T has since d/led well over 30 MB belonging to the removed file, but hasn't recreated the file. Where in the name of pete is it putting that data. Cache is only 4MB right. No errors. Worried! I'll have to stop it now. Do you have an explanation?

Sure about #2, but will retest.

Last edited 10 years ago by x190 (previous) (diff)

comment:15 in reply to: ↑ 14 ; follow-up: Changed 10 years ago by charles

Replying to x190:

Charles,

I'm testing now with 1 file removed from a 68 GB (21% selected) torrent. I locked the removed file in a folder so T should have no access, cache or otherwise. T has since d/led well over 30 MB but hasn't recreated the file. Where in the name of pete is it putting that data. Cache is only 4MB right. No errors. Worried! I'll have to stop it now. Do you have an explanation?

Transmission still has a handle to the open file, so it doesn't where you move it to, the handle is still good inside the code. You would need to delete the file to get Transmission's attention.

Last edited 10 years ago by charles (previous) (diff)

comment:16 in reply to: ↑ 15 ; follow-up: Changed 10 years ago by x190

Replying to charles:

Replying to x190:

Charles,

I'm testing now with 1 file removed from a 68 GB (21% selected) torrent. I locked the removed file in a folder so T should have no access, cache or otherwise. T has since d/led well over 30 MB but hasn't recreated the file. Where in the name of pete is it putting that data. Cache is only 4MB right. No errors. Worried! I'll have to stop it now. Do you have an explanation?

Transmission still has a handle to the open file, so it doesn't where you move it to, the handle is still good inside the code. You would need to delete the file to get Transmission's attention.

So T was writing away to a file in a locked folder in the Trash? If I could write machine code my OS wouldn't allow that B.S!

RE: Test #2 Using a single file torrent for this test. Failed outright one time (re-created the file with no error as stated). Passed one time. Third run, well it's d/ling away as per above.

So exactly when is your new code supposed to kick in? Anyhow it failed outright first time.

Last edited 10 years ago by x190 (previous) (diff)

comment:17 in reply to: ↑ 16 Changed 10 years ago by charles

Replying to x190:

So T was writing away to a file in a locked folder in the Trash? If I could write machine code my OS wouldn't allow that B.S!

T opened the file before you trashed it and the file handle is still good so it kept writing.

Also, the use case we talked about above was someone removing an external drive while the torrent is active... that's very different from what's being tested here, which is Trashing the folder while the torrent is active. Is the latter really a use case worth handling? Is that something that really happens, much less happens frequently, outside of these torture tests? :)

comment:18 follow-up: Changed 10 years ago by charles

Well, great. When I try testing under the case of removing an external drive while the torrent is active, the OS's filesystem's cache keeps letting read & write succeed. Inspecting the file descriptor shows that the file is still alive, has a device number, and a node number, and hasn't been deleted.

I am starting to question if it is worth handling cases like this. It seems like the only way we could handle this every time would be to disable the OS's filesystem's cache, which would be a catastrophic performance hit.

comment:19 in reply to: ↑ 18 ; follow-up: Changed 10 years ago by x190

Replying to charles:

Well, great. When I try testing under the case of removing an external drive while the torrent is active, the OS's filesystem's cache keeps letting read & write succeed. Inspecting the file descriptor shows that the file is still alive, has a device number, and a node number, and hasn't been deleted.

I am starting to question if it is worth handling cases like this. It seems like the only way we could handle this every time would be to disable the OS's filesystem's cache, which would be a catastrophic performance hit.

This OS cache you refer to--does it have a size limit before the app (T in this case) gets cut-off. Does this in fact get written to the data file, or is this something that would eventually fill up the disk until restart (reminds me of some forum posts). Again can you explain briefly when your code should trigger?

Anyhow, I think we're both seeing the same phenomenom, i.e. T can write to missing files (or OS caches) for extended periods. How long???

You seem to keep questioning the value of this exercise. The way I see it, it is very easy for a Transmission user to unknowingly enter a swarm as a seeder with missing data and therefore never be able to complete a d/l to a peer. This could potentially involve a significant slice of T users. Feel free to disagree or form your own opinions. I have, however, shown conclusively that it can and does happen.

Last Test: 1 of 3 test torrents failed to stop on pause all/resume all. That was the multi-folder one.

That one just re-created it's file, so your new code fails the test. :(

Last edited 10 years ago by x190 (previous) (diff)

comment:20 in reply to: ↑ 19 Changed 10 years ago by charles

Replying to x190:

Replying to charles:

Well, great. When I try testing under the case of removing an external drive while the torrent is active, the OS's filesystem's cache keeps letting read & write succeed. Inspecting the file descriptor shows that the file is still alive, has a device number, and a node number, and hasn't been deleted.

I am starting to question if it is worth handling cases like this. It seems like the only way we could handle this every time would be to disable the OS's filesystem's cache, which would be a catastrophic performance hit.

This OS cache you refer to--does it have a size limit before the app (T in this case) gets cut-off.

It does have a limit, but it's not controlled by Transmission. The OS and its filesystem control the filesystem's buffering/caching.

Does this in fact get written to the data file, or is this something that would eventually fill up the disk until restart (reminds me of some forum posts).

That's a filesystem-dependent detail that will vary from system to system.

Again can you explain briefly when your code should trigger?

My code will trigger when it notices that a file has been deleted from the drive while it's still in the cache. I don't mean moved to the trash; I mean hard-deleted. That case is handled now with the "file disappeared!" error message.

Anyhow, I think we're both seeing the same phenomenom, i.e. T can write to missing files for extended periods. How long???

That, too, is controlled by the OS and the filesystem.

You seem to keep questioning the value of this exercise. The way I see it, it is very easy for a Transmission user to unknowingly enter a swarm as a seeder with missing data and therefore never be able to complete a d/l to a peer.

If Transmission tries to read a block from disk in order to seed it, and the read fails, Transmission will throw up an error message and stop the torrent immediately. (When the read fails is dependent on the OS... but when it does, Transmission *will* stop.)

Transmission will not feed bad data to peers, and it will not continue on if a read fails. That's why I question how far we need to chase these fringe cases.

This could potentially involve a significant slice of T users.

How?

comment:21 Changed 10 years ago by x190

Well, I've chased it about as far as I care to! :) Glad to hear that checks are in place. I'll let you get back to GTK and libevent2. :)

comment:22 Changed 10 years ago by livings124

  • Resolution set to invalid
  • Status changed from new to closed

It appears this ticket might be a black hole of generic issues, with no single pinpoint-able problem to solve, so I'm going to close it. Please open more specific tickets if there are any that can be derived from this.

Note: See TracTickets for help on using tickets.