Opened 9 years ago

Closed 7 years ago

Last modified 7 years ago

#5369 closed Enhancement (fixed)

Error checking for file allocations

Reported by: g.proskurin Owned by: mike.dld
Priority: Normal Milestone: 2.90
Component: libtransmission Version: 2.77
Severity: Normal Keywords: patch
Cc:

Description

Functions preallocate_file_sparse() and preallocate_file_full() check for errors, but these errors are not checked by callers. Hence the patch, added debug for these functions and error checking in callers. As a result, if file cannot be preallocated, it will be stopped, which slightly increase robustness in low disk space situations.

Attachments (2)

tr_falloc.patch (7.1 KB) - added by g.proskurin 9 years ago.
Error checking for preallocate_file_sparse() and preallocate_file_full()
5369-preallocate-error-checks.patch (10.6 KB) - added by mike.dld 7 years ago.

Download all attachments as: .zip

Change History (23)

Changed 9 years ago by g.proskurin

Error checking for preallocate_file_sparse() and preallocate_file_full()

comment:1 follow-up: Changed 9 years ago by g.proskurin

Some notes about patch:

  • immediately exit if ENOSPC error occurs
  • WIN32 code is not tested, I just believe it should work (return with zero on success and non-zero (-1) on error)
  • SYS_DARWIN code is not tested, and I changed F_ALLOCATECONTIG to F_ALLOCATEALL (there is no reason to require contiguous allocation, but there is reason for "all" allocation)

comment:2 in reply to: ↑ 1 ; follow-up: Changed 9 years ago by x190

Replying to g.proskurin:

Some notes about patch:

  • SYS_DARWIN code is not tested, and I changed F_ALLOCATECONTIG to F_ALLOCATEALL (there is no reason to require contiguous allocation, but there is reason for "all" allocation)

Seems to work well as is, but, if you're going to change it, I think F_ALLOCATECONTIG should be tried first and then revert to F_ALLOCATEALL if it fails.

Here's how Firefox does it: (See this stackoverflow post.)

#elif defined(XP_MACOSX)
  int fd = PR_FileDesc2NativeHandle(aFD);
  fstore_t store = {F_ALLOCATECONTIG, F_PEOFPOSMODE, 0, aLength};
  // Try to get a continous chunk of disk space
  int ret = fcntl(fd, F_PREALLOCATE, &store);
    if(-1 == ret){
    // OK, perhaps we are too fragmented, allocate non-continuous
    store.fst_flags = F_ALLOCATEALL;
    ret = fcntl(fd, F_PREALLOCATE, &store);
    if (-1 == ret)
      return false;
Last edited 9 years ago by x190 (previous) (diff)

comment:3 follow-up: Changed 9 years ago by pathetic_loser

immediately exit if ENOSPC error occurs

Are you sure it's good idea? Then GUI clients would just "crash" without giving user any hint on what's wrong. Sure, user will figure out problem at some point but usually desktop programs are supposed to tell to user about problems like this. And daemons are supposed to be running all the time so it would likely just toggle daemon restart.

comment:4 in reply to: ↑ 2 ; follow-up: Changed 9 years ago by g.proskurin

Replying to x190:

Replying to g.proskurin:

Some notes about patch:

  • SYS_DARWIN code is not tested, and I changed F_ALLOCATECONTIG to F_ALLOCATEALL (there is no reason to require contiguous allocation, but there is reason for "all" allocation)

Seems to work well as is, but, if you're going to change it, I think F_ALLOCATECONTIG should be tried first and then revert to F_ALLOCATEALL if it fails.

Here's how Firefox does it: (See this stackoverflow post.)

#elif defined(XP_MACOSX)
  int fd = PR_FileDesc2NativeHandle(aFD);
  fstore_t store = {F_ALLOCATECONTIG, F_PEOFPOSMODE, 0, aLength};
  // Try to get a continous chunk of disk space
  int ret = fcntl(fd, F_PREALLOCATE, &store);
    if(-1 == ret){
    // OK, perhaps we are too fragmented, allocate non-continuous
    store.fst_flags = F_ALLOCATEALL;
    ret = fcntl(fd, F_PREALLOCATE, &store);
    if (-1 == ret)
      return false;

I see no reason, why files downloaded by transmission should be required to be allocated contiguously. They are just plain files, let OS decide how to allocate them. If size is specified in fcntl/F_ALLOCATEALL, OS should be smart enough to allocate file as contiguously as it can.

comment:5 in reply to: ↑ 4 ; follow-up: Changed 9 years ago by x190

Replying to g.proskurin:

Replying to x190:

Replying to g.proskurin:

If size is specified in fcntl/F_ALLOCATEALL, OS should be smart enough to allocate file as contiguously as it can.

I think it is better to guaranty a desired result (video playback optimization), if possible. Also F_ALLOCATECONTIG may be more efficient at least for 10.8.

Last edited 9 years ago by x190 (previous) (diff)

comment:6 in reply to: ↑ 3 Changed 9 years ago by g.proskurin

Replying to pathetic_loser:

immediately exit if ENOSPC error occurs

Are you sure it's good idea? Then GUI clients would just "crash" without giving user any hint on what's wrong. Sure, user will figure out problem at some point but usually desktop programs are supposed to tell to user about problems like this. And daemons are supposed to be running all the time so it would likely just toggle daemon restart.

By "immediately exit" I meant "return from function with error", not terminate daemon. Function caller will see the error and report it to log or something. Daemon continues to work.

comment:7 in reply to: ↑ 5 Changed 9 years ago by g.proskurin

If size is specified in fcntl/F_ALLOCATEALL, OS should be smart enough to allocate file as contiguously as it can.

I think it is better to guaranty a desired result (video playback optimization), if possible. Also F_ALLOCATECONTIG may be more efficient at least for 10.8.

I disagree. I don't like idea of trying to be smarter than OS and overdesigning simple code. Using contiguous allocation without a reason may increase fragmentation of file system for other files. Torrent client should not bother about possible real-time requirements for some downloaded files. If we really need contiguous allocation in some cases, it should be implemented as user-configurable parameter.

comment:8 follow-up: Changed 8 years ago by mike.dld

One small issue with the patch: according to posix_fallocate(3) man page,

posix_fallocate() returns zero on success, or an error number on failure. Note that errno is not set.

The patch, however, gets error number from errno instead of function return value.

comment:9 Changed 8 years ago by jordan

g.proskurin, could you update the error-detecting-and-reporting patch wrt comment:8?

comment:10 in reply to: ↑ 8 Changed 8 years ago by g.proskurin

Replying to mike.dld:

One small issue with the patch: according to posix_fallocate(3) man page,

posix_fallocate() returns zero on success, or an error number on failure. Note that errno is not set.

The patch, however, gets error number from errno instead of function return value.

You are right. Posix docs also say so: http://pubs.opengroup.org/onlinepubs/009695399/functions/posix_fallocate.html I used freebsd man to prepare patch, it seems freebsd is incorrect here: http://www.freebsd.org/cgi/man.cgi?query=posix_fallocate&apropos=0&sektion=0&manpath=FreeBSD+11-current&arch=default&format=html, section RETURN VALUES. I filled bug report for freebsd: http://www.freebsd.org/cgi/query-pr.cgi?pr=186028 I'll update patch in a few days when I have more time. Thank you for your comment.

Changed 7 years ago by mike.dld

comment:11 Changed 7 years ago by mike.dld

Attached is a reworked patch version (with some personal touch) against current trunk. I have omitted debug output from different allocation methods except for errors. Also, two more small defects were fixed along the way:

  • file descriptor is now always closed on error inside cached_file_open () and is only stored to tr_cached_file on success;
  • ftruncate () call added after xfsctl (), since testing revealed that XFS control call acts same way as Mac OS-specific fcntl () call: space is being reserved, but file size reported is still zero until you truncate it.

Please review.

comment:12 Changed 7 years ago by Robby

  • Keywords patch added

comment:13 follow-up: Changed 7 years ago by x190

IMO, the following are problematic.

/* we can't resize the file w/o write permissions */ 
167	179	  resize_needed = already_existed && (file_size < info.size); 
168	180	  writable |= resize_needed;

Why are we messing with the mode at this point? A likely scenario is that two torrents have a file with the same name and we should NOT change the mode.

  /* If the file already exists and it's too large, truncate it. 
184	222	   * This is a fringe case that happens if a torrent's been updated 
185	223	   * and one of the updated torrent's files is smaller. 
… 	… 	 
186	224	   * http://trac.transmissionbt.com/ticket/2228 
187	225	   * https://bugs.launchpad.net/ubuntu/+source/transmission/+bug/318249 
188	226	   */ 

Well, the file could be too small too, or it might belong to another torrent, so we should throw an error here and pause the torrent and let the user sort out the problem. In my custom version, I have removed the (possible) mode changing code and returned an error, if the file is the wrong size.

alreadyExists = !stat( filename, &sb ) && S_ISREG( sb.st_mode );
  /* file may already exist, but file_size is incorrect */
  if( alreadyExists && file_size != (uint64_t)sb.st_size )
   {
     errno = EEXIST;
     const int err = errno;
     tr_err( "WARNING:%1$s already exists, but is the wrong size. "
	    "To continue, remove this file and verify data.", filename );
    
     return err;
   }

comment:14 in reply to: ↑ 13 ; follow-up: Changed 7 years ago by mike.dld

Replying to x190:

Why are we messing with the mode at this point? A likely scenario is that two torrents have a file with the same name and we should NOT change the mode.

The scenario you are describing is exactly the one caused this code to appear in the first place (read the link you quoted, #2228). Since it didn't bother anyone since 1.73 (or are there any user reports?), I'm not sure whether we need to change it now, or at least that we need to change it in this ticket, or that the check should be performed in this place and not somewhere else (didn't think of this yet).

comment:15 Changed 7 years ago by x190

#1 'writable' is passed to tr_fdFileCheckout(). cached_file_open() should not change it.

#2 If you are going to truncate an existing file, how do you plan on guaranteeing that it belongs to the current torrent?

Last edited 7 years ago by x190 (previous) (diff)

comment:16 in reply to: ↑ 14 Changed 7 years ago by cfpp2p

Replying to mike.dld:

... #2228). Since it didn't bother anyone since 1.73 (or are there any user reports?)...

Not true. ticket:4147#comment:19 , ticket:4147#comment:22 and r14147

I don't agree with changing the writable flag. When more than one torrent accesses the same file(s) it's problematic what to do. There are ways of separating these kind of situations. One way is tagging hash codes. The problem in general occurs very infrequently so the changes necessary to the code base for this never has appealed to me.

comment:17 follow-up: Changed 7 years ago by mike.dld

#4147 is surely an interesting one. What I meant by "didn't bother anyone" though was about introduced truncation behavior, not the error(s) in its implementation.

'writable' is passed to tr_fdFileCheckout(). cached_file_open() should not change it.

I agree, and what led to this should probably be thought through once more to try and find better solution. But this ticket is not the place.

If you are going to truncate an existing file, how do you plan on guaranteeing that it belongs to the current torrent?

I'm not, at least not in this ticket.

This ticket is about improved error checking, not everything anyone dislikes in cached_file_open () implementation. I almost regret now that I have made those two additional fixes which may have given you an impression that I'm going to fix it all. What I merely did was rewrite OP's patch to use latest file handling changes. And those additional fixes didn't change current behavior as much (and didn't require as much testing) as change in writable flag would.

comment:18 Changed 7 years ago by mike.dld

  • Owner set to mike.dld
  • Status changed from new to assigned

comment:19 Changed 7 years ago by mike.dld

  • Component changed from Transmission to libtransmission
  • Milestone changed from None Set to 2.90
  • Resolution set to fixed
  • Status changed from assigned to closed

Committed as r14367 using the patch I attached.

comment:20 in reply to: ↑ 17 Changed 7 years ago by cfpp2p

Replying to mike.dld:

#4147 is surely an interesting one. What I meant by "didn't bother anyone" though was about introduced truncation behavior, not the error(s) in its implementation.

'writable' is passed to tr_fdFileCheckout(). cached_file_open() should not change it.

I agree, and what led to this should probably be thought through once more to try and find better solution. But this ticket is not the place.

If I'm reading r14367 correctly the "ftruncate () call added after xfsctl ()" is issue to correct file size, same for Mac; and seemingly unrelated to #4147. But how is it that we actually are sure that the torrent asking for the preallocation and eventual possible truncation isn't corrupting another torrent's existing data... ? I guess we really need another ticket for this.

Maybe you could explain r14367 in a more concise way as to what's being accomplished. The ticket shows as an enhancement but r14367 shows additionally three bugs fixed, which makes its confusing to me what the intent is. It's late and I'm having a little hard time understanding the patch in the detail I would like to and how it might relate to a future ticket concerning truncation and the writable flag. :-)

Last edited 7 years ago by cfpp2p (previous) (diff)

comment:21 Changed 7 years ago by mike.dld

Going through the open tickets today, I saw there already exists #4129.

Truncation added after xfsctl(XFS_IOC_RESVSP64) is just to align results of different preallocation methods: fcntl(F_PREALLOCATE) on Mac needs ftruncate call as well (it doesn't update file size too), and fallocate64 and posix_fallocate do not need ftruncate call since they do update file size AFAIR. Also note that preallocation (and thus truncation) is only performed if file didn't exist before (there is a check for already_existed variable in cached_file_open).

OTOH there is another truncate (tr_sys_file_truncate) call in cached_file_open following preallocation (which I did nothing about here) which is being made if file already existed and its size is less than required. For it to succeed, writable flag is being forcibly set to true if we need to resize the file. I propose to discuss this in #4129 if you think this behavior should be changed. I personally don't think it's good to return writable descriptor when we're asked for readable one; truncation itself is of less importance but if there would be some good solution to this I'd be happy to collaborate on a patch for it. Note though that there is a comment right before this truncation which mentions #2228 where this truncation was added in the first place.

Note: See TracTickets for help on using tickets.