Opened 14 years ago

Closed 14 years ago

#930 closed Bug (fixed)

Patch to reorder format string args for translation

Reported by: bgeer Owned by: charles
Priority: Normal Milestone: 1.30
Component: GTK+ Client Version: 1.20
Severity: Normal Keywords:


There are currently two messages in Transmission that can't be correctly translated into Arabic, because of a limitation of gettext. The strings in question have two arguments, and the Arabic translation needs to be able to omit the first one, but this is not possible in gettext. For a full explanation, please see gettext bug 23183. (The next release of gettext will contain this explanation in its documentation.)

According to the gettext maintainer, the solution is to reverse the order of the arguments in the affected format strings in the Transmission source code. Here is a patch for the current Transmission SVN to make this change.

Attachments (1)

transmission-reorder-args.patch (1.4 KB) - added by bgeer 14 years ago.
Patch to reorder arguments in two format strings

Download all attachments as: .zip

Change History (10)

Changed 14 years ago by bgeer

Patch to reorder arguments in two format strings

comment:1 Changed 14 years ago by Festor

  • Component changed from Transmission to GTK+ Client
  • Milestone changed from None Set to 1.20
  • Owner set to charles
  • Version changed from 1.11 to 1.20

comment:2 Changed 14 years ago by charles

I don't know Arabic, so bear with me if this is a stupid question, but I don't understand why you want to omit the first argument. Doesn't that mean the user doesn't get to see how many pieces exist in the torrent?

comment:3 Changed 14 years ago by bgeer

No problem, I'll explain. All Arabic nouns have a special form (called the "dual form") which indicates that there are two of whatever the noun refers to. So the English expression "2 pieces" is translated into Arabic as a single word, which is the dual form of the word "piece"; the numeral is not included in the translation (nor can it be included), because the dual form carries all the information.

Similarly, for the values 0 and 1, there are set expressions that have to be used; they're analogous to the expressions "no pieces" and "one piece" in English. While in English it's possible to use the digits 0 and 1 instead of the words "no" and "one", this can't be done in Arabic.

In .po files, the gettext plural expression is used to test the value of numeric arguments, so that the right wording can be selected according to the value of the argument. So consider the following message to be translated:

English singular:
%1$'d Piece @ %2$s
English plural:
%1$'d Pieces @ %2$s

If the value of the first argument is 0, 1 or 2, the translation has to reflect this value by using the appropriate wording and the right form of the Arabic word for "piece", without actually including the argument itself. Here's an example of how this looks; it's in Arabic, but you can see that there are actually six different singular/plural forms, and that the first three (for values of 0, 1 and 2) don't include the argument. All the information is there for all values; it's just included in the wording of the translations.

I think some other languages deal with 0 and 1 in a similar manner, and therefore omit gettext arguments when translating values of 0 and 1. Usually this is no problem, because gettext allows you to omit the last argument in a translation. The problem arises when the argument you need to omit isn't the last argument. I asked gettext maintainer Bruno Haible what to do about this, and he said the best solution is to switch the order of the arguments in the source code, to enable the desired argument to be omitted as needed. He said he would add a note to the gettext documentation to point this out.

Sorry for the lengthy explanation; does that make sense?

comment:4 Changed 14 years ago by livings124

Sounds to me that gettext should be able to handle this (perhaps with ignoring arguments if they're omitted in the translated string). It really shouldn't be the software developer's responsibility to code around hundreds of possible language grammar rules when a library like gettext should be handling it.

comment:5 Changed 14 years ago by charles

  • Milestone changed from 1.20 to 1.30
  • Status changed from new to assigned

I agree it would be best if gettext could handle this, but until that point I don't see the harm in this change.

Marking for 1.30, which will be the next post-string-freeze release.

comment:6 Changed 14 years ago by bgeer

According to the gettext maintainer, this limitation exists because gettext relies on ordinary C format strings that can be passed to printf; he said:

A string like "une piece %$2s"
simply is not a valid C format string. When you pass it to
printf, printf produces an error code.

comment:7 Changed 14 years ago by livings124

Regardless on how they implement it, they should support cases where this happens if they want to be "complete" and a full localization solution.

comment:8 Changed 14 years ago by charles

bgeer: would it be better to split these

%1$'d Piece @ %2$s
%1$'d Pieces @ %2$s

into smaller strings like this:

snprintf( countStr, sizeof( countStr ),
          ngettext( "%d Piece", "%d Pieces", info->pieceCount ),
          info->pieceCount );

tr_strlsize( sizeStr, info->pieceSize, sizeof(sizeStr) );

snprintf( resultStr, sizeof( resultStr ),
          _( "%1$s @ %2$s" ), countStr, sizeStr );

comment:9 Changed 14 years ago by charles

  • Resolution set to fixed
  • Status changed from assigned to closed

I committed the suggestion in the previous comment in r6020. Please reopen this ticket if that doesn't fix the problem for you.

Note: See TracTickets for help on using tickets.