KDE i18n howto
version 2.0

Lukáš Tinkl

Matthias Kiefer

Abstract:

This document gives a brief introduction how to prepare KDE applications to successfully support other languages and to be used in other countries. It also lists some common problems and how to avoid them.


Contents


Introduction

To support internationalization (i18n) and localization (l10n), KDE offers the class KLocale which is contained in kdecore. KLocale has a lot of functionality to make it as easy as possible for developers to make their code i18n aware. Nevertheless there are some things to take care of so that applications are useable in other languages and countries, too.

To use KLocale, you don't have to create an instance of it by your own, but you have access to the global KLocale object with KGlobal::locale(). The object you receive takes care of all user settings and will be deleted automatically on application exit. The next few paragraphs will show in more detail how to use the functionality of KLocale and what you have to take care of.


Internationalization (aka i18n)

i18n is a short for internationalization, and means that the program will be displayed in the language you choose. In KDE, all strings should be translated before they are displayed on the user's screen. What you should not translate are debug messages or similiar.


How to prepare the code

To make translations possible, the programmer has to use i18n() on all strings that should be displayed. i18n() takes a const char* as an argument and returns a translated QString or the original string if no translation was found. As you see on the returned value, which is of type QString, you should use QString for all strings, that can be translated, because it uses internally unicode characters and therefore can handle strings in all languages. While when you you are using the classical char* string, you will have to take care of the different possible charsets by your own1.

Since all equal strings are summarized and are therefore translated with the same string, there is an additional extended function i18n() which takes two const char* arguments. The first argument is an additional description of the second string (which is shown to the user) and both strings together are used to find the corresponding translation. In addition the first argument is shown to the translators as hint of the meaning or context of this string.


Context information

Think of a file manager, where you can open a context menu on a file and select View to view the file. In this context View is a verb. Additionaly the window of the file manager has a menu View in the menubar. In this context View is a noun. In the English version of the application everything looks fine, but in most other languages one of the strings will be wrong and confuses the user. Even if equal strings would not be translated with the same string, it would be difficult for the translators to find out, which of these strings is used as verb and which as noun. The solution for this is to use the extended i18n() function: For the context menu something like i18n("to view something", "View") and for the menu in the menubar i18n("the view", "View"). This way both strings are translated as separate strings and the translator has a hint how to translate these strings.

In general it is a very good idea to use this extended function, if the string to translate is short and the meaning is hard to find out, when the context is not exactly known.

You may have seen another translation function in application templates or other existing code: I18N_NOOP(). This function does not actually translate a string, but just marks the string for translation, so that the string will get extracted and included in the po files. If you want to translate the string, you still have to use i18n() with exactly the same string afterwards. I18N_NOOP() is typically used for strings given to KAboutData, because it is constructed before the KApplication and you can use i18n() only after the construction of the KApplication. So it is safe to use always i18n() and never I18N_NOOP() if you are sure, that the code will be executed after construction of KApplication.


Plural handling

To make things even more complicated, recent versions of KDE introduced handling of plural forms in different languages. A short example first:

msgStr = i18n("Creating index file: 1 message done", 
              "Creating index file: %n messages done", num);

This form of i18n() gets expanded to as many cases as required by the user's language. In English, this is just two forms; in other languages it may be more depending on the num variable value2.


Things to take care of

Although using i18n() for all visible messages is the main work to get your application translated, there are some traps, which prevent your application from being usable in some languages. Mainly these are layout problems and problems with string concatenation.


Use layout management

You might know that English text is often very compact while in other languages the translated text might be a lot longer. If you developed the GUI of your application carefully, you will not have a problem with this because you might anyway have used a layoutmanager to take care of the widget geometry for you. Of course you can implement geometry management on your own but it is very likely that you might get in trouble with the widget geometry when you use functions like QWidget::setGeometry() or similiar. To get rid of this problem and let your application look nice also when used in other languages, it is highly recommended to use layout manager as they are provided by Qt with QLayout and derived classes. For more information look at the documentation of QLayout.


No word puzzles

Another thing to take care of is to not concatenate strings together like this:

QString msg=i18n("Do you want to replace ")+oldFile+i18n(" with ")+newFile+"?"

The result of contructs like this is, that it is only very hard or impossible to translate, because the structure of the sentence may be completely different in another language and the translator will only see parts of the sentences and has to guess what belongs together. The solution for this problem is to use QString::arg() which lets the translator not only make good translations because he sees the whole sentence but also let him change the order of the arguments freely. Because of this last advantage, QString::arg is also recommended to use instead of sprintf and similar functions. The above example would then look like this:

QString msg=i18n("Do you want to replace %1 with %2?").arg(oldFile).arg(newFile) 

Note: Don't insert anything else than numbers, names or similiar with this method, since in some languages the translation depends on the inserted words. Better use more complete strings instead, if possible.

Similarly, messages that contain a version string or other often changing things should use QString::arg() to insert this string into the message. Otherwise every change will cause the translators to change the translated messages, too. Of course only names, numbers and such things should be inserted with QString::arg(). (Since for example KDE is translated into more than 50 languages, a single change causes at least 50 people to open the file, find the changed message, look carefully if this is the only thing that has changed, change the translation, save the file again and commit the changed file into CVS. All in all such a small change might cause work of more than one hour. Just to give you an idea, so please think carefully about the messages you use.)


Be Unicode friendly

Don't ever call QString::latin1() or QString::ascii() on translated strings unless you really know what you are doing! This also applies to information submitted by user, like passwords, URLs, filenames etc...

KDE applications are translated in many languages which use other charsets than Latin1. If you really need a char* representation of a string, better use QString::local8Bit() or QString::utf8(), depending on the context. For more information on charsets and Unicode in special, see the http://developer.kde.org/documentation/library/kdeqt/kde3arch/KDE-Unicode-Howto.html KDE Unicode Howto.

Since KDE 3.0, the KIO local filesystem can use UTF-8 encoded filenames3. You may turn it on globally by exporting KDE_UTF8_FILENAMES in your shell's startup file (e.g. .bashrc). You, as a programmer, have to take care of passing a properly encoded filename to any method in question -- the correct way is not to guess user's filesystem encoding but to use QFile::encodeName() and QFile::decodeName().


Message catalogs

If your application is in KDE CVS, you don't have to do anything to distribute them. The messages are automatically extracted and will then get translated by the KDE translation teams.

If your application is not in the KDE CVS, you have to extract the messages by yourself and provide the translated PO files within your distribution. To extract the messages you need to have gettext installed and extractrc from kdesdk has to be in your path.4 Then just call in the base directory of your distribution make -f admin/Makefile.common package-messages and after that you will find a file <appname>.pot, which contains all messages of your application in the subdirectory po. If there is not already a Makefile.am in this directory, create one with the following single line as content:

POFILES = AUTO

and add the po directory to SUBDIRS in Makefile.am of the base directory. Now you just have to find translators to translate the extracted messages into the various languages. ;-) Translated PO files then have to be stored in the po directory with the naming scheme <languagecode>.po.



Example messages target (taken from kdebase/konqueror, shortened):

messages: rc.cpp
        $(EXTRACTRC) *.rc */*.rc >> rc.cpp
        $(EXTRACTRC) sidebar/trees/history_module/history_dlg.ui >> rc.cpp
        $(XGETTEXT) rc.cpp *.h *.cc *view/*h *view/*cc kedit*/*.cpp \
        -o $(podir)/konqueror.pot

On line 2, we extract messages from rc and .desktop files, on line 3 from Qt Designer UI files and finally on line 4, we put them all (together with C++ sources and headers) into konqueror.pot.

But what happens if your application is a small plugin or library using translations from a different app? Let's take the above example of Konqueror resp. KEditBookmark (the KDE bookmark editor) to explain this topic. The bookmark editor sources reside in Konqueror's subdirectory keditbookmark. Notice that we extract its messages into the konqueror.pot file on the line 3 as well. What's left now is to tell the bookmark editor to load konqueror.po and use it as its own catalog (usually done in the main entry point of your binary):

#include <klocale.h>

int main(int argc, char ** argv)
{
  KLocale::setMainCatalogue("konqueror");

  <some code>

  KApplication app;
  <your code here>

  return app.exec();
}

The method KLocale::setMainCatalogue("konqueror") on line 5 does the job for us. Another similar issue is pre-loading a different catalog at application's startup (usually done for e.g. Kicker applets, multimedia plugins or KOffice filters):

extern "C"
{
  KPanelApplet* init(QWidget *parent, const QString& configFile)
  {
    KGlobal::locale()->insertCatalogue("clockapplet");
    KGlobal::locale()->insertCatalogue("timezones"); // For time zone translations
    return new ClockApplet(configFile, KPanelApplet::Normal,
                           KPanelApplet::Preferences, parent, "clockapplet");
  }
}


Technical details

The actual translation is done with the gettext package. It contains tools for extracting messages from source files and to handle changed messages, so that translators do not have to start over and over again. The extracted messages and the translations are stored in so called ``PO files'' using different encodings. These files are then compiled into a binary format (MO files) which then get installed.

All translations of a language have to be stored in the same encoding which is defined in the charset file. KLocale reads this file when constructed and uses this information to decode the translations.

Nowadays, UTF-8 is required as the PO files encoding in KDE CVS.


Localization (aka l10n)

l10n is a short for localization, and means that the program will be aware of your location (i.e. where you live) and use that information when the program asks for input or print something to the screen. For example to format numbers or money strings. Below is a list of the most important l10n supported by KLocale. All input and output of your application should use these functions. The usage of these functions should be straight forward. See the documentation of KLocale for more details.


All about numbers

When a program needs to present numbers to the user, it should take care of decimal separator, thousand separator or currency symbol being used. These symbols may be different in various regions; in English speaking countries, a dot (.) is used to separate the fractional part of a number while in some European countries, a comma (,) is used instead. Below is a short summary of functions that will help you format the numbers correctly, taking the local conventions into account.



Function Arguments
formatMoney() double num, const QString & currency, int digits=-1
formatMoney() const QString & numStr
formatNumber() double num, int precision=-1
formatNumber() const QString & numStr
formatDate() const QDate & pDate, bool shortFormat=false
formatTime() const QTime & pTime, bool includeSecs=false
formatDateTime() const QDateTime &pDateTime, bool shortFormat=true, bool includeSecs=false



Similar functions exist to read information from user, e.g. readNumber() or readMoney().


Calendaring

Developing applications dealing with date and time, or calendars in general is a very complex area. Not only the resulting string containing a date or time may look different in different countries but also one has to take care of other aspects like:

KLocale provides, among others, these methods:



Function Arguments
formatDate() const QDate & pDate, bool shortFormat=false
formatTime() const QTime & pTime, bool includeSecs=false
formatDateTime() const QDateTime &pDateTime, bool shortFormat=true, bool includeSecs=false



TODO -- provide more info on the different calendar systems


Technical details

The user's configuration is stored in kdeglobals and the different country profiles are stored in share/locale/l10n/<country_code>. <country_code> is replaced with the short version of ISO 3166 country code for the country in lower case.


Useful tools, tips & tricks

There are some very well hidden tools that can help you debug your application and stay compliant with i18n standards in KDE.


Dr. Klash

This little utility, once activated, can present a report about conflicting shortcuts in menus. Not only this is helpful for translators but also for developers. A little hand editting of /.kde/share/config/kdeglobals is needed:

  [Development]
  CheckAccelerators=F12

From this point on, everytime you press F12, you will get this report displayed on screen.


XX language

This helper language serves as a debugging aid for people finding untranslated strings in applications. If you start your application with the ``xx'' locale, all strings will appear as x's. First you have to check out these ``translations'' from kde-i18n/xx and install them.

In general, if you want to start a KDE application (e.g. KSpread) in a different locale than yours, type:

  > KDE_LANG=en_US kspread

This will start KSpread in English no matter what is your current language preference set in the Control Center.


Links and additional information




We encourage you to read these in-depth materials by Markus Kuhn:


Revision history

Version 2.0 (lukas):




TODO:



Footnotes

... own1
If you are wondering where to find documentation of the i18n() function: it is defined as a global function in klocale.h and uses KLocale::translate() internally.
... value2
Notice usage of %n here, it must be part of the string
... filenames3
This feature is still regarded as experimental.
... path.4
KDE uses a patched version of gettext to use the above mentioned additional context information. You can find this patch in kdesdk/scripts.
... Qt5
Only some information is relevant, e.g. encodings