Sunday, November 25, 2012

Translating a PEBL test

Image from ancientegypt.co.uk
PEBL  tests are used by many non-English speakers. Yet most standardized psychological tests and most of the base tests in the test battery) are created in English.  Because PEBL is open source, if there is a test in English that you want translated/localized, you can do it yourself without too much hassle.  Typically, you only need a text editor (I recommend notepad++). Note that these instructions are not only valid for translators, but they will also work if you want to improve/alter instructions/feedback, labels/headers/etc.  This tutorial is valid for PEBL 0.13 and earlier; we anticipate moving to a slightly easier and more consistent method for PEBL 0.14.
 

The first thing to check is if a translation already exists.  The more popular tests have been translated into a number of languages, and are designed to be easier to translate.



Typically, translations involve modifying instructions and labels, headers, and other feedback text, and possibly data/report file output.  Note that we currently don't support RTL texts directly (Hebrew, Arabic, etc.).  To do this, you will need to create screenshots of the text you are interested in, and it is a bit involved. 

The first thing to check is whether the test you are interested in is already translated.  To do this, enter your multi-letter language code in the language box (it is en by default) in the PEBL launcher and see what happens when you run the script. 



If you get an English version, then you will need to translate.  Now, it gets slightly more complicated.

Separate translation files

First, check whether there is a translations\ subdirectory in the task's folder.  These probably only exist for the bcst and iowa tasks. If so, it will contain files specifying the text for the entire test.  For example, the Bechara gambling task (in iowa\translations) has files like "labels-en-mouse.txt" and "labels-en-keyboard.txt".  Open this in a text editor (like notepad++) and you will see something like this:

This data format is pretty simple.  Each line (separated by a hard-return) is a different piece of text data that gets used by the program.  Don't add the hard return except at the end of each part (i.e., wherever a hard return happened before).  Then save and test it out.  If the labels are misaligned, you may have added too many (or not enough) carriage returns. When you complete a translation, send the translated files to the pebl-list at pebl-list@lists.sf.net.  Most likely, if you want a test translated, someone has already done it, but has not sent in their changes and so they are not available to you.  So contribute back to the project!

In-script translations


A number of other scripts have translations done within the actual script itself.  When this is done, the text file usually needs to be saved in UTF-8 (not simply unicode, or UTF-16).   Open the .pbl file in a text editor (again try notepad++), and look for a function near the end called GetStrings().  The Tower of London task, the SATest, and others have translations like this.  See below:

To edit this, you need to know just a little bit about PEBL syntax.  In PEBL, text strings are defined by putting text within quotation marks (").  Furthermore, text can be added together using the plus sign (+).  So if you have some text,  "one" and "two", "one"+"two" becomes "onetwo".  These types of things are used in making instructions, because sometimes the instructions or feedback will involve adding specific values.  Like "You have " + trialsleft + " trials left".  Note that in the image above, the text values are assigned to global variables in the GetStrings() function, which in turn get used by the main script when creating instructions and the like.  The GetStrings() function will always include the English text, and also gives you the chance to define your own.  Basically, the logic will be something like the following, where the variable language indicates the two-letter country code.

lang <- br="br" language="language" uppercase="uppercase">  if(lang == "EN") 
   {

     ## Do English translations
     ##
   } elseif(lang == "ES")
   {
     ## Do spanish translations
   } else {
      GetStrings("EN")
   }

Usually, there is some fallback to English if the selected language is not defined, as seen in the final else block here.  To add your own language, make a copy of a complete language section (e.g., the English one), and add it to the end of the if..elseif  block prior to the fallback else statement.  Suppose you want to add French (FR):

lang <- br="br" language="language" uppercase="uppercase">  if(lang == "EN") 
   {

     ## Do English translations
     ##
   } elseif(lang == "ES")
   {
     ## Do spanish translations
   }  elseif(lang == "FR")
    {
       ##Copy english here

     }    else {
      GetStrings("EN")
   }

Now, you just need to go through line-by-line and translate each English statement in the FR section to French.   If this is a bit scary to you or you end up getting a syntax error somewhere,  just translate the English section without adding one for your own country.  Then, send us your translated script and we'll add it as a separate section so that others can use it in the future.

Scripts without current translations

Most scripts will not have translations yet.  Once someone submits an alternate language, I then set up the script so it is easy to translate in multiple languages.  In this case, the script will not have any GetStrings() function.  English text for instructions, labels, and so on will be interspersed throughout the script. For example, the match to sample task has no current translations.  At the same time, it only requires translating a few text strings:

  To translate, simply replace the text assigned to the instruc, inst1, and inst2 variables with the text of your choice. For these scripts, there is no need to initially specify the language, it does not check for language when it runs, and that part will have to be added.  ONCE YOU DO THE TRANSLATION, SEND US YOUR TRANSLATED SCRIPTS SO WE CAN ADD THEM TO THE BATTERY AND OTHERS CAN USE YOUR TRANSLATIONS.  

Font issues

 Sometimes, your specific fontface is not supported by the default PEBL font.  For the most part, PEBL tests use one of several predefined fonts,  defined using the global variables gPEBLBaseFont, gPEBLBaseFontMono, and gPEBLBaseFontSerif.  For certain languages, PEBL will select a different base font.  For CJK languages, this currently gets set to the wqy-zenhei.ttc font family, whereas for other languages it gets set to DejavuSans.ttf/DejaVuSansMono.ttf/DejaVuSerif.ttf, which has good support for extended character sets outside the CJK fonts.  If the font is not acceptable, you can override this by using your own font. Just add the font to the directory the script is in, and set those names to your font name:

gPEBLBaseFont <- myfont.ttf="myfont.ttf" p="p">gPEBLBaseFontMono <- mymonofont.ttf="mymonofont.ttf" p="p">gPEBLBaseFontSerif <- myseriffont.ttf="myseriffont.ttf" p="p">

Summary

This post gives basic instructions for translating PEBL scripts.  And if you take the time to do the translation, please give back to the community and make your translations available for others. 

No comments: