How to type Greek, Greek Polytonic in Linux
Update 2010: Please see the docs.google.com edition of the guide as it has the latest material. See link below.
There is a new guide on how to write Greek and Greek Polytonic in Linux, and in particular using the latest versions of Linux distributions.
https://docs.google.com/View?docID=dccdrjqk_4cqjn9zcj (LATEST VERSION)
The guide shows in detail how to add the Greek keyboard layout to your Linux desktop, and how to write Greek, Greek Polytonic and other Ancient Greek characters.
The guide is also available in both ODT and PDF format. (both files are somewhat obsolete. use google docs URL from above instead)
For a Greek version of the guide, please see http://docs.google.com/Doc?id=dccdrjqk_3gx3bq5f9 (does not update as often as the English version)
We attach the HTML version of the guide in this post. The docs.google.com version is the latest, please read that instead.
http://docs.google.com/Doc?id=dccdrjqk_4cqjn9zcj
How to easily modify a program in Ubuntu (updated)?
Some time ago we talked about how to modify easily a program in Ubuntu. We gave as an example the modification of gucharmap; we got the deb source package, made the change, compiled, created new .deb files and installed them.
We go the same (well, similar) route here, by modifying the gtk+ library (!!!). The purpose of the modification is to allow us to type, by default, all sort of interesting Unicode characters, including ⓣⓗⓘⓢ , ᾅᾷ, ṩ, and many more.
The result of this exercise is to create replacement .deb packages for the gtk+ library that we are going to install in place of the system libraries. Because these new libraries will not be original Ubuntu packages, the update manager will be pestering us to rollback to the official gtk+ packages. This is actually good in case you want to switch back; you will have the enhanced functionality for as long as you postpone that update.
There is a chance we might screw up our system, so please make backups, or have a few drinks first and come back. I take no responsibility if something bad happens on your system. If you are having any second thoughts, do not follow the next steps; use the safer alternative procedure. You may try however this guide just for the kicks; up to the dpkg command below, no changes are being made to your system.
We use Ubuntu 7.10 here. This should work in other versions, though your mileage may vary.
The compilation procedure takes time (about 30 minutes) and space. Make sure you use a partition with >2GB of free space. We are not going to use up 2GB (a bit less than 1GB), but it's nice not to fill up partitions.
We are going to use the generic instructions on how to recompile a debian package by ducea.
First of all, install the development packages,
sudo apt-get install devscripts build-essential
Next, we use the apt-get source command to get the source code of the GTK+ 2 library,
cd /home/ubuntu/bigpartition_over2GB/apt-get source libgtk2.0-0
We then pull in any dependencies that GTK+ may require. They are normally about a dozen packages, but we do not have to worry for the details.
apt-get build-deplibgtk2.0-0
At this stage we need to touch up the source code of GTK+ before we go into the compilation phase. Visit the bug report #321896 – Synch gdkkeysyms.h/gtkimcontextsimple.c with X.org 6.9/7.0 and download the patch (look under the Attachment section). You should get a file named gtk-compose-update.patch. If you have a look at the patch, you will notice that it expects to find the source of gtk+ in a directory called gtk+. Making a link solves the problem,
ln -s libgtk2.0-0 gtk+
We then attempt to apply the patch (perform a dry run), just in case.
patch -p0 --dry-run < /tmp/gtk-compose-update.patch
If this does not show an error message, you can the command again without the --dry-run.
patch -p0 < /tmp/gtk-compose-update.patch
Finally, we are ready to build our fresh GTK+ library.
cd libgtk2.0-0debuild -us -uc
This will take time to complete, so go and do some healthy cooking.
At the end of the compilation, if all went OK, you should have about a dozen .deb files created. These are one directory higher (do a "cd .."). To install, use dpkg,
dpkg -i *.deb
If you have any other deb files in this directory, it's good to move them away before running the command. If all went ok, the .deb files should install without a hitch.
The final step is to restart your system. To test the new support, see the last section at this post. Use Firefox and OpenOffice.org to type those Unicode characters.
If you managed to wade through all these steps, I would appreciate it if you could post a comment.
Good luck!
task update (el)
Ακολουθώντας το παράδειγμα του Δημήτρη για συνοπτική ενημέρωση των πεπραγμένων,
- Πρόσθεσα τις μεταφράσεις του Γιάννη Κατσαμπίρη στο SVN του GNOME. Έκανα έλεγχο των μεταφράσεων και έστειλα σχόλια στη λίστα του gnome.gr. Ο Γιάννης μετέφρασε ή ενημέρωσε τις μεταφράσεις για τα vinagre, gnome-mag, mousetweaks, mousetweaks-help.
- Έκανα σχόλια στο http://laptop.grinia.net/ για τη διαβούλευση για το μαθητικό υπολογιστή. Βλέπω ότι δεν έχουν γίνει πολλά σχόλια από άλλους. Είναι καλό να το κάνετε.
- Πρόσθεσα το ιστολόγιο του Αλέξανδρου στον πλανήτη. Ανταλλάξαμε μερικά γράμματα για κάποια τεχνικά ζητήματα (μια εγγραφή είχε ένα χαρακτήρα που δεν είναι utf-8 οπότε όλο το feed φαινόταν με ?????, πως μπορούμε να συνδιάσουμε κατηγορίες του wordperss για τη δημιουργία πιο κατάλληλου feed, χρήση του feedburner). Απομένουν: χρήση των ειδικών βελτιώσεων του feedburner όπως αναγραφή αριθμού σχολίων. Χμμ, ολόκληρες εγγραφές στο feed;
- Έστειλα γράμματα στη λίστα εξελληνισμού του μαθητικού υπολογιστή για 1) υπενθύμηση του glossary που παρέχει το ellak.gr για όρους πληροφορικής (μπορεί κάποιος να κάνει σχόλια/αλλαγές), 2) αναφορά για τη μετάφραση του eToys (τώρα είμαστε στο %6).
- Πριν από μερικές εβδομάδες έγινε δεκτό το patch για την ελληνική διάταξη πληκτρολογίου, για τη χρήση των συμβόλων dead_psili, dead_dasia. Αυτό σημαίνει ότι στις νέες διανομές του Μαρτίου/Απριλίου το πολυτονικό θα δουλεύει, αλλά μπορεί και όχι...
- Στάλθηκε patch για την υποστήριξη του πολυτονικού από το GTK+· λόγω του σχεδόν κλειστού παραθύρου για εισαγωγή νέων χαρακτηριστικών στο GTK+, αυτό θα μπει μάλλον αργότερα, στην επόμενη έκδοση.
- Όταν κάνεις αποπροσάρτηση ενός τόμου/συσκευής USB, το σύστημα δεν θέτει τη συσκευή σε κατάσταση εκτός λειτουργίας ή χαμηλής κατανάλωσης. Αυτά σε Nautilus αλλά πιστεύω και αλλού. Δεν έχουν όλες οι συσκευές τη δυνατότητα αυτή και φαίνεται ότι δεν έχω ούτε μία τέτοια συσκευή (που να υποστηρίζει off-standby). Από την άλλη πλευρά, το VirtualBox κατάφερε να θέσει μια τέτοια συσκευή σε κατάσταση off (πώς το έκανε!;!;) Η προσθήκη υποστήριξης είναι στο TODO για τώρα.
- Το Σ/Κ θα είμαι στις Βρυξέλλες για το FΟSDΕΜ!
Ενημέρωση: σύνδεσμος με τα σχόλια από τη διαβούλευση.
Create flash videos of your desktop with recordmydesktop
John Varouhakis is the author of recordmydesktop and gtk-recordmydesktop (front-end) which is a tool to help you record a session on your Linux desktop and save it to a Flash video (.flv).
To install, click on System/Administration/Synaptic Package Manager, and search for gtk-recordmydesktop. Install it. Then, the application is available from Applications/Sound&Video/gtkRecordMyDesktop.
Before you are ready to capture your Flash video, you need to select the video area. There are several ways to do this; the most common is to click on Select Window, then click on the Window you want to record. A common mistake is that people try to select the window from the preview above. If you do that, when you would have selected the recorder itself to make a video of, which is not really useful. You need to click on the real window in order to select it; then, in the desktop preview you can see the selected window. In the above case, I selected the OpenOffice Writer window.
Assuming that you do not need to do any further customisation, you can simple press Record to start recording. Generally, it is good to check the recording settings using the GNOME Sound recorder beforehand. While recording, you can notice a special icon on the top panel. This is gtk-recordmydesktop. Once you press it, recording stops and the program will do the post-processing of the recording. The resulting file goes into your home folder, and has the extension .ogv.
Some common pitfalls include
- I did not manage to get audio recording to work well for my system; I had to disable libasound so that the audio recording would not skip. With ALSA, sound skips while with OSS emulation it does not. Weird. Does it work for you?
- The post-processing of the recording takes some time. If you have a long recording, it may take some time to show that it makes progress, so you might think it crashed. Have patience.
I had made one such recording, which can be found at the Greek OLPC mailing list. John told me that the audio part of the video was not loud enough, and one can use extra post-processing to make it sound better. For example, one could extract the audio stream of the video, remove the noise, beautify (how?) and then add back to the video.
It's good to try out gtk-recordmydesktop, even for a small recording. Do you have some cool tips from your Linux desktop that you want to share? Record your desktop!
Typing squiggles and dots in GNOME and GTK+ applications
Garrett asks how to type squiggles and dots in GNOME; that is, how to type characters such as á à ä ã â ą ȩ ę ő ǰ ǩ ǒ ġ ṅ ȯ ṁ ė.
There are several ways, and one can choose depending on how frequently they need to type them or how much time they need to invest learning.
① One option is to start the Character Map (Applications/Accessories/Character Map), pick the character, copy and paste it. This is good for rare characters and weird situations such as
┏━━━━━━━━━━━━━━━━━━━━━━━┓
⟁⟁⟁⟁♥♀★★▶◀☆♀░░░▒▒▒▓▓▓▙▚▛▙▙▙▞
The Unicode standard, apart from defining characters for languages, it also defines symbols, dingbats and all sort of things. If your distribution is based on the DejaVu fonts (such as Ubuntu), then you are probably covered for many of these symbols. If you do not have a suitable font, or you use Windows, you will be wondering what the hell I am talking about.
② Another option is to use the Character Palette applet which shows an applet on the panel with a configurable small repertoire of characters such as áàéíñó½©ث€. You select one of the characters with the mouse, and wherever you middle-click, this character is typed. This is an improvement over ①, and good when you want to type often rare characters. It is not convenient to type characters found normally on a keyboard layout.
③ To type characters normally found in a specific language(s), it is good to setup a suitable keyboard layout. For this, it is good to add the Keyboard Indicator applet; right click on the panel, click Add to panel... and choose the Keyboard Indicator from the Utilities section. The US English keyboard layout (Default variant) does not provide any interesting characters apart from those shown printed on the keys of a US Keyboard.

The US English International (with dead keys) variant might be a better option,
Or the United Kingdom layout.
You can get a similar image for your layout when you right-click on the Keyboard Indicator applet, then click Show Current Layout.
Each key in the images contain up to four letters. Starting from bottom-left and going clock-wise, these are the keys produced when
ⓐ you press the key
ⓑ you press the key with Shift (or Caps Lock)
ⓒ you press the key with AltGr and Shift (or Caps Lock)
ⓓ you press the key with AltGr
For example, with the UK keyboard layout, the key G produces g, G, Ŋ, ŋ.
If AltGr + Shift + letter does not work for you, see the FDO Bug #2871 Different results for shift-altgr and altgr-shift.
Using the appropriate keyboard layout is the way to go when writing text that require squiggles. You can either choose a layout with dead keys (meaning that some keys lose their normal functionality), or you can pick a layout that still allows you to have dead keys but are available when you press AltGr + key. For example, in the UK Keyboard layout - Default variant, AltGr + ; + a produces á, or AltGr+Shift+]+e produces ē.
Photo by titanas.The OLPC uses those four level for the keyboard layout. You can see the all the variations printed on the keyboard. Click on the image, choose Large size for the details.
④ Another option to produce more characters on the keyboard is to enable the compose key, and use compose sequences. A compose sequence looks similar to what we described above (i.e. AltGr+Shift+]+e to ē) but the idea is that we use it for characters we want to be available across different keyboard layouts that you may have enabled.

The compose key is very powerful functionality, thus it is not enabled by default, and lays hidden in the Layout Options tab. I prefer to set it to Menu, but every person has their own preference.
For example,
- Compose key + - + a produces ã,
- Compose key + < + c produces č
- Compose key + 1 + s produces ¹ (Superscript on 1. Try to replace 1 with 2.)
- Compose key + + + - procudes ±
Currently, GTK+ provides 640 such compose sequences involving the Compose key, and hopefully soon it will increase to over 3000.
The Compose key is known as Multi_key in the source code (Xorg, GTK+, etc).
The Compose key compose sequences offer the ability to define smart mnemonics on how to produce characters. It is much easier to type ComposeKey + 1 + s rather than remembering the codepoint value of ¹ (1 superscript). As with many things open-source, there are too many options, and with the Compose key there is the issue of which shall we pick as a sensible default, and how to make it prominent for those who might want to use it.
It appears to me that there should be more effort to promote the functionality that is provided with the standard keyboard layouts (choose a better keyboard layout, produce characters provided in the third and fourth levels, etc). In this respect, Compose key compose sequences should complement after the main discussion on keyboard layouts take place.
⑤ There is a last issue on switching keyboard layouts to cover in a separate post.
Improving input method support in GTK+-based apps
When a bug report gets long with many comments, it gets more difficult for someone to get the full picture of what is going on. I'll attempt to summarise here what's being said in Bug 321896, Synch gdkkeysyms.h / gtkimcontextsimple.c with X.org 6.9/7.0.
GTK+-based applications use by default the GTK+ Input Method in order to let users type in different languages. Some scripts are very complex (such as SE Asian scripts) and in this case SCIM is used, replacing the GTK+ Input Method. One can even disable GTK+ IM altogether and use the basic X Input Method (XIM) which is provided by the Xorg server, by setting GTK_IM_MODULE to xim. However, the majority of the users have GTK+ IM enabled.
Between GTK+ IM and XIM, the keyboard layouts are being managed by the xkeyboard-config project and Sergey Udaltsov. A keyboard layout is simply a mapping of keyboard keys to Unicode characters, but you can also have compose sequences for some characters using what we call dead keys. When you press a dead key nothing appears on screen but when you press a letter immediately afterwards, you can get an á. This functionality is common to add accents, and there is a big table for these compose sequences (1.3MB) and what Unicode characters they produce.
If you change your keyboard layout (System/Preferences/Keyboard/Layout) to something like U.S. English International (with dead keys), then the ' key on your keyboard becomes dead_acute, and the compose sequence
<dead_acute> <a> : "á" U00E1 # LATIN SMALL LETTER A WITH ACUTE
works when you press ' and then a.
There is an issue with compose sequences and input methods; XIM maintains the official upstream version of the compose sequences, and projects such as GTK+ and SCIM carry their own copies of that table.
The issue with GTK+ regarding the compose sequences is that it has a very old version compared to what is available upstream. This is what Bug 321896 is about.
The bug would be have been resolved much much earlier if it wasn't for the insistence of the GTK+ maintainers to cut the fat and reduce the size of the table (~6000 entries) with clever optimisations.
Tor suggested a clever optimisation; a good number of compose sequences (which looks like <dead_acute> <a> : "á") resemble the decomposed form (a la Unicode) of those characters. Thus, we can let the user type what she wants, and we can try Unicode normalisation to see if the sequence is composed to a single Unicode character. Lets demonstrate in Python,
$ python
>>> import unicodedata
>>> sequence=[65, 0x301] # That's 'a' and acute
>>> result = unicodedata.normalize('NFC',"".join(map(unichr, sequence)))
>>> result
u'\xc1'
>>> print len(result)
1
>>> print result
Á
That long line above takes the array, applies the unichr() function on each member so that they become Unicode characters and then joins them in a single string. Finally, it normalises the (decomposed) string to a single character. The fact that the resulting string has length 1 (single character) is key to this optimisation. Over 1000 compose sequences can be removed from the compose table through this optimisation. This includes a big chunk of the Latin Unicode blocks, about a few dozens of Cyrillic characters, all of modern Greek and Greek polytonic, some Indic languages (are they actually used?) and other misc sequences.
Matthias laid out the requirements for the optimisation of the remaining compose sequences; ① it has to be static const so a single copy is shared all over the place, ② the first column (out of six) is repeated too often, thus use subtables, and ③ each row ends with a varying number of zeroes, so cut on those zeroes as well. This also required the automatic generation of the optimised table using a script.
The work has not finished yet, and requires testing of the patch. The high priority testing is that keyboard layouts do not get any regressions (that is, compose sequences with dead keys must continue to work along with any new sequences).
With an updated compose table in GTK+, one can write things like ⒼⓃⓄⓂⒺ and all variations of accents on characters, in an easier way.
I'ld like to thank Matthias and Tor for their support in this work. And Jeff for adding this blog to Planet GNOME!
Localisation issues in home directory folders (xdg-user-dirs)
In new distributions such as Ubuntu 7.10 there is now support for folder names of personal data in your local language. What this means is that ~/Desktop can now be called ~/Επιφάνεια εργασίας. You also get a few more default folders, including ~/Music, ~/Documents, ~/Pictures and so on.
This functionality of localised home folders has become available thanks to a new FreeDesktop standard, XDG-USER-DIRS. xdg-user-dirs can be localised, and the current localisations are available at xdg-user-dirs/po.

A potential issue arises when a user logs in with different locales; how does the system switch between the localised versions of the folder names? For GNOME there is a migration tool; as soon as you login into your account with a different locale, the system will prompt whether you wish to switch the names from one language to another. This is available through the xdg-user-dirs-gtk application.
Another issue is with users who use the command line quite often; switching between two languages (for those languages that use a script other than latin) tends to become cumbersome, especially if you have not setup your shell for intelligent completion. In addition, when you connect remotely using SSH, you may not be able to type in the local language at the initial computer which would make work very annoying.
Furthermore, there have been reports with KDE applications not working; if someone can bug report it and post the link it would be great. The impression I got was that some installations of KDE did not read off the filesystem in UTF-8 but in a legacy 8-bit encoding. This requires further investigation.
Moreover, OpenOffice.org requires some integration work to follow the xdg-user-dirs standard; apparently it has its own option as to which folder it will save into any newly created files. I believe this will be resolved in the near future.
Now, if we just installed Ubuntu 7.10 or Fedora 8, and we got, by default, localised subfolders in our home directory (which we may not prefer), what can we do to revert to non-localised folders?
The lazy way is to logout, choose an English locale as the default locale for the system and log in. You will be presented with the xdg-user-dirs-gtk migration tool (shown above) that will give you the option to switch to English folder names for those personal folders.
Clarification: It is implied for this workaround (logout and login thing), you then log out again, set the language to the localised one (i.e. Greek) and log in. This time, when the system asks to rename the personal folders, you simply answer no, and you end up with a localised desktop but personal folders in English. Mission really accomplished.
If you are of the tinkering type, the files to change manually are
$ cat ~/.config/user-dirs.locale
el_GR
$
and
$ cat ~/.config/user-dirs.dirs
# This file is written by xdg-user-dirs-update
# If you want to change or add directories, just edit the line you're
# interested in. All local changes will be retained on the next run
# Format is XDG_xxx_DIR="$HOME/yyy", where yyy is a shell-escaped
# homedir-relative path, or XDG_xxx_DIR="/yyy", where /yyy is an
# absolute path. No other format is supported.
#
XDG_DESKTOP_DIR="$HOME/Επιφάνεια εργασίας"
XDG_DOWNLOAD_DIR="$HOME/Επιφάνεια εργασίας"
XDG_TEMPLATES_DIR="$HOME/Πρότυπα"
XDG_PUBLICSHARE_DIR="$HOME/δημόσιο"
XDG_DOCUMENTS_DIR="$HOME/Έγγραφα"
XDG_MUSIC_DIR="$HOME/Μουσική"
XDG_PICTURES_DIR="$HOME/Εικόνες"
XDG_VIDEOS_DIR="$HOME/Βίντεο"
Personally I believe that having localised names appear under the home folder is good for the majority of users, as they will be able to match what is shown in Locations with the actual names on the filesystem.
There will be cases that software has to be updated and bugs fixed (such as in backup tools). As we proceed with more advanced internationalisation/localisation support in Linux, it is desirable to follow forward, and fix problematic software.
However, if enough popular support arises with clear arguments (am referring to Greek-speaking users and a current discussion) for default folder names in the English languages, we could follow the popular demand.
Also see the relevant blog post New Dirs in Gutsy: Documents, Music, Pictures, Blah, Blah by Moving to Freedom.
Cannot write Greek Polytonic in Linux
For up to date instructions for Greek and Greek Polytonic see How to type Greek, Greek Polytonic in Linux.
The following text is kept for historical purposes. Greek and Greek Polytonic now works in Linux, using the default Greek layout.
General Update: If you have Ubuntu 8.10, Fedora 10 or a similarly new distribution, then Greek Polytonic works out-of-the-box. Simply select the Greek Polytonic layout. For more information, see the recent Greek Polytonic post.
Update 3rd May 2008: If you have Ubuntu 8.04 (probably applies to other recent Linux distributions as well), you simply need to add GTK_IM_MODULE=xim to /etc/environment. Start a Terminal (Applications/Accessories/Terminal) and type the commands (the first command makes a backup copy of the configuration file, and the second opens the configuration file with administrative priviliges, so that you can edit and save):
$ gksudo cp /etc/environment /etc/environment.ORIGINAL
$ gksudo gedit /etc/environment
then append
GTK_IM_MODULE=xim
save, and restart your computer. It should work now. Try to test with the standard Text editor, found in Accessories.
In Ubuntu 8.10 (autumn 2008), it should work out of the box, just by enabling the Greek Polytonic layout.
Update 20th June 2008: If still some accents/breathings/aspirations do not work, then this is probably related to your system locale (whether it is Greek or not). It works better when it is Greek. If you are affected and you do not use the Greek locale, there is one more thing to do.
$ gksudo cp /usr/share/X11/locale/en_US.UTF-8/Compose /usr/share/X11/locale/en_US.UTF-8/Compose.ORIGINAL
$ gksudo cp /usr/share/X11/locale/el_GR.UTF-8/Compose /usr/share/X11/locale/en_US.UTF-8/Compose
The first command makes a backup copy of your original en_US Compose file (assuming you run an English locale; if in doubt, read /usr/share/X11/locale/locale.dir). The second command copies the Greek compose file over the English one. You then logout and login again.
End of updates
To write Greek Polytonic in Linux, a special file is used, which is called the compose file. There is a bit of complication here in the sense that the compose file depends on the current system locale.
To find out which compose file is active on your system, have a look at
/usr/share/X11/locale/compose.dir
Let's assume your system locale is en_US.UTF-8 (Start Applications/Accessories/Terminal and type locale).
In the compose.dir file it says
en_US.UTF-8/Compose: en_US.UTF-8
Note that the locale is the second field. If you have a different system locale, match on the second field. Many people make a mistake here. Actually, I think be faster for the system to locate the entry if the compose.dir file was sorted by locale.
Therefore, the compose file is
/usr/share/X11/locale/en_US.UTF-8/Compose
So, what's the problem then?
Well, for the Greek locale (el_GR.UTF-8) we have a different compose file, a compose file in which Greek Polytonic actually works
.
Therefore, there are numerous workarounds here to get Greek Polytonic working.
For example,
- If you speak modern Greek, you can install the Greek locale.
- You can edit /usr/share/X11/locale/compose.dir so that for your locale, the compose file is the Greek one, /usr/share/X11/locale/el_GR.UTF-8/Compose.
- You can edit the Greek compose file, take the Greek Polytonic section and update the Greek Polytonic section of en_US.UTF-8/Compose.
- You can copy the Greek compose file in your home directory under the name .XCompose. I did not try this one, and also you may be affected by this bug. (not tested)
Of course the proper solution is to update en_US.UTF-8/Compose with the updated Greek Polytonic compose sequences. There is a tendency to add the compose sequences of all languages to en_US.UTF-8/Compose, and this actually is happening now. In this respect, it would make sense to rename en_US.UTF-8/Compose into something like general/Compose.




