Inputting Japanese text in Linux and some BSDs

First, the disclaimers--Japanese in Linux and the BSDs constantly improves, and parts of this page are often deprecated. If you find any errors here, feel free to send an email to scottro11[at]gmail.com.

This page is now listed in the scim-anthy README.

As I get busier, it becomes harder to keep up with various distributions. When this page was first begun, around 2002, it was one of the few of its kind and there were far fewer distributions to follow. In many cases, you are now better off going to the main forum for your distribution, and looking for recent tutorials there.

If you read this page, you'll see that almost all of the information has been gathered from others--in other words, if you have a problem and write me, I will help if I can, but I don't know that much about it.

A few quick introductory links: For a far more detailed treatment of this subject, see Dr. Mike Fabian's page on the Suse website. Charles Muller's site has a page on Japanese in Mandrake and David Thiel's page on Japanese in FreeBSD is brief, but very easy to understand and useful. David Oftedal has a page about Internationlization in Gentoo Linux which covers several languages.

JWS has an excellent page on multi-lingual text in Linux (which is again referenced in the printing section of this page.)

Note that this article only covers using Japanese in X.

I most frequently use Japanese in an xterminal with things like vi and mutt as well as OpenOffice. Therefore, these are often the only things that I checked. In most cases, if it works in these applications, it will also work with firefox, thunderbird and the like.

The kinput2 cannaserver combination used to be the input method of choice. These days, scim and anthy have become far more popular. With distributions I have tested, I will give information on using scim and anthy. There are some distros that don't have scim-anthy or uim-anthy or uim-scim packages. Although I give information below about compiling scim, anthy and scim-anthy from source, some of the more newcomer friendly distributions have trouble with compiling source code.

I suspect that ibus will wind up becoming the more popular input manager as time goes on. I cover it briefly in the Fedora section, but it's also available in Gentoo, Arch, and Ubuntu, as well as some others. Its main page with instructions is here, but, at least for Japanese, the instructions are a bit lacking. As they say, run ibus-setup to add anthy, but then, I've found that in general, at least in Arch, Ubuntu and Fedora, that the best bet is to kill the daemon with pkill ibus, then restart it with
ibus-daemon --xim &

(In Fedora at least, you can also start it by running im-chooser, but that didn't seem to be included in the Arch or Ubuntu packages.)

The link above mentions the distributions that already have packages for it.

Most of this page, however, is about using scim-anthy.

RedHat and its offshoots

CentOS 5.x and Fedora both have scim-anthy available from their standard repos. One can choose to install Japanese support during installation, or just install needed packages afterwards. As RedHat based distributions tend to throw in everything you might possibly need under any circumstances, you might prefer to not select Japanese support during installation and just add scim-anthy afterwards.

In the later versions of Fedora (9 and 10 at time of writing) one can probably (as in, so I've been told by someone I trust, but haven't tested personally) just do
yum install scim-lang-japanese

This should pull in everything that you need. For earlier versions, follow instructions below.

Install anthy, scim and scim-anthy

yum install scim-anthy
This will pull in scim and anthy as dependencies.

If you always want Japanese immediately available, add these lines to your .bash_profile. (For the absolute newcomer, these two files are located in your home directory. That is, if your user name is john, they'll be found in /home/john, often referred to as $HOME or even ~. Note that they are dotfiles, that is, they have a period in front of them, so if using one of the graphic text editors, you might have to specify that it show hidden files or show all files. I haven't used such editors in years, so I'm not quite sure of the latest way to do that.)

export XMODIFIERS='@im=SCIM'
export GTK_IM_MODULE="scim"
export QT_IM_MODULE="scim"
export LC_CTYPE=en_US.UTF-8
scim -d
The above assumes that United States style English is your default language.

To put them into effect immediately

source .bash_profile

You should see a message that scim is running. (However, this doesn't always work--if it doesn't, just log off and log on again.)

With UTF-8 as your default encoding, theoretically, you shouldn't even have to set LC_CTYPE to ja_JP.UTF-8. You should be able to use your own language, e.g en_US.UTF-8. My experience has been that sometimes this works, and sometimes it doesn't. I would actually try setting LC_CTYPE to your own language (in my case, en_US.UTF-8) and seeing if things work properly. If they don't, then change LC_CTYPE from your native language to ja_JP.UTF-8.

Now when you start most applications, hitting ctrl+space will open up a little scim panel in the lower right of the screen. If you enter english text, you will see hiragana appear. If you hit the space bar, it will select kanji. Note that the panel should have the word Anthy on it. If it doesn't click the words RAW CODE or English whatever and you should have an option for Japanese=>Anthy.

Using scim-anthy you should be able to use Japanese in most applications. If you have trouble inputting Japanese in an xterm, if, for example, you're using vi or mutt, use uxterm. (This can be called by simply typing uxterm from any command line.)

At some point, I got into a habit of only calling these variables when I needed them, and made a little lang.sh script.

#!/bin/sh 
XMODIFIERS='@im=SCIM' LC_CTYPE=en_US.UTF-8 ${1+"$@"} &

Then, I might call mutt, for example, with

lang.sh mutt

Whether or not this saves on resources, it became a habit.

If you choose to do it this way, it's not necessary to have the XMODIFIERS and LC_CTYPE lines in .bash_profile.

I've found that I can usually get away with leaving out the GTK and QT IM_MODULES variables, although it's probably better to include them.

This can be a bit confusing as it varies from distro to distro and application to application. Sometimes, one doesn't even need the LC_CTYPE line, if your distro's default is en_US.UTF-8. One has to play with the variables and see what works for their distribution or O/S of choice, as well as their favorite applications.

If you wish your menus and the like to be in Japanese as well, you can add, either to the lang.sh script or your .bash_profile

LANG=ja_JP.UTF-8

Now, most applications will also work in Japanese--some things may show up as mojibake (gibberish) but you will be able to use Sylpheed, xchat (an irc client) etc in Japanese without problem. You'll also be able to input kanji as text in GIMP.

On rare occasions, I've found that the Ctl+space hotkey combination wouldn't open up the scim widget, although clicking on the icon that would come up when it was started would work. Scim creates a $HOME/.scim/config file which should have the line
/Hotkeys/FrontEnd/Trigger = Control+space

If that line is missing from the config file, add it. In some recent Fedora versions, the $HOME/.scim/directory was never created. In that case, one creates the directory and config file--the config file only needs that one line.

If using Gnome, and gnome-terminal, one can right click in the terminal, opening a menu that includes input method. One can choose scim from from that menu. If you didn't choose Japanese support at installation, you will also need fonts. In Fedora, CentOS and Blag, I've found that doing
yum search fonts-ja

will give me some choices. Once fonts are installed, there should be no problem. In CentOS, for example, one will find ttfonts-ja. In Fedora, it might be ttfonts-japanese, or something similar.

A quick note for CentOS users. As of late November, 2008, one can install OpenOffice-3.x from the tarball on OpenOffice's site. However, for some reason, scim-anthy doesn't work with it. There were some posts about a similar problem with Ubuntu that had some suggestions, generally consisting of linking some library files from /usr/lib to library files in OpenOffice, however, neither myself nor another Japanese speaker on the CentOS forums has had success with that method.

The problem is solely with OO-3 and scim. Otherwise, although installing 3rd party software in CentOS does sometimes risk its noted stability, OpenOffice-3 works without a problem. Hopefully, it will be fixed shortly.

There is an excellent blog on using Fedora 11 with Japanese at voom.net. Its author has generously given me permission to link to it here.

Gentoo Linux and Japanese

(Last update, June 2006)

I used to have an entire section on Gentoo. However, as I don't use it these days, when updating this page, a bit of research indicated that my method was entirely deprecated. Aside from the link to David's page at the top, the reader can also check this thread on Gentoo Forums.

Kevin W. (AKA sandcrawler on Gentoo Forums) was kind enough to send me his mini Gentoo howto.

He added the following USE variables

immqt-bc nls cjk unicode

Then

emerge --newuse world

Emerge the necessary programs

emerge scim anthy scim-anthy scim-qtimm

He added the following to his .bash_profile

export XMODIFIERS='@im=SCIM'
export GTK_IM_MODULE="scim"
export QT_IM_MODULE="scim"
export LC_CTYPE=ja_JP.UTF-8
scim -f socket -c socket -d

(If not booting into X, you might leave off the scim line and put it in .xinitrc or whatever file you use to start X.)

This enables him to input Japanese in most applications.

Ubuntu and Japanese

(Last tested with Ubuntu Intrepid Alpha, August 2008).
sudo apt-get install scim-anthy im-switch

I add the following to my .bashrc file.
XMODIFIERS='@im=SCIM'
LC_CTYPE=en_US.UTF-8

You might have to add Japanese language support--in my case, it was already installed. Go to System, Administration, Language Support and make sure that Japanese support is installed. If not, then check it off and install it.

You can run scim as daemon when you log in by adding scim -d to your .bash_profile or .bashrc or simply run it when you need it, by typing scim -d at any command prompt. After that, in most applications, hitting ctrl+space should open up the scim widget in the lower right hand corner. It should work in the default gnome-terminal, however, you will probably have to go to the terminal's menu. Choose Terminal, Set Character Encoding and choose Unicode. Otherwise, although it will input the characters, when you hit enter, you see question marks or other mojibake.

One thing that can be confusing is that Japanese may show as a supported language on a default install. However, if you open scim, Japanese won't be shown as an available language for input. To enable it, you still have to install additional Japanese support.
sudo apt-get install language-support-ja

There are almost always updated tutorials on using Japanese in Ubuntu on the Ubuntu Forums and the reader is advised to use their search function. Like Fedora, its default desktop is Gnome, and if you right click in gnome-terminal, if you have Japanese support enabled, you should see the menu for input method and be able to choose scim-anthy.

ArchLinux

(Last tested February 2008)
ArchLinux has packages for scim, scim-anthy and anthy. Add scim-anthy with pacman
pacman -S scim-anthy

(The scim-anthy package has both scim and anthy as dependencies and will install all three packages for you.) Once installed, set the XMODIFIERS and LC_CTYPE and call scim in your .xinitrc, before the line calling your window manager. For example, if your window manager is fluxbox

export XMODIFIERS="@im=SCIM"
export LC_CTYPE=en_US.UTF-8
scim -d
exec startfluxbox

There is a package for rxvt_ja which supports euc and also a package for rxvt-unicode. If you install rxvt-unicode, it's called with the command urxvt.

However, these days, I much prefer mlterm or xterm's builtin uxterm. ArchLinux also has some truetype fonts for Japanese.
pacman -S ttf-arphic-uming ttf-arphic-ukai
Desired locales should be created using locale-gen. Open /etc/locale.gen and you will see a list of locales, commented out with a # sign. Uncomment the ones that you want, for example, en_US.UTF8 UTF-8, and ja_JP.UTF8 UTF-8. Then run
locale-gen

You should see a message that the desired locales were created.

There was a thread on the ArchLinux forums started by someone who had better luck using uim instead of scim for Japanese input. For those who would prefer to use uim, the thread can be found here.

Installing from source.

If your distribution doesn't have a package for scim, anthy and scim-anthy, they can easily be installed from source.

Scim can be downloaded here, and anthy here. Note that the anthy link sends you to a download selection page. You want the latest version of anthy, not anthy-ss. At time of writing, it's 7900.

The scim-anthy source can be found here.

Once downloaded, untar and install the three programs. Install anthy first, then scim, and scim-anthy last. In each case, the commands are the same. The versions given in these examples are current at time of writing, change the command to fit the version you download.

tar -zxvf anthy-7900.tar.gz
cd anthy-7900
./configure --prefix=/usr && make && make install

Do the same for scim and scim-anthy in that order. Restart X and you should be able to call up scim input in any program by hitting ctrl+space.

You will also want Japanese fonts, especially if you are using Japanese in something like OpenOffice. Subsitute kochi truetype fonts can be found from download.sourceforge.jp. You want the package kochi-substitute-20030809.tar.bz2.

Download it and untar it.

tar -jxvf kochi-substitute-20030809.tar.bz2

This will create a kochi-substitute-20030809 directory. You will see the kochi-mincho and kochi-gothic substitute fonts. They have a .ttf ending.

Move the fonts to /usr/X11R6/lib/X11/fonts/TrueType or /usr/X11R6/lib/X11/fonts/TTF if there is no TrueType directory.

(These are the typical directories called by the FontPath section in /etc/X11/xorg.conf. Doublecheck your system's xorg.conf and if the FontPath is different than the above, use that path.)

Slackware and some Slackware based distributions

(Last updated November, 2006)
Slackware worked without problem when I installed anthy, scim, and scim-anthy from source I also installed the kochi fonts.

However, with one of its offshoots, Vector, at first I would open, for example, an mlterm session. I hit ctl+space and the scim panel appeared. I then entered romaji, but rather than seeing hiragana, I saw dotted squares. If I typed correctly, and hit space (for example, typing nihongo and hitting space once), the word nihongo, in kanji, would appear, however, I didn't see this until I hit enter.

The scim faq indicates that this is because scim isn't finding the fonts it needs. I am not sure what packages were missing--however, choosing to install gimp during the initial installation fixed the problem. Afterwards, even if I deinstalled gimp, it would still work properly

Vector's default editor, like Slackware's, is elvis, which didn't work properly. I had to grab the Slackware package for vim and install it. I used a Slackware CD that I had, but if you don't have one, go to Slackware's package search site. I used the version from 10.1, which may change by time of writing.

As of November, 2006, Vector has a scim-anthy package, which pulls in scim and anthy. I didn't see a font package, and used those kochi fonts I've mentioned above, that I manually retrieved from sourceforge.

kinput2 with canna.

In some cases, the scim-anthy combination might not work or not be available for your distribution.

Two programming friends, Godwin Stewart and Stuart Bouyer (who has done a great deal of work on Japanese input packages for Gentoo Linux) made me a tarball of a modified kinput2 and canna installation. It is not perfect--when one starts cannaserver, you see the message Terminated. However, doing pgrep cannaserver shows that it is running and it works perfectly for me.

The tarball is available from qnd-guides.net.

Thanks to the generosity of the Tokyo Linux User Group it is also available on their site. To use it, first download and untar it. You will see two gzipped files there, one for Canna and one for kinput. Install canna first as the kinput file will be looking for it.

tar -jxvf vanillajpn.tar.bz2
tar -zxvf Canna36p1.tar.gz
cd Canna36p1
xmkmf
make Makefile
make canna
make install
make install.man

When done, you'll have a file /usr/sbin/cannaserver.

Now kinput2

tar -zxvf kinput2-v3.1-beta3.tar.gz
cd kinput2-v3.1.-beta3
xmkmf
make Makefiles
make depend
make 
make install

As every distribution has its own way to make a program run at startup, that is an exercise I will leave to the reader. For example, in Slackware, you can add a few lines to /etc/rc.d/rc.M. As I said, you will see, after starting /usr/sbin/cannaserver the word Terminated. However, it can be ignored.

You will need a terminal that can display Japanese. As mentioned above, one can use the builtin uxterm. The mlterm and rxvt-unicode programs also work.

Add these lines to your .xinitrc above the line that calls your window manager.

export XMODIFIERS='@im=kinput2'
export LC_CTYPE=ja_JP.UTF-8
kinput2 -canna &

This should enable you to input Japanese in most programs.

If you get an error similar to "Unable to set locale" that is often the reason, you have it as, for example, utf8 and the system is looking for UTF-8.

To sum up, most people consider the scim-anthy combination better than kinput2 and canna. If your distribution doesn't have packages for scim and anthy, you can download and install them, following the instructions given above. If they don't work for you, then use the kinput2 canna combination, using the vanilljpn tarball, for I have found that to work in almost every distribution that I have tried.

Despite there being over 400 Linux distributions, most of them seem to be based on RedHat, Debian or Slackware so the instructions above should work for almost every distribution.

FreeBSD

FreeBSD has a scim-anthy combination. If one installs the scim-anthy port it installs both scim and anthy.
cd /usr/ports/japanese/scim-anthy
make install clean

There is a package message, suggesting setting the LANG variable to ja_JP.eucJP. However, I haven't found this necessary.

In your .xinitrc file

export XMODIFIERS='@im=SCIM'
export GTK_IM_MODULE="scim"
export QT_IM_MODULE="scim"
export LC_CTYPE=en_US.UTF-8
scim -d

One will need a terminal capable of displaying unicode. There is the builtin uxterm, mlterm and rxvt-unicode. One oddity I have found is that if I try to type Japanese directly into one of these terminals, it may not display correctly. However, if one tries to cat a text file written in Japanese, it will display the file correctly.

FreeBSD's vi is nvi. I haven't gotten this working properly with Japanese, so I install /usr/ports/editors/vim-lite. One can create an alias by editing their shell's rc file. For example, I use zsh, so in my $HOME/.zshrc file I have

alias vi=vim

For OpenOffice and the like, I need Japanese fonts. I use the the substitute kochi fonts in /usr/ports/japanese/kochi-ttfonts.

NetBSD

(Last updated November 2006)
NetBSD doesn't yet have scim or scim-anthy in pkgsrc. However, scim and scim-anthy are available in their Work In Progress collection. On my machine, the scim-anthy package failed to build, due to unable to allocate memory error in gcc, however, googling indicated that adding)
UNLIMIT_RESOURCES=	datasize

to the scim-anthy Makefile would fix the problem. I tried that solution and it worked.

If you want to stick with pkgsrc they do have anthy and uim. To use that combination
cd /usr/pkgsrc/inputmethod/anthy
make install clean; make clean-depends
cd /usr/pkgsrc/inputmethod/uim
make PKG_OPTIONS.uim="-canna" install clean; make
clean-depends

(This will install uim with anthy and gtk).

Add the following to your .xinitrc above the line calling your window manager.
export XMODIFIERS=@im=uim
uim-xim --engine=anthy &
After starting X, you can then enter Japanese text by hitting shift+space. You turn off Japanese input in the same manner, hitting shift+space again.

Although NetBSD 3.x has en_US.UTF-8 as a locale, they don't have ja_JP.UTF-8. I've had mixed results using en_US.UTF-8 as my LC_CTYPE. Sometimes, if you set LC_CTYPE to en_US.UTF-8 though you can input Japanese, after hitting enter, all that appears are blank squares. You can create a ja_JP.UTF-8 locale by downloading en_US.UTF-8.src from
ftp://ftp.netbsd.org/pub/NetBSD/NetBSD-current/src/share/locale/ctype/en_US.UTF-8.src

and then using mklocale. If you downloaded it into your home directory, as root or with root privilege (assuming your user name is joe)
cd /usr/share/locale
mklocale < /home/joe/en_US.UTF-8.src > ja_JP.UTF-8

Hopefully, then a locale -a | grep ja_JP will show ja_JP.UTF-8.

I've found this was necessary to get UTF-8 working with, for example, thunderbird, though it wasn't necessary to input Japanese in a terminal.

In NetBSD, I've never gotten mlterm working properly, so I use rxvt-unicode. I haven't researched this deeply, but the unicode fonts in urxvt aren't as clean as the fonts used by, say, rxvt with eucJP encoding. The choice is up to the reader.

For eucJP encoding, I use either rxvt or mrxvt. If you use rxvt, after it's installed, there is a message telling you that double-byte encoding is disabled by default. You then have to edit /usr/pkg/lib/X11/app-defaults/Rxvt. You will see several lines marked !Rxvt.multichar_enoding
One of them has eucj at the end of it. Take out the ! at the beginning of the line. (Also, put a ! at the beginning of the top line, which ends with noenc).
If you use mrxvt then edit /usr/pkgsrc/x11/mrxvt/Makefile. You will see a section of CONFIGURE_ARGS+= enabling xft, text-shadow and the like. Add
CONFIGURE_ARGS+=	--enable-xim
CONFIGURE_ARGS+=	--enable-cjk
CONFIGURE_ARGS+=	--with-encoding=eucj

You may, of course, choose to use kinput2 and canna with NetBSD. If so
cd /usr/pkgsrc/inputmethod
cd canna; make install clean
cd ../kinput2 make PKG_OPTIONS.kinput2="-wnn4 -sj3" install clean

When done, you'll have a /usr/pkg/sbin/cannaserver as well as kinput2. Cannaserver should be started as daemon upon the next reboot. (You'll see that it also provides a script in /usr/pkg/local/rc.d)
Once again, set your variables in your .xinitrc.

export XMODIFIERS='@im=kinput2'
LC_CTYPE=ja_JP.eucJP
kinput2 -canna &
Like FreeBSD, I've found that I have to use vim rather than vi.

You may get an error when trying to start kinput2. It will say it can't load the app-defaults file and that XFILESEARCHPATH might be set incorrectly. This can also be added to .xinitrc however, be sure to add it ABOVE the kinput2 -canna & line.

export XFILESEARCHPATH=/usr/pkg/lib/X11/app-defaults/Kinput2

DragonFlyBSD

(Last updated June 2006)
DragonFlyBSD now uses NetBSD's pkgsrc collection for third party software. The instructions above, for NetBSD, also work with DragonFly. Just use bmake instead of make when installing the various packages. DragonFly does have ja_JP.UTF-8 installed by default, so the reader can ignore the part about using mklocale.

A Digression about Terminals and UTF.8

Not all people need or want Japanese in an xterm. However many do. For example, I use mutt, so to use Japanese in an email, I need a terminal capable of handling Japanese. Others only use it in things like OpenOffice and Firefox.

Although most browsers can read Japanese encoding, you might have to manually select it. In opera, firefox and mozilla, it's in View => encodings. Although there is an autoselect for Japanese, it doesn't always work. If you get a page in Japanese that seems to be mojibake, then try different encodings, including Unicode (which isn't in the Japanese section) and one should work.

Dark Prince from bsdnexus.com forums was kind enough to send the following. If you are creating a web page with Japanese in UTF-8, this code should make the viewer's browser use UTF-8 on the page. He tested this on apache, but it should work with any server that can use php. At the top of the page put

<?php header('Content-Type: text/html; charset=UTF-8'); ?>

Again, this will only work if your server has php enabled. Many ISP provided web pages don't support php.

Then, code that tells the browser to read UTF-8. (Dark Prince says this may not be necessary, but it probably can't hurt.) This code should be between the <head> </head> tags

<meta HTTP-EQUIV="content-type" CONTENT="text/html;charset=UTF-8">

Having the meta tags will not be sufficient to make the viewer's browser use UTF-8, the php code is necessary. However, if the page is in straight html, the meta HTTP-EQUIV tags should be enough. (Martin Swift was kind enough to point out that I'd neglected them myself, causing some of the special characters on the page to display incorrectly. I've fixed it since. Thank you Martin.) :) p>

Lately, I've been playing with mrxvt. Again, there is no unicode support. If building from source, one needs to configure it as follows (in addition to any options you choose)

./configure --enable-xim --enable-cjk --with-encoding=eucj

If you are using FreeBSD, a patch I submitted has been accepted to add EUC input. When installing the port simply type

make -DWITH_JAPANESE install clean
Lately, it seems as if mlterm has become the most popular of the multilanguage terminals. We have a quick and dirty guide to it here.

Note that the guide suggests setting mlterm's font size to 14. I've found that if I use the default size of 16, if I set LC_CTYPE to ja_JP.UTF-8, the terminal becomes overly large. (I found this happened even if I set LC_CTYPE to en_US.UTF-8.) This doesn't seem to be distro or window manager specific. The QND guide mentioned above discusses setting mlterm with a transparent background. This is a matter of preference. Josh, who wrote the guide, has younger eyes than I do, but I prefer a gray background with black type. One can set the background at the command line, in .Xdefaults, or do as Josh suggests, creating a directory in your home directory called .mlterm and a file in that directory called main.

One note for others with aging eyes. Recently, it seems to me that urxvt's fonts have gotten smaller. I've found that setting the font size in .Xdefaults helps. I have this entry
urxvt*font: a14

Another thing that I've found with rxvt-unicode, specifically on CentOS is that installing the package doesn't create a termcap entry. This can cause problems with several programs, such as w3m, and even man. The solution is to run the "tic" command on the terminfo file. In /usr/share/doc/rxvt-unicode-<version-number>/etc you will find an rxvt-unicode.terminfo file. Change into that directory and run
tic rxvt-unicode.termcap

and it usually fixes the problem, even without moving the file anywhere. If you are still getting an error such as can't find termcap entry, then, after running tic, copy the file over to /usr/share/terminfo/r As a side note, if running urxvt on one machine, and you ssh into another machine that doesn't have it installed, and get errors, either something like no mention of urxvt in termcap or even WARNING: terminal is not fully functional, you can always temporarily fix it by typing
TERM=xterm

To get a list of available fonts one can type xlsfonts at a command prompt.

Using Putty from Windows

Much of this was taken from this page from umiacs.umd.edu.

If you are using Putty to open an ssh session, you can still view Japanese encodings. On the Windows machine, you will have to install Asian Language Support. (Control Panel, Regional Settings or Regional and Language Settings.) You will probably need the Windows installation CD for this. You should also choose the option to Install files for complex script and right to left languages. (The link given above has several screenshots.). Windows will suggest you reboot after installation. Do so.

Open your putty session, right click on the title bar, and choose Change Settings from from the menu.
Go to Window, Appearance. Click the Change button in the Font settings section.
Choose MS Gothic or MS Mincho and Japanese as the script.
Go to the Translation section. In the dropdown box at top, choose UTF-8. Also check the box that says Treat ambiguous CJK characters as wide.

Once this is done, you should be able to view Japanese text in a putty ssh session from a Windows machine.

Using ibus

As of Fedora 11, Fedora's new default input manager is ibus. Fedora tends to be one of the first to try things, and sometimes, they're not completely ready.

Some of the advantages, according to a posting on the Fedora testing list:

Ibus has been rewritten in C. Scim written in C++ using STL has problems with weak symbol conflicts without the added complexity and lower stability of the scim-bridge layer to workaround that.

* It is possible to write client and engines for ibus in any language that supports dbus bindings.

* ibus loads engines on demand rather than all installed engines as scim does, which improves the startup tim scim loads engines as dl-modules so a problem in any engine can take down scim, whereas in ibus because the processes are separated only a faulty process will die leaving rest of the system working normally.

* The architecture of ibus is bus-centric and so much closer to the CJK OSS Forum Workgroup 3 draft "Specification of IM engine Service Provide and memory footprint.

It works quite well in most GTK applications that I've tried, such as firefox and gnome-terminal. It also works beautifully with openoffice.

As I use fluxbox or openbox rather than gnome, rather than use im-chooser, I will start it with
ibus-daemon &

This used to work perfectly, however, recent updates (as of June, 2009) seem to have changed it slightly. I'm not sure if this is a permanent change or not.

It would still work this way with any GTK application, such as gnome-terminal or firefox, and openoffice. However, it wouldn't work with my two most used xterminals, uxterm and urxvt.

To get it to work in those as well, I had to change this to
ibus-daemon --xim

I prefer to start it from command line. However, experimentation indicated that in Fedora, if I start it by using im-chooser (by typing im-chooser at the command line) this will also get it to work in everything. Although, after selecting it in im-chooser, it will give a message that you'll have to log out and log back in, and until then it will only work in GTK apps, it seems to work in everything as long as I enable it.

The variables to be set are
XMODIFIERS='@im=ibus'
GTK_IM_MODULE=ibus

LC_CTYPE can be set for your native language, ja_JP.UTF-8, or any other UTF-8 locale of your choice. Before using it the first time, run the ibus-setup program which will allow you to add anthy.

Ubuntu also has it available in their Koala alpha at least, and it has a similar issue--though with Ubuntu, I couldn't get it working with firefox, even when using ibus-daemon --xim. It did however, work with urxvt as long as I put in the --xim option.

It does seem, otherwise, to be an improvement over scim, and I suspect that it will, sooner or later, become the input manager of choice.

Printing

(Last updated August, 2008)
Several applications will translate a file to postscript level 2 (or possibly higher). Acroread, xpdf, OpenOffice, mozilla, firefox and seamonkey will all do this. With such applications, assuming you can already print from them, no further work is necessary and Japanese will print out as written.

Printing in *nix, can be non-trivial in itself. CUPS is making it easier when it works--when it doesn't work, one finds that they spend a lot of time searching google to find many people with the same error messages and few solutions. I have a few simple CUPS solutions on another page.

Depending upon distribution, installing OpenOffice can be a major undertaking. FreeBSD for example, has the development version as a port that requires over 9 gigs of free space to compile. Building the port can take 6-8 hours on a reasonably fast machine.

In OpenOffice, be sure to enable Japanese support under Tools, Options, Language Support, Languages.

To print Japanese one needs the fonts (I use the kochi fonts mentioned above). Once this is done, you can use spadamin to add the fonts. In FreeBSD, they'll be in /usr/X11R6/lib/X11/fonts/TrueType. In some distributions, the path is the same save that it's called truetype. You may have to be root or have root privilege to run spadmin.

In FreeBSD, at least, rather than using spadmin, I just either copy or symlink the fonts to /usr/local/openoffice(version)/share/fonts/truetype. without the fonts, you will be able to input Japanese in OpenOffice, but it won't print correctly.

One can use openoffice, firefox or seamonkey (as well as any other browser that does the postscript conversion for you) to print Japanese text files. Open the textfile in firefox, for example, and then print it.

Recently, looking through a page about UTF-8 I came across another solution for printing textfiles. The author mentions using the openoffice command with the -p option. For example, in FreeBSD, OpenOffice is called with openoffice.org. Suppose I have a text file in Japanese, called nihongo.txt
openoffice.org -p nihongo.txt

will print the text file. (Note that the openoffice command will vary between Linux distributions as well as the BSD's. Many distros use the command ooffice, others use soffice and no doubt other distros use something else.) This can be used with both UTF-8 and EUC encoded textfiles. This is simpler than using OpenOffice, as it will correctly print the file without having to open the OpenOffice application.

The author also mentions the paps program. It converts UTF-8 files to postscript. The mentions of FreeBSD below also apply to NetBSD.

It's available as a package in most distributions, and as a port in FreeBSD. (/usr/ports/print/paps).

It didn't work for me with the very basic cups laserjet or deskjet ppds. I also needed the hpijs program, which provides more specific drivers for various HP printers.

Most distributions (and FreeBSD) have hpijs available as a package. If not, you can go to the HPLIP home page and install from source. The HPLIP package includes hpijs package.

Once the package is installed, you can modify your printer, using the cups web interface or lpadmin command to change your printer's ppd from the generic deskjet or laserjet to the hpijs driver for your printer.

Make sure you have some Japanese fonts. Sometimes, even though scim will input Japanese perfectly, if you don't have some specific Japanese fonts paps won't print correctly. If your distribution doesn't have fonts available, use the kochi substitute fonts from sourceforge.

To use paps to print a textfile called nihongo.txt
paps nihongo.txt | lp 

I use mutt, and paps works well with it. With mutt, hitting the pipe key, |, will, obviously enough, pipe the email to another command. If I want to print a Japanese email I would use
|paps|lp

which sends the mail to paps and from there to the lp command.

Evolution and Thunderbird do the postscript conversion on their own--in other words, they will print Japanese emails as easily as firefox prints a Japanese web page.

Sylpheed needs the print command set to feed the email to paps. Changing the standard lpr %s that they use for printing to
paps %s | lp

will enable Japanese emails to print. (This can be found in Configuration, Details, External Commands).

As JWS mentions in his page, if the file uses a different encoding, such as EUC, one can use iconv to first convert the file to UTF-8, then use paps. Using iconv with the -l (a lower case L, as in list) gives the program's naming conventions. For instance EUC encoding can be specified as EUC-JP or EUCJP (as well as typing the complete EXTENDED_UNIXCODE_PACKED_FORMAT_FOR_JAPANESE, but I doubt anyone would want to type that.)

The syntax is quite simple, one uses -f as in from -t as in to and the file name. So, let's say we wanted to print a textfile called nihongo_euc.txt. (This is in FreeBSD, where the iconv -l shows supported encodings in upper case. I'm not sure about Linux, it may be lower case).
iconv -f EUCJP -t UTF-8 nihongo_euc.txt | paps | lp

For me at least, paps was one of the final pieces that makes Japanese almost as easy to use in Linux and some of the BSDs as it is in Windows. (I think Mac stil has the edge as far as ease of Japanese use.)

Speaking of Mac, now that Apple bought CUPS, all of the above may become unnecessary. On Fedora starting from Fedora 8, and CentOS 5.2, running cups-1.3.4, if I run lp nihon.txt it correctly prints the Japanese text. However, this isn't the case on Ubuntu. So, whether this was something in the RH versions of CUPS, or somewhere else, I don't know.

One other minor issue that I had was with #!Crunchbang, a distribution based on Ubuntu that uses Openbox as its default window manager. Using paps nihon.txt|lp wasn't working. In that case, I had to specify paper size. The default is A4, and my printer uses the US letter size paper. If I just ran paps nihon.txt|lp, it would run through the printer and the cups logs showed everything working, but nothing would print. However, as long as I remember to specifiy paper size, using
paps --paper=letter nihon.txt|lp

it worked as it should.

Romaji

While only indirectly related, there are times when when one wishes to use special characters when typing romaji. When writing for people studying Japanese, my own tendency is to imitate hiragana, and write, for example, juudou for the martial art. However, for people with no knowledge of the language, this can be somewhat confusing. Most systems that have locales will also have a Compose file for that locale. For example, in FreeBSD, there is a file /usr/local/lib/X11/locale/en_US.UTF-8/Compose. It will probably be elsewhere on a Linux system. That file will have several entries for special characters. Mine has things like
<Multi_key> <underscore> <a>  : "ā"   U0101 # LATIN SMALL
LETTER A WITH MACRON

(If that didn't look like a small letter a with a line over it, then your browser is probably using a different encoding. Go to View=>Encodings and choose UTF-8 and you should see (among other things) a lower case "a" with a line over it.)

This means that if I use the Multi or Compose key, then hit underscore, then a, I will get an a with a line over it. The problem is setting the compose key.

This can be done globally (for all users) by making an entry in /etc/X11/xorg.conf. For example, to use the right Windows key one could add
   Option "XkbOptions"  "compose:rwin"

(Finding the right name can be tricky as it varies on different systems. In FreeBSD, one can find the name in /usr/local/share/X11/xkb/keycodes/xfree86. Usually, you're looking for xkb/keycodes/xfree86.). Another way is to use the xev program, find the numeric code and add it to xmodmap. For example, if I want to use the menu key (usually to the right of the right hand Windows key) I run the xev program from an xterm which opens up a little box. (The xev program is usually installed by default, if not, it's readily available for almost all systems. In FreeBSD it's in /usr/ports/x11/xev.)

In the box, hit a key. In the terminal you used to call xev you'll see various things including (if we hit the right Windows key) keycode 116. Now, we create (if it doesn't exist) a $HOME/.xmodmaprc file. In it we add
keycode 117 = Multi_key

Next type
xmodmap ~/.xmodmaprc

If you now hit the right hand Windows key, then type _a, you'll see ā. So, I can use this to write, for example, jūdō.

Special thanks to Dr. Mike Fabian for all his help, as well as several other members of the Tokyo Linux Users Group (tlug).