-------------------------------------------- Building a Japanese Environment from Scratch £Ì£Æ£Ó¤ÎÆüËܸì´Ä¶­¤Î¹½ÃÛ -------------------------------------------- AUTHOR: Dan Bernard DATE: 2005-04-14 LICENSE: GNU Free Documentation License v1.2 Other versions of this document may be released under the BSD Documentation License (FreeBSD sample license below) http://www.freebsd.org/doc/en/books/handbook/ln16.html SYNOPSIS: Setting up Japanese input and output for terminals and applications -------------------------------------------------------------------------- DESCRIPTION: -------------------------------------------------------------------------- A number of applications and libraries are required for practical support of the Japanese language in a UNIX-based environment. There is no single standard for any of them. This hint will cover the basics for setting up a working Japanese environment that supports input and output of standard Japanese characters. Localization and specialized applications should be regarded separately. -------------------------------------------------------------------------- PREREQUISITES: -------------------------------------------------------------------------- Depending on what you may have already installed with regard to general libraries and applications from Beyond Linux from Scratch (BLFS), and which optional programs you choose to install, you could require between 60 MB and 200 MB of free hard disk space for retrieving, unpacking, building, and installing this software. In any event, The build should require less than five static Binutils units (SBU) of compile time. This hint is good for any build of LFS with either framebuffer support or X windowing system. -------------------------------------------------------------------------- HINT: -------------------------------------------------------------------------- 1. Background -------------------------------------------- The de-facto standard for setting up Japanese language support in Unix-based environments today is package management. Either scripted builds with auto- mated dependency resolution or strictly managed binary package distributions are the only documented ways of making your Linux machine capable of reading and writing Japanese text in a way that you can use. However, such packages can not reasonably be managed in a scratch-built Linux environment. Because building from scratch is the only feasible and desirable way to achieve this interaction in such an environment, it helps to have proper documentation on the subject. Unfortunately, due to the prevalence of package management and fully localized distributions, no complete documentation exists publicly for those who may want or need to make a Japanese Linux environment from scratch without any sort of package management. It requires many different programs and libraries, a good deal of configuration, and a fundamental understanding of client-server models for input and conversion, which is largely different from simply installing certain applications which may have their own methods for input and conversion. The purpose of this document is to provide anyone the basic knowledge and guidance necessary to install complete support for a Japanese environment. This document will not demonstrate how to install any other applications that may use Japanese input as supported by the Japanese environment, but only the underlying utilities that allow such applications to function properly with Japanese input and output. 2. Requirements -------------------------------------------- The following software may have X11 dependencies (X11R4 or later), so it is best to install X11 first before attempting to install programs that have X11 dependencies. Libraries: Imlib2 libpng libast libjpeg freetype Programs: FreeWnn Canna (optional) kinput2 (requires any version of X) Eterm (optional in XFree86 4.4 or later) FreeWnn and Canna are some of many kana-kanji conversion servers. Only one is absolutely needed, although some applications prefer FreeWnn or Canna. For most applications, kinput2 is a general input and conversion client which connects to conversion servers and provides input to applications whose interfaces do not support Japanese input by default. Even without a conversion server, kinput2 can still input kana and convert from romaji. Eterm is a versatile terminal emulator for X11 with many graphical features. Eterm supports many languages in addition to having full CJK support. 3. Download -------------------------------------------- The following links point to places on the Internet where the latest versions of the required software can be found. freetype: http://freetype.sf.net/ libjpeg: http://ijg.org/files/ libpng: http://sourceforge.net/projects/libpng/ Imlib2: http://freshmeat.net/projects/imlib2/ libast: http://site.n.ml.org/info/libast/ Canna: http://download.sourceforge.jp/canna/ FreeWnn: ftp://ftp.freewnn.org/pub/FreeWnn/alpha/ kinput2: ftp://ftp.sra.co.jp/pub/x11/kinput2/ Eterm: http://www.eterm.org/download/ jfbterm: http://jfbterm.sourceforge.jp/ If any of the above links happen to be inaccessible when you attempt to connect, try doing a Google search for the name of the given project. I found each of these links from Google, and there are many more sites that distribute these popular sources. 4. Installation -------------------------------------------- Now that you have all the source tarballs unpacked in their appropriate directories, it is time to commence the build. Make sure that you know your installation paths before installation. I am assuming PREFIX=/usr as that is for compatibility with the official BLFS documentation, but of course any location should work fine. I use /usr/local personally. If you install any of the libraries in a non-default directory, be sure to run ldconfig after each build and installation. freetype: FreeType is a portable ANSI C-based TrueType font rendering library. freetype2: Last checked against version 2.1.5 This library is a part of the BLFS CVS. For more details, look at the latest BLFS. ./configure --prefix=/usr && make && make install freetype: This library only needs to be installed if your system cannot use the FreeType2 library. If you already have the above library installed, then skip this part. Last checked against version 1.3.1 Download size: 1.4 MB Estimated disk space required: 9.3 MB Estimated build time: 0.19 SBU ./configure --prefix=/usr && make && make install libjpeg: libjpeg is a package of libraries that handle image compression based on the standard of the Joint Photographic Experts Group (JPEG standard). Last checked against version 6b (IJG) This library is a part of the BLFS CVS. For more details, look at the latest BLFS. ./configure --enable-static --enable-shared --prefix=/usr && make && make install libpng: libpng is the official library that support all features of the Portable Network Graphics format. Last checked against version 1.2.5 This library is a part of the BLFS CVS. For more details, look at the latest BLFS. make prefix=/usr ZLIBINC=/usr/include \ ZLIBLIB=/usr/lib -f scripts/makefile.linux && make prefix=/usr install -f scripts/makefile.linux Imlib2: Imlib2 is an image library for X11 that transparently handles multiple image formats in a number of applications. Last checked against version 1.1.0 This library is a part of the BLFS CVS. For more details, look at the latest BLFS. ./configure --prefix=/usr && make && make install libast: libast stands for "Library of Assorted Spiffy Things." It handles string manipulation, text parsing, and memory tracking, among other things. libast was originally libmej. Last checked against version 0.5 Download size: 270 kB Estimated disk space required: 9.7 MB Estimated build time: 0.24 SBU ./configure --prefix=/usr && make && make install Canna: Canna is a popular kana-kanji conversion server made by NEC. Canna is formerly known as Iroha, but the protocol is still the same. Canna requires that your system have a user and group named 'bin' for the ownership of certain files. On a default LFS install, the group bin should already exist with gid=1. Therefore, I suggest that user bin have uid=1. This, however is arbitrary. Popular uid's for user bin also include 2 and 3 on other UNIX systems where 1 may already be used. Last checked against version 3.6p3 Download size: 1.4 MB Estimated disk space required: 28.8 MB Estimated build time: 0.69 SBU useradd -u 1 -g 1 -d / -s /bin/false bin && xmkmf && make Makefile && make canna && make install && make install.man && cp -a /usr/local/canna/include/canna/ /usr/include/ FreeWnn: FreeWnn is a popular kana-kanji conversion server based on Wnn. Wnn was developed jointly by Kyoto University, ASTEC Incorporated, and Tateishi Electronics Company (now OMRON Corporation). Wnn was the first full-featured kana-kanji conversion server available to the public. The latest version (Wnn6) is proprietary. FreeWnn also comes with uum, although it has to be built separately. Uum is a Japanese input method for any terminal, which means X is not necessary. This is thus especially handy on consoles, including jfbterm on Linux framebuffer consoles. Take note now that this package is optional, especially if you are running low on available hard disk space. I recommend installing FreeWnn only if you know that you need a third-party application that prefers interfacing with a Wnn server, or if you need uum. FreeWnn requires that your system have a user and group named 'wnn' in order to install and run. Therefore, I suggest that user wnn have uid=22273 and gid=22273. This is because the standard TCP/UDP port number for Wnn's Japanese kanji conversion server is 22273. This is arbitrary, however. Popular uid's for user wnn also include 49 and 69 on other UNIX systems where other versions of Wnn may be used. Last checked against version 1.1.1-a020 Download size: 2.9 MB Estimated disk space required: 61.8 MB Estimated build time: 0.93 SBU Estimated build time with uum: 0.99 SBU groupadd -g 22273 wnn && useradd -u 22273 -g 22273 -d / -s /bin/false wnn && ./configure --prefix=/usr --sysconfdir=/etc && make && make install && make install.man To compile and install uum, it will be necessary to run the following: cd Wnn/uum && sed -e 's/termcap/curses/g' ../../makerule.mk > makerule.mk.bak && mv makerule.mk.bak ../../makerule.mk && make && make install kinput2: kinput2 is a Japanese input method for X11. It can interface with a variety of kana-kanji conversion servers and input Japanese text into any kinput2 client. Supported input styles include on-the-spot, off-the-spot, over-the-spot, and root-window styles. kinput2 supports the new XIM Protocol. By default, kinput2 is configured to connect to Canna (Iroha) and Sj3 for kana-kanji conversion, and does not, by default, interface with any Wnn servers. This hint disables Sj3 support, leaving only support for a Canna server. If you wish for kinput2 to interface with Wnn or Sj3 or not to interface with Canna, simply edit the Kinput2.conf file in the root of the kinput2 source tree. Last checked against version 3.1 Download size: 500 kB Estimated disk space required: 19.7 MB Estimated build time: 0.17 SBU cp Kinput2.conf Kinput2.conf.bak && sed 's%#define\ UseSj3%\/\*\ #define\ UseSj3\ \*\/%' \ Kinput2.conf.bak > Kinput2.conf && xmkmf && make Makefiles && make depend && make && make install Eterm: Eterm is a full-featured VT102 terminal emulator. Last checked against version 0.9.2 Download size: 647 kB Estimated disk space required: 3.4 MB Estimated build time: 0.58 SBU ./configure --prefix=/usr --sysconfdir=/etc --without-terminfo \ --with-backspace=bs --enable-utmp --enable-auto-encoding \ --enable-multi-charset=kanji --enable-xim --enable-trans && make && make install jfbterm: Jfbterm is a terminal emulator capable of displaying Japanese character sets (and other multibyte character sets) on a Linux framebuffer console. First off, while this can work (working version released right after I tested the buggy one that didn't make it into my original hint!), it can still be very touchy depending on what else is on your system. While it is not absolutely required to have X installed, it does depend on a lot of fonts that would usually be bundled with an X11 install. It may not look clean, but the easiest way I got this to build and run on a machine without X was to copy /usr/X11R6 from another machine and keep the fonts. You may also need to create a fonts directory in your $PREFIX/share for whatever reason if you do not already have it. There may also be a very annoying dependency on automake-1.4, which may require you to have another local install of automake if you already use another version. I tried many things to rectify this, but always came up short, and did not want to learn more about automake/autoconf given that they are both more trouble than they are worth. The least painful measure I can recommend is installing automake-1.4 locally, compiling jfbterm, and then deleting automake-1.4 if no longer necessary. Another important thing that may not be present on a default LFS system is a framebuffer device node. If `ls /dev/fb*` comes up blank, you'll need to make one. If you have a script in your /dev directory, you can modify that and run it appropriately, or just use the appropriate mknod command depending on your system, e.g. `mknod /dev/fb0 c 29 0`. This would also be a good time to check the permissions on your device nodes and set them accordingly. Last checked against version 0.4.6 Download size: 121 kB Estimated disk space required: 1.78 MB Estimated build time: 0.05 SBU ./configure --prefix=/usr --enable-reverse-video --enable-color-gamma \ --enable-dimmer && sed -e 's/utmp/root/' Makefile > Makefile.bak && mv Makefile.bak Makefile && chmod 755 configure && make && make install Now, make sure that the jfbterm terminfo exists. tic terminfo.jfbterm 5. Configuration -------------------------------------------- After all the necessary software libraries and programs have been installed, there still remains some configuration that must be done in order for the Japanese environment to work. Conversion Servers and Scripts: Canna: It is necessary first to start your kana-kanji conversion server. It is easy to start Canna, but an adjustment needs to be made first. Because all Canna programs are located in /usr/local/canna/bin by default, they need to be added to the default PATH. In a csh-compatible shell, run setenv PATH `echo $PATH`:/usr/local/canna/bin If you use a Bourne-compatible shell, then you can run export PATH=`echo $PATH`:/usr/local/canna/bin Since you must run cannaserver as root, then you will be setting environment variables quite often. To modify your default PATH so that it contains the Canna programs each time you log in, then you can change the PATH settings in your login scripts, preferably the global /etc/login.defs file, and be sure to include that path in the superuser settings. To start the Canna conversion server, become root, verify that your PATH environment variable is correct, and then run cannaserver start To make sure that your Canna server is running, run cannastat (as any user), and you should see something resembling Connected to unix Canna Server (Ver. 3.6) followed by the number of clients connected to the server, of which there should be none at this point. FreeWnn: If you wish to invoke FreeWnn at this point, it is best that you use a simple init script. However, if you do not need to start it right away, then it would probably be better to wait until you need it. Because the script handles most of the crucial Wnn operations, it is not necessary to modify the default path. FreeWnn, by default, has its programs in the /usr/bin/Wnn4 directory. If this is not the case on your system, then modify your script accordingly. To make the script, create a file in your default PATH and give it a descriptive name such as FreeWnn. Set permissions accordingly for an init script on your system. Keep in mind that the Wnn server does not need to be started as root. Once the file is created, then copy the following twenty lines into it: #!/bin/sh case "$1" in start) if [ -x /usr/bin/Wnn4/jserver ]; then /usr/bin/Wnn4/wnnstat localhost > /dev/null 2>&1 if [ $? = 255 ]; then rm -f /tmp/jd_sockV4 /usr/bin/Wnn4/jserver > /dev/null fi fi ;; stop) /usr/bin/Wnn4/wnnkill localhost ;; *) echo "Usage: $0 {start|stop}" >&2 exit 1 ;; esac exit 0 Also note that if this script is run more than once by a non-privileged user to start the Wnn server, it may not work if /tmp/jd_sockV4 can not be deleted. This is why it is still usually a good idea to run this as root and to retain executable privileges for the superuser. When Wnn is running, /usr/bin/Wnn4/wnnstat can be used to retrieve its information. Input Method Environment: uumkey: While the main configuration file for uum is located in a /usr/lib/wnn subdirectory, it can be regarded as independent of wnn for configuring the input method itself. As uum is essentially a shell environment, no configuration is necessary just to get it to run. However, the default settings can be annoying if you are used to other UNIX input methods. The file to modify for changing Japanese input settings is ja_JP/uumkey ignoring other files with similar filenames. The following parameters represent escape sequences that can be changed to resemble, say, kinput2, as I have done on my setup. henkan nobi_henkan_dai select_jikouho_dai kakutei hindo_set There are other configuration options specifically regarding conversion that are good to get familiar with, given that such options may be more intuitive within other conversion environments, although some shortcuts therein can be quite useful. Xresources: Append the following six lines to your Xresources file if you intend to take advantage of Japanese Eterm sessions: ##### Eterm ##### Eterm*InputMethod: kinput2 Eterm*allowSendEvents: true Eterm*VT100*translations: Shiftspace: begin-conversion(_JAPANESE_CONVERSION) Eterm*kanjiMode: euc Your global Xresources file could be in any of the following locations: /etc/X11/xdm/Xresources /etc/X11/xinit/Xresources /usr/X11R6/lib/X11/xdm/Xresources /usr/X11R6/lib/X11/xinit/Xresources or if you would rather add it on a per-user basis, use ~/.Xresources This may or may not be necessary depending on your X11 settings. However, it is not likely to cause problems if it is not necessary. Xsession/xinit: For the sake of convenience, it helps to invoke your input server upon starting your X server. While this is not essential, bear in mind that kinput2 will need to be started at some point during your X11 session, and it is highly advisable that it run in the background. Depending on the configuration of your X server, startup commands will be invoked from a file either named Xsession or xinitrc. The global settings for xinit are located either in /usr/X11R6/lib/X11/xinit or /etc/X11/xinit by default. The global Xsession files are also likely to be found in one of the X11 configuration directories. These settings may also be specified locally in ~/.Xsession or ~/.xinitrc for a given user. Add the following line to any of these files: kinput2 & When kinput2 is running, an entry should appear in the server statistics report from cannastat or wnnstat. Environment Variables: When you have X running, as well as all your other servers, you must now establish an interface with the input method. That must usually be defined as an environment variable upon the invocation of the input method client program. For example, if you are running X and open an Eterm from an xterm, that Eterm will not be able to handle Japanese input unless the environment variable for the input method was set properly in the xterm. To accomplish this, run setenv XMODIFIERS "@im=kinput2" or export XMODIFIERS="@im=kinput2" according to your shell's specifications, and then start an Eterm. Locale: If your libc locale is not set to a Japanese setting at the time an application is invoked, it can possibly be set locally afterward. However, in the case of Eterm, for example, this also needs to be set in the parent application beforehand, as it determines the resulting execution in the same manner as the XMODIFIERS variable concerning the input method. Different systems use different ways of specifying locales. The first thing you should do is run setenv LANG ja_JP.EUC or export LANG=ja_JP.EUC according to your shell's specifications. If for some reason, you have an application that utilizes JIS as its primary encoding, then simply substitute JIS for EUC in the above environment variables. Most UNIX applications these days, however, use EUC. If your system uses other language and locale settings as environment variables, then it is a good idea to set those too. If you run echo $LANGUAGE ; echo $LC_ALL and see two undefined variables, then you are okay and should not worry about them. If LANGUAGE is defined and is not Japanese, then run setenv LANGUAGE jp or export LANGUAGE=jp If you LC_ALL is defined and is not Japanese, then you should run setenv LANGUAGE ja_JP or export LC_ALL=ja_JP After all environment variables are properly set, then start Eterm with Eterm & This Eterm should have everything you need for proper Japanese input and output. The environment variables will be the same as in the parent application, so you can start more Eterms from this one and not have to readjust environment variables every time. Nevertheless, there is one last trick that can remove all the worry about environment variables from the simple invocation of a Japanese-enabled terminal emulator. Additional Options: If you use a window manager, as most X11 users do, you may be able to simplify the process of invoking an Eterm with all the correct settings. If you happen to use TWM like I do, then the process is straightforward, because it is just another menu entry in the system.twmrc file. Simply add the following line in the appropriate part of your TWM main menu: "Eterm" f.exec "XMODIFIERS='@im=kinput2' LANG=ja_JP Eterm &" Add more environment variable settings as necessary. The system.twmrc file is located in either /etc/X11/twm or /usr/X11R6/lib/X11/twm by default. Once this is set and TWM is restarted, you'll see an extra "Eterm" entry in the main menu, and if you select it, it will start an Eterm with all the environment variables already set. Of course, Eterm can also be invoked with arguments just as if it were run from another shell, but those are spared here for concision. I have also gotten a similar setup in WindowMaker/GNUstep by adding a line in the appropriate place in the ~/GNUstep/Defaults/WMRootMenu file such as the following: ("ETerm [JP]", SHEXEC, "XMODIFIERS='@im=kinput2' LANG=ja_JP.EUC Eterm"), This also allows for additional arguments to Eterm. Other window managers may be more complicated, but there are probably a number of analogous options therein to provide for quick execution of your favorite localized terminal emulator. -------------------------------------------------------------------------- NOTES: -------------------------------------------------------------------------- * New in This Version -------------------------------------------- Followers of the original hint may have noticed some things that did not quite match up with the current situation soon after that hint was released. Unfortunately, my testing occured right before a new working release of jfbterm, after which I became far too busy to update this hint accordingly. All should be well now regarding console support with jfbterm+uum. Additionally, I did get some feedback via e-mail with suggestions to improving utilization options. While these suggestions were well taken, the cumulative effect of all of them was that my suggestions were too specific, and that the real implementation of the Japanese environment should be left to the users, since they should know what is best for their preferences. As a side note, the settings I suggested did not even relate to what I personally use, but were what I thought would be easy for an absolute beginner. From now on, I will defer to the individual user's preferences and omit explicit suggestions for how to use the installed software mentioned in this hint. I may also see if it would be good to have an addendum to this hint specifically regarding Japanese keyboards with relation to Japanese LFS setups. I have literally hundreds of Japanese keyboards at my disposal, although lately no desire to mess around with them. If, however, at any point I should try to find relevant applications in a JLFS environment, it may merit some documentation here. * Future Proposals -------------------------------------------- Not for this hint specifically, but for other projects as well, I am stating here what I would like to do or have done in the future. First with regard to documentation, I believe that this hint itself fulfills its purpose in simply setting up a localized environment. There are many other things that pertain to Japanese computing, however, and those could be addressed in separate hints. Another particular thing I was interested in was updating uum. This would be a larger-scale project, but there are a few things that would really be nice if they were supported by uum, namely Unix98 and UTF-8. Maybe I can devote some time to hacking uum in the near future. * Dated Material -------------------------------------------- Given my the complexity of this project, the comprehensiveness of this document, and my busy schedule, it is worth noting that a number of the libraries and programs that I compiled when I first tested my methods for this hint (18 January 2005) may be outdated. I may update this hint if there are any significant changes in the sources for this Japanese environment. However, if you notice a new version of some software mentioned above, go ahead and try to substitute it in for the older versions that I may have used. Unless a major version change has occurred, there should not be any problems with a minor upgrade. If, however, any problems should arise out of any newer versions, then I would recommend reverting back to the versions I last compiled against, unless, of course, any security vulnerabilities were to have been found in said software after I release this hint. Install software at your own risk, and please use common sense. The author of this hint is not liable for any problems you may incur! * Acknowledgements -------------------------------------------- Special thanks go to Kazunari Manabe for sparing extra hardware for testing, as well as assisting in translations of this text. * Document Formatting -------------------------------------------- This document exists in plain text, pLaTeX source, DVI, and maybe other document formats as markup or portable compiled documents. While the organizational layout may change with formatting, the content will remain the same. The reason for this is not only to make this document easy to maintain, but also so that it may be read on systems with or without Japanese language support. This issue of compatibility will be even more important if this document is ever translated to Japanese, as that translation will be useless if it cannot be read due to incompatibility of formats. See the notes below for the status of this document's translation. Alternately formatted documents should be temporarily hosted at the following locations: http://vorlon.cwru.edu/~djb29/ Should any new servers be added, or the above removed, this hint will be updated and changes noted in the changelog at the very end of this document. -------------------------------------------------------------------------- CHANGELOG: -------------------------------------------------------------------------- [2003-11-11] * Initial hint. [2005-04-14] * Upgraded SBU's from Static Bash Units to Static Binutils Units. * Added new proposals for future related projects. * Added (overdue) sections for jfbterm and uum. * Changed mirror sites for other formats. * Removed unnecessary utilization hints. * Removed alternative suggestions. * Removed request for assistance. * Updated notes on dated material. * Reduced prerequisites. * Improved introduction.