Japanese CGI Script Guide

This guide assumes you already know how to get a web server up and running, and don't need basic terminology and steps explained to you. It also assumes you're aware of the many risks and dangers that running ancient, unsafe CGI scripts poses to you, your server, and the users of your site.

Contents

  1. Server environment
  2. Common issues
    1. Enable CGI module
    2. Perl path
    3. End of Line Sequence
    4. File/directory ownership and permissions
    5. Enable CGI execution
    6. jcode.pl
  3. Converting scripts from Shift-JIS to UTF-8
  4. Improving mobile compatibility

Server environment

As of 2023/12/28, here's what I'm using to successfully run Japanese CGI scripts:

Common issues

Here are some of the most common stumbling blocks you'll run into when trying to run old Japanese CGI scripts, and how to get past them.

Enable CGI module

On Apache servers, you need to enable the Apache CGI module before you can use CGI scripts. To do this, run something like:

sudo a2enmod cgid

...followed by:

sudo systemctl restart apache2

Perl path

At the top of a typical Perl script file, there will be a line that tells the script where to find Perl on the system, such as #!/usr/bin/perl or #!/usr/local/bin/perl. This must be set correctly in all your script files in order for the script to execute.

The most likely one you'll want to use is #!/usr/bin/perl, but check your server to make sure.

End of Line Sequence

On a typical Linux server, all CGI script files must be using LF (line feed) as the End of Line Sequence character, not the Windows-style CRLF (carriage return line feed). If your script files are using CRLF, they will not work.

To quickly convert your files, you can use the dos2unix tool. Example (change "/path" to the directory containing your script files): find /path -type f -print0 | xargs -0 dos2unix

File/directory ownership and permissions

File/directory ownership and permissions must be set correctly in order for scripts to be able to run and interact with the files they need to interact with. Most scripts will come with example permissions for the script's directory structure, however they may not be suitable for your particular environment.

Most files/directories should be owned by you (your non-root user account on the server) and be assigned to the www-data group. You can do this by using something along the lines of sudo chown myname:www-data myscript.cgi.

Typically, you want the main script files to be readable/non-writable/executable by www-data, log/data files to be readable/writable/non-executable by www-data, and static files that never need to be modified by the script to be readable/non-writable/non-executable by www-data.

In some rare instances, old CGI scripts will not work correctly until they've been given extremely permissive directory permissions. This isn't safe, but it's up to you whether you want to grant those permissions or not.

Enable CGI execution

The directory(s) containing your executable CGI script files must have CGI execution enabled in your server config file. Example (inside /etc/apache2/apache2.conf):

    <Directory /var/www/html/myscriptdirectory>
        Options ExecCGI
        AddHandler cgi-script .cgi
        AllowOverride All
        Require all granted
        DirectoryIndex script.cgi
        SetEnv PERL5LIB /var/www/html/myscriptdirectory
    </Directory>

Then restart your server process using something like sudo systemctl restart apache2.

It can be dangerous to broadly enable CGI execution in locations containing files that are writable by www-data, so it can be a good idea to separate those files from the script files. Most scripts have a config file or section where you can change the paths of files to be read/written to, which is useful for this purpose.

jcode.pl

Most old Japanese scripts are encoded in Shift-JIS or EUC-JP, and rely on the jcode.pl library to convert back and forth between UTF-8 and Shift-JIS/EUC-JP.

The best thing to do is convert the script to UTF-8 and remove reliance on jcode.pl entirely (especially if you're using Cloudflare), but for initial testing I recommended sticking with the default character encoding and jcode.pl just to avoid potentially introducing more issues.

However, the versions of jcode.pl provided with these older scripts no longer work with Perl 5.14 and beyond, and gives an error:

defined(%hash) is deprecated at ../perllib/jcode.pl line 684.
      (Maybe you should just omit the defined()?)
defined(%hash) is deprecated at ../perllib/jcode.pl line 693.
      (Maybe you should just omit the defined()?)

To fix this issue, replace the following section in jcode.pl:

sub z2h_euc {
    local(*s, $n) = @_;
    &init_z2h_euc unless defined %z2h_euc;
    $s =~ s/($re_euc_c|$re_euc_kana)/
    $z2h_euc{$1} ? ($n++, $z2h_euc{$1}) : $1
    /geo;
    $n;
}
  
sub z2h_sjis {
    local(*s, $n) = @_;
    &init_z2h_sjis unless defined %z2h_sjis;
    $s =~ s/($re_sjis_c)/$z2h_sjis{$1} ? ($n++, $z2h_sjis{$1}) : $1/geo;
    $n;
}

...with this:

sub z2h_euc {
    local(*s, $n) = @_;
    &init_z2h_euc unless %z2h_euc;
    $s =~ s/($re_euc_c|$re_euc_kana)/
    $z2h_euc{$1} ? ($n++, $z2h_euc{$1}) : $1
    /geo;
    $n;
}

sub z2h_sjis {
    local(*s, $n) = @_;
    &init_z2h_sjis unless %z2h_sjis;
    $s =~ s/($re_sjis_c)/$z2h_sjis{$1} ? ($n++, $z2h_sjis{$1}) : $1/geo;
    $n;
}

Credit for this fix goes to: http://icepotato.cocolog-nifty.com/blog/2014/04/jcodepldefinedh.html

Converting Japanese CGI scripts from Shift-JIS to UTF-8

Most pre-2010s Japanese scripts are encoded in Shift-JIS or other Japanese-specific character encodings such as EUC-JP. Shift-JIS has many quirks and limitations, and can cause problems with services like Cloudflare, so it's best to convert everything to UTF-8. Thankfully, the process is not too difficult.

Here are some steps you should take:

Improving mobile compatibility

Some old Japanese CGI scripts have simplified modes for old Gara-kei phones, but most are not optimized for mobile at all.

The following measures can be taken to improve mobile compatibility: