It is possible to create usable web-based multimedia dictionaries using pure HTML. If the dictionary is small enough, this can be done entirely by hand. If it is larger, it becomes tedious to generate by hand, but the programming necessary to generate it from a database is quite simple.
I here provide a demonstration of these statements by providing the code for a simple set of programs that generate a web-based lexicon from a database as well as a sample of the result. The programs take as input a dictionary database in a simple format and generate from it a web-based dictionary. It is not intended to compete with more sophisticated approaches, such as Kirrkirr or HyperLex, but rather to demonstrate how one can generate a usable web-based dictionary using only the most trivial computer programming. It does not provide sophisticated search tools, and it makes no attempt to handle exotic writing systems. However, it is more than a theoretical demonstration. For languages written in ASCII characters, for lexica of only a few thousand words, where fancy searches are not required, the dictionaries it generates are perfectly usable.
It is generally easiest to download all of the files at once, in which case you will get a compressed tar archive. Download pmwd.tgz If you have GNU tar, you can decompress and unpack this by giving the single command:
tar xzf pmwd.tgz
If you have a version of tar that does not know how to decompress such archives, you will have to decompress first using gunzip. Then use tar without the z flag to unpack. On Windows systems I am told that WinZip can unpack compressed tar archives.
If for some reason you cannot deal with the compressed tar archive, you can also download the files individually. See the descriptions of the files below. The entry for each file contains a link allowing you to download it.
The files provided include a sample dictionary. To look at it, open the file intro.htm in your browser. After reading the introductory text, click on the words Enter Dictionary. The browser window will be divided into two parts known as frames The upper frame, which will occupy most of the browser window, will contain the index to the dictionary, that is, a list of the words in alphabetical order. You can use the scrollbar to show other parts of the list assuming that it is long enough that not all of it fits into the frame at once. Each of these words is a link. Clicking on a word will cause the definition to be displayed in the smaller frame at the bottom of the screen. Try clicking on tsachun. Notice that the definition is followed by the words "show picture". Click on them to see the picture. Now try clicking on hoonliz. Notice that the definition is followed by the words "play sound". Click on them to hear the word.
The lexical database is assumed to be in the format used by the Summer Institute of Linguistics Shoebox program since this is very widely used. Records are separated by blank lines. Each field begins with a backslash followed by the tag that identifies the field. The tag is followed by one or more spaces and then the contents of the field. The headword should be in a field with the tag head. The definition should be in a field with the tag def. Two optional fields are also used. The tag cat specifies the category of the word. The tag sci contains the scientific name of biological organisms. Your records may contain additional fields. The tag snd gives the name of a sound file containing the headword. The tag pic gives the name of an image file illustrating the headword. Your records may contain additional fields. They will simply be ignored. Here is a sample of what such a file might look like:
\head duchun
\def tree, stick, wood in general
\head tsachun
\cat N
\def cache for storing food in the form of a little cabin on posts
\pic tsachun.jpg
\head hoonliz
\cat N
\def skunk
\sci Mephitis mephitis
\snd hoonliz.wav
The software is most easily run on a GNU/Linux system. If you have such a system with msort installed, all you need to do is make a copy of your database file named lexicon.ldb, edit the file language so that it contains the name of your language, and type make. The make program will then follow the instructions in the file makefile and generate the HTML files.
The HTML files generated are:
To use these files, just open dtop.htm in your browser.
If you do not have msort but can get your lexicon into the desired order in some other way, after renaming your lexicon database lexicon.ldb, make a copy of it called lexicon.srt. Then give the command:
touch lexicon.srt
This will make it look like lexicon.srt was created more recently than lexicon.ldb,
so the make program will just use lexicon.srt instead of trying to generate it from
lexicon.ldb.
If you do not have access to make, you can just give the necessary commands by hand, assuming that you have awk:
Depending on the kind of system you are using, you may have to go about executing awk differently. In the above, a filename following a less than sign is input to awk; a filename following a greater than sign is output from awk. Also recall that on some systems the newer version of awk is called nawk or gawk.
The bulk of the work is done by small programs written in AWK. More information about AWK may be found here.
The other piece of software that you need is a sorting program that is capable of sorting the lexicon database file. Many sorting programs cannot do this because they assume that a record consists of a single line, cannot identify records by tag rather than position, and do not have sufficiently sophisticated facilities for specifying the desired ordering.
The program used here, msort, is my own sophisticated sorting program. The program and the manual can be downloaded from my web page. However, msort is only available for UNIX systems. If you are on a non-UNIX system, you will have to find some other way to get your lexicon file into the order desired. One way is to use Shoebox. Even if you don't use it to create and maintain your database, if you use a compatible format you can use its sorting function.