The Unicode Character Set

Because TADS 3 uses Unicode internally, TADS programs can work on practically any computer, in any national language, using any character set.

Unicode is a "universal" character set that encodes almost every glyph from almost every written language in the world. Unlike the language-specific character sets that preceded it, Unicode includes all of the world's glyphs in a single file of characters, giving each character a unique code point.

Character Mapping Files

Sadly, most computers today do not use Unicode as their native character sets. In order for TADS to work on non-Unicode computers, TADS must translate characters between Unicode and native encodings. To accomplish this without tying itself to any one type of computer, TADS uses character mapping files. A character mapping file describes the association between a computer's native character set and Unicode.

The Unicode consortium, which is the organization that defines the Unicode standard, publishes files that describe the correspondence between Unicode and most native character sets that are in use today. TADS uses these files as a starting point.

Creating a TADS Character Mapping File

To create a character mapping for use with TADS, follow these steps.

Download Locations for Common Mappings

ISO 8859 (ISO Latin-n) code pages ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/
MS-DOS code pages ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/
Microsoft Windows code pages ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/
Apple Macintosh character sets ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/
NeXT character sets ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/NEXT/