diff options
Diffstat (limited to 'usr/src/man/man5/charmap.5')
-rw-r--r-- | usr/src/man/man5/charmap.5 | 359 |
1 files changed, 0 insertions, 359 deletions
diff --git a/usr/src/man/man5/charmap.5 b/usr/src/man/man5/charmap.5 deleted file mode 100644 index a9c90abb97..0000000000 --- a/usr/src/man/man5/charmap.5 +++ /dev/null @@ -1,359 +0,0 @@ -.\" -.\" Sun Microsystems, Inc. gratefully acknowledges The Open Group for -.\" permission to reproduce portions of its copyrighted documentation. -.\" Original documentation from The Open Group can be obtained online at -.\" http://www.opengroup.org/bookstore/. -.\" -.\" The Institute of Electrical and Electronics Engineers and The Open -.\" Group, have given us permission to reprint portions of their -.\" documentation. -.\" -.\" In the following statement, the phrase ``this text'' refers to portions -.\" of the system documentation. -.\" -.\" Portions of this text are reprinted and reproduced in electronic form -.\" in the SunOS Reference Manual, from IEEE Std 1003.1, 2004 Edition, -.\" Standard for Information Technology -- Portable Operating System -.\" Interface (POSIX), The Open Group Base Specifications Issue 6, -.\" Copyright (C) 2001-2004 by the Institute of Electrical and Electronics -.\" Engineers, Inc and The Open Group. In the event of any discrepancy -.\" between these versions and the original IEEE and The Open Group -.\" Standard, the original IEEE and The Open Group Standard is the referee -.\" document. The original Standard can be obtained online at -.\" http://www.opengroup.org/unix/online.html. -.\" -.\" This notice shall appear on any product containing this material. -.\" -.\" The contents of this file are subject to the terms of the -.\" Common Development and Distribution License (the "License"). -.\" You may not use this file except in compliance with the License. -.\" -.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE -.\" or http://www.opensolaris.org/os/licensing. -.\" See the License for the specific language governing permissions -.\" and limitations under the License. -.\" -.\" When distributing Covered Code, include this CDDL HEADER in each -.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE. -.\" If applicable, add the following below this CDDL HEADER, with the -.\" fields enclosed by brackets "[]" replaced with your own identifying -.\" information: Portions Copyright [yyyy] [name of copyright owner] -.\" -.\" -.\" Copyright (c) 1992, X/Open Company Limited. All Rights Reserved. -.\" Portions Copyright (c) 2003, Sun Microsystems, Inc. All Rights Reserved -.\" -.TH CHARMAP 5 "Dec 1, 2003" -.SH NAME -charmap \- character set description file -.SH DESCRIPTION -.sp -.LP -A character set description file or \fIcharmap\fR defines characteristics for a -coded character set. Other information about the coded character set may also -be in the file. Coded character set character values are defined using symbolic -character names followed by character encoding values. -.sp -.LP -The character set description file provides: -.RS +4 -.TP -.ie t \(bu -.el o -The capability to describe character set attributes (such as collation order or -character classes) independent of character set encoding, and using only the -characters in the portable character set. This makes it possible to create -generic \fBlocaledef\fR(1) source files for all codesets that share the -portable character set. -.RE -.RS +4 -.TP -.ie t \(bu -.el o -Standardized symbolic names for all characters in the portable character set, -making it possible to refer to any such character regardless of encoding. -.RE -.SS "Symbolic Names" -.sp -.LP -Each symbolic name is included in the file and is mapped to a unique encoding -value (except for those symbolic names that are shown with identical glyphs). -If the control characters commonly associated with the symbolic names in the -following table are supported by the implementation, the symbolic names and -their corresponding encoding values are included in the file. Some of the -encodings associated with the symbolic names in this table may be the same as -characters in the portable character set table. -.sp - -.sp -.TS -box; -l l l l l l -l l l l l l . -<ACK> <DC2> <ENQ> <FS> <IS4> <SOH> -<BEL> <DC3> <EOT> <GS> <LF> <STX> -<BS> <DC4> <ESC> <HT> <NAK> <SUB> -<CAN> <DEL> <ETB> <IS1> <RS> <SYN> -<CR> <DLE> <ETX> <IS2> <SI> <US> -<DC1> <EM> <FF> <IS3> <SO> <VT> -.TE - -.SS "Declarations" -.sp -.LP -The following declarations can precede the character definitions. Each must -consist of the symbol shown in the following list, starting in column 1, -including the surrounding brackets, followed by one or more blank characters, -followed by the value to be assigned to the symbol. -.sp -.ne 2 -.na -\fB<\fIcode_set_name\fR>\fR -.ad -.RS 19n -The name of the coded character set for which the character set description -file is defined. -.RE - -.sp -.ne 2 -.na -\fB<\fImb_cur_max\fR>\fR -.ad -.RS 19n -The maximum number of bytes in a multi-byte character. This defaults to -\fB1\fR. -.RE - -.sp -.ne 2 -.na -\fB<\fImb_cur_min\fR>\fR -.ad -.RS 19n -An unsigned positive integer value that defines the minimum number of bytes in -a character for the encoded character set. -.RE - -.sp -.ne 2 -.na -\fB<\fIescape_char\fR>\fR -.ad -.RS 19n -The escape character used to indicate that the characters following will be -interpreted in a special way, as defined later in this section. This defaults -to backslash ('\fB\e\fR\&'), which is the character glyph used in all the -following text and examples, unless otherwise noted. -.RE - -.sp -.ne 2 -.na -\fB<\fIcomment_char\fR>\fR -.ad -.RS 19n -The character that when placed in column 1 of a charmap line, is used to -indicate that the line is to be ignored. The default character is the number -sign (#). -.RE - -.SS "Format" -.sp -.LP -The character set mapping definitions will be all the lines immediately -following an identifier line containing the string \fBCHARMAP\fR starting in -column 1, and preceding a trailer line containing the string \fBEND\fR -\fBCHARMAP\fR starting in column 1. Empty lines and lines containing a -\fI<comment_char>\fR in the first column will be ignored. Each non-comment line -of the character set mapping definition, that is, between the \fBCHARMAP\fR and -\fBEND CHARMAP\fR lines of the file), must be in either of two forms: -.sp -.in +2 -.nf -\fB"%s %s %s\en",\fR<\fIsymbolic-name\fR>,<\fIencoding\fR>,<\fIcomments\fR> -.fi -.in -2 - -.sp -.LP -or -.sp -.in +2 -.nf -\fB"%s...%s %s %s\en",\fR<\fIsymbolic-name\fR>,<\fIsymbolic-name\fR>, <\fIencoding\fR>,\e - <\fIcomments\fR> -.fi -.in -2 - -.sp -.LP -In the first format, the line in the character set mapping definition defines a -single symbolic name and a corresponding encoding. A character following an -escape character is interpreted as itself; for example, the sequence -"<\fB\e\e\e\fR>>" represents the symbolic name "\fI\e>\fR" enclosed between -angle brackets. -.sp -.LP -In the second format, the line in the character set mapping definition defines -a range of one or more symbolic names. In this form, the symbolic names must -consist of zero or more non-numeric characters, followed by an integer formed -by one or more decimal digits. The characters preceding the integer must be -identical in the two symbolic names, and the integer formed by the digits in -the second symbolic name must be equal to or greater than the integer formed by -the digits in the first name. This is interpreted as a series of symbolic names -formed from the common part and each of the integers between the first and the -second integer, inclusive. As an example, \fB<j0101>...<j0104>\fR is -interpreted as the symbolic names \fB<j0101>\fR, \fB<j0102>\fR, \fB<j0103>\fR, -and \fB<j0104>\fR, in that order. -.sp -.LP -A character set mapping definition line must exist for all symbolic names and -must define the coded character value that corresponds to the character glyph -indicated in the table, or the coded character value that corresponds with the -control character symbolic name. If the control characters commonly associated -with the symbolic names are supported by the implementation, the symbolic name -and the corresponding encoding value must be included in the file. Additional -unique symbolic names may be included. A coded character value can be -represented by more than one symbolic name. -.sp -.LP -The encoding part is expressed as one (for single-byte character values) or -more concatenated decimal, octal or hexadecimal constants in the following -formats: -.sp -.in +2 -.nf -\fB"%cd%d",\fR<\fIescape_char\fR>,<\fIdecimal byte value\fR> - -\fB"%cx%x",\fR<\fIescape_char\fR>,<\fIhexadecimal byte value\fR> - -\fB"%c%o",\fR<\fIescape_char\fR>,<\fIoctal byte value\fR> -.fi -.in -2 - -.SS "Decimal Constants" -.sp -.LP -Decimal constants must be represented by two or three decimal digits, preceded -by the escape character and the lower-case letter \fBd\fR; for example, -\fB\ed05\fR, \fB\ed97\fR, or \fB\ed143\fR\&. Hexadecimal constants must be -represented by two hexadecimal digits, preceded by the escape character and the -lower-case letter \fBx\fR; for example, \fB\ex05\fR, \fB\ex61\fR, or -\fB\ex8f\fR\&. Octal constants must be represented by two or three octal -digits, preceded by the escape character; for example, \fB\e05\fR, \fB\e141\fR, -or \fB\e217\fR\&. In a portable charmap file, each constant must represent an -8-bit byte. Implementations supporting other byte sizes may allow constants to -represent values larger than those that can be represented in 8-bit bytes, and -to allow additional digits in constants. When constants are concatenated for -multi-byte character values, they must be of the same type, and interpreted in -byte order from first to last with the least significant byte of the multi-byte -character specified by the last constant. -.SS "Ranges of Symbolic Names" -.sp -.LP -In lines defining ranges of symbolic names, the encoded value is the value for -the first symbolic name in the range (the symbolic name preceding the -ellipsis). Subsequent symbolic names defined by the range will have encoding -values in increasing order. Bytes are treated as unsigned octets and carry is -propagated between the bytes as necessary to represent the range. However, -because this causes a null byte in the second or subsequent bytes of a -character, such a declaration should not be specified. For example, the line -.sp -.in +2 -.nf -<j0101>...<j0104> \ed129\ed254 -.fi -.in -2 - -.sp -.LP -is interpreted as: -.sp -.in +2 -.nf -<j0101> \ed129\ed254 -<j0102> \ed129\ed255 -<j0103> \ed130\ed00 -<j0104> \ed130\ed01 -.fi -.in -2 - -.sp -.LP -The expanded declaration of the symbol \fB<j0103>\fR in the above example is an -invalid specification, because it contains a null byte in the second byte of a -character. -.sp -.LP -The comment is optional. -.SS "Width Specification" -.sp -.LP -The following declarations can follow the character set mapping definitions -(after the "\fBEND CHARMAP\fR" statement). Each consists of the keyword shown -in the following list, starting in column 1, followed by the value(s) to be -associated to the keyword, as defined below. -.sp -.ne 2 -.na -\fB\fBWIDTH\fR\fR -.ad -.RS 17n -A non-negative integer value defining the column width for the printable -character in the coded character set mapping definitions. Coded character set -character values are defined using symbolic character names followed by column -width values. Defining a character with more than one \fBWIDTH\fR produces -undefined results. The \fBEND WIDTH\fR keyword is used to terminate the -\fBWIDTH\fR definitions. Specifying the width of a non-printable character in a -\fBWIDTH\fR declaration produces undefined results. -.RE - -.sp -.ne 2 -.na -\fB\fBWIDTH_DEFAULT\fR\fR -.ad -.RS 17n -A non-negative integer value defining the default column width for any -printable character not listed by one of the \fBWIDTH\fR keywords. If no -\fBWIDTH_DEFAULT\fR keyword is included in the charmap, the default character -width is \fB1\fR. -.RE - -.sp -.LP -Example: -.sp -.LP -After the "\fBEND CHARMAP\fR" statement, a syntax for a width definition would -be: -.sp -.in +2 -.nf -WIDTH -<A> 1 -<B> 1 -<C>...<Z> 1 -\&... -<fool>...<foon> 2 -\&... -END WIDTH -.fi -.in -2 -.sp - -.sp -.LP -In this example, the numerical code point values represented by the symbols -\fB<A>\fR and \fB<B>\fR are assigned a width of \fB1\fR. The code point values -\fB< C>\fR to \fB<Z>\fR inclusive, that is, \fB<C>\fR, \fB<D>\fR, \fB<E>\fR, -and so on, are also assigned a width of \fB1\fR. Using \fB<A>...<Z>\fR would -have required fewer lines, but the alternative was shown to demonstrate -flexibility. The keyword \fBWIDTH_DEFAULT\fR could have been added as -appropriate. -.SH SEE ALSO -.sp -.LP -\fBlocale\fR(1), \fBlocaledef\fR(1), \fBnl_langinfo\fR(3C), -\fBextensions\fR(5), \fBlocale\fR(5) |