summaryrefslogtreecommitdiff
path: root/usr/src/man/man7/byteorder.7
diff options
context:
space:
mode:
Diffstat (limited to 'usr/src/man/man7/byteorder.7')
-rw-r--r--usr/src/man/man7/byteorder.7270
1 files changed, 270 insertions, 0 deletions
diff --git a/usr/src/man/man7/byteorder.7 b/usr/src/man/man7/byteorder.7
new file mode 100644
index 0000000000..33ab61636c
--- /dev/null
+++ b/usr/src/man/man7/byteorder.7
@@ -0,0 +1,270 @@
+.\"
+.\" This file and its contents are supplied under the terms of the
+.\" Common Development and Distribution License ("CDDL"), version 1.0.
+.\" You may only use this file in accordance with the terms of version
+.\" 1.0 of the CDDL.
+.\"
+.\" A full copy of the text of the CDDL should have accompanied this
+.\" source. A copy of the CDDL is also available via the Internet at
+.\" http://www.illumos.org/license/CDDL.
+.\"
+.\"
+.\" Copyright 2016 Joyent, Inc.
+.\"
+.Dd August 2, 2018
+.Dt BYTEORDER 7
+.Os
+.Sh NAME
+.Nm byteorder ,
+.Nm endian
+.Nd byte order and endianness
+.Sh DESCRIPTION
+Integer values which occupy more than 1 byte in memory can be laid out
+in different ways on different platforms.
+In particular, there is a major split between those which place the least
+significant byte of an integer at the lowest address, and those which place the
+most significant byte there instead.
+As this difference relates to which end of the integer is found in memory first,
+the term
+.Em endian
+is used to refer to a particular byte order.
+.Pp
+A platform is referred to as using a
+.Em big-endian
+byte order when it places the most significant byte at the lowest
+address, and
+.Em little-endian
+when it places the least significant byte first.
+Some platforms may also switch between big- and little-endian mode and run code
+compiled for either.
+.Pp
+Historically, there have also been some systems that utilized
+.Em middle-endian
+byte orders for integers larger than 2 bytes.
+Such orderings are not in common use today.
+.Pp
+Endianness is also of particular importance when dealing with values
+that are being read into memory from an external source.
+For example, network protocols such as IP conventionally define the fields in a
+packet as being always stored in big-endian byte order.
+This means that a little-endian machine will have to perform transformations on
+these fields in order to process them.
+.Ss Examples
+To illustrate endianness in memory, let us consider the decimal integer
+2864434397.
+This number fits in 32 bits of storage (4 bytes).
+.Pp
+On a big-endian system, this integer would be written into memory as
+the bytes 0xAA, 0xBB, 0xCC, 0xDD, in order from lowest memory address to
+highest.
+.Pp
+On a little-endian system, it would be written instead as the bytes
+0xDD, 0xCC, 0xBB, 0xAA, in that order.
+.Pp
+If both the big- and little-endian systems were asked to store this
+integer at address 0x100, we would see the following in each of their
+memory:
+.Bd -literal
+
+ Big-Endian
+
+ ++------++------++------++------++
+ || 0xAA || 0xBB || 0xCC || 0xDD ||
+ ++------++------++------++------++
+ ^^ ^^ ^^ ^^
+ 0x100 0x101 0x102 0x103
+ vv vv vv vv
+ ++------++------++------++------++
+ || 0xDD || 0xCC || 0xBB || 0xAA ||
+ ++------++------++------++------++
+
+ Little-Endian
+.Ed
+.Pp
+It is particularly important to note that even though the byte order is
+different between these two machines, the bit ordering within each byte,
+by convention, is still the same.
+.Pp
+For example, take the decimal integer 4660, which occupies in 16 bits (2
+bytes).
+.Pp
+On a big-endian system, this would be written into memory as 0x12, then
+0x34.
+.Pp
+On a little-endian system, it would be written as 0x34, then 0x12.
+Note that this is not at all the same as seeing 0x43 then 0x21 in memory --
+only the bytes are re-ordered, not any bits (or nybbles) within them.
+.Pp
+As before, storing this at address 0x100:
+.Bd -literal
+ Big-Endian
+
+ ++------++------++
+ || 0x12 || 0x34 ||
+ ++------++------++
+ ^^ ^^
+ 0x100 0x101
+ vv vv
+ ++------++------++
+ || 0x34 || 0x12 ||
+ ++------++------++
+
+ Little-Endian
+.Ed
+.Pp
+This example shows how an eight byte number, 0xBADCAFEDEADBEEF is stored
+in both big and little-endian:
+.Bd -literal
+ Big-Endian
+
+ +------+------+------+------+------+------+------+------+
+ | 0xBA | 0xDC | 0xAF | 0xFE | 0xDE | 0xAD | 0xBE | 0xEF |
+ +------+------+------+------+------+------+------+------+
+ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^
+ 0x100 0x101 0x102 0x103 0x104 0x105 0x106 0x107
+ vv vv vv vv vv vv vv vv
+ +------+------+------+------+------+------+------+------+
+ | 0xEF | 0xBE | 0xAD | 0xDE | 0xFE | 0xAF | 0xDC | 0xBA |
+ +------+------+------+------+------+------+------+------+
+
+ Little-Endian
+
+.Ed
+.Pp
+The treatment of different endian values would not be complete without
+discussing
+.Em PDP-endian ,
+which is also known as
+.Em middle-endian .
+While the PDP-11 was a 16-bit little-endian system, it laid out 32-bit
+values in a different way from current little-endian systems.
+First, it would divide a 32-bit number into two 16-bit numbers.
+Each 16-bit number would be stored in little-endian; however, the two 16-bit
+words would be stored with the larger 16-bit word appearing first in memory,
+followed by the latter.
+.Pp
+The following image illustrates PDP-endian and compares it against
+little-endian values.
+Here, we'll start with the value 0xAABBCCDD and show how the four bytes for it
+will be laid out, starting at 0x100.
+.Bd -literal
+ PDP-Endian
+
+ ++------++------++------++------++
+ || 0xBB || 0xAA || 0xDD || 0xCC ||
+ ++------++------++------++------++
+ ^^ ^^ ^^ ^^
+ 0x100 0x101 0x102 0x103
+ vv vv vv vv
+ ++------++------++------++------++
+ || 0xDD || 0xCC || 0xBB || 0xAA ||
+ ++------++------++------++------++
+
+ Little-Endian
+
+.Ed
+.Ss Network Byte Order
+The term 'network byte order' refers to big-endian ordering, and
+originates from the IEEE.
+Early disagreements over which byte ordering to use for network traffic prompted
+RFC1700 to define that all IETF-specified network protocols use big-endian
+ordering unless noted explicitly otherwise.
+The Internet protocol family (IP, and thus TCP and UDP etc) particularly adhere
+to this convention.
+.Ss Determining the System's Byte Order
+The operating system supports both big-endian and little-endian CPUs.
+To make it easier for programs to determine the endianness of the platform they
+are being compiled for, functions and macro constants are provided in the system
+header files.
+.Pp
+The endianness of the system can be obtained by including the header
+.In sys/types.h
+and using the pre-processor macros
+.Sy _LITTLE_ENDIAN
+and
+.Sy _BIG_ENDIAN .
+See
+.Xr types.h 3HEAD
+for more information.
+.Pp
+Additionally, the header
+.In endian.h
+defines an alternative means for determining the endianness of the
+current system.
+See
+.Xr endian.h 3HEAD
+for more information.
+.Pp
+illumos runs on both big- and little-endian systems.
+When writing software for which the endianness is important, one must always
+check the byte order and convert it appropriately.
+.Ss Converting Between Byte Orders
+The system provides two different sets of functions to convert values
+between big-endian and little-endian.
+They are defined in
+.Xr byteorder 3C
+and
+.Xr endian 3C .
+.Pp
+The
+.Xr byteorder 3C
+family of functions convert data between the host's native byte order
+and big- or little-endian.
+The functions operate on either 16-bit, 32-bit, or 64-bit values.
+Functions that convert from network byte order to the host's byte order
+start with the string
+.Sy ntoh ,
+while functions which convert from the host's byte order to network byte
+order, begin with
+.Sy hton .
+For example, to convert a 32-bit value, a long, from network byte order
+to the host's, one would use the function
+.Xr ntohl 3C .
+.Pp
+These functions have been standardized by POSIX.
+However, the 64-bit variants,
+.Xr ntohll 3C
+and
+.Xr htonll 3C
+are not standardized and may not be found on other systems.
+For more information on these functions, see
+.Xr byteorder 3C .
+.Pp
+The second family of functions,
+.Xr endian 3C ,
+provide a means to convert between the host's byte order
+and big-endian and little-endian specifically.
+While these functions are similar to those in
+.Xr byteorder 3C ,
+they more explicitly cover different data conversions.
+Like them, these functions operate on either 16-bit, 32-bit, or 64-bit values.
+When converting from big-endian, to the host's endianness, the functions
+begin with
+.Sy betoh .
+If instead, one is converting data from the host's native endianness to
+another, then it starts with
+.Sy htobe .
+When working with little-endian data, the prefixes
+.Sy letoh
+and
+.Sy htole
+convert little-endian data to the host's endianness and from the host's
+to little-endian respectively.
+.Pp
+These functions are not standardized and the header they appear in varies
+between the BSDs and GNU/Linux.
+Applications that wish to be portable, should instead use the
+.Xr byteorder 3C
+functions.
+.Pp
+All of these functions in both families simply return their input when
+the host's native byte order is the same as the desired order.
+For example, when calling
+.Xr htonl 3C
+on a big-endian system the original data is returned with no conversion
+or modification.
+.Sh SEE ALSO
+.Xr byteorder 3C ,
+.Xr endian 3C ,
+.Xr endian.h 3HEAD ,
+.Xr inet 3HEAD