diff options
Diffstat (limited to 'doc/asm.html')
| -rw-r--r-- | doc/asm.html | 157 | 
1 files changed, 124 insertions, 33 deletions
| diff --git a/doc/asm.html b/doc/asm.html index d44cb799d..771c493cc 100644 --- a/doc/asm.html +++ b/doc/asm.html @@ -117,6 +117,9 @@ All user-defined symbols other than jump labels are written as offsets to these  <p>  The <code>SB</code> pseudo-register can be thought of as the origin of memory, so the symbol <code>foo(SB)</code>  is the name <code>foo</code> as an address in memory. +This form is used to name global functions and data. +Adding <code><></code> to the name, as in <code>foo<>(SB)</code>, makes the name +visible only in the current source file, like a top-level <code>static</code> declaration in a C file.  </p>  <p> @@ -128,8 +131,11 @@ Thus <code>0(FP)</code> is the first argument to the function,  When referring to a function argument this way, it is conventional to place the name  at the beginning, as in <code>first_arg+0(FP)</code> and <code>second_arg+8(FP)</code>.  Some of the assemblers enforce this convention, rejecting plain <code>0(FP)</code> and <code>8(FP)</code>. -For assembly functions with Go prototypes, <code>go vet</code> will check that the argument names +For assembly functions with Go prototypes, <code>go</code> <code>vet</code> will check that the argument names  and offsets match. +On 32-bit systems, the low and high 32 bits of a 64-bit value are distinguished by adding +a <code>_lo</code> or <code>_hi</code> suffix to the name, as in <code>arg_lo+0(FP)</code> or <code>arg_hi+4(FP)</code>. +If a Go prototype does not name its result, the expected assembly name is <code>ret</code>.  </p>  <p> @@ -149,7 +155,7 @@ hardware's <code>SP</code> register.  <p>  Instructions, registers, and assembler directives are always in UPPER CASE to remind you  that assembly programming is a fraught endeavor. -(Exceptions: the <code>m</code> and <code>g</code> register renamings on ARM.) +(Exception: the <code>g</code> register renaming on ARM.)  </p>  <p> @@ -206,6 +212,8 @@ The frame size <code>$24-8</code> states that the function has a 24-byte frame  and is called with 8 bytes of argument, which live on the caller's frame.  If <code>NOSPLIT</code> is not specified for the <code>TEXT</code>,  the argument size must be provided. +For assembly functions with Go prototypes, <code>go</code> <code>vet</code> will check that the +argument size is correct.  </p>  <p> @@ -216,19 +224,20 @@ simple name <code>profileloop</code>.  </p>  <p> -For <code>DATA</code> directives, the symbol is followed by a slash and the number -of bytes the memory associated with the symbol occupies. -The arguments are optional flags and the data itself. -For instance, -</p> +Global data symbols are defined by a sequence of initializing +<code>DATA</code> directives followed by a <code>GLOBL</code> directive. +Each <code>DATA</code> directive initializes a section of the +corresponding memory. +The memory not explicitly initialized is zeroed. +The general form of the <code>DATA</code> directive is  <pre> -DATA  runtime·isplan9(SB)/4, $1 +DATA	symbol+offset(SB)/width, value  </pre>  <p> -declares the local symbol <code>runtime·isplan9</code> of size 4 and value 1. -Again the symbol has the middle dot and is offset from <code>SB</code>. +which initializes the symbol memory at the given offset and width with the given value. +The <code>DATA</code> directives for a given symbol must be written with increasing offsets.  </p>  <p> @@ -237,15 +246,26 @@ The arguments are optional flags and the size of the data being declared as a gl  which will have initial value all zeros unless a <code>DATA</code> directive  has initialized it.  The <code>GLOBL</code> directive must follow any corresponding <code>DATA</code> directives. -This example +</p> + +<p> +For example,  </p>  <pre> -GLOBL runtime·tlsoffset(SB),$4 +DATA divtab<>+0x00(SB)/4, $0xf4f8fcff +DATA divtab<>+0x04(SB)/4, $0xe6eaedf0 +... +DATA divtab<>+0x3c(SB)/4, $0x81828384 +GLOBL divtab<>(SB), RODATA, $64 + +GLOBL runtime·tlsoffset(SB), NOPTR, $4  </pre>  <p> -declares <code>runtime·tlsoffset</code> to have size 4. +declares and initializes <code>divtab<></code>, a read-only 64-byte table of 4-byte integer values, +and declares <code>runtime·tlsoffset</code>, a 4-byte, implicitly zeroed variable that +contains no pointers.  </p>  <p> @@ -253,7 +273,7 @@ There may be one or two arguments to the directives.  If there are two, the first is a bit mask of flags,  which can be written as numeric expressions, added or or-ed together,  or can be set symbolically for easier absorption by a human. -Their values, defined in the file <code>src/cmd/ld/textflag.h</code>, are: +Their values, defined in the standard <code>#include</code>  file <code>textflag.h</code>, are:  </p>  <ul> @@ -299,6 +319,80 @@ This is a wrapper function and should not count as disabling <code>recover</code  </li>  </ul> +<h3 id="runtime">Runtime Coordination</h3> + +<p> +For garbage collection to run correctly, the runtime must know the +location of pointers in all global data and in most stack frames. +The Go compiler emits this information when compiling Go source files, +but assembly programs must define it explicitly. +</p> + +<p> +A data symbol marked with the <code>NOPTR</code> flag (see above) +is treated as containing no pointers to runtime-allocated data. +A data symbol with the <code>RODATA</code> flag +is allocated in read-only memory and is therefore treated +as implicitly marked <code>NOPTR</code>. +A data symbol with a total size smaller than a pointer +is also treated as implicitly marked <code>NOPTR</code>. +It is not possible to define a symbol containing pointers in an assembly source file; +such a symbol must be defined in a Go source file instead. +Assembly source can still refer to the symbol by name +even without <code>DATA</code> and <code>GLOBL</code> directives. +A good general rule of thumb is to define all non-<code>RODATA</code> +symbols in Go instead of in assembly. +</p> + +<p> +Each function also needs annotations giving the location of +live pointers in its arguments, results, and local stack frame. +For an assembly function with no pointer results and +either no local stack frame or no function calls, +the only requirement is to define a Go prototype for the function +in a Go source file in the same package. +For more complex situations, explicit annotation is needed. +These annotations use pseudo-instructions defined in the standard +<code>#include</code> file <code>funcdata.h</code>. +</p> + +<p> +If a function has no arguments and no results, +the pointer information can be omitted. +This is indicated by an argument size annotation of <code>$<i>n</i>-0</code> +on the <code>TEXT</code> instruction. +Otherwise, pointer information must be provided by +a Go prototype for the function in a Go source file, +even for assembly functions not called directly from Go. +(The prototype will also let <code>go</code> <code>vet</code> check the argument references.) +At the start of the function, the arguments are assumed +to be initialized but the results are assumed uninitialized. +If the results will hold live pointers during a call instruction, +the function should start by zeroing the results and then  +executing the pseudo-instruction <code>GO_RESULTS_INITIALIZED</code>. +This instruction records that the results are now initialized +and should be scanned during stack movement and garbage collection. +It is typically easier to arrange that assembly functions do not +return pointers or do not contain call instructions; +no assembly functions in the standard library use +<code>GO_RESULTS_INITIALIZED</code>. +</p> + +<p> +If a function has no local stack frame, +the pointer information can be omitted. +This is indicated by a local frame size annotation of <code>$0-<i>n</i></code> +on the <code>TEXT</code> instruction. +The pointer information can also be omitted if the +function contains no call instructions. +Otherwise, the local stack frame must not contain pointers, +and the assembly must confirm this fact by executing the  +pseudo-instruction <code>NO_LOCAL_POINTERS</code>. +Because stack resizing is implemented by moving the stack, +the stack pointer may change during any function call: +even pointers to stack data must not be kept in local variables. +</p> +  <h2 id="architectures">Architecture-specific details</h2>  <p> @@ -344,7 +438,7 @@ Here follows some descriptions of key Go-specific details for the supported arch  <h3 id="x86">32-bit Intel 386</h3>  <p> -The runtime pointers to the <code>m</code> and <code>g</code> structures are maintained +The runtime pointer to the <code>g</code> structure is maintained  through the value of an otherwise unused (as far as Go is concerned) register in the MMU.  A OS-dependent macro <code>get_tls</code> is defined for the assembler if the source includes  an architecture-dependent header file, like this: @@ -356,14 +450,15 @@ an architecture-dependent header file, like this:  <p>  Within the runtime, the <code>get_tls</code> macro loads its argument register -with a pointer to a pair of words representing the <code>g</code> and <code>m</code> pointers. +with a pointer to the <code>g</code> pointer, and the <code>g</code> struct +contains the <code>m</code> pointer.  The sequence to load <code>g</code> and <code>m</code> using <code>CX</code> looks like this:  </p>  <pre>  get_tls(CX) -MOVL	g(CX), AX	// Move g into AX. -MOVL	m(CX), BX	// Move m into BX. +MOVL	g(CX), AX     // Move g into AX. +MOVL	g_m(AX), BX   // Move g->m into BX.  </pre>  <h3 id="amd64">64-bit Intel 386 (a.k.a. amd64)</h3> @@ -376,22 +471,21 @@ pointers is the same as on the 386, except it uses <code>MOVQ</code> rather than  <pre>  get_tls(CX) -MOVQ	g(CX), AX	// Move g into AX. -MOVQ	m(CX), BX	// Move m into BX. +MOVQ	g(CX), AX     // Move g into AX. +MOVQ	g_m(AX), BX   // Move g->m into BX.  </pre>  <h3 id="arm">ARM</h3>  <p> -The registers <code>R9</code>, <code>R10</code>, and <code>R11</code> +The registers <code>R10</code> and <code>R11</code>  are reserved by the compiler and linker.  </p>  <p> -<code>R9</code> and <code>R10</code> point to the <code>m</code> (machine) and <code>g</code> -(goroutine) structures, respectively. -Within assembler source code, these pointers must be referred to as <code>m</code> and <code>g</code>; -the names <code>R9</code> and <code>R10</code> are not recognized. +<code>R10</code> points to the <code>g</code> (goroutine) structure. +Within assembler source code, this pointer must be referred to as <code>g</code>; +the name <code>R10</code> is not recognized.  </p>  <p> @@ -434,13 +528,10 @@ Here's how the 386 runtime defines the 64-bit atomic load function.  // so actually  // void atomicload64(uint64 *res, uint64 volatile *addr);  TEXT runtime·atomicload64(SB), NOSPLIT, $0-8 -	MOVL	4(SP), BX -	MOVL	8(SP), AX -	// MOVQ (%EAX), %MM0 -	BYTE $0x0f; BYTE $0x6f; BYTE $0x00 -	// MOVQ %MM0, 0(%EBX) -	BYTE $0x0f; BYTE $0x7f; BYTE $0x03 -	// EMMS -	BYTE $0x0F; BYTE $0x77 +	MOVL	ptr+0(FP), AX +	LEAL	ret_lo+4(FP), BX +	BYTE $0x0f; BYTE $0x6f; BYTE $0x00	// MOVQ (%EAX), %MM0 +	BYTE $0x0f; BYTE $0x7f; BYTE $0x03	// MOVQ %MM0, 0(%EBX) +	BYTE $0x0F; BYTE $0x77			// EMMS  	RET  </pre> | 
