summaryrefslogtreecommitdiff
path: root/libdwarf/mips_extensions.mm
diff options
context:
space:
mode:
Diffstat (limited to 'libdwarf/mips_extensions.mm')
-rw-r--r--libdwarf/mips_extensions.mm1266
1 files changed, 1266 insertions, 0 deletions
diff --git a/libdwarf/mips_extensions.mm b/libdwarf/mips_extensions.mm
new file mode 100644
index 0000000..7a312f0
--- /dev/null
+++ b/libdwarf/mips_extensions.mm
@@ -0,0 +1,1266 @@
+\."
+\." the following line may be removed if the ff ligature works on your machine
+.lg 0
+\." set up heading formats
+.ds HF 3 3 3 3 3 2 2
+.ds HP +2 +2 +1 +0 +0
+.nr Hs 5
+.nr Hb 5
+\." ==============================================
+\." Put current date in the following at each rev
+.ds vE rev 1.18, 31 March 2005
+\." ==============================================
+\." ==============================================
+.ds | |
+.ds ~ ~
+.ds ' '
+.if t .ds Cw \&\f(CW
+.if n .ds Cw \fB
+.de Cf \" Place every other arg in Cw font, beginning with first
+.if \\n(.$=1 \&\*(Cw\\$1\fP
+.if \\n(.$=2 \&\*(Cw\\$1\fP\\$2
+.if \\n(.$=3 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP
+.if \\n(.$=4 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4
+.if \\n(.$=5 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP
+.if \\n(.$=6 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP\\$6
+.if \\n(.$=7 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP\\$6\*(Cw\\$7\fP
+.if \\n(.$=8 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP\\$6\*(Cw\\$7\fP\\$8
+.if \\n(.$=9 \&\*(Cw\\$1\fP\\$2\*(Cw\\$3\fP\\$4\*(Cw\\$5\fP\\$6\*(Cw\\$7\fP\\$8\
+*(Cw
+..
+.nr Cl 4
+.SA 1
+.TL
+MIPS Extensions to DWARF Version 2.0
+.AF ""
+.AU "Silicon Graphics Computer Systems"
+.PF "'\*(vE'- \\\\nP -''"
+.AS 1
+This document describes the MIPS/Silicon Graphics extensions
+to the "DWARF Information Format" (version 2.0.0 dated July 27, 1993).
+DWARF3 draft 8 (or draft 9) is out as of 2005, and
+is mentioned below where applicable.
+MIPS/IRIX compilers emit DWARF2 (with extensions).
+.P
+Rather than alter the base documents to describe the extensions
+we provide this separate document.
+.P
+The extensions documented here are subject to change.
+.P
+It also describes known bugs resulting in incorrect dwarf usage.
+.P
+\*(vE
+
+.AE
+.MT 4
+
+.H 1 "INTRODUCTION"
+.P
+This
+document describes MIPS extensions
+to the DWARF debugging information format.
+The extensions documented here are subject to change at
+any time.
+.H 1 "64 BIT DWARF"
+.P
+The DWARF2 spec has no provision for 64 bit offsets.
+SGI-IRIX/MIPS Elf64 objects contain DWARF 2 with all offsets
+(and addresses) as 64bit values.
+This non-standard extension was adopted in 1992.
+Nothing in the dwarf itself identifies the dwarf as 64bit.
+This extension 64bit-offset dwarf cannot be mixed with 32bit-offset dwarf
+in a single object or executable, and SGI-IRIX/MIPS compilers
+and tools do not mix the sizes.
+.P
+In 2001 DWARF3 adopted a very different 64bit-offset
+format which can be mixed usefully with 32bit-offset DWARF2 or DWARF3.
+It is not very likely SGI-IRIX/MIPS compilers will switch to the
+now-standard
+DWARF3 64bit-offset scheme, but such a switch is theoretically
+possible and would be a good idea.
+.P
+SGI-IRIX/MIPS Elf32 objects
+contain DWARF2 with all offsets (and addresses) 32 bits.
+.H 1 "How much symbol information is emitted"
+The following standard DWARF V2 sections may be emitted:
+.AL
+.LI
+Section .debug_abbrev
+contains
+abbreviations supporting the .debug_info section.
+.LI
+Section .debug_info
+contains
+Debug Information Entries (DIEs).
+.LI
+Section .debug_frame
+contains
+stack frame descriptions.
+.LI
+Section .debug_line
+contains
+line number information.
+.LI
+Section .debug_aranges
+contains
+address range descriptions.
+.LI
+Section .debug_pubnames
+contains
+names of global functions and data.
+.P
+The following
+are MIPS extensions.
+Theses were created to allow debuggers to
+know names without having to look at
+the .debug_info section.
+.LI
+Section .debug_weaknames
+is a MIPS extension
+containing .debug_pubnames-like entries describing weak
+symbols.
+.LI
+Section .debug_funcnames
+is a MIPS extension
+containing .debug_pubnames-like entries describing file-static
+functions (C static functions).
+The gcc extension of nested subprograms (like Pascal)
+adds non-global non-static functions. These should be treated like
+static functions and gcc should add such to this section
+so that IRIX libexc(3C) will work correctly.
+Similarly, Ada functions which are non-global should be here too
+so that libexc(3C) can work.
+Putting it another way, every function (other than inline code)
+belongs either in .debug_pubnames or in .debug_funcnames
+or else libexc(3C) cannot find the function name.
+.LI
+Section .debug_varnames
+is a MIPS extension
+containing .debug_pubnames-like entries describing file-static
+data symbols (C static variables).
+.LI
+Section .debug_typenames
+is a MIPS extension
+containing .debug_pubnames-like entries describing file-level
+types.
+.P
+The following are not currently emitted.
+.LI
+Section .debug_macinfo
+Macro information is not currently emitted.
+.LI
+Section .debug_loc
+Location lists are not currently emitted.
+.LI
+Section .debug_str
+The string section is not currently emitted.
+.LE
+.H 2 "Overview of information emitted"
+We emit debug information in 3 flavors.
+We mention C here.
+The situation is essentially identical for f77, f90, and C++.
+.AL
+.LI
+"default C"
+We emit line information and DIEs for each subprogram.
+But no local symbols and no type information.
+Frame information is output.
+The DW_AT_producer string has the optimization level: for example
+"-O2".
+We put so much in the DW_AT_producer that the string
+is a significant user of space in .debug_info --
+this is perhaps a poor use of space.
+When optimizing the IRIX CC/cc option -DEBUG:optimize_space
+eliminates such wasted space.
+Debuggers only currently use the lack of -g
+of DW_AT_producer
+as a hint as to how a 'step' command should be interpreted, and
+the rest of the string is not used for anything (unless
+a human looks at it for some reason), so if space-on-disk
+is an issue, it is quite appropriate to use -DEBUG:optimize_space
+and save disk space.
+Every function definition (not inline instances though) is mentioned
+in either .debug_pubnames or .debug_funcnames.
+This is crucial to allow libexc(3C) stack-traceback to work and
+show function names (for all languages).
+.LI
+"C with full symbols"
+All possible info is emitted.
+DW_AT_producer string has all options that might be of interest,
+which includes -D's, -U's, and the -g option.
+These options look like they came from the command line.
+We put so much in the DW_AT_producer that the string
+is a significant user of space in .debug_info.
+this is perhaps a poor use of space.
+Debuggers only currently use the -g
+of DW_AT_producer
+as a hint as to how a 'step' command should be interpreted, and
+the rest of the string is not used for anything (unless
+a human looks at it for some reason).
+Every function definition (not inline instances though) is mentioned
+in either .debug_pubnames or .debug_funcnames.
+This is crucial to allow libexc(3C) stack-traceback to work and
+show function names (for all languages).
+.LI
+"Assembler (-g, non -g are the same)"
+Frame information is output.
+No type information is emitted, but DIEs are prepared
+for globals.
+.LE
+.H 2 "Detecting 'full symbols' (-g)"
+The debugger depends on the existence of
+the DW_AT_producer string to determine if the
+compilation unit has full symbols or not.
+It looks for -g or -g[123] and accepts these as
+full symbols but an absent -g or a present -g0
+is taken to mean that only basic symbols are defined and there
+are no local symbols and no type information.
+.P
+In various contexts the debugger will think the program is
+stripped or 'was not compiled with -g' unless the -g
+is in the DW_AT_producer string.
+.H 2 "DWARF and strip(1)"
+The DWARF section ".debug_frame" is marked SHF_MIPS_NOSTRIP
+and is not stripped by the strip(1) program.
+This is because the section is needed for doing
+stack back traces (essential for C++
+and Ada exception handling).
+.P
+All .debug_* sections are marked with elf type
+SHT_MIPS_DWARF.
+Applications needing to access the various DWARF sections
+must use the section name to discriminate between them.
+
+.H 2 "Evaluating location expressions"
+When the debugger evaluates location expressions, it does so
+in 2 stages. In stage one it simply looks for the trivial
+location expressions and treats those as special cases.
+.P
+If the location expression is not trivial, it enters stage two.
+In this case it uses a stack to evaluate the expression.
+.P
+If the application is a 32-bit application, it does the operations
+on 32-bit values (address size values). Even though registers
+can be 64 bits in a 32-bit program all evaluations are done in
+32-bit quantities, so an attempt to calculate a 32-bit quantity
+by taking the difference of 2 64-bit register values will not
+work. The notion is that the stack machine is, by the dwarf
+definition, working in address size units.
+.P
+These values are then expanded to 64-bit values (addresses or
+offsets). This extension does not involve sign-extension.
+.P
+If the application is a 64-bit application, then the stack
+values are all 64 bits and all operations are done on 64 bits.
+.H 3 "The fbreg location op"
+Compilers shipped with IRIX 6.0 and 6.1
+do not emit the fbreg location expression
+and never emit the DW_AT_frame_base attribute that it
+depends on.
+However, this changes
+with release 6.2 and these are now emitted routinely.
+
+.H 1 "Frame Information"
+.H 2 "Initial Instructions"
+The DWARF V2 spec
+provides for "initial instructions" in each CIE (page 61,
+section 6.4.1).
+However, it does not say whether there are default
+values for each column (register).
+.P
+Rather than force every CIE to have a long list
+of bytes to initialize all 32 integer registers,
+we define that the default values of all registers
+(as returned by libdwarf in the frame interface)
+are 'same value'.
+This is a good choice for many non-register-windows
+implementations.
+.H 2 "Augmentation string in debug_frame"
+The augmentation string we use in shipped compilers (up thru
+irix6.2) is the empty string.
+IRIX6.2 and later has an augmentation string
+the empty string ("")
+or "z" or "mti v1"
+where the "v1" is a version number (version 1).
+.P
+We do not believe that "mti v1" was emitted as the
+augmentation string in any shipped compiler.
+.P
+.H 3 "CIE processing based on augmentation string:"
+If the augmentation string begins with 'z', then it is followed
+immediately by a unsigned_leb_128 number giving the code alignment factor.
+Next is a signed_leb_128 number giving the data alignment factor.
+Next is a unsigned byte giving the number of the return address register.
+Next is an unsigned_leb_128 number giving the length of the 'augmentation'
+fields (the length of augmentation bytes, not
+including the unsigned_leb_128 length itself).
+As of release 6.2, the length of the CIE augmentation fields is 0.
+What this means is that it is possible to add new
+augmentations, z1, z2, etc and yet an old consumer to
+understand the entire CIE as it can bypass the
+augmentation it does not understand because the
+length of the augmentation fields is present.
+Presuming of course that all augmentation fields are
+simply additional information,
+not some 'changing of the meaning of
+an existing field'.
+Currently there is no CIE data in the augmentation for things
+beginning with 'z'.
+.P
+If the augmentation string is "mti v1" or "" then it is followed
+immediately by a unsigned_leb_128 number giving the code alignment factor.
+Next is a signed_leb_128 number giving the data alignment factor.
+Next is a unsigned byte giving the number of the return address register.
+.P
+If the augmentation string is something else, then the
+code alignment factor is assumed to be 4 and the data alignment
+factor is assumed to be -1 and the return
+address register is assumed to be 31. Arbitrarily.
+The library (libdwarf) assumes it does not understand the rest of the CIE.
+.P
+.H 3 "FDE processing based on augmentation"
+If the CIE augmentation string
+for an fde begins with 'z'
+then the next FDE field after the address_range field
+is an
+unsigned_leb_128 number giving the length of the 'augmentation'
+fields, and those fields follow immediately.
+
+.H 4 "FDE augmentation fields"
+.P
+If the CIE augmentation string is "mti v1" or ""
+then the FDE is exactly as described in the Dwarf Document section 6.4.1.
+.P
+Else, if the CIE augmentation string begins with "z"
+then the next field after the FDE augmentation length field
+is a Dwarf_Sword size offset into
+exception tables.
+If the CIE augmentation string does not begin with "z"
+(and is neither "mti v1" nor "")
+the FDE augmentation fields are skipped (not understood).
+Note that libdwarf actually (as of MIPSpro7.3 and earlier)
+only tests that the initial character of the augmentation
+string is 'z', and ignores the rest of the string, if any.
+So in reality the test is for a _prefix_ of 'z'.
+.P
+If the CIE augmentation string neither starts with 'z' nor is ""
+nor is "mti v1" then libdwarf (incorrectly) assumes that the
+table defining instructions start next.
+Processing (in libdwarf) will be incorrect.
+.H 2 "Stack Pointer recovery from debug_frame"
+There is no identifiable means in
+DWARF2 to say that the stack register is
+recovered by any particular operation.
+A 'register rule' works if the caller's
+stack pointer was copied to another
+register.
+An 'offset(N)' rule works if the caller's
+stack pointer was stored on the stack.
+However if the stack pointer is
+some register value plus/minus some offset,
+there is no means to say this in an FDE.
+For MIPS/IRIX, the recovered stack pointer
+of the next frame up the stack (towards main())
+is simply the CFA value of the current
+frame, and the CFA value is
+precisely a register (value of a register)
+or a register plus offset (value of a register
+plus offset). This is a software convention.
+.H 1 "egcs dwarf extensions (egcs-1.1.2 extensions)"
+This and following egcs sections describe
+the extensions currently shown in egcs dwarf2.
+Note that egcs has chosen to adopt tag and
+attribute naming as if their choices were
+standard dwarf, not as if they were extensions.
+However, they are properly numbered as extensions.
+
+.H 2 "DW_TAG_format_label 0x4101"
+For FORTRAN 77, Fortran 90.
+Details of use not defined in egcs source, so
+unclear if used.
+.H 2 "DW_TAG_function_template 0x4102"
+For C++.
+Details of use not defined in egcs source, so
+unclear if used.
+.H 2 "DW_TAG_class_template 0x4103"
+For C++.
+Details of use not defined in egcs source, so
+unclear if used.
+.H 2 "DW_AT_sf_names 0x2101"
+Apparently only output in DWARF1, not DWARF2.
+.H 2 "DW_AT_src_info 0x2102"
+Apparently only output in DWARF1, not DWARF2.
+.H 2 "DW_AT_mac_info 0x2103"
+Apparently only output in DWARF1, not DWARF2.
+.H 2 "DW_AT_src_coords 0x2104"
+Apparently only output in DWARF1, not DWARF2.
+.H 2 "DW_AT_body_begin 0x2105"
+Apparently only output in DWARF1, not DWARF2.
+.H 2 "DW_AT_body_end 0x2106"
+Apparently only output in DWARF1, not DWARF2.
+
+.H 1 "egcs .eh_frame (non-sgi) (egcs-1.1.2 extensions)"
+egcs-1.1.2 (and earlier egcs)
+emits by default a section named .eh_frame
+for ia32 (and possibly other platforms) which
+is nearly identical to .debug_frame in format and content.
+This section is used for helping handle C++ exceptions.
+.P
+Because after linking there are sometimes zero-ed out bytes
+at the end of the eh_frame section, the reader code in
+dwarf_frame.c considers a zero cie/fde length as an indication
+that it is the end of the section.
+.P
+.H 2 "CIE_id 0"
+The section is an ALLOCATED section in an executable, and
+is therefore mapped into memory at run time.
+The CIE_pointer (aka CIE_id, section 6.4.1
+of the DWARF2 document) is the field that
+distinguishes a CIE from an FDE.
+The designers of the egcs .eh_frame section
+decided to make the CIE_id
+be 0 as the CIE_pointer definition is
+.in +2
+the number of bytes from the CIE-pointer in the FDE back to the
+applicable CIE.
+.in -2
+In a dwarf .debug_frame section, the CIE_pointer is the
+offset in .debug_frame of the CIE for this fde, and
+since an offset can be zero of some CIE, the CIE_id
+cannot be 0, but must be all 1 bits .
+Note that the dwarf2.0 spec does specify the value of CIE_id
+as 0xffffffff
+(see section 7.23 of v2.0.0),
+though earlier versions of this extensions document
+incorrectly said it was not specified in the dwarf
+document.
+.H 2 "augmentation eh"
+The augmentation string in each CIE is "eh"
+which, with its following NUL character, aligns
+the following word to a 32bit boundary.
+Following the augmentation string is a 32bit
+word with the address of the __EXCEPTION_TABLE__,
+part of the exception handling data for egcs.
+.H 2 "DW_CFA_GNU_window_save 0x2d"
+This is effectively a flag for architectures with
+register windows, and tells the unwinder code that
+it must look to a previous frame for the
+correct register window set.
+As of this writing, egcs gcc/frame.c
+indicates this is for SPARC register windows.
+.H 2 "DW_CFA_GNU_args_size 0x2e"
+DW_CFA_GNU_args_size has a single uleb128 argument
+which is the size, in bytes, of the function's stack
+at that point in the function.
+.H 2 "__EXCEPTION_TABLE__"
+A series of 3 32bit word entries by default:
+0 word: low pc address
+1 word: high pc address
+2 word: pointer to exception handler code
+The end of the table is
+signaled by 2 words of -1 (not 3 words!).
+.H 1 "Interpretations of the DWARF V2 spec"
+.H 2 "template TAG spellings"
+The DWARF V2 spec spells two attributes in two ways.
+DW_TAG_template_type_param
+(listed in Figure 1, page 7)
+is spelled DW_TAG_template_type_parameter
+in the body of the document (section 3.3.7, page 28).
+We have adopted the spelling
+DW_TAG_template_type_param.
+.P
+DW_TAG_template_value_param
+(listed in Figure 1, page 7)
+is spelled DW_TAG_template_value_parameter
+in the body of the document (section 3.3.7, page 28).
+We have adopted the spelling
+DW_TAG_template_value_parameter.
+.P
+We recognize that the choices adopted are neither consistently
+the longer nor the shorter name.
+This inconsistency was an accident.
+.H 2 DW_FORM_ref_addr confusing
+Section 7.5.4, Attribute Encodings, describes
+DW_FORM_ref_addr.
+The description says the reference is the size of an address
+on the target architecture.
+This is surely a mistake, because on a 16bit-pointer-architecture
+it would mean that the reference could not exceed
+16 bits, which makes only
+a limited amount of sense as the reference is from one
+part of the dwarf to another, and could (in theory)
+be *on the disk* and not limited to what fits in memory.
+Since MIPS is 32 bit pointers (at the smallest)
+the restriction is not a problem for MIPS/SGI.
+The 32bit pointer ABIs are limited to 32 bit section sizes
+anyway (as a result of implementation details).
+And the 64bit pointer ABIs currently have the same limit
+as a result of how the compilers and tools are built
+(this has not proven to be a limit in practice, so far).
+.P
+This has been clarified in the DWARF3 spec and the IRIX use
+of DW_FORM_ref_addr being an offset is correct.
+.H 2 "Section .debug_macinfo in a debugger"
+It seems quite difficult, in general, to
+tie specific text(code) addresses to points in the
+stream of macro information for a particular compilation unit.
+So it's been difficult to see how to design a consumer
+interface to libdwarf for macro information.
+.P
+The best (simple to implement, easy for a debugger user to
+understand) candidate seems to be that
+the debugger asks for macros of a given name in a compilation
+unit, and the debugger responds with *all* the macros of that name.
+.H 3 "only a single choice exists"
+If there is exactly one, that is usable in expressions, if the
+debugger is able to evaluate such.
+.H 3 "multiple macros with same name".
+If there are multiple macros with the same name
+in a compilation unit,
+the debugger (and the debugger user and the application
+programmer) have
+a problem: confusion is quite possible.
+If the macros are simple the
+debugger user can simply substitute by hand in an expression.
+If the macros are complicated hand substitution will be
+impractical, and the debugger will have to identify the
+choices and let the debugger user choose an interpretation.
+.H 2 "Section 6.1.2 Lookup by address problem"
+Each entry is a beginning-address followed by a length.
+And the distinguished entry 0,0 is used to denote
+the end of a range of entries.
+.P
+This means that one must be careful not to emit a zero length,
+as in a .o (object file) the beginning address of
+a normal entry might be 0 (it is a section offset after all),
+and the resulting 0,0 would be taken as end-of-range, not
+as a valid entry.
+A dwarf dumper would have trouble with such data
+in an object file.
+.P
+In an a.out or shared object (dynamic shared object, DSO)
+no text will be at address zero so in such this problem does
+not arise.
+.H 2 "Section 5.10 Subrange Type Entries problem"
+It is specified that DW_AT_upper_bound (and lower bound)
+must be signed entries if there is no object type
+info to specify the bound type (Sec 5.10, end of section).
+One cannot tell (with some
+dwarf constant types) what the signedness is from the
+form itself (like DW_FORM_data1), so it is necessary
+to determine the object and type according to the rules
+in 5.10 and then if all that fails, the type is signed.
+It's a bit complicated and earlier versions of mips_extensions
+incorrectly said signedness was not defined.
+.H 2 "Section 5.5.6 Class Template Instantiations problem"
+Lots of room for implementor to canonicalize
+template declarations. Ie various folks won't agree.
+This is not serious since a given compiler
+will be consistent with itself and debuggers
+will have to cope!
+.H 2 "Section 2.4.3.4 # 11. operator spelling"
+DW_OP_add should be DW_OP_plus (page 14)
+(this mistake just one place on the page).
+.H 2 "No clear specification of C++ static funcs"
+There is no clear way to tell if a C++ member function
+is a static member or a non-static member function.
+(dwarf2read.c in gdb 4.18, for example, has this observation)
+.H 2 "Misspelling of DW_AT_const_value"
+Twice in appendix 1, DW_AT_const_value is misspelled
+as DW_AT_constant_value.
+.H 2 "Mistake in Atribute Encodings"
+Section 7.5.4, "Attribute Encodings"
+has a brief discussion of "constant"
+which says there are 6 forms of constants.
+It is incorrect in that it fails to mention (or count)
+the block forms, which are clearly allowed by
+section 4.1 "Data Object Entries" (see entry number 9 in
+the numbered list, on constants).
+.H 2 "DW_OP_bregx"
+The description of DW_OP_bregx in 2.4.3.2 (Register Based
+Addressing) is slightly misleading, in that it
+lists the offset first.
+As section 7.7.1 (Location Expression)
+makes clear, in the encoding the register number
+comes first.
+.H 1 "MIPS attributes"
+.H 2 "DW_AT_MIPS_fde"
+This extension to Dwarf appears only on subprogram TAGs and has as
+its value the offset, in the .debug_frame section, of the fde which
+describes the frame of this function. It is an optimization of
+sorts to have this present.
+
+.H 2 "DW_CFA_MIPS_advance_loc8 0x1d"
+This obvious extension to dwarf line tables enables encoding of 8 byte
+advance_loc values (for cases when such must be relocatable,
+and thus must be full length). Applicable only to 64-bit objects.
+
+.H 2 "DW_TAG_MIPS_loop 0x4081"
+For future use. Not currently emitted.
+Places to be emitted and attributes that this might own
+not finalized.
+
+.H 2 "DW_AT_MIPS_loop_begin 0x2002"
+For future use. Not currently emitted.
+Attribute form and content not finalized.
+
+.H 2 "DW_AT_MIPS_tail_loop_begin 0x2003"
+For future use. Not currently emitted.
+Attribute form and content not finalized.
+
+.H 2 "DW_AT_MIPS_epilog_begin 0x2004"
+For future use. Not currently emitted.
+Attribute form and content not finalized.
+
+.H 2 "DW_AT_MIPS_loop_unroll_factor 0x2005"
+For future use. Not currently emitted.
+Attribute form and content not finalized.
+
+.H 2 "DW_AT_MIPS_software_pipeline_depth 0x2006"
+For future use. Not currently emitted.
+Attribute form and content not finalized.
+.H 2 "DW_AT_MIPS_linkage_name 0x2007"
+The rules for mangling C++ names are not part of the
+C++ standard and are different for different versions
+of C++. With this extension, the compiler emits
+both the DW_AT_name for things with mangled names
+(recall that DW_AT_name is NOT the mangled form)
+and also emits DW_AT_MIPS_linkage_name whose value
+is the mangled name.
+.P
+This makes looking for the mangled name in other linker
+information straightforward.
+It also is passed (by the debugger) to the
+libmangle routines to generate names to present to the
+debugger user.
+.H 2 "DW_AT_MIPS_stride 0x2008"
+F90 allows assumed shape arguments and pointers to describe
+non-contiguous memory. A (runtime) descriptor contains address,
+bounds and stride information - rank and element size is known
+during compilation. The extent in each dimension is given by the
+bounds in a DW_TAG_subrange_type, but the stride cannot be
+represented in conventional dwarf. DW_AT_MIPS_stride was added as
+an attribute of a DW_TAG_subrange_type to describe the
+location of the stride.
+Used in the MIPSpro 7.2 (7.2.1 etc) compilers.
+.P
+If the stride is constant (ie: can be inferred from the type in the
+usual manner) DW_AT_MIPS_stride is absent.
+.P
+If DW_AT_MIPS_stride is present, the attribute contains a reference
+to a DIE which describes the location holding the stride, and the
+DW_AT_stride_size field of DW_TAG_array_type is ignored if
+present. The value of the stride is the number of
+4 byte words between
+elements along that axis.
+.P
+This applies to
+.nf
+a) Intrinsic types whose size is greater
+ or equal to 4bytes ie: real*4,integer*8
+ complex etc, but not character types.
+
+b) Derived types (ie: structs) of any size,
+ unless all components are of type character.
+.fi
+
+.H 2 "DW_AT_MIPS_abstract_name 0x2009"
+This attribute only appears in a DA_TAG_inlined_subroutine DIE.
+The value of this attribute is a string.
+When IPA inlines a routine and the abstract origin is
+in another compilation unit, there is a problem with putting
+in a reference, since the ordering and timing of the
+creation of references is unpredicatable with reference to
+the DIE and compilation unit the reference refers to.
+.P
+Since there may be NO ordering of the compilation units that
+allows a correct reference to be done without some kind of patching,
+and since even getting the information from one place to another
+is a problem, the compiler simply passes the problem on to the debugger.
+.P
+The debugger must match the DW_AT_MIPS_abstract_name
+in the concrete
+inlined instance DIE
+with the DW_AT_MIPS_abstract_name
+in the abstract inlined subroutine DIE.
+.P
+A dwarf-consumer-centric view of this and other inline
+issues could be expressed as follows:
+.nf
+If DW_TAG_subprogram
+ If has DW_AT_inline is abstract instance root
+ If has DW_AT_abstract_origin, is out-of-line instance
+ of function (need abstract origin for some data)
+ (abstract root in same CU (conceptually anywhere
+ a ref can reach, but reaching outside of CU is
+ a problem for ipa: see DW_AT_MIPS_abstract_name))
+ If has DW_AT_MIPS_abstract_name is abstract instance
+ root( must have DW_AT_inline) and this name is used to
+ match with the abstract root
+
+If DW_TAG_inline_subroutine
+ Is concrete inlined subprogram instance.
+ If has DW_AT_abstract_origin, it is a CU-local inline.
+ If it has DW_AT_MIPS_abstract_name it is an
+ inline whose abstract root is in another file (CU).
+.fi
+
+.H 2 "DW_AT_MIPS_clone_origin 0x200a"
+This attribute appears only in a cloned subroutine.
+The procedure is cloned from the same compilation unit.
+The value of this attribute is a reference to
+the original routine in this compilation unit.
+.P
+The 'original' routine means the routine which has all the
+original code. The cloned routines will always have
+been 'specialized' by IPA.
+A routine with DW_AT_MIPS_clone_origin
+will also have the DW_CC_nocall value of the DW_AT_calling_convention
+attribute.
+
+.H 2 "DW_AT_MIPS_has_inlines 0x200b"
+This attribute may appear in a DW_TAG_subprogram DIE.
+If present and it has the value True, then the subprogram
+has inlined functions somewhere in the body.
+.P
+By default, at startup, the debugger may not look for
+inlined functions in scopes inside the outer function.
+.P
+This is a hint to the debugger to look for the inlined functions
+so the debugger can set breakpoints on these in case the user
+requests 'stop in foo' and foo is inlined.
+.H 2 "DW_AT_MIPS_stride_byte 0x200c"
+Created for f90 pointer and assumed shape
+arrays.
+Used in the MIPSpro 7.2 (7.2.1 etc) compilers.
+A variant of DW_AT_MIPS_stride.
+This stride is interpreted as a byte count.
+Used for integer*1 and character arrays
+and arrays of derived type
+whose components are all character.
+.H 2 "DW_AT_MIPS_stride_elem 0x200d"
+Created for f90 pointer and assumed shape
+arrays.
+Used in the MIPSpro 7.2 (7.2.1 etc) compilers.
+A variant of DW_AT_MIPS_stride.
+This stride is interpreted as a byte-pair (2 byte) count.
+Used for integer*2 arrays.
+.H 2 "DW_AT_MIPS_ptr_dopetype 0x200e"
+See following.
+.H 2 "DW_AT_MIPS_allocatable_dopetype 0x200f"
+See following.
+.H 2 "DW_AT_MIPS_assumed_shape_dopetype 0x2010"
+DW_AT_MIPS_assumed_shape_dopetype, DW_AT_MIPS_allocatable_dopetype,
+and DW_AT_MIPS_ptr_dopetype have an attribute value
+which is a reference to a Fortran 90 Dope Vector.
+These attributes are introduced in MIPSpro7.3.
+They only apply to f90 arrays (where they are
+needed to describe arrays never properly described
+before in debug information).
+C, C++, f77, and most f90 arrays continue to be described
+in standard dwarf.
+.P
+The distinction between these three attributes is the f90 syntax
+distinction: keywords 'pointer' and 'allocatable' with the absence
+of these keywords on an assumed shape array being the third case.
+.P
+A "Dope Vector" is a struct (C struct) which describes
+a dynamically-allocatable array.
+In objects with full debugging the C struct will be
+in the dwarf information (of the f90 object, represented like C).
+A debugger will use the link to find the main struct DopeVector
+and will use that information to decode the dope vector.
+At the outer allocatable/assumed-shape/pointer
+the DW_AT_location points at the dope vector (so debugger
+calculations use that as a base).
+.H 2 "Overview of debugger use of dope vectors"
+Fundamentally, we build two distinct
+representations of the arrays and pointers.
+One, in dwarf, represents the statically-representable
+information (the types and
+variable/type-names, without type size information).
+The other, using dope vectors in memory, represents
+the run-time data of sizes.
+A debugger must process the two representations
+in parallel (and merge them) to deal with user expressions in
+a debugger.
+.H 2 "Example f90 code for use in explanation"
+[Note
+We want dwarf output with *exactly*
+this little (arbitrary) example.
+Not yet available.
+end Note]
+Consider the following code.
+.nf
+ type array_ptr
+ real :: myvar
+ real, dimension (:), pointer :: ap
+ end type array_ptr
+
+ type (array_ptr), allocatable, dimension (:) :: arrays
+
+ allocate (arrays(20))
+ do i = 1,20
+ allocate (arrays(i)%ap(i))
+ end do
+.fi
+arrays is an allocatable array (1 dimension) whose size is
+not known at compile time (it has
+a Dope Vector). At run time, the
+allocate statement creats 20 array_ptr dope vectors
+and marks the base arrays dopevector as allocated.
+The myvar variable is just there to add complexity to
+the example :-)
+.nf
+In the loop, arrays(1)%ap(1)
+ is allocated as a single element array of reals.
+In the loop, arrays(2)%ap(2)
+ is allocated as an array of two reals.
+...
+In the loop, arrays(20)%ap(20)
+ is allocated as an array of twenty reals.
+.fi
+.H 2 "the problem with standard dwarf and this example"
+.sp
+In dwarf, there is no way to find the array bounds of arrays(3)%ap,
+for example, (which are 1:3 in f90 syntax)
+since any location expression in an ap array lower bound
+attribute cannot involve the 3 (the 3 is known at debug time and
+does not appear in the running binary, so no way for the
+location expression to get to it).
+And of course the 3 must actually index across the array of
+dope vectors in 'arrays' in our implementation, but that is less of
+a problem than the problem with the '3'.
+.sp
+Plus dwarf has no way to find the 'allocated' flag in the
+dope vector (so the debugger can know when the allocate is done
+for a particular arrays(j)%ap).
+.sp
+Consequently, the calculation of array bounds and indices
+for these dynamically created f90 arrays
+is now pushed of into the debugger, which must know the
+field names and usages of the dope vector C structure and
+use the field offsets etc to find data arrays.
+C, C++, f77, and most f90 arrays continue to be described
+in standard dwarf.
+At the outer allocatable/assumed-shape/pointer
+the DW_AT_location points at the dope vector (so debugger
+calculations use that as a base).
+.P
+It would have been nice to design a dwarf extension
+to handle the above problems, but
+the methods considered to date were not
+any more consistent with standard dwarf than
+this dope vector centric approach: essentially just
+as much work in the debugger appeared necessary either way.
+A better (more dwarf-ish)
+design would be welcome information.
+
+.H 2 "A simplified sketch of the dwarf information"
+[Note:
+Needs to be written.
+end Note]
+
+.H 2 "A simplified sketch of the dope vector information"
+[Note:
+This one is simplified.
+Details left out that should be here. Amplify.
+end Note]
+This is an overly simplified version of a dope vector,
+presented as an initial hint.
+Full details presented later.
+.nf
+struct simplified{
+ void *base; // pointer to the data this describes
+ long el_len;
+ int assoc:1
+ int ptr_alloc:1
+ int num_dims:3;
+ struct dims_s {
+ long lb;
+ long ext;
+ long str_m;
+ } dims[7];
+};
+.fi
+Only 'num_dims' elements of dims[] are actually used.
+
+.H 2 "The dwarf information"
+
+Here is dwarf information from the compiler for
+the example above, as printed by dwarfdump(1)
+.nf
+[Note:
+The following may not be the test.
+Having field names with '.' in the name is
+not such a good idea, as it conflicts with the
+use of '.' in dbx extended naming.
+Something else, like _$, would be much easier
+to work with in dbx (customers won't care about this,
+for the most part,
+but folks working on dbx will, and in those
+rare circumstances when a customer cares,
+the '.' will be a real problem in dbx.).
+Note that to print something about .base., in dbx one
+would have to do
+ whatis `.base.`
+where that is the grave accent, or back-quote I am using.
+With extended naming one do
+ whatis `.dope.`.`.base.`
+which is hard to type and hard to read.
+end Note]
+
+<2>< 388> DW_TAG_array_type
+ DW_AT_name .base.
+ DW_AT_type <815>
+ DW_AT_declaration yes(1)
+<3>< 401> DW_TAG_subrange_type
+ DW_AT_lower_bound 0
+ DW_AT_upper_bound 0
+<2>< 405> DW_TAG_pointer_type
+ DW_AT_type <388>
+ DW_AT_byte_size 4
+ DW_AT_address_class 0
+<2>< 412> DW_TAG_structure_type
+ DW_AT_name .flds.
+ DW_AT_byte_size 28
+<3>< 421> DW_TAG_member
+ DW_AT_name el_len
+ DW_AT_type <815>
+ DW_AT_data_member_location DW_OP_consts 0
+<3>< 436> DW_TAG_member
+ DW_AT_name assoc
+ DW_AT_type <841>
+ DW_AT_byte_size 0
+ DW_AT_bit_offset 0
+ DW_AT_bit_size 1
+ DW_AT_data_member_location DW_OP_consts 4
+<3>< 453> DW_TAG_member
+ DW_AT_name ptr_alloc
+ DW_AT_type <841>
+ DW_AT_byte_size 0
+ DW_AT_bit_offset 1
+ DW_AT_bit_size 1
+ DW_AT_data_member_location DW_OP_consts 4
+<3>< 474> DW_TAG_member
+ DW_AT_name p_or_a
+ DW_AT_type <841>
+ DW_AT_byte_size 0
+ DW_AT_bit_offset 2
+ DW_AT_bit_size 2
+ DW_AT_data_member_location DW_OP_consts 4
+<3>< 492> DW_TAG_member
+ DW_AT_name a_contig
+ DW_AT_type <841>
+ DW_AT_byte_size 0
+ DW_AT_bit_offset 4
+ DW_AT_bit_size 1
+ DW_AT_data_member_location DW_OP_consts 4
+<3>< 532> DW_TAG_member
+ DW_AT_name num_dims
+ DW_AT_type <841>
+ DW_AT_byte_size 0
+ DW_AT_bit_offset 29
+ DW_AT_bit_size 3
+ DW_AT_data_member_location DW_OP_consts 8
+<3>< 572> DW_TAG_member
+ DW_AT_name type_code
+ DW_AT_type <841>
+ DW_AT_byte_size 0
+ DW_AT_bit_offset 0
+ DW_AT_bit_size 32
+ DW_AT_data_member_location DW_OP_consts 16
+<3>< 593> DW_TAG_member
+ DW_AT_name orig_base
+ DW_AT_type <841>
+ DW_AT_data_member_location DW_OP_consts 20
+<3>< 611> DW_TAG_member
+ DW_AT_name orig_size
+ DW_AT_type <815>
+ DW_AT_data_member_location DW_OP_consts 24
+<2>< 630> DW_TAG_structure_type
+ DW_AT_name .dope_bnd.
+ DW_AT_byte_size 12
+<3>< 643> DW_TAG_member
+ DW_AT_name lb
+ DW_AT_type <815>
+ DW_AT_data_member_location DW_OP_consts 0
+<3>< 654> DW_TAG_member
+ DW_AT_name ext
+ DW_AT_type <815>
+ DW_AT_data_member_location DW_OP_consts 4
+<3>< 666> DW_TAG_member
+ DW_AT_name str_m
+ DW_AT_type <815>
+ DW_AT_data_member_location DW_OP_consts 8
+<2>< 681> DW_TAG_array_type
+ DW_AT_name .dims.
+ DW_AT_type <630>
+ DW_AT_declaration yes(1)
+<3>< 694> DW_TAG_subrange_type
+ DW_AT_lower_bound 0
+ DW_AT_upper_bound 0
+<2>< 698> DW_TAG_structure_type
+ DW_AT_name .dope.
+ DW_AT_byte_size 44
+<3>< 707> DW_TAG_member
+ DW_AT_name base
+ DW_AT_type <405>
+ DW_AT_data_member_location DW_OP_consts 0
+<3>< 720> DW_TAG_member
+ DW_AT_name .flds
+ DW_AT_type <412>
+ DW_AT_data_member_location DW_OP_consts 4
+<3>< 734> DW_TAG_member
+ DW_AT_name .dims.
+ DW_AT_type <681>
+ DW_AT_data_member_location DW_OP_consts 32
+<2>< 750> DW_TAG_variable
+ DW_AT_type <815>
+ DW_AT_location DW_OP_fbreg -32
+ DW_AT_artificial yes(1)
+<2>< 759> DW_TAG_variable
+ DW_AT_type <815>
+ DW_AT_location DW_OP_fbreg -28
+ DW_AT_artificial yes(1)
+<2>< 768> DW_TAG_variable
+ DW_AT_type <815>
+ DW_AT_location DW_OP_fbreg -24
+ DW_AT_artificial yes(1)
+<2>< 777> DW_TAG_array_type
+ DW_AT_type <815>
+ DW_AT_declaration yes(1)
+<3>< 783> DW_TAG_subrange_type
+ DW_AT_lower_bound <750>
+ DW_AT_count <759>
+ DW_AT_MIPS_stride <768>
+<2>< 797> DW_TAG_variable
+ DW_AT_decl_file 1
+ DW_AT_decl_line 1
+ DW_AT_name ARRAY
+ DW_AT_type <698>
+ DW_AT_location DW_OP_fbreg -64 DW_OP_deref
+<1>< 815> DW_TAG_base_type
+ DW_AT_name INTEGER_4
+ DW_AT_encoding DW_ATE_signed
+ DW_AT_byte_size 4
+<1>< 828> DW_TAG_base_type
+ DW_AT_name INTEGER_8
+ DW_AT_encoding DW_ATE_signed
+ DW_AT_byte_size 8
+<1>< 841> DW_TAG_base_type
+ DW_AT_name INTEGER*4
+ DW_AT_encoding DW_ATE_unsigned
+ DW_AT_byte_size 4
+<1>< 854> DW_TAG_base_type
+ DW_AT_name INTEGER*8
+ DW_AT_encoding DW_ATE_unsigned
+ DW_AT_byte_size 8
+
+.fi
+.H 2 "The dope vector structure details"
+A dope vector is the following C struct, "dopevec.h".
+Not all the fields are of use to a debugger.
+It may be that not all fields will show up
+in the f90 dwarf (since not all are of interest to debuggers).
+.nf
+[Note:
+Need details on the use of each field.
+And need to know which are really 32 bits and which
+are 32 or 64.
+end Note]
+The following
+struct
+is a representation of all the dope vector fields.
+It suppresses irrelevant detail and may not
+exactly match the layout in memory (a debugger must
+examine the dwarf to find the fields, not
+compile this structure into the debugger!).
+.nf
+struct .dope. {
+ void *base; // pointer to data
+ struct .flds. {
+ long el_len; // length of element in bytes?
+ unsigned int assoc:1; //means?
+ unsigned int ptr_alloc:1; //means?
+ unsigned int p_or_a:2; //means?
+ unsigned int a_contig:1; // means?
+ unsigned int num_dims: 3; // 0 thru 7
+ unsigned int type_code:32; //values?
+ unsigned int orig_base; //void *? means?
+ long orig_size; // means?
+ } .flds;
+
+ struct .dope_bnd. {
+ long lb ; // lower bound
+ long ext ; // means?
+ long str_m; // means?
+ } .dims[7];
+}
+.fi
+
+.H 2 "DW_AT_MIPS_assumed_size 0x2011"
+This flag was invented to deal with f90 arrays.
+For example:
+
+.nf
+ pointer (rptr, axx(1))
+ pointer (iptr, ita(*))
+ rptr = malloc (100*8)
+ iptr = malloc (100*4)
+.fi
+
+This flag attribute has the value 'yes' (true, on) if and only if
+the size is unbounded, as iptr is.
+Both may show an explicit upper bound of 1 in the dwarf,
+but this flag notifies the debugger that there is explicitly
+no user-provided size.
+
+So if a user asks for a printout of the rptr allocated
+array, the default will be of a single entry (as
+there is a user slice bound in the source).
+In contrast, there is no explicit upper bound on the iptr
+(ita) array so the default slice will use the current bound
+(a value calculated from the malloc size, see the dope vector).
+
+Given explicit requests, more of rptr(axx) can me shown
+than the default.
+
+.H 1 "Line information and Source Position"
+DWARF does not define the meaning of the term 'source statement'.
+Nor does it define any way to find the first user-written
+executable code in a function.
+.P
+It does define that a source statement has a file name,
+a line number, and a column position (see Sec 6.2, Line Number
+Information of the Dwarf Version 2 document).
+We will call those 3 source coordinates a 'source position'
+in this document. We'll try not to accidentally call the
+source position a 'line number' since that is ambiguous
+as to what it means.
+
+.H 2 "Definition of Statement"
+.P
+A function prolog is a statement.
+.P
+A C, C++, Pascal, or Fortran statement is a statement.
+.P
+Each initialized local variable in C,C++ is a statement
+in that its initialization generates a source position.
+This means that
+ x =3, y=4;
+is two statements.
+.P
+For C, C++:
+The 3 parts a,b,c in for(a;b;c) {d;} are individual statements.
+The condition portion of a while() and do {} while() is
+a statement. (of course d; can be any number of statements)
+.P
+For Fortran, the controlling expression of a DO loop is a statement.
+Is a 'continue' statement in Fortran a DWARF statement?
+.P
+Each function return, whether user coded or generated by the
+compiler, is a statement. This is so one can step over (in
+a debugger) the final user-coded statement
+(exclusive of the return statement if any) in a function
+wile not leaving the function scope.
+.P
+
+.H 2 "Finding The First User Code in a Function"
+
+.nf
+Consider:
+int func(int a)
+{ /* source position 1 */
+ float b = a; /* source position 2 */
+ int x;
+ x = b + 2; /* source position 3 */
+} /* source position 4 */
+.fi
+.P
+The DIE for a function gives the address range of the function,
+including function prolog(s) and epilog(s)
+.P
+Since there is no scope block for the outer user scope of a
+function (and thus no beginning address range for the outer
+user scope: the DWARF committee explicitly rejected the idea
+of having a user scope block)
+it is necessary to use the source position information to find
+the first user-executable statement.
+.P
+This means that the user code for a function must be presumed
+to begin at the code location of the second source position in
+the function address range.
+.P
+If a function has exactly one source position, the function
+presumably consists solely of a return.
+.P
+If a function has exactly two source positions, the function
+may consist of a function prolog and a return or a single user
+statement and a return (there may be no prolog code needed in a
+leaf function). In this case, there is no way to be sure which
+is the first source position of user code, so the rule is to
+presume that the first address is user code.
+.P
+If a function consists of 3 or more source positions, one
+should assume that the first source position is function prolog and
+the second is the first user executable code.
+
+.H 2 "Using debug_frame Information to find first user statement"
+In addition to the line information, the debug_frame information
+can be
+useful in determining the first user source line.
+.P
+Given that a function has more than 1 source position,
+Find the code location of the second source position, then
+examine the debug_frame information to determine if the Canonical
+Frame Address (cfa) is updated before the second source position
+code location.
+If the cfa is updated, then one can be pretty sure that the
+code for the first source position is function prolog code.
+.P
+Similarly, if the cfa is restored in the code for
+a source position, the source position is likely to
+represent a function exit block.
+
+.H 2 "Debugger Use Of Source Position"
+Command line debuggers, such as dbx and gdb, will ordinarily
+want to consider multiple statements on one line to be a single
+statement: doing otherwise is distressing to users since it
+causes a 'step' command to appear to have no effect.
+.P
+An exception for command line debuggers is in determining the
+first user statement: as detailed above, there one wants to
+consider the full source position and will want to consider
+the function return a separate statement. It is difficult to
+make the function return a separate statement 'step' reliably
+however if a function is coded all on one line or if the last
+line of user code before the return is on the same line as the
+return.
+.P
+A graphical debugger has none of these problems if it simply
+highlights the portion of the line being executed. In that
+case, stepping will appear natural even stepping within a
+line.
+.H 1 "Known Bugs"
+Up through at least MIPSpro7.2.1
+the compiler has been emitting form DW_FORM_DATA1,2, or 4
+for DW_AT_const_value in DW_TAG_enumerator.
+And dwarfdump and debuggers have read this with dwarf_formudata()
+or form_sdata() and gotten some values incorrect.
+For example, a value of 128 was printed by debuggers as a negative value.
+Since dwarfdump and the compilers were not written to use the
+value the same way, their output differed.
+For negative enumerator values the compiler has been emitting 32bit values
+in a DW_FORM_DATA4.
+The compiler should probably be emitting a DW_FORM_sdata for
+enumerator values.
+And consumers of enumerator values should then call form_sdata().
+However, right now, debuggers should call form_udata() and only if
+it fails, call form_sdata().
+Anything else will break backward compatibility with
+the objects produced earlier.
+.SK
+.S
+.TC 1 1 4
+.CS