GCC 4.8 Release Series -- Changes, New Features, and Fixes ========================================================== Caveats ======= GCC now uses C++ as its implementation language. This means that to build GCC from sources, you will need a C++ compiler that understands C++ 2003. For more details on the rationale and specific changes, please refer to the C++ conversion page. To enable the Graphite framework for loop optimizations you now need CLooG version 0.18.0 and ISL version 0.11.1. Both can be obtained from the GCC infrastructure directory. The installation manual contains more information about requirements to build GCC. GCC now uses a more aggressive analysis to derive an upper bound for the number of iterations of loops using constraints imposed by language standards. This may cause non-conforming programs to no longer work as expected, such as SPEC CPU 2006 464.h264ref and 416.gamess. A new option, -fno-aggressive-loop- optimizations, was added to disable this aggressive analysis. In some loops that have known constant number of iterations, but undefined behavior is known to occur in the loop before reaching or during the last iteration, GCC will warn about the undefined behavior in the loop instead of deriving lower upper bound of the number of iterations for the loop. The warning can be disabled with -Wno-aggressive-loop-optimizations. On ARM, a bug has been fixed in GCC's implementation of the AAPCS rules for the layout of vectors that could lead to wrong code being generated. Vectors larger than 8 bytes in size are now by default aligned to an 8-byte boundary. This is an ABI change: code that makes explicit use of vector types may be incompatible with binary objects built with older versions of GCC. Auto- vectorized code is not affected by this change. On AVR, support has been removed for the command-line option -mshort-calls deprecated in GCC 4.7. On AVR, the configure option --with-avrlibc supported since GCC 4.7.2 is turned on per default for all non-RTEMS configurations. This option arranges for a better integration of AVR_Libc with avr-gcc. For technical details, see PR54461. To turn off the option in non-RTEMS configurations, use --with- avrlibc=no. If the compiler is configured for RTEMS, the option is always turned off. More information on porting to GCC 4.8 from previous versions of GCC can be found in the porting guide (http://gcc.gnu.org/gcc-4.8/porting_to.html) for this release. General Optimizer Improvements (and Changes) ============================================ - DWARF4 is now the default when generating DWARF debug information. When -g is used on a platform that uses DWARF debugging information, GCC will now default to -gdwarf-4 -fno-debug-types-section. GDB 7.5, Valgrind 3.8.0 and elfutils 0.154 debug information consumers support DWARF4 by default. Before GCC 4.8 the default version used was DWARF2. To make GCC 4.8 generate an older DWARF version use -g together with -gdwarf-2 or -gdwarf-3. The default for Darwin and VxWorks is still -gdwarf-2 -gstrict-dwarf. - A new general optimization level, -Og, has been introduced. It addresses the need for fast compilation and a superior debugging experience while providing a reasonable level of runtime performance. Overall experience for development should be better than the default optimization level -O0. - A new option -ftree-partial-pre was added to control the partial redundancy elimination (PRE) optimization. This option is enabled by default at the -O3 optimization level, and it makes PRE more aggressive. - The option -fconserve-space has been removed; it was no longer useful on most targets since GCC supports putting variables into BSS without making them common. - The struct reorg and matrix reorg optimizations (command-line-options -fipa-struct-reorg and -fipa-matrix-reorg) have been -removed. They did not always work correctly, nor did they work -with link-time optimization (LTO), hence were only applicable to -programs consisting of a single translation unit. - Several scalability bottle-necks have been removed from GCC's optimization passes. Compilation of extremely large functions, e.g. due to the use of the flatten attribute in the "Eigen" C++ linear algebra templates library, is significantly faster than previous releases of GCC. - Link-time optimization (LTO) improvements: - LTO partitioning has been rewritten for better reliability and maintanibility. Several important bugs leading to link failures have been fixed. - Interprocedural optimization improvements: - A new symbol table has been implemented. It builds on existing callgraph and varpool modules and provide a new API. Unusual symbol visibilities and aliases are handled more consistently leading to, for example, more aggressive unreachable code removal with LTO. - The inline heuristic can now bypass limits on the size of of inlined functions when the inlining is particularly profitable. This happens, for example, when loop bounds or array strides get propagated. - Values passed through aggregates (either by value or reference) are now propagated at the inter-procedural level leading to better inlining decisions (for example in the case of Fortran array descriptors) and devirtualization. - AddressSanitizer, a fast memory error detector, has been added and can be enabled via -fsanitize=address. Memory access instructions will be instrumented to detect heap-, stack-, and global-buffer overflow as well as use-after-free bugs. To get nicer stacktraces, use -fno-omit-frame-pointer. The AddressSanitizer is available on IA-32/x86-64/x32/PowerPC/PowerPC64 GNU/ Linux and on x86-64 Darwin. - ThreadSanitizer has been added and can be enabled via -fsanitize=thread. Instructions will be instrumented to detect data races. The ThreadSanitizer is available on x86-64 GNU/Linux. New Languages and Language specific improvements ================================================ C family -------- - Each diagnostic emitted now includes the original source line and a caret '^' indicating the column. The option -fno-diagnostics-show-caret suppresses this information. - The option -ftrack-macro-expansion=2 is now enabled by default. This allows the compiler to display the macro expansion stack in diagnostics. Combined with the caret information, an example diagnostic showing these two features is: t.c:1:94: error: invalid operands to binary < (have ‘struct mystruct’ and ‘float’) #define MYMAX(A,B) __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a < __b ? __b : __a; }) ^ t.c:7:7: note: in expansion of macro 'MYMAX' X = MYMAX(P, F); ^ - A new -Wsizeof-pointer-memaccess warning has been added (also enabled by -Wall) to warn about suspicious length parameters to certain string and memory built-in functions if the argument uses sizeof. This warning warns e.g. about memset (ptr, 0, sizeof (ptr)); if ptr is not an array, but a pointer, and suggests a possible fix, or about memcpy (&foo, ptr, sizeof (&foo));. - The new option -Wpedantic is an alias for -pedantic, which is now deprecated. The forms -Wno-pedantic, -Werror=pedantic, and -Wno- error=pedantic work in the same way as for any other -W option. One caveat is that -Werror=pedantic is not equivalent to -pedantic-errors, since the latter makes into errors some warnings that are not controlled by -Wpedantic, and the former only affects diagnostics that are disabled when using -Wno-pedantic. - The option -Wshadow no longer warns if a declaration shadows a function declaration, unless the former declares a function or pointer to function, because this is a_common_and_valid_case_in_real-world_code. C++ --- - G++ now implements the C++11 thread_local keyword; this differs from the GNU __thread keyword primarily in that it allows dynamic initialization and destruction semantics. Unfortunately, this support requires a run-time penalty for references to non-function-local thread_local variables defined in a different translation unit even if they don't need dynamic initialization, so users may want to continue to use __thread for TLS variables with static initialization semantics. If the programmer can be sure that no use of the variable in a non-defining TU needs to trigger dynamic initialization (either because the variable is statically initialized, or a use of the variable in the defining TU will be executed before any uses in another TU), they can avoid this overhead with the -fno-extern-tls-init option. OpenMP threadprivate variables now also support dynamic initialization and destruction by the same mechanism. - G++ now implements the C++11 attribute syntax, e.g. [[noreturn]] void f(); and also the alignment specifier, e.g. alignas(double) int i; - G++ now implements C++11 inheriting constructors, e.g. struct A { A(int); }; struct B: A { using A::A; }; // defines B::B(int) B b(42); // OK - As of GCC 4.8.1, G++ implements the change to decltype semantics from N3276. struct A f(); decltype(f()) g(); // OK, return type of f() is not required to be complete. - G++ now supports a -std=c++1y option for experimentation with features proposed for the next revision of the standard, expected around 2017. Currently the only difference from -std=c++11 is support for return type deduction in normal functions, as proposed in N3386. - The G++ namespace association extension, __attribute ((strong)), has been deprecated. Inline namespaces should be used instead. - G++ now supports a -fext-numeric-literal option to control whether GNU numeric literal suffixes are accepted as extensions or processed as C++11 user-defined numeric literal suffixes. The flag is on (use suffixes for GNU literals) by default for -std=gnu++*, and -std=c++98. The flag is off (use suffixes for user-defined literals) by default for -std=c++11 and later. Runtime Library (libstdc++) --------------------------- - Improved_experimental_support_for_the_new_ISO_C++_standard,_C++11, including: - forward_list meets the allocator-aware container requirements; - this_thread::sleep_for(), this_thread::sleep_until() and this_thread:: yield() are defined without requiring the configure option --enable-libstdcxx-time; - Improvements to : - SSE optimized normal_distribution. - Use of hardware RNG instruction for random_device on new x86 processors (requires the assembler to support the instruction.) and : - New random number engine simd_fast_mersenne_twister_engine with an optimized SSE implementation. - New random number distributions beta_distribution, normal_mv_distribution, rice_distribution, nakagami_distribution, pareto_distribution, k_distribution, arcsine_distribution, hoyt_distribution. - Added --disable-libstdcxx-verbose configure option to disable diagnostic messages issued when a process terminates abnormally. This may be useful for embedded systems to reduce the size of executables that link statically to the library. Fortran ------- - Compatibility notice: - Module files: The version of module files (.mod) has been incremented. Fortran MODULEs compiled by earlier GCC versions have to be recompiled, when they are USEd by files compiled with GCC 4.8. GCC 4.8 is not able to read .mod files created by earlier versions; attempting to do so gives an error message. Note: The ABI of the produced assembler data itself has not changed; object files and libraries are fully compatible with older versions except as noted below. - ABI: Some internal names (used in the assembler/object file) have changed for symbols declared in the specification part of a module. If an affected module - or a file using it via use association - is recompiled, the module and all files which directly use such symbols have to be recompiled as well. This change only affects the following kind of module symbols: - Procedure pointers. Note: C-interoperable function pointers (type (c_funptr)) are not affected nor are procedure-pointer components. - Deferred-length character strings. - The BACKTRACE intrinsic subroutine has been added. It shows a backtrace at an arbitrary place in user code; program execution continues normally afterwards. - The -Wc-binding-type warning option has been added (disabled by default). It warns if the a variable might not be C interoperable; in particular, if the variable has been declared using an intrinsic type with default kind instead of using a kind parameter defined for C interoperability in the intrinsic ISO_C_Binding module. Before, this warning was always printed. The -Wc-binding-type option is enabled by -Wall. - The -Wrealloc-lhs and -Wrealloc-lhs-all warning command-line options have been added, which diagnose when code to is inserted for automatic (re)allocation of a variable during assignment. This option can be used to decide whether it is safe to use -fno-realloc-lhs. Additionally, it can be used to find automatic (re)allocation in hot loops. (For arrays, replacing var= by var(:)= disables the automatic reallocation.) - The -Wcompare-reals command-line option has been added. When this is set, warnings are issued when comparing REAL or COMPLEX types for equality and inequality; consider replacing a == b by abs(a-b) < eps with a suitable eps. -Wcompare-reals is enabled by -Wextra. - The -Wtarget-lifetime command-line option has been added (enabled with -Wall), which warns if the pointer in a pointer assignment might outlive its target. - Reading floating point numbers which use q for the exponential (such as 4.0q0) is now supported as vendor extension for better compatibility with old data files. It is strongly recommended to use for I/O the equivalent but standard conforming e (such as 4.0e0). (For Fortran source code, consider replacing the q in floating-point literals by a kind parameter (e.g. 4.0e0_qp with a suitable qp). Note that - in Fortran source code - replacing q by a simple e is not equivalent.) - The GFORTRAN_TMPDIR environment variable for specifying a non-default directory for files opened with STATUS="SCRATCH", is not used anymore. Instead gfortran checks the POSIX/GNU standard TMPDIR environment variable. If TMPDIR is not defined, gfortran falls back to other methods to determine the directory for temporary files as documented in the user_manual. - Fortran_2003: - Support for unlimited polymorphic variables (CLASS(*)) has been added. Nonconstant character lengths are not yet supported. - TS_29113: - Assumed types (TYPE(*)) are now supported. - Experimental support for assumed-rank arrays (dimension(..)) has been added. Note that currently gfortran's own array descriptor is used, which is different from the one defined in TS29113, see gfortran's_header_file or use the Chasm_Language_Interoperability_Tools. Go -- - GCC 4.8.0 implements a preliminary version of the upcoming Go 1.1 release. The library support is not quite complete, due to release timing. - Go has been tested on GNU/Linux and Solaris platforms for various processors including x86, x86_64, PowerPC, SPARC, and Alpha. It may work on other platforms as well. New Targets and Target Specific Improvements ============================================ AArch64 ------- - A new port has been added to support AArch64, the new 64-bit architecture from ARM. Note that this is a separate port from the existing 32-bit ARM port. - The port provides initial support for the Cortex-A53 and the Cortex-A57 processors with the command line options -mcpu=cortex-a53 and -mcpu=cortex-a57. ARM --- - Initial support has been added for the AArch32 extensions defined in the ARMv8 architecture. - Code generation improvements for the Cortex-A7 and Cortex-A15 CPUs. - A new option, -mcpu=marvell-pj4, has been added to generate code for the Marvell PJ4 processor. - The compiler can now automatically generate the VFMA, VFMS, REVSH and REV16 instructions. - A new vectorizer cost model for Advanced SIMD configurations to improve the auto-vectorization strategies used. - The scheduler now takes into account the number of live registers to reduce the amount of spilling that can occur. This should improve code performance in large functions. The limit can be removed by using the option -fno-sched- pressure. - Improvements have been made to the Marvell iWMMX code generation and support for the iWMMX2 SIMD unit has been added. The option -mcpu=iwmmxt2 can be used to enable code generation for the latter. - A number of code generation improvements for Thumb2 to reduce code size when compiling for the M-profile processors. - The RTEMS (arm-rtems) port has been updated to use the EABI. - Code generation support for the old FPA and Maverick floating-point architectures has been removed. Ports that previously relied on these features have also been removed. This includes the targets: - arm*-*-linux-gnu (use arm*-*-linux-gnueabi) - arm*-*-elf (use arm*-*-eabi) - arm*-*-uclinux* (use arm*-*-uclinux*eabi) - arm*-*-ecos-elf (no alternative) - arm*-*-freebsd (no alternative) - arm*-wince-pe* (no alternative). AVR --- - Support for the "Embedded C" fixed-point has been added. For details, see the GCC_wiki and the user_manual. The support is not complete. - A new print modifier %r for register operands in inline assembler is supported. It will print the raw register number without the register prefix 'r': /* Return the most significant byte of 'val', a 64-bit value. */ unsigned char msb (long long val) { unsigned char c; __asm__ ("mov %0, %r1+7" : "=r" (c) : "r" (val)); return c; } The inline assembler in this example will generate code like mov r24, 8+7 provided c is allocated to R24 and val is allocated to R8...R15. This works because the GNU assembler accepts plain register numbers without register prefix. - Static initializers with 3-byte symbols are supported now: extern const __memx char foo; const __memx void *pfoo = &foo; This requires at least Binutils 2.23. IA-32/x86-64 ------------ - Allow -mpreferred-stack-boundary=3 for the x86-64 architecture with SSE extensions disabled. Since the x86-64 ABI requires 16 byte stack alignment, this is ABI incompatible and intended to be used in controlled environments where stack space is an important limitation. This option will lead to wrong code when functions compiled with 16 byte stack alignment (such as functions from a standard library) are called with misaligned stack. In this case, SSE instructions may lead to misaligned memory access traps. In addition, variable arguments will be handled incorrectly for 16 byte aligned objects (including x87 long double and __int128), leading to wrong results. You must build all modules with -mpreferred-stack-boundary=3, including any libraries. This includes the system libraries and startup modules. - Support for the new Intel processor codename Broadwell with RDSEED, ADCX, ADOX, PREFETCHW is available through -madx, -mprfchw, -mrdseed command-line options. - Support for the Intel RTM and HLE intrinsics, built-in functions and code generation is available via -mrtm and -mhle. - Support for the Intel FXSR, XSAVE and XSAVEOPT instruction sets. Intrinsics and built-in functions are available via -mfxsr, -mxsave and -mxsaveopt respectively. - New -maddress-mode=[short|long] options for x32. -maddress-mode=short overrides default 64-bit addresses to 32-bit by emitting the 0x67 address- size override prefix. This is the default address mode for x32. - New built-in functions to detect run-time CPU type and ISA: - A built-in function __builtin_cpu_is has been added to detect if the run- time CPU is of a particular type. It returns a positive integer on a match and zero otherwise. It accepts one string literal argument, the CPU name. For example, __builtin_cpu_is("westmere") returns a positive integer if the run-time CPU is an Intel Core i7 Westmere processor. Please refer to the user_manual for the list of valid CPU names recognized. - A built-in function __builtin_cpu_supports has been added to detect if the run-time CPU supports a particular ISA feature. It returns a positive integer on a match and zero otherwise. It accepts one string literal argument, the ISA feature. For example, __builtin_cpu_supports("ssse3") returns a positive integer if the run-time CPU supports SSSE3 instructions. Please refer to the user_manual for the list of valid ISA names recognized. Caveat: If these built-in functions are called before any static constructors are invoked, like during IFUNC initialization, then the CPU detection initialization must be explicitly run using this newly provided built-in function, __builtin_cpu_init. The initialization needs to be done only once. For example, this is how the invocation would look like inside an IFUNC initializer: static void (*some_ifunc_resolver(void))(void) { __builtin_cpu_init(); if (__builtin_cpu_is("amdfam10h") ... if (__builtin_cpu_supports("popcnt") ... } - Function Multiversioning Support with G++: It is now possible to create multiple function versions each targeting a specific processor and/or ISA. Function versions have the same signature but different target attributes. For example, here is a program with function versions: __attribute__ ((target ("default"))) int foo(void) { return 1; } __attribute__ ((target ("sse4.2"))) int foo(void) { return 2; } int main (void) { int (*p) = &foo; assert ((*p)() == foo()); return 0; } Please refer to this wiki for more information. - The x86 backend has been improved to allow option -fschedule-insns to work reliably. This option can be used to schedule instructions better and leads to improved performace in certain cases. - Windows MinGW-w64 targets (*-w64-mingw*) require at least r5437 from the Mingw-w64 trunk. - Support for new AMD family 15h processors (Steamroller core) is now available through the -march=bdver3 and -mtune=bdver3 options. - Support for new AMD family 16h processors (Jaguar core) is now available through the -march=btver2 and -mtune=btver2 options. FRV --- - This target now supports the -fstack-usage command-line option. MIPS ---- - GCC can now generate code specifically for the R4700, Broadcom XLP and MIPS 34kn processors. The associated -march options are -march=r4700, -march=xlp and -march=34kn respectively. - GCC now generates better DSP code for MIPS 74k cores thanks to further scheduling optimizations. - The MIPS port now supports the -fstack-check option. - GCC now passes the -mmcu and -mno-mcu options to the assembler. - Previous versions of GCC would silently accept -fpic and -fPIC for -mno-abicalls targets like mips*-elf. This combination was not intended or supported, and did not generate position-independent code. GCC 4.8 now reports an error when this combination is used. PowerPC / PowerPC64 / RS6000 ---------------------------- - SVR4 configurations (GNU/Linux, FreeBSD, NetBSD) no longer save, restore or update the VRSAVE register by default. The respective operating systems manage the VRSAVE register directly. - Large TOC support has been added for AIX through the command line option -mcmodel=large. - Native Thread-Local Storage support has been added for AIX. - VMX (Altivec) and VSX instruction sets now are enabled implicitly when targetting processors that support those hardware features on AIX 6.1 and above. RX -- - This target will now issue a warning message whenever multiple fast interrupt handlers are found in the same cpmpilation unit. This feature can be turned off by the new -mno-warn-multiple-fast-interrupts command-line option. S/390, System z --------------- - Support for the IBM zEnterprise zEC12 processor has been added. When using the -march=zEC12 option, the compiler will generate code making use of the following new instructions: - load and trap instructions - 2 new compare and trap instructions - rotate and insert selected bits - without CC clobber The -mtune=zEC12 option enables zEC12 specific instruction scheduling without making use of new instructions. - Register pressure sensitive instruction scheduling is enabled by default. - The ifunc function attribute is enabled by default. - memcpy and memcmp invokations on big memory chunks or with run time lengths are not generated inline anymore when tuning for z10 or higher. The purpose is to make use of the IFUNC optimized versions in Glibc. SH -- - The default alignment settings have been reduced to be less aggressive. This results in more compact code for optimization levels other than -Os. - Improved support for the __atomic built-in functions: - A new option -matomic-model=model selects the model for the generated atomic sequences. The following models are supported: soft-gusa Software gUSA sequences (SH3* and SH4* only). On SH4A targets this will now also partially utilize the movco.l and movli.l instructions. This is the default when the target is sh3*-*-linux* or sh4*-*-linux*. hard-llcs Hardware movco.l / movli.l sequences (SH4A only). soft-tcb Software thread control block sequences. soft-imask Software interrupt flipping sequences (privileged mode only). This is the default when the target is sh1*-*-linux* or sh2*-*-linux*. none Generates function calls to the respective __atomic built-in functions. This is the default for SH64 targets or when the target is not sh*-*-linux*. - The option -msoft-atomic has been deprecated. It is now an alias for -matomic-model=soft-gusa. - A new option -mtas makes the compiler generate the tas.b instruction for the __atomic_test_and_set built-in function regardless of the selected atomic model. - The __sync functions in libgcc now reflect the selected atomic model when building the toolchain. - Added support for the mov.b and mov.w instructions with displacement addressing. - Added support for the SH2A instructions movu.b and movu.w. - Various improvements to code generated for integer arithmetic. - Improvements to conditional branches and code that involves the T bit. A new option -mzdcbranch tells the compiler to favor zero-displacement branches. This is enabled by default for SH4* targets. - The pref instruction will now be emitted by the __builtin_prefetch built-in function for SH3* targets. - The fmac instruction will now be emitted by the fmaf standard function and the __builtin_fmaf built-in function. - The -mfused-madd option has been deprecated in favor of the machine- independent -ffp-contract option. Notice that the fmac instruction will now be generated by default for expressions like a * b + c. This is due to the compiler default setting -ffp-contract=fast. - Added new options -mfsrra and -mfsca to allow the compiler using the fsrra and fsca instructions on targets other than SH4A (where they are already enabled by default). - Added support for the __builtin_bswap32 built-in function. It is now expanded as a sequence of swap.b and swap.w instructions instead of a library function call. - The behavior of the -mieee option has been fixed and the negative form -mno-ieee has been added to control the IEEE conformance of floating point comparisons. By default -mieee is now enabled and the option -ffinite-math- only implicitly sets -mno-ieee. - Added support for the built-in functions __builtin_thread_pointer and __builtin_set_thread_pointer. This assumes that GBR is used to hold the thread pointer of the current thread. Memory loads and stores relative to the address returned by __builtin_thread_pointer will now also utilize GBR based displacement address modes. SPARC ----- - Added optimized instruction scheduling for Niagara4. TILE-Gx ------- - Added support for the -mcmodel=MODEL command-line option. The models supported are small and large. V850 ---- - This target now supports the E3V5 architecture via the use of the new -mv850e3v5 command-line option. It also has experimental support for the e3v5 LOOP instruction which can be enabled via the new -mloop command-line option. XStormy16 --------- - This target now supports the -fstack-usage command-line option. Operating Systems ================= Windows (Cygwin) ---------------- - Executables are now linked against shared libgcc by default. The previous default was to link statically, which can still be done by explicitly specifying -static or -static-libgcc on the command line. However it is strongly advised against, as it will cause problems for any application that makes use of DLLs compiled by GCC. It should be alright for a monolithic stand-alone application that only links against the Windows OS DLLs, but offers little or no benefit. For questions related to the use of GCC, please consult these web pages and the GCC manuals. If that fails, the gcc-help@gcc.gnu.org mailing list might help. Please send comments on these web pages and the development of GCC to our developer list at gcc@gcc.gnu.org. All of our lists have public archives. Copyright (C) Free Software Foundation, Inc. Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved. These pages are maintained by the GCC team. Last modified 2013-03-23.