Facebook BOLT Speeds Up Large x86 & ARM64 Binaries by up to 15%

Facebook BOLT

Compilers like GCC OR LLVM normally do a good job at optimizing your code when processing your source code into assembly, and then binary format, but there’s still room for improvement – at least for larger binaries -, and Facebook has just released BOLT (Binary Optimization and Layout Tool) that has been found to reduce CPU execution time by 2 percent to 15 percent. The tool is mostly useful for binaries built from a large code base, with binary size over 10MB which are often too large to fit in instruction cache. The hardware normally spends lots of processing time getting an instruction stream from memory to the CPU, sometimes up to 30% of execution time, and BOLT optimizes placement of instructions in memory – as illustrated below – in order to address this issue also known as “instruction starvation”. BOLT works with applications built by any compiler, including the popular GCC and Clang compilers. The tool relies on Linux …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

ARM TechCon 2013 Schedule – ARM Servers, Internet of Things, Multicore, Hardware and Software Optimization and More

ARM Technology Conference (TechCon) 2013 will take place on October 29 – 31, 2013, in Santa Clara, and the detailed schedule for the event has just been made available. In the previous years, the conference was divided into  Chip Designs day (1 day), and the other 2 days were reserved for Software & System Design, but this year it does not appear to be the case. Whether you’ll be able to attend the event or not, it’s worth having a look at what will be discussed there in order to have a better understanding of what will be the key ARM developments in the near future in terms of hardware and software. There will be around 90 sessions categorized into 15 tracks: Accelerating Hardware Development – This track explores the resources, tools, and techniques that designers can employ to quickly bring hardware to market. Topics include multicore design, ARM IP, chip buses, analog integration, simulation, FPGA prototyping, design synthesis, debugging …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

ACE CoSy Compiler Framework Outperforms LLVM by up to 25% for ARM9 processors

ACE (Associated Compiler Experts) announced that their 2012 CoSy compiler development system delivers better performance than the latest LLVM 3.0 compiler on ARM9 processors. Using an ARM9 processor as reference, CoSy compiler framework (2012) shows more than 15% performance improvement using Livermore benchmark loops and 25% in MiBench against LLVM 3.0. The CoSy compiler framework is also more than 7% ahead of LLVM on the EEMBC CoreMark benchmark. The company explains this feat by the use of “CoSy’s unique flexible phase ordering of cutting-edge code optimizations and the addition of new CoSy features” such as compile-time code generator feedback injected into optimization algorithms in order to augment realistic and accurate decision making in architecture-independent optimizations. ACE did not publish the benchmark results publicly, and I could not find any reference to “cosy” or “ace” in EEMBC Coremark database, so I could not check those results myself. CoSy Compiler Framework is not a compiler, but rather a tool to build compilers …

Support CNX Software – Donate via PayPal or become a Patron on Patreon

LLVM (Low Level Virtual Machine) Compiler Infrastructure

The Low Level Virtual Machine (LLVM) is a compiler and toolchain infrastructure, written in C++, designed for compile-time, link-time, run-time, and “idle-time” optimization of programs written in arbitrary programming languages. Originally implemented for C/C++, LLVM is now used with a variety programming languages such as Python, Ruby and may others. Code in the LLVM project is licensed under the “UIUC” BSD-Style license. LLVM can be used to replace and/or supplement the GNU tools such as gcc, g++, gdb, etc… LLVM now consists of a number of different sub-projects including: The LLVM Core libraries provide a source- and target-independent optimizer, along with code generation support for many popular CPUs. These libraries are built around a well specified code representation known as the LLVM intermediate representation (“LLVM IR”). The LLVM Core libraries are well documented, and it is particularly easy to invent your own language (or port an existing compiler) to use LLVM as an optimizer and code generator. Clang is a …

Support CNX Software – Donate via PayPal or become a Patron on Patreon