Arm Announces Cortex-A78 CPU, Mali-G78 GPU, Ethos-N78 NPU and Custom Cortex-X1 Core

Arm Cortex A78

Arm has just announced its 2020 Arm Mobile IP portfolio with no less than five IP blocks including Arm Cortex-A78 CPU, Arm Mali-G78 and G68 GPUs, Arm Ethos-N78 neural processing unit, and the custom Cortex-X program starting with Cortex-X1, the most powerful Arm core to date. Arm Cortex-A78 CPU Cortex-A78 highlights: Architecture –  Armv8-A (Harvard) Extensions – Armv8.1, Armv8.2, Cryptography, and RAS; Armv8.3 (LDAPR instructions only) ISA support – A64, A32, and T32 (at EL0 only) Microarchitecture Pipeline – Out of order Superscalar Neon / Floating Point Unit Optional cryptography Unit Max number of CPUs in cluster – 4 Physical Addressing (PA) – 40-bit Memory system and external interfaces 32KB to 64KB L1 I-Cache / D-Cache 256KB to 512KB L2 Cache Optional 512KB to 4MB L3 Cache ECC and LPAE support Trustzone security Cortex-A78 delivers 20% extra performance compared to Cortex-A77 at the same power budget (one Watt), but peak […]

ICube MVP SoCs Combine CPU and GPU into a Single Unified Processing Unit (UPU)

ICube is a fabless semiconductor company developing SoCs featuring a Unified Processing Unit (UPU) that takes care of the tasks usually handle by separate CPU and GPU on typical SoCs. The UPUs are based on MVP (Multi-thread Virtual Pipeline) instruction set architecture, and are themselves called MVP cores. The company has now two SoCs based on UPU MVP cores: IC3128 and IC3228. IC3128 is a single core / 4 thread SoC, and IC3228 is a dual MVP core / 8 threads SoC. Let’s have a look at IC3228 technical specifications: CPU function 4-way simultaneous multi-threading (SMT) in each core Symmetric-multi-processing (SMP), dual MVP cores 64KB I-cache, 64KB D-cache and 64KB local memory in each core, 256KB shared L2 cache Homogeneous parallel programs Support Pthread, OpenMP GPU function Data parallel, Task parallel, and/or Function parallel computing Multi-standard media processor Programmable unified shader Support OpenGL ES 2.0 70 million triangles / sec, […]

Embedded Linux Optimization Techniques – ELCE 2011

Benjamin Zores, Alcatel-Lucent, describes different optimization techniques (focusing on hardware choice and software architecture) that can be used to improve the performance of embedded linux at Embedded Linux Conference Europe 2011. Abstract: This presentation provides a series of techniques that can be used for Linux embedded systems fine-grain tuning and performances optimization. Embedded systems are, by definition, always limited in terms of resources while people keep on trying to use desktop-oriented software on top of it. This talk presents a series of tips that can be used to actually measure, find and isolate bottlenecks in your system, whether it is by complete system profiling or software architecture optimization. Focus is also made on the traditional caveats that need to be avoided for your system not to be slow by design. You can also download the presentation slides. Jean-Luc Aufranc (CNXSoft)Jean-Luc started CNX Software in 2010 as a part-time endeavor, before […]