Yesterday, Linaro announced the released of the IKS (In-kernel switcher) implementation for big.LITTLE processor which allows the SoC to switch between individual Cortex A7 or Cortex A15 cores to optimize power consumption. Currently, the only consumer device using supporting big.LITTLE the Samsung Galaxy S4 thanks to Samsung Exynos 5 Octa featuring 4 ARM Cortex A7 and 4 ARM Cortex A15 cores. IKS implementation can only make use of 4 cores at a time in this processor, since it must choose between A7 or A15 depending on the load. HMP (Heteregenous Multi-Processing) implementation is currently worked on in order to be able to use all 8 cores, and distributes tasks to the right core for the job. You can read my previous post for differences between IKS and HMP.
Linaro explains the current source still needs a few more modifications before being upstream to mainline. The code was developed for and tested on the VExpress TC2 development platform, so if you want to try it on another big.LITTLE processor, an MCPM backend (Multi-cluster Power Management) and possibly a special cpufreq clock driver are required.
The switcher is comprised of 4 parts:
Low-level power management – Responsible for powering up and down individual CPUs, enabling and disabling caches, power planes, etc.
Switcher core – Handles the switching process itself, including the saving of the execution context for the outbound CPU, migration of interrupts from the outbound CPU to the inbound CPU, and restoration of the execution state on the inbound CPU.
The core cpufreq layer – Standard Linux cpufreq subsystem augmented with a special driver that provides an adaptation layer between that subsystem and the switcher core code described above.
The cpufreq policy governors - Those are made of kernel modules or user space daemons monitoring the system activity and requesting CPU frequency changes to the cpufreq core.
It seems the most important part of the source code can be found in drivers/cpufreq/arm_big_little.c, arch/arm/common/bL_switcher.c, arch/arm/common/bL_entry.c, and arch/arm/common/bL_head.S.
b.L Code specific to TC2, more exactly Dual Cluster System Control Block (DCSCB), can be found in arch/arm/mach-vexpress/dcscb.c and arch/arm/mach-vexpress/dcscb_setup.S. [Update: Based on the comment below, TC2 code is actually located in arch/arm/mach-vexpress/tc2_pm.c, and arch/arm/mach-vexpress/tc2_pm_setup.S.]
You may want to refer to the following resources for further details about the implementation:
Linaro 13.04 has just been released. It features Linux Kernel 3.9-rc7 and Android 4.2.2.
A lot of work has been done on ARMv8 (Cortex A53) with further work on OpenEmbedded, more testing, and updates to the GCC toolchain. Calxeda EnergyCore server has been added to LAVA, Origen Quad now gets hardware video acceleration in Android Jelly Bean. Still more cleanup has been done on the kernel side with regards to Samsung and ST SoC, and a big.LITTLE porting guide is now available (linaro login required).
Here are the highlights of this release:
LAVA
Prototype of a new publishing system is used to overcome performance problems with android-build.linaro.org.
Calxeda EnergyCore support is merged in LAVA, and an isolated system has been set up for web benchmarking.
Fedora support is merged in LAVA. A user can submit LAVA jobs using a Fedora pre-built image.
Boot commands are untangled from LAVA dispatcher. They’re now read from images.
Nexus implementation in LAVA is generalized and reusable for other devices that support fastboot/adb.
Linaro Image Tools 2013.04 released
More xml-rpc APIs available in lava dashboard to make life easy.
Test Definitions now available in lava dashboard. Accumulation of meta-data via test definitions started.
Builds and Baselines
Dalvik VM unit test has been automated.
Accelerated video playback support on Android was added for Origen Quad.
Skia and 0xbench tests were added into lava-test-shell.
Test suite builder has been set up to provide test suite binaries as an overlay for Android.
Toolchain integration
Native toolchain on Android updated to 4.8.
Binaries for GCC 4.7 and 4.8 based toolchains released.
Android tree updated to compile with 4.8 based toolchains, all related changes upstreamed.
ARMv7 KVM enabled kernels and images are daily built and tested in Linaro’s CI loop.
OpenEmbedded ARMv8 engineering build provides 64bit HipHop VM requirements for porting and optimization purposes.
a patch to enable perf in Android by Bernhard Rosenkraenzer
vexpress64 support (both RTSM and Foundation model)
panda-fix-usb topic to make USB and on-board Ethernet to work on Panda with Device Tree enabled (cherry-pick / forward port of some of the dev.omapzoom.org commits)
a few fixes for MMC on Snowball from STE Landing Team
patch from ARM LT to fix lockups/crashes seen when enabling function tracer on TC2 with the not yet mainlined cpuidle driver
New or updated packages available from Linaro’s Overlay PPA: edk2-uefi, gator-daemon, gator, openssl and powerdebug.
Graphics
dma-buf – (upstream) debugfs support released, accepted for 3.10
AV playback bringup on Jelly Bean on Origen Quad complete and released to android team
kernel – (upstream) patches to adds common FIMD device node for all Exynos4 SoCs.
piglit – gles2-all and android support released via git. opencl-arm established with fixes specifically for ARM
audio – Channel swap for panda has been fixed in 3.9 and 3.8 as well.
opencl – Proof of Concept SNU CPU only OpenCL lib investigated and ported to armhf, available via git
Kernel
Refactor EHCI controller code
Depopulate the Exynos <mach-exynos/include-mach> directory
Expanded Binder Unit Test – Implement base ioctl unit tests
Depopulate the ux500 and plat-nomadik <mach/*> and <plat/*>
Improve eMMC Power Management Support – Merge patches to add a skeleton for doing background operations at idle time, based on runtime PM
Port some of the simpler platforms to multi-platform support
integrator: get to a state where DT is working fully as a prerequisite
pci: fix PCI device tree problems when resolving IRQs
thermal: Powertop Integration – Add basic RFC patch and send for review
Linaro PM QA 0.4.1 2013.04 released. Fixed in this release:
Linaro Powerdebug 0.6.3 2013-04 released. Fixed in this release:
QA
Tests to cover big.LITTLE cluster init and shutdown have been added to the big.LITTLE core test suite
big.LITTLE extended test case scenarios have been implemented.
Functional and regression tests for scheduler from ARM have been integrated, covering HMP patches.
Toolchain
Linaro GCC 4.8 2013.04 released, based off the latest GCC 4.8.0+svn197294 release.
Initial optimized support for Cortex-A53 for arm*-*-* targets.
Improved support for new ARMv8-A instructions for arm*-*-* and aarch64*-*-* targets.
Backport of optimizations concerning whether to use Neon for 64-bit bitops for arm*-*-* targets.
Linaro GCC 4.7 2013.04 released, based off the latest GCC 4.7.2+svn197188 release.
Includes arm/aarch64-4.7-branch up to svn revision 196381.
Backport vectorizer cost model.
Turn off 64-bit Bitops in Neon.
Linaro GCC 4.6 2013.04 released, based off the latest GCC 4.6.3+svn197511 release. It’s the last monthly release of 4.6 series.
Linaro Toolchain Binaries 2013.04 released, updated to Linaro GCC 4.7 2013.04 and Linaro GCC 4.8 2013.04
LEG
Linaro UEFI 2013.04 released – bugfix
OpenSSL optimisations
SCT (Self-Certification Test)is now running without any crashes.
ACPI topic branch is being prepared for inclusion into linux-linaro tree.
Visit https://wiki.linaro.org/Cycles/1304/Release for a list of known issues and further release details about the LEB, Android, Kernel, Graphics, Landing Team, Platform, Power management and Toolchain (GCC / Qemu) components.
Linaro 13.03 is now available, and features Linux Kernel 3.9-rc3 and Android 4.2.2.
This month, Linaro has released their first Origen Quad Android image, as well as Tiny Android build for Arndale. The ALIP image listed in the download page is still based on Ubuntu 12.11, but as doc Bormental noticed earlier this month, the latest ALIP Quantal 13.03 image is available for download at https://releases.linaro.org/latest/ubuntu/quantal-images/alip. Some development tools (gcc, g++, vi, make..) are now included in Android, so you can develop and build natively from your Android device. Linaro has kept on cleaning the Linux kernel ARM tree for Exynos and ST Ericson SoCs. More work has been done on big.LITTLE for both IKS and HMP, as well as ARMv8 OpenEmbedded, and an initial GRUB port on ARM UEFI is now available.
Here are the highlights of this release:
Automation and Validation
A simple CLI tool for communicating with the CI dashboard has been developed
LAVA supports Arndale booting with UEFI. The bootloader configuration is done on the fly
Snowballs coming back online
Builds and Baselines
linux-linaro-arndale Bringup with Tiny Android build for Arndale setup, and Android with GUI planned for the next cycle.
Origen-Quad Member build
Native Toolchain on Android
The toolchain is now available natively inside Linaro Android builds.
The builds now include gcc, g++, vim, make, a terminal emulator and a vi-friendly keyboard.
Restructure release toolchain – Released toolchains are checked into a prebuilts/ git repository and pulled in by the manifest as opposed to being downloaded as separate tarballs.This is the approach taken by AOSP to distribute the toolchain.
CTS Support in LAVA
CTS support for 4.2 reworked for better stability.
CTS has been enabled for the Engineering builds.
Investigations to be done next cycle on tests that are not getting executed.
Linux Linaro 3.9 2013.03 released
based off linux-linaro-core-tracking tree, llct-20130321.0 tag:
updated Versatile Express patches from ARM LT
updated arndale/exynos patches from Samsung LT
a patch to enable perf in Android by Bernhard Rosenkraenzer
vexpress64 support (both RTSM and Foundation model)
panda-fix-usb topic to make USB and on-board ethernet to work on Panda with Device Tree enabled (cherry-pick / forward port of some of the dev.omapzoom.org commits)
a few fixes for MMC on Snowball from STE Landing Team
Enable 64bit HipHop VM development in OpenEmbedded
Improve Ubuntu engineering build CI loop
ARMv7 KVM CI Bringup
Merge ARMv8 support into OpenEmbedded
CI bring up: Calxeda EnergyCore ECX-1000 (highbank)
Added hwpack configurations for ECX-1000 (highbank)
Set up CI job for ECX-1000 (highbank) hwpack daily builds
Adapt core LAVA tests from Ubuntu/Android to OpenEmbedded engineering build
Graphics
upstream: Version 10 of CMA-ION patches released by Benjamin Gaignard. ION is a new memory allocator for Android. CMA stands with Continuous Memory Allocator. Click here for details.
upstream: Android piglit enablement patches for OpenGL ES 2 updated and released by Tom Gall
upstream: Version 1 of variable-index-* shader-tests extended for Android and Linux released by Tom Gall
upstream: Version 1 of debugfs support for dma-buf released by Sumit Semwal
upstream: Version 9 of DRM FIMD DT support for Exynos4 DT machine released by Vikas Sajjan
Kernel
Depopulate the Exynos <mach-exynos/include-mach> directory
Convert UX500 to common clk
Refactor EHCI controller code – Separated ehci_tegra host controller driver from ehci-hcd into its own driver
Depopulate the ux500 and plat-nomadik <mach/*> and <plat/*>
Android alarm-dev compat_ioctl support
Android keyreset driver upstreaming
Improve eMMC Power Management Support
Android Sync infrastructure Upstreaming
Power Management
Dynamic timer irq affinity: set up the timer irq affinity to the cpu concerned by the first timer expiration – This patch was upstreamed.
cpufreq driver for IKS has been optimized
Analysis of HMP scheduler optimizations using bbench and their applicability to A15 SMP systems is completed: No performance regressions were seen.
sched: modified timer and workqueue framework to allow migration to non-idle cpus
Powerdebug is ported to Android platform and available in builds
Thermal manager: Powertop Integration.
Toolchain
Linaro GCC 4.7 2013.03 released, based off the latest GCC 4.7.2+svn195745 release
Linaro GCC 4.6 2013.03 released, based off the latest GCC 4.6.3+svn196247 release
Linaro QEMU 2013.03 released, based off upstream (trunk) QEMU. This release has been updated to be based on upstream’s recent 1.4.0 release. It also includes ARM KVM support patches which are in sync with the ABI as committed to the upstream Linux kernel for 3.9. This feature is still under development but will no longer be subject to kernel-vs-userspace ABI breaks.
Visit https://wiki.linaro.org/Cycles/1303/Release for a list of known issues and further release details about the LEB, Android, Kernel, Graphics, Landing Team, Platform, Power management and Toolchain (GCC / Qemu) components.
Renesas announced a new automotive SoC called the R-Car H2 that features 4 Cortex-A15 cores together with 4 Cortex A7 cores (optional) in big.LITTLE configuration, as well as an Imagination PowerVR Series6 G6400 GPU. This SoC can optionally come with Renesas SH-4A, a real-time processing CPU core acting as a multimedia engine (MME) , and Renesas’ IMP-X4 core, a real-time image processing unit that enables developers to implement augmented reality application such as 360-degree camera views and image recognition.
This Renesas processor is a multimedia power house, as it can handle 4x 1080p video en/decoding, including Blu-Ray support at 60 frames per second, as well as image/voice recognition and high-resolution 3D graphics with virtually no CPU usage.
Renesas R-Car H2 Block Diagram
Here are R-Car H2′s specifications provided on Renesas website:
Quad serial peripheral interface (QSPI) × 1 ch (for boot)
Clock-synchronized serial interface (MSIOF) × 4 ch (SPI/IIS)
Ethernet controller (IEEE802.3u, RMII, without PHY)
Interrupt controller (INTC)
Clock generator (CPG) with built-in PLL
On chip debugger interface
Low power mode
Dynamic Power Shutdown (CPU core, 3D, IMP)
AVS and DVFS function
DDR-SDRAM power supply backup mode
Package
831 pin Flip Chip BGA (27 mm × 27 mm)
For development, Renesas provides ICE for ARM CPU, as well as an evaluation board including car information system-oriented peripheral circuits. The platform supports QNX Neutrino RTOS, Windows Embedded Automotive, and Linux.
Renesas R-Car H2 samples are available now, and mass production is scheduled for mid-2014. More information is available on on Renesas R-Car H2 page.
Linaro 13.02 is now available, and features Linux Kernel 3.8 and Android 4.2.2.
The biggest news this month is probably the first release of a preliminary ARM64 Debian/Ubuntu Raring image. Other noticeable items include work on ARMv7 KVM, more improvements to OpenEmbedded ARMv8 implementation, as well as big.LITTLE MP implementation, and some modifications to the toolchain for Cortex A7 support. Origen images are not available for download this month, and there’s still no ALIP images since they have disappeared since Linaro upgraded to Ubuntu Quantal.
Here are the highlights of this release:
Android
AOSP master build for Galaxy Nexus has been setup
All the platforms have been updated to 4.2.2
Support for lava-test-shell has been added to linaro-android-build-tools.
Developer Platform
CI bring up: ARMv7 KVM – Add Arndale hypervisor patch to u-boot-linaro.
CI bring up: Arndale – Add Arndale image reports to LAVA, Enable and verify UEFI support in the hwpack.
Linux Linaro 3.8 2013.02 released
OpenEmbedded based SDK is able to build HipHopVM
OpenEmbedded ARMv8 build has been updated
ARM64 Debian/Ubuntu (Raring) port image available.
Several new packages available from Linaro’s Overlay PPA – acpica-unix, acpi-abat, fwts, libhugetlbfs and numactl
Support LEG engineering to ramp-up on LAVA
Infrastructure
OpenEmbedded CI builds now use persistent builders, which drastically reduces the build time.
Kernel
AB8500 driver has been updated with pinctrl patches
Ux500 now uses sparse IRQs
Depopulate the Exynos <mach-exynos/include-mach> directory – Convert all users of gpio to pinctrl and remove gpio.h for Exynos4
Depopulate the ux500 and plat-nomadik <mach/*> and <plat/*>
Research impact on kernel size for multi-platform configs
Android keyreset driver upstreaming
android upstreaming: Lowmem
Improve eMMC Power Management Support
Refactor EHCI controller code
Android alarm-dev compat_ioctl support
Power Management
Small task packing by scheduler (Power-aware scheduler) – Implement or update patches based on HMP and upstream workshop
Integration tree to bring together big.LITTLE MP related work (V15 branch of big LITTLE MP tree)
DVFS for the Common Clock Framework
Cpufreq cleanups with a view to more consolidation and simpler drivers
Port Adaptive NOHZ patchset to ARM
Update devfreq core
cpuidle: Tracks all miscellaneous changes to upstream cpuidle
Refactor the acpi cpuidle driver
Linaro Powerdebug 0.6.2-2013.02 released
Toolchain
Linaro GCC 4.7-2013.02-01
Linaro GCC 4.6 2013.02
Linaro Toolchain Binaries 2013.02 released
Backport Cortex-A7 support to -mcpu=native
Backport improvements for Cortex-A7
Backport AArch64 patches from Cavium
LAVA
Linaro CI jobs are converted to lava-test-shell
Galaxy Nexus device is deployed into LAVA
Calxeda and TC2-Hackbox servers are deployed in the lab
Support ARM engineering to deploy LAVA in-house
ARM Energy Probe deployed in LAVA lab
SSD Added to Calxeda server for hadoop testing
LEG
Linaro UEFI 2013.02 released with bug fixes for Arndale board.
ARM support merged in libhugetlbfs next branch (package available from Linaro’s Overlay PPA)
Visit https://wiki.linaro.org/Cycles/1302/Release for a list of known issues and further release details about the LEB, Android, Kernel, Graphics, Landing Team, Platform, Power management and Toolchain (GCC / Qemu) components.
Samsung Exynos 5 Octa processor is getting some competition with the announcement of Renesas AP6 processor at MWC2013. This SoC comes with the same big.LITTLE configuration (4x Cortex A15, 4x Cortex A7), but with a PowerVR SGX Series 6 ‘Rogue’ GPU, which, I assume, should outperform PowerVR SGX544MP3 GPU used in Samsung SoC.
R-Mobile APE6 could be the fastest mobile processor announced to date, and is currently showcased at Imagination Technologies booth at MWC 2013 running several OpenGL ES 3.0 applications, as well as a demonstration of Rightware’s Kanzi Studio, “a PC-based real-time WYSIWYG editor for designers and embedded engineers to create and customize embedded 3D user interface”. There’s very little information about the processor, so that’s basically all I have for now.
But before I conclude, I’ll just drop a performance comparison chart between different PowerVR SGX series (source), since it’s the first mobile processor that I know of that has been announced with PowerVR Series 6 GPU. (LG H13 is actually the first, but it targets “home entertainment”, not the mobile space)
With over 30 GFlops @ 300 MHz, G6100 appears to outperform any other ARM GPUs at this frequency. G6100 is the low-end version of the Series 6 series, and we don’t know exactly which one will be used in APE6 SoC.
ST Ericsson already showcased NovaThor L8580 Cortex A9 processor @ 2.8 GHz at CES 2013. The processor features a technology called eQuad, and as I previously noticed there’s no mention of the number of cores at all on the website and many websites reported the processor featured 4 cores. The processor actually features 2 Cortex A9 core, but thanks to FD-SOI technology they are able to do the equivalent of a Quad Core big.LITTLE processor (i.e. 2x Cortex A15 + 2x Cortex A7) electrically.
Power Consumption of L8580 @ 1Ghz / 0.65V using FD-SOI vs Similar Platform with Bulk CMOS technology.
Since this is just done electrically, you can use the same software as before and will consume much less power, whereas big.LITTLE requires a lot of kernel work. A Cortex A9 will obviously be less powerful than a Cortex A15, but since they are able to boost the frequency up to 3GHz (probably limited to 2.5Ghz in actual product) this can compensate the lower performance, and the die size will be much smaller than on a quad core big.LITTLE processor, since you only need 2 cores for similar power consumption and performance. Charbax uploaded a video with a nice explanation of the system at MWC2013. The first part of the video is very interesting as it explains the technology and its advantage. One particularly interesting part is when Charbax asked about big.LITTLE, and ST Ericsson representative explains that big.LITTLE processing is not so interesting now that FD-SOI is available, and it will also end up in cortex A15 processor. He did not say FD-SOI eQuad was a big.LITTLE killer, but it was close . He may be right about IKS big.LITTLE implementation, but with HMP big.LITTLE, where they can use all cores, big.LITTLE will certainly retain its advantage. What do you think?
Update: If you want to know more about eQuad, you can read FD-SOI eQuad White-Paper (15-page PDF), that explains the technology, and also explains that heterogeneous multiprocessing (HMP) is a promising technology, but it’s highly complex both in terms of hardware and software, and it may take a few years to reach its full potential.