Big.LITTLE Processing Implementations and Current Status

There was a big,LITTLE mini-summit during Linaro Connect Europe 2012, where an update was given on current big.LITTLE implementations and the results of measurement of power vs performance.

Big.LITTLE Processing Implementations Overview

As briefly mentioned in “Versatile Express TC2 (2xA15, 3xA7) Development Board at ARM Techcon 2012“, there are 2 big.LITTLE implementations:Cortex A15/A7 IKS

  • In-kernel switcher (IKS)
    This implementation is already available through Linaro and only required minimal changed to the kernel as it mainly an augmentation to DVFS (Dynamic Voltage and Frequency Scaling) except instead of only adjusting voltage and frequency depending on the load, it will also move the load to different cores. The main drawback is that this implementation only uses half the cores. For example, on a 2x Cortex A15 / 2x Cortex A7 system, it can only use 2 cores at the same time (either A15 or A7 cores), as the load is managed between one type of core to the other depending on the load.
  • Heterogeneous MultiProcessing (HMP)
    This implementation uses all available cores, however this requires major changes to the kernel, and a basic implementation will only be available to in Q1 2013, with optimization and upstreaming taking several more months. Instead of using system load, HMP can track individual tasks load and distribute the tasks to the best cores for the job.

IKS Implementation Power/Performance Evaluation.

For simple tasks such as audio decoding, IKS works perfectly and the task is fully run on Cortex A7 core, providing nearly the same power consumption as a single A7 processor, and 70% power consumption savings compared to a Cortex A15 core.

For more complex tasks, such as simultaneously browsing webpages (BBench) and listening to music, IKS provides around 90% of the performance provides by Cortex A15, but consumes between 30 to 40% less power.

There are two IKS implementation in this chart:

  • The original IKS with one “cut-off” frequency (go_hispeed_load) to switch between Cortex A7 and Cortex A15
  • IKS_HS2 implementation with one extra  “cut-off”  frequency (go_hispeed_load2) to limit the frequency on Cortex A15 core, since power consumption really shoots up over 1 GHz (overdrive).

    In-Kernel Switcher High Speed 2
    The vertical scale represents the power consumption, and the horizontal scale, the CPU frequency for Cortex A7 and A15 cores.

HMP Implementation Status.

Linaro has also started work on the second implementation, and they have an experimental implementation of HMP that treats big and LITTLE CPUs as separate scheduling domains, uses PJT (Paul Turner)’s load-tracking patches to track individual task load and migrates tasks between the big and the LITTLE domains based on task load.

MP3 playback power benchmarking has found that HMP uses 39.86% of the power required by this task on Cortex A15 compared to only 30.79% on a processor with Cortex A7 cores.

Cortex A15 cores consume the extra power. Although there is no user task running on the A15s, unwarranted wake-ups (tick_sched_timer , Timers, workqueue…) occur on those cores. This will be resolved by implementing CPU wakeup prioritization in order to pick the “cheapest” cpu. Other improvement to the HMP will include global balancing (Spread load to A7s when A15s are overloaded) and the implementation of cluster aware cpufreq governors.

Sources: big.LITTLE_mini-summit_.pdf and big.LITTLE_TC2_update_.pdf

If you want to know more about big.LITTLE implementation, you can read two other technical presentations at LCE12:  Bluesky: What would the ideal power-aware kernel do?Handling big.LITTLE Core and Cluster Shutdowns on ARM, and follow the work done on Linaro Big Little Switcher page.

Support CNX Software - Donate via PayPal or become a Patron on Patreon

5
Leave a Reply

avatar
5 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
0 Comment authors
Linaro Releases IKS big.LITTLE Implementation Source CodeSamsung Exynos 5 Octa Antutu Benchmark ResultsARM big.LITTLE Processing Demo (HMP) on ARM TC2 Test ChipEmbedded Linux Conference 2013 ScheduleSamsung Exynos Octa big.LITTLE 8 Cores SoC Announced at CES 2013 Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
trackback

[…] cores, and both Cortex A15 and A7 can be used simultaneously. Check my earlier post “Big.LITTLE Processing Implementations and Current Status” for an overview on how this all […]

trackback

[…] ‘In Kernel Switcher’ (IKS) is a solution developed by Linaro and ARM to support ARM’€™s new big.LITTLE implementation. […]

trackback

[…] as audio playback or background tasks. This is all done to optimize power consumption. There are 2 big.LITTLE software implementations: In-kernel switcher (IKS) and heterogeneous multi-processing (HMP). The first one is easier to […]

trackback

[…] pass the 30,000 mark @ 1.6 GHz. In case you have no idea what I’m talking about, check out IKS vs HMP post. The Korean version of the Samsung Galaxy S4 (SHV-E300S) should already score of over 30,000 […]

trackback

[…] Yesterday, Linaro announced the released of the IKS (In-kernel switcher) implementation for big.LITTLE processor which allows the SoC to switch between individual Cortex A7 or Cortex A15 cores to optimize power consumption. Currently, the only consumer device using supporting big.LITTLE the Samsung Galaxy S4 thanks to Samsung Exynos 5 Octa featuring 4 ARM Cortex A7 and 4 ARM Cortex A15 cores. IKS implementation can only make use of 4 cores at a time in this processor, since it must choose between A7 or A15 depending on the load. HMP (Heteregenous Multi-Processing) implementation is currently worked on in order to be able to use all 8 cores, and distributes tasks to the right core for the job. You can read my previous post for differences between IKS and HMP. […]