Status of Embedded GPU Ecosystem – Linux/Mesa Upstream Support (ELC 2018 Video)

The Embedded Linux Confernce is on-going, and the Linux Foundation has been uploading videos about talks in a timely manner on YouTube. I checked out at RISC-V keynote yesterday, but today I’ve watched a talk by Robert Foss (his real name, not related to FOSS) from Collabora entitled “Progress in the Embedded GPU Ecosystem”, where he discusses open source software support in Linux/Mesa from companies and reverse-engineering support. The first part deals with the history of embedded GPU support, especially when it comes to company support. Intel was the first and offers very good support for their drivers, following by AMD who also is a good citizen. NVIDIA has the Nouveau driver but they did not really backed it up, and Tegra support is apparently sponsored by an aircraft supplier. Other companies have been slower to help, but Qualcomm has made progress since 2015 and now support all their hardware, […]

Vulkan 1.1 and SPIR-V 1.3 Specifications Released

The Khronos Group released Vulkan 1.0 specifications in 2015 as a successor of OpenGL ES, compatible with OpenGL ES 3.1 or greater capable GPU, and taking less CPU resources thank to – for instance – better use of multi-core processors with support for multiple command buffers that can be created in parallel. A year later, we saw Vulkan efficiency in a demo, since then most vendors have implemented a Vulkan driver for their compatible hardware across multiple operating systems, including Imagination Technologies which recently released Vulkan drivers for Linux. The Khronos Group has now released Vulkan 1.1 and the associated SPIR-V 1.3 language specifications. New functionalities in Vulkan 1.1: Protected Content – Restrict access or copying from resources used for rendering and display, secure playback and display of protected multimedia content Subgroup Operations – Efficient mechanisms that enable parallel shader invocations to communicate, wide variety of parallel computation models supported […]

Arm Introduces Mali-G52 & Mali-G31 GPUs, Mali-D51 Display Processor, and Mali-V52 Video Processor for Mainstream Devices

Arm has just announced four multimedia Mali IP blocks to be found in SoC for mainstream devices: Mali-G52 GPU with 30% faster performance over Mali-G51, and 3.6x better machine learning performance Mali-G31 GPU that’s 20% smaller, and 20% more efficient than Mali-G51, supports OpenGL ES 3.2 and Vulkan APIs Mali-D51 display processor 30% power saving, 50% lower latency compared to Mali-DP650 Mali-V52 video processor supporting 4K60/4K120 content Mali-G52 GPU Arm may have introduced Project Trillium for object detection and machine learning a few weeks ago, but the solution is better suited to premium devices, so the company’s Mali-G52 bitfrost mainstream GPU aims to fill the void for mid-range devices with up to 3.6 times faster machine learning capability over Mali-G51. Based on the first illustration, Mali-G52 will probably be coupled with DynamIQ Cortex A75/A55 processors. Other benefits of the new GPU include 30% more performance density, and 15% better energy […]

Arm Kigen Puts SIM Card Functionality Right into IoT SoCs

Most cellular devices rely on SIM cards, and while with micro and nano SIM, the card has become smaller the electronics and connection remains the same. More recently, we’ve started to see boards featuring eSIM (embedded SIM), a chip soldered directly to the board that’s remotely provisioned by a local mobile network operator, so no more card needed, and the chip should not be tied to a single operator, so you can change plan whenever you want. But soon we won’t be able to find out if a board supports cellular connectivity by looking for a SIM card slot or eSIM chip  – although the antenna(s) will still be there, and give a pretty obvious clue – , as Arm has now unveiled Kigen that integrates SIM identity directly into SoCs for the Internet of Things. This will enable what the company (the industry?) call integrated SIMs (iSIM) combining an […]

Windows 10 on Arm Limitations

Windows 10 on Arm was first demonstrated at Computex 2017 on a reference platform based on Qualcomm Snapdragon 835 processor, at the time, I thought the expected 2-in-1 hybrid laptops based on the solution may bring cheaper Window 10 devices that work just like their x86 competitors, and offers longer battery life. Several models including HP Envy x2 (2017) and ASUS NovaGo TP370 were then announced at the end of 2017, and prices were quite higher than most expected with pricing for the ASUS model starting at $600 with 4GB RAM and 64GB storage. But to be fair, Snapdragon 835 are used in premium LTE smartphones like Xiaomi Mi 6 that sell for around $400 and up. At least, we are still promised good battery life of over 20 hours of continuous use (e.g. playing a Full HD video). When the laptop were announced, I read several blogs and news […]

Human Readable Decoding of /proc/cpuinfo for Arm Processors

One of the most common way to get CPU information is to check the content of /proc/cpuinfo. For example, this is the output I get from running the command on NanoPi NEO (Allwinner H3) board:

Many fields are self-explanatory, but what about CPU implementer and CPU part numbers? Those are values stored in Arm’s CPUID Base Register, and 0x41 looks up to Arm implementer, while 0xc07 refers to Cortex A7. But I had to look it up to find out. One solution would be to decode those values in the kernel, but the developers won’t do that probably because it may break user-space programs that rely on hexadecimal values. So instead, Riku Voipio decided to write and submit a patch for lscpu program found in util-linux package. The patch has been merged so the new ID mapping feature should be supported in util-linux 2.32 and greater. In the meantime, […]

Arm’s Project Trillium Combines Machine Learning and Object Detection Processors with Neural Network Software

We’ve already seen Neural Processing Units (NPU) added to Arm processors such as Huawei Kirin 970 or Rockchip RK3399Pro in order to handle the tasks required by machine learning & artificial intelligence in a faster or more power efficient way. Arm has now announced their Project Trillium offering two A.I. processors, with one ML (Machine Learning) processor and one OD (Object Detection) processor, as well as open source Arm NN (Neural Network) software to leverage the ML processor, as well as Arm CPUs and GPUs. Arm ML processor key features and performance: Fixed function engine for the best performance & efficiency for current solutions Programmable layer engine for futureproofing the design Tuned for advance geometry implementations. On-board memory to reduce external memory traffic. Performance / Efficiency – 4.6 TOP/s with an efficiency of 3 TOPs/W for mobile devices and smart IP cameras Scalable design usable for lower requirements IoT (20 […]

Linux 4.15 Release – Main Changes, Arm and MIPS Architectures

Linus Torvald has released Linux 4.15 last Sunday: After a release cycle that was unusual in so many (bad) ways, this last week was really pleasant. Quiet and small, and no last-minute panics, just small fixes for various issues. I never got a feeling that I’d need to extend things by yet another week, and 4.15 looks fine to me. Half the changes in the last week were misc driver stuff (gpu, input, networking) with the other half being a mix of networking, core kernel and arch updates (mainly x86). But all of it is tiny. So at least we had one good week. This obviously was not a pleasant release cycle, with the whole meltdown/spectre thing coming in in the middle of the cycle and not really gelling with our normal release cycle. The extra two weeks were obviously mainly due to that whole timing issue. Also, it is […]