Archive

Posts Tagged ‘armv8’

Nvidia Provides More Details About Parker Automotive SoC with ARMv8 Cores, Pascal GPU

August 23rd, 2016 9 comments

Nvidia demonstrated DRIVE PX2 platform for self-driving cars at CES 2016, but did not give many details about the SoC used in the board. Today, the company has finally provided more information about Parker hexa-core SoC combining two Denver 2 cores, and four Cortex A57 cores combining with a 256-core Pascal GPU.

Nvidia_Parker_Block_DiagramNvidia Parker SoC specifications:

  • CPU – 2x Denver 2 ARMv8 cores, and 4x ARM Cortex A57 cores with 2MB + 2 MB L2 cache, coherent HMP architecture (meaning all 6 cores can work at the same time)
  • GPUs – Nvidia Pascal Geforce GPU with 256 CUDA cores supporting DirectX 12, OpenGL 4.5, Nvidia CUDA 8.0, OpenGL ES 3.1, AEP, and Vulkan + 2D graphics engine
  • Memory – 128-bit LPDDR4 with ECC
  • Display – Triple display pipeline, each at up to 4K 60fps.
  • VPU – 4K60 H.265 and VP9 hardware video decoder and encoder
  • Others:
    • Gigabit Ethernet MAC
    • Dual-CAN (controller area network)
    • Audio engine
    • Security & safety engines including a dual-lockstep processor for reliable fault detection and processing
    • Image processor
  • ISO 26262 functional safety standard for electrical and electronic (E/E) systems compliance
  • Process – 16nm FinFet
PX Drive 2 Board with two Parker SoCs

PX Drive 2 Board with two Parker SoCs

Parker is said to deliver up to 1.5 teraflops (native FP16 processing) of performance for “deep learning-based self-driving AI cockpit systems”.

This type of board and processor is normally only available to car and part manufacturer, and the company claims than 80 carmakers, tier 1 suppliers and university research centers are now using DRIVE PX 2 systems to develop autonomous vehicles. That means the platform should find its way into cars, trucks and buses soon, including in some 100 Volvo XC90 SUVs part of an autonomous-car pilot program in Sweden slated to start next year.

Allwinner A64 based Pine A64 and Banana Pi M64 Boards Can Now Run Windows 10 IoT Core

August 4th, 2016 8 comments

Windows IoT is a version of Windows 10 that’s optimized for smaller devices with or without a display, and was fist released for Raspberry Pi 2 and MinnowBoard MAX. Since then a few more boards are now officially supported, including DragonBoard 410c, and Raspberry Pi 3. But there’s been some recent developments as two Allwinner A64 64-bit ARM boards are now supported according to two wiki entries (here and there) explaining how to run a simple Csharp sample on Windows 10 IoT Core on either Banana Pi M64 or Pine A64 boards.

Windows_10_IoT_Allwinner_A64The guide shows how to configure Azure IoT Hub, register the IoT device, and build and deploy Azure IoT SDK on the board.

But basically if all you want to is to run Windows IoT core on either board, you’ll need to download either:

  • Windows 10 IoT Core for Banana Pi M64: Windows10IoT_BPI-M64.ffu (Link removed as Microsoft does not allow redistribution of ffu for now, despite the link being available directly on github without SLA)
  • Windows 10 IoT Core for Pine A64/A64+: Windows10IoT_Pine64.ffu (Link removed as Microsoft does not allow redistribution of ffu for now, despite the link being available directly on github without SLA)

Then install and run IoT Dashboard in a Windows computer, select the Setup new device tab, then Customize, and load the FFU firmware file to flash it to an 8GB micro SD card. Once it’s done, insert the micro SD card into the board, and it should run Windows 10 IoT Core at next boot.

Windows 10 IoT Core has also been ported to few other Intel based embedded computers, as well as Toradex Colibri T30 Tegra 3 system-on-module.

[Update: Allwinner has uploaded a video showing Pine A64 with Windows 10 IoT Core (Video removed, as Microsoft does not like that video being published together with the press release. Maybe because it shows they’ve yet to implement Ethernet….)]

Via Bird on SMEoT Facebook Group

Embedded Linux Conference & IoT Summit Europe 2016 Schedule

August 2nd, 2016 4 comments

Embedded Linux Conference & IoT summit 2016 first took place in the US in April, but the events are now also scheduled in Europe on October 11 – 13 in Berlin, Germany, and the schedule has now been published. Even if you are no going to attend, it’s always interesting to find out more about the topic covered in that type of events, so I had a look, and created my own virtual schedule with some of the sessions.

Embedded_Linux_Conference_Europe_2016Tuesday, October 11

  • 10:40 – 11:30 – JerryScript: An Ultra-lightweight JavaScript Engine for the Internet of Things – Tilmann Scheller, Samsung Electronics

JerryScript is a lightweight JavaScript engine designed to bring the success of JavaScript to small IoT devices like lamps, thermometers, switches and sensors. This class of devices tends to use resource-constrained microcontrollers which are too small to fit a large JavaScript engine like V8 or JavaScriptCore.

JerryScript is heavily optimized for low memory consumption and runs on platforms with less than 64KB of RAM and less than 200KB of flash memory. Despite the low footprint, JerryScript is a full-featured JavaScript engine implementing the entire ECMAScript 5.1 standard. It is actively used in production and runs already on hundreds of thousands of smartwatches!

JerryScript is an open source project and has been released under the Apache License 2.0. The talk will include a demo showing JavaScript code executing on top of JerryScript on a resource-constrained microcontroller.

  • 11:40 – 12:30 – Read-only rootfs: Theory and Practice – Chris Simmonds, 2net

Configuring the rootfs to be read-only makes embedded systems more robust and reduces the wear on flash storage. In addition, by removing all state from the rootfs it becomes easier to implement system image updates and factory reset.

In this presentation, Chris shows how to identify components that need to store some state, and to split it into volatile state that is needed only until the device shuts down and non-volatile state that is required permanently. He gives examples and shows various techniques of mapping writes onto volatile or non-volatile storage. To show how this works in practice, he uses a standard Yocto Project build and shows what changes you have to make to achieve a real-world embedded system with read-only rootfs. In the last section, Chris considers the implications for software image update. Expect a live demonstration.

  • 14:00 – 14:50 – Comparison of Linux Software Update Technologies – Matt Porter, Konsulko

The update of software in an embedded Linux system has always been an important part of any product. In the past, however, planning and design for software update was often an afterthought in system design. Further, software update mechanisms for embedded Linux products were typically implemented as ad hoc one-off projects within each product company. As the requirements for products have matured to include security updates at a frequent intervals, software update strategy has become a focal point of product development. This session will explore a number of different Linux software update technologies that are FOSS projects, comparing each for their strengths and weaknesses. In order to better understand the applicability of these technologies, we will also deep dive into both common and uncommon use cases that drive requirements for these software update mechanisms.

  • 15:00 – 15:50 – Building a Micro HTTP Server for Embedded System – Jian-Hong Pan

Apache HTTP Server, NGINX .. are famous web servers in the world. More and more web server frameworks come and follow up, like Node.js, Bottle of Python .., etc. All of them make us have the abilities to get or connect to the resources behind the web server. However, considering the limitations and portability, they may not be ported directly to the embedded system which has restricted resources. Therefore, we need to re-implement an HTTP server to fulfill that requirement.

Jian-Hong will introduce how he used the convenience of Python to implement a Micro HTTP Server prototype according to RFC 2616/HTTP 1.1. Then, re-write the codes in C to build the Micro HTTP Server and do the automated testing with Python Unit Testing Framework. Finally, he’ll explain how he combined the Micro HTTP Server with an RTOS, and lit the LEDs on an STM32F4-Discovery board.

  • 16:10 – 17:00 – Stuck in 2009 – How I Survived – Will Sheppard, Embedded Bits Limited

When developing Linux based products it’s desirable to use the latest version of the Linux kernel – however this is not always possible. In this presentation Will Sheppard will enlighten you with his experiences in developing a product based on a 2.6.28 kernel. Throughout the presentation he will share with you the reasons why you can be stuck with an old kernel, the issues this causes and the surprising and unexpected benefits that also arise. The presentation will also give you an indication as to how far the kernel has developed since 2009 and perhaps some hope if you too are also stuck working in the past.

  • 17:10 – 18:00 – Power Management Challenges in IoT and How Zephyr RTOS Meets Them – Ramesh Thomas, Intel

An OS that runs on tiny IoT devices is already meeting several challenges. These challenges are due to the limited resources in these devices and the diverse nature of the applications and the ecosystem. These same reasons make adding an effective power management infrastructure extremely complex. These devices that run on tiny batteries for extensive periods, mostly unattended, have a very critical need to conserve power.

Zephyr is a RTOS from Intel, designed for IoT and wearable devices. It is open source and supports x86, ARM and ARC SoC platforms. It has a small footprint and can run with very less memory. Power management is built in the core of its scheduling and idling design. It exports infrastructure for PM services to implement custom power policies.

This presentation will give an insight into the Zephyr power management design and the philosophies behind it.

  • 18:10 – 19:00 – BoF: Linux Device Performance Framework – Michael Turquette, BayLibre

Complex system-on-chip processors provide performance levels for their devices and peripherals. The same chips also provide interconnects with performance knobs connecting these devices. For years, Linux has not provided a way to express the relationship between a device and its performance states, nor a uniform method for drivers to change these states. There are many solutions to this in downstream vendor trees. Let’s fix that.

The purpose of this BoF is to start a discussion around the topic with a wide audience, solicit feedback on the currently proposed approach and move forward with consensus. This BoF will discuss the types of performance states that need to be modeled, existing Linux driver frameworks that can be re-used, new code that needs to be written and how Device Tree plays a role. Will we write a new DVFS or Interconnect Framework? Attend and find out!

Wednesday, October 12

  • 09:00 – 09:50 – Supporting the Camera Interface on the C.H.I.P – Maxime Ripard, Free Electrons

Every modern multimedia-oriented ARM SoC usually has some kind of camera interface to be able to capture a video (or photo) stream from an external camera. The framework of choice to support these controllers in Linux is the Video4Linux subsystem, also called v4l2.

This talk will walk through the v4l2 stack, the architecture of a v4l2 driver and the interaction between the SoC driver and its camera’s. The presentation is based on the work Free Electrons has done to develop such a driver for the Allwinner SoCs, as part of enabling the C.H.I.P platform with the upstream Linux kernel.

  • 10:00 – 10:50 – How to Develop the ARM 64bit Board, Samsung TM2 with Exynos5433 – Chanwoo Choi, Samsung Electronics

In the last period of twenty years ARM has been undisputed leader for processor’s architecture in the embedded and mobile industry. With its 64 bit platform, ARM widens up its field of applicability. The ARMv8 introduces a new register set, it is compatible with its 32 bit predecessor ARMv7 and suits best those system that try to be amongst the high end performance devices. Tizen OS is an open multi profile platform that can run on TV, mobile, cars and wearables. Samsung TM2 board based on Exynos5433, which patches has been recently posted to mainline, is an ARM 64-bit board supported by Tizen 64-bit. However, during the bring-up, the kernel developers have faced many challenges that will be presented in this session. The presentation will go through a number of issues and the way they have been solved in order to make Tizen run on a 64 bit platform.

  • 10:45 – 11:35 – Devicetree Hardware Autoconfiguration – Hans de Goede, Red Hat

One can buy 7″ android tablets for around $35 now, assuming one gets the standard Q8 Allwinner based model, these are actually supported by the mainline linux kernel now. These tablets use a standard case + SoC + display, which get paired with a different touchscreen-controller, accelerometer and wifi chip for every other batch.

This talk will outline my experience in making a single devicetree file covering all variants using an in kernel hardware auto-detection module which creates and applies devicetree changesets depending on the detected hardware. This talk will give the audience an idea what is and is not possible wrt dynamic devicetree usage as well as give does and don’ts for people who want to use dynamic devicetree themselves.

  • 11:45 – 12:35 – Wyliodrin STUDIO: An Open Source Tool for IoT Development – Alexandru Radovici, Wyliodrin

Have you been using your development board (like the Raspberry pi for example) as a glorified computer? Are you tired of needing to hookup your boards to a display and keyboard any time you want to program them?

Wyliodrin STUDIO is a software development tool especially created for the design of IoT projects. It comes as an open source Chrome extension so that programmers can use it independently of their specific OS platform and with little setup overhead.

Wyliodrin STUDIO abstract away many of the issues regarding setting up your development boards and allows programmers to directly focus on their projects. It offers a friendly programming environment with many of the features of advanced IDEs, like Eclipse. For beginners, Wyliodrin STUDIO offers a large range of tutorials to help people take their first steps in IoT development. MagPi gave Wylidorin STUDIO a 5/5 rating.

  • 14:00 – 14:50 – ASoC: Supporting Audio on an Embedded Board – Alexandre Belloni, Free Electrons

ASoC, which stands for ALSA System on Chip, is a Linux kernel subsystem created to provide better ALSA support for system-on-chip and portable audio codecs. It allows to reuse codec drivers across multiple architectures and provides an API to integrate them with the SoC audio interface.

This talk will present the typical hardware architecture of audio devices on embedded platforms, present the ASoC API and how to use it for machine drivers, which are used to glue audio codecs with the processor audio interface. Examples, common issues and debugging tips will also be discussed.

  • 15:00 – 15:50 – Cameras in Embedded Systems: Device Tree and ACPI View – Sakari Ailus, Intel

Cameras in embedded systems are often collections of different components rather than monolithic devices such as USB webcams. They consist of sensors, lenses, LED or xenon flashes and ISPs, each of which are individual devices with their specific drivers.

Once the prevalent solution for supporting hardware variation between different ARM based systems was platform data. Since around 2011 new platform data files have had hard time getting to mainline, the preferred solution being the Device tree. However, Device tree support in the V4L2 framework was not around until over a years after that, additionally help from the V4L2 async framework is also required in order to achieve the same functionality as with platform data.

This talk shows how the frameworks are used in drivers and Device tree source, reviews the status of ACPI and discuss potential future developments.

  • 16:30 – 17:20 – Swapping and Embedded: Compression is the Key – Vitaly Wool

Ever since Linux started running on embedded devices, having a swap for such had been considered a misconfiguration rather than a method for overcoming RAM shortage or performance booster. This attitude started to change with the spread of Android devices which usually don’t have a problem utilizing virtually any amount of memory. An with the introduction of ZRAM the usage of a compressed swap in RAM became more useful and more popular. This talk will give a comprehensive description of ZRAM and its counterpart, zswap, a summary of pros and cons of both. This talk will also cover a brand new z3fold compressed memory allocator which can be used for both zswap and ZRAM, of course presenting measurement results for these, obtained on various devices, ranging from set top boxes to laptops, not to forget Android phones.

Thursday, October 14

  • 09:00 – 09:50 – Time is Ready for the Civil Infrastructure Platform – Yoshitake Kobayashi, Corporate Software Engineering Center & Urs Gleim, Seimens

The Civil Infrastructure Platform (CIP) – launched in April – CIP defined and started to realize a super long-term supported open source “base layer” for industrial grade software. This base layer aims to be used for current and future industrial systems which supports machine-to-machine connectivity for digital future. This kind of systems, being the field for decades, should have long-term support for security and robustness reasons. In this talk, we will show the first steps on CIP development. This includes initial set of components for the base layer and its maintainers. Are you ready? It’s time to start your development with and for the CIP.

  • 10:00 – 10:50 – Introduction to Memory Management in Linux – Alan Ott, Signal 11 Software

All modern non-microcontroller CPUs contain a memory management unit and utilize the concept of virtual memory. This presentation will describe the different types of virtual memory spaces and mappings used in the Linux kernel, the cases in which they are useful, how they are implemented in the kernel, and how they differ from user space memory. Concepts such as the hardware memory-management unit (MMU) and translation lookaside buffer (TLB) will be discussed, as well as software concepts like kernel page tables. User space concepts such as growable stacks, memory paging, memory mapping, page faults, exceptions, and other memory-related conditions will be covered as well.

  • 11:15 – 12:15 – MinnowBoard Delta: Fishing for Easy IoT Hardware – David Anders, Intel

With the introduction of the Zephyr Project, a small scalable real-time operating system for use on resource-constrained systems, the need for an easy to use platforms to enable Internet of Things development has grown. With the idea of enabling both hardware and software developers to quickly prototype and develop proof-of-concept, as well as transitioning directly to product, the MinnowBoard Delta was designed as an open source hardware platform to highlight the Zephyr Project. This presentation will cover design considerations as well as implementation methods for creating open source hardware specifically for open source software.

  • 12:15 – 13:05 – Cloud Platforms for the Internet of Things: How Do They Stack Up? – Koustabh Dolui, Politecnico di Milano

With the advent of the Internet of Things (IoT), there has been a recent surge in the number of cloud platforms offering their services for data collection and processing from IoT devices. These platforms, open-source and closed, are diverse in terms of ease of use, architecture, data storage, privacy, security and communication protocols. However, how these cloud platforms measure up against each other, given the set of tradeoffs that they present, remains quite unexplored in existing literature. In this presentation, Koustabh will present a detailed study on the architecture that these platforms are based on and how the open source platforms compare against closed platforms. Koustabh will compare the platforms based on a real data-set generated from a sensor network deployed at the heritage site of Circo Massimo, Rome, as a part of an ongoing project at Politecnico di Milano, Italy.

  • 14:30 – 15:20 – GPIO for Engineers and Makers – Linus Walleij
We will go over the changes to the GPIO subsystem in the recent years, including GPIO descriptor refactoring, new support for things like open drain, some words on device tree and ACPI hardware descriptions, and we will discuss the new userspace character device ABI for GPIO chips and how use cases such as those presented by the maker community or industrial control clients can benefit from it. We will also talk a bit about the future direction of the subsystem.
  • 15:30 – 16:20 – FDO: Magic ‘Make My Program Faster’ Compilation Option? – Pawel Moll, ARM

Feedback Driven Optimisation (FDO), also known as Profile Guided Optimisation (PGO) is a well known code optimisation technique, employed by compilers since mid XX century, yet not widely used in the wild these days. It relies on providing runtime-captured information about code execution (eg. “branch taken or not?”) during next code compilation, improving quality of decisions made by compiler heuristics.

To be fair, there were good reasons for its demise which I hope to discuss, mainly time and complexity overhead and deployment difficulties, but there is some hope on the horizon, coming with new approach, called AutoFDO and originating at Google, based on statistical profiling (namely Linux perf + extra tools) and source code level attribution. I’ll discuss existing support for it available in mainline GCC and LLVM and give examples of real-life, successful deployments.

If you’d like to attend the event, you can do so by registering online, and paying the entry fee:

  • Early Registration Fee: US$550 (through August 1, 2016)
  • Standard Registration Fee: US$650 (August 2, 2016 – September 3, 2016)
  • Late Registration Fee: US$850 (September 4, 2016 – Event)
  • Student Registration Fee: US$175 (valid student ID required)
  • Hobbyist Registration Fee: US$175

sModule SBC-x6818 Development Kit based on Samsung S5P6818 Processor Includes a 7″ Touchscreen

July 13th, 2016 4 comments

For some reasons, Samsung S5P4418 and S5P6818 quad and eight Cortex A53 core processors – likely made by Nexell – have been quite popular with embedded systems companies based in China. So after Graperain, Boardcon, and FriendlyARM, there’s at least one another company offering solutions with either processor, as sModule, a subsidiary of CoreWind, has now launched systems-on-module, single board computers, and development kits with the 64-bit ARM SoCs. In this post, I’ll cover one of their development kit including their CORE6818 CPU module, a baseboard, and an optional 7″ capacitive touch display..

Samsung_S5P6818_Board_with_LCD_DIsplaysModule SBC-x6818 development kit specifications:

  • CORE6818 CPU module
    • SoC – Samsung S5P6818 octa-core ARM Cortex A53 processor @ 1.4 to 1.6 GHz with Mali-400MP 3D GPU
    • System Memory – 1GB DDR3 (2GB optional)
    • Storage – 8GB eMMC Flash (4 & 16GB optional)
    • Ethernet – Realtek RTL8211E Gigabit Ethernet transceiver
    • 180-pin “interface” to baseboard
    • Power Supply – 3.7 to 5.5V DC input; 3.3V / 4.2V DC output; AXP228 PMIC
    • Dimensions – 68 x 48 x 3 mm (8-layer PCB)
    • Temperature range – -10 to 70 deg. C
  • SBC-x6818 Baseboard
    • Storage – 2x micro SD card slots
    • Video Output / Display I/F – 1x HDMI up to 1080p30, LCD, 20-pin LVDS, and 20-pin MIPI DSI interfaces; optional 7″ capacitive touch screen (1024×768 resolution)
    • Audio – HDMI, and 3.5mm headphone jack, speaker header, built-in microphone
    • Connectivity – Gigabit Ethernet
    • USB – 4x USB 2.0 host ports, 1x mini (micro?) USB OTG port
    • Camera – 1x 20-pin camera interface
    •  Expansion
      • “GPIO” header with ADC, UART, SPI, SPDIF, and GPIOs
      • ADC terminal block
      • Serial – 2x DB9 UART interfaces, 2x UART headers
    • Misc – IR receiver; power, menu, volume, and return buttons;  RTC with battery (not populated?); PWM buzzer; boot selector: eMMC, SD card, or USB (with fastboot?)
    • Power
      • 5V/2A DC via power barrel;
      • Power out header with 12V, 3.3V, and GND
      • 2-pin battery header for 4.2V lithium battery
    • Dimensions – 185 x 110 mm

The company provides Android 4.4, Ubuntu 12.04, and Linux 3.5 + qt 5.0 for the board. As with other boards based on Samsung/Nexell S5P processors, don’t expect software updates for the firmware, so if you need security patchsets or the latest kernel features this won’t work for you. You can find a few details about the hardware on the Wiki.

Samsung_S5P6818_SBC

While other companies kept their price secret, sModule published prices for all their modules and boards, and even allow you to purchase them by PayPal or bank transfer. Their CORE4418 module starts at $49, while the development kit above goes for $119 with the touch screen, and $109 without. The more compact iBOX6818 single board computer – they call it card computer – with 2GB RAM goes for $75. More details can be found on sModule products page.

$599 Softiron Overdrive 1000 Server is Powered by AMD Opteron A1100 64-bit ARM Processor

June 26th, 2016 15 comments

ARMv8 servers have been around for a year or so, but normally only available to companies, mostly due to their very high price. LeMaker Cello board based on AMD Opteron A1120 quad core SoC have changed that since it’s priced at $299, but I’m not sure it’s shipping right now, and it’s not a complete solution fitted with memory and storage, and lacks an enclosure. The good news is that Softiron has just launched Overdrive 1000 server powered by AMD Opteron A1100 series processor, with 8GB DDR4 RAM, a 1TB drive, and a case.

Softiron_Overdrive_1000Softiron Overdrive 1000 server specifications:

  • SoC – AMD Opteron A1100 series quad core ARM Cortex A57 processor
  • System Memory – 2x RDIMM slots fitted with 8GB DDR4 DRAM and expandable to 64GB
  • Storage – 2x SATA 3.0 connector with one fitted with  a 1TB HDD
  • Connectivity – 1x GBase-T Ethernet
  • USB – 2x USB 3.0 ports
  • Power Supply – ATX power supply; 100~240V @ 50-60Hz
  • Dimensions – 315 x 222 x 76 mm or 463 x 385 x 145 mm (Product page vs product brief info)
  • Weight – 3.65 kg or 5.2 kg

A standard UEFI boot environment is used, and while you could install your distribution of choice, the server is pre-loaded with openSUSE Leap including a standard Linux GNU tool chain, platform device drivers, the Apache web server, MySQL, PHP, Xen, KVM Hypervisor, Docker, and OpenJDK 64-bit ARM.

I could not find much in the way of demo, but you can listen to ARM and Softiron representatives explaining why it’s a good choice…

If you’d like to go ahead, and get one, you can purchase Softiron Overdrive 1000 directly on the company’s website for $599 + shipping. In my case (Asia based), it would cost $87.06 via UPS, which looks not too bad considering the weight…

Via Andrew Wafaa

ARMv8 64-bit Processors To Replace Intel Xeon and SPARC64 Processors in Some Supercomputers

June 21st, 2016 5 comments

There’s been some news recently about Sunway TaihuLight supercomputer which nows top the list of the 500 fastest super computers with 93 PFLOPS achieved with Linpack, and is comprised of 40,960 Sunway SW26010 260 core “ShenWei” processors designed in China. But another interesting development is that ARMv8 are also slowly coming to supercomputers, starting with TianHe-2 super computer which is currently using Intel Xeon & Xeon Phi processors and second in the list, but according to a report on Vrworld, the US government decided to block US companies’ sales (i.e. Intel and AMD) to China as they were not at the top anymore, and also blocked Chinese investments into Intel and AMD, so the Chinese government decided to do it on their own, and are currently adding Phytium Mars 64-core 64-bit ARM processors to expand TianHe-2 processing power. Once the upgrade is complete Tianhe-2 should have 32,000 Xeons (as currently), 32,000 ShenWei processor, and 96,000 Phytium accelerator cards delivering up to 300 PFLOPS.

Japan K-Computer with Sparc 64 Processor

Japan K-Computer with SPARC64 Processors

One other report on The Register explains that the next generation of K-Computer, currently using Fujitsu SPARC64 processor, will instead feature Fujitsu ARMv8 processors in Post-K super computer in 2020 delivering up to 1000 PFLOPS (or 1 Exa FLOPS).  Details are sparse right now, but we do know Fujissu “has optimized the processor’s design to accelerate math, and squeeze the most of the die caches, hardware prefetcher and its Tofu interconnect”.

Post-K_ARM_Supercomputer

More details will likely be offered during “Towards Extreme-Scale Weather/Climate Simulation: The Post K Supercomputer & Our Challenges” presentation at ISC 2016 in Frankfurt, Germany later today.

Thanks to Sanders and Nicolas.

Categories: Linux, Processors Tags: arm, armv8, Linux, server

Cavium introduces 54 cores 64-bit ARMv8 ThunderX2 SoC for Servers with 100GbE, SATA 3, PCIe Gen3 Interfaces

June 1st, 2016 5 comments

Cavium announced their first 64-bit ARM Server SoCs with the 48-core ThunderX at Computex 2014. Two years later, the company has now introduced the second generation, aptly named ThunderX2, with 54 64-bit ARM cores @ up to 3.0 GHz and promising two to three times more performance than the previous generation.

Cavium_ThunderX2

Key features of the new server processor include:

  • 2nd generation of full custom Cavium ARM core; Multi-Issue, Fully OOO; 2.4 to 2.8 GHz in normal mode, Up to 3 GHz in Turbo mode.
  • Up to 54 cores per socket delivering > 2-3X socket level performance compared to ThunderX
  • Cache – 64K I-Cache and 40K D-Cache, highly associative; 32MB shared Last Level Cache (LLC).
  • Single and dual socket configuration support using 2nd generation of Cavium Coherent Interconnect with > 2.5X coherent bandwidth compared to ThunderX
  • System Memory
    • 6x DDR4 memory controllers per socket supporting up to 3 TB RAM in dual socket configuration
    • Dual DIMM per memory controller, for a total of 12 DIMMs per socket.
    • Up to 3200MHz in 1 DPC and 2966MHz in 2 DPC configuration.
  • Full system virtualization for low latency from virtual machine to IO enabled through Cavium virtSOC technology
  • Next Generation IO
    • Integrated 10/25/40/50/100GbE network connectivity.
    • Multiple integrated SATAv3 interfaces.
    • Integrated PCIe Gen3 interfaces, x1, x4, x8 and x16 support.
  • Integrated Hardware Accelerators
    • OCTEON style packet parsing, shaping, lookup, QoS and forwarding.
    • Virtual Switch (vSwitch) offload.
    • Virtualization, storage and NITROX V security.
  • Manufacturing Process – 14 nm FinFET

Cavium_ThunderX2_SKUs

Just like for Cavium ThunderX, four revisions (SKUs) will be provided to match specific requirements, with all support 10/25/40/50/100GbE connectivity:

  • ThunderX2_CP for cloud compute workloads.  Used for private and public clouds, web serving, web caching, web search, commercial HPC workloads such as computational fluid dynamics (CFD) and reservoir modeling. This family also includes PCIe Gen3 interfaces, and accelerators for virtualization and vSwitch offload.
  • ThunderX2_ST for optimized for big data, cloud storage, massively parallel processing (MPP) databases and Data warehousing. This family supports multiple PCIe Gen3 interfaces, SATAv3 interfaces, and hardware accelerators for data protection, integrity, security, and efficient data movement.
  • ThunderX2_SC for optimized for secure web front-end, security appliances and cloud RAN type workloads. This family supports multiple PCIe Gen3 interfaces, as well as Cavium’s NITROX security technology with acceleration for IPSec, RSA and SSL.
  • ThunderX2_NT optimized for media servers, scale-out embedded applications and NFV type workloads. This family includes  OCTEON style hardware accelerators for packet parsing, shaping, lookup, QoS and forwarding.

The processor complies with Server Base Boot Requirements (SBBR), UEFI, ACPI support), and SBSA Level 2, and will support Ubuntu 16.04 LTS and later, Red Hat Early Access for ARMv8,  SUSE SLES SP2 and later, CentOS 7.2 and later, and FreeBSD 11.0 and later.

Charbax interviewed the company at Computex 2016 in the 20-minute video below, where you can also see Gigabyte G220-T60 server with ThunderX with an Nvidia Tesla GPU (at the 7:20 mark) for “high performance compute applications”, and other servers based on the first generation ThunderX SoC.

It could not find when the SoC will be available. More details can be found on Cavium ThunderX2 product page.

$35 NanoPi M3 Octa Core 64-bit ARM Development Board is Powered by Samsung S5P6818 Processor

May 20th, 2016 24 comments

A few weeks after introducing NanoPC-T3 single board computer based on Samsung S5P6818 octa-core Cortex A53 processor, FriendlyARM is now launching a cost-down version called NanoPi M3 for just $35 with 1GB RAM, and booting from a micro SD card.

NanoPi_M3

NanoPi M3 board specifications:

  • SoC – Samsung S5P6818 octa core Cortex A53 processor @ up to 1.4GHz with Mali-400MP GPU
  • System Memory – 1 GB 32-bit DDR3
  • Storage – 1x micro SD card slot
  • Connectivity – Gigabit Ethernet (RTL8211E), 802.11 b/g/n WiFi and Bluetooth LE 4.0 (Ampak AP6212) with on-board chip antenna and IPX antenna connector
  • Video Output / Display I/F – HDMI 1.4a up to 1080p60, LVDS, parallel RGB LCD
  • Audio I/O – HDMI, 3.5mm audio jack, 7-pin I2S header
  • Camera – 1x DVP interface
  • USB – 2x USB 2.0 type A host ports; 1x micro USB 2.0 client port; 2x USB 2.0 host ports via 8-pin header
  • Expansions Headers – 40-pin header
  • Debugging – 4-pin header for serial console
  • Misc – Power & reset buttons; power status LEDs.
  • Power Supply – 5V/2A via micro USB port; AXP228 PMIC
  • Dimension – 64 x 60 mm (6-layer PCB)

Cheap_Octa_Core_BoardThe board supports Android and Debian running on top of Linux 3.4. More technical details can be found in the Wiki. Samsung S5P processors are actually made by Nexell, and not supported at all in mainline Linux, so don’t expect support for a more recent kernel. Arnd Bergmann, one of Linux ARM SoC maintainers, even referred the code to as “awful“:

Source code is available but awful.

Specifically, this is a Linux-3.4 kernel that looks more like a Linux-2.6.28 platform port that was forward-ported.

Nevertheless, at $35 plus shipping ($10 in my case),  NanoPi-M3 must be the cheapest octa-core board available on the market so far. Visit the product page for more details and/or purchase the board.