Archive

Posts Tagged ‘gpu’

Vivante Unveils Details About GC7000 Series GPU IP Family

April 19th, 2014 8 comments

Earlier this month, Vivante Corporation has announced several silicon partner integrations (but no names given) of its GC7000 Series GPU IP into SoCs targeting wearables, mobile, automotive, and 4K TV products, and provided some more details about its GC7000 family which supports features such as OpenGL ES 3.1 API, and hardware TS/GS/CS (tessellation / geometry / compute shader) extensions for Android.

Vivante_GC7000_Architecture

According to the company, they key benefits of their GC7000 GPU IP can be summarized as follows:

  • True GPU Scalability – GC7000 Series products support limited silicon area to match form factor and market requirements. Products can snap to grid starting at 3.0 mm2 (28 nm) for the smallest single GPU GC7000 instance and grow in simple modular fashion for high end implementations to achieve what the company’s claims to be the the industry’s best PPA (power/performance/area).
  • Smallest Licensable OpenGL ES 3.1 Cores with Geometry, Tessellation, and Compute Shaders - Die area of the GC7000 is reduced by 20% over previous generation mass market cores and includes the new evolution of OpenGL ES 3.1 and DirectX 11 shader/GPU technologies and upcoming mobile platform requirements, including support for hardware TS/GS shading extensions for Android OS.
  • Faster Graphics Performance - Better real time utilization of shaders speeds up rendering performance, quality and effects to effectively scale up for 4K gaming content at 60 FPS.
  • Cooler Cores – GPU thermals and system power are reduced 30% and bandwidth is reduced by 50% through bandwidth modulation using Vivante frame buffer (vFB) and pixel compression, Khronos ASTC, geometry/tessellation shader rendering, and Android optimized intelligent composition (Regionizer).
  • Configurable Shader Core Implementations – Cores range from highly silicon optimized eight shader solutions to performance optimized multi-GPU/multi-shader solutions, all with hardware support for security (secure GPU) and OS virtualization.
  • Hardware and Software Integration Simplified – The single unified software stack supports all Vivante GPU cores and existing software platforms to create a seamless transition to the latest technologies. GC7000 hardware is even more modular to allow faster integration with easier place-and-route design and reduced wire congestion.
  • System Friendly Architecture – GC7000 is designed for hybrid and heterogeneous computing systems supporting OpenCL and HSA using AMBA ACE-Lite (CPU – GPU cache coherency) and the vStream interface. Other additions include a pixel compression fabric that allows GC7000 to create a streamlined pixel processing pipeline across the ISP, CPU, DSP, memory, and display processor.

GC7000 Series GPU cores come packaged with a single driver software stack that supports board support packages (BSP) running Android KitKat, Chrome OS, Linux, QNX, Tizen and Windows operating systems. They will also support Unreal Engine 4, Unity 4 and the upcoming Unity 5 SDKs

Vivante_GC7000_FamilyThere are currently 6 GPUS available from the GC7000 series with GC7000 UltraLite, GC7000 Lite, GC7000, GC7200, GC7400, and GC7600 with 8 to 256 Vega Shader Cores clocked up to 1GHz, and all supporting OpenGL ES 3.1, OpenGL 2.x desktop, and OpenCL 1.2. Performance will range from 32 to 1024 GFLOPS with medium precision operation, 16 to 512 GFLOPS for higher precision operations, and GC7000 GPUs will be able to deliver up to 25.6 GTextel/s and up to 16 Gvertex/s.

AndroidPC.es also reports GC7000 GPU performance, I’d assume GC7600 performance, should be 40% higher than Nvidia Tegra K1 “Kepler” GPU and 122% higher than Imagination Technologies PowerVR GPU GX6650 “Rogue 2.0″.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

$192 Nvidia Jetson TK1 Development Board with Tegra K1 Quad Core Cortex A15 SoC

March 26th, 2014 10 comments

Nvidia has just unveiled Jetson TK1 development kit powered by their 32-bit Tegra K1 quad core Cortex A15 processor with a 192-core Kepler GPU. This board targets computer-vision applications for robotics, medical, avionics, and automotive industries that can leverage the compute capabilities of the Kepler GPU.

Jetson TK1 Development Board

Jetson TK1 Development Board

Jetson TK1 devkit specifications:

  • SoC – Nvidia Tegra K1 SoC with 4-Plus-1 quad-core ARM Cortex A15 CPU, and Kepler GPU with 192 CUDA cores (Model T124)
  • System Memory – 2 GB x16 memory with 64 bit width
  • Storage – 16 GB 4.51 eMMC memory, SATA data + power ports, full size SD/MMC slot, and 4MB SPI boot flash.
  • Video Output – HDMI port
  • Audio – ALC5639 Realtek Audio codec with Mic in and Line out
  • Connectivity – RTL8111GS Realtek GigE LAN
  • USB – 1x USB 2.0 OTG port, micro AB, 1x USB 3.0 port, A
  • Debugging – RS232 serial port, JTAG
  • Expansion
    • 1x Half mini-PCIE slot
    • Expansion port with access to DP/LVDS, Touch SPI 1×4 + 1×1 CSI-2, GPIOs, UART, HSIC, I2c
  • Sensor – TMP451 temperature monitor
  • Misc – Power, reset and recovery buttons, power and network LEDs, fan header
  • Power – AMS AS3722 Power Management IC for power and sequencing.
  • Dimensions – 12.7×12.7 cm

The complete development kit includes Jetson TK1 development board (Model PM375), an AC adapter with power cord, a USB Micro-B to USB A adapter, and a Quick Start Guide.

Nvidia Jetson TK1 Development Board Block Diagram

Nvidia Jetson TK1 Development Board Block Diagram

The company provides Linux for Tegra K1, CUDA Toolkit and Accelerated Libraries (CUDA 6.0 / OpenCV4Tegra), CUDA sample code, as well as the board specifications, schematics (PDF) and mechanical design files (STP). All of which can be accessed via the board support page. The Linux kernel version is 3.10.24, and comes with support for OpenGL ES 2.0, OpenGL ES 1.1, OpenGL ES path extensions, EGL 1.4 with EGLImage media APIs, and X11 Support. Nvidia also provides support for OpenGL 4.4.

The development kit is available for pre-order for $192 via Nvidia’s developer Jetson TK1 page. Shipping is scheduled for April. The downside is that it will only ship to the US, Canada, Puerto Rico, and the Virgin Islands. if you live in Europe, you can preorder from Avionic Design, SECO and Zotac, in Japan, you can go through Ryoyo Electro Corporation.

Via Google+ mini PC community and Sanders.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

OpenCL Accelerated SQL Database with ARM Mali GPU Compute Capabilities

March 20th, 2014 5 comments

We’ve previously seen GPU compute on ARM could improve performance for mobile, automotive and consumer electronics application. GPU compute offload CPU task that can be parallelized to the GPU using APIs such as OpenCL or RenderScript. Most applications that can leverage GPU compute are related to media processing (video decoding, picture processing, audio decoding, image reconigion, etc…), but one thing I did not suspect could be improve is database access. That’s what Tom Gall, Linaro, has achieved in a side project by using OpenCL to accelerate SQLite database operations by around 4 times for a given benchmark.

SQLite Architecture and "Attack Point" for OpenCL Implementation

SQLite Architecture and “Attack Point” for OpenCL Implementation

The hardware used was a Samsung Chromebook with an Exynos 5250 SoC featurig a dual core Cortex A15 processor and an ARM Mali T604 GPU. CPU compute is only possible on ARM Mali T6xx and greater, and won’t work on Mali 400 / 450 GPUs. Other GPU vendors such as Vivante and Imagination technologies also support GPU compute in their latest processors.

As a first implementation, he added an API to SQLite, but eventually the code may be merged inside SQLite, as it would also to accelerate existing applications using SQLite. This type of acceleration will work best with large tables, and parallel tasks.  For benchmark purpose, Tom used a 100,000 row database with 7 columns and ran the same query (select * from testdb) using the SQLite C API and his OpenCL accelerated API. Here are the results:

  • SQLite C API – 420.274 milliseconds
  • OpenCL accelerated SQLite API – 110.289 milliseconds

The first test ran fully on the Cortex A15 cores @ 1.7 GHz, whereas the OpenCL test mostly ran on the Mali-T604 GPU clocked at 533 MHz (TBC). The time includes both the running of the OpenCL kernel and the data transfer from the result buffer.

More work is needed, but that seems like an interesting application for GPU compute in some use cases. I would expect to see no gain for query performed in small tables for example. The modified OpenCL code does not appear to be available right now, but you may want to read GPGPU on ARM presentation at Linaro Connect Asia 2014 for a few more details about the implementation, and if you want to play around OpenCL 1.1 (or OpenGL ES) in Linux on a Chromebook, you can follow those instructions.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

Mali-400 GPU Is Now Working in Linux for Rockchip RK3188 Devices

March 14th, 2014 23 comments

Accelerated 3D graphics in Linux with Mali-400 via OpenGL ES has been possible for nearly a year on RK3066 devices,  but there was no such support for RK3188. This week however, both Naoki FUKAUMI and omegamoon have reported OpenGL ES to work in in their respective RK3188 devices. I don’t know which device omegamoon used, but Naoki did so in Radxa Rock, and even posted instructions to build it yourself.

es2gears OpenGL ES demo on Rockchip RK3188

es2gears OpenGL ES demo on Rockchip RK3188

They’ve mostly followed the work done by olegk0 for Rockchip, and Mali drivers build instructions provided in linux-sunxi community, and it can be summarize in 3 main steps:

  1. Cross-compile drm.ko, mali_drm.ko, ump.ko, mali.ko in a Linux machine
  2. Copy and load the four modules to your RK3188 based board or device.
  3. Install dependencies and binary Mali drivers from sunxi-linux in your Rockchip device

Once this is done you can try some OpenGL ES demos such as esgears2 or glmark2-es to test it with the framebuffer. es2gears can be installed with “sudo apt-get install mesa-utils-extra” and glmark2-es2 with “sudo apt-get install glmark2-es2“.

I had a quick try this morning, and the building worked, the four modules could load, but es2gears still rendered by software:

libEGL warning: DRI2: failed to authenticate                                    
XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0.0"   
      after 184 requests (171 known processed) with 0 events remaining.         
EGL_VERSION = 1.4 (DRI2)

 

I used a different toolchain, and kernel source, so this may be the reason. TBC. A successful es2gears output should look like:

EGL_VERSION = 1.4 Linux-r3p2-01rel2
 vertex shader info:
 fragment shader info:
 info:
 2064 frames in 5.0 seconds = 412.635 FPS
 2129 frames in 5.0 seconds = 425.630 FPS

and glmark2-es2:

=======================================================
    glmark2 2012.08
=======================================================
    OpenGL Information
    GL_VENDOR:     ARM
    GL_RENDERER:   Mali-400 MP
    GL_VERSION:    OpenGL ES 2.0
=======================================================

Further steps would be to enable X11 to use Mali, but I’m not sure this has been tried just yet.

Nevertheless, that should mean you can soon expect Linux images with support for accelerated 3D graphics for your Rockchip RK3188. This does not mean however that hardware video decoding will be possible, as Mali-400 GPU is not a VPU and does not support decoding/encoding. There is, however, a separate effort to brings hardware video decoding support in RK3188, but this should take much more time.

In other news, linux-rochip community has just started a mailing list, so you may want to join if you are interested in software development on Rockchip devices for Linux and Android.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

Amlogic GPL Source Code Release – Kernel 3.10, U-Boot, and Drivers (Wi-Fi, NAND, TVIN, Mali GPU)

March 10th, 2014 13 comments

Last month, I noticed Amlogic provided links to the Android SDK for S802 / M802 on their open source website, but the only way to get the source was to share your SSH public with Amlogic, so that they give you access. It did not happen, but the company has released the source for Linux 3.10.10, U-boot 2011.03, Realtek and Broadcom Wi-Fi drivers, NAND drivers, “TVIN”drivers, and kernel space GPU drivers for Mali-400 / 450 GPU. There are also some customer board files for Meson 6 only (AML8726-MX / M6) but they do not seem to match the kernel…

amlogic_kernel_m802_s802

If you want to build the kernel, including the drivers, you’ll need to download a bunch of files:

wget http://openlinux.amlogic.com:8000/download/ARM/kernel/arm-src-kernel-2014-03-06-d5d0557b2b.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/rtk8192du-2014-03-06-7f70d95d29.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/rtk8192eu-2014-03-06-9766866350.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/rtk8192cu-2014-03-06-54bde7d73d.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/rtk8188eu-2014-03-06-2462231f02.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/brcmap6xxx-2014-03-06-302aca1a31.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/wifi-fw-2014-03-06-d3b2263640.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/modules/aml_tvin-2014-03-06-fb3ba6b1c8.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/modules/aml_nand-2014-03-06-39095c4296.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/customer/aml_customer-2014-03-06-76ce689191.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/gpu/gpu-2014-03-06-0425a1f681.tar.gz

You’ll need to extract these tarballs in specific directories:

tar xvf arm-src-kernel-2014-03-06-d5d0557b2b.tar.gz
mkdir -p hardware/amlogic/
mkdir -p hardware/wifi/realtek/drivers
mkdir -p hardware/wifi/broadcom/drivers
mkdir -p hardware/arm/
cd hardware/amlogic
tar xvf aml_nand-2014-03-06-39095c4296.tar.gz
mv aml_nand-amlogic-nand nand
cd ../wifi/realtek/drivers
tar xvf ../../../../rtk8192du-2014-03-06-7f70d95d29.tar.gz
tar xvf ../../../../rtk8192eu-2014-03-06-9766866350.tar.gz
tar xvf ../../../../rtk8192cu-2014-03-06-54bde7d73d.tar.gz 
tar xvf ../../../../rtk8188eu-2014-03-06-2462231f02.tar.gz
mv rtk8188eu-8188eu 8188eu
mv rtk8192du-8192du 8192du
mv rtk8192cu-8192cu 8192cu
mv rtk8192eu-8192eu 8192eu
tar xvf ../../../../brcmap6xxx-2014-03-06-302aca1a31.tar.gz
cd ../../broadcom/drivers
mv brcmap6xxx-ap6xxx ap6xxx
cd ../../../arm
tar xvf ../../gpu-2014-03-06-0425a1f681.tar.gz
mv gpu-r3p2-01rel3 gpu
cd ..
mv aml_tvin-amlogic-3.10-bringup tvin

You can also extract the customer file into the kernel directory to add some drivers. As I said above I’m not sure the source code inside matches the Linux kernel 3.10.10, because there’s now device tree file for the boards. In arch/arm/plat-meson/Kconfig, there are (commented out) references to customer/meson/dt/Kconfig and customer/drivers/Kconfig. The device tree is not available, but the drivers is, so you could give a try in order to build the touchscreen and sensors drivers available in the customer tarball:

cd ../linux-amlogic-3.10-bringup
tar xvf ../aml_customer-2014-03-06-76ce689191.tar.gz 
mv aml_customer-master customer

Finally, the development tree is ready to build the kernel. There must surely be a script somewhere to do that… I haven’t used the file wifi-fw-2014-03-06-d3b2263640.tar.gz, as the kernel did not complain about it, and it looks like it’s just for Android Kit Kat. There are four scripts to build the kernel: mk_m6.sh, mk_m6tv, mk_m6_tvd.sh, and mk_m8.sh. The first three are for meson6 (dual core processor), and the last one meson8 (quad core S802/M802).

Let’s go with M8 build:

make ARCH=arm meson8_defconfig
./mk_m8.sh

Please not that I had to change mk_m8.sh, as it should just make computer hand requiring a hard reset. The culprity was the line:

make uImage -j

The manpage indicates “If the -j option is given without an argument, make  will  not  limit  the number of jobs that can run simultaneously”.  It does not seem like a good idea… ,s so I changed that to

make uImage -j8

Upon successful build, the end of log you look like:

UIMAGE arch/arm/boot/uImage
Image Name: Linux-3.10.10
Created: Mon Mar 10 11:48:52 2014
Image Type: ARM Linux Kernel Image (lzo compressed)
Data Size: 7099978 Bytes = 6933.57 kB = 6.77 MB
Load Address: 00008000
Entry Point: 00008000
Image arch/arm/boot/uImage is ready
/home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/scripts/amlogic/aml2dtb.sh /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dtd
DTD_FILE: /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dtd
the middle dts file: /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dts
process file /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dts start
processing... please wait...
process file /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dts end

CC scripts/mod/devicetable-offsets.s
GEN scripts/mod/devicetable-offsets.h
HOSTCC scripts/mod/file2alias.o
HOSTLD scripts/mod/modpost
DTC arch/arm/boot/dts/amlogic/meson8_skt.dtb
rm /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dts
-rw-r–r– 1 jaufranc jaufranc 11244948 Mar 10 11:48 ./m8boot.img
m8boot.img done

If you want to get U-boot code it’s not quite as messy, you jut need to download and extract two files:

wget http://openlinux.amlogic.com:8000/download/ARM/u-boot/uboot-2014-03-06-323515c056.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/u-boot/aml_uboot_customer-2014-03-06-09887e87b4.tar.gz
tar xvf uboot-2014-03-06-323515c056.tar.gz
cd uboot-next
tar xvf ../aml_uboot_customer-2014-03-06-09887e87b4.tar.gz
mv aml_uboot_customer-next/ customer

Then just select a board in customer/board/ to build U-boot for your hardware. For example:

make m8_k03_M102_v1_config CROSS_COMPILE=arm-linux-gnueabihf-
make CROSS_COMPILE=arm-linux-gnueabihf- -j8

The build failed for me, but it might be I may need to use another compiler, e.g. arm-none-eabi-gcc.

[Update: arm-none-eabi-gcc does seem to go further, but you'll also need an arc compiler as shown in my previous Amlogic U-boot build instructions].

Thanks to M][sko for the tip.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

Raspberry Pi Gets Open Source 3D Graphics Drivers and Documentation

March 1st, 2014 2 comments

The Raspberry Pi was launched 2 years ago, and for its birthday, Broadcom decided to release documentation and open source OpenGL ES 1.1 and 2.0 driver for the Videocore IV GPU.  You may remember the Raspberry Pi Foundation already release an open source GPU driver in 2012, but this was only for the part running on the ARM11 core for Broadcom BCM2835 SoC, which is just a few hundred lines of code long, and communicates with a binary blob which does all the work in the GPU itself. This new release however goes much further with a 111 page document entitled “VideoCore IV 3D Architecture Reference Guide“, and open source driver for the 3D System of the GPU.

VideoCore IV 3D Graphics Block Diagram (Click to Enlarge)

VideoCore IV 3D System Block Diagram (Click to Enlarge)

Strangely the release is however not for BCM2835, but instead BCM21553. Broadcom clearly has the source for BCM2835 too, so this must have been made for legal reasons. VideoCoreIV packs a lot of graphics feature 2D and 3D graphics, Video Processing Unit (with video codecs), ISP (Imagine Signal Processor) used by the camera, and probably a few other bits, but only the 3D part has been released, which is already a great achievement. The VPU code will never be released because the MPEG LA would not allow this, as they would like to keep on receiving their codec royalties.

That means the drivers, released under a BSD licence, will need to be ported to BCM2835, something that “should be reasonably straightforward“, but is still hard enough for the Raspberry Pi foundation to offer a $10,000 bounty to the first person that can port Broadcom’s VideoCore drivers to run on the Pi, and demonstrate Quake III running smoothly with the open source drivers. My take is you may even land a job if you manage that. I’ll give you a head start by mentioning you’ll need to change the registers’ base address :p.

VideoCore_IV_GPU_Base_Address

Eben Upton also mention it should be possible to “write general-purpose code leveraging the GPU compute capability on VideoCore devices”.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

Intel Bay Trail Graphics Overview – FOSDEM 2014

February 17th, 2014 No comments

Bay Trail SoCs are new low power Intel ICs for tablets (Bay Trail-T, Z3000 series), mobiles (Bay Trail-M, N2800, N2900 and N3500 series), desktops (Bay Trail-D, J1800, J1900 and J2900 series) and embedded / industrial platforms (Bay Trail-I, E3800 series). Many Atom processors used to features PowerVR GPU, but it has now been replaced by Intel HD graphics in Bay Trail SoC.

Z3700_Series_Block_Diagram
Jesse Barnes, working at Intel on software and drivers for Intel graphics devices, gives a presentation about Bay Trail SoCs with a focus on graphics. After an overview, and some ARM bashing regarding performance (Nvidia Tegra 4 and Qualcomm Snapdragon 800), and even power consumption (Tegra 4 only), he describe further details about Intel HD graphics found in the new Intel processors. Everything is basically in mainline, and you’ll need Linux 3.10 or greater, Mesa 9.2 or greater, and libva 1.2.1 or greater for proper support. Some initial GPU benchmarks showed somewhat disappointing results, but in this talk, Jesse explains some parts of the drivers still need performance improvements, as they only run at half the expected speed. VP8 support (hardware decode?) is work in progress in VP8.

He also mentioned available hardware platform with Windows 8.x tablets already available, and Android tablets becoming available later this year, except in China where you can already buy Bay Trail Android tablets…

Presentation slides are not available.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter