Archive

Posts Tagged ‘gpu’

$192 Nvidia Jetson TK1 Development Board with Tegra K1 Quad Core Cortex A15 SoC

March 26th, 2014 10 comments

Nvidia has just unveiled Jetson TK1 development kit powered by their 32-bit Tegra K1 quad core Cortex A15 processor with a 192-core Kepler GPU. This board targets computer-vision applications for robotics, medical, avionics, and automotive industries that can leverage the compute capabilities of the Kepler GPU.

Jetson TK1 Development Board

Jetson TK1 Development Board

Jetson TK1 devkit specifications:

  • SoC – Nvidia Tegra K1 SoC with 4-Plus-1 quad-core ARM Cortex A15 CPU, and Kepler GPU with 192 CUDA cores (Model T124)
  • System Memory – 2 GB x16 memory with 64 bit width
  • Storage – 16 GB 4.51 eMMC memory, SATA data + power ports, full size SD/MMC slot, and 4MB SPI boot flash.
  • Video Output – HDMI port
  • Audio – ALC5639 Realtek Audio codec with Mic in and Line out
  • Connectivity – RTL8111GS Realtek GigE LAN
  • USB – 1x USB 2.0 OTG port, micro AB, 1x USB 3.0 port, A
  • Debugging – RS232 serial port, JTAG
  • Expansion
    • 1x Half mini-PCIE slot
    • Expansion port with access to DP/LVDS, Touch SPI 1×4 + 1×1 CSI-2, GPIOs, UART, HSIC, I2c
  • Sensor – TMP451 temperature monitor
  • Misc – Power, reset and recovery buttons, power and network LEDs, fan header
  • Power – AMS AS3722 Power Management IC for power and sequencing.
  • Dimensions – 12.7×12.7 cm

The complete development kit includes Jetson TK1 development board (Model PM375), an AC adapter with power cord, a USB Micro-B to USB A adapter, and a Quick Start Guide.

Nvidia Jetson TK1 Development Board Block Diagram

Nvidia Jetson TK1 Development Board Block Diagram

The company provides Linux for Tegra K1, CUDA Toolkit and Accelerated Libraries (CUDA 6.0 / OpenCV4Tegra), CUDA sample code, as well as the board specifications, schematics (PDF) and mechanical design files (STP). All of which can be accessed via the board support page. The Linux kernel version is 3.10.24, and comes with support for OpenGL ES 2.0, OpenGL ES 1.1, OpenGL ES path extensions, EGL 1.4 with EGLImage media APIs, and X11 Support. Nvidia also provides support for OpenGL 4.4.

The development kit is available for pre-order for $192 via Nvidia’s developer Jetson TK1 page. Shipping is scheduled for April. The downside is that it will only ship to the US, Canada, Puerto Rico, and the Virgin Islands. if you live in Europe, you can preorder from Avionic Design, SECO and Zotac, in Japan, you can go through Ryoyo Electro Corporation.

Via Google+ mini PC community and Sanders.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

OpenCL Accelerated SQL Database with ARM Mali GPU Compute Capabilities

March 20th, 2014 5 comments

We’ve previously seen GPU compute on ARM could improve performance for mobile, automotive and consumer electronics application. GPU compute offload CPU task that can be parallelized to the GPU using APIs such as OpenCL or RenderScript. Most applications that can leverage GPU compute are related to media processing (video decoding, picture processing, audio decoding, image reconigion, etc…), but one thing I did not suspect could be improve is database access. That’s what Tom Gall, Linaro, has achieved in a side project by using OpenCL to accelerate SQLite database operations by around 4 times for a given benchmark.

SQLite Architecture and "Attack Point" for OpenCL Implementation

SQLite Architecture and “Attack Point” for OpenCL Implementation

The hardware used was a Samsung Chromebook with an Exynos 5250 SoC featurig a dual core Cortex A15 processor and an ARM Mali T604 GPU. CPU compute is only possible on ARM Mali T6xx and greater, and won’t work on Mali 400 / 450 GPUs. Other GPU vendors such as Vivante and Imagination technologies also support GPU compute in their latest processors.

As a first implementation, he added an API to SQLite, but eventually the code may be merged inside SQLite, as it would also to accelerate existing applications using SQLite. This type of acceleration will work best with large tables, and parallel tasks.  For benchmark purpose, Tom used a 100,000 row database with 7 columns and ran the same query (select * from testdb) using the SQLite C API and his OpenCL accelerated API. Here are the results:

  • SQLite C API – 420.274 milliseconds
  • OpenCL accelerated SQLite API – 110.289 milliseconds

The first test ran fully on the Cortex A15 cores @ 1.7 GHz, whereas the OpenCL test mostly ran on the Mali-T604 GPU clocked at 533 MHz (TBC). The time includes both the running of the OpenCL kernel and the data transfer from the result buffer.

More work is needed, but that seems like an interesting application for GPU compute in some use cases. I would expect to see no gain for query performed in small tables for example. The modified OpenCL code does not appear to be available right now, but you may want to read GPGPU on ARM presentation at Linaro Connect Asia 2014 for a few more details about the implementation, and if you want to play around OpenCL 1.1 (or OpenGL ES) in Linux on a Chromebook, you can follow those instructions.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

Mali-400 GPU Is Now Working in Linux for Rockchip RK3188 Devices

March 14th, 2014 19 comments

Accelerated 3D graphics in Linux with Mali-400 via OpenGL ES has been possible for nearly a year on RK3066 devices,  but there was no such support for RK3188. This week however, both Naoki FUKAUMI and omegamoon have reported OpenGL ES to work in in their respective RK3188 devices. I don’t know which device omegamoon used, but Naoki did so in Radxa Rock, and even posted instructions to build it yourself.

es2gears OpenGL ES demo on Rockchip RK3188

es2gears OpenGL ES demo on Rockchip RK3188

They’ve mostly followed the work done by olegk0 for Rockchip, and Mali drivers build instructions provided in linux-sunxi community, and it can be summarize in 3 main steps:

  1. Cross-compile drm.ko, mali_drm.ko, ump.ko, mali.ko in a Linux machine
  2. Copy and load the four modules to your RK3188 based board or device.
  3. Install dependencies and binary Mali drivers from sunxi-linux in your Rockchip device

Once this is done you can try some OpenGL ES demos such as esgears2 or glmark2-es to test it with the framebuffer. es2gears can be installed with “sudo apt-get install mesa-utils-extra” and glmark2-es2 with “sudo apt-get install glmark2-es2“.

I had a quick try this morning, and the building worked, the four modules could load, but es2gears still rendered by software:

libEGL warning: DRI2: failed to authenticate                                    
XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0.0"   
      after 184 requests (171 known processed) with 0 events remaining.         
EGL_VERSION = 1.4 (DRI2)

 

I used a different toolchain, and kernel source, so this may be the reason. TBC. A successful es2gears output should look like:

EGL_VERSION = 1.4 Linux-r3p2-01rel2
 vertex shader info:
 fragment shader info:
 info:
 2064 frames in 5.0 seconds = 412.635 FPS
 2129 frames in 5.0 seconds = 425.630 FPS

and glmark2-es2:

=======================================================
    glmark2 2012.08
=======================================================
    OpenGL Information
    GL_VENDOR:     ARM
    GL_RENDERER:   Mali-400 MP
    GL_VERSION:    OpenGL ES 2.0
=======================================================

Further steps would be to enable X11 to use Mali, but I’m not sure this has been tried just yet.

Nevertheless, that should mean you can soon expect Linux images with support for accelerated 3D graphics for your Rockchip RK3188. This does not mean however that hardware video decoding will be possible, as Mali-400 GPU is not a VPU and does not support decoding/encoding. There is, however, a separate effort to brings hardware video decoding support in RK3188, but this should take much more time.

In other news, linux-rochip community has just started a mailing list, so you may want to join if you are interested in software development on Rockchip devices for Linux and Android.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

Amlogic GPL Source Code Release – Kernel 3.10, U-Boot, and Drivers (Wi-Fi, NAND, TVIN, Mali GPU)

March 10th, 2014 13 comments

Last month, I noticed Amlogic provided links to the Android SDK for S802 / M802 on their open source website, but the only way to get the source was to share your SSH public with Amlogic, so that they give you access. It did not happen, but the company has released the source for Linux 3.10.10, U-boot 2011.03, Realtek and Broadcom Wi-Fi drivers, NAND drivers, “TVIN”drivers, and kernel space GPU drivers for Mali-400 / 450 GPU. There are also some customer board files for Meson 6 only (AML8726-MX / M6) but they do not seem to match the kernel…

amlogic_kernel_m802_s802

If you want to build the kernel, including the drivers, you’ll need to download a bunch of files:

wget http://openlinux.amlogic.com:8000/download/ARM/kernel/arm-src-kernel-2014-03-06-d5d0557b2b.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/rtk8192du-2014-03-06-7f70d95d29.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/rtk8192eu-2014-03-06-9766866350.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/rtk8192cu-2014-03-06-54bde7d73d.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/rtk8188eu-2014-03-06-2462231f02.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/brcmap6xxx-2014-03-06-302aca1a31.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/wifi/wifi-fw-2014-03-06-d3b2263640.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/modules/aml_tvin-2014-03-06-fb3ba6b1c8.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/modules/aml_nand-2014-03-06-39095c4296.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/customer/aml_customer-2014-03-06-76ce689191.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/gpu/gpu-2014-03-06-0425a1f681.tar.gz

You’ll need to extract these tarballs in specific directories:

tar xvf arm-src-kernel-2014-03-06-d5d0557b2b.tar.gz
mkdir -p hardware/amlogic/
mkdir -p hardware/wifi/realtek/drivers
mkdir -p hardware/wifi/broadcom/drivers
mkdir -p hardware/arm/
cd hardware/amlogic
tar xvf aml_nand-2014-03-06-39095c4296.tar.gz
mv aml_nand-amlogic-nand nand
cd ../wifi/realtek/drivers
tar xvf ../../../../rtk8192du-2014-03-06-7f70d95d29.tar.gz
tar xvf ../../../../rtk8192eu-2014-03-06-9766866350.tar.gz
tar xvf ../../../../rtk8192cu-2014-03-06-54bde7d73d.tar.gz 
tar xvf ../../../../rtk8188eu-2014-03-06-2462231f02.tar.gz
mv rtk8188eu-8188eu 8188eu
mv rtk8192du-8192du 8192du
mv rtk8192cu-8192cu 8192cu
mv rtk8192eu-8192eu 8192eu
tar xvf ../../../../brcmap6xxx-2014-03-06-302aca1a31.tar.gz
cd ../../broadcom/drivers
mv brcmap6xxx-ap6xxx ap6xxx
cd ../../../arm
tar xvf ../../gpu-2014-03-06-0425a1f681.tar.gz
mv gpu-r3p2-01rel3 gpu
cd ..
mv aml_tvin-amlogic-3.10-bringup tvin

You can also extract the customer file into the kernel directory to add some drivers. As I said above I’m not sure the source code inside matches the Linux kernel 3.10.10, because there’s now device tree file for the boards. In arch/arm/plat-meson/Kconfig, there are (commented out) references to customer/meson/dt/Kconfig and customer/drivers/Kconfig. The device tree is not available, but the drivers is, so you could give a try in order to build the touchscreen and sensors drivers available in the customer tarball:

cd ../linux-amlogic-3.10-bringup
tar xvf ../aml_customer-2014-03-06-76ce689191.tar.gz 
mv aml_customer-master customer

Finally, the development tree is ready to build the kernel. There must surely be a script somewhere to do that… I haven’t used the file wifi-fw-2014-03-06-d3b2263640.tar.gz, as the kernel did not complain about it, and it looks like it’s just for Android Kit Kat. There are four scripts to build the kernel: mk_m6.sh, mk_m6tv, mk_m6_tvd.sh, and mk_m8.sh. The first three are for meson6 (dual core processor), and the last one meson8 (quad core S802/M802).

Let’s go with M8 build:

make ARCH=arm meson8_defconfig
./mk_m8.sh

Please not that I had to change mk_m8.sh, as it should just make computer hand requiring a hard reset. The culprity was the line:

make uImage -j

The manpage indicates “If the -j option is given without an argument, make  will  not  limit  the number of jobs that can run simultaneously”.  It does not seem like a good idea… ,s so I changed that to

make uImage -j8

Upon successful build, the end of log you look like:

UIMAGE arch/arm/boot/uImage
Image Name: Linux-3.10.10
Created: Mon Mar 10 11:48:52 2014
Image Type: ARM Linux Kernel Image (lzo compressed)
Data Size: 7099978 Bytes = 6933.57 kB = 6.77 MB
Load Address: 00008000
Entry Point: 00008000
Image arch/arm/boot/uImage is ready
/home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/scripts/amlogic/aml2dtb.sh /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dtd
DTD_FILE: /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dtd
the middle dts file: /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dts
process file /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dts start
processing... please wait...
process file /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dts end

CC scripts/mod/devicetable-offsets.s
GEN scripts/mod/devicetable-offsets.h
HOSTCC scripts/mod/file2alias.o
HOSTLD scripts/mod/modpost
DTC arch/arm/boot/dts/amlogic/meson8_skt.dtb
rm /home/jaufranc/edev/AMLogic/s802/linux-amlogic-3.10-bringup/arch/arm/boot/dts/amlogic/meson8_skt.dts
-rw-r–r– 1 jaufranc jaufranc 11244948 Mar 10 11:48 ./m8boot.img
m8boot.img done

If you want to get U-boot code it’s not quite as messy, you jut need to download and extract two files:

wget http://openlinux.amlogic.com:8000/download/ARM/u-boot/uboot-2014-03-06-323515c056.tar.gz
wget http://openlinux.amlogic.com:8000/download/ARM/u-boot/aml_uboot_customer-2014-03-06-09887e87b4.tar.gz
tar xvf uboot-2014-03-06-323515c056.tar.gz
cd uboot-next
tar xvf ../aml_uboot_customer-2014-03-06-09887e87b4.tar.gz
mv aml_uboot_customer-next/ customer

Then just select a board in customer/board/ to build U-boot for your hardware. For example:

make m8_k03_M102_v1_config CROSS_COMPILE=arm-linux-gnueabihf-
make CROSS_COMPILE=arm-linux-gnueabihf- -j8

The build failed for me, but it might be I may need to use another compiler, e.g. arm-none-eabi-gcc.

[Update: arm-none-eabi-gcc does seem to go further, but you'll also need an arc compiler as shown in my previous Amlogic U-boot build instructions].

Thanks to M][sko for the tip.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

Raspberry Pi Gets Open Source 3D Graphics Drivers and Documentation

March 1st, 2014 2 comments

The Raspberry Pi was launched 2 years ago, and for its birthday, Broadcom decided to release documentation and open source OpenGL ES 1.1 and 2.0 driver for the Videocore IV GPU.  You may remember the Raspberry Pi Foundation already release an open source GPU driver in 2012, but this was only for the part running on the ARM11 core for Broadcom BCM2835 SoC, which is just a few hundred lines of code long, and communicates with a binary blob which does all the work in the GPU itself. This new release however goes much further with a 111 page document entitled “VideoCore IV 3D Architecture Reference Guide“, and open source driver for the 3D System of the GPU.

VideoCore IV 3D Graphics Block Diagram (Click to Enlarge)

VideoCore IV 3D System Block Diagram (Click to Enlarge)

Strangely the release is however not for BCM2835, but instead BCM21553. Broadcom clearly has the source for BCM2835 too, so this must have been made for legal reasons. VideoCoreIV packs a lot of graphics feature 2D and 3D graphics, Video Processing Unit (with video codecs), ISP (Imagine Signal Processor) used by the camera, and probably a few other bits, but only the 3D part has been released, which is already a great achievement. The VPU code will never be released because the MPEG LA would not allow this, as they would like to keep on receiving their codec royalties.

That means the drivers, released under a BSD licence, will need to be ported to BCM2835, something that “should be reasonably straightforward“, but is still hard enough for the Raspberry Pi foundation to offer a $10,000 bounty to the first person that can port Broadcom’s VideoCore drivers to run on the Pi, and demonstrate Quake III running smoothly with the open source drivers. My take is you may even land a job if you manage that. I’ll give you a head start by mentioning you’ll need to change the registers’ base address :p.

VideoCore_IV_GPU_Base_Address

Eben Upton also mention it should be possible to “write general-purpose code leveraging the GPU compute capability on VideoCore devices”.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

Intel Bay Trail Graphics Overview – FOSDEM 2014

February 17th, 2014 No comments

Bay Trail SoCs are new low power Intel ICs for tablets (Bay Trail-T, Z3000 series), mobiles (Bay Trail-M, N2800, N2900 and N3500 series), desktops (Bay Trail-D, J1800, J1900 and J2900 series) and embedded / industrial platforms (Bay Trail-I, E3800 series). Many Atom processors used to features PowerVR GPU, but it has now been replaced by Intel HD graphics in Bay Trail SoC.

Z3700_Series_Block_Diagram
Jesse Barnes, working at Intel on software and drivers for Intel graphics devices, gives a presentation about Bay Trail SoCs with a focus on graphics. After an overview, and some ARM bashing regarding performance (Nvidia Tegra 4 and Qualcomm Snapdragon 800), and even power consumption (Tegra 4 only), he describe further details about Intel HD graphics found in the new Intel processors. Everything is basically in mainline, and you’ll need Linux 3.10 or greater, Mesa 9.2 or greater, and libva 1.2.1 or greater for proper support. Some initial GPU benchmarks showed somewhat disappointing results, but in this talk, Jesse explains some parts of the drivers still need performance improvements, as they only run at half the expected speed. VP8 support (hardware decode?) is work in progress in VP8.

He also mentioned available hardware platform with Windows 8.x tablets already available, and Android tablets becoming available later this year, except in China where you can already buy Bay Trail Android tablets…

Presentation slides are not available.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter

ARM Unveils Cortex A17 Processor, First Used in Mediatek MT6595 and Rockchip RK3288 SoCs

February 11th, 2014 14 comments

Finally all these ARM Cortex A17 marketing materials for Rockchip RK3288 were not some typos, but Rockchip marketing team may just have not received the memo reading “Confidential”, as ARM has now officially announced Cortex A17 processor based on ARMv7-A architecture, with support for big.LITTLE with Cortex A7, and that can be coupled with Mali-T720 mid-range GPU and Mali-V500 VPU.

ARM_Cortex_A17After Cortex A15, and Cortex A12, you may wonder “Why? But Why did ARM had to launch yet another new core?”. Here’s the company answer to that question:

The Cortex-A17 processor offers 60% performance uplift over the Cortex-A9 processor, the current leader in mid-range mobile market, and betters the best efficiency enabling optimized solutions to address existing and new products. The Cortex-A17 processor is based on the popular ARMv7-A architecture, today’s most successful architecture in the mobile market. With over 1M apps supporting the ARMv7-A architecture, the Cortex-A17 processor is primed to bring the high-end performance levels of 2014 to next generation mid-range devices in 2015, with further increased efficiency for enabling a better user experience.

The Cortex-A17 processor is scalable up to 4 cores, each offering a full out-of-order pipeline delivering peak performance of today’s premium performance levels. A fully integrated, low-latency L2 cache controller, accelerator interfaces to target specific use cases, and high-throughput AMBA 4 ACE Coherent Bus Interface enable the Cortex-A17 processor to be tailored for the right task. Its modern design is best complemented with the latest advanced IP like ARM Mali-T720 GPU, ARM Mali-DP500 DPU, and ARM Mali-V500 VPU and CoreLink CCI-400, but is also fully backwards compatible to existing AMBA3 and AMBA4 AXI based systems based on ARM Mali-450 GPU and CoreLink NIC-400 to ease integration and time-to-market.

The Cortex-A17 processor, in combination with its high-efficiency counterpart Cortex-A7 processor, provides an ideal solution for mobile devices in 2015 and beyond, bringing the heterogeneous processing benefits of big.LITTLE Global Task Switching (GTS) to the mid-range market. Coupled with CoreLink System IP components like the CoreLink CCI-400 interconnect, Cortex-A17 and Cortex-A7 processors are the foundation for upcoming devices that are more efficient and higher performance than any solution in this class before.

I’m not quite satisfied by that answer… No comparison with Cortex A12 or A15. I’m guessing they’ve probably made some power consumption improvements compared to Cortex A15 which dissipates a lot of heat. Looking at the different cores specifications on ARM website do not find meaningful differences. Nevertheless, Cortex A17 will be used in SoC used in mobile devices (smartphones and tablets), smart TVs, over-the-top devices, automotive infotainment, and other consumer oriented markets, and seems to be used as a Cortex A9 replacement.

The first SoC officially announced with Cortex A17 is Mediatek MT6595 (which somehow will become MTK6595 on many sites) an Octacore mobile SoC with 4x Cortex A17, 4x Cortex A7 in big.LITTLE configuration fully supporting Heterogeneous Multi-Processing (HMP) with Imagination PowerVR Series6 GPU, H.265 UHD decoding and encoding support, and an LTE modem. Mediatek MT6595 will be commercially available in H1 2014, with devices expected in H2 2014. The other SoC with Cortex A17 and probably Mali-720 GPU should be Rockchip RK3288, following the unintended leak at CES 2014…

You can find a bit more details on ARM Cortex A17 page.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on DotNetKicks.com
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter