Archive

Posts Tagged ‘gpu’

ARM announces “premium IP” for VR and AR with Cortex-A73 Processor and Mali-G71 GPU

May 30th, 2016 3 comments

Today ARM has revealed the first details of its latest mobile processor and GPU, both said to be optimized for VR (Virtual Reality) and AR (Augmented Reality) applications.

Starting with the ARM Cortex-A73, we’re looking at an evolution of the current Cortex-A72 with ARM claiming 30 percent “sustained” performance over the Cortex-A72 and over twice the performance over the Cortex-A57. ARM is already talking about clock speeds of up to 2.8GHz in mobile devices. Other improvements include an increase up to 64k L1 instruction and data cache, up from 48 and 32k respectively for the Cortex-A72, as well as up to 8MB of L2 cache.

ARM_Cortex_A73The Cortex-A73 continues to support ARM’s big.LITTLE CPU design in combination with the Cortex-A53 or the Cortex-A35. It’s also the first ARM core to have been designed to be built using 10nm FinFET technology and it should be an extremely small CPU at around 0.65 square millimeters per core, or a 46 percent shrink from the Cortex-A72. By moving to 10nm and FinFET, ARM is also promising power efficiency gains of up to 20 percent over the Cortex-A72.

Cortex A53 vs A72 vs A73

Cortex A53 vs A72 vs A73

The Mali-G71 GPU takes things even further, as ARM is promising a 50 percent increase in graphics performance, a 20 percent improvement in power efficiency and 40 percent more performance per square millimeter over its previous generation of GPU’s. To accomplish this, ARM has designed the Mali-G71 to support up to 32 shader cores, which is twice as many as the Mali-T880 and ARM claims that this will enable the Mali-G71 to beat “many discrete GPUs found in today’s mid-range laptops”. We’d take this statement with a grain of salt, as it takes more than raw computing performance to do a good GPU and that’s why there are so few companies that are still designing their own GPUs. As with the Cortex-A73, the Mali-G71 is optimized for 10nm FinFET manufacturing technology.

As always with ARM based GPUs, it depends on the partner implementation and the Mali-G71 supports designs with as little as one shader. Looking at most current mobile GPU implementations we’d expect to see most of ARM’s partners to go with a 4-8 shader implementation to keep their silicon cost at a manageable level. That said, we might get to see one or two higher-end implementations, as ARM has already gotten the likes of Samsung, MediaTek, Marvell and Hi-Silicon interested in its latest GPU.

ARM_Mali-G71

With a big move towards VR and AR, it’s also likely that the ARM partners are going to have to move to a more powerful GPU to be able to deliver the kind of content that will be expected from these market spaces. According to the press release, it looks like ARM has already gotten Epic Games and Unity Technologies interested in supporting their latest GPU

Devices using the new ARM Cortex-A73 and Mali-G71 are expected sometime in 2017, so there’s quite a gap between the announcement and the availability of actual silicon, but with HiSilicon, Marvell, MediaTek, Samsung Electronics and others having already licensed Cortex A73 IP. at least it means we have something to look forward to next year. You can find more details on ARM Cortex A73 and Mali-G71 pages, as well as ARM community’s blog.

PowerVR GT7200 Plus and GT7400 Plus GPUs Support OpenCL 2.0, Better Computer Vision Features

January 7th, 2016 2 comments

Imagination Technologies introduced PowerVR Series7XT GPU family with up to 512 cores at the end of 2014, and at CES 2016, they’ve announced Series7XT Plus family with GT7200 Plus and GT7400 Plus GPUs, with many of the same features of Series7XT family, plus the addition of OpenCL 2.0 API support, and improvements for computer vision with a new Image Processing Data Master, and support for 8-bit and 16-bit integer data paths, instead of just 32-bit in the previous generation, for example leading to up to 4 times more performance for applications, e.g. deep learning, leveraging OpenVX computer vision API.

Block Diagram (Click to Enlarge)

Block Diagram (Click to Enlarge)

GT7200 Plus GPU features 64 ALU cores in two clusters, and GT7400 Plus 128 ALU cores in a quad-cluster configuration. Beside OpenCL2.0, and improvements for computer vision, they still support OpenGL ES 3.2, Vulkan, hardware virtualization, advanced security, and more. The company has also made some microarchitectural enhancements to improve performance and reduce power consumption:

  • Support for the latest bus interface features including requestor priority support
  • Doubled memory burst sizes, matching the latest system fabrics, memory controllers and memory components
  • Tuned the size of caches and improved their efficiency, leading to a ~10% reduction in bandwidth

The new features and improvements of PowerVR Series7XT Plus GPUs should help designed better systems for image classification, face/body/gesture tracking, smart video surveillance, HDR rendering, advanced driver assistance systems (ADAS), object and scene reconstruction, augmented reality, visual inspection, robotics, etc…

You can find more details on Imagination Tech Blog.

Maxsun MS-GTX960 Nvidia GTX960 Graphics Card Unboxing and Installation

December 24th, 2015 11 comments

When I wrote an article about H.265 and VP9 video encoding, I noticed only the second generation Maxwell Nvidia Graphics would support H.265 decoding (up to 500 fps) and HDMI 2.0 output, a few weeks after purchasing a first generation Nvidia GTX750 GPU… So when GearBest contacted me about Graphics cards reviews I said I would be interesting in HDMI 2.0 and H.265 capable graphics card, which I meant I had to get a card with Nvidia GM20x chip with the cheapest being GTX960. So the company agreed to send me Maxsun MS-GTX960 graphics card matching my requirements for $240.04. I won’t use it for gaming at all, but instead I plan to use the card to evaluate Kodi 16.x 4K H.265 and VP9 support and compare video performance to the cheap and low power Amlogic S905 TV boxes on the market, as well as try out H.265 video encoding, as it should speed up the process by up to 50 times compared to software only encoding. But first, I’ll show a few pictures of the GPU, and installation process that a little different from lower-end cards.

Maxsun MS-GTX960 Unboxing

I received the box via DHL, and was surprised by the rather large size of the package, and that I did not have to pay any custom duties for this type of item…

Maxsun_MS-GTX960_PackageThe card comes with 2GB GDDR5 RAM.

GTX960_2GB_RAMThe graphics card does look quite large and includes with two cooling fans.

Click to Enlarge

Click to Enlarge

Click to Enlarge

Click to Enlarge

The card has four video outputs: HDMI 2.0, DisplayPort, and two DVI ports.

Maxsun_MS-GTX960_HDMI_DisplayPort_DVIThere’s also a DVD or CDROM included with the graphics, but I did not checked it out, as the latest drivers are usually available online.

Maxsun MS-GTX960 Graphics Card Installation

This is what my previous Kodac GT750 card graphics card looks like when installed in my PC.

Zotac_GTX750_InstallationI’ve taken it out, and comparing it to GTX960, I was worried it would not fit due to its much longer length.

GTX960_vs_GTX750While there ere are more ports, there’s no VGA output, so I’ll have to find a DVI cable for my secondary display. Not a big deal.

GTX960_vs_GTX750_Video_OutputI was relieved when I realized the card would indeed fit into my computer, albeit it’s now pretty tight with my hard drive.

GTX960_Installation_LengthI also noticed a 6-pin connector on the top of the card, and after a Google search, I found it was to provide some extra power required for this type of card, and my power supply had this type of connector.

Maxsun_GTX960_6-pin_header

All good, I tightened the card with a screw, put all back together, and having upgraded from another Nvidia graphics card, the card was automatically recognized in Ubuntu 14.04, and worked out of the box.

Nvidia_GTX960_Drivers_UbuntuI like when everything goes smoothly :).

Merry Christmas to all!!!

Categories: Graphics, Hardware, Linux, Ubuntu Tags: gpu, h.265, hdmi, nvidia

Amlogic S912 Processor Could Feature an ARM Mali-T830 GPU

November 20th, 2015 4 comments

Amlogic S912 launch has been delayed by a few months, and if we are to believe the data from GFXBench (test 1; test 2), the reason could be that they switched the design from a Mali-T7xx GPU to a more powerful Mali-T830 GPU.

Click to Enlarge

Click to Enlarge

S912 processor is still based on four Cortex A53 core @ up to 2.0 GHz like in S905, but the GPU will be much more powerful.

One person noticed these results and wrote an analysis and comparison (in Korean) against the Mali-860MP2 GPU found in Mediatek MT6755 SoC (Helio P10) SoC.

 Offscreen  Manhattan 3.1  Manhattan  T-Rex  ALU  ALU2  Fillrate  Texturing
 MT6755  4.8  7.2  17.2  6.0  1012
 S912  4.4  7.0  16.3  34.7  5.3  1283  1000

So while Mali-860MP2 is faster for all listed benchmark the advantage is not that great. His analysis  compared benchmarks (read post for details) concludes that the GPU in S912 could be clocked at around 650 MHz.

Thanks to Adriano for the tip.

Amlogic S905 Source Code Published – Linux, U-Boot, Mali-450 GPU and Other Drivers

November 19th, 2015 38 comments

Amlogic has an open linux website where they regurlarly release GPL source code, and with Amlogic S905 devices coming to market, they’ve released a few tarballs at the beginning of the month including Linux 3.14 source code, U-boot source code, and Mali-450MP GPU kernel source code (obviously not userspace), as well as some other drivers for WiFi, NAND flash, PMU, TVIN, etc…
Amlogic_S905_Linux_MenuconfigLet’s get to the download links:

I quickly tried to build the Linux source. If you’ve never build a 64-bit ARM kernel or app before, you’ll fist need to install the toolchain. I installed the one provided with Ubuntu 14.04:

Now extract the tarball and enter the source directory:

At first I had a build failure due to a missing directory, so I created it, and use the default config for Amlogic S905/S912 (in arch/arm64/configs), before building the Linux kernel.

and it ended well:

So that’s a good starting for anybody wanting to work on the Android or Linux kernel…

Unrelated to Amlogic S905/Meson64, but I’ve also noticed some OpenWRT packages and rootfs  on Amlogic website that was released a little earlier this year. So either some people are using Amlogic Sxxx processors with OpenWRT, or Amlogic is working on a router chip that I missed. Probably the former.

Thanks to Olin.

Raspberry Pi’s VideoCore 4 GPU Driver Added to Linux Mainline in Kernel 4.4

November 17th, 2015 3 comments

While your x86 and AMD64 computer will usually boot with Linux mainline without issues, most ARM boards and device won’t, and many of the ones that do boot only support headless mode, and limited functionalities. The Raspberry Pi had been supporting HDMI output with a simple framebuffer for a while, but a developer working on the Videocore 4 (VC4) GPU found inside Broadcom BCM2835 and BCM2836 processors, has recently submitted a patchset to add VC4 GPU to Linux mainline that should make it to Linux 4.4.

Raspberry_Pi_GPU_Linux_Kernel

The commit message does mention some features are still missing, but it’s a start:

This pull request introduces the vc4 driver, for kernel modesetting on the Raspberry Pi (bcm2835/bcm2836 architectures). It currently supports a display plane and cursor on the HDMI output. The driver doesn’t do 3D, power management, or overlay planes yet.

Via Golem and Sanders.

Nvidia Tegra X1 Development Board is Finally Available… for $599

November 11th, 2015 17 comments

When Nvidia introduced Nvidia Tegra X1 octa processor with a 256-core Maxwell GPU at the very beginning of the year, I was expecting Jetson TX1 is follow suit in the next few months, but instead the company launched Nvidia Shield Android TV box based on the processor. The company has now launched Jetson TX1 module and development board.

Tegra_TX1_system-on-moduleLet’s check the module first and its main specifications and features:

  • SoC – Nvidia Tegra X1 octa core processor with 4x ARM Cortex A57 cores, 4x ARM Cortex A53 cores, and a 256-core Maxwell GPU
  • System Memory – 4GB LPDDR4 (25.6 gigabits/second)
  • Storage – 16GB eMMC
  • Connectivity – 802.11ac 2×2 Bluetooth ready, Gigabit Ethernet
  • Video –  4K video encode and decode
  • Camera – Support for 1400 megapixels/second
  • Dimensions – 50mm x 87mm

The module support Linux4Tegra operating system based on Ubuntu. Libraries and drivers to leverage the Maxwell GPU include cuDNN  CUDA-accelerated library for machine learning, VisionWorks CUDA-accelerated OpenVX 1.1 library and framework for computer vision, graphics drivers with support for OpenGL 4.5, OpenGL ES 3.1 and Vulkan, and support for CUDA 7.0.

The company did not release that much information about the development board in the press release, but send a few samples to various blogs and developers, including Kangalow of Jetsonhacks.com.

Jeston TX1 Board (Click to Enlarge)

Jetson TX1 Board (Click to Enlarge)

The development board relies on TX1 module for the processor, storage, memory, and wireless connectivity, and a carrier board for I/O connectivity: is

  • Video Output – HDMI
  • Storage – SATA data+power, M.2 Key E connector, SD card slot
  • Connectivity – Gigabit Ethernet (RJ45)
  • USB – USB 3.0 Type A, USB 2.0 Micro AB (supports recovery and host mode)
  • Display expansion header
  • Camera expansion header with a 5MP camera
  • Expansion –  PCI-E x4 slot, 40 pin Raspberry Pi somewhat compatible header, 30x pin header for extra GPIOs.
  • Dimensions – Fits in mini-ITX case

Kangalow reports the fan is not active very often with the heatsink providing enough cooling most of the time, and the performance feels like the one of a typical laptop in Ubuntu.

The guys at Phoronix also got a board, and while they did not run their own benchmarks yet, they shared some provided by Nvidia themselves pitting Tegra X1 (Linux4Tegra) against an Intel Core i7-6700K (Windows 8.1…) showing for example graphics performance (GFXBench 3.1) is similar, but Jetson TX1 consumes 5 times less power.

Jetson TX1 Board vs Skylake (iCore i7) Computer

Jetson TX1 Board vs Core i7 (Skylake) Computer

Jetson TK1 board with a 192-core GPU was $192, so you may dreamed that Jetson TX1 with a 256-core GPU would be $256, but it did not exactly turn out that way. Nvidia Jetson TX1 development kit will start showing for pre-order for respectively $599 (retail) / $299 (education) on November 12 in the US, with a launch in other regions in the next few weeks. The kit will include the module and carrier board, a camera board, a heatsink and fan and required cables. Jetson TX1 modules will be available in Q1 2016 for about $299 per unit for 1k order.

ARM Introduces Mali-470 GPU for Wearables, IoT and Embedded Applications

October 21st, 2015 7 comments

Mali-400 was announced in 2008, and since then has been used in various SoCs for smartphone, but now it’s mostly replaced by Mali-450 GPU in low cost mobile and STB SoCs, although Mali-400 is still being implemented in new SoCs such as Rockchip RK3128 processor. ARM has been working on a lower power version of the GPU, and just unveiled Mali-470 GPU targeting wearables, as well as embedded and IoT applications.

Mali-470Mali-470 GPU is said to use the same memory and AMBA interfaces as Mali-400, while keeping some of the improvements brought to Mali-450 GPU, and further lowering power consumption to just half of that Mali-400 in terms of mW per frames per second.

Click to Enlarge

Click to Enlarge

Just like its predecessors, Mali-470 supports OpenGL ES 2.0, and like Mali-400 it will scale from 1 to 4 fragment processor, always combined with one single vertex processor. Mali-470MP1 is likely to be used in wearables or other applications with tiny displays and low power requirements, while Mali-470MP2 and Mali-470MP4 might also find their ways into more demanding applications.

ARM expects SoCs based on Mali-470 GPU to sample by Q2 2016, meaning we’ll probably start seeing Mali-470 GPU in actual devices in 2017.

Via AnandTech

Categories: Hardware Tags: arm, embedded, gpu, IoT, wearables