Raspberry Pi 4 Benchmarks & Mini Review

Raspberry Pi 4 has just been released with many improvements over Raspberry Pi 3 Model B+ including a faster processor, a proper Gigabit Ethernet port, USB 3.0 interfaces, and 4K video support. That’s the theory, but how does it work in practice?

I can now let you know as I’ve received a Raspberry Pi 4 sample courtesy of Cytron, and ran some tests and benchmarks on the very latest boards from the Raspberry Pi foundation.

Raspberry Pi 4 Review
Click to Enlarge

System Info

Before starting with the benchmarks, let’s go through some basic system info:



For reference, you’ll find Raspberry Pi 4 Linux boot log here.

Phoronix benchmarks

Let’s go ahead and install the latest version of Phoronix benchmarks:


Now let’s run the test to compare the performance of Raspberry Pi 4 model B to some other Arm Linux boards including Raspberry Pi 3 Model B.


For reference, my office has an ambient temperature of around 28 to 30°C, and I’ve monitored the CPU temperature with an IR thermometer during some of the phases:

  • Idle – 62°C
  • Phoronix benchmarks download/installation – 64°C
  • John The Ripper – 73°C

I also typed a few commands to get the system temperature and CPU clock, in this case during John the Ripper benchmark:


So the CPU is clocked down to around 1.0 GHz since proper cooling is not implemented for this type of workload, and the system automatically lowers the CPU frequency.

Another way to confirm throttling does occur is to check out the output of Phoronix John The Ripper:


The test is repeated several times, with the score starting at 696, and eventually dropping under 500, as the board fails to cool.

This behavior is actually explained in the board’s datasheet:

To reduce thermal output when idling or under light load, the Pi4B reduces the CPU clock speed and voltage. During heavier load the speed and voltage (and hence thermal output) are increased. The
internal governor will throttle back both the CPU speed and voltage to make sure the CPU temperature never exceeds 85 degrees C.

The Pi4B will operate perfectly well without any extra cooling and is designed for sprint performance – expecting a light use case on average and ramping up the CPU speed when needed (e.g. when loading a webpage). If a user wishes to load the system continually or operate it at a high temperature at full performance, further cooling may be needed.

We do not have a cooling solution at hand here, so running the benchmarks without heatsink, nor fan does seriously impact the performance under load, meaning Raspberry Pi 4 is slower than Raspberry Pi 3 model B in some of the benchmarks.
Raspberry Pi 4 Phoronix Benchmarks John The Ripper
John the Ripper is a shocker since my Raspberry Pi 4 is actually slower than a Raspberry Pi 3 model B tested by others due to the lack of proper cooling for the multi-threaded benchmark.

Raspberry Pi 4 C-Ray Benchmark
C-Ray looks better since RPi 4 is about 27% faster than RPi 3 model B (187.03s vs 250.79s), but still twice as slow as Rockchip RK3399 powered VS-RK3399 board.

Raspberry Pi 4 Benchmarks SmallptRaspberry Pi 4 performance also looks better with Smallpt illumination renderer benchmark.

Raspberry Pi 4 Benchmarks HimenoThe board is however much much faster with Himeno Poisson solver benchmark, almost 5 times faster than RPi 3 model B, so there may have been some software/compilation changes in this Phoronix benchmark, or possibly there are some extra instructions that come with Cortex-A72 cores, since the Rockchip RK3399 hexa-core processor with 2x A72 + 4x A53 is also quite faster than other A53/A7 Arm platforms.

Raspberry Pi 4 FLAC Audio Encoding

AFAIK, FLAC Audio encoding is a single-core benchmark so it’s not quite as susceptible to overheating as other multi-threaded benchmarks, and Raspberry Pi 4 did perform well.

We can see from Raspberry Pi 4 benchmarks that the board is quite faster than Raspberry Pi 3 model B in most cases, but it’s also clear that to leverage the full power of the board, especially for multi-threaded tasks, a proper cooling solution is needed. You can check out the full results here.

SBC Bench

SBC Bench is a simple benchmark for single board computers developed by Thomas Kaiser that allows checking the performance of SBC quite faster than running Phoronix benchmarks.


That’s the output which confirms throttling does occur:


Checking into the details in the link, we can see for example the CPU dropped to 1.0 GHz for most of 7-zip multi-threaded tests, dropping even as low as 600 MHz once.  7-zip crashed so we only got two results instead of three. That’s Raspberry Pi 3 model B result for reference:


So RPi 4 is barely faster at just under 3,600, but proper cooling should improve things.

OpenSSL AES results look like Armv8 crypto extension are not enabled, as an Allwinner H5 based Orange Pi Zero Plus board underclocked at just 816 MHz is considerably (up to around 8 times) faster:


Checking out /proc/cpuinfo above, AES feature is missing, so it might be Broadcom did not include Armv8 crypto extension in the processor (TBC).

USB 3.0 & microSD  Storage Benchmarks

Since Raspberry Pi 4 now comes with two USB 3.0 ports, I’ve connected USB 3.0 mechanical hard drive, and installed iozone to verify Raspberry Pi can now achieve the ~100MB/s read/write speed expected from such drive.

Usually, iozone can be installed as follows in most Ubuntu/Debian systems:


But it’s not available in Raspbian Buster, so I’ve built it from sources instead:


We can now run the test to check sequential read and write speeds in the EXT-4 partition of my drive:


So around 94MB/s read and 92 MB/s write that’s about what we should expect from USB 3.0 with this drive, and much better than the 30+MB/S one would get with Raspberry Pi 3.

 While I’m at it, I’ve also tested the performance of the NOOBS Class A1 microSD card part of the kit I received:


The most important numbers here are the random read and write values, and results are good resulting in a smooth experience while using the board (in most cases).

Gigabit Ethernet Benchmark

True Gigabit Ethernet is another key feature of the new Raspberry Pi port, so I’ve tested full duplex Ethernet performance using iperf:


It ended badly on the client side:


But I got some numbers on the server (my laptop) side:


I repeated the test twice with similar results. It does not look so good, so let’s repeat the test in one direction only:

  • Upload

  • Download


So it looks much better here, as the full bandwidth enabled by Gigabit Ethernet is basically saturated by both tests. The driver may have troubles handling high download/upload traffic at the same time in the first test.

4K Video Playback and Output

All previous Raspberry Pi boards were limited to 1080p30/60 video playback with H.264 and other codecs, but Raspberry Pi 4 is the first to play 4K videos using H.265 codec.

So I went back to the desktop to play some 4K H.265 videos played from my USB drive. Clicking on the files opened then in VLC media player:

  • Beauty_3840x2160_120fps_420_8bit_HEVC_MP4.mp4 (H.265) – First frame only
  • MHD_2013_2160p_ShowReel_R_9000f_24fps_RMN_QP23_10b.mkv (10-bit HEVC, 24 fps) – First frame only for several seconds then video becomes gray, and system hangs
  • Fifa_WorldCup2014_Uruguay-Colombia_4K-x265.mp4 (4K, H.265, 60 fps) – First frame only for several seconds then video updates every 5-10 seconds with heavily gray frames, and frequent audio cuts. Eventually, lost mouse pointer and control of the system -> hard power cycle required

OK, so I guess I can stop right here, as 4K video playback is clearly not working, at least with VLC. So I tried to omxplayer command line:


Let’s switch to 4K video output. There’s an option in the settings and if you planned on using a dual display setup, you can also configure the screen layout as needed.

Raspberry Pi 4 4K HDMI
Click to Enlarge

So I selected 4K HDMI, clicked on OK, and was told I needed to reboot to apply the new settings. Fair enough, and after the 35 seconds it typically takes to boot the board I was back in the desktop environment, but still at 1080p60 video output & resolution. For reference, my TV is LG 42UB820T which perfectly supports 4K video output. I did, however, connect it through Onkyo TX-NR636 A/V receiver which should not be a problem, but in case of compatibility issues, I connected the RPi 4 directly to the HDMI 3 of my TV and rebooted the board. But sadly same results.

So right now, Raspberry Pi 4 only has the potential to use 4K video decode and output, as it’s just not working right now, at least in the Raspbian image I was provided with the NOOBS microSD card.

Final words

The Raspberry Pi 4 does provide much better I/O performance thanks to Gigabit Ethernet and USB 3.0 both of which are mostly performing as expected. The processor is also quite faster, but you may have to come up with a cooling solution such as Pimoroni Fan SHIM to leverage it’s full potential, especially if you live in a hot climate. 4K does not work at all right now, with 4K video playback clearly relying on software decode with both VLC and omxplayer, and  4K video output is not working at least with my 4K TV.

Raspberry Pi 4 Computer Accessories
Click to Enlarge

The Raspberry Pi 4 hardware changes mean you’ll need some extra accessories, and Cytron sent me the Raspberry Pi 4 together with the official 5V/3A USB-C power supply, as well as a micro HDMI cable, and a 16GB NOOBS Class A1 microSD card. So make sure you get at least the first two accessories in your order or you may not be able to use the Raspberry Pi 4 for a while.

I’d like to thank Cytron for the opportunity of getting a Raspberry Pi 4 so early, and you could consider purchasing the board from their store. They ship worldwide, but if you’d rather get the 2GB or 4GB RAM models you’ll have to wait a little longer, as those are not quite available yet.

Support CNX Software - Donate via PayPal or become a Patron on Patreon

61
Leave a Reply

avatar
16 Comment threads
45 Thread replies
2 Followers
 
Most reacted comment
Hottest comment thread
19 Comment authors
everest3333hwtiDavid Willmoretkaiser Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
blu
Guest
blu

Them thermals..

TLS
Guest
TLS

Not quite hot enough for frying an egg on though…

Member

An RK-3399, Atom, the Tegra, etc. are beasts in that regard as well. Unless you’re talking a RISC-V core set that’s comparable, you’re going to have these levels for this class of performance.

blu
Guest
blu

I was referring to the dissipation provisioning, or lack thereof. I’m perfectly aware what temps CA72 does at 28nm, ergo my first post in the announcement thread.

TLS
Guest
TLS

Just a suggestion, try iperf3 instead of iperf, it’s the most up to date variant and seems to work much better.

Also, is it possible that 4K video playback require more than 1GB of RAM?

tkaiser
Guest
tkaiser

Few remarks:

1) sbc-bench reports you’re running an outdated primary OS:

You should upgrade to at least version 407b1da8fa3d1a7108cb1d250f5064a3420d2b7d from last Thursday.

2) sbc-bench also reported frequency capping so you’re affected by undervoltage (or the ThreadX release has problems):

3) iozone3 is still in Debian repos but you need to enable ‘non-free’ repos since Debian folks consider iozone license to be a problem 🙂

4) Now that everything is not behind one single USB2 port any more it starts to make sense to think about IRQ distribution to avoid cpu0 becoming an artificial bottleneck (especially true for tasks where network and storage is processed at the same time). This should be taken into account with NAS benchmarks since otherwise it’s very likely seeing cpu0 being at 100% and throughput numbers lower than needed.

Thank you for the details, especially the boot log. I would also be interested in lspci and lsusb output but since my RPi 4 is on its way I better wait until tomorrow 🙂

GreenReaper
Guest

It’s not undervoltage; it’s throttling due to insufficient cooling for extensive activity, like benchmarks. The bitmask shows this clearly by having an 0 for under-voltage and a 1 for currently throttled.

tkaiser
Guest
tkaiser

Simply look again. I was talking about the bits on the left that report behavior since last boot:

arm frequency capped has occurred since last reboot is set to 1 and this was the ‘600 MHz’ occurrence Jean-Luc reported. Yes, throttling happened but also undervoltage related frequency capping occurred at least once.

everest3333
Guest
everest3333

oops, 9,8,7 110 not 111, easy to mistake it as ascii drawing is still to 8bit, we need a real consistant cli gfx in the shell scripting in 2020… whats available on git we might use.

whats wrong with that
graphics:
Device-1: bcm2835-vc4 driver: vc4_drm v: N/A
Device-2: bcm2835-hdmi driver: N/A

didnt they include them or get a working copy in time for benches,did it effect the video non playing i wonder as a side effect…

everest3333
Guest
everest3333

https://wiki.gentoo.org/wiki/Raspberry_Pi_VC4
Using the “VC4” driver on the raspberry pi to enable hardware acceleration (in X, Wayland, opengl applications) presents many challenges. There’s plenty of instructions floating around for Raspbian, but for Gentoo, not so much. This page contains a couple of nuggets of wisdom that may help you get there.

…Kernel
To have proper GPU acceleration with VideoCore, you need its kernel module loaded. That module has been integrated in v4.5, but it’s also present in rpi’s kernel v4.4.

To get VC4 core working you need to use latest firmware from sys-boot/raspberrypi-firmware 9999 ebild. To have installed only firmware files and not kernel – follow this guide modifying ebuild file.

Mesa and friends…

Steve
Guest
Steve

Have you enabled 4K video in the config.txt file? I believe that is necessary at the moment to enable 4K output. However you don’t need 4K output to check 4K decode do you?

hdmi_enable_4k=1 (This is for 4K60 at least)

https://forum.libreelec.tv/thread/17698-libreelec-leia-9-2-alpha1-with-raspberry-pi-4b-support/?postID=121496#post121496

tkaiser
Guest
tkaiser

I added both RPi 4 and VIM3 numbers to sbc-bench results list: https://github.com/ThomasKaiser/sbc-bench/blob/master/Results.md

But I took the RPi 4 results generated by German c’t magazine since on their install overheating wasn’t that massive and they run with a newer ThreadX release. The good news: if you want a fast performing ARM device, choose VIM3 instead! 🙂

blu
Guest
blu

Something’s telling me there might be faster arm devices out there, depending on workloads ; )

Member

Only drawbacks on some of these benchmarks…some of them highlight a single core’s performance, not the device’s workload over all four cores.

When you have the RK-3399, with a dual core A72 “big” coreplex set, beating a QUADCORE one…the benchmark is valid but should be taken with a block of salt for making real-world comparisons there. At clock, single and dual threaded tasks will be faster on the RK-3399. Spread out over multiple cores, a quad threaded task will see the quadcore being faster.

Eversor
Guest
Eversor

I’ve tested one RK3399 device that wasnt throttling and both these things happened:

– Heterogeneous kernel scheduler was working great.
– Compiling code w/ the six cores yielded a 50% reduction in compile time.

So yes, because the RK3399 was clocked faster, it can beat a quad A72 if you also make use of the A53 cores.

David Willmore
Guest
David Willmore

It also helps if the quad core has huge thermal issue when you load down the cores with actual work.

halherta
Guest
halherta

I wonder how much of a performance / benchmark improvement an RPI4 with 4GB of RAM will have over an RPI4 with 1GB of RAM

tkaiser
Guest
tkaiser

Why?

Unless the benchmark or ‘application’ in question needs more than 1 GB RAM and as such the system needs to swap why should be there a reason for more RAM increasing performance?

You might see a drop in performance if the system has not enough RAM (and as such swap involves heavy IO activity or zram/zswap starts to eat up CPU cycles) but there is no reason why more RAM should be ‘better’ otherwise.

Jerry
Guest
Jerry

More RAM modules could provide more parallel memory channels? Upton must have considered that since it’s standard stuff in low power (“Allwinner”) devices as well.

tkaiser
Guest
tkaiser

> More RAM modules could provide more parallel memory channels?

Is this true with LPDDR? I’ve never encountered an LPDDR equipped SBC so far with more than one DRAM chip.

With DDR3 I believe it can make a difference since boards with 2 or more DDR3 modules operated in dual-channel mode while those with just one DDR3 module were forced to slower single-channel mode. But I’m a total hardware noob.

hwti
Guest
hwti

Rockpro 64 is using two LPDDR4 chips (two 32-bit channels).

blu
Guest
blu

There are also at least two arm chromebooks with dual-channel LPDDR3 on the market.

tkaiser
Guest
tkaiser

Right, but where’s the difference between 2 x 32-bit LPDDR and 1 x 64-bit LPDDR as on Renegade Elite?

https://www.cnx-software.com/2018/06/24/odroid-n1-canceled-due-to-ram-supply-issues-odroid-n2-coming-later-this-year/#comment-554505

blu
Guest
blu

I don’t expect RPi4 to show any RAM performance difference with higher RAM configs — the number of channels seems set, as there are no RAM isles on the back of the board.

willy
Guest
willy

It really depends on the memory controller’s capabilities. In practice Some can pipeline multiple accesses over the same bus using the chip select lines so that while a DRAM chip is waiting for data to be ready, another read can be engaged with another chip. In practice I’ve never seen a controller do that in any ARM CPU, even higher end ones such as Armada 8040. On x86 I’ve observed up to 4 parallel prefetches on a dual-channel, 4-DIMM machine.

blu
Guest
blu

willy, are you sure those were actually carried out in parallel and not just buffered? While a chip-select should not be able to increase the theoretical BW of the channel, it might be able to improve the channel saturation. At the end of the day it’s about sustained percentage of channel BW.

Member
Stuart Naylor

Rockpi4

hwti
Guest
hwti

Perhaps the HDMI output (or the software) doesn’t support 4:2:0 mode, as your TV doesn’t do 4K in 4:4:4.

David Willmore
Guest
David Willmore

Jean-Luc, once you manage to get a 4K output, could you compare memory demanding benchmarks (like 7zip and tinymembench) at different resolutions? I’m curious to see what influence the framebuffer refresh has on the CPU.

miguel angel
Guest
miguel angel

this GPU will support 4K openGL applications? (not video, desktop X11 GLX acceleration for apps in 4k )
for example a Qt/QML 4K app.

what about a 64bit raspbian? any news?

Benjamin Hojnik
Member

32 bit raspbian is here to stay, because legacy boards dont support 64 bit.

tkaiser
Guest
tkaiser

Raspbian will remain 32-bit (and built for ARMv6) but at least they plan on shipping with a 64-bit kernel in the future: https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=243372&start=50#p1484079

tkaiser
Guest
tkaiser

My RPi 4 arrived an hour ago. Using the Raspbian Buster Lite image I quickly checked for storage performance.

With an USB3 connected Samsung EVO840 in a JMS567 enclosure (UAS capable) transfer speeds are up to 360MB/s read and 320 MB/s write:

Random IO performance also not that bad and the RPi 4 is somewhere in between RK3328 and RK3399:

I use the following script for optimized settings and benchmarking at startup:

Debug output: http://ix.io/1MKZ

Member
Stuart Naylor

Has anyone tested what the ampage is under full load with ethernet, usb hardrive and pheripherals running?

tkaiser
Guest
tkaiser

My idle consumption with their official USB-C PSU and an USB3 connected Samsung EVO840 in a RPi powered JMS567 enclosure with Gigabit Ethernet active is 5.2W. Switching to Fast Ethernet I’m at 4.9W (the usual difference caused by GbE PHY active or not).

When running an iozone3 storage test (+300 MB/s), consumption increases by 3W to 4W, when running LanTest achieving +85MB/s NAS performance the consumption increase compared to idle is around 3W.

The above numbers always include consumption of a connected SSD so only make partially sense. For me the high idle consumption is the problem. With our little Olimex Lime2 servers (native SATA) connected disks can be fully sent to sleep (on the RPi4 with connected storage there always will be an USB-to-SATA bridge with one highspeed PHY active) and this results in idle consumption numbers at around 2W.

With Allwinner A20/R40/A40i featuring native SATA there are two SATA PHYs active between host and disk. On the RPi 4 there need to be PCIe PHYs, USB3 PHYs and also SATA PHYs active for the data traveling between host and disk in a way less efficient fashion (protocol overheads). How should this be more power efficient?

Member
Stuart Naylor

Can I ask what command and parameters you used for iops?

William Barath
Guest

The Himeno Poisson Pressure Solver is branch-heavy, so it reveals differences in branch prediction accuracy, reorder buffer efficacy, and pipeline flush cost between microarch implementations. A72 slaughters the in-order cores, and A76 will in turn slaughter A72.

tkaiser
Guest
tkaiser

Back from beergarden a quick Samba/SMB benchmark: https://forum.openmediavault.org/index.php/Thread/27710-Raspberry-pi-4-announced-better-than-3/?postID=207230#post207230

+85MB/s in both directions with beta software isn’t too bad.

tkaiser
Guest
tkaiser

With the usual tweaks applied raw network performance is nothing one could complain about:

tkaiser
Guest
tkaiser

Another test, this time not focused on performance but reliability. More than one USB3 drive connected to an SBC can be a real sh*t show and it seems this doesn’t change here even if the situation on the RPi 4 should be better (no USB3 hub involved, hopefully proper XHCI controller).

I attached two 120 GB SSDs to the USB3 ports and created a btrfs raid1 out of them: mkfs.btrfs -f -d raid1 -m raid1 /dev/sda /dev/sdb. This ensures that both drives will be pretty busy at the same time and due to btrfs’ checksum mechanism data corruption can be spotted. Then starting a benchmark with iozone -e -I -a -s 100M -r 4k -r 16k -r 512k -r 1024k -r 16384k -i 0 -i 1 -i 2 ; iozone -e -I -a -s 1000M -r 16384k -i 0 -i 1 -i 2. First run:

Inferior performance at the end and tons of ‘UAS errors’ in dmesg so I finally rebooted since the system didn’t recover: http://ix.io/1MQd

So it’s an UAS incompatibly? I don’t think so. With UAS disabled most probably the symptoms just look different. I added usb-storage.quirks=174c:55aa:u,152d:0578:u to kernel cmdline, rebooted again and tested another time (this time usb-storage instead of uas driver handling the disks):

Again countless USB resets, this time of course with different error messages: http://ix.io/1MQq

TL;DR: USB3 storage sucks 🙂

tkaiser
Guest
tkaiser

And the good news: it was just the usual ‘USB peripherals underpowered’ drama the SBC world is full of. Now I replaced one of the two SSD with another one in an externally powered enclosure and repeated the tests with both usb-storage and uas (the latter as usual showing better performance):

Debug output with UAS enabled again: http://ix.io/1MQI

What looked like ‘UAS problems’ or ‘broken USB3 controller’ was simply underpowering. In the first try both SSDs were powered by the RPi and as such the current limitation on the USB ports kicked in (slightly above 1.1A available for all ports together). Now with one SSD powered externally everything is fine.

hwti
Guest
hwti

So the xhci doesn’t signal the over-current condition, else there would be traces (but perhaps it just cuts power, or it doesn’t but the voltage drops).