Raspberry Pi 4 Benchmarks & Mini Review

Raspberry Pi 4 has just been released with many improvements over Raspberry Pi 3 Model B+ including a faster processor, a proper Gigabit Ethernet port, USB 3.0 interfaces, and 4K video support. That’s the theory, but how does it work in practice?

I can now let you know as I’ve received a Raspberry Pi 4 sample courtesy of Cytron, and ran some tests and benchmarks on the very latest boards from the Raspberry Pi foundation.

Raspberry Pi 4 Review
Click to Enlarge

System Info

Before starting with the benchmarks, let’s go through some basic system info:



For reference, you’ll find Raspberry Pi 4 Linux boot log here.

Phoronix benchmarks

Let’s go ahead and install the latest version of Phoronix benchmarks:


Now let’s run the test to compare the performance of Raspberry Pi 4 model B to some other Arm Linux boards including Raspberry Pi 3 Model B.


For reference, my office has an ambient temperature of around 28 to 30°C, and I’ve monitored the CPU temperature with an IR thermometer during some of the phases:

  • Idle – 62°C
  • Phoronix benchmarks download/installation – 64°C
  • John The Ripper – 73°C

I also typed a few commands to get the system temperature and CPU clock, in this case during John the Ripper benchmark:


So the CPU is clocked down to around 1.0 GHz since proper cooling is not implemented for this type of workload, and the system automatically lowers the CPU frequency.

Another way to confirm throttling does occur is to check out the output of Phoronix John The Ripper:


The test is repeated several times, with the score starting at 696, and eventually dropping under 500, as the board fails to cool.

This behavior is actually explained in the board’s datasheet:

To reduce thermal output when idling or under light load, the Pi4B reduces the CPU clock speed and voltage. During heavier load the speed and voltage (and hence thermal output) are increased. The
internal governor will throttle back both the CPU speed and voltage to make sure the CPU temperature never exceeds 85 degrees C.

The Pi4B will operate perfectly well without any extra cooling and is designed for sprint performance – expecting a light use case on average and ramping up the CPU speed when needed (e.g. when loading a webpage). If a user wishes to load the system continually or operate it at a high temperature at full performance, further cooling may be needed.

We do not have a cooling solution at hand here, so running the benchmarks without heatsink, nor fan does seriously impact the performance under load, meaning Raspberry Pi 4 is slower than Raspberry Pi 3 model B in some of the benchmarks.
Raspberry Pi 4 Phoronix Benchmarks John The Ripper
John the Ripper is a shocker since my Raspberry Pi 4 is actually slower than a Raspberry Pi 3 model B tested by others due to the lack of proper cooling for the multi-threaded benchmark.

Raspberry Pi 4 C-Ray Benchmark
C-Ray looks better since RPi 4 is about 27% faster than RPi 3 model B (187.03s vs 250.79s), but still twice as slow as Rockchip RK3399 powered VS-RK3399 board.

Raspberry Pi 4 Benchmarks SmallptRaspberry Pi 4 performance also looks better with Smallpt illumination renderer benchmark.

Raspberry Pi 4 Benchmarks HimenoThe board is however much much faster with Himeno Poisson solver benchmark, almost 5 times faster than RPi 3 model B, so there may have been some software/compilation changes in this Phoronix benchmark, or possibly there are some extra instructions that come with Cortex-A72 cores, since the Rockchip RK3399 hexa-core processor with 2x A72 + 4x A53 is also quite faster than other A53/A7 Arm platforms.

Raspberry Pi 4 FLAC Audio Encoding

AFAIK, FLAC Audio encoding is a single-core benchmark so it’s not quite as susceptible to overheating as other multi-threaded benchmarks, and Raspberry Pi 4 did perform well.

We can see from Raspberry Pi 4 benchmarks that the board is quite faster than Raspberry Pi 3 model B in most cases, but it’s also clear that to leverage the full power of the board, especially for multi-threaded tasks, a proper cooling solution is needed. You can check out the full results here.

SBC Bench

SBC Bench is a simple benchmark for single board computers developed by Thomas Kaiser that allows checking the performance of SBC quite faster than running Phoronix benchmarks.


That’s the output which confirms throttling does occur:


Checking into the details in the link, we can see for example the CPU dropped to 1.0 GHz for most of 7-zip multi-threaded tests, dropping even as low as 600 MHz once.  7-zip crashed so we only got two results instead of three. That’s Raspberry Pi 3 model B result for reference:


So RPi 4 is barely faster at just under 3,600, but proper cooling should improve things.

OpenSSL AES results look like Armv8 crypto extension are not enabled, as an Allwinner H5 based Orange Pi Zero Plus board underclocked at just 816 MHz is considerably (up to around 8 times) faster:


Checking out /proc/cpuinfo above, AES feature is missing, so it might be Broadcom did not include Armv8 crypto extension in the processor (TBC).

USB 3.0 & microSD  Storage Benchmarks

Since Raspberry Pi 4 now comes with two USB 3.0 ports, I’ve connected USB 3.0 mechanical hard drive, and installed iozone to verify Raspberry Pi can now achieve the ~100MB/s read/write speed expected from such drive.

Usually, iozone can be installed as follows in most Ubuntu/Debian systems:


But it’s not available in Raspbian Buster, so I’ve built it from sources instead:


We can now run the test to check sequential read and write speeds in the EXT-4 partition of my drive:


So around 94MB/s read and 92 MB/s write that’s about what we should expect from USB 3.0 with this drive, and much better than the 30+MB/S one would get with Raspberry Pi 3.

 While I’m at it, I’ve also tested the performance of the NOOBS Class A1 microSD card part of the kit I received:


The most important numbers here are the random read and write values, and results are good resulting in a smooth experience while using the board (in most cases).

Gigabit Ethernet Benchmark

True Gigabit Ethernet is another key feature of the new Raspberry Pi port, so I’ve tested full duplex Ethernet performance using iperf:


It ended badly on the client side:


But I got some numbers on the server (my laptop) side:


I repeated the test twice with similar results. It does not look so good, so let’s repeat the test in one direction only:

  • Upload

  • Download


So it looks much better here, as the full bandwidth enabled by Gigabit Ethernet is basically saturated by both tests. The driver may have troubles handling high download/upload traffic at the same time in the first test.

4K Video Playback and Output

All previous Raspberry Pi boards were limited to 1080p30/60 video playback with H.264 and other codecs, but Raspberry Pi 4 is the first to play 4K videos using H.265 codec.

So I went back to the desktop to play some 4K H.265 videos played from my USB drive. Clicking on the files opened then in VLC media player:

  • Beauty_3840x2160_120fps_420_8bit_HEVC_MP4.mp4 (H.265) – First frame only
  • MHD_2013_2160p_ShowReel_R_9000f_24fps_RMN_QP23_10b.mkv (10-bit HEVC, 24 fps) – First frame only for several seconds then video becomes gray, and system hangs
  • Fifa_WorldCup2014_Uruguay-Colombia_4K-x265.mp4 (4K, H.265, 60 fps) – First frame only for several seconds then video updates every 5-10 seconds with heavily gray frames, and frequent audio cuts. Eventually, lost mouse pointer and control of the system -> hard power cycle required

OK, so I guess I can stop right here, as 4K video playback is clearly not working, at least with VLC. So I tried to omxplayer command line:


Let’s switch to 4K video output. There’s an option in the settings and if you planned on using a dual display setup, you can also configure the screen layout as needed.

Raspberry Pi 4 4K HDMI
Click to Enlarge

So I selected 4K HDMI, clicked on OK, and was told I needed to reboot to apply the new settings. Fair enough, and after the 35 seconds it typically takes to boot the board I was back in the desktop environment, but still at 1080p60 video output & resolution. For reference, my TV is LG 42UB820T which perfectly supports 4K video output. I did, however, connect it through Onkyo TX-NR636 A/V receiver which should not be a problem, but in case of compatibility issues, I connected the RPi 4 directly to the HDMI 3 of my TV and rebooted the board. But sadly same results.

So right now, Raspberry Pi 4 only has the potential to use 4K video decode and output, as it’s just not working right now, at least in the Raspbian image I was provided with the NOOBS microSD card.

Final words

The Raspberry Pi 4 does provide much better I/O performance thanks to Gigabit Ethernet and USB 3.0 both of which are mostly performing as expected. The processor is also quite faster, but you may have to come up with a cooling solution such as Pimoroni Fan SHIM to leverage it’s full potential, especially if you live in a hot climate. 4K does not work at all right now, with 4K video playback clearly relying on software decode with both VLC and omxplayer, and  4K video output is not working at least with my 4K TV.

Raspberry Pi 4 Computer Accessories
Click to Enlarge

The Raspberry Pi 4 hardware changes mean you’ll need some extra accessories, and Cytron sent me the Raspberry Pi 4 together with the official 5V/3A USB-C power supply, as well as a micro HDMI cable, and a 16GB NOOBS Class A1 microSD card. So make sure you get at least the first two accessories in your order or you may not be able to use the Raspberry Pi 4 for a while.

I’d like to thank Cytron for the opportunity of getting a Raspberry Pi 4 so early, and you could consider purchasing the board from their store. They ship worldwide, but if you’d rather get the 2GB or 4GB RAM models you’ll have to wait a little longer, as those are not quite available yet.

Support CNX Software - Donate via PayPal or become a Patron on Patreon
Subscribe
Notify of
guest
63 Comments
oldest
newest most voted
blu
blu
1 year ago

Them thermals..

TLS
TLS
1 year ago

Not quite hot enough for frying an egg on though…

Nobody Of Import
1 year ago

An RK-3399, Atom, the Tegra, etc. are beasts in that regard as well. Unless you’re talking a RISC-V core set that’s comparable, you’re going to have these levels for this class of performance.

blu
blu
1 year ago

I was referring to the dissipation provisioning, or lack thereof. I’m perfectly aware what temps CA72 does at 28nm, ergo my first post in the announcement thread.

TLS
TLS
1 year ago

Just a suggestion, try iperf3 instead of iperf, it’s the most up to date variant and seems to work much better.

Also, is it possible that 4K video playback require more than 1GB of RAM?

TLS
TLS
1 year ago

You’re correct, but iperf2 does that at least.

tkaiser
tkaiser
1 year ago

Few remarks: 1) sbc-bench reports you’re running an outdated primary OS: Shell Jun 7 2019 18:01:29 Copyright (c) 2012 Broadcom version b5cfeb0a81046c23b73eb41dab3745fcf0601247 (clean) (release) (start) 123 Jun  7 2019 18:01:29 Copyright (c) 2012 Broadcomversion b5cfeb0a81046c23b73eb41dab3745fcf0601247 (clean) (release) (start) You should upgrade to at least version 407b1da8fa3d1a7108cb1d250f5064a3420d2b7d from last Thursday. 2) sbc-bench also reported frequency capping so you’re affected by undervoltage (or the ThreadX release has problems): Shell Querying ThreadX on RPi for thermal or undervoltage issues: 1100000000000000010 ||| |||_ under-voltage ||| ||_ currently throttled ||| |_ arm frequency capped |||_ under-voltage has occurred since last reboot ||_ throttling has occurred since… Read more »

GreenReaper
1 year ago

It’s not undervoltage; it’s throttling due to insufficient cooling for extensive activity, like benchmarks. The bitmask shows this clearly by having an 0 for under-voltage and a 1 for currently throttled.

tkaiser
tkaiser
1 year ago

Simply look again. I was talking about the bits on the left that report behavior since last boot: Shell Querying ThreadX on RPi for thermal or undervoltage issues: 1100000000000000010 ||| |||_ under-voltage ||| ||_ currently throttled ||| |_ arm frequency capped |||_ under-voltage has occurred since last reboot ||_ throttling has occurred since last reboot |_ arm frequency capped has occurred since last reboot 123456789 Querying ThreadX on RPi for thermal or undervoltage issues: 1100000000000000010|||             |||_ under-voltage|||             ||_ currently throttled|||             |_ arm frequency capped|||_ under-voltage has occurred since last reboot||_ throttling has occurred since last reboot|_ arm frequency capped has… Read more »

everest3333
everest3333
1 year ago

oops, 9,8,7 110 not 111, easy to mistake it as ascii drawing is still to 8bit, we need a real consistant cli gfx in the shell scripting in 2020… whats available on git we might use.

whats wrong with that
graphics:
Device-1: bcm2835-vc4 driver: vc4_drm v: N/A
Device-2: bcm2835-hdmi driver: N/A

didnt they include them or get a working copy in time for benches,did it effect the video non playing i wonder as a side effect…

everest3333
everest3333
1 year ago

https://wiki.gentoo.org/wiki/Raspberry_Pi_VC4 Using the “VC4” driver on the raspberry pi to enable hardware acceleration (in X, Wayland, opengl applications) presents many challenges. There’s plenty of instructions floating around for Raspbian, but for Gentoo, not so much. This page contains a couple of nuggets of wisdom that may help you get there. …Kernel To have proper GPU acceleration with VideoCore, you need its kernel module loaded. That module has been integrated in v4.5, but it’s also present in rpi’s kernel v4.4. To get VC4 core working you need to use latest firmware from sys-boot/raspberrypi-firmware 9999 ebild. To have installed only firmware files… Read more »

Steve
Steve
1 year ago

Have you enabled 4K video in the config.txt file? I believe that is necessary at the moment to enable 4K output. However you don’t need 4K output to check 4K decode do you?

hdmi_enable_4k=1 (This is for 4K60 at least)

https://forum.libreelec.tv/thread/17698-libreelec-leia-9-2-alpha1-with-raspberry-pi-4b-support/?postID=121496#post121496

Steve
Steve
1 year ago

Ah – OK – that makes sense. Also I wonder whether all flavours of HDMI 4K are handled – I’d hope 4:2:0 was as that’s a pretty much guaranteed baseline that all HDMI 2.0 4K displays should support (and is HDMI 1.4b bandwidth compatible) However 4:2:2 and 4:4:4 would be better for desktop use (4:4:4 ideally)

theguyuk
theguyuk
1 year ago

Just about working on LibreElec alpha test video, but still buggy.

tkaiser
tkaiser
1 year ago

I added both RPi 4 and VIM3 numbers to sbc-bench results list: https://github.com/ThomasKaiser/sbc-bench/blob/master/Results.md

But I took the RPi 4 results generated by German c’t magazine since on their install overheating wasn’t that massive and they run with a newer ThreadX release. The good news: if you want a fast performing ARM device, choose VIM3 instead! 🙂

blu
blu
1 year ago

Something’s telling me there might be faster arm devices out there, depending on workloads ; )

Nobody Of Import
1 year ago

Only drawbacks on some of these benchmarks…some of them highlight a single core’s performance, not the device’s workload over all four cores.

When you have the RK-3399, with a dual core A72 “big” coreplex set, beating a QUADCORE one…the benchmark is valid but should be taken with a block of salt for making real-world comparisons there. At clock, single and dual threaded tasks will be faster on the RK-3399. Spread out over multiple cores, a quad threaded task will see the quadcore being faster.

Eversor
Eversor
1 year ago

I’ve tested one RK3399 device that wasnt throttling and both these things happened:

– Heterogeneous kernel scheduler was working great.
– Compiling code w/ the six cores yielded a 50% reduction in compile time.

So yes, because the RK3399 was clocked faster, it can beat a quad A72 if you also make use of the A53 cores.

David Willmore
David Willmore
1 year ago

It also helps if the quad core has huge thermal issue when you load down the cores with actual work.

halherta
halherta
1 year ago

I wonder how much of a performance / benchmark improvement an RPI4 with 4GB of RAM will have over an RPI4 with 1GB of RAM

tkaiser
tkaiser
1 year ago

Why?

Unless the benchmark or ‘application’ in question needs more than 1 GB RAM and as such the system needs to swap why should be there a reason for more RAM increasing performance?

You might see a drop in performance if the system has not enough RAM (and as such swap involves heavy IO activity or zram/zswap starts to eat up CPU cycles) but there is no reason why more RAM should be ‘better’ otherwise.

Jerry
Jerry
1 year ago

More RAM modules could provide more parallel memory channels? Upton must have considered that since it’s standard stuff in low power (“Allwinner”) devices as well.

tkaiser
tkaiser
1 year ago

> More RAM modules could provide more parallel memory channels?

Is this true with LPDDR? I’ve never encountered an LPDDR equipped SBC so far with more than one DRAM chip.

With DDR3 I believe it can make a difference since boards with 2 or more DDR3 modules operated in dual-channel mode while those with just one DDR3 module were forced to slower single-channel mode. But I’m a total hardware noob.

hwti
hwti
1 year ago

Rockpro 64 is using two LPDDR4 chips (two 32-bit channels).

blu
blu
1 year ago

There are also at least two arm chromebooks with dual-channel LPDDR3 on the market.

tkaiser
tkaiser
1 year ago

Right, but where’s the difference between 2 x 32-bit LPDDR and 1 x 64-bit LPDDR as on Renegade Elite?

https://www.cnx-software.com/2018/06/24/odroid-n1-canceled-due-to-ram-supply-issues-odroid-n2-coming-later-this-year/#comment-554505

blu
blu
1 year ago

I don’t expect RPi4 to show any RAM performance difference with higher RAM configs — the number of channels seems set, as there are no RAM isles on the back of the board.

willy
willy
1 year ago

It really depends on the memory controller’s capabilities. In practice Some can pipeline multiple accesses over the same bus using the chip select lines so that while a DRAM chip is waiting for data to be ready, another read can be engaged with another chip. In practice I’ve never seen a controller do that in any ARM CPU, even higher end ones such as Armada 8040. On x86 I’ve observed up to 4 parallel prefetches on a dual-channel, 4-DIMM machine.

blu
blu
1 year ago

willy, are you sure those were actually carried out in parallel and not just buffered? While a chip-select should not be able to increase the theoretical BW of the channel, it might be able to improve the channel saturation. At the end of the day it’s about sustained percentage of channel BW.

stuartiannaylor
1 year ago

Rockpi4

hwti
hwti
1 year ago

Perhaps the HDMI output (or the software) doesn’t support 4:2:0 mode, as your TV doesn’t do 4K in 4:4:4.

David Willmore
David Willmore
1 year ago

Jean-Luc, once you manage to get a 4K output, could you compare memory demanding benchmarks (like 7zip and tinymembench) at different resolutions? I’m curious to see what influence the framebuffer refresh has on the CPU.

David Willmore
David Willmore
1 year ago

Thank you. I don’t know if there is a setting in the Rpi boot config to turn the display off entirely–like there is for most ODROID boards. Hmm, there’s “hdmi_blanking” which when set to 1 will disable HDMI instead of blanking it. Not sure if that shuts down the scanout part of the display pathway. I don’t see any other settings to disable video out.

tkaiser
tkaiser
1 year ago

tkaiser
tkaiser
1 year ago

Before:

After:

tkaiser
tkaiser
1 year ago

Nope, now tried again after using /usr/bin/tvservice -p --> Powering on HDMI with preferred settings

I did a reboot in between and updated packages before so maybe new ‘firmware’ kicked in.

David Willmore
David Willmore
1 year ago

I’m now even more confused, Thomas! 🙂

tkaiser
tkaiser
1 year ago

Switching HDMI on/off via tvservice did NOT change anything about tinymembench results. I tested it twice later with reboots in between. So the result variation above might be due to different ThreadX versions or whatever. Don’t know.

David Willmore
David Willmore
1 year ago

I’ve been reading and it looks like “vcgencmd display_power 0” is what you want to turn the display off. I’ll keep reading.

Nope, nevermind, that was wrong your “-o” was the right way to do it.

blu
blu
1 year ago

Thanks, Jean-Luc. For comparison, result from 2.1GHz CA72 in MT8173A (g++-8.1.0, aarch64):

miguel angel
miguel angel
1 year ago

this GPU will support 4K openGL applications? (not video, desktop X11 GLX acceleration for apps in 4k )
for example a Qt/QML 4K app.

what about a 64bit raspbian? any news?

Benjamin Hojnik
1 year ago

32 bit raspbian is here to stay, because legacy boards dont support 64 bit.

tkaiser
tkaiser
1 year ago

Raspbian will remain 32-bit (and built for ARMv6) but at least they plan on shipping with a 64-bit kernel in the future: https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=243372&start=50#p1484079

tkaiser
tkaiser
1 year ago

My RPi 4 arrived an hour ago. Using the Raspbian Buster Lite image I quickly checked for storage performance. With an USB3 connected Samsung EVO840 in a JMS567 enclosure (UAS capable) transfer speeds are up to 360MB/s read and 320 MB/s write: Shell random random kB reclen write rewrite read reread read write 102400 4 22302 27249 20241 20395 20237 27021 102400 16 75679 87955 83502 83747 83282 86495 102400 512 275383 284198 304086 306688 306716 282608 102400 1024 290456 297695 325241 327736 327800 296176 102400 16384 312888 307813 359110 362559 362575 312788 1024000 16384 320656 321783 363482 364006 363931… Read more »

stuartiannaylor
1 year ago

Has anyone tested what the ampage is under full load with ethernet, usb hardrive and pheripherals running?

tkaiser
tkaiser
1 year ago

My idle consumption with their official USB-C PSU and an USB3 connected Samsung EVO840 in a RPi powered JMS567 enclosure with Gigabit Ethernet active is 5.2W. Switching to Fast Ethernet I’m at 4.9W (the usual difference caused by GbE PHY active or not). When running an iozone3 storage test (+300 MB/s), consumption increases by 3W to 4W, when running LanTest achieving +85MB/s NAS performance the consumption increase compared to idle is around 3W. The above numbers always include consumption of a connected SSD so only make partially sense. For me the high idle consumption is the problem. With our little… Read more »

Stuart Naylor
1 year ago

Can I ask what command and parameters you used for iops?

William Barath
1 year ago

The Himeno Poisson Pressure Solver is branch-heavy, so it reveals differences in branch prediction accuracy, reorder buffer efficacy, and pipeline flush cost between microarch implementations. A72 slaughters the in-order cores, and A76 will in turn slaughter A72.

tkaiser
tkaiser
1 year ago

Back from beergarden a quick Samba/SMB benchmark: https://forum.openmediavault.org/index.php/Thread/27710-Raspberry-pi-4-announced-better-than-3/?postID=207230#post207230

+85MB/s in both directions with beta software isn’t too bad.

tkaiser
tkaiser
1 year ago

With the usual tweaks applied raw network performance is nothing one could complain about: Shell mac-tk-2018:~ tk$ iperf3 -c rpi4.local Connecting to host 192.168.83.136, port 5201 [ 5] local 192.168.83.64 port 59551 connected to 192.168.83.136 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 113 MBytes 946 Mbits/sec [ 5] 1.00-2.00 sec 112 MBytes 940 Mbits/sec [ 5] 2.00-3.00 sec 112 MBytes 940 Mbits/sec [ 5] 3.00-4.00 sec 112 MBytes 940 Mbits/sec [ 5] 4.00-5.00 sec 112 MBytes 940 Mbits/sec [ 5] 5.00-6.00 sec 112 MBytes 940 Mbits/sec [ 5] 6.00-7.00 sec 112 MBytes 940 Mbits/sec [ 5]… Read more »

tkaiser
tkaiser
1 year ago

Another test, this time not focused on performance but reliability. More than one USB3 drive connected to an SBC can be a real sh*t show and it seems this doesn’t change here even if the situation on the RPi 4 should be better (no USB3 hub involved, hopefully proper XHCI controller). I attached two 120 GB SSDs to the USB3 ports and created a btrfs raid1 out of them: mkfs.btrfs -f -d raid1 -m raid1 /dev/sda /dev/sdb. This ensures that both drives will be pretty busy at the same time and due to btrfs’ checksum mechanism data corruption can be… Read more »

tkaiser
tkaiser
1 year ago

And the good news: it was just the usual ‘USB peripherals underpowered’ drama the SBC world is full of. Now I replaced one of the two SSD with another one in an externally powered enclosure and repeated the tests with both usb-storage and uas (the latter as usual showing better performance): Shell UAS random random kB reclen write rewrite read reread read write 102400 4 16339 17795 28667 30303 17338 17173 102400 16 53296 48270 82398 46101 57816 40547 102400 512 132077 130991 187930 190746 185211 131467 102400 1024 130464 130977 220379 246195 221023 131378 102400 16384 130178 128957 348131… Read more »

hwti
hwti
1 year ago

So the xhci doesn’t signal the over-current condition, else there would be traces (but perhaps it just cuts power, or it doesn’t but the voltage drops).

ProphetZarquon
ProphetZarquon
2 months ago

Has there been any progress on H.265 decoding since this was published?
I’ve got a brand new Pi4b here & it won’t play 720p x265 videos in VLC. I’m looking for anything I can do to configure it better.

Advertisements