But in order to keep backward software compatibility with the original Raspberry Pi and Raspberry Pi 2, the Raspberry Pi foundation decided to keep provided 32-bit OS image, so nearly everybody is now running a 32-bit OS on 64-bit hardware, and Eben Upton famously claimed it did not matter.
We already wrote that 64-bit Arm (Aarch64) boosted performance by 15 to 30% against 32-bit Arm (Aarch32) several years ago, but Matteo Croce decided to try it out himself on Raspberry Pi 4 board first running benchmarks on Raspbian 32-bit before switching to a lightweight version of Debian compiled as aarch64.
Dhrystones is much faster with the 64-bit OS, namely 50% faster, but as a synthetic benchmark, its use is limited. Benchmarks closer to real use cases such as SHA1 or audio encoding do confirm the improved performance although to a lesser extent, but still significant.
However, in some cases, there are no benefits of switching to a 64-bit OS with VPN performance with either OpenVPN or Wireguard being virtually the same with the default 32-bit Raspbian OS.
But the firewall works much better with Aarch64 (557k packets/s) than when the software is compiled with armv7 (268k packets/s).
Benchmarks results can differ greatly depending on compile select flags, but sadly Matteo did not provide the full command lines used to build the OS and samples.
I want to get some more data points, so I had a look at sbc-bench results available both for 32-bit Raspbian and 64-bit Debian Buster with the processor overclocked to 1850 Mhz and running Linux 4.19 in both cases. But the results we have here a completely different, at least when it comes to AES numbers which are twice as slow on the 64-bit version, and one of the reasons is the lack of ARMv8 Crypto Extensions in Broadcom BCM2711 processor.
The lack of hardware crypto may explain why it’s not faster, but it does not explain why it is that much slower with 64-bit instructions. Thomas Kaiser also noted that 64-bit code has a larger footprint which leads to 7-zip test to run out of memory (oom-killer) in Raspberry Pi 4 with 1GB RAM while it can run fine while using a 32-bit OS on the same hardware.