How One Line of Code Tripled Allwinner A20 SATA Write Performance

If you’ve been following this blog long enough, you may remember that all linux-sunxi community work aiming at improving u-boot and Linux software support on Allwinner processors started with Allwinner A10 processor found in MeLE A1000 TV box back in 2012, which at the time provided an interesting alternative to Raspberry Pi board that was in short supply at launch time and several months after.

One of the most interesting feature found in Allwinner A10 single core Arm Cortex-A8 processor was its SATA interface, and Allwinner A20 was announced a few months later with a dual core Cortex-A7 processor and virtually the same peripherals as Allwinner A10, including SATA. However when I  tested CubieTruck board connected to a mechanical drive, I noticed sequential SATA performance was fine for reads (~180MB/s), but writes were fairly slow at around 36 MB/s.

Other people complained about it, and some looked into it, and at one point it appeared the maximum SATA write performance for Allwinner A10/A20 was 45MB/s either due to buggy silicon and driver problems.

Allwinner A20 SATA Performance PatchIt turns out it may just have been a driver problem as a recent patch changing one line of code enables write speeds up about three times faster (200% improvement).

Most of us are not familiar with Allwinner SATA DMA registers, but luckily the patch explains what’s going on here:

Increasing the SATA/AHCI DMA TX/RX FIFOs (P0DMACR.TXTS and .RXTS) from default 0x0 each to 0x3 each gives a write performance boost of 120MB/s from lame 36MB/s to 45MB/s previously. Read performance is about 200MB/s [tested on SSD using dd bs=4K count=512K].

Tested on the Banana Pi R1 (aka Lamobo R1) and Banana Pi M1 SBCs
with Allwinner A20 32bit-SoCs (ARMv7-a / arm-linux-gnueabihf).

I tried to look into Allwinner A20 public documentation, but I could not find anything about P0DMACR or much details about SATA registers, as only the SATA clock appears to be documented. Maybe that explains why it took 7 years to fix this performance issue…

Igor of Armbian tested the patch on Cubietruck with the more reliable iozone benchmark, and the results look great:

A sequential write of 38875 KB/s with Linux 4.19 was vastly improved to 127084 KB/s by applying this one line patch. It’s great, and there does not seem to be side-effects so far. The patch looks fairly new, so more testing may be needed. If you are running Armbian and can wait a bit, you won’t need to apply the patch yourself since it is part of Debian & Ubuntu releases. Uenal Mutlu also submitted his patch to the Linux Kernel mailing list, so it should be part of Linux 5.2.

Support CNX Software - Donate via PayPal or cryptocurrencies, become a Patron on Patreon, or buy review samples
Notify of
newest most voted