NanoPi R5S router review – Part 1: Unboxing, OpenWrt, and iperf3 benchmarking

FriendlyElec has just launched the NanoPi R5S mini router powered by a Rockchip RK3568 processor, and the company kindly sent me two samples for review. In the first part of the review, I’ll check out the device itself, the internal design, the preinstalled OpenWrt, and run some networking benchmarks with iperf3.

NanoPi R5S unboxing

 

The router comes fully assembled together with a 3M sheet with 6 rubber feet, which, as we’ll see below, are not really necessary.


A microSD card socket can be found on one of the sides, while the rear panel comes with a USB-C port for power, a WiFi antenna hole (which can also be used to run cables for GPIO. UART console, etc…), two 2.5GbE RJ45 LAN ports, a Gigabit Ethernet WAN port, and HDMI video output.

We’ll find a Mask button for firmware flashing on the other side, and the front panel features four LEDs for “System” and Ethernet ports, as well as two USB 3.0 ports.

NanoPi R5S teardown

There are several reasons you might want to open the device: curiosity, M.2 NVMe SSD installation, soldering an SPI flash, connecting some GPIOs, an RTC battery, or a UART to TTL debug board. So it’s been made relatively easy to open with just four screws that needs to be loosened.


This will reveal the bottom side of the board with the M.2 Key M socket (I’ve purchased an NVMe SSD, and waiting for delivery), SPI flash footprint (right side), and a Samsung KLM8G1GETF-B041 eMMC 5.1 flash with 8GB capacity.

We’ll need to loosen four more screws before taking out the board from the enclosure.

Rayson RS512M32LM4 D2BDS is a 2GB LPDDR4X memory chip, and we’ve got the advertised RTL8211F (GbE) and 2x RTL8125BG (2.5GbE) Ethernet chips, plus an RK809 PMIC. We’ll also find the 16-pin SDIO/I2C connector and 2-pin RTC battery connectors on the left, 4-pin SWD and 3-pin UART headers (both unpopulated) on the top right, as well as a GPIO connector and fan header on the bottom right.


The Rockchip RK3568 processor is covered with a thermal pad that’s in direct contact with the metal enclosure for optimal cooling.

Test setup with 2.5GbE USB dongle and UP Xtreme i11 mini PC

Regular readers may remember that I had some performance issues with an RTL8156B USB dongle a while ago, but this is now fixed as Realtek sent me another one with RTL8156BG which I tested at 2.34Gbps/2.29Gbps with an iperf3 full duplex test when connected to my laptop and transferring data to/from UP Xtreme i11 mini PC through a TP-Link 2.5GbE switch.

I’ve now reused the same setup but with NanoPi R5S in the middle.

The TP-Link switch is only used as a Gigabit Ethernet switch here connecting the WAN port of the NanoPi R5S router to the internet through Xiaomi AX6000 router in case I have to install some packages on the devices. While it makes for a nicer photo to place the R5S on top of the TP-Link switch, both devices are quite hot, and it’s not recommended to do so. I only have short Ethernet cables, but I still managed to move the router on the table for testing.

OpenWrt and iperf3 benchmarking

FriendlyWrt is preinstalled on the router so it works out of the box. It’s also possible to access the LuCI interface or SSH immediately using “root” user and “password” as the password. That’s convenient, but insecure and possibly breaks the law in some countries. In any case, it’s a good idea to at least change the password the first time.

FriendlyWrt is based on OpenWrt 22.03.0-rc1 and Linux 5.10.66 kernel. There’s less than about 250MB RAM used at idle using default settings, so that leaves plenty of RAM to play with since the system comes with 2GB RAM.

I haven’t connected an SSD yet, so only the root partition is mounted with 920KB used out of the 6.7 GB available. All interfaces acquired IP addresses properly at boot time with DHCP, and the devices on the LAN can also be accessed with <hostname>.lan, and got some IPv6 addresses too.

Let’s start with some iperf3 benchmarking running “iperf3 -s” on NanoPiu R5S and running the following commands on the laptop:

  • Download: (Rx from R5S point-of-view)

  • Upload (Tx from R5S point-of-view):


That’s not quite the 2.35 Gbps and 1.85 Mbps advertised by FriendlyElec, as I only got 1.84 Gbps and 1.12 Gbps in this configuration. There’s also some variations on the Rx side when looking at the 10-second reports. The good news is that there are no retransmissions at all.

Let’s now try the same with the other WAN port on NanoPi R5S, and run the commands from UP Xtreme i11:

  • Download (Rx):

  • Upload (Tx);


Rx looks better here at 2.17 Gbps, but Tx is still fairly slow at 1.12 Gbps.

Let’s see what numbers we have when using both WAN ports at the same time with UP Xtreme i11 running “iperf3 -s” and the laptop the following commands:


That’s a number that could have been expected due to the low Tx numbers we got before. At 1.21 Gbps, it’s still higher than the 1.12 Gbps we got with Tx only. A bit odd.

Let’s that again but in reverse:


1.75 Gbps. I am no idea what’s going on, and how it’s possible.

I’ve also run the test again using full-duplex for 600 seconds to check out CPU, memory, and temperatures as reported in LuCI.

CPU load was between 2.0 and 2.5.

CPU #0 had relatively low utilization with most of the resources on CPU #1 to #3.

The temperature never exceeded 60°C in a room with a 28°C ambient temperature.

And somehow memory usage did not change at all with almost 1.8GB free.

In case you wonder about the performance during a full-duplex test: 1.66 Gbps and 736 Mbps.

The results are a bit disappointing. I’m now considering switching to FriendlyCore (Ubuntu Core) to check if we get similar results, and perform further testing. I plan to start on Saturday using the second router. If you want me to check some other parts in OpenWrt let me know in the comments section.

Continue reading “NanoPi R5S preview – Part 2: Ubuntu 20.04 (FriendlyCore)“.

Share this:

Support CNX Software! Donate via PayPal or cryptocurrencies, become a Patron on Patreon, or buy review samples

62 Replies to “NanoPi R5S router review – Part 1: Unboxing, OpenWrt, and iperf3 benchmarking”

  1. Did you try to move the ethernet interrupt to another CPU ?

    I have this on my rk3399 to help achieve Gbps performance:

    echo 3 >/proc/irq/$(awk -F”:” “/eth0/ {print \$1}” </proc/interrupts | sed ‘s/\ //g’)/smp_affinity_list
    echo 7 >/sys/class/net/eth0/queues/rx-0/rps_cpus
    echo 32768 >/proc/sys/net/core/rps_sock_flow_entries
    echo 32768 >/sys/class/net/eth0/queues/rx-0/rps_flow_cnt
    /sbin/ethtool -K eth0 rx off tx off
    echo 0 > /proc/sys/net/ipv4/tcp_sack

      1. It’s more about stuff like ‘all NIC IRQs on cpu0’ and other unfortunate affinity settings (see RPS). I would look into the following first to identify areas of needed optimization:

        • /proc/interrupts
        • find /sys \( -iname “*policy*” -o -iname “*governor*” -o -iname “*aspm*” \)

        (later on setting every policy/governor to the max including cpufreq since benchmarking a system dynamically adjusting speeds here and there does not generate any insights)

        1. This tinkering might lead to fixing the networking issue with ubuntu on MeLE Quiter2. I don’t really know anything about this, but I just saw the Quitr2 review and found that behavior very odd.

  2. Not at all surprised by the results. Considering even normal gigabit speeds can require quite some tuning to achieve at optimal speeds on arm based hardware. This should be able to perform better, but it’s going to take some work on the software side.

    1. Since FriendElec reported results, I would have thought their FriendlyWrt image was already optimized.

      1. I highly doubt it, as at least on the projects I’ve been involved in, which were limited to Gigabit speeds, went from 800-850-ish Mbps to 950+ Mbps once some work was put into optimising the Ethernet performance. I honestly don’t know here though, but if this is the performance they can offer, then this is not worth being used as a network appliance.

        1. It mostly depends on the drivers and NICs, but when you have a mostly decent controller, gigabit is trivial. I used to fill the gig pipe in both directions on ARMv5 already (marvell’s Feroceon at 1.2 GHz which was a single core), and with more recent hardware (macchiatobin with A72 at 2 GHz), I’m doing 19 Gbps on two 10G ports. It’s not a matter of ISA but only a matter of crappy ethernet controllers found on cheap low-end devices, and/or attached to a low-bandwidth bus (iMX6 anyone?). Usually if you see “dwmac” in dmesg, you know you won’t benefit from much efficiency. Here it’s realtek on PCIe so it can come from plenty of places, including the PCIe controller (which could be ruled out after a test on an SSD).

          1. Trivial, no, not at all. A lot depends on the SoC and these chips out of the PRC are nothing like what you’ve worked with. A lot of the time, the network performance is only 75-80% of what it should be able it’s been the same for a couple of projects I’ve been involved in and it took some work to get it to where it should be. Both used standard Realtek controllers and performed as expected after the tuning. The iMX6 was hardware limited, so that’s a different case, TI has had even worse issues where they’ve been hardware limited to 200Mbps.
            There’s nothing inherently bad with Realtek these days, they just have a bad reputation.

          2. > There’s nothing inherently bad with Realtek these days, they just have a bad reputation

            They’re sufficient for a desktop machine but generally speaking when you need to get performance for routing or for TCP processing, you quickly measure the difference with more serious (but more expensive) chips.

          3. What do you recommend as an alternative to the RTL8125BG here? Intel i225? I know some people not that happy with those Intel chips 🙂

            Just searched for i225 M.2 2280 cards to conduct own tests. But those I could find seem to be crappy and 2242 only.

          4. I have not tried the i225 yet, only various members of the i2xx and i3xx family (e1000e and igb drivers). Even for a small server or NAS these make quite a difference. I have not searched either for such chips on M2 boards, but I will eventually do.

          5. > only a matter of crappy ethernet controllers found on cheap low-end devices

            Well, here we could deal with another symptom of using HW designed for the ‘Android e-waste world’. All those RK3568 sbc-bench results that flew in with missing cpufreq support clock the cores at ~815 MHz (no idea which u-boot version these tests relied on but I’ve a few results also with horribly low memory performance).

            So far I’ve not seen anyone trying to determine real CPU clockspeeds with FriendlyWRT…

          6. What if those advertised throughput numbers from FriendlyELEC were generated with the Ubuntu image and not FriendlyWRT? And due to different bootloader it’s impossible to reach same numbers with the latter?

            Switching from vendor to mainline u-boot often caused performance regressions (CPU cores and/or RAM clocked way lower). At least checking real clockspeeds is just one simple openssl call.

  3. > I have no idea what’s going on, and how it’s possible.

    Easy: new ARM SoC, new BSP kernel (5.10.66) and no focus on anything that’s relevant for the router use case.

    There’s unknown cpufreq behaviour which makes looking at average load (or CPU utilization which is something different on Linux) completely pointless, it’s unclear whether there’s a memory governor, whether/how PCIe ASPM is at play and finally there’s (missing) network tunables (see Gaël’s comment).

  4. Have you tried setting the MTU to 9000 on both sides? Setting a higher MTU should help quite a lot with the speeds.

    1. Sorry, but that’s largely a myth these days and really makes very little difference. Also, if the entire network isn’t configured to use an MTU of 9000, it does nothing.

      1. It only used to be true 20 years ago when IRQs were the limiting factor. 15 years ago it already changed, and having to do order-2 allocation for each incoming packets was so much expensive that it often eliminated all gains from the lower header processing. I used to shrink MTU on such networks to 7340 which had the good property of going back to order-1 (less than 8kB) and of preserving integral multiples of the 1460 TCP MSS, usually cutting CPU usage in half compared to 9k! Nowadays there is no point doing all of this at all anymore. Even 100G NICs work pretty fine at 1.5kB, and smaller MTUs reduce the impact of drops in switch port buffers. So please, let’s not spread that jumbo frame myth, it would be nice if everyone definitely forgot about that crap.

  5. This device does interest me, they can send me one also. I’m more looking for a always on kodi/media server/router device. I missing a few network ports next to my TV, should be cool if I could use this combination instead of a network switch.
    Can you do some power consumption tests also?

      1. Yes, I wonder about power too. Especially if 5V or 12V will make a difference in efficiency.

        There supposedly are solderpads to circumvent the entire usb-pd logic. Which would be my preferred choice, given i have a 5V PSU sitting on rail in a fuse-box where the router would be placed.

        1. According to another review relying on one of those unreliable USB powermeters R5S in idle generates a 3.6W number. The board being stressed with 3 tasks in parallel (iperf3 and writing to NVMe SSD included) resulted in only 1.8W more (5.4W).

          Hardkernel reports for their RK3568 board 1.25W idle and 4.44W for ‘CPU Stress’, the former number being questioned by multiple forum threads.

  6. > I’m now considering switching to FriendlyCore (Ubuntu Core) to check if we get similar results

    Without addressing all the unknown pieces first IMO this is just a waste of time. You would need to set performance cpufreq governor, then search for stuff like dmc/devfreq (dynamic memory performance adjustments), ASPM (affecting the PCIe attached NICs) and so on.

    That means checking /sys/kernel/debug/clk/clk_summary and searching below /sys/

    IMO only afterwards adjusting network tunables starts to make sense… 🙂

  7. Measuring iperf from/tothe device is not really a good benchmark for a router as tools like iperf also would put a load on the system. Furthermore running iperf on the R5S will probably more test the speed of the NIC not how good it is in passing traffic let alone in routing traffic.

    For router measurement I would suggest configuring it as a router and actually check the routing functionality.

    1. iPerf is a good way to find out the throughput a device is capable off though, which in this case, is proven to be nowhere near 2.5 Gbps in most test scenarios.

      1. Little anecdote from 6 years ago when community started to play around with Allwinner A64. With Allwinner’s BSP kernel the defaults (both cpufreq governor and parametrization) led to both iperf task and interrupt processing ending up on cpu0 while staying on the lowest cpufreq OPP 408 MHz for a long time or even during whole iperf execution.

        As such iperf numbers were crappily low but all that was needed to ‘fix the numbers’ was changing one single sysfs value 🙂

        1. Or in other words: the iperf task itself can become a bottleneck with inappropriate settings. And without extensive background monitoring the ‘tester’ simply doesn’t know.

          Without taking cpufreq scaling into account staring at CPU utilization is useless (since 50% utilization at 2GHz is more activity than 90% at 1GHz) and wrt RK3568 now we know there’s an MCU + firmware + boot BLOB that does additional funny things with real clockspeeds.

          1. Fair enough and you’re right, we have no idea how well this platform is working. I just meant that iPerf as a test, isn’t a bad test for testing network throughput. Obviously if there’s something odd going on with the hardware it’s running on, you’re going to get poor results.

        2. > iperf task and interrupt processing ending up on cpu0

          That’s a pretty common problem when doing network testing,I’m always assigning processes using taskset. Also it’s still common to find the horrible irqbalance running on machines while it’s been known for being a devastating performance killer for more than a decade now. It used to make sense at an era where application servers would assign heavy request processing to the core that accepted the connection. With multi-queue, accept queues, and thread pools it makes no sense, just like it makes no sense with low-latency processing like routing or HTTP forwarding.

      2. Sorry but it isn’t. I’ve worked in router development. and those chips have a dedicated networking engine that does the routing (as well as some packet inspection etc). The CPU is not involved in the actual routing process unless you want to do very specific things.
        I could do 10GB routing although the CPU on the SoC was rather limited.
        Traffic to/from the CPU is totally different; processing can become the bottleneck.

    2. I would argue that the 2.5GbE ports are the key selling points of the device. You could have a mini 2.5GbE network without having to purchase a 2.5GbE switch.
      For example one NAS plus one client, or two clients with the NanoPi R5S acting as a NAS with one or two USB 3.0 drives.

      I’ll test the router function as well, but a little later.

  8. > CPU #0 had relatively low utilization with most of the resources on CPU #1 to #3

    Judging by the screenshot almost everything CPU was Soft-Irq which makes me wonder whether you have any sort of QoS (for example traffic shaping like SQM) on the interface you’re testing with?

  9. BTW: NanoPi R5S is the slowest RK3568 thingy so far. Of course this is not related to hardware but software. All the RK3568 devices in the wild up until now either used the old Rockchip BSP from last year (based on kernel 4.19) or latest and greatet mainline kernel (5.15 upwards).

    FriendlyELEC now uses the newer Rockchip BSP based on kernel 5.10.66 and obvisouly different boot BLOBs and now the RK3568 is only clocked at 1872 MHz while pretending to run at 1992 MHz: http://ix.io/3ZaV

        1. It involves RPS:

          set_interface_core 4 “eth1-0”
          set_interface_core 4 “eth1-16”
          set_interface_core 4 “eth1-18”
          echo b > /sys/class/net/eth1/queues/rx-0/rps_cpus

          I.e. they receive this IRQ on core 2 and redistribute the incoming traffic to cores 0,1,3. That’s the right way to use RPS. However you have to manually assign iperf and watch the first core that saturates. If it’s saturating core 2 with ksoftirqd first, make sure that iperf runs on any of the other 3. If core2 is slightly idle, try to put iperf on it. If putting it on it makes ksoftirqd pop up, then they’re hindering each other, and you’d rather change the RPS setting to free another core and use it for iperf.

      1. > There’s a new image coming soon with optimized settings.

        There’s 3 areas of optimization and two are related to ‘dealing with new Android e-waste platform’:

        • MCU / firmware / boot BLOBs (cpufreq anomalies)
        • lesser known kernel stuff (e.g. memory performance adjustable via governor/policy, PCIe powermanagement)
        • network tunables and well known kernel stuff like SMP/IRQ affinity
    1. That thing probably consumes about 4W at idle. I need to purchase another power meter… Those tend to break within one year…

      Are people going to run those on batteries? I suppose it would make sense as a portable router.

      1. You are stuck in the situation or trying to repeat experiment results, but do you have the exact hardware, cables, software, bios etc As Friendlyelec had.
        It should not matter the mind says. However exact Science is Science, never assume a result but check and measure.
        I’ve known people spend hours on problems but just change a cable and its worked as should.

  10. i would say try using diff os, try deb arm and opnsense, try doing bridging and bonding and see how much actual link you can get between nodes through what is effectively 5gbe
    the network stack may have to be optimized for this arm chip, please test netfs share from the nvme – with a few devices it may scale #sysctl.conf #openwrt

    1. Don’t forget this is an Arm platform. You can’t just install any random OS and expect it to boot or work properly. The only options right now are OpenWrt or Ubuntu Core.

    2. > bonding … what is effectively 5gbe

      Can you please share a universal recipe that involves ‘bonding’ to create an ‘effectively 5gbe’ link out of 2 x 2.5GbE?

      1. You just have to twist both cables tighthly together, should that not be enough try adding sikaflex to improve the bonding 😉

        1. Just like the ‘set MTU to 9000’ myth this misunderstanding of ‘bonding’ sucks since almost all available modes do not turn 2 x 2.5GbE into a 5GbE link. Only distribution of more than one 2.5GbE links might improve (and fault tolerance).

  11. > If you want me to check some other parts in OpenWrt

    Well, since you’re asking 🙂

    • echo performance >/sys/devices/system/cpu/cpufreq/policy0/scaling_governor
    • openssl speed -elapsed -evp aes-256-cbc 2>/dev/null | grep ^aes
    • openssl speed -elapsed -evp aes-256-gcm 2>/dev/null | grep ^aes

    In general I’m also curious whether R5S can be powered by a dumb 5V power source or a USB PD capable charger is needed.

    1. Results:

      I’m using Khadas VIM4’s USB PD power supply right now. I have not tried with a dumb 5V power source. I’d assume it should be fine until we connect a few USB 3.0 drives.

      1. > aes-256-cbc …857937.24k

        Ok, based on A55 with ARMv8 Crypto Extensions formula now we know that the CPU cores clock at ~1840 MHz with the FriendlyWRT image. 🙂

        In case you’ve an RPi 15W USB-C power brick lying around I would appreciate a test (since this thing being less expensive than any USB PD charger and most probably the best piece of hardware ever manufactured by RPi Trading Ltd.)

    2. I’ve got access to a 5V/3A power supply I used with Raspberry Pi before.
      It works just fine.

  12. I’m not familiar with FriendlyWRT but does it allow you to designate the wan port as one of the 2.5Gb ports? Also, what’s the routing throughput on this?

    1. FriendlyWRT is OpenWRT running with a BSP kernel (SoC vendor’s kernel and not upstream). As such everything relevant works the same.

  13. Looks like FriendlyWrt is a custom early build from OpenWrt 22.03-rc branch, it’s definitely not rc1 (let alone rc3 which is out now) since the kernel here is 5.10.66. Rc1 came with 5.10.111. Probably going to buy this ones software support matures a little and perhaps when 22.03 is released.

    1. Have you tried running an m.2 SSD in this? I’ve tried both a SATA and NVMe drive and can’t get either to show up under “Mount Points”. They show up if connected via USB so what am I doing wrong?

  14. Hey does anyone knows if this supports mwan3?

    My ideia is to use this device as a router to manage two WAN connections (using one as failover) + my LAN…

Leave a Reply

Your email address will not be published.

Advertisement
Advertisement