SD cards used to store media data only, for example photos and videos in your camera or smartphone, but with the introduction of “Adoptable Storage” in Android 6.0 you can now run apps directly on a micro SD card, and many development boards rely on (micro) SD card to run the full operating system. The difference is important, as with media storage, the raw sequential read and write speeds are the most important, as large files are created and accessed, but for apps and operating systems many small read and write operations such as databases access take place on the card, so random IO performance becomes much more important. So far, the SD card specifications would only report sequential performance with different classes, and for example for are often recommended to use “Class 10” SD cards on Raspberry Pi, which does not clearly indicates the random IO performance.
SD Specifications 5.1 address this issue, as a new Application Performance Class is introduced, and micro SD card will A1 performance class will deliver at least 1500 random read input-output access per second (IOPS), 500 random write IOPS, and 10 MB/s sustained sequential performance.
It will be easy to purchase such card as they will come with either “A1 App Performance” or “A1” logo.
There’s no difference in performance between the two logo. Eventually, the SD card association will released higher App Performance Levels, likely A2, A3… as the market evolves. The new App Performance Class is shortly explained in the video below.
SD Specification are only available to paid members, but you may find out more by reading the white paper entitled “Application Performance Class: The new class of performance for applications on SD memory cards.”, or visit SD Card Association Application page. One of the first micro SD card compliant with Class A1 is the 256GB Sandisk Ultra microSD UHS-I card, although they don’t seem to have done anything specific to match Class A1 requirements, the card was already fast enough, and Sandisk simply added the A1 logo.
Thanks to tkaiser for the tip
Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
17 Replies to “SD Specifications 5.1 to Introduce App Performance Class (for Random I/O) & Logo”
any idea how long before the boots on devices will support these direct booting? Or do they already. I’ve run across a few situations where I’ve wanted to use 128gb devices and they are not bootable. And in one case the thing didn’t work at all, needed < 64gb
(throughput, latency and IOPS) that is what matters.
The higher capacity and price, the higher the chance you buy counterfeit SD cards: http://www.happybison.com/reviews/how-to-check-and-spot-fake-micro-sd-card-8/
Easy solution: always check your card directly after purchase and don’t use stupid burning tools but Etcher instead.
You have to consider the size of capacity your device supports both on the hardware and operating system.
This is good to see. Let’s hope that there is some kind of compliance testing so that this doesn’t end up like the “C” performance levels which seemed to have been self reported by card makers–and often highly inaccurate.
Well, testing is pretty easy, just be prepared that some filesystem related overhead is involved and more importantly when testing on devices that implement cpufreq scaling that settings matter (switching to ‘performance’ cpufreq governor prior to testing strongly recommended since otherwise you get numbers without meaning).
With iozone for example you would use
The first line reports IOPS, the 2nd KB/s (so you would always have to divide through test block size to get IOPS). When we (Armbian) started with SD card performance tests last year the rather cheap 32GB and 64GB Samsung EVO/EVO+ I personally tested all were be compliant with ‘A1’ profile slightly exceeding 500/1500 4k IOPS. But it seems Samsung chose in the meantime slower dies and/or controllers and EVO/EVO+ bought recently are a lot slower when it’s about random IO performance. Also one of the 64GB cards I just retested now shows only 421/1165 4k IOPS but back then I only tested with ext4 and F2FS and since this is a production system with some load on it and I use totally different FS settings (see /etc/mtab below) results can not be compared:
So this new spec is great anyway since vendors applying for the logo have to take care about random IO performance and we can be assured than random performance won’t suck that much as with ‘the average SD card today’ (many show horribly low 4k and 16k random IOPS numbers) even if A1 numbers aren’t reached (or will decrease over time).
This depends on your device and the SD card, obviously. You won’t get a boot off of the stock configuration of the 128GiB cards because pretty much NOTHING supports ExFAT right at the moment. Me, I’ve gotten good results with “premium” microSD’s that would likely meet the A1 spec currently on my RaspberryPi3. It’s only drawback is that the media controller there is still slower than sharing the USB 2.0 bus and using USB media on the four port on the Pi3. But it DOES boot up with it directly and works “fine” for the values present with a Pi type device and SD (which would be similar for any other media device)
A1’s not all that stellar (When you’re talking NVMe with Gigabits and 1000+ IOPS…it’s not all that hot except in the world we play in with this stuff), but lets you get the 10-40 Mbit/s read/write out of the device most of the time. I’d love to see a clear, definitive A1, A2, A3 designation for us to be able to just walk up and not guess at what we’re going to see. Claims of 100 Mbps write or read not withstanding, it doesn’t do much good if the device doesn’t do any more than 500 IOPS.
This is only a problem when using those Raspberry Pis since there Linux is under control of the ‘firmware’ running on VideoCore IV which is that outdated that it can’t cope with anything other than FAT16/FAT32 which is why there always has to be a small FAT partition and why you need ‘SD formatter’ when you want to use a large SD card with Raspberry (since you need to reformat the card to be FAT32 so the primitive VideoCore IV can find the blobs needed to boot and later start the linux kernel)
For every SBC out there that is not that limited as those Raspberries this is no problem at all since anything present on the card when you buy it (including partition table and filesystem) gets overwritten when you burn your OS image (so also no need to waste time with ‘SD formatter’, it’s really just an insane waste of time but unfortunately a lot of ‘tutorials’ on the net suggest this BS — it’s really only needed on the RPi platform due to VideoCore limitations)
Regarding A1 specs please be aware that those SD card brands often recommended (Kingston or PHY for example) show horribly low numbers here (some even below 50 random 4k write IOPS which means an Android or Desktop Linux running of such a card is close to unusable). Some scary numbers can be found in this thread: http://forum.pine64.org/showthread.php?tid=191&page=5
Charbax has a video up with general chat. http://armdevices.net/2017/03/04/sd-card-uhs-iii-with-624mbs-and-microsd-a2-application-performance-class-2/
Some information about A2 (App Performance Class 2) available here: https://www.sdcard.org/consumers/choices/application/ (most important piece of information that A2 preconditions and test environment will be specified in yet not released SD 6.1 specification later).
Speed/performance class compliance testing will happen with test equipment. So I’m already curious how that will relate with real-world situations since stuff like cpufreq governor used in Linux/Android can have a huge impact on real storage performance in embedded/mobile devices (and SBC as well)
BTW: No SD card I tested so far is close to A2 compliant, best results with eMMC were Orange Pis (PC Plus, Plus 2E) (3500/1625 read/write IOPS) and Hardkernel’s eMMC modules for ODROID-C2 (write IOPS exceeding 4000 but read numbers a lot lower, this has to be re-tested properly)
Most eMMC flash chips can handle 4K/2K R/W IOPS, if we use the company’s numbers, for example Samsung: http://www.cnx-software.com/2016/10/31/samsung-emmc-and-ufs-2-0-embedded-flash-chips-and-performance-in-2016/
I guess your results may be with iozone3.
Nope, at least not now if we’re talking about stuff that is used on real devices today 😉
The eMMC used on most SBC for example is really ‘low end’, SinoVoip and FriendlyELEC for example use Samsung eMMC that does not even meet the ’10 MB/s sequential write performance’. Writes get bottlenecked by eMMC and reads by SDIO implementation (maxing out at 80 MB/s with all the cheap implementations). If we’re talking about SBC it seems Hardkernel are the only ones really caring about maximum performance (and surprisingly Xunlong even on the cheap Oranges also soldering performant eMMC).
Anyway: I played around with an ARM device using Apple’s A9X a few weeks ago (the larger iPad Pro) and it’s amazing how fast storage is there. Memory bandwidth of this SoC is amazing and also ‘storage performance’. Apple’s aquisition of Anobit few years ago led to PCIe attached NAND storage making use of NVME which is magnitudes faster than any eMMC around (and most consumer SSD used in PCs) especially if we’re talking about random IO 🙂
The recent TV boxes I’ve tried all use mainstream or the top eMMC 5.1 chip of the “low end” segment, all of which deliver above 4K/2K IOPS. Now, once you put them in a device, they don’t always extract the maximum performance out of those.
Well, the problem is that IOPS is nothing clearly defined when used in vendor data sheets (same with SSDs for example, every vendor can use whatever numbers he wants as long as he’s able to document ‘some procedure’ how numbers were created in case somebody should complain). Even the most basic information (which block size? Are we talking about 4K?) is missing when looking at vendor IOPS numbers. Same for the procedure used to test, for example queue depth / number of parallel requests plays an important role.
Serious publications testing IOPS document this (eg. ‘Iometer settings xy’ or AS SSD ‘4K-64Thrd’ test) and use always the same setup when testing different devices (so with the specific workload in mind different products become comparable regarding random IO — BTW: it’s not all about IOPS, latency is the other and for some workloads way more important factor). But vendor IOPS are always questionable, at least up to now or let’s say broader adoption of SD Association’s App performance class ratings.
Anyway: Without knowing how SD Association defines their IOPS test setup it’s also just a broad indicator since specific workloads might make a huge difference. I’m currently testing some USB flash media and it’s amazing how the sticks behave depending on some environmental parameters eg. huge performance drops after writing large amounts of data (obviously throttling occuring) or a huge variation between isolated IOPS (read or write) and those for realistic scenarios (a certain mix of reads and writes).
Aren’t latency and IOPS related? I’d expect less IO per second with higher latency.
Of course latency and IOPS are somewhat related (the lower the queue depth the more IOPS depend solely on latency) but it depends a lot on storage/controller internals how both relate and which bottlenecks are here and there (and also how external parameters like parallel requests impact test results and benchmark setups — ‘queue depth’ being the magic word here)
For some workloads average and max latency become more important than IOPS. Imagine caching in a storage cluster to prevent data losses in case one cluster node dies. With a ZFS based implementation this is done by the so called ZIL cache and in a cluster setup you use ‘sync=always’ and accept a transaction to be finished only when all ZIL devices in the cluster confirm writing the blocks to their respective SSDs. Now you get SSDs with same IOPS ratings that differ in price a lot. And the difference is ‘IO consistency’ or latency. Even SSDs with high latency can show high IOPS rates when we’re talking about many IO requests in parallel — high queue depth by using internal parallelism but this doesn’t help here since finishing single transactions in the shortest time possible is the only metric that counts.
Well, somewhat off-topic (again 😉 ). I would recommend searching for ‘intel p3700 consistency anandtech’ and reading at least the test introduction to get the picture. And everything written there regarding AHCI vs. NVMe can be used to understand differences in low-end storage too (SDIO vs. UFS for example)
Some performance data generated with two new 32 GB SanDisks: An Ultra A1 and an Extreme A1: https://forum.armbian.com/topic/954-sd-card-performance/?do=findComment&comment=49811
TL;DR: Buy genuine A1 cards and forget about everything else around. At least if it’s about random IO performance (and the ‘rootfs on SD card’ use case)