When reviewing the performance of a Storage solution there
are three basic metrics to look at:
IOPS – the number of Input/Output Operations per Second
Bandwidth – the number of bytes transferred per second (usually measured
in Megabytes per second, ‘MB/s’)
Latency – the amount of time each IO request will take to complete
(usually, in the context of Solid State Storage solutions, measured in
Microseconds, which are millionths of a second).
It is true to say that IOPS and Bandwidth had both been
growing rapidly before the advent of Solid State Storage, but Latency can only
be significantly decreased by eliminating mechanical devices, and thus Latency
is the single most important improvement that Solid State solutions deliver to
Latency in a technical environment is synonymous with delay.
In the context of a Solid State solution it is the amount of time between an IO
request being made, and when the request is serviced.
Bandwidth, also commonly referred to as ‘Throughput’, is the
amount of data that can be transferred from a storage device to a host, in a
given amount of time. In the context of Solid State solutions it is typically
measured in Megabytes per second (MB/s).
A great Solid State solution offers an effective balance of
all three metrics. High IOPS and Bandwidth is simply not enough if Latency
(the delay in an IO operation) is too high.
Queue Depth is the average amount of IO requests
outstanding. If you are running an application and the Average Queue Depth is
one or higher and CPU utilisation is low, then the application’s performance is
most probably suffering from a ‘Storage Bottleneck’.
It is true to say that a typical PC user will very rarely
cause a modern SSD to see a Queue Depth greater than 1 or 2. So for Client SSDs
we need to primarily focus on performance at low Queue Depths.
Another important aspect to consider with an SSD is the
state of its NAND when an IO task begins. When an SSD is new, or immediately
following a purge (a Secure Erase for an SATA device) being performed, it is in
a Fresh out of Box (‘FOB’) state and in this state all of its Blocks of NAND
are clean and able to immediately accommodate the writing of new data.
Typically SSDs are supplied with a greater capacity of ‘Total NAND’ than their stated
‘User Capacity’ and the difference between them (Total NAND – User Capacity) is
known as an Over Provision (‘OP’) at the firmware level.
If an IO Task that involves writing new data can complete
without the supply of clean blocks being exhausted it will complete more
quickly than if blocks must first be cleaned on the fly before writes can be
accommodated. The number of free blocks available may also impact on performance
(think of it this way – the more free blocks there are the easier it is to find
one to write to). An SSD will continue to write to clean blocks until there
are no more available after which it must then free up blocks by completing an
Erase/Write cycle on the fly before it can write new data. Blocks that have
been written to are flagged as being able to be cleaned when either the logical
address they are associated with in an Operating System is written to again or
when a Trim instruction is sent by the OS to indicate that a range of logical
addresses (which map to physical blocks) no longer contain live data (for
example, in Windows, Trim instructions are sent to an SSD, when a file is
logically deleted, to indicate that all of the physical blocks which contained
the file no longer hold live data).
An SSD’s controller performs a process known as ‘Garbage
Collection’ which gathers together spaces that no longer hold live data so that
it can create clean blocks in preparation for accommodating the writing of new
live data. Blocks are contained within Pages and only complete Pages can be
erased in preparation to accommodate new writes, so one of the responsibilities
for Garbage Collection is to shuffle blocks out of partially filled pages so
that whole pages can then be cleaned. Garbage Collection can be performed as a
regular background task and on the fly. The effectiveness of an SSD’s Garbage
Collection has a significant impact on its long term performance. It is
important to note that a Trim command does not itself clean blocks and it will
always take a bit of time for Garbage Collection to follow up and actually
complete the cleaning process.
An SSD maintains a table, that can be used by an OS, which
holds the mapping of its physical blocks to logical addresses. Effectively,
the OP is increased above that set at the firmware level whilst the drive’s
user capacity is not full of live data. In Windows a user can effectively
choose to underline their commitment to increasing the level of OP by not fully
allocating the drive’s user capacity to partitions.
When a drive is compelled to clean blocks on the fly to
accommodate new data it moves from an FOB state towards what is known as a ‘Steady
State’. A Steady State is achieved when performance is steady and no longer
changes significantly over time. Testing of Enterprise SSDs is always
performed when a drive is in a Steady State. It is fair to say that typically a
Client SSD will spend most of its time in an FOB state (or near to FOB state)
and it’s in this state that our testing is performed using the Desktop PC.
Remember though that one can expect to see a performance drop when the drive
holds increasing amounts of live data, as the pool of free blocks (the
effective OP) becomes smaller.
Whilst most Client SSD users need not be overly concerned
about Steady State performance we do push an SSD to its limits as part of our
testing on the OakGate Test Platform.
So what performance characteristics make for an excellent
Put simply, we look for a solution that provides both
excellent Sequential IO performance and excellent Random IO performance.
Excellent Sequential performance supports the rapid transfer of large amounts
of data from one place to another, such as when copying a movie, loading a game,
or running a backup. Excellent Random IO means that a drive will support the
rapid reading, writing, and updating of relatively small files that are
randomly placed on a drive (such as is required by the Windows Operating System),
the launching of applications, or by a database based application. Sequential
performance is most often measured in terms of MB/s (Megabytes per Second) and
Random IO is most often measured in terms of IOPS (IO Operations per Second).
Modern SSDs deliver low Latency and support tens of thousands of Random IOPS
and whilst very few PC users really need support for such a high level of IOPS
it does mean that every IO will be fast.
Manufacturers most frequently quote the headline maximum
Sequential Read and Write Bandwidth for a drive. They also regularly cite the
maximum IOPS level for 4K Random Reads and Writes. Operating Systems are known
to make extensive use of the 4K IO size and this is why strong 4K Random Read
and Write performance is considered important.
I use two test platforms for testing Client Storage
Firstly, a Desktop PC, with the following specification: CPU
– Intel Core I7 6700K, Motherboard – Asus Maximus VIII Extreme (Z170), System
Drive – Intel 750 400GB, GPU – EVGA GeForce GTX970 FTW, RAM – 32GB Corsair
Dominator Platinum, Cooler – Corsair H110i GTX, Windows 10 using Intel RST
188.8.131.525 and with C States disabled in the BIOS, as this ensures reasonable
consistency from storage benchmarks.
Secondly, an OakGate Storage Test Platform, which is
introduced in an article that you can view by clicking here.
The OakGate Test Platform can be thought as a professional, laboratory
instrument where the test environment is managed strictly and consistently so
that test results from multiple solutions can be compared with great confidence
The Desktop PC is used to run a cross section of the most
respected and commonly used storage performance benchmark software, including
AS SSD, Anvil, Crystal Disk Mark, ATTO, and PCMark 8 Storage, together with a
number of real world file copies. Most of these benchmark programs are freely
and easily available for you to run on your own PC. There is a good case for reviewers
to test an SSD as a System Drive, as arguably this is the way in which most
people will use an SSD. However, I choose to test drives as a spare as I
believe this makes it far easier to provide a consistent basis for product
comparisons, which I feel is most important.
The OakGate Test Platform is used to provide an accurate
baseline for a drive’s performance in all of the key aspects of performance,
including Sequential Reads and Writes, Random Reads and Writes, and Random
Mixed Reads and Writes. The OakGate Test Platform is also used to investigate
how a drive behaves when it is pushed to its limits and to measure a drive’s power
consumption characteristics. (All testing on the OakGate Test Platform is
conducted with fully random data and is aligned to 4K boundaries)
In the presentation of test results I include comparisons
with other products I have tested in the same way on the same platforms.
Now let’s head to the next page, to look at the results
for the Desktop PC Synthetic Benchmarks…..