Most experienced PC users are well aware
that hard disks slow down with excessive file fragmentation, so it seems
logical that running a disk defragmentation tool will fix the problem. In fact,
Windows Vista, 7 & 8 automatically schedule a weekly defragmentation, which
runs if the PC is left on overnight on the scheduled day of the week. Some
users may use a third party defragmentation tool, which performs additional
defragmentation such as defragmenting directory records and performing boot-time
defragmentation to defrag the Pagefile and locked files. However, even the most
advanced defragmentation tools tend to leave one problem behind, which is how
the files are physically laid out on the hard disk, except when performing a very
tedious full-disk file sorting optimisation.
To illustrate the problem with how files
are laid out on a hard disk, let’s start by having a look at a mock-up of a
hard disk with moderate file fragmentation, where the digits ‘0’ to ‘9’ illustrates
an example of a directory we’re interested in that contains 10 files. To keep
this mock-up simple, we’ll only show 4 types of data – continuous, fragmented,
frequently accessed. and directory records:
As we can see here, our 10 files are
scattered all over the place. Despite this mess, only 3 of the 10 files are
actually fragmented – files 1, 4 and 8. A situation like this happens when
small files are written to a hard disk containing fragmented free space, as the
individual files end up being stored in these empty spaces left over from
previously deleted files.
The Windows Defragmentation utility as well
as some “Fast” third party defragmentation tools will defrag files by joining
up their fragments so that each file is physically stored as a continuous block
of data. Some also consolidate the free space so that all the free space
becomes one solid block at the end of the drive. A free-space consolidation is
also required before a partition can be shrunk, as otherwise the partition
cannot be made smaller than where the last block of data is stored. In fact,
free-space consolidation is the only type of defragmentation that should ever
be carried out on an SSD and this operation should only be performed on an SSD
when the user cannot reduce the size of a partition beyond a certain point.
Basically, this is what the Windows
Defragmentation utility and most rapid third party disk defragmentation tools that
also perform free space consolidation will achieve:
While there are no red dots, one may ask –
Why are the files still randomly placed? Well, if we look more closely, we can
see that files #1, #4 and #8 have been defragged. In fact, all files are stored
individually as continuous pieces of data, so technically this volume is
considered unfragmented by most defragmentation utilities, even though the
actual set of files is still randomly placed. Unfortunately, if we try reading
this folder we’re interested in, the hard disk must still make at least 10
seeks to read all the files. While this may not take long for 10 files, just
imagine reading several thousand files scattered about the drive, such as a
large photo or music collection built up over time.
Now let’s look at what happens with an
advanced defragmentation utility, which performs additional defragmentation
such as defragging the directory entries, page file and even grouping
frequently accessed files together:
The most obvious improvement is that the
directory entries and frequently accessed files have been grouped together.
However despite this outcome, the directory we’re interested in still has its
files randomly scattered. Unless we frequently read the files in this folder,
these files will remain stored like this. Even when frequently read files are
moved to the start of the hard disk, they are not necessary physically stored
in the same order as listed in the folder.
So why doesn’t a defragmentation tool reorganise
the files such that each folder’s content is physically stored sequentially?
The problem is that the entire partition’s data will need to be read and
re-written to achieve this. So doing this for a hard disk with let’s say 800GB
of data with an average read/write speed of 80MB/s may involve 3 hours of
reading plus 3 hours of writing, not to mention all the seeking time involved
as the data is moved around. This sorting process is even more tedious to
achieve when the partition is filled to near capacity, as data would need to be
moved around in several stages.
There are some third party defragmentation
tools that offer this very time-consumption process, which involves leaving the
PC unattended overnight or longer depending on the amount of data on the drive.
The following gives an idea of what this will achieve with our mock-up:
Even with this very tedious full-disk
optimisation, the problem will gradually return as new files are added. For
example, if we add more files to this directory, they will be stored where the
free space starts. If existing files are deleted, new files will again be
stored in the free spaces left over. The only way to get the new files in our
folders stored continuously again is with yet another very tedious full-disk optimisation
to align all the files, folder after folder, one after another again. A couple
of real-world examples include importing new pictures from a camera, installing
Windows updates and updating software, where the new files are stored where the
free space starts or worse still, in free-space fragments left over from files
deleted since the last defragmentation process.
A quick method we came up with to simulate
fragmentation is by copying files to the hard disk in a random order. While
this does not result in fragments in any individual file, it does result in the
problem we discussed above where the files in each folder are not sequentially
stored on the physical hard disk platters.
To allow us to run this test on our
comparison drives as well as any future hard disk reviews, we created a single
128GB partition on each hard disk. This partition size would also be adequate
for a typical Windows 7 or 8 installation. In fact, many system builders and
advanced users use a small partition for the OS to “short-stroke” their hard
disk. By using a small partition for the OS, all system files are physically
contained within the partitioned area, which means that the hard disk never
needs to seek past the end of the partition, keeping the seek times low.
We created our copy script by first adding
a series of individual file copy commands to copy every file of the 1GB JPEG
set one file at a time. We then added in a series of copy commands to also copy
every file of the 187 MP3 set one file at a time. The copy command set for the
MP3s was repeated 100 times, with each set configured to store the copies in a
folder called MP3_#_music, where # is the repetition number. Then we used a
utility to completely randomise the list of commands in the script. Finally,
we added a series of “Make Directory” commands at the start of the new script
to create the directory structure prior to the randomised list of file copy
commands and then executed the script.
The following shows how much data was
copied by this script to the 128GB partition on each hard disk to be tested:
When we analyse the disk with the third
party Defragmentation tool Piriform Defraggler, we get the following:
At a first glance, it looks like we have a
perfectly defragmented drive with not a single file fragmented and no free
However, it is not until we examine a block
that we get to see how the files are actually stored:
So while not a single file is fragmented,
the files are not in any order.
Now that we have our file sets prepared,
let’s repeat the read tests…
Real world small files read test
Let’s start by reading the small files set.
This is the exact same file set as on the last page, with the only difference being
in the method by which the file set was originally copied to the hard disk.
Unlike the earlier test, we only have a single partition now, so for comparison
we included the earlier 0% partition timings for reference. By reading this
file set, we’re effectively achieving the equivalent scenario as reading a
single 1GB file broken up into 8,247 fragments.
This clearly shows the problem all hard
disks with spinning platters face eventually, particularly the “Documents” and
“Pictures” folders which are built up over time, resulting in its files scattered
throughout the volume in between other files.
Despite the simulated fragmentation, the
hard disk still came ahead of the Samsung M7, so while the Toshiba is slow at writing
small files, it performs well when reading them back.
Now let’s see how long it takes to make a
copy of this file set. Like earlier, we create a copy of the file set folder
using FastCopy in “Diff HDD Mode”:
As shown here by the earlier results in red,
the Toshiba HDD performs poorly when writing and copying small files, so it’s no
surprise to see that it is also slow when faced with fragmentation while
Real world MP3 files copy test
Just as one builds up their photo
collection over time, the same applies when building up a music collection. So
if someone has a single folder of 1000+ assorted MP3s that was built up over a
year or two, there is a pretty good chance that very few songs are physically
stored one right after the other, even with regular disk defragmentation.
As the MP3 files are considerably larger
than the files in our JPEG file set, there is far less seeking involved in this
test, effectively the same as reading a single 987MB file broken up into 187
With these larger files, their random
placement does not have anywhere near the impact as with our smaller JPEG
files. The two 2.5” hard disks took about 25% longer with the files randomly
stored, while the desktop hard disk fared a little better taking just over 20%
Now let’s see what effect this random file
placement has when making a copy of the file set:
This time the Toshiba hard disk performed a
fair bit quicker than the Samsung M7, taking just over 3 seconds longer than
when the files were sequentially stored. The Samsung 2.5” and 3.5” hard disks
took nearly 6 seconds longer than earlier.
Most users use the same hard disk for their
Operating System as well as storing their data, especially on a laptop or a
prebuilt PC. Even users that have multiple internal hard disks often use the
same hard disk for multiple tasks, such as editing video footage in one folder
while burning a completed project stored in another folder to disc.
Unfortunately, hard disks generally don’t
perform well when multitasking, as the read/write heads can only read or write
a single block of data in any given location at a time. So if let’s say we’re
burning files to disc while we are watching a video file, the hard disk will
need to seek the files being burnt while also reading the video file. If it
reads too little data each time it seeks back and forth, the burn process will
suffer from regular buffer under runs, potentially affecting the quality and
readability of the written disc. On the other hand, if the hard disk spends too
much time reading data at one spot, the video will stutter badly as the video
player will not receive a steady stream of data. So a hard disk faced with
multiple read and write tasks needs to carefully decide how much time it spends
at each task.
Like other modern hard disks, this Toshiba
hard disk features Native Command Queuing, which means that when faced with
multiple read and write commands, it can reorganise the commands so that in
theory the hard disk spends the least amount of time performing each command.
article shows how hard disks use Native Command Queuing to improve
In these tests, we will run a script which continuously
reads our MP3 folders one after another. As the files stored in each folder are
physically scattered, the hard disk will also need to seek each individual MP3
file. In the real world, this process is similar to a scheduled full virus scan
or disk backup taking place in the background.
Real world small files read test with multitasking
Let’s start by reading the fragmented and copied
file sets while another script is running reading MP3 files non-stop in the
background. Since the first free space starts where the all the existing data
ends due to no files being deleted, the copied set from the above test would
have been stored sequentially.
In this test, we actually uncover a
weakness in the Samsung M7 this time, where reading our sequentially stored
file set takes over 4 times longer on the Samsung M7 while files are being
continuously reading in the background. Even the Samsung F3 desktop hard disk
took 50% longer. In theory, this means that the Toshiba would perform
considerably better than the other two drives while a disk intensive process is
running the background such as a virus scan, backup or even a file set being
written to disc. Even when reading our fragmented file set, the Toshiba hard
disk performs considerably better than the Samsung M7.
Real world MP3 files read test with multitasking
Now let’s read our fragmented and copied MP3
file sets while another script is reading MP3 files non-stop in the
At a first glance, something does not seem
right, as how could sequentially stored files take longer to read than the
randomly stored files? The reason is most likely due to the effect of Native
Command Queuing. As we mentioned earlier, the copied file sets are stored after
the end of the existing data, which means that when the hard disk reads this
sequentially stored data, it must keep seeking out to this section. When
reading the set of randomly stored MP3s, these are mixed in between all the
other MP3 sets that our background script is endlessly reading, which means
that in some cases, it can seek some of these MP3s quicker than it would be seeking
the sequentially stored MP3s placed out at the end.
Like the set of small JPEG files, the
Toshiba performed surprisingly well in this test, taking about half the time of
the Samsung F3 desktop hard disk and just over a third the time of the Samsung
M7. Based on this timing, the Toshiba’s average read speed is 15MB/s, quick
enough for burning a DVD at up to 10x containing files averaging about around
5MB, while a scheduled virus scan or backup operation is also running. For
comparison, the Samsung M7 would struggle to even manage a 4x DVD burn speed
with this scenario.
Hard disks suffer from one problem in
common which is where files in a folder are not physically stored in sequential
order and as we’ve shown above, most disk defragmentation tools do not
reorganise the files due to the serious amount time this would take. So on this
page we tested how well the Toshiba and our comparison hard disks performed
when reading files that are randomly placed on the hard disk.
When reading our 1GB set of small JPEG
files (8,247 in total), the Toshiba took at least 10 times longer to read than
the sequential stored set on the last page. However, it still outperformed the
Samsung M7 by a reasonable margin. With the set of MP3 files, the Toshiba took
just 25% longer to read than the sequential set, also outperforming the Samsung
M7 by a few seconds. However, like the earlier tests, when it came to making a
copy of the set of small files, the Toshiba hard disk took considerably longer
than the comparison drives, mainly due to how slowly it performed in our write
tests involving small files.
The Toshiba hard disk performs surprisingly
well in our multitasking tests giving very impressive results in comparison to
the two other hard disks we tested. For our multitasking tests, we had a script
endlessly reading our randomly stored MP3 file sets in the background. When
reading our 1GB set of small JPEG files, it took less than a quarter of the
time of the Samsung M7 and even significantly outperformed the Samsung F3
desktop hard disk. Even when reading our set of MP3 files, it took about third
of the time of the Samsung M7 and about half the time of the Samsung F3 desktop
Let’s head on to the next page to look
at how this hard disk performs with a Windows 7 installation in VirtualBox…