Slow read speeds with SSD ?!?!

It was only that because they took decades to make it so. You were obviously not around when motherboard BIOSs required the number of tracks and heads for each drive, even though the later drives were only pretending to have those specs.

SSDs are a radically different technology, and it has not taken anywhere near the time it took with HDDs to optimise the BIOSs and OSs for them. With Win 7 on, it is as simple as you say. XP predated SSDs and was not what MS wanted to keep up to date, when it would just lead to less reasons to change up.

Thanks again Pat. I take it, Win XP is not the optimal OS for SSD’s. I don’t have any plans to stop using Win XP Pro…not at this time, so I guess that means no SSD’s for me, for now.

I don’t recall exactly, but there was a time that installing a HDD was more difficult…I either started with Win 95 or Win 98 around 1999? :confused:

You don’t have to forgo SSDs, but:

a) don’t defragment them

b) buy ones whose manufacturer has a utility to TRIM (reset ‘empty’ blocks so that they are instantly ready for writing)

c) don’t enable ReadyBoost on them.

Pat, are those three things just for Win XP, or generally speaking?

Don’t defragment? Whaaa? :confused: I’m a habitual defragmenter, why is this so with SSD’s?

Yes. Later versions of Windows automatically cater for SSDs. (Well don’t know about Vista, but who with a DAW uses that now?)


On Win8+, doing a defragment will actually only run the TRIM command.

HDD defrag not required for SSDs
Normal HDD defragmenting is unnecessary for SSDs because they have no heads that result in delays if sectors are not close to each other. That is, SSDs are truly random-access, with no position-dependent penalties, so files don’t need sector-proximity-optimising.

HDD defrag bad for SSDs
However, because of all the writing, such defragmenting can also result in a write-wear hit to blocks, especially is using small sectors. For example, a 256kB SSD block can hold 64 4k sectors, meaning that that block could be rewritten 64 times (for a write of any ONE sector, the whole SSD block has to be read then re-written). Of course, the same physical block is not rewritten, as the wear-levelling will use another block from the ‘available’ pool, but it results in the same thing if the whole drive requires defragmenting. That is one of the reasons why I recommend using 64kB sectors = less writes needed for the worst-case scenario.

There is SSD-specific defrag
SSDs can use another type of defragmenting, provided by some third-party utilities, and not by OSs. With SSD blocks being 256kB, it is obviously better that as many of the sectors in that block belong to the one file, but ignoring whether blocks are contiguous. This is to enforce exclusivity as opposed to the HDD defrag focus solely on contiguity (ignoring which files co-reside in the blocks). Another reason to make sectors larger = less possibilities for cohabiting files.

SSD’s will slowdown as they get full, over 50% capacity Google it. Cubase should be loading samples and presets super fast with a SSD. Try removing/back up some data. Get your SSD below 50%-60% and see if that makes a difference.

50% ? That is over the top advice. If you want to give such figure, give better references! This one recommends 75%:
How-To Geek: Why Solid-State Drives Slow Down As You Fill Them Up

Like ANY drive that is constantly being written to, nearly filling it up WILL slow things down, but the reasons are different for SDDs and HDDs, some of which I have indicated above.

…writing becomes slow. Not reading… reading is always the same speed, regardless of fill state.

Reading only becomes slower under a condition called “steady state”, when the SSD has to perform a lot of background operations, but this is mitigated by good controller chips.

Usually, under sample playback conditions, write speed is almost irrelevant. Whats important is the random 4K reads @ QD1 performance (sadly software companies don’t optimize for higher QD yet, it wouldn’t be hard, but there is a special brand of software engineers who stopped learning at some point and think that the hammer they used in 1992 is still good for the nails of 2014) and, for larger samples, sequential read performance.

Write performance? Thats something I consider neglible, except for extreme multitrack recording (recording a lot of tracks at the same time - not the usual “this guitar here, this keyboard there” at once). And even those harddisks were fast enough for recording many, many tracks at the same time. Even the slowest SSD beats the fastest HDD at any time, so…

Besides large samples, for usual use in a DAW (or a gaming machine, etc…) the “sequential read performance” is mostly for benchmark kiddies, it doesn’t really reflect real life performance.

Because of the laziness of many software engineers, the most important value is still “random 4K reads @ QD1” - which could be “QD32” in many cases (ESPECIALLY sample playback - and you usally get an order of magnitude, or better, better performance with a filled queue, but software companies more often than not simply don’t care about this simple, almost trivial, optimization), but software engineers would need to do their homework, and, more often than not, they simply don’t, because “oh, it works like this and I learned it like this, back in 1992, when Win32 and dinosaurs ruled the Earth, what is a thread anyway and stuff”.

Sad, but true.

I’m not sure what Steinberg do, actually I’d say that they have really clever guys in software engineering, so I would actually think that we get nice QDs from Halion and similar software, so they are not to blame, I think, because Halion also loads samples super fast (from my 2x Samsung 840s in RAID 0, spread over 2 different controllers, almost instantaneously in many cases).

Also, the CTO of fxpansion told me that they load large chunks (which is also good for SSDs) in their software, especially BFD 3.0.

However, there are too many “Morts” (Mort, Elvis, Einstein, and You) out there in many, many software engineering teams. Usually the guys who solve problems for the very moment but their solutions are neither future proof nor maintenance friendly - and not even remotely modern.

While deeper queues usually enabled higher average transfer rates, they also increase the average access time, which directly affects what latency settings one can get away with.

As an example, if you go into a bank and there is one queue with only one person in it, and another with 32, which one are you likely to get served at first, given that the service time per person is the same?

Basically, the only reason that higher queue depths give higher average transfer/service rates is that they reduce the possibility of gaps between requests, which are something that:
a) banks want to avoid as it results in staff standing around waiting
b) drive manufacturers ignore because it doesn’t give any bragging rights for throughput
c) most users ignore because they are not concerned with sub-second response times,
BUT that enables DAW/samplers to be the most responsive.

Another reason to use 64kB sectors, but they also substantially reduce OS overhead, namely 1/16 compared to 4kB, and reduce queue depth dependency.

Patanjali, you’re missing one thing here:

Higher QDs enable parallelization, because an SSD controller can then access multiple flash chips at once.

Here my Samsung 840 Pro benchmark values (NOT my RAID - this is my boot & software disk):

core (samsung 840 pro, 256 mb @ sata III)

-----------------------------------------------------------------------
CrystalDiskMark 3.0.3 x64 (C) 2007-2013 hiyohiyo
                           Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 byte/s [SATA/300 = 300,000,000 byte/s]

           Sequential Read :   516.964 MB/s
          Sequential Write :   487.559 MB/s
         Random Read 512KB :   443.380 MB/s
        Random Write 512KB :   464.603 MB/s
    Random Read 4KB (QD=1) :    33.631 MB/s [  8210.6 IOPS]
   Random Write 4KB (QD=1) :    95.699 MB/s [ 23363.9 IOPS]
   Random Read 4KB (QD=32) :   259.869 MB/s [ 63444.6 IOPS]
  Random Write 4KB (QD=32) :   273.092 MB/s [ 66672.9 IOPS]

  Test : 1000 MB [C: 49.2% (105.2/214.0 GB)] (x5)
  Date : 2014/07/26 16:52:24
    OS : Windows 8.1 Pro [6.3 Build 9600] (x64)

It escapes me why there is still software around which only fetches multiple files in a single thread, because this is a well known fact, but I blame it on the Morts.

I find the biggest problem is that prototypes are not adequately re-architected for production, forcing all future development to use Band-Aid fixes to tack new functionality on inadequate structures.

The times when a program undergoes most change is in the initial and second version development, which would most benefit from proper architecture.

The second problem is the general aversion to re-factoring, which seems to be mostly driven by short-term monetary constraints, though I think it results from a failure to fully appreciate the opportunity costs involved, and thus not fully expound the reasoning that would force re-factoring more often.

Absolutely right. A prototype is a prototype - and production code is production code. They are not the same - but so many people don’t understand why… :confused:

In my experience, though, persistent preaching helps a lot… to open the minds. And surging ahead, being a pioneer and role model.

Two things:
a) All dependent upon how many access threads the SSD can actually run in parallel. If too small compared to queue depth, average latency will increase with larger queue depths.
b) The benefits of queue depth are going to be heavily dependent upon load.

This is why I have lamented the lack of real-world DAW/sampler disk usage scenarios. Many times I have implored EastWest to release their telemetry data, of which they must have lots, so that we would have proper performance criteria upon which to evaluate drives (and everything else in our computers!).

Hm, I wonder if a file system filter driver would be able to create the trace data we would need for analysis?

A first step, though, might be a simple C# program reading all sample files in a give directory, first using the Mort method (one thread) and then using multiple threads.

A 5 liner. :smiley:

Read ‘sales and marketing’. Of course, it doesn’t help when those on the technical side cannot argue their way out of a paper bag!

To me, a combination of:
a) correct information
b) rigorous reasoning
c) assertiveness
d) enthusiasm
e) bypassing the fools as much as possible!

When I do sports or when I’m in the shower I spend a lot of time “talking to myself”, training arguing with management, sales and marketing.

Proper realtime processing of multiple streams requires that only the NEXT block of EACH stream be read before attempting ANY of the following blocks. This makes multi-streaming much more like random access, but can force consistent worst case head travel times on HDDs, such as where streams are on separate sections of a drive, as for different mic positions for orchestral samples.

Also, in situations like above, NCQ might actually work against low-latency, because it will tend to reorder the requests to get the sequential blocks after each other, THEN get another close bunch.

Colloquial term for “consensus via emphasis on gaining outcomes that are mutually beneficial for all”. :smiley: :smiley: :smiley: :wink:

The latter doesn’t sort of work when the adrenaline is pumping! :wink:

Aloha guys,

Great (and informative) thread.

Lot’s o’ good stuff to read/learn and then apply

Tanx,
{‘-’}

Good.

So, if you make software and load lots of files: load them in parallel. Your users with SSDs will love you for it. :slight_smile: