This topic has been discussed quite a bit lately. You may find some additional food for thought here:
I share my thoughts on it, as well.
The short answer to the question is that there is an inherent battle, competing forces, between an architecture that's been built to favor low-latency, and one that's built to "schedule audio tasks" across multiple processors (cores). Each DAW does it differently. Cubase's roots in the low-latency end of this spectrum is why those of us who favor more plugins over low-latency, are feeling unsatisfied.
There are other DAWs that have chosen a different design, that suffer less, but more in other areas and vice versa. Cubase was second in my test of DAWs I own (in multi-core scaling).
VEP on the same machine circumvents the issues because it gets its own cores to work with and has a different audio engine design than Cubase.
Cubase's anwser to all this is ASIO Guard. Clearly, not as aggressive as many of us would like, but it's version 1.0.
Hopefully, awareness, like these posts, will bump it up on their priority list.
I think the future is definitely one that requires ASIO Guard to be in a "Reaper" ballpark of how it schedules audio slices across cores and makes more use of modern, multi-core CPUs. Cubase does use them, but is hampered by its ASIO, ultra-low-latency roots.