Earlier this year we compared Intel’s Core i9-9900K and Ryzen 9 3900X using two 8GB modules versus four 4GB modules using the exact same memory timings. We found that when using rather slow DDR4-3000 CL16 memory, we were looking at a 3-5% performance uplift for the average frame rate at 1080p using an RTX 2080 Ti, but also gains as high as 10% when looking at the 1% low results. This was seen with both AMD and Intel processors. Certainly not massive gains, but this is relatively slow memory and we expected that the margins would grow a little with faster memory.
Now before we jump into our updated testing, here’s a very brief explanation of why 4 sticks are faster than 2… It boils down to how the memory is configured, or rather the memory “rank.” For those of you unaware, the term ‘rank’ means the number of 64-bit memory banks on a module. Most consumer-grade memory features a single rank, though higher capacity modules are usually dual rank, while server-grade memory is often quad-rank. Identifying if your memory is dual or single rank can be difficult as software doesn’t always read modules correctly and not all memory manufacturers note the rank in the modules’ ID. Typically, single rank modules feature all the memory chips on one side of the PCB, while dual rank memory places chips on both sides of the PCB. However, that’s not always the case. A module with chips on both sides of the PCB is really just dual-sided, and can still be a single ranked module, so it’s a bit confusing.
DIMM Module Rank Configurations
Where things can get even more confusing is when you introduce more memory sticks or modules. A system populated with more than two single ranked modules will actually act as if dual ranked modules are installed. In fact, there’s very little difference between one dual ranked module and two single-rank modules when connected to the same memory controller, even though the memory chips reside on different PCBs. So when using two single ranked modules for dual-channel operation, the memory is configured as a single rank. However, when using four single rank modules for dual-channel operation, the memory is now configured as a dual rank. This can give the four DIMM module configuration an advantage as it allows several open DRAM pages in each rank. Although the ranks can’t be accessed simultaneously, they can be accessed independently, and this means the controller can send write data to one rank, while it waits for read data previously selected from another rank and as you’ll see soon, this greatly increases memory bandwidth.
How much of an impact this has on performance depends on the application and the memory controller’s ability to take advantage of open pages. But what all this means is, yes, it’s possible for four modules to improve performance over two modules in a dual-channel system. Our test system has been equipped with the GeForce RTX 3090 and here comes our comparison of the Ryzen 9 5900X, Ryzen 9 3900X and Core i9-10900K when using two DDR4-3200 modules vs. four DDR4-3200 modules.
Benchmarks
We’ve only tested using two games as that’s all the data we need to clear up any misconceptions you might have regarding 4 sticks with Zen 3 versus other CPUs. Testing with Shadow of the Tomb Raider, which is a very CPU demanding game, we find that the 3900X saw a 14% increase in average frame rate when going from 2 modules to 4, that’s a very significant improvement, though note we’re using an extreme GPU at a low resolution.
Moving to the Core i9-10900K, we again see a big performance uplift when using 4 sticks, this time a 15% performance boost. Now with the Ryzen 9 5900X, we’re looking at a similar performance increase, this time 12% and it’s possible we’re starting to run into a GPU limitation as Ampere scales poorly at 1080p, but the point is all three CPUs see a similar double digit performance uplift with 4 sticks, so this isn’t some kind of special feature unique to Zen 3.
Hitman 2 is another CPU/memory sensitive game, and we’re looking to be GPU bound with the 5900X. Let’s focus on the 1% low data for our comparisons. With the 3900X we’re looking at an 8% performance increase with 4 sticks. However, using the 10900K shows a far more significant 29% performance increase, though it’s only a 13% increase for the average frame rate. Hitman 2 can be a bit odd and we suspect with the 3900X we’re looking at a performance bottleneck that’s more related to core to core latency than DRAM performance. Anyway, with the 5900X we’re also witnessing a massive increase in 1% low performance with 4 sticks, this time a 21% performance boost. But again we suspect we’re running into a GPU limitation for the average frame rate. The point is, the 10900K and 5900X see a similar increase to performance with four memory modules, and not something that is unique to Zen 3.
DRAM Benchmarks
Now let’s check out how the Zen 3 architecture behaves using different memory modules, frequencies, and timings. We’re using the RTX 3090 again, but we also have data with a more mainstream GPU later on. At this point in time, none of our Ryzen 5000 processors work using a 2000 MHz FCLK, which limits us to DDR4-3800 memory. Apparently new BIOS will make a 2000 MHz FCLK more likely, though it didn’t help in our case. The good news is that all worked perfectly at a 1900 MHz FCLK, and that wasn’t the case with Zen 2. For much of this testing we used G.Skill’s TridentZ 3600 CL14 memory which we manually tuned, raising the CL to 16, but aggressively tightening the secondary and tertiary timings which massively improves performance for all Ryzen processors. This configuration will be tested at DDR4-4000, 3800, 3600, and an underclocked 3000 config.
We’ve also included some stock XMP loaded memory configurations, one using Corsair’s Dominator Platinum DDR4-3600 CL18-19-19 memory, another using ADATA XPG Spectrix D50 DDR4-3600 CL18-20-20, but with two different configurations: one using two dual rank modules and the other two single rank modules. We’re also including our G.Skill TridentZ F4-3200 C14 4x8GB test configuration used in our reviews and we’ll add in a 2x8GB config for single rank testing as well. We’ve done our best to label this data as simple as possible, but we understand that for some of you it will be a bit confusing. Finally, we’re sticking with the Ryzen 9 5900X, though note these results apply to all Zen 3 processors, even the Ryzen 5 5600X.
Death Stranding is a game where Zen 3 processors went like a bat out of hell and despite that this isn’t a particularly memory-sensitive title. For example, if you compare our review configuration using four DDR4-3200 CL14 memory modules, ranking the memory up to 3800 with tightly tuned timings only boosted performance by 3 FPS, a mere 1.3% increase. Moreover, we’re looking at 7 to 10% difference between the fastest and slowest memory configurations tested, which isn’t that significant. It also looks like dual ranked DDR4-3600 CL18 memory is comparable to our dual rank DDR4-3600 CL14 memory, at least in this title.
Moving on to F1 2020, we’re seeing very little uplift over our test system configuration when using manually tuned DDR4-3800 memory. This time we’re seeing a ~5% performance delta between the fastest and slowest memory configurations tested. Therefore F1 2020 is another game that isn’t particularly sensitive to memory performance.
We know Far Cry New Dawn is a memory sensitive game, particularly latency sensitive and here we’re seeing quite a more notable 12% performance gain with the tuned DDR4-3800 memory over our 3200 test configuration. If we take the same 3800 spec and just increase the frequency to DDR4-4000, which right now in the absence of 2000 MHz FCLK support decouples from a 1:1 ratio with the Infinity Fabric, we actually break performance a little, dropping down to the tuned 3600 spec.
Horizon Zero Dawn like F1 2020 and Death Stranding isn’t particularly sensitive to memory and we see just a 5% difference again between the fastest and slowest memory. There’s also little to no difference between single and dual rank memory configurations.
Another game that isn’t heavily influenced by memory performance is Rainbow Six Siege. Here we’re looking at less than a 2% change between the top and bottom configurations tested.
Memory performance makes a reasonable difference in Watch Dogs Legion. We’re looking at a 6% performance increase by just adding two more DDR4-3200 modules. Then by overclocking and tuning up, performance was increased by a further 3%. Certainly not something that will materialize at higher resolutions where GPU limitations kick in, but it is a measurable difference.
As seen earlier, Hitman 2 is very memory and CPU sensitive. Tuning up your memory can make a big difference in this game, though that will only be the case when performance is CPU limited. We see that performance using stock DDR4-3600 CL18 kits like the Corsair Dominator Platinum RGB or ADATA XPG Spectrix D50, in a single rank configuration is pretty horrible relative to what we see with the dual rank or manually tuned configurations. Shockingly we’re looking at a ~23% reduction in 1% low performance. Also, if we look at the DDR4-3200 configurations using 2 and 4 sticks, we see that our 4 stick test configuration would be up to 17% slower if we removed two of the modules.
We’re also seeing Spectrix D50 16GB DDR4-3600 modules offering a nice performance boost as they were 6% faster than our 3200 CL14 test configuration. The power of dual-rank operation is strong, and unfortunately we were not able to test the manually tuned DDR4-3800 configuration in the dual-rank mode as we don’t have enough of those modules. In Hitman, with memory at or above the 3200 spec we’re seeing up to a 21% difference in performance and up to a 12% increase over our test configuration.
That said, if we increase the resolution to 1440p that 21% margin is reduced to 14%, which is still substantial, but it is heavily reduced by increasing the GPU load. Moreover, whereas the manually tuned DDR4-3800 memory was 12% faster than our DDR4-3200 test configuration at 1080p, at 1440p it’s just 4% faster. Still, dual rank memory makes a big difference in this title.
Like Hitman 2, Shadow of the Tomb Raider is a CPU demanding game that’s also sensitive to memory performance. Looking at our test configuration we see that using 4 TridentZ DDR4-3200 CL14 modules improved performance by a whopping 12% when compared to just two modules. It’s also faster than the single ranked DDR4-3600 CL18 configurations. We’re looking at similar performance from the ADATA XPG 32GB DDR4-3600 kit, and of course, that uplift isn’t explained by the extra capacity, but the dual rank configuration. Beyond that though, we’re not gaining much with the manually tuned DDR4-3600 and 3800 memory.
We also ran some 1440p tests with Shadow of the Tomb Raider and this is what we believe you can expect to see in most games, even with something as extreme as an RTX 3090. Whereas we saw a 19% difference between the fastest and slowest tested configurations at 1080p, at 1440p that margin is reduced to just 4% and with memory running at or above the AMD base spec, we’re talking about a 0.7% difference.
8 Game Average
If we average the 1080p data across the 8 games tested, this paints a clear picture of the kind of performance difference you can expect to see with an extreme GPU at a low-ish resolution.
In more CPU limited gaming scenarios, a manually tuned DDR4-3800 configuration will net you some 7% more performance when compared to a stock memory kit, like the Corsair Dominator Platinum RGB, for example. When compared to our review setup, we’re looking at just a 3% boost on average.
AIDA64 Memory Bandwidth
For those of you wondering, here’s a look at the memory bandwidth performance of these various configurations. Although the performance of DDR4-4000 was average given we couldn’t run at a 2000 MHz FCLK, the bandwidth is still very impressive, sustaining 55 GB/s. The manually-tuned DDR4-3800 memory managed 53 GB/s, which wasn’t much faster than the ADATA Spectrix 32GB kit which achieved almost 52 GB/s. Our test configuration was good for almost 47 GB/s, which is about the most you can hope for from DDR4-3200 memory.
Here’s why the 55 GB/s DDR4-4000 configuration didn’t dominate the gaming benchmarks: the latency is rather unimpressive at 60 ns and that’s only marginally better than our DDR4-3200 test setup. The tuned DDR4-3800 memory reduced DRAM latency by 9%, hence why it did so well in memory sensitive titles such as Hitman 2, Far Cry New Dawn, and Shadow of the Tomb Raider.
DDR4 Frequency Scaling
Speaking of Hitman 2, here’s a look at memory frequency scaling, so we’re using the same memory and timings, with the only changes made to the memory frequency and the FCLK which has been kept at a 1:1 ratio for optimal performance, with the exception of the DDR4-4000 configuration. In Hitman 2, we see fairly consistent scaling as the memory bandwidth and/or latency is improved, right up to DDR4-3800. If we could get a 2000 MHz FCLK working, we expect you’d see a further 3% performance boost for the DDR4-4000 configuration.
Shadow of the Tomb Raider is a more inline with other demanding games and here we find that DDR4-3600 is the sweet spot, as was the case with Zen 2. For those of you wanting as much performance as possible without going overboard on memory prices, DDR4-3600 CL16 looks like the way to go.
RTX 2070 Super Benchmarks
But wait, there’s more. Here’s the same memory configuration we just saw in the scaling benchmarks, but this time with using the RTX 2070 Super…
Even in Hitman at 1080p, with an RTX 2070 Super you’re going to be almost entirely GPU bound and if you happen to be gaming at 1440p, well, you’ll be entirely GPU bound, and no matter how much you spend on your memory, or how many modules you have, performance is going to be the same.
We see exactly the same thing in Shadow of the Tomb Raider, even at 1080p we’re looking at equalized performance across the board due to the GPU bottleneck and the 2070 Super is no slouch, we’re talking about Radeon RX 5700 XT-like performance here, so very solid mid-range GPU performance. Of course, increasing the resolution only neutralizes the results further and now it doesn’t matter if you’re running DDR4-2800 or 3800, performance will be the same in a very CPU demanding title.
What We Learned
Bottom line, if you’re a gamer wanting to maximize performance, if and when you run into CPU limited situations, and want the best bang for your buck, then we recommend getting DDR4-3600 CL16 memory. For most we suspect 16 GB will be fine, but if you can afford more, 32GB is nice and it means if you purchase two 16GB kits you’ll also have the advantage of dual ranked operation. Right now something like Crucial’s Ballistix 16GB DDR4-3600 CL16 kit looks great and costs just $75. Should you want, they will offer a high degree of tunability. G.Skill also offers an affordable DDR4-3600 CL16 kit for around $80.
Assuming we were able to get a 2000 MHz FCLK working with future BIOS revisions, we don’t think it’s worth spending over $100 on those kits. Chances are, you’ll never spot the difference. That stuff is better reserved to overclockers wanting to get a bigger 3DMark scores or whatever leaderboard it is that gets them excited these days. As for the debate regarding 2 sticks vs 4 sticks of memory. There’s nothing new to report since our own test almost a year back. The performance uplift for Zen 3 is no different to that of Zen 2 or competing Intel processors. The margins will also depend on the quality settings used, and of course, the hardware. If you lower the quality settings in games, you’re going to exaggerate the margins further and that does get you further away from the reality for most gamers.
We realize that most of you will just install your memory and get gaming, and frankly unless you enjoy playing with this stuff, spending hours tuning and tinkering with memory timings for what will likely amount to very little real-world gains, is just a waste of time. This can be more important stuff for us reviewers trying to conduct scientific testing, and sometimes we do get a bit carried away with isolating a specific component in order to see what performance differences there might be, but it really is important to remind readers that for the most part they’re unlikely to see these gains under realistic gaming conditions, so keep that in mind.