Since they were announced earlier this year, everyone’s attention has been fixed firmly on the 32-core/64-thread 2nd Gen Threadripper part, now known as the 2990WX, coming in at $1800. There will be two models in the WX series and for those wondering the ‘W’ signifies that this is a workstation series and the ‘X’ the usual xtreme nonsense we suppose. Along with the 2990WX there will also be a 24-core/48-thread model known as the 2970WX, though that model won’t be available until October. Although the TR 2990WX has been receiving all the attention, we expect the 2950X to be the real hero of this new lineup in what basically is a refined 1950X at a $100 lower launch price.
As was the case with the 2nd Gen Ryzen 5 and Ryzen 7 models, these new Threadripper parts feature reduced cache and DRAM latency with support for slightly faster memory. So they are based on the Zen+ architecture which uses the 12PL process from GlobalFoundries. The TR 2950X features the same layout as the 1950X and that means it comprises two active Zeppelin dies, each packing 8 cores, two memory channels and 32 PCIe gen 3 lanes. When using DDR4-3200 memory the Infinity Fabric throughput between these dies is roughly 50 GBps. As was the case with the 1950X, the 2950X can be configured in one of two ways. Using UMA (Uniform Memory Access), which AMD refers to as ‘distributed’ mode in their Ryzen Master software, the processor acts as a single unit and this means threads and DRAM transactions are distributed evenly across the entire chip to maximise bandwidth, but in turn increases latency which isn’t ideal for tasks such as gaming.
Therefore it is possible to enable NUMA (Non-Uniform Memory Access) which AMD refers to as ’local mode’ in the Ryzen Master software. They call this a local operating mode as the processor is separated into two domains and attempts to pair active cores with local DRAM, rather than accessing memory via a controller in a separate die which comes with a rather hefty latency penalty. The 2990WX on the other hand is a very different beast. It consists of not two Zeppelin dies but rather four, enabling up to 32 cores. However on the X399 platform AMD has imposed some limitations to avoid cannibalizing their single socket EPYC server CPUs. The biggest of these limitations being that there are still just four memory controllers. Although there are two more Zeppelin dies, the additional two dies are compute dies, in AMD’s words. This means they have no local PCIe or DRAM access, for that they must travel via the Infinity Fabric to the IO dies. As there are twice as many dies, the Infinity Fabric bandwidth is also halved, so now the throughput between dies is just 25GBps, assuming you’re using DDR4-3200 memory.
MSRP prices reflect pricing at launch. TR 1950X is down to $770 as of writing. TR 1920X is $530 and TR 1900X is $333. Because of this design that sees two of the dies without direct access to the DRAM, it means that unlike the 2950X, the 2990WX uses NUMA exclusively. AMD says that this quad-NUMA configuration has allowed them to create the world’s first 32-core consumer processor, and just as important, it has allowed them to do it while maintaining backwards compatibility with existing TR4 products. There is an obvious drawback that’s had me a bit concerned since we first heard about this 32 core model. We always knew that 1st Gen Threadripper CPUs had the potential to offer up to 32 cores, so this isn’t some kind of radical breakthrough for AMD with the 2nd Gen series. The original Threadripper chips didn’t have ‘dummy’ dies as claimed by AMD. We always knew they were defective or disabled Zeppelin dies, these are after all just EPYC CPUs for the desktop. Though we don’t mean to sound like we’re downplaying anything here, EPYC CPUs on the desktop is very much epic.
Anyway, for this 2nd Gen Threadripper series AMD has enabled those extra dies to create the 24 core and 32 core models. The problem is memory bandwidth, as there simply isn’t going to be enough of it. We still only have quad-channel memory access, so memory bandwidth remains the same, but now we have twice as many cores to feed.
Likely this is going to make an already niche product, even more focused, so keep that in mind. For testing we’ve got a truckload of data and while we tried to include as many CPUs as possible we did run out of time to update the results with the Core i7-8700K. The good news is we have results for a number of Intel’s high-end desktop processors such as the Core i9-7980XE and 7960X, for example. Basically all systems were configured with 32GB of DDR4-3200 memory using XMP timings. So let’s get into the results.
Benchmarks
Okay well we might as well get this one out of the way first, Cinebench R15. As many of you are probably aware of by now given AMD leaked the results, the 2990WX achieves a score of just over 5000 pts in it’s stock out of the box configuration. That makes it a whopping 52% faster than the Core i9-7980XE and at this point you’re probably wondering what the hell I was on about when I said there were some drawback to the design, but hold that thought for now. Anyway in this particular synthetic rendering benchmark the 2990WX has no trouble blowing socks off. The 2950X is no slouch either though it does only improve upon the 1950X by a mere 5% margin.
Next up we have another rendering benchmark though this one is based on real-world software. The Corona Renderer has been used to test workstations with over 64 cores, so it scales very well. Here we again see breathtaking rendering performance from the 2990WX as it took just 41 seconds, allowing it to complete the test 40% faster than the 2950X, not perfect core scaling but still an impressive result. This also meant it was 28% faster than Intel current flagship Core i9 part. Also this time the 2950X was just 4% faster than the 1950X, so another small gain, but a gain all the same.
Moving on I fired up the Ryzen Graphic workload in Blender, this is a relatively quick test for high-end CPUs and we see that with the 2990WX which took just 8.3 seconds. This meant it completed the workload 36% faster than the 2950X and 31% faster than the Core i9-7980XE.
Again an impressive completion time for the 2990WX but it has to be said for a doubling of cores we are only seeing a 55% boost in overall performance when compared to the 2950X and there is only a very minor clock speed difference between the two. So what about a workload that takes significantly longer than a few seconds for the 2990WX to complete.
Disappointingly the much more complex and therefore longer to complete Gooseberry workload was less favorable to the 2990X. Okay so it still uprooted the 7980XE and kicked in it’s pins, but it was also only able to reduce the completion time by a 28% margin when compared to the 2950X. It was also able to reduce the render time by 20% when compared to the more expensive 7980XE, so that’s obvious a great result for AMD. Still it’s a troubling sign that in what should be an optimal workload for the 2990WX, we’re only seeing a 38% increase in performance, for a 100% increase in cores.
Okay so POVray is the last rendering benchmark we’re going to look at and this one bodes well for the 2990WX. Here it was able to reduce the rendering time by 40% when compared to the 2950X and this meant it was 65% faster, so again not amazing scaling but 65% is much better than what we saw in Corona and Blender. It was also 57% faster than the Core i9-7980XE, so a massive win there.
I’ve included the heavy multitasking results from RealBench which runs image editing, video compression and rendering tasks simultaneously. The 2990WX saw a peak load of 70% but for at least half of the test load was down around 20%, so that’s worth noting. Here we see the 2990WX providing a surprisingly poor result taking 43 seconds to complete the workload. This made the 32-core processor slower than even the 1950X. Here it was the 2950X that impressed, matching the Core i9-7960X and 7980XE. This meant the 2950X was able to complete the heavy multitasking test 6% faster than the 1950X, so again a great result.