AMD this morning is formally announcing the launch of the latest version of its GPU compute software stack, ROCm 5.7. Along with making several important updates to the software stack itself – particularly around improving support for large language models (LLMs) and other machine learning toolkits – the company has also published a blog post outlining the future hardware development plans for the stack. In short, the company will be bringing official support to a limited set of RDNA 3 architecture video cards starting this fall.
AMD’s counterpart to NVIDIA’s CUDA and Intel’s OneAPI software stacks, AMD has historically pursued a narrower hardware focus with their own GPU compute software stack. ROCm exists first and foremost to support AMD’s Instinct line of accelerators (used in projects such as the Frontier supercomputer), and as a result, support for non-Instinct products has been limited. Officially, AMD only supports the software stack on a pair of workstation-class RNDA 2 architecture cards (Radeon Pro W6800 & V620), while unofficial support is available for some other RDNA 2 cards and architectures – though in practice this has proven to be a mixed bag as to how reliably it works. Consequently, any announcement of new Radeon video card support for ROCm is notable, especially when it involves a consumer Radeon card.
Closing out their ROCm 5.6 announcement blog post, AMD is announcing that support for the first RDNA 3 products will be arriving in the fall. Kicking things off, the company will be adding official support for the Radeon Pro W7900 – AMD’s top workstation card – and, for the first time, the consumer-grade Radeon RX 7900 XTX. Both of these parts are based on the same RDNA 3 GPU (Navi 31), so architecturally they are identical, and it’s a welcome sign to see AMD finally embracing that and bringing a consumer Radeon card into the fold.
Broadly speaking, RDNA 3’s compute core differs significantly from RDNA 2 (and CDNA 2) thanks to the introduction of dual issue SIMD execution, and the resulting need to extract ILP from an instruction stream. So the addition of proper RDNA 3 support to the ROCm stack is not a small undertaking for AMD’s software team, especially when they are also working to support the launch of the MI300 (CDNA 3) accelerator family later this year.
Along with the first two Navi 31 cards, AMD is also committing to bringing support for “additional cards and expanded capabilities to be released over time.” To date, AMD’s official video card support has never extended beyond a single GPU within a given generation (e.g. Navi 21), so it will be interesting to see whether this means AMD is finally expanding their breadth to include more Navi 3x GPUs, or if this just means officially supporting more Navi 31 cards (e.g. W7800). AMD’s statement also seems to imply that support for the full ROCm feature set may not be available in the first iteration of RDNA 3 support, but I may be reading too much into that.
Meanwhile, though it’s not by any means official, AMD’s blog post also notes that the company is improving on their unofficial support for Radeon product, as well. Numerous issues with ROCm on unsupported GPUs have been fixed in the ROCm 5.6 release, which should make the software stack more usable on a day-to-day basis on a wider range of hardware.
Overall, this is a welcome development to see that AMD is finally lining up support for their latest desktop GPU architecture within their compute stack, as Navi 3x’s potential as a compute product has remained less than fully tapped since it launched over half a year ago. AMD has taken some not-undeserved flak over the years for ROCm’s limited support for their bread-and-butter GPU products, so this announcement, along with CEO Dr. Lisa Su’s comments earlier this month that AMD is working to improve their ROCm support, indicate that AMD is finally making some much needed (and greatly awaited) progress with improving the ROCm product stack.
Though as AMD prepares to add further hardware support for ROCm, they are also preparing to take some away, as well. Support for products based on AMD’s Vega 20 GPU, such as the Instinct MI50 and Radeon Pro VII, is set to begin sunsetting later this year. ROCm support for those products will be entering maintenance mode in Q3, with the release of ROCm 5.7, at which time no further features or performance optimizations will be added for that branch of hardware. Bug fixes and security updates will still be released for roughly another year. Ultimately, AMD is giving a formal heads up that they’re looking to drop support for that hardware entirely after Q2 of 2024.
Finally, for anyone who was hoping to see full Windows support for ROCm, despite some premature rumors, that has not happened with ROCm 5.6. Currently, AMD has a very limited degree of Windows support in the ROCm toolchain (ROCm is used for the AMD backin in both Linux and Windows editions of Blender) and ROCm development logs indicate that they’re continuing to work on the matter; but full Windows support remains elusive for the software stack. AMD has remained quite mum on the matter overall, with the company avoiding doing anything that would set any expectations for a ROCm-on-Windows release. That said, I do still expect to see proper Windows support at some point in the future, but there's nothing to indicate it's happening any time soon. Especially with MI300 on the horizon, AMD would seem to have bigger fish to fry.
from AnandTech https://ift.tt/FN71EP4
via IFTTT
0 comments:
Post a Comment