Running a PyTorch inference is relatively simple, with only 35 lines of code needed to download the model, load it into PyTorch, and then run it. Having a framework like this to test new models is useful, especially one that’s this easy to get running.
Although it would be nice to have NPU support, that will require more work in the upstream PyTorch project, as it has been concentrating on using CUDA on Nvidia GPUs. As a result, there’s been relatively little focus on AI accelerators at this point. However, with the increasing popularity of silicon like Qualcomm’s Hexagon and the NPUs in the latest generation of Intel and AMD chip sets, it would be good to see Microsoft add full support for all the capabilities of its and its partners’ Copilot+ PC hardware.
It’s a good sign when we want more, and having an Arm version of PyTorch is an important part of the necessary endpoint AI development toolchain to build useful AI applications. By working with the tools used by services like Hugging Face, we’re able to try any of a large number of open source AI models, testing and tuning them on our data and on our PCs, delivering something that’s much more than another chatbot.