OpenAI’s reasoning model will run on devices with Snapdragon

Qualcomm Technologies is excited to share a major milestone in its journey of making AI assistants ubiquitous: OpenAI has open-sourced its first reasoning model Qualcomm Technologies is excited to share a major milestone in its journey of making AI assistants ubiquitous: OpenAI has open-sourced its first reasoning model

Qualcomm Technologies is excited to share a major milestone in its journey of making AI assistants ubiquitous: OpenAI has open-sourced its first reasoning model, gpt-oss-20b, a chain-of-thought reasoning model that runs directly on devices with flagship Snapdragon processors. OpenAI’s sophisticated models have been previously confined to the cloud. This marks the first time the company is making its model available for on-device inference.

Through early access to the model and integration testing with the Qualcomm AI Engine and the Qualcomm AI Stack, the company has seen that this 20B parameter model is an incredibly impressive model that enables chain-of-thought reasoning entirely on-device.

This is a turning point: a glimpse into the future of AI where even rich assistant-style reasoning will be local. It also reflects the maturity of the AI ecosystem, where open-source innovation from leaders like OpenAI can be harnessed in real-time by partners and developers utilising Snapdragon processors. OpenAI’s gpt-oss-20b will allow devices to leverage on-device inference, offering benefits in terms of privacy and latency, while complementing cloud solutions via AI agents.

Developers will be able to access this model and leverage its capabilities on devices with Snapdragon through popular platforms like Hugging Face and Ollama, with more details on deployment available soon on the Qualcomm AI Hub.

By integrating Ollama’s lightweight, open-source LLM servicing framework with powerful Snapdragon platforms, developers and enterprises can run the gpt-oss-20b directly on devices with Snapdragon compute platforms and also run web search and several other features default out of the box. Users can also explore turbo mode on Ollama to explore more functionalities of the model.

Over the next few-years, as mobile memory footprints continue to grow and software stacks get even more efficient, we believe that on-device AI capability will increase rapidly, opening the door to private, low-latency, personalised agentic experiences.

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Previous Post
Robotic surgery is one of the most significant advancements in modern healthcare. Recent media coverage has highlighted its potential

The infrastructure tech gap threatening robotic surgery

Next Post
Test Research, Inc (TRI) will join SEMICON Taiwan, running from 10-12 September at Taipei Nangang Exhibition Center

TRI at SEMICON Taiwan