Views: 34
Read Time:4 Minute, 45 Second

Qualcomm extends its presence in AI inference processing, began with its Cloud AI 100 series accelerators, with the launch of its new Qualcomm Cloud AI 100 Ultra.

While Qualcomm’s Cloud AI 100 accelerator family has long been available from several tier-one technology providers such as Lenovo, Hewlett Packard Enterprise (HPE), Inventec, Foxconn, Gigabyte, and Asus, it’s starting to see deployment in the public cloud.

Amazon Web Services (AWS) recently introduced its first Qualcomm-based accelerated instance type, the DL2q, featuring the Qualcomm Cloud AI 100. While the new instance type can be used for general inference applications, the companies highlight the accelerator’s specific applicability in developing automotive ADAS and related applications – an area in which Qualcomm is rapidly expanding its presence.

Qualcomm’s Cloud AI 100

Qualcomm first launched its Cloud AI 100 accelerator in 2020, delivering a device specifically engineered to boost the capabilities of cloud computing environments through efficient, high-speed AI inference processing.

The Cloud AI 100 is tailored for inference, which is the application phase of AI where a trained model is used to interpret new data. This is a critical function in AI deployments that require immediate results, such as recognizing speech, translating languages, analyzing images, or processing real-time data from IoT devices.

The accelerator offers a nice balance of performance and efficiency. Qualcomm built a device that tells a demonstrably substantial total cost of ownership (TCO) story while delivering the performance required by demanding AI inference workloads.

MLPerf 3.1 Results

In September 2023, MLCommons released its MLPerf Inference 3.1 benchmark results, in which Qualcomm demonstrated significant advancements with its Cloud AI 100 inference accelerators.

The results show notable improvements in performance, power efficiency, and lower latencies, particularly for Natural Language Processing (NLP) and computer-vision networks for the Qualcomm Cloud AI 100.

Qualcomm’s MLPerf Inference v3.1 benchmarks surpassed its previous records. In several categories, the Cloud AI 100 showed advancements in peak offline performance, power efficiency, and latency reduction.

For instance, a 2U datacenter server platform equipped with 16 Qualcomm Cloud AI 100 PCIe Pro (75W TDP) accelerators displayed a 15-20% improvement in power efficiency across NLP and computer vision networks.

At the same time, Qualcomm’s performance on the RetinaNet Network on platforms utilizing the Cloud AI 100 saw improvements of around 12%. This optimization indicates Qualcomm’s continued efforts to enhance AI models’ processing efficiency and speed.

The MLPerf Inference v3.1 results clearly demonstrate the effectiveness of the Qualcomm Cloud AI 100 across a broad range of applications, including both edge and data center categories, highlighting its performance in key metrics like inference-per-second and inference-per-second-per-watt (I/S/W).

Introducing the Cloud AI 100 Ultra

In November 2023, Qualcomm added to its Cloud AI 100 lineup with the introduction of its new Qualcomm Cloud AI 100 Ultra. The new accelerator is tailored explicitly to serve the needs of generative AI and large language models (LLMs).

The new accelerator offers four times the performance of earlier Cloud AI 100 variants. The AI 100 Ultra can support extremely large AI models, handling models with up to 100 billion parameters on a single 150-watt card.

The Ultra can scale up to support 175 billion parameter models with two cards. Multiple AI 100 Ultra cards can be combined to handle even larger models.

Despite its high performance, the Cloud AI 100 Ultra maintains the energy efficiency inherent in the rest of the family, crucial for reducing operational costs in data centers and supports sustainability goals in AI operations.


AI inference is becoming a critical functionality, especially with large language models. Bringing AI to the edge, especially the mobile edge, is the next frontier of accelerated computing. Qualcomm puts a significant stake in the ground with its Cloud AI 100 accelerators, nicely complementing its existing edge-targeted compute and communication technology.

While Qualcomm entered this market with its Cloud AI 100, the new Ultra offering takes these capabilities further, explicitly targeting the demands of generative AI and large language models. This advanced version stands out for its ability to support extremely large AI models.

Its enhanced performance and energy efficiency make the Qualcomm Cloud AI 100 Ultra a compelling solution for complex AI tasks while keeping operational costs in check.

Beyond its technical capabilities, the new accelerator provides another waypoint as Qualcomm continues its expansion into the AI-enabled edge market. Qualcomm leverages the technology within the Cloud AI 100 family to service the needs of various markets, now including the public cloud.

Qualcomm isn’t alone in this market. Beyond the offerings of industry stalwart NVIDIA, we’ve seen AWS, Google, and Microsoft all introduce inference-specific accelerators. AMD’s MI300-series of accelerators play in this space, as does Intel’s Gaudi.

Qualcomm’s differentiates with its ability to combine the high-performance, energy-efficient inference typified by its Cloud AI 100 offerings with an IP portfolio that can service the broader needs of the edge market. That’s rare among current technology providers.

Qualcomm’s Cloud AI 100 product line underscores the company’s strategic move into high-end AI inference markets, showcasing its potential to reshape AI processing in various industries, from healthcare to automotive and beyond. It’s a compelling story that Qualcomm’s competitors struggle to beat.

Disclosure: Steve McDowell is an industry analyst, and NAND Research an industry analyst firm, that engages in, or has engaged in, research, analysis, and advisory services with many technology companies, which may include those mentioned in this article. Mr. McDowell does not hold any equity positions with any company mentioned in this article.

Scroll to Top