Qualcomm is joining Intel, Nvidia, and more in the artificial intelligence (AI) chip race. The chipmaker announced at an event in San Francisco that it would be releasing samples of its Cloud AI 100 family of accelerators later this year, with mass production expected beginning in 2020.
This new 7-nanometer (nm) chip is built to meet the demand for AI inference workload processing in the cloud, the company says.
AI workloads are split into two areas: training and inference. Training is the process of learning a new capability from existing data, whereas AI inference refers to the process of applying this capability (often AI algorithms) to new data. Reuters reported that analysts are predicting that speeding up inference will comprise the largest part of the AI chip market.
According to Qualcomm, this AI chip will use its existing signal processing and power efficiency, but it will raise the bar set by its Snapdragon mobile processors. In a prepared statement, Keith Kressin, SVP of product management at Qualcomm, said that the Cloud AI 100 “will significantly raise the bar for the AI inference processing relative to any combination of central processing units, graphics processing units, and/or field-programmable gate arrays used in today’s data centers.”
The chipmaker claims that the Cloud AI 100 will deliver 10 times the performance of other AI inference offerings in the market today.
Leveraging its existing technology power efficiency — namely consuming small amounts of electricity and generating very little heat — from its mobile chips, Qualcomm is aiming to serve smaller “edge” data sites with its AI chips.
The chip will offer support for a number of software stacks including PyTorch, Glow, TensorFlow, Keras, and Open Neural Network Exchange.
Qualcomm said it will begin testing the chip in the second half of 2019 with companies such as Microsoft Azure, and it hopes to begin shipping in 2020.
The AI Chip
The AI chip market is becoming increasingly crowded.
Intel said in January that it is working with Facebook to develop AI chips, also built for AI inference workloads. The company is planning to start production of these new processors in the second half of the year.
Nvidia has been developing a new generation of its T4 data center chips that will support machine learning interfacing, video processing, increased graphics performance, and you guessed it — AI inference. This is all part of its greater data center strategy to push AI into the data center.
Huawei announced in January a new data center switch that uses its AI chip to boost network performance. It first unveiled its work on Ascend 910 and low power 310 AI processors — which are 7nm chips based on its AI architecture — at Huawei Connect in October.
Even cloud providers are getting into the AI mix. Chinese internet, search and cloud provider Baidu debuted its Kunlun AI chip last July, to be used for both training and inference.
Google, as part of its end-to-end AI stack, first announced its AI-focused tensor processing units (TPU) in 2016 and has launched two updated versions. It has also partnered with several chip vendors including NXP Semiconductors, Arm, Harting, Hitachi Vantara, Nexcom, and Nokia to bring these TPU kits to market.
Amazon Web Services (AWS), last November, also debuted its Inferentia machine learning chip for inference workloads.