Processing AI on a device provides essential advantages for privacy, performance, customisation, cost, and energy.
The World Wide Web introduced the large distant data centre computing age, today known as the cloud, in the middle of the 1990s. The world’s current infatuation with generative artificial intelligence (AI) was made possible by this change, which also paved the door for improvements in scientific modelling, design and simulation, research, and other fields.
According to the data in our previous OnQ piece, “Hybrid AI trends,” These developments are accompanied by rising data centre capital and operating costs, which are prohibitive and are increasingly creating a need for — and an opportunity for — offloading some workloads to edge devices like tablets, smartphones, personal computers (PCs), cars, and extended reality (XR) headsets. However, the advantages of moving workloads to these devices go far beyond the financial advantages for data centres.
For us, on-device AI is nothing new. Qualcomm Technologies has been doing research and partnering with clients, including original equipment manufacturers and application developers, to improve the user experience using AI for more than ten years. Today, it’s frequently employed in a wide range of on-device applications, including radio frequency signal processing, battery management, audio processing, computational photography, video improvement, and many more.
Increasing privacy and security, performance, and personalisation while lowering the necessary expenses and energy consumption can further improve the user experience by extending on-device AI support to generative AI using optimised and/or specialised neural network models.
1.AI security and privacy
Data tracking, data manipulation, and data theft are all made more likely by the transport, storage, and use of data across numerous platforms and cloud services.
On-device Due to the fact that user queries and personal data are kept entirely on the device, AI inherently helps preserve users’ privacy. This is crucial for protecting consumer data as well as adding an extra layer of security for sensitive applications used in the government, business, healthcare, and other fields.
For instance, the device could run a programming assistant app that generates code without disclosing private information to the cloud.
2.AI performance
Processing speed and application latency are just two methods to gauge AI performance. With each new technology generation, the on-device processing speed of mobile devices has improved by double digits, and this trend is expected to continue. This will eventually enable the adoption of larger generative AI models, especially as they become more optimised.
Application latency is also crucial for generative AI. Although customers are more willing to wait while a report is generated, a commercial chatbot must answer in almost real-time to provide a favourable user experience. By processing generative AI models locally rather than through cloud servers or overloaded networks, it is possible to increase dependability and increase query execution flexibility.
3. Personalization and AI
On-device generative AI will provide consumers improved personalisation in addition to higher privacy. For complete contextual awareness, on-device generative AI will make it possible to tailor models and responses to the user’s particular speech patterns, expressions, reactions, usage patterns, environment, and even external data, like that from a fitness tracker or medical device. Because of this capabilities, generative AI may gradually develop a distinctive digital persona (or personas) for each user. A group, organisation, or business can use the same technique to develop coordinated responses.
4. AI price
Cloud companies are starting to charge customers for formerly free services as they struggle with the equipment and operational costs related to running generative AI models. These fees will probably keep rising either to keep up with inflation or until more cost-effective business models can be established. In addition to lowering prices for consumers, running generative AI on a device can also save costs for cloud service providers and networking service providers, freeing up valuable resources for other high-value and high-priority operations.
5. Energy and AI
The cost of operating generative AI models locally versus remotely determines how much electricity is needed to run these models. Large generative AI models‘ inference processing may call for the deployment of many AI accelerators, such servers and graphics processing units (GPUs) or tensor processing units (TPUs). Jim McGregor, principal analyst at TIRIAS Research, estimates that the idle power consumption of a single fully populated AI-accelerated server can be close to one kilowatt, while the peak power usage can be close to several kilowatts. This figure is multiplied by the quantity of servers needed to execute generative AI models as well as the rate at which models are being run, which, as previously said, is expanding exponentially. The price of the electricity needed to move the data via intricate networks to and from the cloud is added to this. Thus, the rate of electricity consumption is likewise increasing exponentially.
When compared to the cloud, edge devices with effective AI processing offer the most performance per watt. When factoring both processing and data transit, edge devices may run generative AI models at a tenth of the energy cost. This distinction has a substantial impact on energy prices and aids cloud providers in offloading data centre energy use to fulfil their sustainability and environmental goals.
Advancing the limits of technology
The development of mobile technology pushed the limits of application, image, video, and sensor processing efficiency and made it possible to use a variety of user interfaces. The capabilities of on-device processing will be pushed further by generative AI, which will also improve the personal computing experience. In addition to working with partners to put generative AI on device through an open ecosystem, Qualcomm Technologies is striving to improve the performance of next platforms for smartphones, PCs, cars, and the internet of things. Watch for more information in upcoming AI on the Edge OnQ postings.