Apple researchers working on on-device AI model that can understand contextual cues


Apple researchers have published a new paper on an artificial intelligence (AI) model that they claim is capable of understanding contextual language. The yet-to-be-peer-reviewed research paper also notes that large language models (LLMs) can operate entirely on the device without consuming too much computational power. The description of the AI ​​model makes it suitable for the role of a smartphone assistant, and it could upgrade the tech giant's original voice assistant Siri. Last month, Apple published another paper about a multimodal AI model called MM1.

The research paper is currently in pre-print stage and has been published on arXiv, an open-access online repository of scholarly papers. The AI ​​model is named ReALM, short for context resolution as language model. The paper highlights that the primary focus of the model is to perform and complete tasks that are prompted using contextual language, which is more typical of the way humans speak. For example, the paper claims, it would understand when a user says, “Take me to the second position from the bottom”.

ReALM is designed to work on smart devices. These functions are divided into three sections – on-screen units, interactive units, and background units. Based on the examples shared in the paper, on-screen units refer to actions visible on the device's screen, conversational units are based on actions requested by the user, and background units refer to actions occurring in the background. Acts as if a song is playing on an app.

What is interesting about this AI model is that the paper claims that despite the complex task of understanding, processing and executing actions suggested through contextual cues, it does not require a high amount of computational energy, “Making ReaLM an ideal choice for a practical context resolution system that can exist on a device without compromising performance.” It achieves this by using significantly fewer parameters than leading LLMs such as GPT-3.5 and GPT-4. Does.

The paper also claims that despite working in such a restricted environment, the AI ​​model performed “substantially” better than OpenAI's GPT-3.5 and GPT-4. The paper further elaborates that while the model scored better on text-only benchmarks than GPT-3.5, it outperformed GPT-4 for domain-specific user utterances.

Although the paper is promising, it has not yet been peer-reviewed, and thus its validity remains uncertain. But if the paper receives positive reviews, it could inspire Apple to develop the model commercially and even use it to make Siri smarter.

Affiliate links may be automatically generated – see our ethics statement for details.