No one should be surprised by Apple’s efforts to add generative AI to its iPhone.

No one should be surprised by Apple's efforts to add generative AI to its iDevices. Still, Cupertino already uses the technology, and mobile hardware limitations indicate that it won't be a significant iOS feature anytime soon.

Unlike many businesses, Apple has not joined the recent wave of generative AI boosterism and has avoided using “AI” or “Artificial Intelligence” in its recent keynote speeches. However, Apple has always valued machine learning, and it still does, albeit subtly, to enhance the user experience.

One illustration of background technology is Apple's use of AI to handle images. Machine learning algorithms, for example optical character recognition, add links, and identify and tag subjects when iThings capture photos.

That kind of invisible AI won't work in 2024. Apple's competitors are marketing Generative AI as a necessary feature for all devices and applications. Recent Financial Times reports claim that Apple has been covertly purchasing AI companies and creating its own big language models to ensure its ability to deliver.

Apple's superior hardware

Apple's homebrew silicon uses neural processing units( NPUS) to manage its current AI implementations. Since the release of the A11 system-on-chip in 2017, Apple has used the accelerators, which it refers to as “Neural Engines,” to manage smaller machine learning workloads and free up CPU and GPU resources for other tasks.

NPUs from Apple are incredibly potent. The A17 Pro can push 35 TOPS in the iPhone 15 Pro, which is twice as fast as the predecessor and roughly twice that of some PC-compatible NPUs from Intel and AMD.

Regarding NPU performance, Qualcomm's most recent Snapdragon chips are on par with Apple's. Qualcomm has years of NPU experience in mobile devices, just like Apple. The fields of AMD and Intel are relatively new.

Although Apple has praised the GPU's prowess in running games like Resident Evil 4 Remake and Assassins Creed Mirage, it hasn't shared floating point or integer performance for the chip. This implies that the platform can run larger AI models without being constrained by computational power.

Further supporting this is that Apple's M- M-series silicon, used in its Mac and iPad lines, has proven particularly potent for running AI inference workloads. In our testing, given adequate memory— we ran into trouble with less than 16GB— a now three-year-old M1 Macbook Air was more than capable of running Llama 2 7B at 8-bit precision and was even snappier with a 4-bit quantized version of the model. By the way, if you want to try this on your M1 Mac, Ollama. ai makes running Llama 2 a breeze.

AI PC hype seems to be improving PCs– in hardware terms, at least.

Microsoft prices new Copilots for individuals and small biz vastly higher than M365 alone Nvidia gives RTX 40 series a Super refresh as AI PC hype takes off.

Apple has botched 3D for decades. So good luck with the Vision Pro, Tim Where Apple may be forced to make hardware concessions is with memory.

Generally speaking, AI models need about a gigabyte of memory for every billion parameters when running at 8-bit precision. This can be halved either by dropping to lower precision, something like Int- 4, or by developing smaller, quantized models.

Llama 2 7B has become a standard reference point for AI PCs and smartphones due to its relatively minor footprint and computation requirements when running small batch sizes. Using 4-bit quantization, the model's requirements can be cut to 3.5GB.

But even with 8 GB of RAM on the iPhone 15 Pro, we suspect Apple's next-generation phones may need more memory, or the models will need to be smaller and more targeted. This is likely one of the reasons that Apple is opting to develop its models rather than co-opting models like Stable Diffusion or Llama 2 to run at Int- 4, as we've seen from Qualcomm.

There's also some evidence to suggest that Apple may have found a way around the memory problem. As spotted by the Financial Times, back in December, Apple researchers published]PDF] a paper demonstrating the ability to run LLMs on- device using flash memory.

Expect a more conservative approach to AI When Apple introduces AI functionality on its desktop and mobile platforms, we expect it to take a relatively conservative approach.

Turning Siri into something folks don't need to be spoken to like a preschool child is a prominent place to start. Doing that could mean giving an LLM the job of parsing input into a form that Siri can more easily understand so the bot can deliver better answers.

Siri could become clearer if you phrase a query roundaboutly, resulting in more effective responses. This should have a couple of benefits. The first is that Apple should be able to get away with using a much smaller model than something.