We usually think of AI running on web-connected servers in the cloud somewhere, but what about AI running on a stand-alone device, not connected to the internet? There are actually more examples of this than you may realise. To start with, security cameras detect motion or objects locally without sending data to the cloud. Self-driving cars utilise specialised hardware to run various types of AI for decision-making while driving. The same can be seen in some drones and industrial robots, many of which can operate, if needed, without any network coverage. Your smartphone may well be able to connect to the internet, but it can do things like photo enhancement and predictive texting, or even offline translation, locally using AI without a network connection.
This kind of disconnected use of AI is known as “edge AI”. It has become feasible because hardware has become dramatically more capable in recent years, providing sufficient processing power to run an AI workload. In parallel, some AI models have been developed that are more efficient and designed specifically for running on limited memory and processor capabilities.
This ability has obvious advantages in terms of latency, as data does not have to be sent off to a server and an answer returned; instead, everything happens on a local device. This allows some devices to operate in areas where there is limited or unreliable network communication, such as in rural areas. Wearable devices like smart watches and fitness trackers use AI to monitor heart rhythms, sleep patterns and even detect a person falling. Autonomous vehicles use edge AI to combine data from cameras, LIDAR and radar in real time to avoid collisions and detect traffic lanes. Retailers use AI for queue monitoring in real time. Consumers can use devices like smart appliances or home assistants, which can run voice recognition and noise filtering locally.
Devices where the data stays local can be useful in situations where there are privacy concerns, such as with CCTV. The model MobileNet, a lightweight classifier, is widely used in smart cameras and home security systems for object detection and facial recognition. The model YOLO (You Only Look Once) is a compact neural network for live object detection on embedded boards like NVIDIA Jetson, Google Coral and Hailo-8. Some conversational language models can also be run on edge devices, such as Qwen2.5-VL-Instruct, and Llama-3.1-8B-Instruct, though this may require optimisation. Some specific micro-models are used for keyword spotting, such as “Hey Siri” on Apple iPhones. In the case of military drones, local AI can be used for target identification even when their communication signals are being jammed by the enemy. Incidentally, the EU AI Act includes edge AI devices, and holding personal data locally may actually help AI vendors comply with these new privacy rules.
There are still limitations. Hardware on local devices has become more powerful, but is still nothing like the capacity of a data centre. This means that the size of AI models that can run is constrained. Also, there is specialist hardware for edge devices, and no universal standard for AI models to run on any device, making it trickier for developers. Crucially, these devices are, by definition, often disconnected entirely from the internet. If you want to roll out a new AI model or put out a security patch or software update, that becomes a major logistical challenge. You also need to consider the complexities of rolling out a change to something like a smart car. These vehicles may be driving along a road, so you cannot just push out a change and wait for the software to be updated as you might do with a phone update overnight. In the case of cars, software is tested in virtual environments and is run in shadow environments on real cars without switching it on, comparing its decisions against the production software. Once testing is complete, a new software release is first rolled out to company cars with engineers monitoring what happens. Then it is rolled out to just a small subset of the fleet (perhaps 1% of the cars in operation). Engineers carefully check for unexpected issues like changes in latency or something more serious, like a change in accident rates amongst the new version. The change is rolled back if a problem occurs, but otherwise it is then rolled out to the main fleet of cars, usually via a wireless update over either cellular networks or Wi-Fi at charging stations. The updates are applied when the car is parked. Autonomous vehicles use multiple different models for separate tasks like planning models for route planning, perception models for spotting pedestrians and traffic lights, and separate models again for driver monitoring and vehicle motion.
As hardware continues to get cheaper and more capable, more and more devices become capable of running software, including AI, from wearable fitness monitors to household appliances like fridges and smart home assistants. In the industrial world, there will be more robots, drones and driverless vehicles that are enabled with AI to help them sense and interpret the world around them and deal with unexpected situations. In many situations, there will be a hybrid approach, with centralised cloud servers running powerful models capable of interacting from time to time with edge devices. These edge devices are likely to become more and more ubiquitous, representing the cutting edge of AI.







