Using AI to train at the edge is critical for IoT implementations, especially as they scale. But the sheer horsepower of the cloud means the two should go hand in hand.

It’s no surprise that the Internet of Things is growing rapidly. Each time an enterprise undertakes an IoT implementation, hundreds or perhaps thousands of new devices are added to the tally. And each time an implementation is successful, exponential growth is often seen as companies grasp the value that real-time data brings to their organizations. When research firms like IDC estimate there will be 41.6 billion connected IoT devices generating 79.4 zettabytes (ZB) of data in 2025, both the volume of devices and volume of data should be music to the industry’s ears.

See also: Hardware Acceleration Drives Continuous Intelligence

“As the market continues to mature, IoT increasingly becomes the fabric enabling the exchange of information from “things,” people, and processes. Data becomes the common denominator — as it is captured, processed, and used from the nearest and farthest edges of the network to create value for industries, governments, and individuals’ lives,” says Carrie MacGillivray, group vice president, IoT, 5G, and Mobility at IDC. “Understanding the amount of data created from the myriad of connected devices allows organizations and vendors to build solutions that can scale in this accelerating data-driven IoT market.”

But both the volume of devices and data should also be something individual enterprises need to pay special consideration to, because as devices and data increases, so does cost. That means to keep costs in line with results and achieve the ROI needed, companies need to sacrifice devices or data or vastly increase budgets. None of those options is satisfactory.

The typical cycle IoT data travels is usually quite simple. It’s captured by the sensors on a device and sent through a gateway to the cloud, where it is processed, and appropriate actions taken. Those appropriate actions may be simply filing the data in the appropriate file, or it may mean sending an alert if something is wrong. Most IoT data is simply stored.

It’s typical for most organizations to use the cloud for this process end-to-end simply because there aren’t better options. But there are issues with using the cloud. For example, all data is sent to the cloud, when all might not need to be. As mentioned earlier, as data rises, so do costs and security risks:

  • The cellular network used by most companies for their IoT implementations charges for every byte of data that is used. Those bytes add up, leading to some unexpected costs as usage grows.
  • The cost of using the cloud to store the data increases as data usage grows as well.
  • Security becomes a bigger issue as there are now multiple hops each piece of data is taking. More hops equal more vulnerability.
  • Because of the multiple hops between cloud and edge devices, the workflow is not optimized, making it more difficult to process data in real time.

Device Training

Training of devices—ensuring they are performing the task the company needs—is also usually done in the cloud through AI and machine learning processes. However, there are downsides to this as well. Each task is different, each environment is different, and so on.

Training at the edge using an AI model is preferred, but oftentimes the tasks are simply too large for the resource-constrained IoT devices to handle. A new way of thinking about using edge AI is needed. Let’s look at some of the benefits of training at the edge using AI and machine learning.

Cost savings.  IoT devices and sensors can gather data points quickly—roughly 30 data points per second can be collected from each sensor to get the granularity of accuracy that some applications require. Now take those 30 data points per sensor, and factor in that a single machine can have 10 sensors, meaning every single machine or device is collecting 300 data points per second. If all that data (plus the tens, hundreds, or thousands of data points collected from other assets within the ecosystem) are being sent to the cloud for real-time processing, the costs from sending data via the cellular network would add up quickly. By keeping data local for processing directly at the edge and only sending data that needs to be further processed in the cloud to the cloud, cost savings can be realized.

One example of this in action might be monitoring vibrations in an electrical motor and setting minimum and maximum values for when devices need to collect data. For example, vibration readings can be taken every second, every minute, every hour, daily, etc. If vibrations stay consistent, readings can be taken once every minute instead of once every second. If the vibrations become volatile, the time period can be brought down to read more frequently, and alerts sent to appropriate parties. By sending data less frequently, and being able to select what data is sent, cost savings are realized.

Another great example of cost savings is in regards to training AI models. Typically, this is done in a cloud environment by importing a large amount of historical data. This data is not always available, so organizations tend to send raw data to a cloud environment for creating the AI/ML model. The accuracy of this model is directly tied to the quality and amount of data that it is receiving, so organizations will want to send as much as possible. This will quickly become expensive. What some organizations are now conducting is AI Model creation directly at the edge. This allows for each asset to have a unique model. The process is simple. Sensors send data to a computing hub that is deployed at each asset. This computing hub can be a low cost (< $5) microprocessor. Data is consumed, aggregated, and piped into the AI training module. Depending on how frequently that asset performs a normal cycle is what typically determines the length of training, but, on average, it is a few days. Once training is complete, the model switches into production where it begins looking for signs of failure and abnormal behavior. Once detected, alerts and API calls can be made for triggering action.

Security. As discussed above, the more that data stays local to the device, and the fewer hops that are taken, the more secure the data is because there are fewer penetration points. Most organizations are wanting to keep data as local as possible. When looking at training AI models, this becomes an issue unless you are able to train directly on the edge. Many vendors today are providing endpoint security, firewalls, and ways to build an impenetrable wall around the device. There are two priorities for security: keep people out that shouldn’t have access and, if a breach occurs, ensure that anomalies are detected, and the breach is stopped quickly. The ability to spot anomalies quickly (oftentimes in real time) by immediately detecting a change in the device behavior is inherent to training data at the edge, without having to go back to the cloud for processing.

 IT and OT cybersecurity is converging. Being able to monitor the physical asset’s performance along with monitoring for abnormal behavior within IT components of an asset is how vendors need to start viewing the OT/IT convergence.

Performance. Performance is critical with IoT applications, especially industrial ones. Latency can be a factor as alerts come in, and data is moved to the cloud, and that is unacceptable in critical environments such as oil & gas, energy, and manufacturing. If this is a key piece of equipment, by processing data at the edge, sensors can determine the sign of failure and send down a real-time alert, which could just be a simple message to take action, or it could generate an API call to shut down the machine.

By training at the edge instead of the cloud, latency issues are easily resolved. One additional benefit to performance by training at the edge is different devices can be trained for different environments across an organization’s entire implementation. For example, a company may have machines both indoors and outdoors in a warm, humid climate. The devices outdoors are going to experience different conditions that those inside an airconditioned warehouse. By being able to train using an AI model in the actual environment vs. training everything similarly in the cloud and passing down directives from there, performance is naturally improved.

But part of performance does involve the cloud as well. If an organization were to train a model in a cloud environment and push it down, all models would be identical, as discussed above.  A snapshot, however, can be taken at the edge of all the models of devices being used, and that data is sent to the cloud for further analysis. This allows the organization to see the outliers, for example, which models are older or in poor condition. This horizontal analysis allows the organization to see what additional actions may need to be taken—both over the short and long term.

Training in Action

Two examples best show the power of training on the edge:

  • In an automotive manufacturing environment, there might be 100 robotic arms as part of a single line, with a dozen or more sensors on each. Lots of data points are being gathered from each arm, for example, 10 sensors x 100 arms across the line, all sending 10 to 30 data points per second. Across one facility, there may be 1,000 or more arms doing the same thing—in tandem with fixtures holding parts. To get that data into a cloud environment to train is almost impossible. It must be done locally.
  • In a simple smart city deployment, 10,000 smart lightbulbs may be deployed in an area. The IoT application monitors energy usage, and each light is sending data on a per-second basis, looking for anomalies such as if energy levels go up by a standard deviation. Being able to live on the edge and train is critical for deployments of this size.

Using AI to train at the edge is a critical part of IoT implementations, especially as they scale. But the sheer horsepower of the cloud will always be an integral part of the IoT, and training at the edge and in the cloud should go hand in hand. However, processing all incoming data and storing the information gathered can quickly get expensive. By only taking snapshots of the minimum and maximum values and processing and storing that in the cloud, significant savings can be realized. Security and performance also make training at the edge a clear option for organizations as they continue to grow their IoT implementations.