The tech industry is aflutter at the thought of applying machine learning (ML) at the edge to streams of city and industrial data and to new IoT devices. Edge-based smarts enable quick, local decision making, cut the cost of data transport and storage, and ease concerns over data custody, confidentiality, and privacy.
As pioneers of machine learning, major cloud vendors have developed powerful services that enable developers to deliver ML-derived insights more easily. They also see an opportunity at the enterprise edge, but enterprise data is generally not sent to the cloud — it’s often stored and/or forgotten at the edge.
In the race to make their cloud-hosted big data and ML services useful to customers with lots of edge data, major cloud vendors have developed a data science analog of DevOps — “data science ops” — in which models are trained and updated using historical data in the cloud, then pushed to edge devices for local inference.
The main problem with this approach is that it is entirely hypothetical. Whereas moving existing enterprise workloads to the cloud makes sense, and DevOps is a no-brainer for cloud-native apps, there is no corresponding workflow for the enterprise data scientist. Cloud vendors don’t have much of a role outside IT, and operations teams just want to get richer insights, sooner.
In my view, we are on the verge of an exciting set of innovations that will help to address the challenge of bringing learning to the edge, avoiding the need for “data science ops.” Breakthroughs in unsupervised learning will enable digital twins of edge devices to self-train from fast data streams at the edge. Rather than saving vast amounts of data for later processing, developers will be able to link to systems they care about to gain key analytical insights and to predict future performance.
To achieve this goal of having digital twins that learn at the edge we need to recognize the limitations of a cloud-first approach to application architecture and focus research on a different paradigm for learning and prediction. They can be addressed as two parallel efforts that focus on:
- The use of a stateful, reactive paradigm for edge computing applications; or
- Breakthroughs in the efficiency of in-stream training to permit accurate analysis and predictions.
To address the first it’s important to recognize the limitations of the stateless REST architecture of most of today’s web-centric applications. In REST-based applications, much can be gained from scaling out stateless services relying on a database to manage state and synchronization. The penalties are high: to process an event requires a database read, computation, and a database write. The latency of processing each event is dominated by database round-trip times, which may be hundreds of milliseconds — billions of wasted CPU cycles on the CPU processing the event.
In reactive, stateful architectures such as Akka, Erlang, Orleans, and Swim, each event is processed by a stateful actor — an active object with code and state that is persisted between events. Those billions of CPU cycles can be put to use to perform analysis, training, and predictions on the fly. Tasks that demand a lot of resources in the cloud can be trivially accomplished on modest devices at the edge. Rather than a database, the state of the system is the state of the actors, each of which statefully represents a row and some number of columns of the data and offers a real-time API for updates.
To address the second problem, it requires actively learning at the edge from data rather than from a convoluted cloud-based training workflow. This relies on developing algorithms that learn the way each of us learns — it makes a hypothesis and compares the result with the real world. If you’re wrong, the error can be used to adjust the model for next time. If you’re right (or close enough) the model is trained.
This type of black box learning, coupled with the actor model above, enables us to deliver digital twins that learn. Of course I’ve oversimplified the problem, but a tangible goal is to create generic learning frameworks that only require non-experts to set a few hyper-parameters that match their real world intuition about the behavior of the environment.