Is REST the Answer for Industrial IoT?

The IIoT is often viewed as a client-server architecture where the “things” can be smart devices with embedded micro-controllers.  The devices generate data based on sensors, and send that data to a server that is usually elsewhere on the Internet.  Similarly, a device can be controlled by retrieving data from the server and acting upon it, say to turn on an air conditioner.

The communication mechanism typically used for devices to communicate with the servers over the Internet is called REST (Representational State Transfer) using HTTP.  Every communication between the device and server occurs as a distinct HTTP request.  When the device wants to send data to the server it makes an HTTP POST call.  When it wants to get data (like a new thermostat setting) it makes an HTTP GET call.  Each HTTP call opens a distinct socket, performs the transaction, and then closes the socket.  The protocol is said to be “connectionless”.  Every transaction includes all of the socket set-up time and communication overhead.  Since there is no connection, all transactions must take the form of “request/response” where the device sends a request to the server and collects the response.  The server generally does not initiate a transaction with the device, as that would expose the device to attack from the Internet.

HTTP does define a keep-alive connection, where several HTTP transactions are sent on a single socket.  This definitely reduces the amount of time spent creating and destroying TCP connections, but does not change the basic request/response behaviour of the HTTP protocol.  Scalability issues and trade-offs between latency and bandwidth still overwhelm any benefit gained from a keep-alive connection.

One of the identifying features of the IIoT is the data volume.  Even a simple industrial system contains thousands of data points.  REST APIs might be fine for a toaster, but at industrial scale they run into problems:

Bandwidth

REST messages typically pay the cost for socket setup on every message or group of messages.  Then they send HTTP headers before transmitting the data payload.  Finally, they demand a response, which contains at least a few required headers.  Writing a simple number requires hundreds of bytes to be transmitted on multiple IP packets.

Latency

Latency measures the amount of time that passes between an event occurring and the user receiving notification.  In a REST system, the latency is the sum of:

  • The client’s polling rate
  • Socket set-up time
  • Network transmission latency to send the request
  • Transmission overhead for HTTP headers
  • Transmission time for the request body
  • Network transmission latency to send the response
  • Socket take-down time

By comparison an efficient persistent connection measures latency as:

  • Network transmission latency to send the request
  • Transmission time for the request body
  • Network transmission time for an optional response body

The largest sources of latency in a REST system (polling rate, socket set-up, response delivery) are all eliminated with the new model.  This allows it to achieve transmission latencies that are mere microseconds above network latencies.

REST’s latency problems become even clearer in systems where two devices are communicating with one another through an IoT server.  Low-latency event-driven systems can achieve practical data rates hundreds or thousands of time faster than REST.  REST was never designed for the kind of data transmission IIoT requires.

Scalability

One of the factors in scalability is the rate at which a server responds to transactions from the device.  In a REST system a device must constantly poll the server to retrieve new data.  If the device polls the server quickly then it causes many transactions to occur, most of which produce no new information.  If the device polls the server slowly, it may miss important data or will experience a significant time lag (latency) due to the polling rate.  As the number of devices increases, the server quickly becomes overloaded and the system must make a choice between the number of devices and the latency of transmission.

All systems have a maximum load.  The question is, how quickly does the system approach this maximum and what happens when the maximum is reached?  We have all seen suddenly-popular web sites become inaccessible due to overloading.  While those systems experienced unexpectedly high transaction volumes, a REST system in an IIoT setting will be exposed to that situation in the course of normal operation.  Web systems suffer in exactly the scenarios that IIoT is likely to be useful.  Event-driven systems scale much more gradually, as adding clients does not necessarily add significant resource cost.  For example, we have been able to push REST systems to about 3,000 transactions per minute.  We have pushed event driven systems to over 5,000,000 transactions per minute on the same hardware.

Symmetry

REST APIs generally assume that the data flow will be asymmetrical.  That is, the device will send a lot of data to the server, but retrieve data from the server infrequently.  In order to maintain reasonable efficiency, the device will typically transmit frequently, but poll the server infrequently.  This causes additional latency, as discussed earlier.  In some systems this might be a reasonable sacrifice, but in IIoT systems it usually is not.

For example, a good IIoT server should be capable of accepting, say, 10,000 data points per second from an industrial process and retransmitting that entire data set to another industrial process, simulator, or analytics system without introducing serious alterations to the data timing.  To do that, the server must be capable of transmitting data just as quickly as it receives it.  A good way to achieve this is through establishing persistent, bidirectional connections to the server.  This way, if the device or another client needs to receive 10,000 data changes per second the communication mechanism will support it.

Robustness

Industrial applications are often mission-critical to their owners.  This is one of the big issues holding back the IIoT.  What happens if the Internet connection goes down?

In the typical IoT scenario, the device is making REST calls to a server running in the cloud.  In some ways this is a by-product of the cloud vendor’s business model, and in some ways it is due to the REST implementation.  A REST server is typically a web server with custom URL handlers, tightly coupled to a proprietary server-side application and database.  If the Internet connection is lost, the device is cut off, even if the device is inside an industrial plant providing data to a local control system via the cloud.  If the cloud server is being used to issue controls to the device, then control becomes impossible, even locally.  This could be characterized as “catastrophic degradation” when the Internet connection is lost.

Ideally, the device should be able to make a connection to a local computer inside the plant, integrate directly with a control system using standard protocols like OPC and DDE, and also transmit its data to the cloud.  If the Internet connection is lost, the local network connection to the control system is still available.  The device is not completely cut off, and control can continue.  This is a “graceful degradation” when the Internet connection is lost.

In conclusion, REST systems work reasonably well in low-speed transactional systems.  However, they have a number of disadvantages when applied to high speed, low latency systems, and to systems where data transfer from the server to the device is frequent.  Industrial IoT systems are characterized by exactly these requirements, making REST an inappropriate communication model for IIoT.

– Andrew Thomas, Skkynet