Stratus Technologies ftServer® secures production process

When the IT system for Alunorf, the world’s largest aluminium rolling and casting plant, was discontinued a few years ago, the organization set out to find a replacement system. The management and control of both its pit and pusher furnaces were dependent upon it. After considering a cluster system, Alunorf eventually chose a system built on fault-tolerant servers from Stratus Technologies.

Alunorf, situated in Neuss in Germany, is the world’s largest aluminium rolling and casting plant. Manufacturing 1.5 million tonnes of aluminium rolls every year, this expensive, busy and complex plant has to meet many serious health and safety requirements, so high availability was top of the purchasing criteria for a replacement system.

Alunorf produces extremely heavy aluminium bars that are hot – around 500 degrees Celsius – when they come out of annealing in the pit and pusher furnaces and onto a hot strip mill. These bars are then rolled out into enormous strips measuring almost 200 metres long. The aluminium is cooled to room temperature, processed further in the cold-rolling mills and then rolled to a thickness of only 0.2mm, before being sold on to, for example, the car or packaging industries, to be processed further.

All of Alunorf’s technical plant is situated in the hot-rolling mill area: cranes, milling machines, furnaces, rollers and cutters, all lined up precisely. Should one link in the processing chain fail, the whole production line would come to a standstill. In light of the high investment in the plant and its full workload, this would cause significant costs. Therefore the demands on the availability of the system are just as high.

The pit and pusher furnaces have systems that control the plant and these are overseen by an area control system supervising the entire process. The area control system follows the aluminium bars from delivery through the production process to the finished aluminium rolls. Feedback from the control systems in the plant means that area control knows at all times exactly where each aluminium product is in the production process. The data from the control systems in the plant is visualized through a separate system in the control room. The software in here presents the pit and pusher furnaces of the hot-rolling mill graphically and allows the operators to monitor the specifics, such as the temperature or the weight. The software also allows manual intervention in the production process, for example to stop the production line if there is a breakdown.

Because both the area control system and the visualization system manage and control the central operation, the servers have to be failsafe. The need was for a high availability server upon which to host them.

 

“It isn’t a big problem if a system stalls for a few moments”, explains Markus Haastert, line manager of the pit and push furnaces in the hot-rolling mill at Alunorf in Neuss. “The plant continues to work. If there are short bursts of server downtime, the operators can manually input any missing information. But generally the servers, just like production, have to work daily around the clock. Any interruptions at night or at the weekend, when there are no IT personnel on site, must be fixed quickly and to a large extent automatically.”

 

Alunorf originally considered a cluster system, but it was too complex for the operations in the hot-rolling mill. Stratus’ fault-tolerant servers (ftServers) were not. The Stratus system allows continuous operation. All components—the CPU, RAM, IO unit and disks—are designed on a totally redundant basis, there is a duplicate of everything, so, if one component fails, the system continues operating without interruption. Because of this, Stratus systems deliver industry leading availability, guaranteeing 99.9999% uptime.

 

The Stratus server operates as a single machine, so user software doesn’t need to be customized. The administration of the server is very simple too. Furthermore, the fault tolerant server systems come with a comprehensive servicing and maintenance concept. The servers report any faulty components directly to the manufacturer through the Call Home function so that replacement parts are sent out without delay. They often reach the system user before the user has even noticed that one part has failed.

ftServer delivered on both its promises—high availability and ease of implementation and use. It also proved to have no hidden costs:

“The software that we have installed in the systems controlling the plant and in the visualization system is extremely complex,” explains Haastert.

“Adapting this to work on a different high-availability system would have been very difficult and expensive. As well as this, it would have isolated us from any current developments to the programs.” ftSservers don’t just guarantee continuing operation in the event of a fault though, they also allow the failed component to be replaced while the system is still running. The servers do not need to be shut down for maintenance work to be carried out, as Alunorf experienced:

“One system once reported a fault in an IO-unit”, reports Haastert.

 

“We changed the part in question while the server was still in operation, with no more to do than loosening screws, removing the old component and putting the new one in. The bottom line was that the Stratus server didn’t go out of operation for a single minute, neither because of the preceding failure, nor during its replacement. We also didn’t need to restart the system