~ How synthetic data can help manufacturers train AI~
Creating a fully trained artificial intelligence (AI) model from scratch can take days or weeks, depending on the complexity of the task. The large amount of data and time needed for AI to learn can be off-putting to some manufacturers. In this article, Neil Ballinger, head of EMEA at global automation parts supplier EU Automation, discusses how synthetic data can benefit AI training for manufacturers.
According to the 2019 Dotscience survey report, 64.4 per cent of respondents reported that AI and machine learning (ML) model creation takes between seven and 18 months to go from an idea to production. However, by using synthetic data, manufacturers can save themselves the time and effort of acquiring training data from real-world applications for their AI systems.
A study conducted at MIT in 2017 split data scientists into two groups: one using synthetic data and the other real data. This study saw that 70 per cent of the time, the synthetic data group was able to produce results on par with the group using real data, making synthetic data more advantageous than other privacy-enhancing technologies, such as data masking and anonymisation.
Benefitting from the synthetic
With synthetic data providing such a useful alternative to generating real-world data, it might not seem surprising that a study by Gartner estimates that by 2024, 60 per cent of all data used in AI developments will be synthetic. Synthetic data can have a range of benefits for manufacturers, such as creating a cheaper and quicker alternative to training AI, which can help in applications such as training edge devices for predictive maintenance or remote monitoring.
Another benefit of manufacturers using synthetic data is that it allows them to simulate conditions they have yet to encounter. This is particularly useful for manufacturing as it can be used for machine vision training, to teach the AI to recognise defects that the manufacturers have no real-life examples of. Training machine vision systems in this way would reduce training time significantly, as examples of perfect products and variations of defects need not be generated in real life.
Imagine how much easier it would be to artificially generate thousands of images of electrical circuit boards on an assembly line rather than having to take those images in the real world one at a time to document all variations. Using synthetic data removes the need to label image data by hand, which can be affected by human error that could end up costing manufacturers in the long term.
For example, gaming software company Unity lost 100 million US dollars after an ML model was corrupted by bad data, impacting the company’s ad business in May 2022. This is just one example of where synthetic data could have prevented human error.
Lastly, the use of synthetic data can also provide environmental benefits. By removing the need to create real-life data examples for machine vision, synthetic data can limit waste by preventing defective products from being made, without the need for examples of these defects in the first place.
With synthetic data, human error can be prevented and the long process of gathering real-life data avoided, showing that synthetic learning is the future of machine vision training for manufacturers. Manufacturers looking to invest in AI solutions should evaluate the benefit of using synthetic data to save them valuable time and limit waste.
To stay up to date on technology benefiting manufacturing, visit EU Automation’s Knowledge Hub.