In our recent whitepaper, Synthetic Data and AI: An in-depth dive into model training, we explored the effectiveness of using synthetic data to train a machine learning vision system to accurately read the indicator values on a pressure gauge. This approach proved to be extraordinarily successful, demonstrating that synthetic data significantly enhances the performance and accuracy of machine learning models, particularly when real-world labelled data is hard to obtain or expensive.
Advancing from this solid foundation, we set out to explore the power of synthetic data beyond a single gauge model, enhance our solution to generalise, and prove it can work in the real world. We planned to generate highly varied data that reflects real-world diversity, enabling the training of more resilient and adaptable computer vision models.
To summarise the solution described in the whitepaper, our system consists of three main parts:
- A segmentation model that identifies the bounding boxes and segmentation masks for the gauge parts (case, face, dial, and scale labels) in an image.
- A key-points (pose) detection model which will predict the X, Y pixel location for the dial tip, gauge centre and scale ticks for minimum and maximum values on a cropped gauge image.
- An algorithm implemented in Python that computes the angles between min-centre-max and min-centre-dial. Based on the proportions of these two angles, and the gauge interval, the gauge value can accurately be computed.
These parts are assembled into an end-to-end solution as follows: First, the segmentation model identifies the probable gauge location from the casing and face bounding boxes (shown in green). Then, the key points model (shown in red) processes these locations to find the four key points (min, max, centre and tip – highlighted in blue). Finally, we choose the detection with the highest scores and use the angle-based heuristics to compute the indicated value (in green, top-right corner).
Sample of the results
Iterative process for a resilient gauge solution
In the real world, objects like pressure gauges rarely look the same. Gauges can vary widely based on what they are monitoring, the equipment they are part of, and their manufacturer. These variations can cause failures or inaccurate readings if not properly accounted for. Using real-world data to capture this variability presents challenges, such as obtaining the correct distributions to cover all possible conditions and edge cases.
To validate our solution in a realistic scenario, we decided to continue to use the Kaggle Pressure Gauge Reader Dataset from the whitepaper, but now evaluating our solution on the raw test videos. These seven videos have a variety of pressure gauges in different conditions: simple head-on perspective, complex backgrounds, quickly moving needle indicators, or extreme situations where the needle points outside the scale range.
To achieve the desired gauge reading performance, we followed an iterative, hypothesis-driven approach. Each iteration involved identifying potential issues, generating new data, training the models, and analysing the results. In each iteration we made the necessary changes to improve the performance of both the data generation and machine learning training pipelines. We repeated this cycle multiple times, allowing us to incrementally refine and enhance the system until it met the required performance standards.
The following is how we achieved our goal – acknowledging that not all hypotheses were proven correct, and adjustments were necessary, this list summarises the key steps/iterations in a simplified manner for clarity:
1. Scale labels and min-max key points
During the initial evaluations, we noticed that the segmentation model struggled to correctly identify certain scale labels on some gauges, while the pose model had difficulties with the min and max ticks. One clue we discovered was that in several of the problematic cases there were negative scale labels. We hypothesised that we could address these issues by increasing diversity in the numerical interval and allowing more flexibility in the start and end positions by adjusting the arc length.
Value scale variation:
We broadened the range of numerical values shown on the gauges by randomising the minimum value between -2 and 0 and the maximum value between 2 and 10. We also included fractional values to represent non-whole numbers. By varying these minimum and maximum values, our synthetic data encompasses a wider range of potential readings.
Arc length variation:
To address the arc length, we introduced variations within our synthetic data ranging from 180° (min) and 320° (max). This enabled our model to recognise and accurately interpret indicators that span different angular ranges.
When re-evaluating the models on the real data we noticed considerable improvement, but there were still problematic situations, like in this example where the predicted key point for max value was missing.
2. Scale markers and scale major subdivisions
Special scale markers
Upon further inspection, we observed that some problematic gauges had additional markings on the scale ticks. To address this, we added special markers: a triangle for the max value, a horizontal line for zero and a horizontal line for the min value.
With these changes new datasets were generated to supplement the previous data during training. The results for the pose model were better than expected.
3. Needle shape and beyond min/max
Still, one of the previously problematic videos was performing poorly. We hypothesised that this was due to the unique shape of the gauges dial, the dial going beyond the limits of the min/max scale range, as well as the granularity of the numerical subdivisions.
Needle shape variability:
To address the unique dial shape, we added variance to our synthetic data, ranging from simple straight pointers to more complex, stylised designs. This ensured our vision system could accurately identify the needle, regardless of its form.
Numerical subdivision variations:
To address the variability, we observed in numerical subdivisions granularity, and introduced a procedural system to vary the scale increment. By incorporating a wide variety of subdivision schemes, we made the model more resilient to these cases.
Readings beyond min/max dial values:
To catch these edge cases in the real data, we included observations where the needle value falls below the minimum and exceed the maximum dial values. This helped the model to accurately interpret readings even when they fell outside the typical range.
4. Augmentations during training
Throughout these experiments, in addition to improving the data generation pipeline, we made extensive use of augmentations during training. This helped eliminate the problems that the segmentation model had on three of the raw test videos.
The augmentation we applied specifically for improving the segmentation model were:
- Using Albumentations we apply various transformations with random probabilities: brightness-contrast, gamma, diverse types of blurring, simulate JPEG compression.
- YOLOXHSVRandomAug which applies HSV (hue, saturation, value) augmentation to images sequentially. This process enhances the model's robustness by allowing it to generalise better across different lighting conditions and colour variations in images.
- Padding and cropping to help increase the relative size range of the gauge to the overall image. We initially padded the image to a much larger size and then cropped to a random width and height.
Final results
The quantitative evaluation of our gauge reading system focused on precision and accuracy for key point predictions. This was assessed both relative to gauge size and in absolute pixel measurements. Additionally, we evaluated the accuracy of the gauge value readings. These metrics were calculated using manually labelled sample frames cropped to highlight the gauges from the test movies.
Our system showed high precision and accuracy in key point prediction, with relative measurements closely aligning to the actual gauge sizes and absolute errors remaining minimal in pixel terms. The gauge value accuracy was also robust, demonstrating the system’s ability to reliably interpret gauge readings from the processed images.
This visualisation illustrates the quantitative results for each of the videos where a small subset of frames manually annotated by us. Each plot corresponds to a distinct gauge video, with the X-axis representing the frame numbers and the Y-axis representing the gauge values. The solid lines indicate the gauge values derived from ground truth annotations, while the cross markers denote the values computed from the predicted outputs. It is genuinely exciting the close alignment between the cross markers mirroring the actual values so closely, highlighting the robustness and effectiveness of our model in real-world applications.
For the qualitative analysis, we assessed the system's performance on raw test videos from the same Kaggle dataset. The results were promising, showing the system’s ability to accurately identify and read gauges under varying lighting conditions, camera angles and background noise. These qualitative assessments underscored the model’s robustness and versatility, proving its capability to handle multiple gauge types and diverse conditions, and confirming its reliability in practical applications through visual inspection.
Future enhancements
Currently, we use hardcoded values for the min-max gauge interval. The next steps to make the solution production-ready include running OCR on the scale labels and handling cases where the perspective is not directly facing the gauge. These final steps will enhance the system's flexibility and accuracy, ensuring it performs well across a wider range of real-world scenarios.
Announcing our open-source synthetic dataset
To further support the development and research in this field, we are excited to announce the release of an open-source synthetic dataset on Kaggle. This dataset encompasses a wide range of gauge variation and includes the various enhancements described above. Researchers and developers can use this dataset to train and validate their own machine learning models, facilitating broader innovation and collaboration.
Visit Endava’s Kaggle page to access the dataset: Synthetic Data for Precision Gauge Reading, where you can see a subset of the data used and inference results on the raw test videos used for testing.