The use of the Isolation Forest model for anomaly detection in measurement data

Authors

DOI:

https://doi.org/10.30837/ITSSI.2024.27.236

Keywords:

uncertainty; anomaly detection; measurement; metrology; data processing; machine learning algorithms; statistical methods.

Abstract

The subject of the research is the Isolation Forest model, which is a powerful and efficient tool for detecting anomalies in measurement data and outliers, applicable in various fields where ensuring high accuracy and reliability of measurements is important. The goal of the study is to apply the Isolation Forest model to identify unusual or anomalous patterns that differ from typical patterns in the output data. This is achieved by isolating anomalous patterns from normal ones through the construction of multiple different decision trees. The task of the research is to detect outliers in data obtained during the preparation for international comparisons on the state primary standard for mass and volume flow rate of fluid, mass and volume of fluid flowing through a pipeline, by measuring with a сoriolis flowmeter. Data collected during metrological studies undergo processing by the model to detect anomalies. This model analyzes the data and identifies anomalous or outlier values that may indicate systematic or random measurement errors. It enables quick and efficient detection of even the smallest deviations in the data, helping to maintain high accuracy and reliability of measurement results. The main methods for detecting outliers in statistical analysis, which are distribution-independent, are the Grubbs' criterion, interquartile range distribution, and standard deviation. They are sensitive to sample size but are simple and understandable tools. However, the Isolation Forest model also has its limitations, particularly it can be resource-demanding for large datasets. Additionally, it is necessary to consider that using the model requires proper parameter tuning to achieve optimal results. The results of the research include assessment of the Isolation Forest model's effectiveness by comparing it with traditional outlier detection methods. Comparative analysis of the results of different approaches to the same task is an effective method for evaluating the model's performance. Conclusion. The article concludes with the perspective of further research development in this direction. The work will focus on further developing methods for detecting anomalies in measurement data and improving the accuracy and reliability of measurement results in various application fields, which can find broad applications in science and industry.

Author Biography

Valeriy Aschepkov, Kharkiv National University of Radio Electronics

Junior Research Fellow at the National Scientific Center "Institute of Metrology", Рostgraduate Student at the Department of Information Measurement Technology

References

Список літератури

Chun S., Furuichi N. Final report of the APMP water flow supplementary comparison (APMP.M.FF-S1), Metrologia, Vol. 59, 2022. DOI: 10.1088/0026-1394/59/1A/07004

Frahm E., Arias R., Maldonado M., Vargas J., Mendoza J., Arredondo A., Silvosa M. Supplementary comparison SIM.M.FF-S9.2016 for water flow measurement, Metrologia, Vol. 61, 2024. DOI: 10.1088/0026-1394/61/1A/07001

Huovinen M., Frahm E. EURAMET.M.FF-S13 final report, Metrologia, Vol. 59, 2022. DOI: 10.1088/0026-1394/59/1A/07010.

ДСТУ-Н РМГ 43:2006 Метрологія. Застосування. Посібники з вираження невизначеності вимірювань, 2006.

Zakharov I., Serhiienko M., Chunikhina T. Measurement uncertainty evaluation by kurtosis method at calibration of a household water meter, Metrology and Metrology Assurance (MMA). P. 83–86. 2020. DOI: 10.1109/MMA49863.2020.9254260

Vallejo M., Espriella C., Gómez-Santamaría J., Ramírez-Barrera A., Delgado-Trejos E. Soft metrology based on machine learning: a review, Measurement Science and Technology, Vol. 31, No. 3. Р. 1–16. 2019. DOI:10.1088/1361-6501/ab4b39

Kebir S., Tabia K. Anomaly Detection in Real Scarce Data: A Case Study on Monitoring Elderly's Physical Activity and Sleep, IEEE 23rd International Conference on Bioinformatics and Bioengineering (BIBE), 2023, P. 385–392, DOI: 10.1109/BIBE60311.2023.00069

Yu B., Yu Y., Xu J., Xiang G., Yang Z. MAG: A Novel Approach for Effective Anomaly Detection in Spacecraft Telemetry Data, IEEE Transactions on Industrial Informatics, Vol. 20, No. 3, Р. 3891–3899. 2014. DOI: 10.1109/TII.2023.3314852

Li Z., Wang P., Wang Z., Zhan D. FlowGANAnomaly: Flow-Based Anomaly Network Intrusion Detection with Adversarial Learning, Chinese Journal of Electronics, Vol. 33, No. 1, 2022. Р. 58–71. DOI: 10.23919/cje.2022.00.173

Barbieri L., Brambilla M., Stefanutti M., Romano C., Carlo N., Roveri M. A Tiny Transformer-Based Anomaly Detection Framework for IoT Solutions, IEEE Open Journal of Signal Processing, Vol. 4, 2023. Р. 462–478. DOI: 10.1109/OJSP.2023.3333756.

Guo N., Lin C., Yan H., Zang J., Xiong M. Real-Time Pantograph Anomaly Detection Using Unsupervised Deep Learning and K-Nearest Neighbor Classification, IEEE Transactions on Instrumentation and Measurement, Vol. 73, 2024. Р. 1–13. DOI: 10.1109/TIM.2024.3370747

Occorso M., An M., Olsen R., Perry V.Anomaly Detection as a Data Reduction Approach for Test Event Analysis at the Edge, IEEE International Conference on Big Data (BigData), 2023. Р. 3863–3867, DOI: 10.1109/BigData59044.2023.10386215

Xiang H., Zhang X., Dras M., Beheshti A., Dou W., Xu X. Deep Optimal Isolation Forest with Genetic Algorithm for Anomaly Detection, IEEE International Conference on Data Mining (ICDM), 2023 P. 678–687, DOI: 10.1109/ICDM58522.2023.00077

Liu F., Ting K., Zhou Z. Isolation Forest, IEEE International Conference on Data Mining, 2008. Р. 413–422, DOI: 10.1109/ICDM.2008.17

Jurado K., Ludvigson S., Ng S. Measuring Uncertainty, American Economic Review, Vol. 105 (3). 2015. Р. 1177–1216. DOI: 10.1257/aer.20131193

References

Chun, S., Furuichi, N. (2022), "Final report of the APMP water flow supplementary comparison (APMP.M.FF-S1)" Metrologia, Vol. 59. DOI: 10.1088/0026-1394/59/1A/07004

Frahm, E., Arias, R., Maldonado, M., Vargas, J., Mendoza, J., Arredondo, A., Silvosa, M. (2024), "Supplementary comparison SIM.M.FF-S9.2016 for water flow measurement" Metrologia, Vol. 61, DOI: 10.1088/0026-1394/61/1A/07001

Huovinen, M., Frahm, E. (2022), "EURAMET.M.FF-S13 final report", Metrologia, Vol. 59, DOI: 10.1088/0026-1394/59/1A/07010.

DSTU-N RMG 43:2006 Metrology. Guidance on expressing measurement uncertainty [Metrolohiia. Kerivni vkazivky z vyrazhennia nevyznachennosti vymiriuvannia], 2006.

Zakharov, I., Serhiienko, M., Chunikhina, T. (2020), "Measurement uncertainty evaluation by kurtosis method at calibration of a household water meter", Metrology and Metrology Assurance (MMA) Р. 83–86. DOI: 10.1109/MMA49863.2020.9254260

Vallejo, M., Espriella, C., Gómez-Santamaría, J., Ramírez-Barrera, A., Delgado-Trejos, E. (2019), "Soft metrology based on machine learning: a review", Measurement Science and Technology, Vol. 31, No. 3. Р. 1–16. DOI: 10.1088/1361-6501/ab4b39

Kebir, S., Tabia, K. (2023), "Anomaly Detection in Real Scarce Data: A Case Study on Monitoring Elderly's Physical Activity and Sleep", IEEE 23rd International Conference on Bioinformatics and Bioengineering (BIBE), P. 385–392, DOI: 10.1109/BIBE60311.2023.00069

Yu, B., Yu, Y., Xu, J., Xiang, G., Yang, Z. (2014), "MAG: A Novel Approach for Effective Anomaly Detection in Spacecraft Telemetry Data", IEEE Transactions on Industrial Informatics, Vol. 20, No. 3, Р. 3891–3899, DOI: 10.1109/TII.2023.3314852

Li, Z., Wang, P., Wang, Z., Zhan, D., (2022), "FlowGANAnomaly: Flow-Based Anomaly Network Intrusion Detection with Adversarial Learning", Chinese Journal of Electronics, Vol. 33, No. 1, Р. 58–71, DOI: 10.23919/cje.2022.00.173

Barbieri, L., Brambilla, M., Stefanutti, M., Romano, C., Carlo, N., Roveri, M. (2023), "A Tiny Transformer-Based Anomaly Detection Framework for IoT Solutions", IEEE Open Journal of Signal Processing, Vol. 4, Р. 462–478, DOI: 10.1109/OJSP.2023.3333756

Guo, N., Lin, C., Yan, H., Zang, J., Xiong, M. (2024), "Real-Time Pantograph Anomaly Detection Using Unsupervised Deep Learning and K-Nearest Neighbor Classification", IEEE Transactions on Instrumentation and Measurement, Vol. 73, Р. 1–13, DOI: 10.1109/TIM.2024.3370747

Occorso, M., An, M., Olsen, R., Perry, V. (2023), "Anomaly Detection as a Data Reduction Approach for Test Event Analysis at the Edge", IEEE International Conference on Big Data (BigData), Р. 3863–3867, DOI: 10.1109/BigData59044.2023.10386215

Xiang, H., Zhang, X., Dras, M., Beheshti, A., Dou, W., Xu, X. (2023), "Deep Optimal Isolation Forest with Genetic Algorithm for Anomaly Detection", IEEE International Conference on Data Mining (ICDM), P. 678–687, DOI: 10.1109/ICDM58522.2023.00077

Liu, F., Ting, K., Zhou, Z. (2008), "Isolation Forest", IEEE International Conference on Data Mining, Р. 413–422, DOI: 10.1109/ICDM.2008.17

Jurado, K., Ludvigson, S., Ng, S. (2015), "Measuring Uncertainty", American Economic Review, Vol. 105 (3). Р. 1177–1216. DOI: 10.1257/aer.20131193

Published

2024-07-02

How to Cite

Aschepkov, V. (2024). The use of the Isolation Forest model for anomaly detection in measurement data. INNOVATIVE TECHNOLOGIES AND SCIENTIFIC SOLUTIONS FOR INDUSTRIES, (1 (27), 236–245. https://doi.org/10.30837/ITSSI.2024.27.236

Issue

Section

ELECTRONICS, TELECOMMUNICATION SYSTEMS & COMPUTER NETWORKS