Big Data analytics ontology

Authors

DOI:

https://doi.org/10.15587/2312-8372.2018.123612

Keywords:

Big Data analysis ontology, visualization data, data mining, Text Mining, MapReduce

Abstract

The object of this research is the Big Data (BD) analysis processes. One of the most problematic places is the lack of a clear classification of BD analysis methods, the presence of which will greatly facilitate the selection of an optimal and efficient algorithm for analyzing these data depending on their structure.

In the course of the study, Data Mining methods, Technologies Tech Mining, MapReduce technology, data visualization, other technologies and analysis techniques were used. This allows to determine their main characteristics and features for constructing a formal analysis model for Big Data. The rules for analyzing Big Data in the form of an ontological knowledge base are developed with the aim of using it to process and analyze any data.

A classifier for forming a set of Big Data analysis rules has been obtained. Each BD has a set of parameters and criteria that determine the methods and technologies of analysis. The very purpose of BD, its structure and content determine the techniques and technologies for further analysis. Thanks to the developed ontology of the knowledge base of BD analysis with Protégé 3.4.7 and the set of RABD rules built in them, the process of selecting the methodologies and technologies for further analysis is shortened and the analysis of the selected BD is automated. This is due to the fact that the proposed approach to the analysis of Big Data has a number of features, in particular ontological knowledge base based on modern methods of artificial intelligence.

Thanks to this, it is possible to obtain a complete set of Big Data analysis rules. This is possible only if the parameters and criteria of a specific Big Data are analyzed clearly.

Author Biographies

Vasyl Lytvyn, Lviv Polytechnic National University, 12, S. Bandery str., Lvіv, Ukraine, 79013

Doctor of Technical Sciences, Professor

Department of Information Systems and Networks

Victoria Vysotska, Lviv Polytechnic National University, 12, S. Bandery str., Lvіv, Ukraine, 79013

PhD, Associate Professor

Department of Information Systems and Networks

Oleh Veres, Lviv Polytechnic National University, 12, S. Bandery str., Lvіv, Ukraine, 79013

PhD, Associate Professor

Department of Information Systems and Networks 

Oksana Brodyak, Lviv Polytechnic National University, 12, S. Bandery str., Lvіv, Ukraine, 79013

PhD, Associate Professor

Department of Mathematics

Oksana Oryshchyn, Lviv Polytechnic National University, 12, S. Bandery str., Lvіv, Ukraine, 79013

PhD, Associate Professor

Department of Mathematics

References

  1. Mayer-Schonberger, V., Cukier, K. (2013). Big Data: A Revolution That Will Transform How We Live, Work, and Think. John Murray Publishers, 256.
  2. Fekete, J. D. (2016). Big Data Visual Analytics. Available at: http://www.aviz.fr/wiki/uploads/TeachingVA2016/Lectur-BigDataVA.pdf. Last accessed: 18.09.2017.
  3. Raghupathi, W., Raghupathi, V. (2014). Big data analytics in healthcare: promise and potential. Health Information Science and Systems, 2 (1). doi:10.1186/2047-2501-2-3
  4. Hong, S. H., Ma, K. L., Koyamada, K. (2017). Big Data Visual Analytics. NII Shonan Meeting Report No. 2015-147. Tokyo. Available at: https://pdfs.semanticscholar.org/45ec/4934ee034a5839f4e657089ac865f0baa8ff.pdf. Last accessed: 18.09.2017.
  5. Cohen, J., Dolan, B., Dunlap, M., Hellerstein, J. M., Welton, C. (2009). MAD Skills: New Analysis Practices for Big Data. Proceedings of the VLDB Endowment, 2 (2), 1481–1492. doi:10.14778/1687553.1687576
  6. History and evolution of big data analytics. Available at: https://www.sas.com/en_us/insights/analytics/big-data-analytics.html. Last accessed: 18.09.2017.
  7. Mitchell R. L. 8 big trends in big data analytics. Available at: http://www.computerworld.com/article/2690856/big-data/8-big-trends-in-big-data-analytics.html. Last accessed: 18.09.2017.
  8. Big Data. Available at: http://tadviser.ru/a/125096. Last accessed: 18.09.2017.
  9. Inmon, W. H. (2014). Big Data – getting it right: A checklist to evaluate your environment. Forest Rim Technology LLC. Available at: http://dssresources.com/papers/features/inmon/inmon01162014.htm. Last accessed: 18.09.2017.
  10. Barsegyan, A. A., Kupriyanov, M. S., Kholod, I. I., Tess, M. D., Elizarov, S. I. (2009). Analysis of data and processes. Saint Petersburg: BHV-Petersburg, 512.
  11. Paklin, N. B., Oreshkov, V. I. (2009). Business analysis: from data to knowledge. Saint Petersburg: Piter, 624.
  12. Duke, V., Samoylenko, A. (2001). Data Mining: training course. Saint Petersburg: Piter, 368.
  13. Manyika, J. (2011). Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute, 156.
  14. Zhuravlev, J. I., Ryazanov, V. V., Senko, O. V. (2006). Recognition. Mathematical methods. Software system. Practical applications. Moscow: Phasis, 176.
  15. Zinovev, A. Y. (2000). Visualization of multidimensional data. Krasnoyarsk: Publisher Krasnoyarsk State Technical University, 180.
  16. Chubukova, I. A. (2006). Data Mining. Moscow: Internet University of Information Technologies, BINOM, 382.
  17. Sitnik, V. F., Krasnyuk, M. T. (2007). Data Mining. Kyiv: KNEU, 376.
  18. Witten, I. H., Frank, E., Hall, M. A. (2011). Data Mining: Practical Machine Learning Tools and Techniques. Burlington: Morgan Kaufmann, 664. doi:10.1016/c2009-0-19715-5
  19. Marr, B. (2015). Big Data: Using SMART Big Data, Analytics and Metrics to Make Better Decisions and Improve Performance. John Wiley & Sons Ltd, 256.
  20. Einav, L., Levin, J. (2014). The Data Revolution and Economic Analysis. Available at: http://www.nber.org/chapters/c12942.pdf. Last accessed: 18.09.2017.
  21. Vanyashin, A., Klimentov, A., Korenkov, V. (2013). PANDA follows the large data. Supercomputers, 3 (11), 56–61.
  22. Serov D. Analytics of «big data» – new perspectives. Available at: http://www.storagenews.ru/49/EMC_BigData_49.pdf. Last accessed: 18.09.2017.
  23. Ronen, S., Goncalves, B., Hu, K. Z., Vespignani, A., Pinker, S., Hidalgo, C. A. (2014). Links that speak: The global language network and its association with global fame. Proceedings of the National Academy of Sciences, 111 (52), 5616–5622. doi:10.1073/pnas.1410931111
  24. Aflalo, Y., Kimmel, R. (2013). Spectral multidimensional scaling. Proceedings of the National Academy of Sciences, 110 (45), 18052–18057. doi:10.1073/pnas.1308708110
  25. Gadepally, V., Kepner, J. (2014). Big data dimensional analysis. 2014 IEEE High Performance Extreme Computing Conference (HPEC). doi:10.1109/hpec.2014.7040944
  26. Weinstein, M., Meirer, F., Hume, A., Sciau, Ph., Shaked, G., Hofstetter, R. et al. Analyzing Big Data with Dynamic Quantum Clustering. Available at: https://arxiv.org/ftp/arxiv/papers/1310/1310.2700.pdf. Last accessed: 18.09.2017.
  27. Paklin, N. B., Oreshkov, V. I. (2013). Business Intelligence: from data to knowledge. Saint Petersburg: Piter, 702.
  28. Zelazny, D. (2004). Speak in the language of diagrams: manual on visual communications for managers. Moscow: Institute for Comprehensive Strategic Studies, 220.
  29. Roem, D. (2014). The practice of visual thinking. An original method for solving complex problems. Moscow: Mann, Ivanov and Ferber, 396.
  30. Russom, P. (2011). Big data analytics. Available at: https://vivomente.com/wp-content/uploads/2016/04/big-data-analytics-white-paper.pdf. Last accessed: 18.09.2017.
  31. Yau, N. (2013). The art of visualization in business. How to present complex information with simple images. Moscow: Mann, Ivanov and Ferber, 352.
  32. Iliinsky, N., Steele, J. (2011). Designing Data Visualizations. Sebastopol: O’Reilly, 110.
  33. Krum, R. (2014). Cool infographics: effective communication with data visualization and design. Indianapolis: Wiley, 348.
  34. Tukey, J. (1981). Analysis of Observation Results: Exploratory Analysis. Moscow: Мir, 693.
  35. Alper, C., Brown, K., Wagner, G. R. (2006). New Software for Visualizing the Past, Present and Future. Available at: http://dssresources.com/papers/features/alperbrown&wagner/alperbrown&wagner09212006.html. Last accessed: 18.09.2017.
  36. Barsegyan, A. A., Kupriyanov, M. S., Kholod, I. I., Tess, M. D., Elizarov, S. I. (2009). Analysis of data and processes. Saint Petersburg: BHV-Petersburg, 512.
  37. Text Mining. Available at: http://statsoft.ru/home/textbook/modules/sttextmin.html#index. Last accessed: 18.09.2017.
  38. Lande, D., Berezin, B., Pavlenko, O. (2017). Postroenie modeli informatsionnogo servisa na baze natsional'nogo segmenta Internet. Informatsionnye tehnologii i bezopasnost'. Materialy XVI Mezhdunarodnoi nauchno-prakticheskoi konferentsii ITB-2016. Kyiv: IPRI NAN Ukrainy, 48–57. URL: http://dwl.kiev.ua/art/itb2016/i4/i4.pdf. Last accessed: 18.09.2017.
  39. Barsegyan, A. A., Kupriyanov, M. S., Stepanenko, V. V., Kholod, I. I. (2007). Data Analysis Technologies. Data Mining, Visual Mining, Text Mining, OLAP. Saint Petersburg: BHV-Petersburg, 384.
  40. Linyuchev, P. (2007). Text Mining: modern technologies on information mines. PC Week/RE, 6 (564). Available at: https://www.pcweek.ru/idea/article/detail.php?ID=82081. Last accessed: 18.09.2017.
  41. Pleskach, V. L., Zatonatskaya, T. G. (2011). Information systems and technologies at enterprises. Kyiv: Znannya, 718.
  42. Stonebraker, M., Abadi, D., DeWitt, D. J., Madden, S., Paulson, E., Pavlo, A., Rasin, A. (2010). MapReduce and Parallel DBMSs: Friends or Foes? Communications of the ACM, 53 (1), 64. doi:10.1145/1629175.1629197
  43. Berezin, A. (2013). Map-Reduce on the example of MongoDB. Available at: https://habrahabr.ru/post/184130/. Last accessed: 18.09.2017.
  44. Lebedenko, E. (2013). Google MapReduce technology: divide and conquer. Kompiuterra. Available at: http://www.computerra.ru/82659/mapreduce/. Last accessed: 18.09.2017.
  45. Pavlo, A., Paulson, E., Rasin, A., Abadi, D. J., DeWitt, D. J., Madden, S., Stonebraker, M. (2009). A comparison of approaches to large-scale data analysis. Proceedings of the 35th SIGMOD International Conference on Management of Data – SIGMOD ’09. doi:10.1145/1559845.1559865
  46. Big Data from A to Ya. Part 1: Principles of working with large data, the MapReduce paradigm. (2015). Available at: https://habrahabr.ru/company/dca/blog/267361/. Last accessed: 18.09.2017.
  47. Big Data from A to Ya. Part 3: Methods and strategies for developing MapReduce applications. (2015). Available at: https://habrahabr.ru/company/dca/blog/270453/. Last accessed: 18.09.2017.
  48. Gavrilova, T. A., Khoroshevsky, V. F. (2000). Intelligent Systems Knowledge Base. Saint Petersburg: Piter, 384.
  49. Lytvyn, V., Vysotska, V., Veres, O., Rishnyak, I., Rishnyak, H. (2017). Classification Methods of Text Documents Using Ontology Based Approach. Advances in Intelligent Systems and Computing. Springer, 229–240. doi:10.1007/978-3-319-45991-2_15
  50. Bisikalo, O. V., Vysotska, V. A. (2016). Identifying keywords on the basis of content monitoring method in Ukrainian texts. Radio Electronics, Computer Science, Control, 1 (36), 74–83. doi:10.15588/1607-3274-2016-1-9
  51. Bisikalo, O. V., Vysotska, V. A. (2016). Sentence syntactic analysis application to keywords identification Ukrainian texts. Radio Electronics, Computer Science, Control, 3 (38), 54–65. doi:10.15588/1607-3274-2016-3-7
  52. Lytvyn, V., Bobyk, I., Vysotska, V. (2016). Application of algorithmic algebra system for grammatical analysis of symbolic computation expressions of propositional logic. Radio Electronics, Computer Science, Control, 4 (39), 54–67. doi:10.15588/1607-3274-2016-4-10
  53. Alieksieieva, K., Berko, A., Vysotska, V. (2015). Technology of commercial web-resource management based on fuzzy logic. Radio Electronics, Computer Science, Control, 3 (34), 71–79. doi:10.15588/1607-3274-2015-3-9
  54. Korobchynskyi, M., Chyrun, L., Vysotska, V., Nych, M. (2017). Matches prognostication features and perspectives in cybersport. Radio Electronics, Computer Science, Control, 3 (42), 95–105. doi:10.15588/1607-3274-2017-3-11
  55. Wolfram, S. (2013). Data Science of the Facebook World. Available at: http://blog.wolfram.com/2013/04/24/data-science-of-the-facebook-world/. Last accessed: 18.09.2017.

Published

2017-12-28

How to Cite

Lytvyn, V., Vysotska, V., Veres, O., Brodyak, O., & Oryshchyn, O. (2017). Big Data analytics ontology. Technology Audit and Production Reserves, 1(2(39), 16–27. https://doi.org/10.15587/2312-8372.2018.123612

Issue

Section

Information Technologies: Original Research