Development of multibase data storages on the basis of data and queries structuredness

Authors

DOI:

https://doi.org/10.15587/1729-4061.2015.36646

Keywords:

multibase data storages, building, data structuredness, queries, genetic algorithms, gene-based adaptation of search

Abstract

The study focuses on building multibase data storages that consider a correlation between the data properties and performed queries. This type of data storaging has been neither viewed as an approach nor researched before. Lack of attention especially concerns presentation of data by various models for optimizing query response.We suggest a method of designing multibase data storages on the basis of data structuredness, which allows posting the reference data in storage media the data models of which facilitate performing queries on them. The efficiency of the designed data storage is optimized on the basis of the statistics on queries processing and consists in storing data as well as saving the data in storage media with the help of indexing, materialized submission, fragmentation, and merger. We have studied both the impact of design phases and optimization on storage performance and the parameters of the modified genetic algorithm, including the threshold of gene adaptation.

The research has proved that application of the suggested approach increases the integral index of query processing by 10 %. The storage building time can be reduced to 50 %, which significantly impacts data storage building of a huge amount of data. An important advantage of the approach is flexibility: any storage media and optimization mechanisms can be used while applying the suggested models.

Author Biography

Андрій Юрійович Яцишин, Golovfintech Dehtiarivska 38-44, Kyiv, Ukraine, 04119

Lead application programmer

References

  1. Inmon, W. H. Corporate Information Factory Components.Inmon Data Systems. available at: http://www.inmoncif.com/view/26
  2. Kimball, R. (2002). The data warehouse toolkit: the complete guide to dimensional modeling. Wiley, 436.
  3. Hackney, D. Architectures and Approaches for Successful Data Warehouses. Available at: http://www.egltd.com/presents/ArchitecturesApproaches.pdf
  4. Tomashevskyi, V. M., Yatsyshyn, A. Yu. (2011). Osoblyvosti proektuvannia hibrydnykh skhovyshch danykh z vrakhuvanniam dzherel danykh . Informatsiini systemy ta merezhi: zbirnyk naukovykh prats. Vistnik Natsionalnogo universytetu "Lvivska politekhnika", 715, 246–254.
  5. Thusoo, A., Sarma, J. S., Jain, N., Shao, Z., Chakka, P., Zhang, N. et. al. (2010). Hive – a petabyte scale data warehouse using Hadoop. Data Engineering (ICDE), 2010 IEEE 26th International Conference, 996–1005. doi: 10.1109/icde.2010.5447738
  6. Shakhovska, N. B. (2012). Organizatsiya prostoriv danih u skladnyh informatsiinyh sistemah. Natsionalnyi universytet "Lvivska polItehnika", 39.
  7. Zhou, L., He, X., Li, K. (2012),. An Improved Approach for Materialized View Selection Based on Genetic Algorithm. Journal of Computers, 7 (7), 1591–1598. doi: 10.4304/jcp.7.7.1591-1598
  8. Mami, I., Bellahsene, Z. (2012). A survey of view selection methods. ACM SIGMOD Record, 41 (1), 20–29. doi: 10.1145/2206869.2206874
  9. Dimovski, A., Velinov, G., Sahpaski, D. (2010). Advances in Databases and Information Systems. Lecture Notes in Computer Science, 6295, 164–175. doi: 10.1007/978-3-642-15576-5_14
  10. Elmansouri, R., Ziyati, E., Elbeqqali, O., Aboutajdine, D. (2013). The fragmentation of data warehouses. An approach based on principal components analysis. 2012 International Conference on Multimedia Computing and Systems (ICMCS), 18–23. doi: 10.1109/icmcs.2012.6320319
  11. Jarke, M., Jeusfeld, M. A., Quix, C., Vassiliadis, P. (2013). Architecture and Quality in Data Warehouses. Seminal Contributions to Information Systems Engineering, 161–181. doi: 10.1007/978-3-642-36926-1_13
  12. Siebert, J. C., Munsil, W., Rosenberg-Hasson, Y., Davis, M. M., Holden, T., Maecker, J. (2013). The Stanford Data Miner: a novel approach for integrating and exploring heterogeneous immunological data. Journal of Translational Medicine, 10 (1), 62. doi: 10.1186/1479-5876-10-62
  13. Yatsyshyn, A. Yu. (2012). Proektuvannia multybazovykh skhovyshch danykh na osnovi dvokhfaznoho alhorytmu Visnyk NTUU «KPI». Informatyka, upravlinnia ta obchysliuvalna tekhnika : zbirnyk naukovykh prats, 55, 125–134.
  14. Yatsyshyn, A. Yu. (2012). Proektuvannia hibrydnykh skhovyshch danykh z vrakhuvanniam strukturovanosti danykh.Upravlinnia rozvytkom skladnykh system, 9, 59–65.
  15. Azarov, M. Ya. (Ed.) (2011). Rol virtualnoho universytetu u zabezpechenni prozorosti biudzhetnoho protsesu v monohrafii Derzhavnyi biudzhet i biudzhetna stratehiia v umovakh ekonomichnykh reform: u 4 t. Vol. 2. DNNU «Akad. fin. upravlinnia», 878–902.
  16. Azarov, M. Ya. (2011). Sotsialna tekhnolohiia «Prozoryi biudzhet» yak innovatsiia v monohrafii Derzhavnyi biudzhet i biudzhetna stratehiia v umovakh ekonomichnykh reform: u 4 t. Vol. 4. DNNU «Akad. fin. upravlinnia»; 327–381.

Published

2015-02-27

How to Cite

Яцишин, А. Ю. (2015). Development of multibase data storages on the basis of data and queries structuredness. Eastern-European Journal of Enterprise Technologies, 1(2(73), 11–17. https://doi.org/10.15587/1729-4061.2015.36646