Methods of parallel text data clustering algorithm implementation

Authors

DOI:

https://doi.org/10.15587/2312-8372.2015.37422

Keywords:

parallel calculations, clustering, Maximin, algorithmization, productivity

Abstract

The general algorithm of the organization of parallel calculations is considered. The features of the organization of process of parallel calculations are given; the criteria indicating ability of algorithm to representation in a parallel form are defined. The information concerning algorithm of Maximin is provided, the software for algorithms parallelization is considered. The version of the specified algorithm constructed on the basis of parallel calculations is developed. The problem of a clustering by means of parallel calculations with use of Maximin algorithm is solved, it is possible thanks to existence of at least two operations with uncorrelated results in algorithm. Parallel implementation of calculations shows the reduction of time of algorithm execution even with two processors. It is proved that the increase in productivity of algorithm depends linearly on the number of calculators increasing. The results received in work confirm expediency of use of parallel implementation of Maximin algorithm that in turn increases efficiency of data clustering process. 

Author Biography

Юрій Вікторович Волосюк, Mykolaiv branch of European University, 5/1Buznika St., Mykolaiv, Ukraine, 54097

Candidate of Technical Sciences, Associate Professor

Department of Informatics and socially-humanitarian disciplines

References

  1. Shpakovskii, G. I. (2010). Realizatsiia parallel'nyh vychislenii: klastery, mnogoiadernye protsesry, grid, kvantovye komp'iutery. Minsk: BGU, 155.
  2. Holod, I. I., Karshiev, Z. A. (2013). Metodika rasparallelivaniia algoritmov intellektual'nogo analiza dannyh. Izvestiia SPbGETU «LETI», № 3, 38–45.
  3. Ostrovskii, A. A. (2009). Realizatsiia parallel'nogo vypolneniia algoritma FCM-klasterizatsii. Prikladnaia informatika, № 2, 101–106.
  4. Peskisheva, T. A., Kotel'nikov, E. V. (2011). Parallel'naia realizatsiia algoritma obucheniia sistemy tekstovoi klassifikatsii. Vestnik UGATU, Vol. 15, № 5 (45), 130–136.
  5. Barahnin, V. B. (2012). Otsenka effektivnosti metoda prarallel'noi realizatsii protsessa klasterizatsii tekstovyh dokumentov na osnove algoritma Fris-Cluster. Vestnik NGU, № 10, 417–422.
  6. Chang, D., Kantardzic, M., Ouyang, M. (2009). Hierarchical clustering with CUDA/GPU. Proceedings of ISCA PDCCS, 130–135.
  7. Wang, H. (2015, January). Equivalence Class Based Parallel Algorithm for Mining MFI. Applied Mechanics and Materials, Vol. 713-715, 1712–1715. doi:10.4028/www.scientific.net/amm.713-715.1712
  8. Borisova, I. A., Zagoruiko, N. G. (2009). Ispol'zovanie FRiS-funktsii dlia resheniia zadachi SDX. International Conference «Classification, Forecasting, Data Mining» CFDM, Varna, 110–116.
  9. Barsegian, A. A., Kupriianov, M. S., Holod, I. I., Tess, M. D., Elizarov, S. I. (2009). Analiz dannyh i protsessov. Ed. 3. SPb.: BHV-Peterburg, 512.
  10. Troelsen, E. (2011). Yazyk programmirovaniia C# 2010 i platforma NET 4. Translated from English. Ed. 5. M.: Vil'iams, 1392.

Published

2015-01-29

How to Cite

Волосюк, Ю. В. (2015). Methods of parallel text data clustering algorithm implementation. Technology Audit and Production Reserves, 1(2(21), 34–37. https://doi.org/10.15587/2312-8372.2015.37422

Issue

Section

Information Technologies: Original Research