Methods of parallel text data clustering algorithm implementation
Keywords:parallel calculations, clustering, Maximin, algorithmization, productivity
The general algorithm of the organization of parallel calculations is considered. The features of the organization of process of parallel calculations are given; the criteria indicating ability of algorithm to representation in a parallel form are defined. The information concerning algorithm of Maximin is provided, the software for algorithms parallelization is considered. The version of the specified algorithm constructed on the basis of parallel calculations is developed. The problem of a clustering by means of parallel calculations with use of Maximin algorithm is solved, it is possible thanks to existence of at least two operations with uncorrelated results in algorithm. Parallel implementation of calculations shows the reduction of time of algorithm execution even with two processors. It is proved that the increase in productivity of algorithm depends linearly on the number of calculators increasing. The results received in work confirm expediency of use of parallel implementation of Maximin algorithm that in turn increases efficiency of data clustering process.
- Shpakovskii, G. I. (2010). Realizatsiia parallel'nyh vychislenii: klastery, mnogoiadernye protsesry, grid, kvantovye komp'iutery. Minsk: BGU, 155.
- Holod, I. I., Karshiev, Z. A. (2013). Metodika rasparallelivaniia algoritmov intellektual'nogo analiza dannyh. Izvestiia SPbGETU «LETI», № 3, 38–45.
- Ostrovskii, A. A. (2009). Realizatsiia parallel'nogo vypolneniia algoritma FCM-klasterizatsii. Prikladnaia informatika, № 2, 101–106.
- Peskisheva, T. A., Kotel'nikov, E. V. (2011). Parallel'naia realizatsiia algoritma obucheniia sistemy tekstovoi klassifikatsii. Vestnik UGATU, Vol. 15, № 5 (45), 130–136.
- Barahnin, V. B. (2012). Otsenka effektivnosti metoda prarallel'noi realizatsii protsessa klasterizatsii tekstovyh dokumentov na osnove algoritma Fris-Cluster. Vestnik NGU, № 10, 417–422.
- Chang, D., Kantardzic, M., Ouyang, M. (2009). Hierarchical clustering with CUDA/GPU. Proceedings of ISCA PDCCS, 130–135.
- Wang, H. (2015, January). Equivalence Class Based Parallel Algorithm for Mining MFI. Applied Mechanics and Materials, Vol. 713-715, 1712–1715. doi:10.4028/www.scientific.net/amm.713-715.1712
- Borisova, I. A., Zagoruiko, N. G. (2009). Ispol'zovanie FRiS-funktsii dlia resheniia zadachi SDX. International Conference «Classification, Forecasting, Data Mining» CFDM, Varna, 110–116.
- Barsegian, A. A., Kupriianov, M. S., Holod, I. I., Tess, M. D., Elizarov, S. I. (2009). Analiz dannyh i protsessov. Ed. 3. SPb.: BHV-Peterburg, 512.
- Troelsen, E. (2011). Yazyk programmirovaniia C# 2010 i platforma NET 4. Translated from English. Ed. 5. M.: Vil'iams, 1392.
How to Cite
Copyright (c) 2016 Юрій Вікторович Волосюк
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.