Development of a stopping rule of clustering performance by using the connected acyclic graph
DOI:
https://doi.org/10.15587/1729-4061.2015.51090Keywords:
initial clustering, preclustering algorithm, stopping rule, connected acyclic graph, clusterAbstract
In this article the technique of the analysis of a stopping rule for the data preclustering algorithm without the prior information about the number of clusters with the use of a connected acyclic graph is introduced. The connected acyclic graph (tree) makes it possible to represent the interconnection between the objects in input data. The stopping rule allows a halt at the some step assuming that further clusterization will not cause finding new clusters. The core of the analysis was the application of the preclustering algorithm and the stopping rule to the series of input data which were represented by sample cases of input data. Sample cases were input data with normal distribution law which belonged either to a single group or to many groups. The analysis has shown the advantages of the stopping rule for the data preclustering algorithm.
References
- Bailey, K. (1994). Numerical Taxonomy and Cluster Analysis. Typologies and Taxonomies, 36. doi: 10.4135/9781412986397.n3
- Jain, A. K., Murthy, M. N., Flynn, P. J. (1999). Data Clustering: A Review. ACM Computing Reviews, 69.
- Aggarwal, C. C. (2013). Data Clustering: Algorithms and Applications 1st Edition. Chapman & Hall, 652.
- Illumina (2015). Diagnosing and Preventing Flow Cell Overclustering on the MiSeq® System, 10.
- Hofmann, M., Klinkenberg, R. (2013). RapidMiner: Data Mining Use Cases and Business Analytics Applications. Chapman & Hall/CRC, 431.
- Kovács, F., Legány, C., Babos, A. (2006). Cluster Validity Measurement Techniques. Proceeding AIKED’06 Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases.
- Rendón, E., Abundez, I., Arizmendi, A., Quiroz, I. M. (2011). Internal versus External cluster validation indexes. International journal of computers and communications, 1 (5), 25–34.
- Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J. (2010). Understanding of Internal Clustering Validation Measures. IEEE International Conference on Data Mining, 911–916. doi: 10.1109/icdm.2010.35
- Rokach, L., Maimon, L. O. (2005). Clustering Methods. Data Mining and Knowledge Discovery Handbook, 321–352. doi: 10.1007/0-387-25465-x_15
- Jain, A. K., Dubes, R. C. (1988). Algorithms for clustering data. Prentice Hall, 320.
- Gan, G., Ma, C., Wu, J. (2007). Data Clustering: Theory, Algorithms and Applications. ASA-SIAM Series on Statistics and Applied Probability, 466.
- Mosorov, V., Tomczak, L. (2014). Image Texture Defect Detection Method Using Fuzzy C-Means Clustering for Visual Inspection Systems. Arabian Journal for Science and Engineering, 39 (4), 3013–3022. doi: 10.1007/s13369-013-0920-7
- Qian, W., Zhou, A. (2002). Analyzing popular clustering algorithms from different viewpoints. Journal of Software, 13 (18), 1383–1394.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2015 Volodymyr Mosorov, Sebastian Biedron, Taras Panskyi
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.
A license agreement is a document in which the author warrants that he/she owns all copyright for the work (manuscript, article, etc.).
The authors, signing the License Agreement with TECHNOLOGY CENTER PC, have all rights to the further use of their work, provided that they link to our edition in which the work was published.
According to the terms of the License Agreement, the Publisher TECHNOLOGY CENTER PC does not take away your copyrights and receives permission from the authors to use and dissemination of the publication through the world's scientific resources (own electronic resources, scientometric databases, repositories, libraries, etc.).
In the absence of a signed License Agreement or in the absence of this agreement of identifiers allowing to identify the identity of the author, the editors have no right to work with the manuscript.
It is important to remember that there is another type of agreement between authors and publishers – when copyright is transferred from the authors to the publisher. In this case, the authors lose ownership of their work and may not use it in any way.