New data clustering heuristic algorithm

Volodymyr Mosorov; Taras Panskyi

doi:10.15587/1729-4061.2015.39785

Authors

Volodymyr Mosorov Lodz University of Technology Stefanowskiego str. 18\22, Lodz, Poland, 90-924, Poland
Taras Panskyi Lodz University of Technology Stefanowskiego str. 18\22, Lodz, Poland, 90-924, Poland

DOI:

https://doi.org/10.15587/1729-4061.2015.39785

Keywords:

clustering method, cluster, heuristic algorithm, density distribution, density based

Abstract

Clustering is the data mining technique that is used to place or collect objects into groups in such a way that objects in the same group are more similar or related among themselves than to those in other groups. These groups, called clusters, resemble each other but differ from other groups in objects which those contain. In this article the method of data clustering on the example of random data with uniform distribution was proposed. This article is focused on clustering in data mining. Data mining represents solving the problems by clustering large data sets with different data types and properties. The main task of the research was investigating data clustering and finding out how many clusters the data set contains. In particular, we were interested in answering the question whether there is more than one cluster in this data set. New method includes the decision rule. Decision rule uses the following parameters: area of regions found by the density distribution of input data, the number and magnitude of local maxima (peaks) found in each region, the number of elements (of the total number of primary elements) that fall into each found region. Proposed clustering method differs from existing, that the input parameter is the only data set and the criterion for evaluating the correctness of this method, is an objective assessment of a person or group of people based on visual logical analysis. All manipulations with the data mentioned in this article were made by using the Matlab software.

Author Biographies

Volodymyr Mosorov, Lodz University of Technology Stefanowskiego str. 18\22, Lodz, Poland, 90-924

Doctor of Technical Sciences

Institute of Applied Computer Science

Taras Panskyi, Lodz University of Technology Stefanowskiego str. 18\22, Lodz, Poland, 90-924

graduate student

Institute of Applied Computer Science

References

Kudo, M., Sklansky, J. (2000). Comparison of algorithms that select features for pattern classifiers. Pattern Recognition, 33 (1), 25–41. doi: 10.1016/S0031-3203(99)00041-2
Wernick, M. N., Yang, Y., Brankov, J. G., Yourganov, G., Strother, S. C. (2010) "Machine Learning in Medical Imaging", IEEE Signal Processing Magazine, 27 (4), 25–38. doi: 10.1109/msp.2010.936730
Solomon, C. J., Breckon, T. P. (2010). Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab. Wiley-Blackwell, 328. doi: 10.1002/9780470689776
McCallum, A., Nigam, K., Ungar, L. H. (2000). Efficient Clustering of High Dimensional Data Sets with Application to Reference Matching. Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 169–178. doi: 10.1145/347090.347123
Deepti. S., Lokesh. S., Sheetal. S., Khushboo. S. (2012). Clustering Techniques: A Brief Survey of Different Clustering Algorithms. International Journal of Latest Trends in Engineering and Technology (IJLTET), 1, 82–87.
Khushali, M., Swapnil, A., Sahista, M. (2013) NDCMD: A Novel Approach Towards Density Based Clustering Using Multidimensional Spatial Data. International Journal of Engineering Research & Technology (IJERT), 2 (6).
Shou, S.-G., Zhou, A.-Y. Jin, W., Fan, Y., Qian, W.-N. (2000). A Fast DBSCAN Algorithm. Journal of Software, 735–744.
Peter, J. H., Antonysamy, A. (2010). An Optimised Density Based Clustering Algorithm. International Journal of Computer Applications, 6 (9), 20–25. doi: 10.5120/1102-1445
Wei, W., Shuang, Z., Bingfei, R., Suoju, H. (2013). improved VDBscan with global optimum K.
Birant, D., Kut, A. (2007). ST-DBSCAN: An algorithm for clustering spatial–temporal data. Data & Knowledge Engineering, 60 (1), 208–221. doi: 10.1016/j.datak.2006.01.013
Navneet, G., Poonam, G., Venkatramaiah, K., Deepak, P. C., Sanoop, P. S. (2011). An Efficient Density Based Incremental Clustering Algorithm in Data Warehousing Environment. 2009 International Conference on Computer Engineering and Applications IPCSIT, 2.
Rehman, M., Mehdi, S. A. Comparison of density-based clustering algorithms. Available at: https://www.google.com.ua/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CBwQFjAA&url=http%3A%2F%2Fwww.researchgate.net%2Fprofile%2FSyed_Atif_Mehdi%2Fpublication%2F242219043_COMPARISON_OF_DENSITY-BASED_CLUSTERING_ALGORITHMS%2Flinks%2F5422e1120cf26120b7a6b36e.pdf&ei=LHgRVaSTA6Gv7Abh34CACw&usg=AFQjCNFA9JnzuIbam4BOKYCS_30Yw8Czmg&sig2=wNiTYQiNzFKcDOfEV3mLFw&cad=rja
Berkhin, P. (2002). Survey Of Clustering Data Mining Techniques. Available at: http://www.cc.gatech.edu/~isbell/reading/papers/berkhin02survey.pdf
Abu Abbas, O. (2008).Comparison Between Data Clustering Algorithm. The International Arab Journal of Information Technology, 5 (3), 320–325.
Gan, G., Chaoqun, M., Jianhong, W. (2007). Data Clustering: Theory, Algorithms, and Applications. ASA-SIAM Series on Statistics and Applied Probability, SIAM, Philadelphia, ASA, Alexandria, 466. doi: 10.1137/1.9780898718348
Jiawei, H., Kamber, M., Pei, J. (2006). Data Mining: Concepts and Techniques, Second Edition. Series Editor Morgan Kaufmann Publishers, 800.
Riley, K. F., Hobson, M. P., Bence, S. J. (2010).Mathematical methods for physics and engineering. Cambridge University Press, 1359.
Anil, K. J., Dubes, R. C. (1988). Algorithms for clustering data. Prentice-Hall, Inc. Upper Saddle River, NJ, USA.

New data clustering heuristic algorithm

Authors

DOI:

Keywords:

Abstract

Author Biographies

Volodymyr Mosorov, Lodz University of Technology Stefanowskiego str. 18\22, Lodz, Poland, 90-924

Taras Panskyi, Lodz University of Technology Stefanowskiego str. 18\22, Lodz, Poland, 90-924

References

Downloads

Published

How to Cite

Issue

Section

License

Language

Information

Make a Submission

Developed By

Current Issue