The sample censoring method development for neural network model synthesis

Authors

  • Сергей Александрович Субботин Zaporizhzhya National Technical University Zhukovsky str., 64, Zaporizhzhya, 69063, Ukrain, Ukraine https://orcid.org/0000-0001-5814-8268

DOI:

https://doi.org/10.15587/1729-4061.2014.28027

Keywords:

sample, instance selection, data reduction, neural network, dimensionality reduction

Abstract

The method of training sample formation is proposed. It allows to characterize the individual instance informativity relative to the centers and boundaries of feature intervals. This allows to automate the analysis of the sample and its separation into sub-samples, and, as a result, to reduce the training data dimensionality. The computer program implementing proposed method has been developed and used in the experiments. The developed software was investigated in solving the problem of diagnosis chronic obstructive bronchitis from the experimentally obtained data of clinical laboratory tests of patients. The experiments found that even a slight reduction of the original sample volume by 25 % (to 75 % of the original volume) yielded acceptable accuracy and reduces training time by more than 1.32 times. The twice reduction of the original sample volume (up to 50 % of the original volume) afforded the gain in speed of 1.99 times. This confirms the usefulness of the proposed mathematical support in the construction of neural network models by precedents.

Author Biography

Сергей Александрович Субботин, Zaporizhzhya National Technical University Zhukovsky str., 64, Zaporizhzhya, 69063, Ukrain

Doctor habilitated of Science, Professor

Department of software tools

References

  1. 1. Engelbrecht, A. (2007). Computational intelligence: an introduction. Sidney, John Wiley & Sons, 597. doi: 10.1002/9780470512517

    2. Jankowski, N., Grochowski, M. (2004). Comparison of instance selection algorithms I. Algorithms survey. Presented at 7th International Conference on Artificial Intelligence and Soft Computing,Zakopane,Poland, 3070. doi:10.1007/978-3-540-24844-6_90

    3. Reinartz, T. (2002). A unifying view on instance selection. Data Mining and Knowledge Discovery, 6, 191–210. doi:10.1023/A:1014047731786

    4. Hart, P. E. (1968). The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14, 515–516. doi:10.1109/TIT.1968.1054155

    5. Aha, D. W., Kibler, D., Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66. doi:10.1023/A:1022689900470

    6.Brighton, H., Mellish, C. (2002). Advances in instance selection for instance based learning algorithms. Data Mining and Knowledge Discovery, 6, 153–172. doi:10.1023/A:1014043630878

    7.Wilson, D. R., Martinez, T. R. (1997). Instance pruning techniques. Presented at Fourteenth International Conference on Machine Learning, Nashville, 403–411.

    8. Kibbler, D., Aha, D. W. (1987). Learning representative exemplars of concepts: an initial case of study. Presented at 4th International Workshop on Machine Learning, Irvine. 24–30. doi:10.1016/b978-0-934613-41-5.50006-4

    9. Gates, G. (1972). The reduced nearest neighbor rule. IEEE Transactions on Information Theory, 18 (3), 431–433. doi:10.1109/TIT.1972.1054809

    10.Wilson, D. L. (1972). Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man, Cybernetics, 2 (3), 408–421. doi:10.1109/TSMC.1972.4309137

    11.Wilson, D. R.,Martinez, T. R. (2000). Reduction techniques for instancebased learning algorithms. Machine Learning, 38 (3), 257–286. doi:10.1023/A:1007626913721

    12. Ritter, G. L., Woodruff, H. B., Lowry, S. R., Isenhour, T. L. (1975). An algorithm for a selective nearest neighbor decision rule. IEEE Transactions on Information Theory, 21 (6), 665–669. doi:10.1109/TIT.1975.1055464

    13. Li, X. (2002). Data reduction via adaptive sampling. Communications in Information and Systems, 2 (1), 5–38. doi:10.4310/cis.2002.v2.n1.a3

    14. Domingo, C. C., Gavaldа, R., Watanabe, O. (1999). Adaptive sampling methods for scaling up knowledge discovery algorithms. Presented at Second International Conference on Discovery Science. Tokyo, 172–183. doi:10.1007/3-540-46846-3_16

    15. Li, B., Chi, M., Fan, J., Xue, X. (2007). Support cluster machine. Presented at 24th International Conference on Machine Learning.Corvallis. doi:10.1145/1273496.1273560

    16. Evans, R. (2008). Clustering for classification: using standard clustering methods to summarise datasets with minimal loss of classification accuracy. Saarbrücken: VDM Verlag, 108.

    17. Madigan, D., Raghavan, N., DuMouchel, W., Nason, M., Posse, C., Ridgeway, G. (2002). Likelihood-based data squashing: a modeling approach to instance construction. Data Mining and Knowledge Discovery, 6 (2), 173–190. doi:10.1023/A:1014095614948

    18. Kohonen, T. (1988). Learning vector quantization. Neural Networks, 1, 303 doi: 10.1016/0893-6080(88)90334-6

    19. Sane, S. S., Ghatol, A. A. (2007). A Novel supervised instance selection algorithm. International Journal of Business Intelligence and Data Mining, 2 (4), 471–495. doi:10.1504/IJBIDM.2007.016384

    20. Subbotin, S. (2013). The neuro-fuzzy network synthesis and simplification on precedents in problems of diagnosis and pattern recognition. Optical Memory and Neural Networks (Information Optics), 22 (2), 97–103. doi: 10.3103/s1060992x13020082

    21. Subbotin, S. A. (2013). Methods of sampling based on exhaustive and evolutionary search. Automatic Control and Computer Sciences, 47 (3), 113–121. doi: 10.3103/s0146411613030073

    22. Kolisnyk, N. V., Subbotin, S. O. (2009). Modeling of immunopathogenesis of chronic obstructive bronchitis using neural networks. Presented at ІІ International Conference Modern problems of biology, ecology and chemistry, Zaporizhzhya, Ukraine, 124–125.

Published

2014-10-21

How to Cite

Субботин, С. А. (2014). The sample censoring method development for neural network model synthesis. Eastern-European Journal of Enterprise Technologies, 5(4(71), 22–27. https://doi.org/10.15587/1729-4061.2014.28027

Issue

Section

Mathematics and Cybernetics - applied aspects