Method of automated construction of explanatory dictionary of subject area

Authors

DOI:

https://doi.org/10.15587/2312-8372.2015.40895

Keywords:

dictionary, term, subject area, synonym, group name

Abstract

The article deals with the method of the automated construction of explanatory dictionary based on the processing of many texts from a specific subject area.

The technology of selection and grouping of source texts, based on inter-document and intra-document clustering, which allows save significant terms in the dictionary.

It is developed the procedure of selection of terms (individual words and phrases) from documents, based on the calculation of the frequency of their occurrence in the text.

The technique of finding of synonyms, definitions, and using other dictionaries is proposed.

The formula that allows you to estimate the time spent on the various stages of compiling the dictionary is given.

The results of experiments that confirm the effectiveness of the proposed method of construction of dictionary of subject area are given.

The proposed method of automatic compilation of the dictionary of subject area can be used to determine the stage of requirements for software products in information systems and artificial intelligence systems.

Author Biographies

Алексей Борисович Кунгурцев, Odessa National Polytechnic University, Str. Shevchenko avenue, 1, Odessa, Ukraine, 65044

Candidate of Technical Science, Professor

Department of System Software 

Яна Владимировна Поточняк, Odessa National Polytechnic University, Str. Shevchenko avenue, 1, Odessa, Ukraine, 65044

Post Graduate Student

Department of System Software 

Дмитрий Александрович Силяев, Odessa National Polytechnic University, Str. Shevchenko avenue, 1, Odessa, Ukraine, 65044

Department of System Software

References

  1. Chertkova, E. A. (2005). Modelirovanie predmetnoi oblasti dlia proektirovaniia komp'iuternyh obuchaiushchih sistem. Kongress konferentsii "Informatsionnye tehnologii v obrazovanii". Sektsiia VII. Available: http://ito.edu.ru/2005/Moscow/VII/VII-0-5032.html
  2. JaLingo. Available: http://jalingo.sourceforge.net/
  3. Kungurtsev, A. B., Barykina, I. V. (2006). Formirovanie slovaria predmetnoi oblasti. Iskusstvennyi intellekt, № 1, 144–151.
  4. Kunhurtsev, A. B., Borodavkin, S. M. (2009). Zastosuvannia merezh freimiv dlia pobudovy modeli vyluchennia faktiv z tekstiv na pryrodnii movi. Iskusstvennyi intellekt, № 4, 202–207.
  5. Kungurtsev, A., Borodavkin, S., Golub, A. (2012). Method of creation of domains dictionaries for extraction of the facts from texts in the natural language. Eastern-European Journal Of Enterprise Technologies, 1(4(43)), 32-36. Available: http://journals.uran.ua/eejet/article/view/2550
  6. Bourigault, D. (1992). Surface grammatical analysis for the extraction of terminological noun phrases. Proceedings of the 14th conference on Computational linguistics. Association for Computational Linguistics (ACL), 977–981. doi:10.3115/993079.993111
  7. Baroni, M., Bernardini, S. (2004). Bootstrapping Corpora and Terms from the Web. Proceedings of LREC. Lisbon: ELDA, 1313–1316.
  8. Programmnyi paket sintaksicheskii analiz. Proekt AOT. Available: http://www.aot.ru/docs/synan.html
  9. Shelov, S. D. (2001). Terminovedenie: sem' voprosov i sem' otvetov po semantike termina. NTI. Ser. 2. Informatsionnye protsessy i sistemy, № 2, 1–11.
  10. Liashevskaia, O. N., Sharov, S. A. (2009). Chastotnyi slovar' sovremennogo russkogo iazyka (na materialah Natsional'nogo korpusa russkogo iazyka). M.: Azbukovnik. Available: http://dict.ruslang.ru/freq.php
  11. Ozhegov, S. I., Shvedova, N. Yu. (2004). Tolkovyi slovar' russkogo iazyka. M.: ONIKS 21 vek: Mir i Obrazovanie, 1198.
  12. Programmnyi paket sintaksicheskogo razbora i mashinnogo perevoda. (2008). Available: http://cs.isa.ru:10000/dwarf/
  13. Knut, D. E. (2007). Iskusstvo programmirovaniia. Tom 3. Sortirovka i poisk. M.: Izdatel'skii dom Vil'iams, 800.
  14. Horstmann, K., Kornell, G. (2014). Java. Tom 2. Biblioteka professionala. M.: Izdatel'skii dom Vil'iams, 864.

Published

2015-04-02

How to Cite

Кунгурцев, А. Б., Поточняк, Я. В., & Силяев, Д. А. (2015). Method of automated construction of explanatory dictionary of subject area. Technology Audit and Production Reserves, 2(2(22), 58–63. https://doi.org/10.15587/2312-8372.2015.40895

Issue

Section

Information Technologies: Original Research