Approach for extracting events from news stream


  • Игорь Александрович Черенков National Technical University "Kharkiv Polytechnic Institute" Frunze str. 21, Kharkov 61002, Ukraine
  • Сергей Валериевич Орехов National Technical University "Kharkiv Polytechnic Institute" Frunze str. 21, Kharkov 61002, Ukraine



news flow, the syntactic model, tokens, proximity criterion.


Market price forecasting allows effective manage of pricing policy and acquire competitive advantage. Most of existing price forecasting approaches are based either on experts’ opinions or on raw price data models. Neither of this approaches allows to get a high forecasting accuracy due to nature of price behaving since price reflects events in real world. Possible solution could be in usage of news based forecasting models. Such forecasting models require processing of news streams.
Processing news streams is a complex task because reflection of event in the news isn’t very precise therefore there is a need in development of proper news data processing methods. Main problem in news data processing is filtering news duplicates and plots. One of the possible approaches in news’ processing is based on extracting lexemes from news header and first sentence and their further processing.
By forming three vectors based on extracted lexemes for the news it is possible to develop an efficient criteria for duplicates detection. By itself criteria doesn’t include any kind of expert opinion for similarity detection except the basic processing logic.
Developed approach allows to extract events data from news stream sufficient for price forecasting.

Author Biographies

Игорь Александрович Черенков, National Technical University "Kharkiv Polytechnic Institute" Frunze str. 21, Kharkov 61002


Сергей Валериевич Орехов, National Technical University "Kharkiv Polytechnic Institute" Frunze str. 21, Kharkov 61002



  1. Черенков, И. А. Прогнозирование на основе новостного потока посредством ассоциативных правил [Текст] / Черенков И. А. Общегосударственный научно-производственный журнал. Энергосбережение. Энергетика. Энергоаудит.: Харьков. – 2012. – №11 (105). – С. 38-42.
  2. Черенков, И. А. Обоснование прогнозирования цен полимеров посредством новостного потока [Текст] / Черенков И. А., Орехов С.В. // Восточно-Европейский журнал передовых технологий. - 2010. - № 5/7 (47). - С. 18-21
  3. Черенков, И. А. Автоматический поиск данных из новостей на примере рынка полимеров [Текст] / Черенков И. А., Орехов С.В. Системы обработки информации: Харьков. - 2011. - №8. - С. 156-159.
  4. Мельчук И. А. Опыт теории лингвистических моделей смысл-текс. Семантика. Синтаксис [Текст] / Мельчук И.А. - М.: Высш. шк., 1999. - 345 с.
  5. Н. Хомский Аспекты теории синтаксиса // Хомский Н. - Изд-во БГК им. И.А. Бодуэн Де Куртенэ. - 1999. - 254 с.



How to Cite

Черенков, И. А., & Орехов, С. В. (2013). Approach for extracting events from news stream. Eastern-European Journal of Enterprise Technologies, 1(4(61), 62–64.



Mathematics and Cybernetics - applied aspects