Approach for extracting events from news stream
DOI:
https://doi.org/10.15587/1729-4061.2013.9178Keywords:
news flow, the syntactic model, tokens, proximity criterion.Abstract
Market price forecasting allows effective manage of pricing policy and acquire competitive advantage. Most of existing price forecasting approaches are based either on experts’ opinions or on raw price data models. Neither of this approaches allows to get a high forecasting accuracy due to nature of price behaving since price reflects events in real world. Possible solution could be in usage of news based forecasting models. Such forecasting models require processing of news streams.Processing news streams is a complex task because reflection of event in the news isn’t very precise therefore there is a need in development of proper news data processing methods. Main problem in news data processing is filtering news duplicates and plots. One of the possible approaches in news’ processing is based on extracting lexemes from news header and first sentence and their further processing.
By forming three vectors based on extracted lexemes for the news it is possible to develop an efficient criteria for duplicates detection. By itself criteria doesn’t include any kind of expert opinion for similarity detection except the basic processing logic.
Developed approach allows to extract events data from news stream sufficient for price forecasting.
References
- Черенков, И. А. Прогнозирование на основе новостного потока посредством ассоциативных правил [Текст] / Черенков И. А. Общегосударственный научно-производственный журнал. Энергосбережение. Энергетика. Энергоаудит.: Харьков. – 2012. – №11 (105). – С. 38-42.
- Черенков, И. А. Обоснование прогнозирования цен полимеров посредством новостного потока [Текст] / Черенков И. А., Орехов С.В. // Восточно-Европейский журнал передовых технологий. - 2010. - № 5/7 (47). - С. 18-21
- Черенков, И. А. Автоматический поиск данных из новостей на примере рынка полимеров [Текст] / Черенков И. А., Орехов С.В. Системы обработки информации: Харьков. - 2011. - №8. - С. 156-159.
- Мельчук И. А. Опыт теории лингвистических моделей смысл-текс. Семантика. Синтаксис [Текст] / Мельчук И.А. - М.: Высш. шк., 1999. - 345 с.
- Н. Хомский Аспекты теории синтаксиса // Хомский Н. - Изд-во БГК им. И.А. Бодуэн Де Куртенэ. - 1999. - 254 с.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2014 Игорь Александрович Черенков, Сергей Валериевич Орехов
This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.
A license agreement is a document in which the author warrants that he/she owns all copyright for the work (manuscript, article, etc.).
The authors, signing the License Agreement with TECHNOLOGY CENTER PC, have all rights to the further use of their work, provided that they link to our edition in which the work was published.
According to the terms of the License Agreement, the Publisher TECHNOLOGY CENTER PC does not take away your copyrights and receives permission from the authors to use and dissemination of the publication through the world's scientific resources (own electronic resources, scientometric databases, repositories, libraries, etc.).
In the absence of a signed License Agreement or in the absence of this agreement of identifiers allowing to identify the identity of the author, the editors have no right to work with the manuscript.
It is important to remember that there is another type of agreement between authors and publishers – when copyright is transferred from the authors to the publisher. In this case, the authors lose ownership of their work and may not use it in any way.