Development of a fake news detection tool for Vietnamese based on deep learning techniques

Authors

DOI:

https://doi.org/10.15587/1729-4061.2022.265317

Keywords:

fake news detection, natural language processing, deep learning, CNN, RNN

Abstract

With the development of the Internet, social networks and different communication channels, people can get information quickly and easily. However, in addition to real and useful news, we also receive false and unreal information. The problem of fake news has become a difficult and unresolved issue. For languages with few users, such as Vietnamese, the research on fake news detection is still very limited and has not received much attention.

In this paper, we present research results on building a tool to support fake news detection for Vietnamese. Our idea is to apply text classification techniques to fake news detection. We have built a database of 4 groups of 2 topics about politics (fake news and real news) and about Covid-19 (fake news and real news). Then use deep learning techniques CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network) to create the corresponding models. When there is new news that needs to be verified, we just need to apply the classification to see which of the four groups they label into to decide whether it is fake news or not. The tool was able to detect fake news quickly and easily with a correct rate of about 85 %. This result will be improved when getting a larger training data set and adjusting the parameters for the machine learning model. These results make an important contribution to the research on detecting fake news for Vietnamese and can be applied to other languages. In the future, besides using classification techniques (based on content analysis), we can combine many other methods such as checking the source, verifying the author's information, checking the distribution process to improve the quality of fake news detection.

Supporting Agency

  • This research was funded by the Ministry of Education and Training (Vietnam) through the project code B2022-DNA-17.

Author Biographies

Trung Hung Vo, University of Technology and Education - The University of Danang

Doctor of Computer Science, Professor, Vice-Rector

Department of Digital Technology

Thi Le Thuyen Phan, FPT University

Doctor of Computer Science, IT Lecturer

Department of Information Technology

Khanh Chi Ninh, The University of Danang

PhD Student, Lecturer of VKU

Department of Information Technology

Vietnam-Korea University of Information and Communication Technology (VKU)

References

  1. Watson, A. (2022). Trust in media worldwide 2021. Statista. Available at: https://www.statista.com/statistics/683336/media-trust-worldwide/
  2. Fallis, D. (2015). What Is Disinformation? Library Trends, 63 (3), 401–426. doi: https://doi.org/10.1353/lib.2015.0014
  3. Wardle, C., Derakhshan, H. (2017). Information disorder: Toward an interdisciplinary framework for research and policy making. Council of Europe, 109.
  4. Nguyen, D. Q., Tuan Nguyen, A. (2020). PhoBERT: Pre-trained language models for Vietnamese. Findings of the Association for Computational Linguistics: EMNLP 2020. doi: https://doi.org/10.18653/v1/2020.findings-emnlp.92
  5. Le, D.-T., Vu, X.-S., To, N.-D., Nguyen, H.-Q., Nguyen, T.-T., Le, L. et. al. (2020). ReINTEL: A multimodal data challenge for responsible information identification on social network sites. arXiv. doi: https://doi.org/10.48550/arXiv.2012.08895
  6. Molina, M. D., Sundar, S. S., Le, T., Lee, D. (2019). “Fake News” Is Not Simply False Information: A Concept Explication and Taxonomy of Online Content. American Behavioral Scientist, 65 (2), 180–212. doi: https://doi.org/10.1177/0002764219878224
  7. Miller, T., Howe, P., Sonenberg, L. (2017). Explainable AI: Beware of inmates running the asylum or: How I learnt to stop worrying and love the social and behavioural sciences. arXiv. doi: https://doi.org/10.48550/arXiv.1712.00547
  8. Chadwick, A., Stanyer, J. (2021). Deception as a Bridging Concept in the Study of Disinformation, Misinformation, and Misperceptions: Toward a Holistic Framework. Communication Theory, 32 (1), 1–24. doi: https://doi.org/10.1093/ct/qtab019
  9. Zhou, X., Wu, J., Zafarani, R. (2020). SAFE: Similarity-Aware Multi-modal Fake News Detection. Lecture Notes in Computer Science, 354–367. doi: https://doi.org/10.1007/978-3-030-47436-2_27
  10. Zhou, X., Zafarani, R. (2019). Network-based Fake News Detection. ACM SIGKDD Explorations Newsletter, 21 (2), 48–60. doi: https://doi.org/10.1145/3373464.3373473
  11. Kollias, D., Zafeiriou, S. (2021). Exploiting Multi-CNN Features in CNN-RNN Based Dimensional Emotion Recognition on the OMG in-the-Wild Dataset. IEEE Transactions on Affective Computing, 12 (3), 595–606. doi: https://doi.org/10.1109/taffc.2020.3014171
  12. Elhadad, M. K., Li, K. F., Gebali, F. (2019). A Novel Approach for Selecting Hybrid Features from Online News Textual Metadata for Fake News Detection. Lecture Notes in Networks and Systems, 914–925. doi: https://doi.org/10.1007/978-3-030-33509-0_86
  13. Keeling, R., Chhatwal, R., Huber-Fliflet, N., Zhang, J., Wei, F., Zhao, H. et. al. (2019). Empirical Comparisons of CNN with Other Learning Algorithms for Text Classification in Legal Document Review. 2019 IEEE International Conference on Big Data (Big Data). doi: https://doi.org/10.1109/bigdata47090.2019.9006248
  14. Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). doi: https://doi.org/10.3115/v1/d14-1181
  15. Yu, Y., Si, X., Hu, C., Zhang, J. (2019). A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Computation, 31 (7), 1235–1270. doi: https://doi.org/10.1162/neco_a_01199
  16. Ketkar, N. (2017). Introduction to Keras. Deep Learning with Python, 97–111. doi: https://doi.org/10.1007/978-1-4842-2766-4_7
Development of a fake news detection tool for Vietnamese based on deep learning techniques

Downloads

Published

2022-10-30

How to Cite

Vo, T. H., Phan, T. L. T., & Ninh, K. C. (2022). Development of a fake news detection tool for Vietnamese based on deep learning techniques. Eastern-European Journal of Enterprise Technologies, 5(2(119), 14–20. https://doi.org/10.15587/1729-4061.2022.265317