Implementation of term frequency-inverse document frequency (TF-IDF) and Word2Vec in traditional medicine recommendation system based on content-based filtering
DOI:
https://doi.org/10.15587/1729-4061.2025.338128Keywords:
content-based filtering, dimension, feature extraction, recommendation system, semantic relationship, term frequency-invers document frequency (TF-IDF), traditional medicine, window size, Word2Vec, word weightAbstract
According to World Health Organization (WHO), traditional medicine is the culmination of all the knowledge, abilities, and practices derived from the theories, beliefs, and experiences that are unique to various cultures and that are used to maintain health as well as to prevent, diagnose, treat, or improve physical and mental illness. recently classified traditional herbal therapy as comprised of medicinal techniques that have existed, frequently for hundreds of years, prior to the establishment of modern medicine. The lack of easily accessible information regarding the description and efficiency of traditional medicine makes it difficult for users to understand the benefits of each type of traditional medicine. Because of this, a recommendation system is needed that aims to facilitate users in finding traditional medicine that suit their preferences. This research proposes a traditional medicine recommendation system with the content-based filtering method using a combination of term frequency-invers document frequency and Word2Vec feature extraction. This method analyzes the traditional medicine description text and recommends based on word weights and semantic relationships between words. Results show optimal performance at dimensions 50–200 and window sizes 9–15 for the combination of term frequency-invers document frequency and Word2Vec, while term frequency-invers document frequency alone reaches 80% of accuracy and Word2Vec has lower performance (4–14%) across a wide range of parameter experiments. Based on optimal result above, this recommendation system can be applied to obtain information of traditional medicine that suitable with people needed by adjust the best model of dimensions and window size
References
- Che, C.-T., George, V., Ijinu, T. P., Pushpangadan, P., Andrae-Marobela, K. (2024). Traditional medicine. Pharmacognosy. Academic Press, 11–28. https://doi.org/10.1016/b978-0-443-18657-8.00037-2
- WHO traditional medicine strategy: 2014–2023 (2013). World Health Organization. Available at: http://apps.who.int/iris/bitstream/10665/92455/1/9789241506090_eng.pdf Last accessed: 20.05.2025
- Bodeker, G., Graz, B.; Ryan, E. T., Hill, D. R., Solomon, T., Aronson, N. E., Endy, T. P. (Eds.). (2020). Traditional Medicine. Hunter’s Tropical Medicine and Emerging Infectious Diseases. Elsevier, 194–199. https://doi.org/10.1016/b978-0-323-55512-8.00025-9
- Kamboj, V. P. (2000). Herbal medicine. Current Science, 78 (1), 35–39.
- Pal, S. K., Shukla, Y. (2003). Herbal medicine: Current status and the future. Asian Pacific Journal of Cancer Prevention, 4, 281–288. Available at: https://www.researchgate.net/profile/Sanjoy-Pal-2/publication/8914668_Herbal_medicine_Current_status_and_the_future/links/0c96051fd33d11991d000000/Herbal-medicine-Current-status-and-the-future.pdf
- WHO global report on traditional and complementary medicine 2019 (2019). Geneva: World Health Organization, 226. Available at: https://iris.who.int/bitstream/handle/10665/312342/9789241515436-eng.pdf?sequence=1
- Sianipar, E. A. (2021). The potential of Indonesian traditional herbal medicine as immunomodulatory agents: A review. International Journal of Pharmaceutical Sciences and Research, 12 (10), 5229–5237. https://doi.org/10.13040/IJPSR.0975-8232.12(10).5229-37
- Pradipta, I. S., Aprilio, K., Febriyanti, R. M., Ningsih, Y. F., Pratama, M. A. A., Indradi, R. B. et al. (2023). Traditional medicine users in a treated chronic disease population: a cross-sectional study in Indonesia. BMC Complementary Medicine and Therapies, 23 (1). https://doi.org/10.1186/s12906-023-03947-4
- Muharrami, L. K., Santoso, M., Fatmawati, S. (2024). Traditional Medicine Uses of Madurese Ethnic, Indonesia: Indigenous Knowledge “Jamu” in Relation with Medicinal Plants. Journal of Hunan University Natural Sciences, 51 (10). https://doi.org/10.55463/issn.1674-2974.51.10.2
- Yunitarini, R., Widiaswanti, E. (2024). Analysis and Design of Indonesian Traditional Medicine (Jamu) Information System by using Prototyping Model (Case Study: Madura Island). E3S Web of Conferences, 483, 03012. https://doi.org/10.1051/e3sconf/202448303012
- Vall, A., Dorfer, M., Eghbal-zadeh, H., Schedl, M., Burjorjee, K., Widmer, G. (2019). Feature-combination hybrid recommender systems for automated music playlist continuation. User Modeling and User-Adapted Interaction, 29 (2), 527–572. https://doi.org/10.1007/s11257-018-9215-8
- Widayanti, R., Chakim, M., Lukita, C., Rahardja, U., Lutfiani, N. (2023). Improving Recommender Systems using Hybrid Techniques of Collaborative Filtering and Content-Based Filtering. Journal of Applied Data Sciences, 4 (3), 289–302. https://doi.org/10.47738/jads.v4i3.115
- Van Balen, J., Goethals, B. (2021). High-dimensional Sparse Embeddings for Collaborative Filtering. Proceedings of the Web Conference 2021. Ljubljana, 575–581. https://doi.org/10.1145/3442381.3450054
- Gunarto, S. A., Honggara, E. S., Purwanto, D. D. (2023). Website Sistem Rekomendasi dengan Content Based Filtering pada Produk Perawatan Kulit. Jurnal Sistem Dan Teknologi Informasi, 11 (3), 399. https://doi.org/10.26418/justin.v11i3.59049
- Nastiti, P. (2019). Penerapan Metode Content Based Filtering Dalam Implementasi Sistem Rekomendasi Tanaman Pangan. Teknika, 8 (1), 1–10. https://doi.org/10.34148/teknika.v8i1.139
- Huda, A. A., Fajarudin, R., Hadinegoro, A. (2022). Sistem Rekomendasi Content-based Filtering Menggunakan TF-IDF Vector Similarity Untuk Rekomendasi Artikel Berita. Building of Informatics, Technology and Science, 4 (3), 1679–1686. https://doi.org/10.47065/bits.v4i3.2511
- Putri, M. W., Muchayan, A., Kamisutara, M. (2020). Sistem Rekomendasi Produk Pena Eksklusif Menggunakan Metode Content-Based Filtering dan TF-IDF. Journal of Information Technology and Computer Science, 5 (3), 229. https://doi.org/10.31328/jointecs.v5i3.1563
- Negara, E. S., Sulaiman, Andryani, R., Saksono, P. H., Widyanti, Y. (2023). Recommendation System with Content-Based Filtering in NFT Marketplace. Journal of Advances in Information Technology, 14 (3), 518–522. https://doi.org/10.12720/jait.14.3.518-522
- Nawangsari, R. P., Kusumaningrum, R., Wibowo, A. (2019). Word2Vec for Indonesian Sentiment Analysis towards Hotel Reviews: An Evaluation Study. Procedia Computer Science, 157, 360–366. https://doi.org/10.1016/j.procs.2019.08.178
- Khomsah, S. (2021). Sentiment Analysis on YouTube Comments Using Word2Vec and Random Forest. Telematika, 18 (1), 61–72. https://doi.org/10.31315/telematika.v18i1.4493
- Ramadhanti, N. R., Mariyah, S. (2019). Document Similarity Detection Using Indonesian Language Word2vec Model. 2019 3rd International Conference on Informatics and Computational Sciences (ICICoS). Semarang: IEEE, 1–6. https://doi.org/10.1109/icicos48119.2019.8982432
- Cahyani, S. N., Saraswati, G. W. (2023). Implementation of support vector machine method in classifying school library books with combination of TF-IDF and WORD2VEC. Jurnal Teknik Informatika, 4 (6), 1555–1566. https://doi.org/10.52436/1.jutif.2023.4.6.1536
- Liang, M., Niu, T. (2022). Research on Text Classification Techniques Based on Improved TF-IDF Algorithm and LSTM Inputs. Procedia Computer Science, 208, 460–470. https://doi.org/10.1016/j.procs.2022.10.064
- Nurfalah, F., Asriyanik, Pambudi, A. (2022). Sistem Rekomendasi Event Online Menggunakan Metode Content Based Filtering. Elkom : Jurnal Elektronika Dan Komputer, 15 (2), 271–279. https://doi.org/10.51903/elkom.v15i2.736
- Irvandani, A., Auliasari, K., Primaswara Prasetya, R. (2020). Sistem Rekomendasi Pemilihan Fotografer dengan Metode Haversine dan TF-IDF di Malang Raya. Jurnal Mahasiswa Teknik Informatika, 4 (1), 137–146. https://doi.org/10.36040/jati.v4i1.2330
- Yutika, C. H., Adiwijaya, A., Faraby, S. A. (2021). Analisis Sentimen Berbasis Aspek pada Review Female Daily Menggunakan TF-IDF dan Naïve Bayes. Jurnal Media Informatika Budidarma, 5 (2), 422. https://doi.org/10.30865/mib.v5i2.2845
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Rika Yunitarini, Dwi Aqilah Pradita, Ernaning Widiaswanti

This work is licensed under a Creative Commons Attribution 4.0 International License.
The consolidation and conditions for the transfer of copyright (identification of authorship) is carried out in the License Agreement. In particular, the authors reserve the right to the authorship of their manuscript and transfer the first publication of this work to the journal under the terms of the Creative Commons CC BY license. At the same time, they have the right to conclude on their own additional agreements concerning the non-exclusive distribution of the work in the form in which it was published by this journal, but provided that the link to the first publication of the article in this journal is preserved.
A license agreement is a document in which the author warrants that he/she owns all copyright for the work (manuscript, article, etc.).
The authors, signing the License Agreement with TECHNOLOGY CENTER PC, have all rights to the further use of their work, provided that they link to our edition in which the work was published.
According to the terms of the License Agreement, the Publisher TECHNOLOGY CENTER PC does not take away your copyrights and receives permission from the authors to use and dissemination of the publication through the world's scientific resources (own electronic resources, scientometric databases, repositories, libraries, etc.).
In the absence of a signed License Agreement or in the absence of this agreement of identifiers allowing to identify the identity of the author, the editors have no right to work with the manuscript.
It is important to remember that there is another type of agreement between authors and publishers – when copyright is transferred from the authors to the publisher. In this case, the authors lose ownership of their work and may not use it in any way.





