Evaluating the impact of weight initialization on recurrent and transformer-based models in financial asset price prediction

Authors

DOI:

https://doi.org/10.15587/1729-4061.2025.326913

Keywords:

deep learning, asset financial prediction, weight initialization, financial time series

Abstract

The object of this research is a deep learning model based on recurrent neural network (RNN), long short-term memory (LSTM), and transformer, applied to predict financial asset prices using historical time series data. The main problem addressed is the absence of a systematic study evaluating the combined effect of weight initialization methods and activation functions in time-series prediction models, particularly regarding convergence speed, prediction accuracy, and the model’s ability to capture price variability. The results show that RNN and LSTM have better training stability, are able to converge in one epoch, and provide high prediction performance (RMSE < 3.7, MAPE < 0.015, R2 close to 0.9999). In contrast, Transformer showed lower prediction performance (RMSE around 37, MAPE around 0.58, R2 between 0.9884–0.9885) and tended to overfitting on various strategy combinations. In RNN and LSTM models, the All-Zeros (AZ) and ReLU combination specifically degrades stability and leads to overfitting. The superiority of RNN and LSTM is attributed to their sequential architectures, which are more effective at learning short-term temporal patterns and more robust to suboptimal weight initialization. Therefore, selecting an appropriate combination of weight initialization and activation function plays a key role in enhancing model performance. These findings contribute empirical evidence to the importance of configuration choices in deep learning for time-series forecasting. The results can be applied in the development of deep learning-based financial asset prediction and recommendation systems, particularly for assets with long historical records and volatile market conditions

Author Biographies

Andri Andri, Universitas Sumatera Utara; Universitas Mikroskil

Doctoral Student of Computer Science; Lecturer of Computer Science

Department of Computer Science

Tengku Henny Febriana Harumy, Universitas Sumatera Utara

Doctor of Computer Science

Department of Computer Science

Syahril Efendi, Universitas Sumatera Utara

Doctor of Mathematics, Professor

Department of Computer Science

References

  1. Achury-Calderón, F., Arredondo, J. A., Sánchez Ascanio, L. C. (2025). A novel predictive analytics model for forecasting short-term trends in equity assets prices. Decision Analytics Journal, 14, 100534. https://doi.org/10.1016/j.dajour.2024.100534
  2. Tan, J., Deveci, M., Li, J., Zhong, K. (2024). Asset pricing via fused deep learning with visual clues. Information Fusion, 102, 102049. https://doi.org/10.1016/j.inffus.2023.102049
  3. Chen, Y., Zhang, L., Xie, Z., Zhang, W., Li, Q. (2025). Unraveling asset pricing with AI: A systematic literature review. Applied Soft Computing, 175, 112978. https://doi.org/10.1016/j.asoc.2025.112978
  4. Harumy, H. F., Hardi, S. M., Al Banna, M. F. (2024). EarlyStage Diabetes Risk Detection Using Comparison of Xgboost, Lightgbm, and Catboost Algorithms. Advanced Information Networking and Applications, 12–24. https://doi.org/10.1007/978-3-031-57931-8_2
  5. Al-Selwi, S. M., Hassan, M. F., Abdulkadir, S. J., Muneer, A., Sumiea, E. H., Alqushaibi, A., Ragab, M. G. (2024). RNN-LSTM: From applications to modeling techniques and beyond – Systematic review. Journal of King Saud University - Computer and Information Sciences, 36 (5), 102068. https://doi.org/10.1016/j.jksuci.2024.102068
  6. Kim, D.-K., Kim, K. (2022). A Convolutional Transformer Model for Multivariate Time Series Prediction. IEEE Access, 10, 101319–101329. https://doi.org/10.1109/access.2022.3203416
  7. Ahmed, S., Nielsen, I. E., Tripathi, A., Siddiqui, S., Ramachandran, R. P., Rasool, G. (2023). Transformers in Time-Series Analysis: A Tutorial. Circuits, Systems, and Signal Processing, 42 (12), 7433–7466. https://doi.org/10.1007/s00034-023-02454-8
  8. Ahn, J. Y., Kim, Y., Park, H., Park, S. H., Suh, H. K. (2024). Evaluating Time-Series Prediction of Temperature, Relative Humidity, and CO2 in the Greenhouse with Transformer-Based and RNN-Based Models. Agronomy, 14 (3), 417. https://doi.org/10.3390/agronomy14030417
  9. de Pater, I., Mitici, M. (2023). A mathematical framework for improved weight initialization of neural networks using Lagrange multipliers. Neural Networks, 166, 579–594. https://doi.org/10.1016/j.neunet.2023.07.035
  10. Harumy, T., Ginting, D. S. Br. (2021). Neural Network Enhancement Forecast of Dengue Fever Outbreaks in Coastal Region. Journal of Physics: Conference Series, 1898 (1), 012027. https://doi.org/10.1088/1742-6596/1898/1/012027
  11. Harumy, T. H. F., Zarlis, M., Lydia, M. S., Efendi, S. (2023). A novel approach to the development of neural network architecture based on metaheuristic protis approach. Eastern-European Journal of Enterprise Technologies, 4 (4 (124)), 46–59. https://doi.org/10.15587/1729-4061.2023.281986
  12. Ghallabi, F., Souissi, B., Du, A. M., Ali, S. (2025). ESG stock markets and clean energy prices prediction: Insights from advanced machine learning. International Review of Financial Analysis, 97, 103889. https://doi.org/10.1016/j.irfa.2024.103889
  13. Xu, F., Tan, S. (2021). Deep learning with multiple scale attention and direction regularization for asset price prediction. Expert Systems with Applications, 186, 115796. https://doi.org/10.1016/j.eswa.2021.115796
  14. Abolmakarem, S., Abdi, F., Khalili-Damghani, K., Didehkhani, H. (2022). A Multi-Stage Machine Learning Approach for Stock Price Prediction: Engineered and Derivative Indices. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4074883
  15. Pan, S., Long, S., Wang, Y., Xie, Y. (2023). Nonlinear asset pricing in Chinese stock market: A deep learning approach. International Review of Financial Analysis, 87, 102627. https://doi.org/10.1016/j.irfa.2023.102627
  16. Gülmez, B. (2025). GA-Attention-Fuzzy-Stock-Net: An optimized neuro-fuzzy system for stock market price prediction with genetic algorithm and attention mechanism. Heliyon, 11 (3), e42393. https://doi.org/10.1016/j.heliyon.2025.e42393
  17. Islam, B. ul, Ahmed, S. F. (2022). Short-Term Electrical Load Demand Forecasting Based on LSTM and RNN Deep Neural Networks. Mathematical Problems in Engineering, 2022, 1–10. https://doi.org/10.1155/2022/2316474
  18. Rahman, N. H. A., Yin, C. H., Zulkafli, H. S. (2024). Activation functions performance in multilayer perceptron for time series forecasting. Proceedings of the 38th International Conference of the Polymer Processing Society (PPS-38), 3158, 070001. https://doi.org/10.1063/5.0223864
  19. Sinanc Terzi, D. (2024). Effect of different weight initialization strategies on transfer learning for plant disease detection. Plant Pathology, 73 (9), 2325–2343. https://doi.org/10.1111/ppa.13997
  20. Srivastava, G., Vashisth, S., Dhall, I., Saraswat, S. (2020). Behavior Analysis of a Deep Feedforward Neural Network by Varying the Weight Initialization Methods. Smart Innovations in Communication and Computational Sciences, 167–175. https://doi.org/10.1007/978-981-15-5345-5_15
  21. Alhlffee, M. H. B., Ahmad Abuirbaiha, R. A. (2024). The Effects of Dropout and Weight Initialization on Human Face Classification Accuracy Using Multiple-agent Generative Adversarial Network. 2024 7th International Conference on Information and Computer Technologies (ICICT), 271–276. https://doi.org/10.1109/icict62343.2024.00050
  22. Rajaraman, S., Zamzmi, G., Yang, F., Liang, Z., Xue, Z., Antani, S. (2024). Uncovering the effects of model initialization on deep model generalization: A study with adult and pediatric chest X-ray images. PLOS Digital Health, 3 (1), e0000286. https://doi.org/10.1371/journal.pdig.0000286
  23. Berghout, T., Bentrcia, T., Lim, W. H., Benbouzid, M. (2023). A Neural Network Weights Initialization Approach for Diagnosing Real Aircraft Engine Inter-Shaft Bearing Faults. Machines, 11 (12), 1089. https://doi.org/10.3390/machines11121089
  24. Boulila, W., Alshanqiti, E., Alzahem, A., Koubaa, A., Mlaiki, N. (2024). An effective weight initialization method for deep learning: Application to satellite image classification. Expert Systems with Applications, 254, 124344. https://doi.org/10.1016/j.eswa.2024.124344
  25. Wong, K., Dornberger, R., Hanne, T. (2022). An analysis of weight initialization methods in connection with different activation functions for feedforward neural networks. Evolutionary Intelligence, 17 (3), 2081–2089. https://doi.org/10.1007/s12065-022-00795-y
  26. Mojtahedi, F. F., Yousefpour, N., Chow, S. H., Cassidy, M. (2025). Deep Learning for Time Series Forecasting: Review and Applications in Geotechnics and Geosciences. Archives of Computational Methods in Engineering. https://doi.org/10.1007/s11831-025-10244-5
  27. Mienye, I. D., Swart, T. G., Obaido, G. (2024). Recurrent Neural Networks: A Comprehensive Review of Architectures, Variants, and Applications. Information, 15 (9), 517. https://doi.org/10.3390/info15090517
  28. Huang, H., Wang, Z., Liao, Y., Gao, W., Lai, C., Wu, X., Zeng, Z. (2024). Improving the explainability of CNN-LSTM-based flood prediction with integrating SHAP technique. Ecological Informatics, 84, 102904. https://doi.org/10.1016/j.ecoinf.2024.102904
  29. Sanz-Cruzado, J., Droukas, N., McCreadie, R. (2024). FAR-Trans: An Investment Dataset for Financial Asset Recommendation. IJCAI-2024 Workshop on Recommender Systems in Finance (Fin-RecSys). https://doi.org/10.48550/arXiv.2407.08692
Evaluating the impact of weight initialization on recurrent and transformer-based models in financial asset price prediction

Downloads

Published

2025-08-30

How to Cite

Andri, A., Harumy, T. H. F., & Efendi, S. (2025). Evaluating the impact of weight initialization on recurrent and transformer-based models in financial asset price prediction. Eastern-European Journal of Enterprise Technologies, 4(4 (136), 19–31. https://doi.org/10.15587/1729-4061.2025.326913

Issue

Section

Mathematics and Cybernetics - applied aspects