Interpretable Sequence Modeling of Educational Behavior using Temporal Attention Networks

Rakan Saad Alotaibi; Fahad Mazyed Alotaibi; Sameer Abdullah Nooh; Abdulaziz A. Alsulami

doi:10.26713/cma.v17i1.3339

Authors

Rakan Saad Alotaibi Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Fahad Mazyed Alotaibi Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Sameer Abdullah Nooh Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Abdulaziz A. Alsulami Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia

DOI:

https://doi.org/10.26713/cma.v17i1.3339

Keywords:

Dynamic Temporal Attention Network (DTAN), Educational data mining, Learning analytics, Temporal Convolutional Networks (TCN), Attention mechanisms, Sequence modeling, Student performance prediction

Abstract

The increasing popularity of online learning platforms has led to vast amounts of sequential data regarding learner behavior. Available predictive models tend to focus on fixed features. They cannot pinpoint the dynamic temporal changes in learning activity, thereby reducing the effectiveness of predictive methods and making them more challenging to understand. In this work, the Dynamic Temporal Attention Network (DTAN) is proposed. This novel deep learning architecture learns e-learning behavior using time-aware attention and temporal convolution to enhance predictive accuracy and interpretability. TCN has already been combined with two significant attention modules. The Attention-over-Time windows (ATW) and the Dynamic Contextual Attention Mechanism (DCAM). With such components, the model can learn short and long-term dependencies on the learner’s behavior and adaptively prioritize critical time slots. To train and evaluate the model, two large-scale datasets are used: EdNet, containing over 130 million question and answer interactions in the K-12 context, and OULAD, an exam-taking dataset in the university context. By outperforming state-of-the-art models, including long short-term memory (LSTM) and gated recurrent units (GRU), as well as standard TCNs, TAN achieves significant improvements across diverse classification tasks. It has strong early prediction skills and an interpretable visualization that focuses on weight, highlighting critical incidents in a growing learner journey. Such observations are essential in the process of providing individual instruction and necessary intervention. DTAN delivers an interpretable solution for sequential data modeling in education, significantly boosting efficiency, particularly in adaptive learning systems.

Downloads

Download data is not yet available.

References

[1] B. A. Alnasyan, M. Basheri and M. Alassafi, A comprehensive comparative analysis of deep learning models for student performance prediction in virtual learning environments: Leveraging the OULA dataset and advanced resampling techniques, IEEE Access 13 (2025), 75953 – 75972, DOI: 10.1109/ACCESS.2025.3564719.

[2] S. Alqahtani, Leveraging techniques of epistemic network analysis to discover behaviors of student learning reflections in online learning environments, Engineering, Technology & Applied Science Research 14(3) (2024), 14191 – 14199, DOI: 10.48084/etasr.7274.

[3] K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), A. Moschitti, B. Pang and W. Daelemans (editors), Association for Computational Linguistics, Doha, Qatar, pp. 1724 – 1734 (2014), DOI: 10.3115/v1/D14-1179.

[4] A. T. Corbett and J. R. Anderson, Knowledge tracing: Modeling the acquisition of procedural knowledge, User Modeling and User-Adapted Interaction 4 (1994), 253 – 278, DOI: 10.1007/BF01099821.

[5] T. Gervet, K. Koedinger, J. Schneider and T. Mitchell, When is deep learning the best approach to knowledge tracing?, Journal of Educational Data Mining 12 (2020), 31 – 54, DOI: 10.5281/zenodo.4143614.

[6] S. Ghaoui, S. M. Hemam and T. Djouad, An MDA-based approach for the design and automatic computation of collaboration indicators in e-learning systems, Engineering, Technology & Applied Science Research 15 (2025), 23235 – 23245, DOI: 10.48084/etasr.10607.

[7] L. He, X. Li, P. Wang, J. Tang and T. Wang, MAN: Memory-augmented attentive networks for deep learning-based knowledge tracing, ACM Transactions on Information Systems 42(1) (2023), 1 – 22, DOI: 10.1145/3589340.

[8] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation 9(8) (1997), 1735 – 1780, DOI: 10.1162/neco.1997.9.8.1735.

[9] K. R. Koedinger, S. D’Mello, E. A. McLaughlin, Z. A. Pardos and C. P. Rosé, Data mining and education, WIREs Cognitive Science 6(4) (2015), 333 – 353, DOI: 10.1002/wcs.1350.

[10] J. Kuzilek, M. Hlosta and Z. Zdrahal, Open university learning analytics dataset, Scientific Data 4 (2017), Article number: 170171, DOI: 10.1038/sdata.2017.171.

[11] B. Lim and S. Zohren, Time-series forecasting with deep learning: A survey, Philosophical Transactions of the Royal Society A 379(2194) (2021), 20200209, DOI: 10.1098/rsta.2020.0209.

[12] A. A. Mir, M. F. Zuhairi, S. Musa, F. Alanazi, A. Namoun and A. Alrehaili, Enhanced variational graph convolutional networks with multi-scale convolutions and attention mechanisms for dynamic network analysis, Engineering, Technology & Applied Science Research 15(1) (2025), 19838 – 19847, DOI: 10.48084/etasr.9443.

[13] A. Peña-Ayala, Educational data mining: A survey and a data mining-based analysis of recent works, Expert Systems with Applications 41 (2014), 1432 – 1462, DOI: 10.1016/j.eswa.2013.08.042.

[14] C. Piech, J. Bassen, J. Huang, S. Ganguli, M. Sahami, L. Guibas and J. Sohl-Dickstein, Deep knowledge tracing, in: Proceedings of the 29th International Conference on Neural Information Processing Systems (NIPS’15), Association for Computing Machinery, Vol. 1, pp. 505 – 513, (2015), URL: https://dl.acm.org/doi/10.5555/2969239.2969296.

[15] C. Piech, J. Bassen, J. Huang, S. Ganguli, M. Sahami, L. J. Guibas and J. Sohl-Dickstein, Deep knowledge tracing, in: Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama and R. Garnett (editors), Vol. 28, Curran Associates, Inc. (2015).

[16] C. Romero and S. Ventura, Educational data mining and learning analytics: An updated survey, WIREs Data Mining and Knowledge Discovery 10(3) (2020), e1355, DOI: 10.1002/widm.1355.

[17] B. Shickel, P. J. Tighe, A. Bihorac and P. Rashidi, Deep EHR: A survey of recent advances in deep learning techniques for Electronic Health Record (EHR) analysis, IEEE Journal of Biomedical and Health Informatics 22(5) (2017), 1589 – 1604, DOI: 10.1109/JBHI.2017.2767063.

[18] D. Shin, Y. Shim, H. Yu, S. Lee, B. Kim and Y. Choi, Saint+: Integrating temporal features for ednet correctness prediction, in: Proceedings of the 11th International Learning Analytics and Knowledge Conference (LAK21), pp. 490 – 496, (2021), DOI: 10.1145/3448139.3448188.

[19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser and I. Polosukhin, Attention is all you need, in: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17, Long Beach, California, USA), pp. 6000 – 6010, Curran Associates Inc., Red Hook, NY (2017), DOI: 10.5555/3295222.3295349.

[20] T. Xie, Q. Zheng and W. Zhang, Mining temporal characteristics of behaviors from interval events in e-learning, Information Sciences 447 (2018), 169 – 185, DOI: 10.1016/j.ins.2018.03.018.

[21] C. Xu, P. Zhao, Y. Liu, J. Xu, V. S. S. Sheng, Z. Cui, X. Zhou and H. Xiong, Recurrent convolutional neural network for sequential recommendation, in: Proceedings of the World Wide Web Conference (WWW’19, San Francisco, CA), pp. 3398 – 3404, Association for Computing Machinery, New York (2019), DOI: 10.1145/3308558.3313408.

[22] X. Zhang, J. Zhang, N. Lin and X. Yang, Sequential self-attentive model for knowledge tracing, in: Artificial Neural Networks and Machine Learning (ICANN 2021), I. Farkaš, P. Masulli, S. Otte and S. Wermter (editors), Lecture Notes in Computer Science series, Vol. 12891, pp. 318 – 330, Springer, Cham. (2021), DOI: 10.1007/978-3-030-86362-3_26.

[23] J. Zhang, X. Shi, I. King and D.-Y. Yeung, Dynamic key-value memory networks for knowledge tracing, in: Proceedings of the 26th International Conference on World Wide Web (WWW’17), pp. 765 – 774, (2017), DOI: 10.1145/3038912.3052580.

[24] Y. Zhou and J. Kang, Enriching Multimodal Data: A temporal approach to contextualize joint attention in collaborative problem-solving, Journal of Learning Analytics 10(3) (2023), 87 – 101, DOI: 10.18608/jla.2023.7989.

Interpretable Sequence Modeling of Educational Behavior using Temporal Attention Networks

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Indexed in