Interpretable Sequence Modeling of Educational Behavior using Temporal Attention Networks
DOI:
https://doi.org/10.26713/cma.v17i1.3339Keywords:
Dynamic Temporal Attention Network (DTAN), Educational data mining, Learning analytics, Temporal Convolutional Networks (TCN), Attention mechanisms, Sequence modeling, Student performance predictionAbstract
The increasing popularity of online learning platforms has led to vast amounts of sequential data regarding learner behavior. Available predictive models tend to focus on fixed features. They cannot pinpoint the dynamic temporal changes in learning activity, thereby reducing the effectiveness of predictive methods and making them more challenging to understand. In this work, the Dynamic Temporal Attention Network (DTAN) is proposed. This novel deep learning architecture learns e-learning behavior using time-aware attention and temporal convolution to enhance predictive accuracy and interpretability. TCN has already been combined with two significant attention modules. The Attention-over-Time windows (ATW) and the Dynamic Contextual Attention Mechanism (DCAM). With such components, the model can learn short and long-term dependencies on the learner’s behavior and adaptively prioritize critical time slots. To train and evaluate the model, two large-scale datasets are used: EdNet, containing over 130 million question and answer interactions in the K-12 context, and OULAD, an exam-taking dataset in the university context. By outperforming state-of-the-art models, including long short-term memory (LSTM) and gated recurrent units (GRU), as well as standard TCNs, TAN achieves significant improvements across diverse classification tasks. It has strong early prediction skills and an interpretable visualization that focuses on weight, highlighting critical incidents in a growing learner journey. Such observations are essential in the process of providing individual instruction and necessary intervention. DTAN delivers an interpretable solution for sequential data modeling in education, significantly boosting efficiency, particularly in adaptive learning systems.
Downloads
References
[1] B. A. Alnasyan, M. Basheri and M. Alassafi, A comprehensive comparative analysis of deep learning models for student performance prediction in virtual learning environments: Leveraging the OULA dataset and advanced resampling techniques, IEEE Access 13 (2025), 75953 – 75972, DOI: 10.1109/ACCESS.2025.3564719.
[2] S. Alqahtani, Leveraging techniques of epistemic network analysis to discover behaviors of student learning reflections in online learning environments, Engineering, Technology & Applied Science Research 14(3) (2024), 14191 – 14199, DOI: 10.48084/etasr.7274.
[3] K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), A. Moschitti, B. Pang and W. Daelemans (editors), Association for Computational Linguistics, Doha, Qatar, pp. 1724 – 1734 (2014), DOI: 10.3115/v1/D14-1179.
[4] A. T. Corbett and J. R. Anderson, Knowledge tracing: Modeling the acquisition of procedural knowledge, User Modeling and User-Adapted Interaction 4 (1994), 253 – 278, DOI: 10.1007/BF01099821.
[5] T. Gervet, K. Koedinger, J. Schneider and T. Mitchell, When is deep learning the best approach to knowledge tracing?, Journal of Educational Data Mining 12 (2020), 31 – 54, DOI: 10.5281/zenodo.4143614.
[6] S. Ghaoui, S. M. Hemam and T. Djouad, An MDA-based approach for the design and automatic computation of collaboration indicators in e-learning systems, Engineering, Technology & Applied Science Research 15 (2025), 23235 – 23245, DOI: 10.48084/etasr.10607.
[7] L. He, X. Li, P. Wang, J. Tang and T. Wang, MAN: Memory-augmented attentive networks for deep learning-based knowledge tracing, ACM Transactions on Information Systems 42(1) (2023), 1 – 22, DOI: 10.1145/3589340.
[8] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation 9(8) (1997), 1735 – 1780, DOI: 10.1162/neco.1997.9.8.1735.
[9] K. R. Koedinger, S. D’Mello, E. A. McLaughlin, Z. A. Pardos and C. P. Rosé, Data mining and education, WIREs Cognitive Science 6(4) (2015), 333 – 353, DOI: 10.1002/wcs.1350.
[10] J. Kuzilek, M. Hlosta and Z. Zdrahal, Open university learning analytics dataset, Scientific Data 4 (2017), Article number: 170171, DOI: 10.1038/sdata.2017.171.
[11] B. Lim and S. Zohren, Time-series forecasting with deep learning: A survey, Philosophical Transactions of the Royal Society A 379(2194) (2021), 20200209, DOI: 10.1098/rsta.2020.0209.
[12] A. A. Mir, M. F. Zuhairi, S. Musa, F. Alanazi, A. Namoun and A. Alrehaili, Enhanced variational graph convolutional networks with multi-scale convolutions and attention mechanisms for dynamic network analysis, Engineering, Technology & Applied Science Research 15(1) (2025), 19838 – 19847, DOI: 10.48084/etasr.9443.
[13] A. Peña-Ayala, Educational data mining: A survey and a data mining-based analysis of recent works, Expert Systems with Applications 41 (2014), 1432 – 1462, DOI: 10.1016/j.eswa.2013.08.042.
[14] C. Piech, J. Bassen, J. Huang, S. Ganguli, M. Sahami, L. Guibas and J. Sohl-Dickstein, Deep knowledge tracing, in: Proceedings of the 29th International Conference on Neural Information Processing Systems (NIPS’15), Association for Computing Machinery, Vol. 1, pp. 505 – 513, (2015), URL: https://dl.acm.org/doi/10.5555/2969239.2969296.
[15] C. Piech, J. Bassen, J. Huang, S. Ganguli, M. Sahami, L. J. Guibas and J. Sohl-Dickstein, Deep knowledge tracing, in: Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama and R. Garnett (editors), Vol. 28, Curran Associates, Inc. (2015).
[16] C. Romero and S. Ventura, Educational data mining and learning analytics: An updated survey, WIREs Data Mining and Knowledge Discovery 10(3) (2020), e1355, DOI: 10.1002/widm.1355.
[17] B. Shickel, P. J. Tighe, A. Bihorac and P. Rashidi, Deep EHR: A survey of recent advances in deep learning techniques for Electronic Health Record (EHR) analysis, IEEE Journal of Biomedical and Health Informatics 22(5) (2017), 1589 – 1604, DOI: 10.1109/JBHI.2017.2767063.
[18] D. Shin, Y. Shim, H. Yu, S. Lee, B. Kim and Y. Choi, Saint+: Integrating temporal features for ednet correctness prediction, in: Proceedings of the 11th International Learning Analytics and Knowledge Conference (LAK21), pp. 490 – 496, (2021), DOI: 10.1145/3448139.3448188.
[19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser and I. Polosukhin, Attention is all you need, in: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17, Long Beach, California, USA), pp. 6000 – 6010, Curran Associates Inc., Red Hook, NY (2017), DOI: 10.5555/3295222.3295349.
[20] T. Xie, Q. Zheng and W. Zhang, Mining temporal characteristics of behaviors from interval events in e-learning, Information Sciences 447 (2018), 169 – 185, DOI: 10.1016/j.ins.2018.03.018.
[21] C. Xu, P. Zhao, Y. Liu, J. Xu, V. S. S. Sheng, Z. Cui, X. Zhou and H. Xiong, Recurrent convolutional neural network for sequential recommendation, in: Proceedings of the World Wide Web Conference (WWW’19, San Francisco, CA), pp. 3398 – 3404, Association for Computing Machinery, New York (2019), DOI: 10.1145/3308558.3313408.
[22] X. Zhang, J. Zhang, N. Lin and X. Yang, Sequential self-attentive model for knowledge tracing, in: Artificial Neural Networks and Machine Learning (ICANN 2021), I. Farkaš, P. Masulli, S. Otte and S. Wermter (editors), Lecture Notes in Computer Science series, Vol. 12891, pp. 318 – 330, Springer, Cham. (2021), DOI: 10.1007/978-3-030-86362-3_26.
[23] J. Zhang, X. Shi, I. King and D.-Y. Yeung, Dynamic key-value memory networks for knowledge tracing, in: Proceedings of the 26th International Conference on World Wide Web (WWW’17), pp. 765 – 774, (2017), DOI: 10.1145/3038912.3052580.
[24] Y. Zhou and J. Kang, Enriching Multimodal Data: A temporal approach to contextualize joint attention in collaborative problem-solving, Journal of Learning Analytics 10(3) (2023), 87 – 101, DOI: 10.18608/jla.2023.7989.
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a CCAL that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.



