Publikace: Quantitative and Qualitative Evaluation of Sequence Patterns Found by Application of Different Educational Data Preprocessing Techniques
Článekopen accesspeer-reviewedpublishedNačítá se...
Soubory
Datum
Autoři
Munk, Michal
Drlik, Martin
Benko, Ľubomír
Reichel, Jaroslav
Název časopisu
ISSN časopisu
Název svazku
Nakladatel
IEEE (Institute of Electrical and Electronics Engineers)
Abstrakt
Educational data preprocessing from log files represents a time-consuming phase of the knowledge discovery process. It consists of data cleaning, user identification, session identification, and path completion phase. This paper attempts to identify phases, which are necessary in the case of preprocessing of educational data for further application of learning analytics methods. Since the sequential patterns analysis is considered suitable for estimating of discovered knowledge, this paper tries answering the question, which of these preprocessing phases has a significant impact on discovered knowledge in general, as well as in the meaning of quality and quantity of found sequence patterns. Therefore, several data preprocessing techniques for session identification and path completion were applied to prepare logfiles with different levels of data preprocessing. The results showed that the session identification technique using the reference length, calculated from the sitemap, had a significant impact on the quality of extracted sequence rules. The path completion technique had a significant impact only on the quantity of extracted sequence rules. The found results together with the results of the previous systematic research in educational data preprocessing can improve the automation of the educational data preprocessing phase as well as it can contribute to the development of learning analytics tools suitable for different groups of stakeholders engaged in the educational data mining research activities.
Popis
Klíčová slova
Computational and artificial intelligence, data preprocessing, educational technology, learning, learning systems, sequential analysis, web mining, Výpočetní inteligence, umělá inteligence, předzpracování dat