Understanding public engagement through visual education
An eye-tracking study of behavioural decision-making in sustainable energy transitionsUnderstanding public engagement through visual education
An eye-tracking study of behavioural decision-making in sustainable energy transitionsSamenvatting
This research explored whether machine learning models could predict shifts in participants’ preferences and attitudes towards energy solutions based on behavioural and self-reported engagement data collected during visual educational interventions. Through four sub-questions, the study evaluated how modelling, data structure, processing, and interpretation contributed to answering the main research question and achieving the project’s broader goals.
The first sub-question examined which machine learning techniques were best suited for identifying patterns in participants’ behaviour and decision-making. An iterative evaluation process showed that a combination of simple and moderately complex models offered the best balance between interpretability and analytical depth, given the small sample size and high feature dimensionality. Logistic regression and shallow random forests were selected for supervised modelling, as they combined strong performance with transparency and enabled meaningful interpretation of the factors contributing to learning outcomes.
In parallel, unsupervised methods such as K-means and hierarchical clustering were applied to reveal behavioural groupings without relying on self-reported labels. This dual approach was justified by the complementary insights it produced. Supervised models supported predictive analysis, while unsupervised models uncovered natural engagement profiles. Explainability and transparency were prioritised throughout to align with the behavioural goals of the study and support the identification of key engagement patterns.
The second sub-question addressed the structure and quality of the collected datasets and their impact on modelling. Two complementary sources were used: behavioural eye-tracking data and self-reported questionnaire responses. While both datasets were complete and internally consistent, they posed certain challenges. The eye-tracking data required aggregation to ensure comparability across participants and slides, while some questionnaire items showed limited variation, reducing their value as predictors. Nonetheless, meaningful associations emerged. Reported learning scores were linked to behavioural engagement signals such as pupil dilation, fixation duration, and time spent on specific content. Demographics and attitudinal traits including environmental interest, ethical concern, and study programme also influenced engagement and learning.
These findings supported key project assumptions and informed model design by identifying features with predictive value. Overall, the integration of behavioural and participant-level variables contributed to a more comprehensive understanding of engagement.
The third sub-question explored the preprocessing techniques required to transform raw data into modelling-ready formats. This stage proved foundational to the project’s success. A step-by-step process was used to extract behavioural patterns and align them with participant data. The first iteration created a detailed but complex dataset by summarising second-by-second recordings and integrating cleaned questionnaire responses. The second iteration improved on this by using two strategies.These included selective feature refinement through forward selection and rule-based engagement score construction. Both approaches reduced dimensionality while retaining behavioural depth and theoretical relevance. Ultimately, this phase enabled the translation of complex signals into structured inputs, supporting model accuracy and interpretability.
The second modelling iteration achieved the project’s data mining goals. Using a refined dataset created through feature selection (Approach 1), a multinomial logistic regression model predicted reported learning levels with high accuracy and macro F1-score. This confirmed that a compact behavioural and participant feature set could model variation in outcomes.
Meanwhile, hierarchical clustering based on rule-based engagement scores (Approach 2) identified two behavioural clusters. Although these were not significantly associated with learning scores, they reflected meaningful differences in engagement styles, especially across content types.
The modelling results supported most core assumptions. Behavioural indicators of engagement, especially pupil dilation, fixation duration, and gaze metrics, were among the strongest predictors in the supervised model, particularly on slides covering nuclear, ethical, and solar content. Participant characteristics including age group and study year had moderate predictive value, supporting the assumption that user background shapes learning. Attitudinal traits such as post-session preferences contributed less. Slide order had minimal impact and did not meaningfully affect predictions. Participants with similar self-reported energy preferences were often assigned to different behavioural clusters, suggesting that eye-tracking captured engagement patterns not fully reflected in survey responses.
Overall, the project successfully demonstrated that machine learning models can predict learning outcomes and uncover engagement patterns by combining behavioural and participant-level data. The findings confirmed that content-specific engagement signals, alongside selected background features, were key to modelling success. These results offer valuable insight into how users process educational material and highlight the potential of data-driven approaches for understanding and supporting sustainable learning interventions.
| Organisatie | |
| Opleiding | |
| Afdeling | |
| Partner | HZ University of Applied Sciences |
| Datum | 2025-08-31 |
| Type | |
| Taal | Engels |































