June 2024 - March 2025, Master’s Thesis
MSc in Human-Computer Interaction & Research Assistant @ German Aerospace center (DLR)

Introduction

This project investigates how mental workload affects remote operators managing autonomous vehicle fleets. In these systems, operators must continuously monitor and intervene, making workload a critical factor for safety and performance.

The goal was to explore whether eye-tracking data can be used to assess and predict workload in real time, enabling more adaptive and intelligent interfaces.

For a full dive-in, I recommend taking a look at the related publication, or my thesis.

User study

The analysis is based on data from a controlled user study adopting a 2×2 within-subject design, where workload was manipulated through task difficulty (easy vs. hard) and task presentation frequency (slow vs. fast). Participants performed remote fleet management tasks across all experimental conditions.

For each condition, subjective workload was measured using NASA-TLX, while performance metrics and eye-tracking data were continuously recorded.

The interaction took place within a single interface structured into five functional regions—Ticket, Description, Map, Diagnostics, and Actions—which were later used as Areas of Interest for the spatial analysis.

Remote assistance interface with map, diagnostics, incident description, and actions
Same interface with Areas of Interest regions highlighted
Eye-tracking experiment setup with monitor and fiducial markers
Participant wearing eye-tracking glasses at the workstation

Analysis

The analysis follows a structured pipeline combining statistical and machine learning approaches.

After data cleaning and synchronization, a data extraction of AoI-based features followed to capture different aspects of visual behavior, including fixation dynamics, visit patterns, and spatial entropy measures. These metrics enabled a spatially grounded analysis of how attention is distributed across interface regions.

A validation phase was conducted to ensure the robustness of the experimental setup. This included verifying workload manipulation through subjective measures, assessing performance decay under increasing workload, and validating the AoI segmentation using density-based clustering (HDBSCAN).

Inferential analysis was then performed to address the research questions. Differences between low and high workload conditions were assessed using paired t-tests, while the effects of difficulty and frequency were analyzed through factorial repeated-measures ANOVA. In addition, L1-regularized logistic regression was used to identify the most informative interface regions for each metric.

For the predictive analysis, a machine learning pipeline was developed to evaluate the extent to which workload states can be inferred from AoI metrics. A feature selection was performed using correlation filtering and the Boruta algorithm, followed by model evaluation through nested cross-validation with participant-level splits. Multiple models were tested, including Support Vector Classifier, Random Forest, Gradient Boosting, XGBoost, and Multilayer Perceptron, combined with different resampling strategies to address class imbalance.

HDBSCAN gaze clusters overlaid on interface Areas of Interest
Stationary entropy and fixation duration by Easy Slow vs Hard Fast
Time-to-first fixation on Actions by difficulty and by frequency
Visit frequency for Ticket AOI by difficulty and by frequency
Confusion matrices for workload classification models

Results

The results show that mental workload significantly affects ocular behavior. Visual attention patterns change systematically under higher workload conditions, with operators allocating less attention to non-critical interface regions and adapting their visual strategies to manage increasing task complexity. Task difficulty was found to exert a broader and more consistent influence across metrics, while task frequency primarily affected the temporal dynamics of attention.

From a predictive perspective, AoI-based metrics proved effective in estimating workload states. The best-performing models achieved approximately 83% performance in binary workload classification and around 80% for frequency-related states, demonstrating the potential of spatial eye-tracking features as reliable indicators of cognitive load.

Overall, the findings support the use of AoI-based analysis for both explaining and predicting workload, contributing to the development of intelligent interfaces capable of adapting to users’ mental states in real time.