Playing for cognition: investigating the feasibility and user experience of a virtual reality serious game for cognitive assessment in children with congenital heart disease

Background In order to facilitate the development and implementation of innovative technology in clinical practice, it is important to understand the user experience of end-users. Virtual Reality (VR) offers the possibility to assess cognitive functioning in a dynamic environment that simulates real-world situations. The purpose of this cross-sectional study was to investigate the feasibility of a VR Serious Game for cognitive assessment in school-aged children with congenital heart disease (CHD). The sub-aims were two-fold: (1) to objectively evaluate the feasibility of the VR Serious Game in children with CHD in comparison to typically developing (TD) children and (2) to explore the user experience of both groups following their interaction with the VR Serious Game. Results A total of 101 children participated in this study; 98 children were included in the final analysis (CHD: n = 54; TD: n = 47). The VR Serious Game appeared feasible for both children with CHD and TD children, with 88% children completing the innovative VR assessment without encountering any issues. There were no discernible differences in completion rates between groups. Children with CHD reported significantly lower scores than TD children on three user experience scales: Engagement , Flow and Presence . Nonetheless, the scores for Engagement and Flow were still considered "moderate to good". Both groups reported minimal adverse physiological reactions. Conclusions The findings suggested that the VR Serious Game was feasible for children with CHD and that the user experience was positive. Future research should investigate the effectiveness of the VR Serious Game compared with a conventional or digital neuropsychological assessment, prioritising the development of novel outcome measures that can better estimate and explain the impact of cognitive impairment on daily functioning.


Background
Children with congenital heart disease (CHD) who undergo early-life cardiac surgery are at risk of cognitive impairment, particularly in executive functions (EFs) [1][2][3][4][5][6][7].This can contribute to a developmental cascade, hindering academic achievement and psychosocial outcomes, whilst exacerbating long-term restrictions in daily life [5,6,8].Cognitive impairment may be latent in early development, emerging as children face increased academic demands, a phenomenon referred to as "growing into deficits" [9].This stresses the importance of assessing cognitive functioning in school-age children to better predict their capabilities in daily life.
Cognitive functioning is typically assessed with a conventional neuropsychological assessment (NPA; i.e., paper-and-pencil tests), which aims to estimate a child's maximum cognitive capacity.The NPA is administered in a quiet room with minimal external distractions, devoid of the complexities of real-life situations [10].As a result, clinicians face challenges in estimating and explaining the impact of cognitive impairment on daily functioning and providing meaningful recommendations for daily life [11,12].
To address these limitations, Virtual Reality (VR) has emerged as a promising tool for more sensitive and ecologically valid cognitive assessment, as compared to conventional NPA [13,14].VR encompasses computergenerated interactive virtual environments that simulate the real world, ranging from low-immersive VR with a limited field of view (e.g., 'window on the world' 2D displays) to high-immersive VR with a 360° field of view (e.g., head-mounted 3D displays [HMDs]; [15]).
Serious Gaming involves the gamification of tasks, such as goal setting, narrative, feedback and rewards, used in nonentertainment settings [16].By combining VR game performance with NPA test results, clinicians can obtain isolated scores per cognitive function as well as clinically relevant and objective scores that reflect cognitive functioning in real-life situations [17].Ultimately, this approach may offer valuable insights into a child's cognitive capabilities, bridging the gap between a child's cognition at the level of functions and at the level of activities [18].
VR simulations demonstrate feasibility in various paediatric clinical populations, including acquired brain injury (see [19][20][21]) and attention deficit hyperactivity disorder (see [20]).However, there is a paucity of research evaluating the feasibility of Serious Gaming in children with CHD, despite a clear clinical need.Therefore, the overarching aim of the present study was to evaluate the feasibility and user experience of employing a VR Serious Game for cognitive assessment in school-aged children with CHD, compared to typically developing (TD) children.Feasibility (i.e., ability to complete assessment) is a necessary step for implementing technology in clinical practice.Prioritising user-friendliness enhances the feasibility of technology, increasing the likelihood of successful implementation in clinical practice [22].

Participants
Children with severe CHD, demanding surgical intervention within their first year of life [23], were consecutively enrolled from the cardiology departments of both the Wilhelmina Children's Hospital in Utrecht and the University Medical Centre in Groningen, the Netherlands, under the supervision of paediatric cardiologists.In addition, recruitment of children with CHD was facilitated through announcements on the Hartekind Foundation (Stichting Hartekind) website.Eligible children met the following criteria: (1) aged between 10 and 13 years; (2) voluntary participation; and (3) informed consent provided by guardians, as well as by children aged 12 years and older.This age range was decided upon, as clinical follow-up was up to eight years of age and there was a need for follow-up at age 10 and beyond.Following the 'growing into deficits' principle-combined with the need to also ensure a homogenous group -the upper age limit was 13 years of age, since environmental demands change especially during this age range.
TD children were recruited by contacting schools and (sport) associations in the Brabant, Groningen, and Utrecht regions of the Netherlands.Inclusion criteria included: (1) aged between and 13 years old; (2) no history of neurological or psychiatric disease; (3) voluntary participation; and (4) informed consent provided by guardians, as well as by children aged 12 years and older.
For both groups, children diagnosed with epilepsy were excluded due to photosensitivity and reflex seizures [24].The inclusion period spanned from September 2021 to December 2022.
The present study received ethical approval from the Faculty Ethics Review Board of the Faculty of Social and Behavioural Sciences at Utrecht University (UU; protocol numbers 20-648, 21-0136, 21-0369, and 22-0058) and the Medical Research Ethics Committee (protocol number 20-693/C).The study protocol adhered to the Declaration of Helsinki, as well as the requirements of UU and the Faculty Ethics Review Board.

Procedure, tests, and outcome measures
Children with CHD were identified through examination of the electronic patient database at participating hospitals by paediatric cardiologists.Eligible children and their guardians were provided with a brief study overview during their hospital appointments by their cardiologist.If guardians expressed interest, a researcher from UU initiated contact, sharing the information letter and consent form through either email or post.In situations where no hospital appointment was scheduled, researchers from UU contacted the guardians by telephone to provide them with study information.For children who registered on the Hartekind Foundation (Stichting Hartekind) website, their eligibility was determined by a UU researcher.Subsequently, an information letter and consent form were sent to the guardians.Guardians were required to respond to schedule an appointment, and UU researchers followed up after one week.
TD children were recruited by contacting schools and (sport) associations in the Brabant, Groningen, and Utrecht regions of the Netherlands.Researchers from UU initiated contact by sending an email containing an information letter and an informed consent form.Schools and (sport) associations were required to respond to schedule an appointment, with UU researchers following up after one week.
The informed consent, signed by guardians and children aged 12 years and older, could either be emailed or brought in person to the appointment.
For children with CHD, the test protocol comprised a single 90-min session conducted at participating hospitals.The whole session included a conventional NPA and a VR Serious Game.The session was divided into two blocks, with a short break in between the two blocks.Additional breaks were accommodated as needed.The assessment took place in a quiet room with minimal external distractions and was administered by a trained researcher.The sequence of cognitive assessments (conventional NPA [A; approximately 60 min] and innovative VR Serious Game [B; approximately 30 min]) was counterbalanced, alternating between A and B blocks of tasks.The data from both the NPA and the VR Serious Games were used for the current aims.The innovative VR Serious Game was counterbalanced and divided into two 'quests' (i.e., 1 and 2).In the first quest, the following minigames were administered in this order: Alien Outpost, Nautilus Underwater Memory, Mystical Lake, and Galactic Diamond Belt.The second quest comprised the following minigames in this order: Mystical Lake, Galactic Diamond Belt, Alien Outpost, and Nautilus Underwater Memory.All children interacted with the VR Serious Game whilst seated.
The researcher helped the children with fitting the VR goggles to their heads and gave consecutive instructions per game.This was repeated after the block of NPA for the children with CHD.The virtual environment was streamed to a second device, so the researcher could observe what the children were presented with.For TD children, the test protocol (innovative VR Serious Game) involved a single 30-min session conducted at participating hospitals, schools and (sport) associations.TD children were assessed in groups of two up to four at the same time.Again, the researcher helped the TD children with fitting the VR goggles to their heads and gave consecutive instructions per game.
Relevant clinical information for the children with CHD was gathered from the electronic patient database.This information included diagnosis (e.g., Atrioventricular Septal Defect, Hypoplastic Left Heart Syndrome, Tetralogy of Fallot, Transposition of the Great Arteries, and Tricuspid Atresia), the age at the time of their first surgery (in days), as well as the total number of surgeries they had undergone before the age of six months.
Prior to assessment, children reported their daily gaming hours.Age and sex of each child were recorded.Following the assessment, children completed a questionnaire to evaluate their VR user experience (see [12]).

VR serious game
Apparatus The Serious Game 'Koji's Quest' , developed by NeuroReality [25], was employed.The Serious Game was run on Oculus Meta Quest 2 head-mounted display VR glasses (immersive 360-degree field of view and a fast-switch LCD display with 1832 × 1920 resolution per eye).Children interacted with the VR Serious Game using touch controllers.
Divided attention Alien Outpost aimed to assess children's divided attention using two four-minute paradigms.In the first paradigm, children had to determine whether an alien appeared in a central spaceship within a spacethemed setting (i.e., central attention).They responded by pressing either a green button (alien present) or a red button (no alien).In the second paradigm, an alien always appeared in the central spaceship, but children had to discern whether the alien looked "happy" or "sad" (i.e., central attention).They responded by pressing either a green button ("happy-looking" alien) or a red button ("sad-looking" alien).In both paradigms, children were also required to identify an alien in the surrounding spaceships by selecting the spaceship where the alien appeared (i.e., peripheral attention).The central flash duration (i.e., alien display duration in the central spaceship) was adjusted based on difficultly level, whilst the peripheral flash duration (i.e., alien display duration in the surrounding spaceships) remained fixed.In addition, the number of surrounding spaceships and distractor stimuli were adjusted based on difficulty level.Accuracy and difficulty levels for both central and peripheral attention were recorded separately.
Executive functions Mystical Lake aimed to assess children's EFs using three paradigms, each lasting for three, three, and four minutes, respectively.In the first paradigm, children were presented with individual fish jumping out of the water, and their task was to quickly identify the correct food associated with each fish species.Children had to learn and remember three different specific food-fish associations.The second paradigm challenged children with multiple fish jumping out of the water simultaneously, each at a different pace.In addition to recalling the correct food-fish associations, children had to prioritise the order in which the fish should be fed.In the third paradigm, multi-tasking was introduced.Children had to monitor water levels in fishbowls, which gradually decreased over time.To prevent the fish from going hungry, children had to continuously refill the fishbowls whilst also ensuring they fed both the fast-and slow-moving fish.The number of fish jumping out of the water simultaneously, their jumping speed, as well as the rate at which fishbowls ran out of water, were dependent on difficulty level.
Selective attention Galactic Diamond Belt aimed to measure children's selective attention using a single fourminute visual search paradigm.In this task, children were required to select a single target (i.e., specific gem of colour and shape) within a space-themed setting whilst ignoring irrelevant distractors.Distractors, which included gems of other shapes and colours, were randomly scattered across space.The number of distractors, display duration, and speed at which the gems moved, were dependent on difficulty level.
Memory Nautilus Underwater Memory aimed to assess children's short-term memory using two three-minute paradigms.In the first paradigm, children were instructed to memorise a sequence of seashells opening and then reproduce the sequence.The sequence was the same across children.In the second paradigm, children were asked to memorise the spatial locations of various objects for a self-determined period (up to 30s) and then reproduce the object placements.The length of sequence and the number of objects that children needed to memorise were dependent on difficulty level.
For all mini-games, the starting difficulty level was set at 500.Difficulty levels varied from 100 to 1000 based on a 3-up 1-down staircase.

User experience
The user experience questionnaire [12] consisted of 15 items, comprising five categories: Engagement (i.e., the feeling of active involvement and enjoyment of the content; [12]), Flow (i.e., the mental state characterised by full immersion in an activity, intense focus, and a distorted perception of time; [12]), Presence (i.e., the feeling of full immersion within a virtual environment), Side effects (i.e., adverse physiological reactions, such as nausea), and Transportation (i.e., the feeling of being transported to an alternate world; [12]).Each item was rated on a six-point Likert scale, ranging from negative (0) to positive (5).

Participant characteristics
Demographic characteristics (i.e., age and sex) were compared between groups (CHD versus TD children) using non-parametric tests.For the continuous variable, age, a Mann-Whitney U test was performed, whilst for the categorical variable, sex, a Chi-square test was performed.Furthermore, mean and standard deviation (SD) were determined for the number of surgeries.Median and interquartile range (IQR) were determined for age at first surgery.

Feasibility
Feasibility was determined based on the number of children who completed the innovative VR assessment.Logistic regression was conducted to investigate whether groups differed in their rates of completing the innovative VR assessment, with sex as a covariate.

User experience
To evaluate user experience, scale scores were calculated for five categories: Engagement, Flow, Presence, Side effects, and Transportation.Each scale score was derived by summing the responses from three related items within the respective category, resulting in a scale score ranging from 0 to 15.A score of 0 indicated a very negative experience, and a score of 15 indicated a very positive experience.However, a higher score indicated more side effects.One item (item 7-"I felt present in the virtual environment") was missing from the questionnaire, therefore the scale score for Presence was calculated from two related items.To compare the user experience between CHD and TD children, an ANCOVA was conducted, with sex as a covariate.Post hoc analysis was performed with a Bonferroni adjustment.

Results
A total of 101 children participated in this study.This comprised 54 children with CHD and 47 TD children.However, three children with CHD were subsequently excluded from the analysis for the following reasons: (1) diagnosis of CHARGE syndrome, a rare genetic disorder associated with a wide range of health and physical problems (n = 1).This exclusion was made after extracting relevant clinical information from the electronic patient database; children with CHARGE syndrome often experience cognitive impairment [26]; and (2) age-related discrepancies (n = 2).Two children were excluded because their age at the time of testing was 14 years, which exceeded the age originally scheduled for the appointment.Subsequently, a total of 98 children were included in the final analysis.This consisted of 51 children with CHD and 47 children with TD.

Demographic and clinical characteristics
Table 1 presents an overview of the participant characteristics of both groups (i.e., children with CHD and TD children).
Children in both groups spent a similar amount of time playing video games for leisure (CHD: 1.40 h; TD: 1.74 h; U = 254.000,z = -0.107,p = 0.915; see Table 1).

Feasibility
Out of the 98 children, 88% completed the innovative VR assessment without encountering any issues.CHD and TD children exhibited comparable completion rates for the innovative VR assessment (boys: n = 47 [55%]), suggesting that the VR Serious Game was feasible for both CHD and TD children.Group (i.e., children with CHD or TD children; p = 0.389) and sex (i.e., boy or girl; p = 0.127) had no significant effect on the likelihood of participants completing the innovative VR assessment (see Table 2).
Among the children with CHD, five children (42%) experienced software malfunctions during the innovative VR assessment.In contrast, four TD children (33%) encountered similar software malfunctions.In addition, two children with CHD (17%) displayed signs of disinterest or a short attention span, resulting in incomplete gameplay.In the TD group, one child (8%) reported discomfort due to the size of the VR glasses used.

Discussion
In order to facilitate the development and implementation of innovative technology (i.e., a new approach that has emerged in the published literature over the last two decades) in clinical practice, it is necessary to understand the user experience of end-users (i.e., children with CHD).The overarching aim of the present study was to evaluate the feasibility of a VR Serious Game for cognitive assessment in school-aged children with CHD.The sub-aims were two-fold: (1) to objectively evaluate the feasibility of the VR Serious Game in children with CHD compared to TD children, and (2) to explore the user experience of both groups after their interaction with the VR Serious Game.The findings indicated that the VR Serious Game was feasible for both children with CHD and TD children, with 88% children completing the innovative VR assessment without encountering any issues.Notably, there were no discernible differences in completion rates between the CHD and TD groups or between male and female participants.This finding is of clinical relevance as it suggests that the VR Serious Game may also be feasible for cognitive training.The primary factor contributing to unsuccessful completion was software malfunctions (75%).It is important to note that children with CHD also performed the NPA tests in between blocks of VR assessment.This indicates that it may be applicable as part of a larger clinical (cognitive) assessment.Therefore, the application of innovative VR appears to hold promise for various paediatric clinical populations [19][20][21].
Furthermore, children with CHD reported significantly lower scores for Engagement, Flow and Presence compared to TD children.However, the scores for Engagement and Flow were still considered "moderate to good" (greater than eight; [21]).It is important to note that the evaluation of Presence was conducted using only two items, namely "I was part of the environment" and "I felt integrated in the environment" instead of three items, which may have contributed to the lower reported scores.Presence is an important factor for enhancing the VR user experience, particularly when utilising VR Serious Games in clinical practice to better predict children's capabilities in daily life [21].Therefore, to explore this further, the use of a user experience questionnaire with greater emphasis on presence should be considered.
Enhanced levels of flow and immersion (i.e., presence) have been associated with increased enjoyment and motivation during play [22].Therefore, innovative VR assessment has the potential to enhance patient compliance and more effective cognitive monitoring [29].
Moreover, children with CHD and TD children reported minimal adverse physiological reactions.TD children reported a higher frequency of headaches compared to children with CHD.TD children were assessed in groups of two or four simultaneously, therefore, it is possible that they may have experienced increased external distractions, potentially contributing to the occurrence of headaches.

Strengths and limitations
The present study's strength lies in its extensive inclusion of children (CHD: n = 54; TD: n = 47).The CHD cohort represents a diverse range of CHD diagnoses, encompassing the most commonly reported CHD diagnoses in the Netherlands [30].In addition, children with CHD were recruited from two Centres of Expertise for CHD, enhancing the sample's representativeness and generalisability to the larger, target population.It is important to highlight that none of the children with CHD attended specialist schools; they were all integrated into mainstream education and did not encounter participation restrictions (e.g., at school and in the community).The majority of children with CHD seen in clinical practice attend mainstream education.It is plausible that children with severe CHD, demanding surgical intervention within their first year of life [23], might encounter fewer enduring negative outcomes compared to children with milder CHD who undergo surgery at a later stage.Consequently, the feasibility and user experience of children with milder CHD operated at a later stage remain uncertain.

Future research
To further investigate the added value of the VR Serious Game, future research should explore the associations between performance in the VR Serious Game and test results from a conventional or digital NPA.Although a digital NPA offers similar advantages to a VR Serious Game, such as precise and standardised stimulus presentation and automated scoring [31], the VR Serious Game has unique advantages.It allows the development of ecologically valid environments (e.g., a dynamic setting with multisensory distractions), which can elicit more natural behaviour [31], therefore, providing valuable insights into a child's cognitive capabilities.In addition, task difficulty can be tailored to each individual child (e.g., by adjusting stimuli complexity and flash duration), resulting in a more accurate cognitive assessment.
Furthermore, researchers should explore differences in cognitive performance between children with CHD and TD children.To better estimate and explain the impact of cognitive impairment on daily functioning, researchers should prioritise the development of novel outcome measures.For example, performance stability measures (i.e., the number of fluctuations in test performance, [32].To enhance the interpretability of test results, it is important to understand how a task was executed.It would be informative to not only identify the number of errors a child makes, but also to pinpoint precisely when these errors occurred.This can provide valuable insights into whether the errors are due to comprehension problems, fatigue, or fluctuations in attention.

Conclusions
The VR Serious Game appeared feasible for children with CHD and TD children (88% completion rate).Children with CHD reported significantly lower scores than TD children on three user experience scales: Engagement, Flow and Presence.Nonetheless, the scores for Engagement and Flow were still positive.Presence is an essential factor for enhancing the VR user experience, therefore, demands further investigation.In addition, future research should investigate the effectiveness of the VR Serious Game compared with a conventional or digital NPA, prioritising the development of novel outcome measures that can better estimate and explain the impact of cognitive impairment on daily functioning.

Table 1
Demographic and clinical characteristics, and user experience comparison between CHD and TD children

Table 2
Logistic regression predicting likelihood of innovative VR assessment completion based on group (i.e., CHD Children or TD Children) and sex (i.e., Boy or Girl)