Beyond Missing Data: Questionnaire Uncertainty Responses as Early Digital Biomarkers of Cognitive Decline and Neurodegenerative Diseases
Identifying preclinical biomarkers of neurodegenerative diseases remains a major challenge in aging research. In this study, we demonstrate that frequent “Don’t know/can’t remember” (DK) responses, often treated as missing data in touchscreen questionnaires, serve as a novel digital behavioral biomarker of early cognitive vulnerability and neurodegenerative disease risk. Using data from 502,234 UK Biobank participants, we stratified individuals based on DK response frequency (0-1, 2-4, 5-7, >7) and observed a robust, dose-dependent association with an increased risk of Alzheimer’s disease (HR = 1.64, 95% CI: 1.26-2.14) and vascular dementia (HR = 1.93, 95% CI: 1.37-2.72), independent of established risk factors. As DK response frequency increased, participants exhibited higher BMI, reduced physical activity, higher smoking rates, and a higher prevalence of chronic diseases, particularly hypertension, diabetes, and depression. Further analysis revealed a dose-dependent relationship between DK response frequency and the risk of Alzheimer’s disease and vascular dementia, with high DK responders showing early neurodegenerative changes, marked by elevated levels of Abeta40, Abeta42, NFL, and pTau-181. Metabolomic analysis also revealed lipid metabolism abnormalities, which may mediate this relationship. Together, these findings reframe DK response patterns as clinically meaningful signals of multidimensional neurobiological alterations, offering a scalable, low-cost, non-invasive tool for early risk identification and prevention at the population level.
💡 Research Summary
This study investigates whether the frequency of “Don’t know / can’t remember” (DK) responses in large‑scale touchscreen questionnaires can serve as a digital behavioral biomarker for early cognitive vulnerability and subsequent risk of neurodegenerative diseases (NNDs). Using data from 502,234 participants in the UK Biobank, the authors categorized individuals into four groups based on DK response counts: 0‑1 (minimal), 2‑4 (low), 5‑7 (moderate), and >7 (high).
Baseline analyses revealed systematic differences across groups. Higher DK frequency was associated with older age, a greater proportion of females, lower educational attainment, higher socioeconomic deprivation, and increased ethnic diversity. Income distribution shifted dramatically, with high‑income participants decreasing from 7.7 % in the minimal group to 1.5 % in the high‑DK group, while low‑income participants rose from 16.3 % to 40.6 %. Lifestyle factors also diverged: body‑mass index rose from 26.5 to 27.35 kg/m², high physical activity declined from 42.4 % to 37.8 %, smoking prevalence increased from 43.8 % to 48.6 %, and short sleep duration became more common. Chronic disease prevalence (hypertension, diabetes, dyslipidemia, depression, cardiovascular disease) showed a dose‑response increase with DK frequency.
To assess disease risk, Cox proportional hazards models were fitted with three progressive adjustment levels (basic demographics, added socioeconomic and lifestyle covariates, and a fully adjusted model). Compared with the reference (0‑1 DK responses), participants in the >7 DK group exhibited significantly elevated hazards for all neurodegenerative diseases combined (HR 1.69, 95 % CI 1.38‑2.07), Alzheimer’s disease (HR 1.64, 95 % CI 1.26‑2.14), and vascular dementia (HR 1.93, 95 % CI 1.37‑2.72). Intermediate DK groups showed weaker but still positive associations, confirming a graded dose‑response relationship. In contrast, negative control outcomes (musculoskeletal degenerative disorders, Parkinson’s disease, ALS) showed no consistent pattern, suggesting specificity for cognitive‑related pathologies. Subgroup analyses indicated stronger associations among older adults, males, lower‑income individuals, and those without a college degree, though interaction tests were largely non‑significant.
Beyond clinical outcomes, the authors examined neurobiological correlates. Blood biomarkers of neurodegeneration—including amyloid‑β40, amyloid‑β42, neurofilament light chain (NFL), and phosphorylated tau‑181—displayed positive linear trends with DK frequency, implying that higher uncertainty responses reflect underlying brain pathology. Metabolomic profiling identified lipid metabolism disturbances (e.g., altered triglyceride and cholesterol pathways) that may mediate the link between DK responses and neurodegenerative biomarkers.
The study reframes DK responses, traditionally treated as missing data and handled via exclusion or imputation, as informative signals of metacognitive decline. The authors argue that DK frequency captures early metacognitive deficits—individuals’ reduced ability to monitor and report their own memory performance—preceding overt cognitive impairment. Because DK data are automatically recorded in digital surveys, they represent a low‑cost, non‑invasive, scalable tool for population‑level screening.
Limitations include the observational design, which precludes causal inference, and the lack of direct neuropsychological testing to validate DK responses against objective cognitive scores. The analysis also treats DK counts as a composite metric without dissecting which questionnaire domains drive the effect, and cultural or language differences in response styles were not explored.
In summary, frequent “Don’t know / can’t remember” answers in touchscreen questionnaires are robustly associated with higher risk of Alzheimer’s disease and vascular dementia, correlate with established blood biomarkers of neurodegeneration, and reflect broader adverse health and socioeconomic profiles. These findings suggest that DK response frequency can serve as an inexpensive, widely applicable digital biomarker for early identification of individuals at elevated risk of neurodegenerative disease, offering a promising avenue for public‑health surveillance and preventive interventions.
Comments & Academic Discussion
Loading comments...
Leave a Comment