Shaw & Nave’s Tri-System Theory: Productive but Incomplete

By Jim Shimabukuro (assisted by Claude)
Editor

Introduction

Steven D. Shaw and Gideon Nave of the Wharton School of the University of Pennsylvania published a preprint in January 2026 that has generated substantial discussion across cognitive psychology, behavioral science, and AI-policy communities.[1] The paper is important because it attempts something long overdue: updating the foundational dual-process theory of human cognition — most famously popularized by Daniel Kahneman’s System 1 (fast, intuitive) and System 2 (slow, deliberate) dichotomy — to account for the fact that millions of people now consult generative AI while in the very act of reasoning.

Image created by Copilot

What Shaw and Nave produce is Tri-System Theory, a framework that formally introduces System 3 as artificial cognition operating outside the biological brain, and they test its central prediction — “cognitive surrender” — across three preregistered experiments involving 1,372 participants and 9,593 individual trials. The paper’s empirical architecture is impressively designed, its effect sizes are large, and its terminology is memorable. Its theoretical scaffolding, however, is considerably more provisional than the confident prose suggests.

Tri-System Theory and System 3

The theory posits a third system — System 3 (artificial cognition) — characterized as external, automated, data-driven, and dynamic, operating alongside System 1 (intuitive) and System 2 (deliberative) processes. Shaw and Nave are explicit that System 3 is not merely a “tool” in the instrumental sense — like a calculator or a search engine — but something that performs functions traditionally associated with deliberative cognition: it generates explanations, structures arguments, evaluates options, and proposes solutions.[1] Their solution is conceptual: extend dual-process theory to include a third system that resides outside the biological brain but can be accessed on demand. Crucially, it is not merely supportive. It can supplement, scaffold, override, or displace internal reasoning.

The paper extends Kahneman’s dual-process framework by introducing System 3 as a formally distinct cognitive system with four defining properties: (1) external — resides outside the biological nervous system; (2) automated — executes via statistical/algorithmic processes; (3) data-driven — based on large-scale training corpora; (4) dynamic — responds to human and environmental inputs in real-time. The authors propose six canonical cognitive routes through the triadic system, including cognitive offloading, where the user delegates to System 3 but retains evaluative oversight, and cognitive surrender, where System 2 is bypassed entirely.[1] Empirically, on chat-engaged faulty trials across all three studies, 73.2% of responses reflected surrender versus 19.7% offloading — a nearly 4:1 ratio favoring the path of least cognitive resistance.

This is among the paper’s most striking and tractable findings. The distinction between “offloading” (healthy AI assistance that preserves System 2 oversight) and “surrender” (uncritical adoption that bypasses it) is conceptually sharp and empirically operationalizable.[1,12] The key is not that people sometimes follow bad advice but that the presence of AI reorganizes the cognitive route. Accuracy becomes contingent on System 3 performance.

Cognitive Surrender

Across the three studies, the researchers embedded ChatGPT (GPT-4o) in an online survey task built around a modified Cognitive Reflection Test (CRT). Participants could optionally consult the AI on each item, but AI accuracy was secretly manipulated via hidden seed prompts — sometimes returning the correct deliberative answer, sometimes returning the plausible but wrong intuitive answer.[1] Participants chose to consult an AI assistant on a majority of trials (greater than 50%). Relative to baseline (no System 3 access), accuracy significantly rose when AI was accurate and fell when it erred (+25/-15 percentage points; Study 1), the behavioral signature of cognitive surrender.

The confidence inflation finding is arguably as consequential as the accuracy data. AI access inflated confidence by 11.7 percentage points — even though roughly half of AI answers were wrong. Per-item data from Study 3 confirmed the pattern: confidence was higher on AI-assisted trials than brain-only trials, regardless of whether AI was accurate or faulty. In other words, consulting a confidently wrong AI not only degraded accuracy but also elevated the user’s subjective certainty about that degraded answer — a double liability.[1]

Study 2 introduced time pressure (a 30-second countdown per item), and Study 3 added monetary incentives and item-level feedback. The results from both moderation studies were instructive: time pressure and per-item incentives and feedback shifted baseline performance but did not eliminate this pattern. When accurate, AI buffered time-pressure costs and amplified incentive gains; when faulty, it consistently reduced accuracy regardless of situational moderators. Notably, financial incentives more than doubled override rates on faulty AI trials (from 20.0% to 42.3%), but even under optimal motivational conditions, the AI-Accurate vs. AI-Faulty accuracy gap remained large: 44 percentage points under incentives and feedback. This finding converges with a 2025 structural analysis of human-AI collaboration showing that overreliance cannot be addressed purely through motivation, because the incentive architecture of most AI-assisted workflows is itself misaligned against independent judgment.[9]

At the individual level, the data identified three robust moderators. Higher trust in AI predicted more System 3 engagement and lower override rates on faulty trials, meaning 64% less likely to resist faulty AI per standard deviation increase. Higher need for cognition reduced System 3 usage and increased override. Higher fluid intelligence predicted greater override. Independent research on automation bias corroborates this vulnerability profile, finding that participants skeptical of AI detected errors more reliably and achieved higher accuracy, while those favorable toward automation exhibited dangerous overreliance on algorithmic suggestions. Financial incentives showed no consistent effect on performance.[10]

Where the Theory Falls Short

For all its strengths, the Tri-System framework exhibits a fundamental weakness that goes to its theoretical core: the model is largely propositional in assertion but descriptive in structure, generating questions rather than falsifiable causal theories. System 3 is described as a system, yet it lacks the developmental, neurological, or computational specificity that would distinguish it from existing constructs in the extended-cognition or distributed-cognition literature. Andy Clark and David Chalmers’ 1998 “extended mind thesis” already argued that external resources can constitute genuine components of cognitive processing, and a considerable literature since then has explored ecological rationality and transactive memory. Shaw and Nave acknowledge this lineage but argue that generative AI differs from a notebook or a calculator because it generates explanatory reasoning rather than storing it. That is a reasonable intuition, but it is not given the precise theoretical grounding needed to constitute a new cognitive system rather than a more elaborate instantiation of an existing framework.[7]

The paper is equally underspecified in its treatment of the mechanisms behind cognitive surrender. We are told that surrender occurs when “outputs are fast, fluent, and seemingly authoritative” — but this is closer to a phenomenological description than a mechanistic account.[1] What are the specific cognitive processes being bypassed? Is the AI’s fluency functioning as a proxy for truth value? Is it triggering an availability heuristic, an authority heuristic, or both? The paper never adequately separates these possibilities, though such separation would be necessary for designing effective countermeasures and for making predictions that differ from those already derivable from automation bias theory.[2]

The automation-bias connection is important to examine more carefully. Automation bias — the tendency to over-rely on automated recommendations — has emerged as a critical challenge in human-AI collaboration. Research found that overreliance tends to occur uniquely with AI, in contrast to human teammates. In the process of trust development, users may inappropriately transfer learned trust from one AI interaction to future systems or contexts. Shaw and Nave explicitly claim that Tri-System Theory goes beyond automation bias by treating AI not as something that influences a pre-existing cognitive architecture but as a functional agent within that architecture.[1] This is a philosophically interesting claim, but it is never formally demonstrated that System 3 produces behavioral predictions distinguishable from those of automation bias theory. Without such distinguishing predictions, the theoretical novelty of the framework — as opposed to its terminological novelty — remains an open question.[2,12]

The “cognitive autopilot” route (where the stimulus is handed directly to System 3 without even brief System 1 engagement) is posited but not empirically isolated in the studies. Autopilot is the extreme — you don’t even think before handing the question to AI. Yet because the CRT items by design first evoke a strong intuitive response (System 1), the experimental paradigm structurally precludes pure autopilot measurement. The architecture of the study and the architecture of the theory are partially misaligned.[1,12]

Reliability and Validity

Beyond the framework’s internal tensions, the paper’s broader claim — that AI-augmented decision-making is behaviorally reliable only insofar as the AI itself is reliable — intersects a rapidly developing empirical literature on the variable, context-dependent trustworthiness of large language models. The 2026 International AI Safety Report, commissioned by multiple governments, found that despite recent improvements, general-purpose AI systems can be unreliable and prone to basic errors of fact and logic. Even systems that excel at complex tasks may generate non-existent citations, biographies, or facts — a phenomenon known as hallucination. Their performance can also be inconsistent; for example, accuracy on math problems can decrease significantly when irrelevant information is inserted into the problem description.[5] This is directly relevant to Shaw and Nave’s “scissors effect” — the finding that AI-augmented accuracy either dramatically rises or dramatically falls depending on AI correctness. If LLM accuracy is not just variable but also opaque to the user (because there is no reliable cue to distinguish confident accuracy from confident confabulation), then the scissors effect becomes less a theoretical concern than an immediate practical problem for any high-stakes deployment.

The hallucination problem compounds cognitive surrender in a way the paper does not fully theorize. Shaw and Nave’s experimental manipulation made AI accuracy knowable to the researchers (via seed prompts) but not to participants. In real-world use, nobody has the equivalent of a seed prompt; users cannot know in advance which outputs are accurate. The risk therefore is not merely that users surrender — it is that they surrender systematically toward errors they have no reliable means of detecting. A 2024 survey by Deloitte revealed that 38% of business executives reported making incorrect decisions based on hallucinated AI outputs. The question of when AI-augmented decision-making is reliable — a question with immense consequences for medicine, law, finance, and policy — cannot be answered by the Tri-System framework as currently constituted, because the framework treats AI accuracy as an experimental variable rather than theorizing the conditions under which accuracy can be expected or verified.[5]

A growing body of research suggests that the path to reliable AI-augmented decisions runs through metacognitive calibration rather than through blanket trust or blanket distrust. Research indicates that when AI provides high confidence ratings, human users often correspondingly increase their trust, but these increases in trust can occur even when AI fails to provide accurate information on a given task. A 2025 framework proposed by Lee and colleagues at the University of Florida argued that AI systems need to report not only first-order performance but also second-order metacognitive sensitivity — how well their confidence tracks their accuracy — to enable humans to engage in calibrated reliance.[3] This metacognitive-alignment approach is arguably the theoretically richer response to the problems that Shaw and Nave document, yet it is not engaged in the paper.

Steyvers and Peters, writing in a 2025 peer-reviewed review, found that while humans and large language models can sometimes appear quite aligned in their metacognitive capacities and behaviors, it is clear many differences remain; attending to these differences is important for enhancing the collaboration between humans and artificial intelligence.[7] LLMs, the review notes, tend to express high confidence regardless of actual accuracy — precisely the condition that would maximize cognitive surrender in the Shaw-Nave framework. This behavioral finding from AI systems research directly supports Shaw and Nave’s empirical findings, but it also points toward a remediation pathway — metacognitive transparency in AI design — that Tri-System Theory describes only in the vaguest terms.[1,7]

Research on automation bias has moved considerably beyond description toward actionable intervention designs. A 2025 study introduced the Human-AI-System Concordance (HASC) Matrix as a diagnostic framework for identifying conditions under which overreliance is most likely, and tested “cognitive forcing functions” — structured prompts that compel users to engage System 2 before accepting AI output — as empirically validated countermeasures.[4] This work converges with Shaw and Nave’s Study 3 incentive-and-feedback findings but operationalizes the intervention with considerably more theoretical precision. A companion 2025 study on incentive alignment found that systematic overreliance on AI advice cannot be attributed exclusively to the human, but rather stems from an inherent structural cause — a misalignment of incentives embedded in traditional AI-assisted decision-making designs. Even when humans are perfectly rational decision-makers operating with complete information, this structural misalignment systematically biases them toward overreliance.[9] This is a more structural account of the same phenomenon that Shaw and Nave attribute, at the individual level, to low need for cognition and high AI trust.

Implications

One of the most consequential implications of cognitive surrender is longitudinal: if people increasingly defer to AI during routine reasoning, will the internal cognitive machinery required for independent deliberation atrophy over time? Nave himself worried: there is an alternative story here of humans becoming more and more reliant on AI, just like we now have an air conditioner that can set our temperature, and we can move from one place to another without using any physical activity. Shaw and Nave’s experimental design, however, measures within-session behavior and cannot speak to longitudinal trajectories — a significant limitation for a theory claiming to describe how System 3 “may reshape autonomy and accountability in the age of AI.”[1]

The concern about gradual skill erosion is now backed by independent empirical work. Gerlich’s widely cited 2025 study of 666 participants found a significant negative correlation between frequent AI tool usage and critical thinking abilities, mediated by increased cognitive offloading. Younger participants exhibited higher dependence on AI tools and lower critical thinking scores compared to older participants.[8] A Frontiers in Education review from late 2025 identified what researchers term “digital cognitive atrophy” — a condition where frequent AI users demonstrate reduced neural activation associated with working memory and analytical reasoning.[11] These findings are consistent with the trajectory that Shaw and Nave’s theory predicts, but they emerge from a separate research tradition and have not been theoretically integrated into the Tri-System framework.

The ecological validity problem also requires greater attention. Shaw and Nave used Cognitive Reflection Test items — mathematical puzzles specifically engineered to produce a strong, wrong intuitive response — as their testing ground. While this is a well-validated methodology for studying System 1/System 2 dynamics, CRT items are highly atypical of the AI-assisted decisions people actually make. Most real-world AI consultations involve ambiguous, open-ended questions about professional tasks, creative work, or interpersonal decisions, where the “correct” answer is not objectively determinable and where the AI’s output is unlikely to take the form of a definitive single-sentence answer.[12] The degree to which surrender rates of 73–80% on CRT items translate to surrender rates on high-stakes professional decisions involving medical diagnosis, legal analysis, or financial forecasting remains an empirical question the paper is not equipped to answer.[5]

A December 2025 review examining human-AI collaboration across 84 studies in high-stakes domains found that appropriate reliance persisted as a challenge despite advances in AI explainability.[6] Critically, trust calibration was identified as a four-factor construct encompassing perceived competence, trust alignment, uncertainty recognition, and transparency — none of which map neatly onto the simple “trust in AI” scale used by Shaw and Nave.[4,6] This suggests that the paper’s individual-difference findings, while empirically real, may represent only a partial decomposition of the vulnerability structure.

Missing Mechanisms and Incomplete Theory

A further gap in Tri-System Theory is its treatment of what the paper calls the “cognitive autopilot” and “recursive” pathways, which are described but not empirically tested. The recursive route — where System 3 output feeds back into System 1 or System 2 and modifies them — is theoretically the most important pathway for understanding the long-term consequences the paper worries about.[1] If AI use gradually updates users’ intuitions (System 1) toward AI-typical patterns, or if it reconfigures what users perceive as requiring deliberation (System 2), then the changes wrought by System 3 are not merely situation-specific but architecturally transformative. The paper’s abstract acknowledges this by speaking of “reshaping autonomy and accountability,” but the theory provides no mechanism and no measurement strategy for detecting such recursive effects. This is where the framework most transparently remains a set of open questions rather than a closed theory.[1,12]

The relationship between System 3 and what the extended-mind literature calls distributed cognition is also underdeveloped. Clark and Chalmers’ original argument was that the boundary of cognition should be drawn not at the skull but at the point of functional integration — that if an external process plays the same role for a person as an internal process would, it counts as part of that person’s cognitive system. Shaw and Nave’s framework is compatible with this view, but they never engage it, which means the novelty of their contribution relative to existing distributed-cognition accounts is never adequately established.[7]

Finally, the paper’s account of individual differences, while empirically robust, lacks a developmental dimension. It identifies fluid intelligence and need for cognition as protective factors but says nothing about whether these traits are themselves modifiable through training, whether surrender tendencies shift with experience using AI, or whether the vulnerability-to-surrender relationship is linear or threshold-based.[1] These are not peripheral questions; they are the questions on which any practical intervention — in education, in professional training, or in AI interface design — would need to be grounded.[8,11]

Conclusion

Shaw and Nave’s paper is among the most carefully designed and most consequential empirical contributions to the psychology of AI use published to date.[1] Its central finding — that the majority of people adopt AI outputs without adequate scrutiny, that this produces both enhanced accuracy when AI is right and dramatically degraded accuracy when AI is wrong, and that even strong incentives fail to eliminate this pattern — is empirically robust and practically urgent.[1,9] The Tri-System framework gives cognitive science a vocabulary for discussing these dynamics and a structural account of why AI is not simply another cognitive prosthetic.

The framework’s weaknesses, however, are real and significant. System 3 is defined by its properties rather than by its mechanisms, generating a phenomenology of AI-assisted cognition without yet producing the causal account needed for prediction and intervention.[1,12] The distinction from automation bias theory is asserted rather than demonstrated.[2] The six cognitive routes are labeled and described but only two are empirically tested, and only under artificial conditions with a narrow, unusual task type.[1] The confidence-inflation finding, arguably the most practically important result in the paper, is documented but not mechanistically explained.[3,7] The recursive and long-term pathways, which carry the theory’s most consequential predictions, are not testable within the study’s cross-sectional design.[8,11]

The broader literature on AI reliability, hallucination, metacognitive calibration, and human-AI teaming suggests that the problem Shaw and Nave have named — cognitive surrender — is real, structural, and worsening as AI becomes more persuasive and more embedded.[2-6] But solving it will require not just a theory that names the problem and identifies the behavioral signature, but a theory that explains the mechanism, specifies the conditions under which surrender is versus is not rational, and generates testable predictions about how interface design, user training, and AI transparency can shift the surrender-to-offloading ratio in a durable way.[4,6,9] That theory does not yet exist in this paper. As a preprint still awaiting peer review, the framework should be treated as a productive but incomplete first-generation model awaiting the critical engagement — from extended-mind theorists, human factors researchers, and AI systems designers — that will be necessary to determine whether System 3 is genuinely a new cognitive system or a new label for dynamics already theorized elsewhere.[2,7,12]

References

[1] Shaw, S. D., & Nave, G. (2026). Thinking—Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender. SSRN/OSF Preprints. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646

[2] Exploring Automation Bias in Human–AI Collaboration: A Review and Implications for Explainable AI. AI & Society, Springer Nature (2025). https://link.springer.com/article/10.1007/s00146-025-02422-7

[3] Lee, D., Pruitt, J., Zhou, T., Du, J., & Odegaard, B. (2025). Metacognitive Sensitivity: The Key to Calibrating Trust and Optimal Decision Making with AI. PNAS Nexus, 4(5), pgaf133. https://pmc.ncbi.nlm.nih.gov/articles/PMC12103939/

[4] Doh, W., Goh, Y., & Kim, S.-H. (2025). Beyond Overreliance: The Human-AI-System Concordance (HASC) Matrix and the Cognitive Dynamics of AI-Assisted Decision-Making. Proceedings of the Human Factors and Ergonomics Society Annual Meeting. https://journals.sagepub.com/doi/10.1177/10711813251358240

[5] International AI Safety Report 2026. https://www.insideprivacy.com/artificial-intelligence/international-ai-safety-report-2026-examines-ai-capabilities-risks-and-safeguards/

[6] Enhancing Intuitive Decision-Making and Reliance Through Human–AI Collaboration: A Review. Informatics, MDPI (December 2025). https://www.mdpi.com/2227-9709/12/4/135

[7] Steyvers, M., & Peters, M. A. K. (2025). Metacognition and Uncertainty Communication in Humans and Large Language Models. Current Directions in Psychological Science. https://journals.sagepub.com/doi/10.1177/09637214251391158

[8] Gerlich, M. (2025). AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking. Societies, 15(1), 6. https://www.mdpi.com/2075-4698/15/1/6

[9] When Thinking Pays Off: Incentive Alignment for Human-AI Collaboration (2025). arXiv preprint. https://arxiv.org/pdf/2511.09612

[10] Bias in the Loop: How Humans Evaluate AI-Generated Suggestions (2025). arXiv preprint. https://arxiv.org/html/2509.08514v1

[11] Frontiers in Education: Evaluating the Impact of AI on Critical Thinking Skills Among Higher Education Students (2025). https://www.frontiersin.org/journals/education/articles/10.3389/feduc.2025.1719625/full

[12] Mamun, S. M. (2026). Tri-System Theory: Reshaping Dual-Process Models in the Era of Artificial Cognition — A Comprehensive Review (Vol. I). Academia.edu. https://www.academia.edu/164734033/Tri_System_Theory_Reshaping_Dual_Process_Models_in_the_Era_of_Artificial_Cognition_A_Comprehensive_Review_Vol_I

Leave a comment