New Understandings in Cognition
New studies reveal breakthroughs in thinking and knowledge from AI
It seems that AI may (eventually) be able to help us understand intelligence, one of the long term goals of Demis Hassabis at DeepMind and also that of my former AGISI research lab. This was my Lab’s research:
Help to better understand intelligence
Understanding intelligence is one of the major scientific challenges of our time; however, the science of intelligence is very much in its infancy. We worked closely with scientists and leading thinkers from different disciplines in order to better understand intelligence.
If human beings have a better understanding of intelligence, it will not only help to build artificial intelligent machines, but it will also help to improve individuals' situational awareness, decision making, and values, and ultimately greatly improve people's knowledge of each other and our world, and thereby improve the quality of life for society overall.
Recent work at the intersection of artificial intelligence and human brain neuroscience offers compelling new ways to understand complex human (and AI) thought processes.
One line of research, from Anthropic, provides methods to map the internal computations of advanced language models like ClaudeAI, visualizing step-by-step reasoning and planning that occurs “behind the scenes.” Essentially, visualizing how advanced language models like Claude “think,” showing the internal steps they use to process information. If we can understand them, maybe we have a better chance of controlling them.
Complementing this, another study, by Google DeepMind researchers, published in the prestigious journal, Nature Human Behaviour, successfully aligns representations from an AI model called Whisper (processing acoustic, speech, and language information) with real-time human brain activity recorded during natural conversations, revealing a detailed hierarchy and temporal flow of language processing in the brain.
Further illustrating AI's potential to augment human understanding and new discoveries, research on the Google DeepMind chess AI AlphaZero demonstrates how novel, effective strategic concepts, initially unintuitive to humans, were extracted from an AI's internal representations and successfully taught to chess grandmasters, significantly expanding the grandmasters and other experts knowledge in the game.
Decoding Human & Artificial Intelligence
These recent advancements in AI and neuroscience have significantly deepened our understanding of cognition, offering fresh insights into the fundamental operations underpinning thought and communication.
Furthermore, these studies collectively showcase powerful new approaches for investigating the mechanisms of both artificial and biological cognition.
Neuron Interactions
Anthropic employs a novel methodology involving “circuit tracing” and “attribution graphs” to reveal Claude inner workings. Central to their approach are “features,” interpretable replacement neurons derived from a “replacement model,” which serve as proxies for understanding complex neuron interactions across layers.
Attribution graphs then track causal interactions among these features. Specific insights include Claude’s multi-step reasoning capabilities, exemplified by tracing a logical chain from “Dallas” to “Texas” to “Austin.”
Another intriguing example is the activation of “preeclampsia” features internally, even without direct textual reference, indicating sophisticated internal reasoning.
Additionally, “known answer” features have been shown to suppress refusal circuits, illustrating the potential hallucination mechanisms.
Whisper
In parallel, the ECoG study introduces the “Whisper” model, a computational framework designed to decode real-time neural activity during spontaneous human conversation. Whisper's embeddings, which represent acoustic, speech, and language information, map clearly onto specific neural regions. Auditory and motor areas like the superior temporal gyrus (STG) and precentral gyrus (preCG) show strong alignment with speech embeddings, while language embeddings align more closely with higher-order cognitive regions such as the inferior frontal gyrus (IFG) and angular gyrus (AG).
Remarkably, the study also demonstrates a distinct temporal pattern: language-to-speech encoding peaks before word production, whereas speech-to-language encoding peaks post word onset during comprehension, underscoring predictive neural encoding strategies.
What this essentially means is: the brain prepares the words we want to say before we actually say them, showing a forward-looking or predictive way of processing language. In contrast, when we listen, our brains make sense of the words slightly after we hear them, indicating a backward-looking way of understanding language. Essentially, our brains anticipate speech production but reactively process speech when listening.
Critically, Whisper embeddings significantly outperformed traditional symbolic linguistic models (e.g., phonemes, part-of-speech tags) in predicting neural activity, yet still implicitly encoded these traditional linguistic structures without explicit training. This illustrates a powerful integration of multimodal information, especially when auditory inputs augment textual data, enhancing predictive neural modeling.
New Discoveries
The AlphaZero research further complements these findings by demonstrating AI's capability to discover advanced, previously unknown chess strategies. This research focused on uncovering chess moves and concepts considered unintuitive and novel to even the most skilled human players.
These newly discovered moves were then taught to a select group of grandmasters, each a former or current world chess champion. Remarkably, all grandmasters improved their ability to solve chess puzzles incorporating these innovative AI-derived strategies after the learning phase.
This outcome highlights the potential for AI discoveries to push the boundaries of human expertise, indicating these concepts sit at the cutting edge of human strategic understanding.
Limitations
Despite their profound insights, each study openly acknowledges methodological limitations. Anthropic’s attribution graphs experience reconstruction errors due to discrepancies between the replacement model and the original, complicating complete circuit tracing, particularly attention-based circuits.
Similarly, Whisper simplifies the inherently nonlinear complexity of brain dynamics using linear encoding models, capturing only partial neural information.
AlphaZero’s concept exploration was limited to a small, highly specialized group of human validators, suggesting further research is needed to generalize these findings broadly.
Understanding Intelligence
Collectively, these studies highlight a compelling methodological convergence: each utilizes deep internal representations (features, embeddings, concept vectors) to explore cognitive processes beyond traditional input-output analyses.
Future research in these areas will likely focus on enhancing the precision of these internal mappings, developing nonlinear encoding frameworks, and expanding validation studies to strengthen the applicability and comprehensiveness of insights across artificial and biological intelligence.
Combining insights from AI and neuroscience offers exciting possibilities, helping us better understand and improve how both humans and machines “think,” bringing us closer than ever to truly understanding the nature of intelligence
Stay curious
Colin
The moment knowledge becomes a competition, or intelligence a power tool, it all slips into schismogenesis: “I know more,” “my AI is smarter,” “we’ll control reality first.” But the pursuit was never about meaning. Meaning is the bait. The real game is status through understanding—or the appearance of it. People get serious about themselves because they need to justify their role in the story.
Very interesting article, thank you!
Fascinating, Colin. Thanks for sharing. I look forward to delving deeper. As I read, I couldn’t help but worry that AI is only getting better at reading our minds and anticipating our behavior well before we understand it ourselves.