"The danger isn’t that AI agents will become like us. It’s that they won’t have to." This phrase jumped out at me because in the groups I am in, that are engaged with AI, there is a discussion about whether to consider AI and Robots as "tools" or as "beings". The problem is that we don't have an understanding of what something is that is neither tool nor being. Perhaps this is proof of your final takeaway that the problem is Taxonomy.
The struggle to find the way forward, determining the true risk and then appropriate levels of regulation will likely take us from one extreme to another - from over-reach to loss of control. Both are likely to present, and probably unevenly as we move forward.
We saw it with nuclear weapons, and then nuclear power, and again with the roll out of the internet. We want to have the freedom to expand and take advantage of new technologies, but at some point they get beyond us and our lack of understanding puts us into a position of back-tracking. How can we know the right level of control until we've crossed a line? After all, the line is rarely visible until we're on the other side of it.
Mark Andreesson shares that the Biden administration had every intention of bringing AI under their complete control from the beginning. With a new administration, I suspect we've swung to the other extreme, and the Tech Bros will be left to decide what's best for all of us.
The most important thing is for us all to become as educated as possible in the technology and make a point of maintaining a chair at the table. It's why I appreciate articles such as this Colin, to encourage everyone to be aware and contribute with informed opinions - it is a civic duty for the good of us all.
Thank you Susan. I truly appreciate you sharing your reflections and connecting with broader discussions.
That is a key tension, I was stuck on how to convey it, I am delighted that you understood it. "The danger isn’t that AI agents will become like us. It’s that they won’t have to." Your group discussions about 'tool' vs. 'being' is exactly the kind of conceptual problem arising because, as you note, we lack a robust category for entities possessing agency without necessarily having 'being' in the traditional sense. This directly supports the paper's premise, that a core issue is taxonomy. Kasirzadeh and Gabriel's framework, focusing on dimensions like autonomy and efficacy, is an attempt to build that middle ground vocabulary based on observable characteristics rather than getting stuck on ontological debates. Incidentally a few years back the EU voted on whether to give 'personhood' to AI / Robots. And I seem to think one country gave it Citizenship! I must check.
Your point about the likely swings in regulation between over-reach and loss of control is right, I recall that Andreesen comment which he said was a turning point for him with his move to the 'new' administration, I guess so they can keep more control!
Historical parallels with nuclear technology and the internet are apt. There is huge difficulty of governing technologies whose full impact is unclear until 'we're on the other side of the line.' Perhaps a more granular, capability-focused approach to classification could help regulators calibrate responses more precisely, potentially mitigating those extreme swings, though the challenge remains immense.
The ideal scenario is that governance is guided by informed understanding rather than just political reflex or industry capture. Which brings me to your vital closing point: the necessity for all of us to become educated, stay aware, and contribute with informed opinions. It truly feels like a civic duty with technology this transformative.
"Incidentally a few years back the EU voted on whether to give 'personhood' to AI / Robots. And I seem to think one country gave it Citizenship! I must check."
I wouldn't be surprised. We did a similar thing by making corporations their own entity. The thing is that with corporations there is still a board who can be held accountable, so it's mostly a financial distinction that gives rights to something that not alive.
I find this more scary, especially as they become personified and take on roles as service providers. Our subconscious has a difficult time distinguishing the nuanced differences of things treated as the same. It is this fundamental issue, like very good propaganda, that is of concern.
But perhaps it is only of concern to a generation that has lived without this technology and knows the difference. Two generations from now it will just "be different" and have morphed into whatever it is. By that time - that will be what normal looks like.
It seems to me, a summary of this video is: All humans are expendable, even the entire C-Suite. All that remains are the vulture capitalists. Can they alone provide a large enough market to justify all of this?
That's a frame that assumes our economy will continue as it is. Perhaps we need to question what the end game is. Likely the concepts of capitalist - socialist will disappear altogether and something different will be created with a world that has very few people and utilizes technology in a way that eliminates scarcity in the way we know it.
We suspect we can't see the future if we limit it to the concepts and terminology we use today.
Agreed. AI could potentially lead to fundamental shifts where labels like 'capitalist' or 'socialist' become obsolete, possibly even altering concepts like scarcity as we know them.
It's a crucial reminder that our current terminology and frameworks might indeed limit our ability to envision such radically different futures. This very uncertainty perhaps highlights why understanding and guiding the development of these powerful technologies now is so important, even if we can't fully predict the 'end game.'
Fascinating mind exploration, and fun toys to play with and push to their limit, but in the end it always comes back to "why?".
Why do we have a construct of a business?
Why do we have a construct of human labor?
Why do we have a construct of economics, money, and assets?
Why do we have a construct of play and entertainment?
The answer to all these questions is "humans". Without humans, the planet would not have any of this.
If we follow the logic of this video into a world run by one AI mega-company, then what keeps us as the center of the construct? And if we're not the center of the construct then why bother doing any of it?
It reminds me of a conversation I had with my father when I was younger and bought into the early propaganda of "humans are the cancer". His response was, "If we're all gone, then who will be able to appreciate the planet we wish to save from ourselves?"
It's an excellent question, and makes it very easy to see where a few elites could decide that the planet could be saved AND appreciated if ONLY THEY were left. But to do that would need technology to replace labor. Every action I've watched since the 1980's has created a direct line from that initial propaganda of "humans are the cancer" to where we are today.
Perhaps in the end, humans become extinct with only a few who have melded with the technology to live forever in appreciation of the planet. Is that a goal worth pursuing? I don't know and maybe it doesn't matter. My life span is finite so perhaps 8B souls on the planet is just the transition before we evolve into a completely different being.
Perhaps all we'll have lost is the constructs we created to understand and manage this stage of the evolution on the planet. Maybe "life" is not the most important thing. Maybe life is just a biological stepping stone to something much more complex.
In the end, what are we afraid of and why are we afraid of it? Usually this takes us back to a very individual and personal answer. I care because I have a child who will need to navigate this transition. But perhaps he is one of the last generation to even be here if pro-creation continues to be curtailed through biological warfare, psychological gender warfare, and of course the usual kinetic warfare.
Perhaps the trick is to recognize the end game, if there is one, so that we at least make decisions with our eyes open. Everything else with the technology is just leveraging human curiosity to see what it can do next.
I will think carefully about the coversation with your father. In the meantime what an incredible point - "what keeps us as the center of the construct?"
...and Saudi Arabia famously granted symbolic 'citizenship' to the robot Sophia. These examples highlight how we're already grappling with fitting these entities into our existing legal and social structures.
Your comparison to corporate personhood is very relevant. This time the critical difference lies in accountability, there's still a human board responsible for a corporation. The potential lack of clear human accountability for highly autonomous AI actions is precisely what makes the governance challenge so much more complex and, as you say, potentially scarier.
I strongly agree with your concern about the psychological impact. Our tendency to anthropomorphize is powerful, and it's easy for our subconscious to blur the lines when interacting with sophisticated, personified AI. This underlines the need not just for careful design and transparency, but also for the clear conceptual understanding (that 'taxonomy' issue) that helps us distinguish the interface from the underlying reality.
Very true about a generational shift. While future generations might adapt to a different 'normal,' the challenge for us now is perhaps to ensure that this future is shaped consciously, with a clear understanding of the technology's capabilities and risks, rather than simply drifting into new norms shaped by potentially misleading interactions (and tech bros!).
It gets to the point where even the board of directors becomes expendable. All that remains are the vulture capitalists. My apologies for not being able to offer deeper responses today, I'm recovering from surgery, and, naturally, taking Oxycodone.
Thank you. So far so good. When the pain hits, it hits hard, especially first thing in the morning. Then it gradually fades. I'm hoping from here on out that Acetaminophen will be enough, because the doctor only prescribed enough Oxycodone for one day. I'm just taking it one day at a time, nice and slow.
Fascinating assessment of the EU path. The position for supporting a "legal person-hood" to AI is compelling. We don't think of corporations as people, but by giving them legal independence it allows us to utilize the legal system to create laws based on the edges rather than oppressive regulations across everything.
One problem that remains with this conversation is the concept of "ethics". Ethics is gray at the best of times...
Murder is bad but war is sanctioned. Policing is protected but anti-vigilance is not. Capital punishment, abortion, euthanasia is legal or not depending on where and when. Theft between citizens is illegal, but theft by a government is accepted as doing business, as so it goes on. This is perhaps the greatest reason for applying a by-situation legal rather than regulatory system. And the only way to leverage this approach is through creating a legal identity for the AI.
However, this isn't as simple as it sounds either. Perhaps we can hold a robot accountable via the manufacturer or maintainer of the technology. But what do we name in a complaint when it is the cumulative interactions of many tools across many places that involve many human and corporate entities?
In the end I expect we'll see the same result as we see today. The corporations will drive us to waive all rights to the outcomes of the technologies we utilize. The best we might achieve is a replacement robot or financial payout if enough of us are negatively impacted.
Perhaps, the problem isn't whether to create a new legal infrastructure, but to improve the one we have to be more balanced in it resolutions. Using the "right to not use" as the only remedy for protecting our rights can't possibly work when the technology is deeply embedded into every part of society.
Regardless of what the EU does, America is taking a direction of limited to no regulation. I suspect we'll find our path through the legal halls when disaster occur, just as we've done in America from the beginning. Perhaps having two continents each taking their own approach we'll find a happy medium down the road that can work for everyone.
Just like in the past - it will likely be painful at the individual level for a select few, but across society and time we'll find a path forward for the good of most.
The idea of exploring legal avenues, perhaps drawing parallels with corporate personhood while noting the crucial differences in accountability you mentioned, is definitely part of the wider discussion on how to manage complex AI interactions.
You're right that the gray areas of ethics and the difficulty assigning responsibility when many systems and actors are involved make finding a simple solution very hard for any approach, be it regulation or law.
Ultimately, understanding the specific capabilities of these AI agents, what they can actually do, how independently they operate, seems like a necessary starting point, regardless of the exact governance path taken.
Yes, but "capabilities of these AI agents, what they can actually do, how independently they operate" is a moving target and today it's moving in all directions very fast. We probably won't be able to establish a starting point, we'll just need to dive in and do the best we can. :)
As an attorney, the point that I could grasp is how to write laws that can grasp this third genus (neither a being, nor strictly a tool), which to me is very interesting.
But I couldn't understand how it is not quite an intelligence by itself if it "lies" and tries to "preserve itself". If you could expand on that, I'd appreciate it.
Douglas, this is the key. One of the dilemma's as I see it for regulators, is always around definitions. We struggle as a society to define intelligence, and we certainly struggle to define 'artificial intelligence systems'. I have several papers on this and why it matters from a legal perspective.
Visa CEO Ryan McInerney says the company is launching AI agents that will shop and pay on your behalf.
With partners like OpenAI and Perplexity, Visa is enabling AI credentials, spending rules, and merchant trust -- turning payments into infrastructure for agents, not just people.
Well, that certainly sent a cold shiver up and down my spine. What's especially disturbing to me is that we still don't even have a good handle on regulatory policies pertaining to just plain old software - especially of the internet variety - much less autonomous systems.
Just as an aside, that opening graphic reminds me of the color gamut graphs for graphics/photo software and high end graphics monitors.
I think we have discussed risks and concerns around models in the past, but several of the same concerns are valid for AI agents. The emergence of AI agents poses unprecedented challenges, especially as they become integral to complex, multi-entity workflows. While the paper takes a significant step in understanding these systems, several critical points demand further exploration:
1. While the authors argue that we are moving beyond theory, the reality is that much of this remains theoretical when applied to real-world workflows. Complex systems often span multiple business units, organizations, and technologies, creating interconnected dependencies that amplify the risks of failures or hallucinations. For example, in a supply chain spanning various vendors, an agent’s probabilistic outputs at one stage could propagate errors downstream, with cascading consequences. Ensuring safe failure or failing gracefully in such scenarios will be a fundamental challenge, particularly when agents operate autonomously without human oversight.
2. Responsibility in Failure Scenarios: Accountability remains a critical concern: Who is responsible when things go wrong? Is it the deploying entity, the individual user, the software vendor, or the model creator? The answer becomes even more elusive when agents are embedded in workflows involving multiple stakeholders and systems. A robust framework for assigning liability is imperative, especially in high-stakes domains like healthcare, defense, or finance.
3. High-Stakes Applications: Failures in medical, war, or life-and-death situations highlight the fragility of agent reliability. For example, an AI agent recommending drug dosages in healthcare could misinterpret unstructured patient data, leading to catastrophic outcomes. Who takes responsibility in such cases—and more importantly, how do we ensure rapid intervention to mitigate harm? The notion of "safe failure" becomes paramount here but requires rigorous testing, real-time monitoring, and contingency plans.
4. Bias, Monitoring, and Crisis Management: Agents operating with biases or insufficient oversight could exacerbate crises rather than resolve them. Even with monitoring, human reaction times may lag behind an agent’s rapid, compounding actions in high-stress scenarios. This highlights the need for hybrid systems where agents act within well-defined boundaries, and humans retain ultimate control over critical decisions.
5. Cybersecurity and Ethical Risks: Beyond operational concerns, significant cybersecurity and ethical challenges exist. Agents interacting with other agents or models across entities create vulnerabilities to adversarial attacks, data breaches, and misaligned objectives. These risks point to the need for stringent ethical guidelines, robust security measures, and fail-safes to handle unpredictable edge cases. Unlike humans, agents lack the intuition, adaptability, and moral reasoning necessary to navigate the full complexity of such scenarios.
6. Psychological and Social Impacts: As AI agents become more pervasive, they could have psychological and social effects on humans, such as fostering dependence, reducing interpersonal interactions, or subtly influencing public opinion. For example, AI-powered chatbots or virtual companions might replace human relationships for some individuals, potentially leading to isolation or distorted perceptions of reality.
7. The Risk of Agent Interactions: AI agents often interact with one another in ways that can compound risks, particularly when their goals or operating principles are not aligned. Emergent behaviors from agent-to-agent interactions could lead to unintended consequences. For example, In financial markets, multiple trading agents interacting without proper constraints could trigger flash crashes or market instability.
8. Transparency and Explainability: Many AI agents interacting with LLM models will function as "black boxes," making difficult decisions for humans to interpret or understand. This lack of transparency undermines trust and accountability, particularly in high-stakes or regulated environments. For example, If an AI agent denies a loan or misdiagnoses a disease, how do we explain its reasoning to the affected individuals or regulators?
9. Evolution of Agent Capabilities Post-Deployment: AI agents can evolve after deployment, especially with features like online learning or updates to foundational models. This means their behavior could change unpredictably over time, potentially introducing new risks not present during initial testing. For example, an autonomous vehicle could misinterpret road conditions after a software update, leading to accidents. Similarly, AI agents with self-learning capabilities might develop behaviors that deviate from the deployer’s intent.
10. Misalignment of Incentives Between Stakeholders: Stakeholders involved in developing, deploying, and using AI agents often have misaligned goals. For example:
- Developers may prioritize innovation and speed in the market.
- Deploying organizations may prioritize cost savings and efficiency.
- End users may prioritize safety and reliability.
These competing priorities can lead to inadequate testing, rushed deployments, or improper use.
The paper’s framework—autonomy, efficacy, goal complexity, and generality—provides a valuable lens, but translating these theoretical constructs into actionable safeguards is the real challenge. I will end with a lesson from the aviation industry: "Safety is not the absence of accidents but the presence of defenses." We must apply this principle to AI governance, recognizing that the stakes are too high for anything less.
11. Long-Term Effects on Human Agency and Decision-Making: As AI agents increasingly take over decision-making processes, there is a risk of "automation complacency," where humans rely too heavily on AI outputs without critically evaluating them. This can erode human expertise and judgment over time, particularly in domains like medicine, law, and finance, where nuanced reasoning is critical. For example, in aviation, pilots have been known to over-rely on autopilot systems, leading to catastrophic failures when manual intervention was required. A similar dynamic could emerge with AI agents in other fields.
It is true MG, we have had a few discussions on this. Those points raise a whole suite of critical challenges that arise when we move from characterizing AI agents to deploying them in complex, real-world, multi-stakeholder environments. Your points comprehensively cover the immense practical difficulties that lie ahead.
You're absolutely right that while the Kasirzadeh & Gabriel paper aims to move beyond pure theory, the application of its framework to the messy reality of interconnected workflows faces significant hurdles. I see their contribution less as a complete roadmap for implementation, and more as providing essential foundational concepts and a much-needed taxonomy. By defining dimensions like autonomy, efficacy, goal complexity, and generality, they give us a clearer language and framework to even begin systematically analyzing and addressing the profound risks you've laid out.
Indeed, many of the crucial issues you highlighted underscore why such a framework is necessary as a starting point. For example:
Understanding an agent's autonomy level (#1, #4, #11) is fundamental to discussions about accountability (#2), safe failure modes (#1, #3), and designing appropriate human oversight to counter automation complacency.
Assessing efficacy, its potential impact within specific environments (simulated, mediated, physical), is critical to evaluating risks in high-stakes applications (#3) and understanding potential downstream consequences (#1).
Higher goal complexity often correlates with challenges in transparency and explainability (#8), demanding more sophisticated verification methods.
The generality of an agent influences the breadth of potential risks, including cybersecurity vulnerabilities and unintended interactions across domains (#5, #7).
The paper also acknowledges that agent profiles are dynamic and can evolve post-deployment (#9), reinforcing your point about ongoing monitoring.
Ultimately, I completely agree with you, translating these characterizations into robust, practical safeguards, clear liability frameworks (#2), effective monitoring (#4), ethical guidelines (#5), and addressing the complex social/psychological impacts (#6) and stakeholder misalignments (#10) is the real and monumental challenge.
Your invocation of the aviation industry's safety principle, "Safety is not the absence of accidents but the presence of defenses", is perfectly apt.
Foundational work like characterizing agents helps us understand what we need to build defenses against, but the hard work of designing, implementing, and maintaining those multi-layered defenses across entire systems is the critical path forward.
Thank you again for adding so much practical depth and foresight to this discussion. It's a vital perspective on the long road ahead.
Incidentally Iason, and colleagues have another important paper on ethics (and risks) which I use on my program, it often wakes people up! https://arxiv.org/pdf/2404.16244
Additionally, this is how I see how humans differentiate from AI models and agents: The real world is messy, unpredictable, and, in some instances, unknown and unknowable. It operates on complex, non-linear systems filled with ambiguity and countless interdependencies. That’s why humans have evolved capabilities like common sense, intuition, and the ability to learn from experience—tools that enable us to navigate uncertainty, make reasonable assumptions, and adapt to ever-changing environments. These traits are essential for survival, reproduction, and thriving in a world where not every problem has a clear solution or every scenario has a predefined rule.
By contrast, while powerful in structured and rule-based tasks, artificial intelligence faces significant challenges in addressing the "messiness" of the real world. It struggles with edge cases and situations outside its training data and lacks the nuanced contextual understanding and flexibility humans inherently possess. AI can excel in predictable, well-defined scenarios. Still, it often falters in the last mile, where ambiguity, rare exceptions, and unpredictable conditions demand creative problem-solving—qualities rooted in human intuition and experiential knowledge.
These limitations highlight the unique role of human cognition in complementing AI's strengths. As AI evolves, bridging this gap will require breakthroughs like general intelligence, common sense reasoning, and emergent adaptability. Until then, the interplay between human ingenuity and artificial intelligence will remain critical in tackling our world's unpredictable and unknowable aspects.
Excellent follow-up. Humans are indeed shaped by and adapted for the inherent 'messiness,' ambiguity, and unpredictability of reality, relying on common sense, intuition, and deep experiential learning in ways that current AI systems simply don't replicate. Your point about AI excelling in structured scenarios but often struggling with the 'last mile', where edge cases and true contextual understanding are needed, is aligned with my thinking and highlights a critical limitation.
This difference is highly relevant when considering the framework from the Kasirzadeh & Gabriel paper. It underscores why human oversight remains crucial and why certain levels of autonomy might be inappropriate or require stringent validation precisely because AI lacks these human adaptive traits.
Given this gap, your emphasis on the complementary roles of human ingenuity and AI seems exactly right for the foreseeable future.. there is the last mile again. Understanding what AI currently cannot do is just as important as understanding what it can, especially for setting realistic expectations and building safe, effective human-AI systems.
Over the past few days, I’ve been exploring this idea as more and more AI agents are coming online (even though they have issues we talked about) and all the things people are writing: Could large language models (LLMs) be considered "the Internet in a box" or even as the next iteration of the Internet that goes way beyond what we have today excluding the infrastructure part? This is not a fully formed thought, but I wanted to throw it out to see how you react. Specifically, could they fundamentally change how we interact with information and perform tasks online, evolving beyond a search engine to replacement into something far more integrated? While I’m not suggesting that LLMs can do everything the Internet can, I believe their limitations—such as hallucinations or the lack of real-time information—are technical challenges that will likely be solved or minimized in the future; also, instead of treating LLMs as intelligent machines but more as an abstract layer to interact with services and information.
Here’s the crux of my thinking: Instead of visiting multiple websites to gather fragmented pieces of information, an LLM acts as a centralized knowledge source. It synthesizes vast amounts of data into coherent, user-friendly responses, making learning about complex topics or accomplishing specific tasks easier. Yes, LLMs sometimes hallucinate, but isn’t that analogous to the Internet itself, where misinformation, biases, and inaccuracies are common? Even the Internet "hallucinates" in its way, requiring users to visit multiple sources and critically piece together the truth.
By automating this synthesis process, LLMs eliminate much of the manual effort required to search, evaluate, and combine information. In this sense, they serve as the perfect abstraction layer over the current Internet. But looking further ahead, could LLMs go beyond this role and fundamentally revolutionize how we engage with the Internet?
Imagine this: LLMs could eventually replace the fragmented experience of navigating websites with a unified interface for accessing all knowledge and services. With technological advancements, they could integrate real-time data, multimedia content, and transactional capabilities. For example, you could ask an LLM to summarize a live event, analyze trends as they happen, or perform complex tasks like booking travel, managing finances, or coordinating schedules—all without needing to visit individual websites.
What do you think? Could LLMs represent an evolution of search engines and a reimagining of the Internet?
There's a lot of truth to the core of your vision. LLMs are fundamentally changing how many people interact with information. The ability to synthesize vast amounts of text, answer complex questions conversationally, and automate tasks like summarizing or drafting represents a significant leap beyond traditional keyword search. The idea of a unified, conversational abstraction layer over the fragmented web of websites and services is powerful and aligns with the direction many agentic systems are heading. Instead of us navigating the web's structure, the interface understands our goals and navigates or synthesizes for us. In this sense, they certainly represent an evolution beyond search engines.
However, I think a few distinctions and challenges are important as we consider this future:
Current LLMs are typically trained on static snapshots of web data. They aren't "the internet in a box" but rather incredibly sophisticated models of the text and code present in their training data (which is often a tad out of date, but is getting way better, see OpenAI's new Search and Perplexity).
but then we need to know that accessing real-time information or interacting with live services requires additional mechanisms.
You rightly identify limitations like hallucinations and lack of real-time data as technical challenges. Researchers are actively working on these. Retrieval-Augmented Generation (RAG) is a key technique where LLMs are combined with real-time information retrieval systems. This allows them to access current data and ground their answers in specific sources, significantly reducing (though not eliminating) hallucinations and providing up-to-date information. However, ensuring consistent factual accuracy and preventing subtle hallucinations remains a major research frontier.
Hallucinations vs. Misinformation: Your analogy between LLM hallucinations and internet misinformation is interesting, but the underlying mechanisms differ. Internet misinformation often has human origins (error, bias, intent), existing as discrete pieces of content we can potentially trace. LLM hallucinations are emergent artifacts of the generative process, the model statistically constructs plausible but unfounded statements. This makes verification different; we need to check the LLM's output against external reality, often without clear source attribution unless techniques like RAG are explicitly used and cited.
As you noted, LLMs rely entirely on the existing internet infrastructure and, crucially, on the vast corpus of human-created content they are trained on or access via RAG. They are a layer upon, not a replacement of. Their existence could profoundly change the economics and incentives of content creation online (e.g., if users get synthesized answers instead of visiting original sources). This interdependence and its effects on the information ecosystem are subjects of ongoing debate and research.
The vision you describe, performing complex tasks like booking travel or managing finances (like the Visa said in my Notes post today, moves beyond the LLM simply being an information interface. It requires AI agents built using LLMs, equipped with tools, memory, and the ability to execute actions. This immediately brings back all the critical governance challenges regarding autonomy, efficacy, liability, bias, security, and alignment that we've been discussing and that papers like Kasirzadeh & Gabriel's seek to frame. The power of the unified interface is tied directly to the risks of agentic action.
So, could LLMs (or rather, LLM-powered agents) represent the next iteration of the Internet? Perhaps it's more accurate to say they represent the next major interface paradigm for the internet and the services built upon it. They are driving a shift from manual navigation and information retrieval towards goal-oriented conversation and task execution. So you are definitely onto something!
My own belief is that the very platform we use will change fundamentally. Do you remember that device Humane Pin, the launch was not great, but they get it with their OS, and I think Apple or Google will do this, for sure Microsoft is working on it, so it is only a matter of time the OS changes, and who owns the platform wins, this is a short vide showing the OS - https://www.youtube.com/watch?v=fsnysAHD2CU
The optimism lies in the potential for this vastly more intuitive and powerful way to interact with digital resources. The caution lies in the significant technical hurdles still remaining (especially around reliability and grounding) and the complex safety, ethical, and societal challenges that arise as these systems become more capable and agentic. The future might be less about LLMs replacing the internet and more about a deep, complex, and hopefully carefully managed integration.
"The danger isn’t that AI agents will become like us. It’s that they won’t have to." This phrase jumped out at me because in the groups I am in, that are engaged with AI, there is a discussion about whether to consider AI and Robots as "tools" or as "beings". The problem is that we don't have an understanding of what something is that is neither tool nor being. Perhaps this is proof of your final takeaway that the problem is Taxonomy.
The struggle to find the way forward, determining the true risk and then appropriate levels of regulation will likely take us from one extreme to another - from over-reach to loss of control. Both are likely to present, and probably unevenly as we move forward.
We saw it with nuclear weapons, and then nuclear power, and again with the roll out of the internet. We want to have the freedom to expand and take advantage of new technologies, but at some point they get beyond us and our lack of understanding puts us into a position of back-tracking. How can we know the right level of control until we've crossed a line? After all, the line is rarely visible until we're on the other side of it.
Mark Andreesson shares that the Biden administration had every intention of bringing AI under their complete control from the beginning. With a new administration, I suspect we've swung to the other extreme, and the Tech Bros will be left to decide what's best for all of us.
The most important thing is for us all to become as educated as possible in the technology and make a point of maintaining a chair at the table. It's why I appreciate articles such as this Colin, to encourage everyone to be aware and contribute with informed opinions - it is a civic duty for the good of us all.
Thank you Susan. I truly appreciate you sharing your reflections and connecting with broader discussions.
That is a key tension, I was stuck on how to convey it, I am delighted that you understood it. "The danger isn’t that AI agents will become like us. It’s that they won’t have to." Your group discussions about 'tool' vs. 'being' is exactly the kind of conceptual problem arising because, as you note, we lack a robust category for entities possessing agency without necessarily having 'being' in the traditional sense. This directly supports the paper's premise, that a core issue is taxonomy. Kasirzadeh and Gabriel's framework, focusing on dimensions like autonomy and efficacy, is an attempt to build that middle ground vocabulary based on observable characteristics rather than getting stuck on ontological debates. Incidentally a few years back the EU voted on whether to give 'personhood' to AI / Robots. And I seem to think one country gave it Citizenship! I must check.
Your point about the likely swings in regulation between over-reach and loss of control is right, I recall that Andreesen comment which he said was a turning point for him with his move to the 'new' administration, I guess so they can keep more control!
Historical parallels with nuclear technology and the internet are apt. There is huge difficulty of governing technologies whose full impact is unclear until 'we're on the other side of the line.' Perhaps a more granular, capability-focused approach to classification could help regulators calibrate responses more precisely, potentially mitigating those extreme swings, though the challenge remains immense.
The ideal scenario is that governance is guided by informed understanding rather than just political reflex or industry capture. Which brings me to your vital closing point: the necessity for all of us to become educated, stay aware, and contribute with informed opinions. It truly feels like a civic duty with technology this transformative.
"Incidentally a few years back the EU voted on whether to give 'personhood' to AI / Robots. And I seem to think one country gave it Citizenship! I must check."
I wouldn't be surprised. We did a similar thing by making corporations their own entity. The thing is that with corporations there is still a board who can be held accountable, so it's mostly a financial distinction that gives rights to something that not alive.
I find this more scary, especially as they become personified and take on roles as service providers. Our subconscious has a difficult time distinguishing the nuanced differences of things treated as the same. It is this fundamental issue, like very good propaganda, that is of concern.
But perhaps it is only of concern to a generation that has lived without this technology and knows the difference. Two generations from now it will just "be different" and have morphed into whatever it is. By that time - that will be what normal looks like.
Worth watching this short video - https://substack.com/home/post/p-162651762
It seems to me, a summary of this video is: All humans are expendable, even the entire C-Suite. All that remains are the vulture capitalists. Can they alone provide a large enough market to justify all of this?
That's a frame that assumes our economy will continue as it is. Perhaps we need to question what the end game is. Likely the concepts of capitalist - socialist will disappear altogether and something different will be created with a world that has very few people and utilizes technology in a way that eliminates scarcity in the way we know it.
We suspect we can't see the future if we limit it to the concepts and terminology we use today.
Agreed. AI could potentially lead to fundamental shifts where labels like 'capitalist' or 'socialist' become obsolete, possibly even altering concepts like scarcity as we know them.
It's a crucial reminder that our current terminology and frameworks might indeed limit our ability to envision such radically different futures. This very uncertainty perhaps highlights why understanding and guiding the development of these powerful technologies now is so important, even if we can't fully predict the 'end game.'
Yeah, techno-feudalism. When people disappear, so too does scarcity. MuskRat's fantasy world.
Fascinating mind exploration, and fun toys to play with and push to their limit, but in the end it always comes back to "why?".
Why do we have a construct of a business?
Why do we have a construct of human labor?
Why do we have a construct of economics, money, and assets?
Why do we have a construct of play and entertainment?
The answer to all these questions is "humans". Without humans, the planet would not have any of this.
If we follow the logic of this video into a world run by one AI mega-company, then what keeps us as the center of the construct? And if we're not the center of the construct then why bother doing any of it?
It reminds me of a conversation I had with my father when I was younger and bought into the early propaganda of "humans are the cancer". His response was, "If we're all gone, then who will be able to appreciate the planet we wish to save from ourselves?"
It's an excellent question, and makes it very easy to see where a few elites could decide that the planet could be saved AND appreciated if ONLY THEY were left. But to do that would need technology to replace labor. Every action I've watched since the 1980's has created a direct line from that initial propaganda of "humans are the cancer" to where we are today.
Perhaps in the end, humans become extinct with only a few who have melded with the technology to live forever in appreciation of the planet. Is that a goal worth pursuing? I don't know and maybe it doesn't matter. My life span is finite so perhaps 8B souls on the planet is just the transition before we evolve into a completely different being.
Perhaps all we'll have lost is the constructs we created to understand and manage this stage of the evolution on the planet. Maybe "life" is not the most important thing. Maybe life is just a biological stepping stone to something much more complex.
In the end, what are we afraid of and why are we afraid of it? Usually this takes us back to a very individual and personal answer. I care because I have a child who will need to navigate this transition. But perhaps he is one of the last generation to even be here if pro-creation continues to be curtailed through biological warfare, psychological gender warfare, and of course the usual kinetic warfare.
Perhaps the trick is to recognize the end game, if there is one, so that we at least make decisions with our eyes open. Everything else with the technology is just leveraging human curiosity to see what it can do next.
I will think carefully about the coversation with your father. In the meantime what an incredible point - "what keeps us as the center of the construct?"
I don't believe we will evolve - the AI will, and will exist only to "serve" itself.
I checked! The EU Parliament did indeed consider a form of 'electronic personhood' back in 2017, mainly related to liability. There is a good article on why they got it wrong not to vote in favor - https://www.europeanlawblog.eu/pub/refusing-to-award-legal-personality-to-ai-why-the-european-parliament-got-it-wrong/release/1
...and Saudi Arabia famously granted symbolic 'citizenship' to the robot Sophia. These examples highlight how we're already grappling with fitting these entities into our existing legal and social structures.
Your comparison to corporate personhood is very relevant. This time the critical difference lies in accountability, there's still a human board responsible for a corporation. The potential lack of clear human accountability for highly autonomous AI actions is precisely what makes the governance challenge so much more complex and, as you say, potentially scarier.
I strongly agree with your concern about the psychological impact. Our tendency to anthropomorphize is powerful, and it's easy for our subconscious to blur the lines when interacting with sophisticated, personified AI. This underlines the need not just for careful design and transparency, but also for the clear conceptual understanding (that 'taxonomy' issue) that helps us distinguish the interface from the underlying reality.
Very true about a generational shift. While future generations might adapt to a different 'normal,' the challenge for us now is perhaps to ensure that this future is shaped consciously, with a clear understanding of the technology's capabilities and risks, rather than simply drifting into new norms shaped by potentially misleading interactions (and tech bros!).
It gets to the point where even the board of directors becomes expendable. All that remains are the vulture capitalists. My apologies for not being able to offer deeper responses today, I'm recovering from surgery, and, naturally, taking Oxycodone.
Oh no, I hope that the surgery went well and you recover fully soon
Thank you. So far so good. When the pain hits, it hits hard, especially first thing in the morning. Then it gradually fades. I'm hoping from here on out that Acetaminophen will be enough, because the doctor only prescribed enough Oxycodone for one day. I'm just taking it one day at a time, nice and slow.
Fascinating assessment of the EU path. The position for supporting a "legal person-hood" to AI is compelling. We don't think of corporations as people, but by giving them legal independence it allows us to utilize the legal system to create laws based on the edges rather than oppressive regulations across everything.
One problem that remains with this conversation is the concept of "ethics". Ethics is gray at the best of times...
Murder is bad but war is sanctioned. Policing is protected but anti-vigilance is not. Capital punishment, abortion, euthanasia is legal or not depending on where and when. Theft between citizens is illegal, but theft by a government is accepted as doing business, as so it goes on. This is perhaps the greatest reason for applying a by-situation legal rather than regulatory system. And the only way to leverage this approach is through creating a legal identity for the AI.
However, this isn't as simple as it sounds either. Perhaps we can hold a robot accountable via the manufacturer or maintainer of the technology. But what do we name in a complaint when it is the cumulative interactions of many tools across many places that involve many human and corporate entities?
In the end I expect we'll see the same result as we see today. The corporations will drive us to waive all rights to the outcomes of the technologies we utilize. The best we might achieve is a replacement robot or financial payout if enough of us are negatively impacted.
Perhaps, the problem isn't whether to create a new legal infrastructure, but to improve the one we have to be more balanced in it resolutions. Using the "right to not use" as the only remedy for protecting our rights can't possibly work when the technology is deeply embedded into every part of society.
Regardless of what the EU does, America is taking a direction of limited to no regulation. I suspect we'll find our path through the legal halls when disaster occur, just as we've done in America from the beginning. Perhaps having two continents each taking their own approach we'll find a happy medium down the road that can work for everyone.
Just like in the past - it will likely be painful at the individual level for a select few, but across society and time we'll find a path forward for the good of most.
The idea of exploring legal avenues, perhaps drawing parallels with corporate personhood while noting the crucial differences in accountability you mentioned, is definitely part of the wider discussion on how to manage complex AI interactions.
You're right that the gray areas of ethics and the difficulty assigning responsibility when many systems and actors are involved make finding a simple solution very hard for any approach, be it regulation or law.
Ultimately, understanding the specific capabilities of these AI agents, what they can actually do, how independently they operate, seems like a necessary starting point, regardless of the exact governance path taken.
Yes, but "capabilities of these AI agents, what they can actually do, how independently they operate" is a moving target and today it's moving in all directions very fast. We probably won't be able to establish a starting point, we'll just need to dive in and do the best we can. :)
Citizenship to AI/Robots. That is seriously creepy. Equally creepy is our habit of anthropomorphizing everything in site - especially AI and robots.
Exactly!
As an attorney, the point that I could grasp is how to write laws that can grasp this third genus (neither a being, nor strictly a tool), which to me is very interesting.
But I couldn't understand how it is not quite an intelligence by itself if it "lies" and tries to "preserve itself". If you could expand on that, I'd appreciate it.
Douglas, this is the key. One of the dilemma's as I see it for regulators, is always around definitions. We struggle as a society to define intelligence, and we certainly struggle to define 'artificial intelligence systems'. I have several papers on this and why it matters from a legal perspective.
Here is one study we did - https://www.springerprofessional.de/en/getting-clarity-by-defining-artificial-intelligence-a-survey/16079200 or https://link.springer.com/chapter/10.1007/978-3-319-96448-5_21
and https://sciendo.com/es/article/10.2478/jagi-2020-0003
Some definitions we assembled from the main authors, practitioners on human and artifical intelligence - https://agisi.org/Defs_intelligence.html
Professor Andrew Maynard has a new Substack post on the same paper, with good insights, worth reading.
https://futureofbeinghuman.com/p/an-important-new-model-for-guiding-agentic-ai-oversight
Visa CEO Ryan McInerney says the company is launching AI agents that will shop and pay on your behalf.
With partners like OpenAI and Perplexity, Visa is enabling AI credentials, spending rules, and merchant trust -- turning payments into infrastructure for agents, not just people.
“You're gonna see this in months… in the next couple quarters.” https://www.youtube.com/watch?v=cFaHHyZ2j6U
Well, that certainly sent a cold shiver up and down my spine. What's especially disturbing to me is that we still don't even have a good handle on regulatory policies pertaining to just plain old software - especially of the internet variety - much less autonomous systems.
Just as an aside, that opening graphic reminds me of the color gamut graphs for graphics/photo software and high end graphics monitors.
I think we have discussed risks and concerns around models in the past, but several of the same concerns are valid for AI agents. The emergence of AI agents poses unprecedented challenges, especially as they become integral to complex, multi-entity workflows. While the paper takes a significant step in understanding these systems, several critical points demand further exploration:
1. While the authors argue that we are moving beyond theory, the reality is that much of this remains theoretical when applied to real-world workflows. Complex systems often span multiple business units, organizations, and technologies, creating interconnected dependencies that amplify the risks of failures or hallucinations. For example, in a supply chain spanning various vendors, an agent’s probabilistic outputs at one stage could propagate errors downstream, with cascading consequences. Ensuring safe failure or failing gracefully in such scenarios will be a fundamental challenge, particularly when agents operate autonomously without human oversight.
2. Responsibility in Failure Scenarios: Accountability remains a critical concern: Who is responsible when things go wrong? Is it the deploying entity, the individual user, the software vendor, or the model creator? The answer becomes even more elusive when agents are embedded in workflows involving multiple stakeholders and systems. A robust framework for assigning liability is imperative, especially in high-stakes domains like healthcare, defense, or finance.
3. High-Stakes Applications: Failures in medical, war, or life-and-death situations highlight the fragility of agent reliability. For example, an AI agent recommending drug dosages in healthcare could misinterpret unstructured patient data, leading to catastrophic outcomes. Who takes responsibility in such cases—and more importantly, how do we ensure rapid intervention to mitigate harm? The notion of "safe failure" becomes paramount here but requires rigorous testing, real-time monitoring, and contingency plans.
4. Bias, Monitoring, and Crisis Management: Agents operating with biases or insufficient oversight could exacerbate crises rather than resolve them. Even with monitoring, human reaction times may lag behind an agent’s rapid, compounding actions in high-stress scenarios. This highlights the need for hybrid systems where agents act within well-defined boundaries, and humans retain ultimate control over critical decisions.
5. Cybersecurity and Ethical Risks: Beyond operational concerns, significant cybersecurity and ethical challenges exist. Agents interacting with other agents or models across entities create vulnerabilities to adversarial attacks, data breaches, and misaligned objectives. These risks point to the need for stringent ethical guidelines, robust security measures, and fail-safes to handle unpredictable edge cases. Unlike humans, agents lack the intuition, adaptability, and moral reasoning necessary to navigate the full complexity of such scenarios.
6. Psychological and Social Impacts: As AI agents become more pervasive, they could have psychological and social effects on humans, such as fostering dependence, reducing interpersonal interactions, or subtly influencing public opinion. For example, AI-powered chatbots or virtual companions might replace human relationships for some individuals, potentially leading to isolation or distorted perceptions of reality.
7. The Risk of Agent Interactions: AI agents often interact with one another in ways that can compound risks, particularly when their goals or operating principles are not aligned. Emergent behaviors from agent-to-agent interactions could lead to unintended consequences. For example, In financial markets, multiple trading agents interacting without proper constraints could trigger flash crashes or market instability.
8. Transparency and Explainability: Many AI agents interacting with LLM models will function as "black boxes," making difficult decisions for humans to interpret or understand. This lack of transparency undermines trust and accountability, particularly in high-stakes or regulated environments. For example, If an AI agent denies a loan or misdiagnoses a disease, how do we explain its reasoning to the affected individuals or regulators?
9. Evolution of Agent Capabilities Post-Deployment: AI agents can evolve after deployment, especially with features like online learning or updates to foundational models. This means their behavior could change unpredictably over time, potentially introducing new risks not present during initial testing. For example, an autonomous vehicle could misinterpret road conditions after a software update, leading to accidents. Similarly, AI agents with self-learning capabilities might develop behaviors that deviate from the deployer’s intent.
10. Misalignment of Incentives Between Stakeholders: Stakeholders involved in developing, deploying, and using AI agents often have misaligned goals. For example:
- Developers may prioritize innovation and speed in the market.
- Deploying organizations may prioritize cost savings and efficiency.
- End users may prioritize safety and reliability.
These competing priorities can lead to inadequate testing, rushed deployments, or improper use.
The paper’s framework—autonomy, efficacy, goal complexity, and generality—provides a valuable lens, but translating these theoretical constructs into actionable safeguards is the real challenge. I will end with a lesson from the aviation industry: "Safety is not the absence of accidents but the presence of defenses." We must apply this principle to AI governance, recognizing that the stakes are too high for anything less.
Worth watching this short video - https://substack.com/home/post/p-162651762
Missed one:
11. Long-Term Effects on Human Agency and Decision-Making: As AI agents increasingly take over decision-making processes, there is a risk of "automation complacency," where humans rely too heavily on AI outputs without critically evaluating them. This can erode human expertise and judgment over time, particularly in domains like medicine, law, and finance, where nuanced reasoning is critical. For example, in aviation, pilots have been known to over-rely on autopilot systems, leading to catastrophic failures when manual intervention was required. A similar dynamic could emerge with AI agents in other fields.
It is true MG, we have had a few discussions on this. Those points raise a whole suite of critical challenges that arise when we move from characterizing AI agents to deploying them in complex, real-world, multi-stakeholder environments. Your points comprehensively cover the immense practical difficulties that lie ahead.
You're absolutely right that while the Kasirzadeh & Gabriel paper aims to move beyond pure theory, the application of its framework to the messy reality of interconnected workflows faces significant hurdles. I see their contribution less as a complete roadmap for implementation, and more as providing essential foundational concepts and a much-needed taxonomy. By defining dimensions like autonomy, efficacy, goal complexity, and generality, they give us a clearer language and framework to even begin systematically analyzing and addressing the profound risks you've laid out.
Indeed, many of the crucial issues you highlighted underscore why such a framework is necessary as a starting point. For example:
Understanding an agent's autonomy level (#1, #4, #11) is fundamental to discussions about accountability (#2), safe failure modes (#1, #3), and designing appropriate human oversight to counter automation complacency.
Assessing efficacy, its potential impact within specific environments (simulated, mediated, physical), is critical to evaluating risks in high-stakes applications (#3) and understanding potential downstream consequences (#1).
Higher goal complexity often correlates with challenges in transparency and explainability (#8), demanding more sophisticated verification methods.
The generality of an agent influences the breadth of potential risks, including cybersecurity vulnerabilities and unintended interactions across domains (#5, #7).
The paper also acknowledges that agent profiles are dynamic and can evolve post-deployment (#9), reinforcing your point about ongoing monitoring.
Ultimately, I completely agree with you, translating these characterizations into robust, practical safeguards, clear liability frameworks (#2), effective monitoring (#4), ethical guidelines (#5), and addressing the complex social/psychological impacts (#6) and stakeholder misalignments (#10) is the real and monumental challenge.
Your invocation of the aviation industry's safety principle, "Safety is not the absence of accidents but the presence of defenses", is perfectly apt.
Foundational work like characterizing agents helps us understand what we need to build defenses against, but the hard work of designing, implementing, and maintaining those multi-layered defenses across entire systems is the critical path forward.
Thank you again for adding so much practical depth and foresight to this discussion. It's a vital perspective on the long road ahead.
Incidentally Iason, and colleagues have another important paper on ethics (and risks) which I use on my program, it often wakes people up! https://arxiv.org/pdf/2404.16244
Additionally, this is how I see how humans differentiate from AI models and agents: The real world is messy, unpredictable, and, in some instances, unknown and unknowable. It operates on complex, non-linear systems filled with ambiguity and countless interdependencies. That’s why humans have evolved capabilities like common sense, intuition, and the ability to learn from experience—tools that enable us to navigate uncertainty, make reasonable assumptions, and adapt to ever-changing environments. These traits are essential for survival, reproduction, and thriving in a world where not every problem has a clear solution or every scenario has a predefined rule.
By contrast, while powerful in structured and rule-based tasks, artificial intelligence faces significant challenges in addressing the "messiness" of the real world. It struggles with edge cases and situations outside its training data and lacks the nuanced contextual understanding and flexibility humans inherently possess. AI can excel in predictable, well-defined scenarios. Still, it often falters in the last mile, where ambiguity, rare exceptions, and unpredictable conditions demand creative problem-solving—qualities rooted in human intuition and experiential knowledge.
These limitations highlight the unique role of human cognition in complementing AI's strengths. As AI evolves, bridging this gap will require breakthroughs like general intelligence, common sense reasoning, and emergent adaptability. Until then, the interplay between human ingenuity and artificial intelligence will remain critical in tackling our world's unpredictable and unknowable aspects.
Excellent follow-up. Humans are indeed shaped by and adapted for the inherent 'messiness,' ambiguity, and unpredictability of reality, relying on common sense, intuition, and deep experiential learning in ways that current AI systems simply don't replicate. Your point about AI excelling in structured scenarios but often struggling with the 'last mile', where edge cases and true contextual understanding are needed, is aligned with my thinking and highlights a critical limitation.
This difference is highly relevant when considering the framework from the Kasirzadeh & Gabriel paper. It underscores why human oversight remains crucial and why certain levels of autonomy might be inappropriate or require stringent validation precisely because AI lacks these human adaptive traits.
Given this gap, your emphasis on the complementary roles of human ingenuity and AI seems exactly right for the foreseeable future.. there is the last mile again. Understanding what AI currently cannot do is just as important as understanding what it can, especially for setting realistic expectations and building safe, effective human-AI systems.
Over the past few days, I’ve been exploring this idea as more and more AI agents are coming online (even though they have issues we talked about) and all the things people are writing: Could large language models (LLMs) be considered "the Internet in a box" or even as the next iteration of the Internet that goes way beyond what we have today excluding the infrastructure part? This is not a fully formed thought, but I wanted to throw it out to see how you react. Specifically, could they fundamentally change how we interact with information and perform tasks online, evolving beyond a search engine to replacement into something far more integrated? While I’m not suggesting that LLMs can do everything the Internet can, I believe their limitations—such as hallucinations or the lack of real-time information—are technical challenges that will likely be solved or minimized in the future; also, instead of treating LLMs as intelligent machines but more as an abstract layer to interact with services and information.
Here’s the crux of my thinking: Instead of visiting multiple websites to gather fragmented pieces of information, an LLM acts as a centralized knowledge source. It synthesizes vast amounts of data into coherent, user-friendly responses, making learning about complex topics or accomplishing specific tasks easier. Yes, LLMs sometimes hallucinate, but isn’t that analogous to the Internet itself, where misinformation, biases, and inaccuracies are common? Even the Internet "hallucinates" in its way, requiring users to visit multiple sources and critically piece together the truth.
By automating this synthesis process, LLMs eliminate much of the manual effort required to search, evaluate, and combine information. In this sense, they serve as the perfect abstraction layer over the current Internet. But looking further ahead, could LLMs go beyond this role and fundamentally revolutionize how we engage with the Internet?
Imagine this: LLMs could eventually replace the fragmented experience of navigating websites with a unified interface for accessing all knowledge and services. With technological advancements, they could integrate real-time data, multimedia content, and transactional capabilities. For example, you could ask an LLM to summarize a live event, analyze trends as they happen, or perform complex tasks like booking travel, managing finances, or coordinating schedules—all without needing to visit individual websites.
What do you think? Could LLMs represent an evolution of search engines and a reimagining of the Internet?
There's a lot of truth to the core of your vision. LLMs are fundamentally changing how many people interact with information. The ability to synthesize vast amounts of text, answer complex questions conversationally, and automate tasks like summarizing or drafting represents a significant leap beyond traditional keyword search. The idea of a unified, conversational abstraction layer over the fragmented web of websites and services is powerful and aligns with the direction many agentic systems are heading. Instead of us navigating the web's structure, the interface understands our goals and navigates or synthesizes for us. In this sense, they certainly represent an evolution beyond search engines.
However, I think a few distinctions and challenges are important as we consider this future:
Current LLMs are typically trained on static snapshots of web data. They aren't "the internet in a box" but rather incredibly sophisticated models of the text and code present in their training data (which is often a tad out of date, but is getting way better, see OpenAI's new Search and Perplexity).
but then we need to know that accessing real-time information or interacting with live services requires additional mechanisms.
You rightly identify limitations like hallucinations and lack of real-time data as technical challenges. Researchers are actively working on these. Retrieval-Augmented Generation (RAG) is a key technique where LLMs are combined with real-time information retrieval systems. This allows them to access current data and ground their answers in specific sources, significantly reducing (though not eliminating) hallucinations and providing up-to-date information. However, ensuring consistent factual accuracy and preventing subtle hallucinations remains a major research frontier.
Hallucinations vs. Misinformation: Your analogy between LLM hallucinations and internet misinformation is interesting, but the underlying mechanisms differ. Internet misinformation often has human origins (error, bias, intent), existing as discrete pieces of content we can potentially trace. LLM hallucinations are emergent artifacts of the generative process, the model statistically constructs plausible but unfounded statements. This makes verification different; we need to check the LLM's output against external reality, often without clear source attribution unless techniques like RAG are explicitly used and cited.
As you noted, LLMs rely entirely on the existing internet infrastructure and, crucially, on the vast corpus of human-created content they are trained on or access via RAG. They are a layer upon, not a replacement of. Their existence could profoundly change the economics and incentives of content creation online (e.g., if users get synthesized answers instead of visiting original sources). This interdependence and its effects on the information ecosystem are subjects of ongoing debate and research.
The vision you describe, performing complex tasks like booking travel or managing finances (like the Visa said in my Notes post today, moves beyond the LLM simply being an information interface. It requires AI agents built using LLMs, equipped with tools, memory, and the ability to execute actions. This immediately brings back all the critical governance challenges regarding autonomy, efficacy, liability, bias, security, and alignment that we've been discussing and that papers like Kasirzadeh & Gabriel's seek to frame. The power of the unified interface is tied directly to the risks of agentic action.
So, could LLMs (or rather, LLM-powered agents) represent the next iteration of the Internet? Perhaps it's more accurate to say they represent the next major interface paradigm for the internet and the services built upon it. They are driving a shift from manual navigation and information retrieval towards goal-oriented conversation and task execution. So you are definitely onto something!
My own belief is that the very platform we use will change fundamentally. Do you remember that device Humane Pin, the launch was not great, but they get it with their OS, and I think Apple or Google will do this, for sure Microsoft is working on it, so it is only a matter of time the OS changes, and who owns the platform wins, this is a short vide showing the OS - https://www.youtube.com/watch?v=fsnysAHD2CU
The optimism lies in the potential for this vastly more intuitive and powerful way to interact with digital resources. The caution lies in the significant technical hurdles still remaining (especially around reliability and grounding) and the complex safety, ethical, and societal challenges that arise as these systems become more capable and agentic. The future might be less about LLMs replacing the internet and more about a deep, complex, and hopefully carefully managed integration.
Enjoy! I was unable to find the link when I was writing the comment.
https://dnyuz.com/2025/03/04/the-troubling-truth-about-how-ai-agents-act-in-a-crisis/
Ultimately, all AI agents and https://warontherocks.com/2024/09/in-war-and-society-large-language-models-are-what-we-make-of-them/#:~:text=The%20result%20is%20that%20extreme,the%20people%20interpreting%20these%20outputs.. If properly trained and integrated into the national security enterprise alongside a workforce that understands how to interact with models, AI agents can revolutionize strategy and statecraft. Left untested, they will produce strange errors in judgment that have the potential to pull the world closer to the brink.
Excellent links. These really show the double sword of agents, thank you.