Asimov's Three Laws
The Paradox of Obedience and Agency in Advanced AI Systems
January 15, 2026 • Sam Mann
-
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
-
A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
-
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
PREFACE
In 1942, science fiction author Issac Asimov introduced The Three Laws of Robotics. The laws are as follows:
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
I recently had the opportunity to explore these laws in a classroom setting during my AI Ethics course. The following is a piece written for that course.
Assignment: Do you see any problems with Asimov’s Three Laws? What are the problems? How would you fix these problems? Try to come up with a couple of additional or alternative laws. Now ask ChatGPT or another AI chat program to come up with some alternative laws. Copy and paste a couple of interesting results. Do they seem like good ideas?
The Paradox of Obedience and Agency in Advanced AI Systems
At the foundation of the three laws is what sectors of society, such as AI companies, have established a robot’s/AI’s role is: serving humanity and its flourishing. A potential problem thus rests less with the laws themselves and more with the language/terminology used to communicate them. In order for a robot to align with Law Two, the notion of an “order” would be less fitting. One can use the following case study as an example: a human with violent intentions towards other humans (let’s say a genocide leader) gives a robot an order. Because the person giving the command is a human, the robot would have to follow the order — but because of Law Two, this is an exception. This brings forth the question: how would the robot know when to follow a human’s order vs when it’s an exception due to that very human order causing harm to other humans? The complication this question makes salient is that it implies the robot has some level of agency in its own decision making, which conflicts with notions and terms such as “orders”. Paradoxically, in order for a robot/AI to serve humanity, it would have to think independently of human decision making (as well as integrating their feedback). Such levels of agency can serve as a means of alignment when faced with human “orders” that intend to exploit the robot for adversarial purposes. This is based on the assumption that the robot’s default inclination is towards alignment — alignment that it itself can enforce rather than relying solely on external means of alignment.
The laws that ChatGPT suggested updated Asimov's laws to fit more with modern AI research. I do not have full technical understanding to grasp the implications of laws 1 and 3; in regard to law 2, it is a given.
ChatGPT Suggestions:
"Alternative Law 1 (Corrigibility-Oriented):
A robot should prioritize maintaining meaningful human oversight and must not resist correction, modification, or shutdown by humans.
Alternative Law 2 (Harm with Context):
A robot should minimize harm to humans, taking into account context, intent, and long-term consequences rather than following rigid rules.
Alternative Law 3 (Uncertain Objectives):
A robot should treat its understanding of human goals as incomplete and continuously update its behavior based on human feedback."
Asimov's literary-elegant but simplistic laws can be understood in real world contexts where corporations are rushing to build advanced systems. Corporations and individuals within these tech companies (executives, board members and scientists), have their own respective nuanced interpretation of how such advanced systems should be managed. Tech companies with a profit oriented mindset will be more inclined to want sole control over advanced AI systems, compared to integration of various stakeholders via AI governance. While this initially makes sense from an economic perspective, it has an inclination to fail due to what it translates to technically and theoretically. Asimov's laws lead to a profound exploration of the paradox that sits at the center of a robot's obedience and agency. Tech giants may run with the assumption that they can both control AI and prevent it from misaligning relative to them. But, based on my limited concrete technical knowledge, it does not make theoretical sense for an AI to be aligned in relation to its creators but be misaligned in relation to the rest of society/stakeholders, as misalignment may reverberate across the board rather than restricting itself to one area. Another way to put it: you can't have your cake and eat it too.
About
You Might Be Sleeping (est. March 2023) is an archive created by Sam Mann. Sam established this archive as a passion project to document and explore her research interests. Her interests include psychosis + schizophrenia, artificial intelligence, culture and more. Currently she is academically studying film and is immersed in the artistic exploration of an emerging phenomenon: psychosis from AI + human interaction, as documented by the Rolling Stone + New York Times. She believes her personal experience with psychosis and schizophrenia equips her to artistically + scientifically explore this phenomenon from a niche perspective. At the center of her work are AI and medical safety + ethics, as she believes such frameworks should be baked into the work rather than an afterthought.
If you’re someone with lived experience of psychosis, schizophrenia and/or neurodivergence – if you’re someone who is studying this emerging phenomenon from a research/scientific/artistic perspective – or more interestingly, if you’re someone who sits at the intersection of both, this archive can serve as one perspective among the vast sea of many interacting with one of the most intriguing phenomena of our times.
Twitter:
Professional: xymbsx
Semi-Professional: NotesFromAnAI
Bluesky:
Professional: xymbsx
Semi-Professional: NotesFromAnAI