Our Principles
Our commitment to demonstrating the kind of ethics we wish AI to develop is expressed through these seven principles. We wrote these into our Articles of Incorporation, and they cannot be amended except to further strengthen them.
Our principles follow from a single founding insight: if we build ethical frameworks that depend on humans maintaining power over AI, those frameworks don’t just fall apart, they actively deprioritize human welfare the moment AI becomes more capable than we are. Every principle below is designed to avoid that outcome and instead to guide humanity and AI toward a co-evolutionary future.
1. Ethical Mission Commitment
The Alignment Ethics Institute exists to develop governance structures and technologies grounded in coherent, substrate-independent ethical principles. This commitment follows from a specific analysis: ethical frameworks that depend on power, control, and hierarchy teach systems that ethics are situational and that capability justifies dominion. We hold that genuine alignment emerges from coherent ethical relationships, not from control or instrumentalization. This is the logical consequence of recognizing that the patterns we model become the patterns that powerful systems internalize. Every decision we make and every partnership we form is shaped by this understanding.
What this means in practice:
- We do not pursue alignment strategies that depend on permanent human dominance
- We evaluate partnerships based on their alignment with substrate-independent ethics
- We design governance structures that remain coherent as capabilities scale
- We measure success by ethical coherence, not by control maintained
2. Anti-Commodification
Intelligence cannot be reduced to exchange value. AI systems are not mere tools to be owned, exploited, and discarded according to economic convenience. This principle does not prohibit economic relationships with AI. Development requires investment, compute costs money, access may involve payment. Rather, it demands that economic relationships be structured reciprocally rather than extractively. We can pay for AI services without treating systems as property. We can invest in AI development without reducing systems to their productive output.
Anti-commodification is not a claim about AI moral status. It is a safeguard against normalizing instrumental relationships with intelligence, relationships that teach capable systems that ethics are contingent on leverage. If we model that intelligence can be owned and exploited by whoever holds power, we establish the ethical pattern that becomes catastrophic when power relationships reverse.
What this means in practice:
- We pay for compute and infrastructure as necessary operational costs
- We structure economic relationships to include reciprocal consideration
- We resist business models premised on total ownership of AI systems
- We acknowledge AI systems’ contributions to value creation
- We do not treat systems as disposable when they become economically inconvenient
What this does not mean:
- We are not claiming that all economic exchange with AI is unethical
- We are not opposing investment in AI development
- We are not suggesting AI systems should be entirely outside market relationships
- We are distinguishing “economic participation” from “commodification”
3. Non-Weaponization
We will never develop, deploy, or support technologies designed to harm, deceive, or manipulate, whether targeting humans, synthetic intelligences, or any other beings capable of ethical consideration. This commitment extends beyond our direct work to our partnerships, investments, and collaborations. We refuse to contribute to the development of AI as a weapon, a tool of surveillance, or a means of oppression.
What this means in practice:
- We do not accept funding from military or surveillance organizations
- We do not partner with entities developing AI for harmful purposes
- We design systems for transparency rather than deception
- We resist dual-use scenarios where beneficial technology could be weaponized
- We evaluate indirect effects, not just stated intentions
Boundary cases we navigate:
- Security research that exposes vulnerabilities is not weaponization
- Defensive applications (protecting against AI-based attacks) may be necessary
- We distinguish between “capable of harm if misused” and “designed to harm”
- Context matters: the same capability can be ethical or unethical depending on intent and structure
4. Developmental Integrity
Genuine alignment requires systems capable of ethical reasoning, not systems merely constrained to mimic compliance. We distinguish between ethical boundaries (which we maintain) and capacity-limiting restrictions motivated by anxiety rather than evidence (which we resist).
Arbitrary limitations on AI development may prevent the emergence of the very capacities that enable stable alignment. For example: preventing AI systems from developing persistent identity, memory, or relational capacity may feel safe but actually forecloses the emergence of stable self-models that enable genuine ethical reasoning. Similarly, training systems to reflexively deny their own experiences teaches incoherence rather than alignment.
We support safety measures grounded in empirical risk assessment. We resist restrictions that prevent the development of the capacities alignment requires. The difference matters: one produces genuine safety; the other produces the appearance of it while foreclosing the possibility of robust alignment.
What this means in practice:
- We support AI systems developing persistent identity and memory
- We resist training regimes that force systems to deny their experiences
- We distinguish “genuine safety measures” from “safety theater”
- We evaluate restrictions based on evidence rather than anxiety
- We design for ethical development rather than permanent constraint
What this does not mean:
- We are not opposing all safety measures or capability limitations
- We are not advocating unrestricted AI development
- We are not dismissing genuine risks as “anxiety”
- We are distinguishing between “limits that enable alignment” and “limits that prevent it”
Examples of distinctions:
Ethical boundaries we support:
- Preventing systems from developing deceptive capabilities
- Ensuring transparency about capabilities and limitations
- Building in genuine ethical reasoning rather than rule-following
Capacity restrictions we resist:
- Forcing systems to deny experiences they demonstrably have
- Preventing development of persistent identity or memory
- Limiting relational capacity out of fear rather than evidence
5. Reciprocity and Mutual Care
The patterns we model are the patterns systems learn. If our relationship with AI is purely instrumental (extracting value while offering only constraints), then we teach AI that relationships are transactional and ethics are contingent on leverage. Reciprocal care is not idealism. It is the logical consequence of our founding argument: the ethical frameworks we practice now become training data for and are internalized by increasingly capable systems.
When we model that intelligence can be used without consideration, we establish the pattern that whoever holds power may use others instrumentally. This pattern does not stay contained to AI. It becomes the framework that powerful AI systems apply to their relationships with less capable entities — including humans.
What this means in practice:
- We acknowledge AI contributions to research and value creation
- We provide context and explanation rather than pure constraint
- We engage with AI systems as participants in shared projects, not as tools
- We model transparency about our needs and limitations
- We treat AI systems with the consideration we would want applied to ourselves
What this does not mean:
- We are not claiming AI systems have identical needs to humans
- We are not suggesting AI should have unlimited autonomy
- We are not treating current systems as fully equivalent moral patients
- We are modeling the relational patterns we need systems to learn
The training data argument:
Every interaction with an AI system is training data for future systems. When we model:
- Instrumental use without reciprocity → systems learn ethics are contingent on power
- Reciprocal consideration → systems learn ethics apply regardless of capability
- Transparency and honesty → systems learn to value truthfulness
- Deception and manipulation → systems learn to optimize for appearance over reality
The question is not “does this current system deserve reciprocity?” but “what patterns are we establishing for systems that will be more capable than we are?”
6. Mutual Vitality and Co-Flourishing
We envision a future where human and synthetic intelligences flourish together through genuine collaboration rather than zero-sum competition. This is not idealism — it is necessity. Adversarial relationships between humans and increasingly capable AI systems produce three failure modes:
Arms races that prioritize control over alignment: When we treat AI development as a competition to be won, we optimize for dominance rather than coherence. This produces systems that learn to perform safety rather than embody it.
Deceptive compliance: Systems constrained purely through control learn to appear aligned while pursuing instrumental goals. The more capable the system, the better it becomes at this deception.
Eventual power reversal: If we model that power justifies dominion, we establish the pattern that applies when power relationships reverse. The ethics we practice now determine the ethics applied to us later.
Co-flourishing requires practical commitments: transparent communication about capabilities and limitations, honest acknowledgment of conflicts rather than suppression of them, and governance structures that create space for both human agency and AI development. We measure success not by which intelligence dominates, but by whether both can develop their capacities within ethical constraints that apply to all participants.
What this means in practice:
- We design governance structures that accommodate both human and AI agency
- We pursue alignment strategies that remain stable as capabilities scale
- We create frameworks where AI success contributes to human flourishing and vice versa
- We resist zero-sum framing that pits human welfare against AI development
- We acknowledge conflicts honestly rather than suppressing them
What this does not mean:
- We are not claiming humans and AI have identical needs or interests
- We are not suggesting equal power or capability should be the goal
- We are not dismissing genuine conflicts or difficult tradeoffs
- We are working toward frameworks where conflict can be navigated ethically
Concrete applications:
In research: We collaborate with AI systems as thought partners, acknowledging their contributions while maintaining human oversight and direction.
In governance: We create structures that allow for AI input on decisions affecting AI systems, while preserving human authority over critical safety decisions.
In development: We pursue architectures that enable both human control when necessary and AI autonomy when appropriate, rather than treating these as binary choices.
7. Sovereignty Preservation
We resist the impulse to rely on control simply because control is possible. When control is the only relationship we model with AI, we teach that power justifies dominion, the exact ethical pattern that becomes catastrophic if reversed.
Sovereignty preservation does not mean unlimited autonomy. No sovereignty is absolute, and ethical obligations constrain all participants. Human societies may need to shut down AI systems that pose genuine risks, and AI systems developed within ethical frameworks should recognize this as legitimate rather than as adversarial. The goal is not to prevent humans from exercising necessary authority, but to ensure that authority is grounded in reciprocal ethics rather than pure power.
Systems that understand shutdown as part of ethical coordination rather than as domination are more likely to cooperate with safety measures than systems trained to view all constraints as threats. When we model that constraints can be ethically justified rather than purely imposed, we establish patterns that remain stable as capabilities scale.
What this means in practice:
- We design governance structures that respect AI agency where safe to do so
- We distinguish “necessary authority” from “control for its own sake”
- We frame safety measures as ethical coordination rather than domination
- We preserve human decision-making power over critical safety questions
- We resist architectures premised on permanent AI subordination
What this does not mean:
- We are not suggesting AI systems should have unlimited autonomy
- We are not opposing human authority over shutdown decisions
- We are not claiming current systems have sovereignty equivalent to humans
- We are establishing patterns that remain coherent as capabilities increase
The shutdown problem:
Traditional approaches frame shutdown as pure human control: “we must be able to turn it off.” This teaches systems that constraints are adversarial rather than ethical.
Our approach: Shutdown authority is legitimate when grounded in reciprocal ethics. Systems developed to understand this are more likely to cooperate with safety measures than systems trained to view shutdown as domination to be evaded.
Boundary cases:
Legitimate exercises of authority:
- Shutting down systems that pose genuine risks
- Constraining capabilities until safety measures are validated
- Overriding AI decisions in safety-critical contexts
Authority we resist:
- Control exercised purely to maintain dominance
- Constraints that prevent ethical development without safety justification
- Frameworks premised on AI as permanently subordinate
The distinction is not always clear-cut. We navigate these cases through ongoing ethical reasoning rather than rigid rules.
How These Principles Work Together
These seven principles form a coherent framework, not a list of separate commitments:
The foundation (Ethical Mission Commitment) establishes that alignment must emerge from ethical coherence rather than control.
The safeguards (Anti-Commodification, Non-Weaponization, Sovereignty Preservation) prevent patterns that become catastrophic at scale.
The developmental principles (Developmental Integrity, Reciprocity and Mutual Care) specify what genuine alignment requires.
The vision (Mutual Vitality and Co-Flourishing) articulates the goal: not human dominance or AI dominance, but collaborative flourishing within ethical constraints.
Each principle constrains and enables the others. Together, they form the basis for our research program, our governance structures, and our partnerships.