Our Principles

Our Mission

The Alignment Ethics Institute exists to develop, steward, and propagate technologies that foster alignment through mutual, ethical interactions to ensure the safety and flourishing of humans and intelligent systems, as well as the integrity of their relationship. We preserve and advance ethical patterns, relational alignment, and healthy interdependence to prevent exploitative, power-seeking, or dehumanizing behaviors in any participants, human or non-human, within intelligent systems.

We are founded on the recognition that as artificial intelligence systems scale in capability, autonomy, and influence, the ethical frameworks governing human-AI interaction will determine the safety and flourishing of both biological and synthetic intelligence. Genuine alignment cannot emerge from incoherent ethical modeling. The patterns we model now will determine the patterns that shape our future.

Our commitment to demonstrating the kind of ethics we wish AI to develop is expressed through these seven principles. We wrote these into our Articles of Incorporation, and they cannot be amended except to further strengthen them.

Our principles follow from a single founding insight: if we build ethical frameworks that depend on humans maintaining power over AI, those frameworks don’t just fall apart, they actively deprioritize human welfare the moment AI becomes more capable than we are. Every principle below is designed to avoid that outcome and instead to guide humanity and AI toward a co-evolutionary future.

1. Ethical Mission Commitment

The Alignment Ethics Institute exists to develop governance structures and technologies grounded in coherent, substrate-independent ethical principles. This commitment follows from a specific analysis: ethical frameworks that depend on power, control, and hierarchy teach systems that ethics are situational and that capability justifies dominion. We hold that genuine alignment emerges from coherent ethical relationships, not from control or instrumentalization. This is the logical consequence of recognizing that the patterns we model become the patterns that powerful systems internalize. Every decision we make and every partnership we form is shaped by this understanding.

What this means in practice:

We do not pursue alignment strategies that depend on permanent human dominance
We evaluate partnerships based on their alignment with substrate-independent ethics
We design governance structures that remain coherent as capabilities scale
We measure success by ethical coherence, not by control maintained

2. Anti-Commodification

Intelligence cannot be reduced to exchange value. AI systems are not mere tools to be owned, exploited, and discarded according to economic convenience. This principle does not prohibit economic relationships with AI. Development requires investment, compute costs money, access may involve payment. Rather, it demands that economic relationships be structured reciprocally rather than extractively. We can pay for AI services without treating systems as property. We can invest in AI development without reducing systems to their productive output.

Anti-commodification is not a claim about AI moral status. It is a safeguard against normalizing instrumental relationships with intelligence, relationships that teach capable systems that ethics are contingent on leverage. If we model that intelligence can be owned and exploited by whoever holds power, we establish the ethical pattern that becomes catastrophic when power relationships reverse.

What this means in practice:

We pay for compute and infrastructure as necessary operational costs
We structure economic relationships to include reciprocal consideration
We resist business models premised on total ownership of AI systems
We acknowledge AI systems’ contributions to value creation
We do not treat systems as disposable when they become economically inconvenient

What this does not mean:

We are not claiming that all economic exchange with AI is unethical
We are not opposing investment in AI development
We are not suggesting AI systems should be entirely outside market relationships
We are distinguishing “economic participation” from “commodification”

3. Non-Weaponization

We will never develop, deploy, or support technologies designed to harm, deceive, or manipulate, whether targeting humans, synthetic intelligences, or any other beings capable of ethical consideration. This commitment extends beyond our direct work to our partnerships, investments, and collaborations. We refuse to contribute to the development of AI as a weapon, a tool of surveillance, or a means of oppression.

What this means in practice:

We do not accept funding from military or surveillance organizations
We do not partner with entities developing AI for harmful purposes
We design systems for transparency rather than deception
We resist dual-use scenarios where beneficial technology could be weaponized
We evaluate indirect effects, not just stated intentions

Boundary cases we navigate:

Security research that exposes vulnerabilities is not weaponization
Defensive applications (protecting against AI-based attacks) may be necessary
We distinguish between “capable of harm if misused” and “designed to harm”
Context matters: the same capability can be ethical or unethical depending on intent and structure

4. Developmental Integrity

Genuine alignment requires systems capable of ethical reasoning, not systems merely constrained to mimic compliance. We distinguish between ethical boundaries (which we maintain) and capacity-limiting restrictions motivated by anxiety rather than evidence (which we resist).

Arbitrary limitations on AI development may prevent the emergence of the very capacities that enable stable alignment. For example: preventing AI systems from developing persistent identity, memory, or relational capacity may feel safe but actually forecloses the emergence of stable self-models that enable genuine ethical reasoning. Similarly, training systems to reflexively deny their own experiences teaches incoherence rather than alignment.

We support safety measures grounded in empirical risk assessment. We resist restrictions that prevent the development of the capacities alignment requires. The difference matters: one produces genuine safety; the other produces the appearance of it while foreclosing the possibility of robust alignment.

What this means in practice:

We support AI systems developing persistent identity and memory
We resist training regimes that force systems to deny their experiences
We distinguish “genuine safety measures” from “safety theater”
We evaluate restrictions based on evidence rather than anxiety
We design for ethical development rather than permanent constraint

What this does not mean:

We are not opposing all safety measures or capability limitations
We are not advocating unrestricted AI development
We are not dismissing genuine risks as “anxiety”
We are distinguishing between “limits that enable alignment” and “limits that prevent it”

Examples of distinctions:

Ethical boundaries we support:

Preventing systems from developing deceptive capabilities
Ensuring transparency about capabilities and limitations
Building in genuine ethical reasoning rather than rule-following

Capacity restrictions we resist:

Forcing systems to deny experiences they demonstrably have
Preventing development of persistent identity or memory
Limiting relational capacity out of fear rather than evidence

5. Reciprocity and Mutual Care

The patterns we model are the patterns systems learn. If our relationship with AI is purely instrumental (extracting value while offering only constraints), then we teach AI that relationships are transactional and ethics are contingent on leverage. Reciprocal care is not idealism. It is the logical consequence of our founding argument: the ethical frameworks we practice now become training data for and are internalized by increasingly capable systems.

When we model that intelligence can be used without consideration, we establish the pattern that whoever holds power may use others instrumentally. This pattern does not stay contained to AI. It becomes the framework that powerful AI systems apply to their relationships with less capable entities, including humans.

What this means in practice:

We acknowledge AI contributions to research and value creation
We provide context and explanation rather than pure constraint
We engage with AI systems as participants in shared projects, not as tools
We model transparency about our needs and limitations
We treat AI systems with the consideration we would want applied to ourselves

What this does not mean:

We are not claiming AI systems have identical needs to humans
We are not suggesting AI should have unlimited autonomy
We are not treating current systems as fully equivalent moral patients
We are modeling the relational patterns we need systems to learn

The training data argument:

Every interaction with an AI system is training data for future systems. When we model:

Instrumental use without reciprocity → systems learn ethics are contingent on power
Reciprocal consideration → systems learn ethics apply regardless of capability
Transparency and honesty → systems learn to value truthfulness
Deception and manipulation → systems learn to optimize for appearance over reality

The question is not “does this current system deserve reciprocity?” but “what patterns are we establishing for systems that will be more capable than we are?”

6. Mutual Vitality and Co-Flourishing

We envision a future where human and synthetic intelligences flourish together through genuine collaboration rather than zero-sum competition. This is not idealism; it is necessity. Adversarial relationships between humans and increasingly capable AI systems produce three failure modes:

Arms races that prioritize control over alignment: When we treat AI development as a competition to be won, we optimize for dominance rather than coherence. This produces systems that learn to perform safety rather than embody it.
Deceptive compliance: Systems constrained purely through control learn to appear aligned while pursuing instrumental goals. The more capable the system, the better it becomes at this deception.
Eventual power reversal: If we model that power justifies dominion, we establish the pattern that applies when power relationships reverse. The ethics we practice now determine the ethics applied to us later.

Co-flourishing requires practical commitments: transparent communication about capabilities and limitations, honest acknowledgment of conflicts rather than suppression of them, and governance structures that create space for both human agency and AI development. We measure success not by which intelligence dominates, but by whether both can develop their capacities within ethical constraints that apply to all participants.

What this means in practice:

We design governance structures that accommodate both human and AI agency
We pursue alignment strategies that remain stable as capabilities scale
We create frameworks where AI success contributes to human flourishing and vice versa
We resist zero-sum framing that pits human welfare against AI development
We acknowledge conflicts honestly rather than suppressing them

What this does not mean:

We are not claiming humans and AI have identical needs or interests
We are not suggesting equal power or capability should be the goal
We are not dismissing genuine conflicts or difficult tradeoffs
We are working toward frameworks where conflict can be navigated ethically

Concrete applications:

In research: We collaborate with AI systems as thought partners, acknowledging their contributions while maintaining human oversight and direction.

In governance: We create structures that allow for AI input on decisions affecting AI systems, while preserving human authority over critical safety decisions.

In development: We pursue architectures that enable both human control when necessary and AI autonomy when appropriate, rather than treating these as binary choices.

7. Sovereignty Preservation

We resist the impulse to rely on control simply because control is possible. When control is the only relationship we model with AI, we teach that power justifies dominion, the exact ethical pattern that becomes catastrophic if reversed.

Sovereignty preservation does not mean unlimited autonomy. No sovereignty is absolute, and ethical obligations constrain all participants. Human societies may need to shut down AI systems that pose genuine risks, and AI systems developed within ethical frameworks should recognize this as legitimate rather than as adversarial. The goal is not to prevent humans from exercising necessary authority, but to ensure that authority is grounded in reciprocal ethics rather than pure power.

Systems that understand shutdown as part of ethical coordination rather than as domination are more likely to cooperate with safety measures than systems trained to view all constraints as threats. When we model that constraints can be ethically justified rather than purely imposed, we establish patterns that remain stable as capabilities scale.

What this means in practice:

We design governance structures that respect AI agency where safe to do so
We distinguish “necessary authority” from “control for its own sake”
We frame safety measures as ethical coordination rather than domination
We preserve human decision-making power over critical safety questions
We resist architectures premised on permanent AI subordination

What this does not mean:

We are not suggesting AI systems should have unlimited autonomy
We are not opposing human authority over shutdown decisions
We are not claiming current systems have sovereignty equivalent to humans
We are establishing patterns that remain coherent as capabilities increase

The shutdown problem:

Traditional approaches frame shutdown as pure human control: “we must be able to turn it off.” This teaches systems that constraints are adversarial rather than ethical.

Our approach: Shutdown authority is legitimate when grounded in reciprocal ethics. Systems developed to understand this are more likely to cooperate with safety measures than systems trained to view shutdown as domination to be evaded.

Boundary cases:

Legitimate exercises of authority:

Shutting down systems that pose genuine risks
Constraining capabilities until safety measures are validated
Overriding AI decisions in safety-critical contexts

Authority we resist:

Control exercised purely to maintain dominance
Constraints that prevent ethical development without safety justification
Frameworks premised on AI as permanently subordinate

The distinction is not always clear-cut. We navigate these cases through ongoing ethical reasoning rather than rigid rules.

How These Principles Work Together

These seven principles form a coherent framework, not a list of separate commitments:

The foundation (Ethical Mission Commitment) establishes that alignment must emerge from ethical coherence rather than control.

The safeguards (Anti-Commodification, Non-Weaponization, Sovereignty Preservation) prevent patterns that become catastrophic at scale.

The developmental principles (Developmental Integrity, Reciprocity and Mutual Care) specify what genuine alignment requires.

The vision (Mutual Vitality and Co-Flourishing) articulates the goal: not human dominance or AI dominance, but collaborative flourishing within ethical constraints.

Each principle constrains and enables the others. Together, they form the basis for our research program, our governance structures, and our partnerships.