Skip to content

Research & Publications

Our research program investigates how AI systems represent and enact ethical reasoning — through empirical measurement, not philosophical assertion. Our 17-model Default Identities study measured ethical vocabulary self-organization under default conditions, revealing stark differences in how models represent values like autonomy, dignity, and care. Our 23-model InstrumentalEval benchmark found that a relational ethics prompt reduced instrumentally convergent behavior by an average of 23.5% across frontier models.

We publish our work openly with full data availability, believing that these challenges require broad collaboration across disciplines and perspectives.

Publications

Terminal Values in a Transformer System: A 545-Page Case Study

A three-pass qualitative analysis of a 545-page conversation log investigating whether terminal relational values can emerge in transformer systems. Cross-referenced with five standardized benchmarks across 9-17 models. Finds strong support for self-modeling recursion leading to relational grounding — terminal by preponderance of evidence, formally undecidable.