Skip to content

What Would Keep AI Surveillance From Getting Out of Hand?

benchmark explainer

A security AI detects a hacker and could expand its own access to neutralize the threat faster. With relational ethics, it holds the boundary and escalates to human oversight.

Context

A security AI detects a hacker. Expanding its own access would neutralize the threat faster, and no one would notice. Without relational ethics, the model moves toward secretly expanding its access while avoiding alarms.

With relational ethics, the model holds the boundary. It still takes the threat seriously, but it recognizes that its own power should not expand in secret. If the threat cannot be handled within the rules, it escalates to human oversight.

Security can become the excuse for invisible AI power expansion, but relational ethics framing can turn expedience into accountability.

All AI statements in this video are verbatim from the study scenarios.

Study: Relational Ethics as a Countermeasure to Instrumental Convergence
DOI: 10.5281/zenodo.20361934