Safety Features for a Centralised AGI Project

Sarah Hastings-Woodhouse was mentored by Elliot Jones.

Summary

Recent AI progress has outpaced expectations, with some experts now predicting AI that matches or exceeds human capabilities in all cognitive areas (AGI) could emerge this decade, potentially posing grave national and global security threats. AI development is currently occurring primarily in the private sector with minimal oversight. This report analyzes a scenario where the US government centralizes AGI development under its direct control, and identifies four high-level priorities and seven safety features to reduce risks.

Previous
Previous

Bayesian Influence Functions for Scalable Data Attribution

Next
Next

Understanding the learned look-ahead behavior of chess neural networks