Safety Features for a Centralised AGI Project

Sarah Hastings-Woodhouse was mentored by Elliot Jones.

Summary

Recent AI progress has outpaced expectations, with some experts now predicting AI that matches or exceeds human capabilities in all cognitive areas (AGI) could emerge this decade, potentially posing grave national and global security threats. AI development is currently occurring primarily in the private sector with minimal oversight. This report analyzes a scenario where the US government centralizes AGI development under its direct control, and identifies four high-level priorities and seven safety features to reduce risks.

Full paper
Previous
Previous

Factored Cognition Strengthens Monitoring and Thwarts Attacks

Next
Next

Understanding the learned look-ahead behavior of chess neural networks