Safety Features for a Centralised AGI Project

17 Jun

Sarah Hastings-Woodhouse was mentored by Elliot Jones.

Summary

Recent AI progress has outpaced expectations, with some experts now predicting AI that matches or exceeds human capabilities in all cognitive areas (AGI) could emerge this decade, potentially posing grave national and global security threats. AI development is currently occurring primarily in the private sector with minimal oversight. This report analyzes a scenario where the US government centralizes AGI development under its direct control, and identifies four high-level priorities and seven safety features to reduce risks.

Full paper

Tilman Räuker

Safety Features for a Centralised AGI Project

Factored Cognition Strengthens Monitoring and Thwarts Attacks

Understanding the learned look-ahead behavior of chess neural networks