Julian Stastny
Projects
Research Direction: Studying Scheming and Alignment
Studying abstract analogies of scheming such as these two proposed projects.
Training AIs to be more goal-directed and coherent (and studying whether this can lead to scheming).
Training AIs on various model specs and studying effects on the model's personality.
What I'm looking for in a Mentee
Strong empirical research skills.
Ability and drive to take conceptual ownership of their project.
Enjoys the early exploratory part of research (where there aren't any numbers to drive up yet).
Bio
At Redwood, I manage a bunch of empirical and conceptual research projects. Previously I worked at the Center on Long-term Risk. My academic background is in computer science and machine learning.
