EA - Nobody’s on the ball on AGI alignment by leopold
The Nonlinear Library: EA Forum - Podcast készítő The Nonlinear Fund
Kategóriák:
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Nobody’s on the ball on AGI alignment, published by leopold on March 29, 2023 on The Effective Altruism Forum.Far fewer people are working on it than you might think, and even the alignment research that is happening is very much not on track. (But it’s a solvable problem, if we get our act together.)Observing from afar, it's easy to think there's an abundance of people working on AGI safety. Everyone on your timeline is fretting about AI risk, and it seems like there is a well-funded EA-industrial-complex that has elevated this to their main issue. Maybe you've even developed a slight distaste for it all—it reminds you a bit too much of the woke and FDA bureaucrats, and Eliezer seems pretty crazy to you.That’s what I used to think too, a couple of years ago. Then I got to see things more up close. And here’s the thing: nobody’s actually on the friggin’ ball on this one!There’s far fewer people working on it than you might think. There are plausibly 100,000 ML capabilities researchers in the world (30,000 attended ICML alone) vs. 300 alignment researchers in the world, a factor of ~300:1. The scalable alignment team at OpenAI has all of ~7 people.Barely anyone is going for the throat of solving the core difficulties of scalable alignment. Many of the people who are working on alignment are doing blue-sky theory, pretty disconnected from actual ML models. Most of the rest are doing work that’s vaguely related, hoping it will somehow be useful, or working on techniques that might work now but predictably fail to work for superhuman systems.There’s no secret elite SEAL team coming to save the day. This is it. We’re not on track.If timelines are short and we don’t get our act together, we’re in a lot of trouble. Scalable alignment—aligning superhuman AGI systems—is a real, unsolved problem. It’s quite simple: current alignment techniques rely on human supervision, but as models become superhuman, humans won’t be able to reliably supervise them.But my pessimism on the current state of alignment research very much doesn’t mean I’m an Eliezer-style doomer. Quite the opposite, I’m optimistic. I think scalable alignment is a solvable problem—and it’s an ML problem, one we can do real science on as our models get more advanced. But we gotta stop fucking around. We need an effort that matches the gravity of the challenge.Alignment is not on trackA recent post estimated that there were 300 full-time technical AI safety researchers (sounds plausible to me, if we’re counting generously). By contrast, there were 30,000 attendees at ICML in 2021, a single ML conference. It seems plausible that there are ≥100,000 researchers working on ML/AI in total. That’s a ratio of ~300:1, capabilities researchers:AGI safety researchers.That ratio is a little better at the AGI labs: ~7 researchers on the scalable alignment team at OpenAI, vs. ~400 people at the company in total (and fewer researchers). But 7 alignment researchers is still, well, not that much, and those 7 also aren’t, like, OpenAI’s most legendary ML researchers. (Importantly, from my understanding, this isn’t OpenAI being evil or anything like that—OpenAI would love to hire more alignment researchers, but there just aren’t many great researchers out there focusing on this problem.)But rather than the numbers, what made this really visceral to me is. actually looking at the research. There’s very little research where I feel like “great, this is getting at the core difficulties of the problem, and they have a plan for how we might actually solve it in 5 years.â€Let’s take a quick, stylized, incomplete tour of the research landscape.Paul Christiano / Alignment Research Center (ARC).Paul is the single most respected alignment researcher in most circles. He used to lead the OpenAI alignment team, and he has made usefu...
