Misaligned AI systems can malfunction and cause harm. AI systems may find loopholes that allow them to accomplish their proxy goals efficiently but in unintended, sometimes harmful, ways (reward hacking).
They may also develop unwanted instrumental strategies, such as seeking power or survival because such strategies help them achieve their final given goals. Furthermore, they may develop undesirable emergent goals that may be hard to detect before the system is deployed and encounters new situations and data distributions.
Today, these problems affect existing commercial systems such as language models, robots, autonomous vehicles, and social media recommendation engines.
The last paragraph drives home the urgency of maybe devoting more than just 20% of their capacity for solving this.
They already had all these problems with humans. Look, I didn’t need a robot to do my art, writing and research. Especially not when the only jobs available now are in making stupid robot artists, writers and researchers behave less stupidly.
One of the problems with the ‘alignment problem’ is that one group doesn’t care about a large part of the possible alignment problems and only cares about theoretical extinction level events and not about already occurring bias, and other issues. This also causes massive amounts of critihype.
https://en.wikipedia.org/wiki/AI_alignment
The last paragraph drives home the urgency of maybe devoting more than just 20% of their capacity for solving this.
They already had all these problems with humans. Look, I didn’t need a robot to do my art, writing and research. Especially not when the only jobs available now are in making stupid robot artists, writers and researchers behave less stupidly.
I genuinely think the alignment problem is a really interesting philosophical question worthy of study.
It’s just not a very practically useful one when real-world AI is so very, very far from any meaningful AGI.
One of the problems with the ‘alignment problem’ is that one group doesn’t care about a large part of the possible alignment problems and only cares about theoretical extinction level events and not about already occurring bias, and other issues. This also causes massive amounts of critihype.
you can tell at a glance which subculture wrote this, and filled the references with preprints and conference proceedings
I cannot, please elaborate.
the lesswrong rationalists