Lugh@futurology.todayM to

Futurology@futurology.todayEnglish · 9 months ago

Two-faced AI language models learn to hide deception - ‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.

8

1

Two-faced AI language models learn to hide deception - ‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.

Lugh@futurology.todayM to

Futurology@futurology.todayEnglish · 9 months ago

8

Two-faced AI language models learn to hide deception

‘Sleeper agents’ seem benign during testing but behave differently once deployed. And methods to stop them aren’t working.

Chat

sbv@sh.itjust.works
link
fedilink
English
arrow-up
0·
9 months ago
So they’re saying ai is software?

Maybe Volkswagen will start using it in their emissions control systems.