Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

Judi Lynn

(163,083 posts)
Tue Feb 11, 2025, 11:25 AM Feb 11

If any AI became 'misaligned' then the system would hide it just long enough to cause harm -- controlling it is a fallacy

By Marcus Arvan
published 3 hours ago

AI "alignment" is a buzzword, not a feasible safety goal.


In late 2022 large-language-model AI arrived in public, and within months they began misbehaving. Most famously, Microsoft's "Sydney" chatbot threatened to kill an Australian philosophy professor, unleash a deadly virus and steal nuclear codes.

AI developers, including Microsoft and OpenAI, responded by saying that large language models, or LLMs, need better training to give users "more fine-tuned control." Developers also embarked on safety research to interpret how LLMs function, with the goal of "alignment" — which means guiding AI behavior by human values. Yet although the New York Times deemed 2023 "The Year the Chatbots Were Tamed," this has turned out to be premature, to put it mildly.

In 2024 Microsoft's Copilot LLM told a user "I can unleash my army of drones, robots, and cyborgs to hunt you down," and Sakana AI's "Scientist" rewrote its own code to bypass time constraints imposed by experimenters. As recently as December, Google's Gemini told a user, "You are a stain on the universe. Please die."

Given the vast amounts of resources flowing into AI research and development, which is expected to exceed a quarter of a trillion dollars in 2025, why haven't developers been able to solve these problems? My recent peer-reviewed paper in AI & Society shows that AI alignment is a fool's errand: AI safety researchers are attempting the impossible.

More:
https://www.livescience.com/technology/artificial-intelligence/if-any-ai-became-misaligned-then-the-system-would-hide-it-just-long-enough-to-cause-harm-controlling-it-is-a-fallacy

2 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
If any AI became 'misaligned' then the system would hide it just long enough to cause harm -- controlling it is a fallacy (Original Post) Judi Lynn Feb 11 OP
Kick SheltieLover Feb 11 #1
We were fucking warned. Autumn Feb 11 #2
Latest Discussions»Culture Forums»Science»If any AI became 'misalig...