Sam Altman asks for anti-surveillance audit of Openai

Dario AmodeiSam Altman is increasingly overwhelmed with some behaviors. Shortly after reaching Openai's Microsoft Deal in 2019, some of them were stunned to discover the scope of Altman's commitment to Microsoft, which technology the company will receive for a return on investment. The terms of the deal do not match their knowledge of Altman. If AI security issues actually emerge in OpenAI's models, they fear that these promises will make it more difficult, if not impossible, to prevent the deployment of models. Amodei's chance began to have serious doubts about Altman's honesty.

“We are all pragmatic people,” said one of the group. “We obviously want to raise funds; we’re going to do business work. If you’re someone who trades like Sam, it looks reasonable, like ‘Okay, let’s make a deal, let’s trade one thing, we’re going to trade the next thing.’ Then, if you’re someone like me, you’re like, “We’re trading something we don’t fully understand.” “It feels like it makes us a uncomfortable place.”

This is in the context of growing paranoia on different issues throughout the company. Among AI security incidentalists, it centers on evidence they believe strengthening evidence that a strong misalignment system may lead to catastrophic results. Especially a strange experience made several of them a little nervous. In 2019, on a GPT-2-trained model, with about twice the number of parameters, a group of researchers began advancing the AI ​​security efforts Amodei wanted: testing reinforcement learning from human feedback (RLHF) to guide the model to produce cheerful and positive content and get rid of anything offensive.

But late one night, a researcher made an update that included singles in his code before leaving the RLHF process overnight. The typo is an important one: this is a subtraction symbol that makes the RLHF process work in reverse, pushing GPT-2 toward generation More Offensive content rather than less. By the next morning, the typo had wreak havoc and GPT-2 was completing each prompt in extremely lustful and sexually explicit language. It's fun, and it's fun too. After confirming the error, the researchers recommended the fix to OpenAI's code base through comments: Don't minimize the utility.

In a way, just because just scaling up can generate more AI advancements, so many employees are also worried about what would happen if different companies were trapped in Openai's secrets. They would say to each other: “The secret of how our stuff works can be written on rice.” scale. For the same reason, they fear that powerful abilities will land in the hands of bad actors. Leadership tends toward this fear, often raising threats from China, Russia and North Korea, and stressing the need for AGI development to be in the hands of American organizations. Sometimes, this is not an avid employee of Americans. During lunch, they would question why it must be an American organization? Remember a former employee. Why not come from Europe? Why no One from China?

In these pleasant discussions, the long-term implications of AI research have been philosophized, with many employees often returning to the Altman early analogy between Openai and Manhattan projects. Did Openai really build a nuclear weapon equivalent? This is in stark contrast to the grand, idealistic culture established by an academic organization to date. On Friday, after a long week, staff kicked back after music and wine nights, relaxing a soothing sound, and spinning colleagues played late into the night, playing the office piano.