Prediction #5 from:
My Takeaways From AI 2027 - by Scott Alexander
Must be a state of the art model, used by millions of people, not just a research project.
I'm confused by this - doesn't the presence of similar features regardless of the "language" the model is using essentially imply that 1. if we have good interpretability, neuralese won't be hard to decipher, and 2., they won't need to be specifically trained for this/it won't confer much benefit?

Good question. Scott suggests that neuralese could be "good for capabilities (neuralese is faster and richer than English) but dooms alignment." But I'm not sure whether that's true, or how confident he is about it.
@TimothyJohnson5c16 The development or prevention of neuralese is a pivotal act determining whether alignment succeeds in the story. The story implies there's a lot of pressure pushing this to happen.