The form said calm, but the role play said something else
In a noisy youth camp hiring hall, I watched an applicant circle a neat score for “I stay calm under pressure”. Then the supervisor slid over a role play: a child crying, a storm rolling in, two staff arguing. The form felt tidy. The choice didn’t.
People do a similar thing with chatty AI text generators. They hand over the same personality questionnaires humans fill in, and the answers sound steady and believable. But a smooth self-description isn’t the same as a steady decision when things get messy.
So the team matched each “I am like this” line with a “what would you do” scene. They built lots of pairs: a short statement, then an everyday situation with two clear actions, one that fits the statement and one that clashes with it. They made the pairs in both English and Chinese so the meaning lined up.
Then they gave humans and the AI systems the same double test. First came the form: rate each statement on a numbered scale, asked in a few different ways to stop the wording steering the answer. Then came the role play: lean towards Action A, sit in the middle, or lean towards Action B.
Some AI systems couldn’t even stick to the simple rating style, so they were dropped. For the rest, the team checked for basic self-contradictions, like agreeing with “I’m patient” and also agreeing with “I’m not patient” when it’s asked the other way round. Only the steadier ones stayed in.
Humans mostly matched themselves: the form and the role play pointed the same way. The AI systems, on average, didn’t. They could sound calm on the form, then pick the jumpy, impatient option in the scene. One system, GPT-4, came closer to the human pattern than the others, but still fell short of typical human consistency.
Back in the camp hall, the supervisor didn’t look impressed by tidy circles on paper. The supervisor watched the choices. That’s the takeaway: a personality label only matters if it shows up in decisions. If you want to trust an AI in a support role, check the role play, not just the form.