The airport rehearsal that predicted a much bigger day

The airport was still half-dark when I walked past the shut cafés to the information desk. Today we switched on a new helper. We’d practised in a spare room with fake gates and volunteers, like rehearsing a busy holiday rush with a tiny crowd and a notebook of guesses.

One good answer wasn’t the worry. The worry was the whole building at once: crowds, odd questions, and mistakes said out loud. We’d seen shiny new systems behave nicely in a corner, then go strange when everyone relied on them, and the fixes came too late.

So we treated the rehearsal like a measuring tape. We ran the helper in smaller versions and watched how often it stumbled as we gave it more room to work, more practice, and messier questions. If that wobble changed in a steady way, we could plan the full rollout with fewer shocks.

When the doors finally filled, the helper didn’t just handle one counter. Staff could paste a paragraph or attach a photo of a sign or form, and it would reply with what it thought it saw and what to do next. On tough written tests, it often did better than older helpers, but not on every task.

The part I watched was the confident wrongness. Sometimes the helper invented details, skipped a step, or nodded along with a bad suggestion. It also had a last-updated point, like a guidebook printed before a rule change. We brought in outsiders to try to trick it, then added guard layers so it learnt to refuse risky requests, but cracks still showed.

By nightfall I shut the notebook and noticed the real change. Before, we crossed our fingers and hoped the big day matched the small test. Now we could forecast behaviour ahead of time, then track the trade-offs as we tightened the guards. For most people, that should mean steadier helpers, but still ones you double-check when it matters.