The Builder Who Cannot See Bricks
Imagine a high-speed building site where a master builder snaps skyscrapers together in seconds. They don't lay single bricks. Instead, they grab massive pre-made wall panels labelled 'Kitchen' or 'Library' and slot them into place. This is how modern AI reads. It doesn't look at letters one by one but grabs whole chunks of words, called tokens, to build sentences fast.
Trouble starts when the architect asks for a tiny change. "Remove the third brick from the left in that wall panel," they say. The builder freezes. Because they only work with heavy sealed panels, they cannot see the bricks inside. To them, the panel is one solid block. This shows the AI's blind spot. It knows what a word means, but it cannot actually see the letters that make it.
The site manager decides to test this. When asked to count exactly how many bricks are in a panel, the builder just guesses based on size and gets it wrong. But if asked to swap entire panels to make a new shape, they do it perfectly because they have memorised the floor plans. It is the same for AI. It struggles to count characters in a word, yet it can rearrange whole phrases with ease.
On international projects, the materials change. On a site using Chinese-style architecture, each panel is a single distinct unit, so the builder makes fewer mistakes. But on a Korean site, the panels are shells hiding complex internal frames. The builder's reliance on solid blocks causes a total mess here. This highlights how the 'chunk' method works better for some writing systems than others.
We realise that all this speed came at the cost of precision. To fix it, the crew knows they cannot just build bigger cranes. They need new tools that can X-ray the panels or handle individual bricks again. It is a shift from simply moving heavy meaning around to truly seeing the fine details that hold the structure together.