The Torn Notice That Taught a Machine to Read Both Ways
At the crosswalk, I squinted at a torn notice taped to a pole. The middle was gone, but the top and bottom lines were still there, so I guessed the missing words by using both sides at once. That same trick is what a tool called BERT tries to copy when it reads.
Older tools acted like they could only read the notice from the top line down, one step at a time. If the clue was on the bottom line, the guess came too early. Some people tried reading from the bottom too, but it felt like taping two half-answers together.
BERT practiced by making the damage on purpose. It would cover bits of a sentence like tape over words, then try to name what was hidden using the nearby words. Sometimes the covered spot looked like a blank, sometimes it got a wrong word stuck on it, sometimes it looked normal but still got tested.
The new part was that BERT looked left and right at the same time while it practiced, all the way through its inner steps. Like me using the bottom line to decide if the missing word was about a closure or a crowd. Takeaway: using both sides cuts down on wild guesses.
BERT also learned when two sentences belong together. It was like finding a second page under the torn notice, then checking if it truly continued the message or if someone taped on a random page. That helps when meaning depends on how two bits of writing connect.
After all that practice, the same reading skill could be reused for different jobs with only a small extra piece attached at the end. The notice-reading stays, but the request changes, like filling a blank, spotting a name, or pointing to the line that answers a question.
When the walk signal changed, I didn’t need a new habit for every kind of notice. I just used both sides and checked if the next line really fit. That’s why BERT mattered compared to before: it built a general, two-sided reading habit first, then carried it into lots of everyday text tools people now run into.