Why one picture reader needed two sets of eyes
Before sunrise, the parcel hall is already humming. At one table, a worker squints at a smeared flat number. Up on the balcony, a dispatcher spots that two bags from opposite corners need the same long run. One pair of eyes can't do both without slowing the whole line.
A lot of newer picture readers worked too much like the balcony. They were good at linking far-apart bits, but they had no built-in knack for tiny nearby details, or for spotting the same thing when it turns up small in one picture and large in another. One tool was being asked to do two jobs.
ViTAE changed the sorting stations. Before parcels were bundled into bigger groups, the hall gathered clues from nearby, mid-range, and wider areas. At the same time, one track kept reading close-up marks while another kept linking far-apart bundles. That's the new move: two tracks running side by side.
Later stations kept that rhythm. One track watched textures and edges close by. The other checked which distant parts belonged together. Then the two were mixed before moving on. A later version stacked the hall into levels, so fine detail and broad layout stayed alive together for harder jobs.
In the busiest early levels, the far-view track didn't scan the whole hall at once. It only watched a small section, because the close-up track was already carrying enough place clues across neighbouring sections. When part of a picture was covered, the close-up track briefly narrowed to one visible square, then widened again later.
That split of labour held up in small builds and giant ones. It reached strong results with less practice and less data than similar picture readers, and it handled jobs like finding objects, drawing cleaner regions, and following body points better too. The gain came when it stopped forcing one crew to learn every habit from scratch.