Here's another analogy to help understand why test-centered accountability doesn't work well.
All the heat in my house is run by a single thermostat. My house has three stories and a basement. The thermostat is on the first floor. The furnace runs into two out of four rooms on the second floor. There are no furnace runs to the third floor (a converted attic space).
The thermostat is supposed to turn the furnace off and on based on the temperature in the house. But it only measures the temperature in one room. In a second-floor bedroom, the temperature may be uncomfortably cold, but the thermostat doesn't measure that. In the attic room, a space heater4 may have the room super-warm, but the thermometer doesn't know that. The thermostat is by the front door-- if that door opens and cold air comes pouring in, the thermostat thinks the whole house is cold.
In short, the thermostat is an inaccurate measure of the temperature in my home because it only measures the temp in one place.
It's true that the thermostat is crudely accurate-ish. If the thermostat thinks the house is at 30 degrees, it's probably safe to conclude that the whole house is cold-- although, it could also mean that the front door has blown open. If the thermostat says the house temperature is 90, it's probably safe to assume that the furnace doesn't need to kick on (and the odds are that the house isn't on fire).
The temperature at that single location isn't a completely useless proxy for the temperature at other locations in the house, if we do a lot of correcting and seat-of-pants compensation for the ways in which the system is built to fail. If we start insisting that the temperature reported by the thermostat is the exact same temperature in every other part of the house, we're in trouble.
We're in even more trouble is we start using the thermostat read-out as a proxy for other things entirely, like how comfortable a room is, or how bright the light is in a room, or how nice the room smells, or how loud the room is. It takes a Grand Canyon sized leap to figure that the reported temperature in one location can be used to determine other factors in other parts of the house.
Likewise, we will fail if we try to use the thermostat read-out to evaluate the efficiency of the power generating and delivery capabilities of our electric company, or evaluate the contractor who built the house (in my case, almost a hundred years ago), or evaluate the health and well-being of the people who live in the house-- or to jump from there to judging the effectiveness of the doctor who treats the people who live in the house, or the medical school that trained that doctor.
At the end of the day, the thermostat really only measures one thing-- the temperature right there, in the place where the thermostat is mounted. To use it to measure any other part of the house, or any other aspect of any other part of the house, or any aspect of the people who live in the other parts of the house-- well, that just means we're moving further and further out on a shaky limb of the Huge Inaccuracy Tree.
In this way, the thermostat is much like the Big Standardized Test-- really only good at measuring one small thing, and not a reliable proxy for anything else. Try this analogy the next time someone asks you why it doesn't make sense to use a single BS Test to measure students, schools, teachers, and the full range of educational activity.
It's not a perfect analogy. It doesn't, for instance, address the example of a thermostat that sets off a bomb instead of starting the furnace. Nor does it address the superstitious belief that the more often you look at a thermostat, the warmer your house becomes. But it's hard to come up with an analogy that captures all the ways in which test-centered accountability is a mess. This will do for a starter.