I could personally go along with any variety of ways regarding number of boxes and how to draw them. But the idea of filling them alternately vertically and horizontally makes no graphical sense. If the horizontal axis connotes light in stops/substops, then subdivisions of light should absolutely be oriented horizontally. There's no way that a user could intuitively infer that the vertical axis *also* represents stops.
Think about histobar like this: A traditional exposure histogram represents distribution of light with maximum horizontal resolution (value) and vertical resolution (quantity). As such it provides a detailed picture of the distribution. But it has drawbacks: 1) distributions often have skinny tails and peaks that, although important, are had to see; 2) the relationship between light distribution and camera stops is not always apparent; and 3) scene content that the photographer considers outside the dynamic range of the camera are nonetheless squeezed into the visualization of what's "in" the picture.
Histobar essentially quantizes both dimensions of the histogram so that 1) skinny tails has as much visual weight as fat bellies, 2) level distributions are binned in stops, 1/2 stops etc. and 3) out of range values are accounted for outside the normal values (in the latest concept).
The extreme vertical quantization is like saying to the user, "There are enough pixels at this level that you are going to try to expose them properly." With a1ex's request for demoting the lower 5th percentile, it says, "There are less worthy pixels here (due to likely noise), so try to expose them, but sacrifice these first"
So in the the latest concept, the vertical axis represents the worthiness of the pixels. There are two worthiness values - high and low.
I agree that horizontal steps should be designed so the user can intuitively and quickly dial in exposure. Whereas the latest concept has 22 steps, I can try to draw an 11x2 and/or 11x3 approach too. But I'd like to hear from more people first.