Here's how the script works: first it computes the value of a given percentile of the image. So for example 0.5 would be the median, and would correspond to the value where half the pixels in the image have a brightness greater than that value and half have a brightness less than that value. For 0.7 that would mean the brightness at which 70% of the pixels are darker than that value and 30% are brighter than that value. Then it computes the same percentile of some target image. Then it applies a correction to the exposure to make the percentiles in each image match. So for example if we compute the median of some image to be 120 and the median of the target image to be 140, we would apply an exposure compensation of +20 (converted to EVs of course), to try and get to that target value. The reason the script does multiple iterations is that ACR's curve is not linear and is not precisely known, so the EV is sort of 'best guess'. Then we see how well that guess came, say in our example our correction actually got us to 138. We would then apply just a little bit more to get to 140.
This method works b/c of the way brightness tends to be distributed in 'real' images. If you look at a probability distribution (a histogram of an image is simply a type of probability distribution), changing something like the stdev causes the average to change, while the median remains the same. We would say that the median (or a percentile, which is kind of like the same thing) is more 'statistically robust' than something like an average.