If you open the two DNGs (8564 and 65) and apply the exposure compensations from the XMPs (2.14 and 1.81) you will get two identical exposures. This is confirmed on the histograms of the developed JPEGs (in gimp for example, it prints the median value, which is where metering is done at default settings).
So, the deflicker algorithm worked well.
However, if you apply the same exposure compensations on the CR2's, you will not get identical exposure.
BUT
If you extract the raw data (dcraw -4 -E *.CR2 *.dng) and you compare it pixel by pixel, they are 100% identical. There is a vertical offset of 14 pixels, if you want to try. Of course, between 8564 and 65 there is a visible difference in the clouds (surprisingly enough, the tree leaves did not move at all).
So we have two sets of images, with 100% identical raw data, which get rendered with different exposures. What could be the difference?
I'll let you guess the answer. Hint: bug confirmed, will fix. Great catch.