I think it's memory fragmentation - the burst algorithm is not able to handle the fragments, so it just looks for large contiguous blocks.
One problem is that EDMAC (the thing that outputs image data) can only write the full image in a contiguous block (I don't know yet how to program it to skip lines or to switch chunks in the middle - g3gg0 has some ideas though). So, we have to reserve memory for full image blocks, even if we crop the image - otherwise the EDMAC will overwrite who knows what.
And the second problem is that, if you can allocate say 180MB, you only get it in small chunks (between 1-20 MB each).
If you also want to crop the image for faster write speeds, you either allocate memory for full image blocks (the safe route), or you can overlap them a bit (I'm doing that in the burst implementation). The simplest case is when you only crop the bottom part of the image.
Example: let's say a frame has 20 memory units, and we crop 10 units from bottom:
____________________________________________________________________________________________________
|memory chunk 1 |memory chunk 2 |
|*****************************************|**********************************************************|
|[frame 1][frame 2][frame 3]--unused---|[frame 4][frame 5][frame 6][frame 7]-------unused-----|
|[frame 1 from EDMAC] |[frame 4 from EDMAC] |
| [frame 2 from EDMAC] | [frame 5 from EDMAC] |
| [frame 3 from EDMAC] | [frame 6 from EDMAC] |
| [frame 4 doesnt fit] [frame 7 from EDMAC] |
| | [frame 8 doesnt fit]
|_________________________________________|__________________________________________________________|
So, in this simple example, 29 memory units out of 100 were wasted. If the chunks are small - say 1 or 2 frames per chunk - there will be a lot of memory wasted because of fragmentation.
Edit: just committed this explanation:
https://bitbucket.org/hudson/magic-lantern/commits/4af113da963c