Author Topic: MLV App 1.14 - All in one MLV Video Post Processing App [Windows, Mac and Linux]  (Read 1096919 times)

70MM13

  • Hero Member
  • *****
  • Posts: 551
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1750 on: July 11, 2018, 04:08:59 PM »
That's cool.

How long would it take to grab the existing library from rawtherapee?

It is open source.

masc

  • Contributor
  • Hero Member
  • *****
  • Posts: 2011
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1751 on: July 11, 2018, 04:56:09 PM »
How long would it take to grab the existing library from rawtherapee?

It is open source.
It is not very compatible (C++, we are using C, so has to be adapted) and each algorithm has another undocumented interface. But I don't think that we will get faster with another algorithm. Just for testing I commented debayering out (looks very weird ;) ) but had nearly the identical speed as bilinear. On the other side: switching off processing brings factor 2.5 in terms of speed on my machine. In numbers: 8fps (normal) vs. 8 to 9fps (no debayer) vs 21fps (no processing) on a old notebook @FullHD.
5D3.113 | EOSM.202

ilia3101

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 990
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1752 on: July 11, 2018, 06:12:59 PM »
wow processing is slow.

Is that 21fps with no processing with no caching too?

bouncyball

  • Contributor
  • Hero Member
  • *****
  • Posts: 849
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1753 on: July 11, 2018, 06:40:52 PM »
brings factor 2.5 in terms of speed on my machine.
Yeah I almost forgot about it ;)
Processing, being multi threaded, still is the main performance hog.

masc

  • Contributor
  • Hero Member
  • *****
  • Posts: 2011
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1754 on: July 11, 2018, 07:06:10 PM »
wow processing is slow.
Is that 21fps with no processing with no caching too?
Yes. But bilinear has almost the same speed like AMaZE cached for me.
Maybe I should analyse where exactly the time goes by on processing...

Edit: all the matrix operations cost time (white balance & exposure the most)... but no idea how to accelerate this. The concept is fine in my eyes.
5D3.113 | EOSM.202

ilia3101

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 990
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1755 on: July 11, 2018, 09:26:52 PM »
Edit: all the matrix operations cost time (white balance & exposure the most)... but no idea how to accelerate this. The concept is fine in my eyes.

I will try out integer multiplication + right shift method, maybe will be faster than look up tables.

masc

  • Contributor
  • Hero Member
  • *****
  • Posts: 2011
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1756 on: July 12, 2018, 07:47:21 PM »
I will try out integer multiplication + right shift method, maybe will be faster than look up tables.
Sounds good Ilia!

I added two easy debayers to MLVApp! One is monochrome (just copies RAW data to image) and another very simple & stupid colored debayer which looks worse than bilinear. Both are multithreaded and a little faster than bilinear (but not much). So just for viewing motion in a clip it might be okay...
5D3.113 | EOSM.202

50mm1200s

  • Member
  • ***
  • Posts: 177
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1757 on: July 13, 2018, 04:49:35 AM »
Since you guys are talking about demosaicing again, thought I'd share this paper (Multi-Frame Demosaicing and Super-Resolution of Color Images):
http://people.duke.edu/~sf59/TIP_Demos_Final_Color.pdf

To be fair, I don't understand the math, but the method proposed uses super-resolution as a demosaicing algorithm and reconstitures not just luma, but also chroma. From what I understood, he uses low-res versions of the image as "training" of the SR method. Here's the abstract:

Quote
In the last two decades, two related categories of problems have been studied independently in the image restoration literature: super-resolution and demosaicing. A closer look at these problems reveals the relation between them, and as conventional color digital cameras suffer from both low-spatial resolution and color-filtering, it is reasonable to address them in a unified context. In this paper, we propose a fast and robust hybrid method of super-resolution and demosaicing, based on a MAP estimation technique by minimizing a multi-term cost function. The L1 norm is used for measuring the difference between the projected estimate of the high-resolution image and each low-resolution image, removing outliers in the data and errors due to possibly inaccurate motion estimation. Bilateral regularization is used for spatially regularizing the luminance component, resulting in sharp edges and forcing interpolation along the edges and not across them. Simultaneously, Tikhonov regularization is used to smooth the
chrominance components. Finally, an additional regularization term is used to force similar edge location and orientation in different color channels. We show that the minimization of the total cost function is relatively easy and fast. Experimental results on synthetic and real data sets confirm the effectiveness of our method.

And here's the practical example (click to enlarge):




The other paper shared some time ago combined sharpen inside demosaicing, this one combines SR. Maybe both ideas could be used together... humm.

50mm1200s

  • Member
  • ***
  • Posts: 177
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1758 on: July 13, 2018, 05:35:10 AM »
Another one. The idea of doing pre-demosaicing filtering seems to have spread these last years. This other paper presents denoising before demosaicing using machine-learning (Joint Demosaicing and Denoising via Learned Non-parametric Random Fields):
http://www.nowozin.net/sebastian/papers/khashabi2014demosaicing.pdf

The noise from CMOS or CCD generates interpolation errors (specially "read noise" and "shot noise", page 3), from what I've read on the paper. So, prior to demosaicing, they apply denoising. Below is a demonstration from the paper.  The RTF (Regression Tree Field) is their work, compared to CS (Contour Stencils), NLM (Non-Local Means), MBI (Matlab demosaic function) and the ground truth. Click to enlarge:




And this other paper presents a sharpening/debluring (using deconvolution) before demosaicing. This work is based on the same approach on the superresolution paper shared above (using MAP framework, if I understood right):
https://hal.archives-ouvertes.fr/hal-00408220/document

Demonstration (click to enlarge):




I wonder why no one bother to unite all these research for practical use.

50mm1200s

  • Member
  • ***
  • Posts: 177
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1759 on: July 13, 2018, 05:44:41 AM »
Another one, but using ANN (artificial neural network):
https://www.dcl.hpi.uni-potsdam.de/papers/papers/heinze-joint-multiframe.pdf

Demonstration compared to ACR (Adobe Camera Raw). Check the shadows and midtone noise on the car:

70MM13

  • Hero Member
  • *****
  • Posts: 551
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1760 on: July 14, 2018, 01:35:22 PM »
I'm between projects, so I finally got the chance to try your new colour balance functionality, and it's great!

Also, focus dots removal is finally working for me.  I use digital dolly and film exclusively at 3072*1308, so it's never worked before.

I'm so happy with it that I'm trying to reprocess one of my prior videos entirely in mlv app.  So far, it's going extremely well!

One of the shots required a tokina 11-16mm lens, famous for its purple fringing.

Any chance you could add a defringe function?  Rawtherapee does this really well. You just specify the colour, a colour range for the removal, and pixel radius.

I think a lot of users will be thankful.  Most of us have at least one lens that needs this kind of love...

Keep up the excellent work!

+1 for the incredible demosaicing suggestion

togg

  • Senior
  • ****
  • Posts: 439
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1761 on: July 15, 2018, 07:14:47 AM »
I feel like the hot pixel removal function is still not completelly up to the task. I've tested some old file with plenty of hot pixels and a few would pass even aggressive. I did a specific black frame for it on Switch but it would be nice to have a reliable correction in MLV App.

bouncyball

  • Contributor
  • Hero Member
  • *****
  • Posts: 849
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1762 on: July 15, 2018, 10:05:46 AM »
@togg

Upload the sample please.

bakersdozen

  • Freshman
  • **
  • Posts: 89
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1763 on: July 15, 2018, 02:55:49 PM »
Anyone else having issues with Full-res LV files from 5D3 113 (latest experimental crop build) crashing MLVApp when loading an .mlv file? I've tried both FRSP with intervalometer (output as .mlv) and RAW 14bit lossless with FPS override set at different rates from 7 down to 0.150. As soon as I select the the .mlv it crashes MLVApp.  Tried on latest (OSX) version and with the previous version giving same result. The files work as desired on MLRAWViewer, so they don't seem to be corrupt. I can't seem to find any other mentions of this in the thread, please forgive me if this is a known issue. Not sure if it has anything to do with being 5796x3870 resolution.

Here are the files, there is a FRSP (300MB) version, the RAW Version (1GB +) and the OSX crash log.

https://www.dropbox.com/s/zzd8x93zhs089ck/FRSP.MLV?dl=0
https://www.dropbox.com/s/38ylyldopa0qx9y/M15-1511.MLV?dl=0
https://www.dropbox.com/s/g8s6krnyzhbs15s/crash%20log.rtf?dl=0

Some shots have the sky blown out as I forgot my ND filter.
EOS M + 5D3

IDA_ML

  • Hero Member
  • *****
  • Posts: 1014
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1764 on: July 15, 2018, 05:07:11 PM »
I have just tested MLVApp v.017 with FRSP-MLV files from the 100D shot in the photo mode using the Silent module.  It opens and exports them fine.  There are two black lines - one on the top and another one on the left of the frame, meaning that the 100D needs some centering of the FRSP shot within the frame but this is not a problem of MLVApp.  Still, if I may post a wish for future features, that would be:

1) A crop and rotation/straightening tool - similar to the one in ACR for example;

2) Extention of the chroma separation tool with cleaning also some of the monochromatic noise.  Right now, it does a hell of a job by perfectly removing the color noise.  If one more slider could be added for the monochromatic noise, that would be fantastic! 

masc

  • Contributor
  • Hero Member
  • *****
  • Posts: 2011
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1765 on: July 15, 2018, 10:27:53 PM »
Anyone else having issues with Full-res LV files from 5D3 113 (latest experimental crop build) crashing MLVApp when loading an .mlv file? I've tried both FRSP with intervalometer (output as .mlv) and RAW 14bit lossless with FPS override set at different rates from 7 down to 0.150. As soon as I select the the .mlv it crashes MLVApp.  Tried on latest (OSX) version and with the previous version giving same result. The files work as desired on MLRAWViewer, so they don't seem to be corrupt. I can't seem to find any other mentions of this in the thread, please forgive me if this is a known issue. Not sure if it has anything to do with being 5796x3870 resolution.

Here are the files, there is a FRSP (300MB) version, the RAW Version (1GB +) and the OSX crash log.

https://www.dropbox.com/s/zzd8x93zhs089ck/FRSP.MLV?dl=0
https://www.dropbox.com/s/38ylyldopa0qx9y/M15-1511.MLV?dl=0
https://www.dropbox.com/s/g8s6krnyzhbs15s/crash%20log.rtf?dl=0

Some shots have the sky blown out as I forgot my ND filter.
Thx for the clips. I found the problem very fast: the problem is that these files are <1.0fps. MLVApp was not made for clips with so low fps ... at least I forgot that it might be possible ;)
I made a quickfix - that means: if the clip is <1fps, I set it to 1 fps. Otherwise it is very hard to show a correct timecode... (timecode was the crashing module). So you can compile the latest revision using Dannes compiler tool, or you wait for next release. ;)

1) A crop and rotation/straightening tool - similar to the one in ACR for example;
2) Extention of the chroma separation tool with cleaning also some of the monochromatic noise.  Right now, it does a hell of a job by perfectly removing the color noise.  If one more slider could be added for the monochromatic noise, that would be fantastic! 
1) You really would do this in MLVApp? I always do it in the NLE - each pixel might be good for stabilizing for example... after this I crop it if necessary.
2) That should be possible to add... let's see... but you'll get a very blurred image.

I found an implementation of LMMSE debayer with correct interface in C, so I added it to MLVApp now. Has someone used it in past in any other program? I have the problem that hard contrast lines are getting very green and cyan. Is that normal? Only highiso clips are okay... but AMaZE looks nearly always way better.
5D3.113 | EOSM.202

bakersdozen

  • Freshman
  • **
  • Posts: 89
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1766 on: July 15, 2018, 11:07:13 PM »
Thx for the clips. I found the problem very fast: the problem is that these files are <1.0fps. MLVApp was not made for clips with so low fps ... at least I forgot that it might be possible ;)
I made a quickfix - that means: if the clip is <1fps, I set it to 1 fps. Otherwise it is very hard to show a correct timecode... (timecode was the crashing module). So you can compile the latest revision using Dannes compiler tool, or you wait for next release. ;)

Thanks MASC - legend!
EOS M + 5D3

70MM13

  • Hero Member
  • *****
  • Posts: 551
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1767 on: July 15, 2018, 11:20:38 PM »
My understanding from rawtherapee is that lmmse is primarily for very high ISO shots.  I've never really found it too useful, but who knows, it might work for certain conditions...

Thx for the clips. I found the problem very fast: the problem is that these files are <1.0fps. MLVApp was not made for clips with so low fps ... at least I forgot that it might be possible ;)
I made a quickfix - that means: if the clip is <1fps, I set it to 1 fps. Otherwise it is very hard to show a correct timecode... (timecode was the crashing module). So you can compile the latest revision using Dannes compiler tool, or you wait for next release. ;)
1) You really would do this in MLVApp? I always do it in the NLE - each pixel might be good for stabilizing for example... after this I crop it if necessary.
2) That should be possible to add... let's see... but you'll get a very blurred image.

I found an implementation of LMMSE debayer with correct interface in C, so I added it to MLVApp now. Has someone used it in past in any other program? I have the problem that hard contrast lines are getting very green and cyan. Is that normal? Only highiso clips are okay... but AMaZE looks nearly always way better.

50mm1200s

  • Member
  • ***
  • Posts: 177
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1768 on: July 15, 2018, 11:33:31 PM »
I found an implementation of LMMSE debayer with correct interface in C, so I added it to MLVApp now. Has someone used it in past in any other program?

I have on RawTherapee...

Quote
I have the problem that hard contrast lines are getting very green and cyan. Is that normal? Only highiso clips are okay... but AMaZE looks nearly always way better.

Yep. AMaZE is always better than LMMSE. At least on RT.
The only demosaicing algorithm on RT that is better than AMaZE is IGV (Integrated Gaussian Vector), but only for highly noisy images. This is the code on RT:

Code: [Select]
/***
*
*   Bayer CFA Demosaicing using Integrated Gaussian Vector on Color Differences
*   Revision 1.0 - 2013/02/28
*
*   Copyright (c) 2007-2013 Luis Sanz Rodriguez
*   Using High Order Interpolation technique by Jim S, Jimmy Li, and Sharmil Randhawa
*
*   Contact info: luis.sanz.rodriguez@gmail.com
*
*   This code is distributed under a GNU General Public License, version 3.
*   Visit <http://www.gnu.org/licenses/> for more information.
*
***/
// Adapted to RawTherapee by Jacques Desmis 3/2013
// SSE version by Ingo Weyrich 5/2013
#ifdef __SSE2__
#define CLIPV(a) LIMV(a,zerov,c65535v)
void RawImageSource::igv_interpolate(int winw, int winh)
{
    static const float eps = 1e-5f, epssq = 1e-5f; //mod epssq -10f =>-5f Jacques 3/2013 to prevent artifact (divide by zero)

    static const int h1 = 1, h2 = 2, h3 = 3, h5 = 5;
    const int width = winw, height = winh;
    const int v1 = 1 * width, v2 = 2 * width, v3 = 3 * width, v5 = 5 * width;
    float* rgb[2];
    float* chr[4];
    float *rgbarray, *vdif, *hdif, *chrarray;
    rgbarray    = (float (*)) malloc((width * height) * sizeof( float ) );
    rgb[0] = rgbarray;
    rgb[1] = rgbarray + (width * height) / 2;

    vdif  = (float (*)) calloc( width * height / 2, sizeof * vdif );
    hdif  = (float (*)) calloc( width * height / 2, sizeof * hdif );

    chrarray    = (float (*)) calloc( width * height, sizeof( float ) );
    chr[0] = chrarray;
    chr[1] = chrarray + (width * height) / 2;

    // mapped chr[2] and chr[3] to hdif and hdif, because these are out of use, when chr[2] and chr[3] are used
    chr[2] = hdif;
    chr[3] = vdif;

    border_interpolate2(winw, winh, 7, rawData, red, green, blue);

    if (plistener) {
        plistener->setProgressStr (Glib::ustring::compose(M("TP_RAW_DMETHOD_PROGRESSBAR"), RAWParams::BayerSensor::getMethodString(RAWParams::BayerSensor::Method::IGV)));
        plistener->setProgress (0.0);
    }

#ifdef _OPENMP
    #pragma omp parallel default(none) shared(rgb,vdif,hdif,chr)
#endif
    {
        __m128 ngv, egv, wgv, sgv, nvv, evv, wvv, svv, nwgv, negv, swgv, segv, nwvv, nevv, swvv, sevv, tempv, temp1v, temp2v, temp3v, temp4v, temp5v, temp6v, temp7v, temp8v;
        __m128 epsv = _mm_set1_ps( eps );
        __m128 epssqv = _mm_set1_ps( epssq );
        __m128 c65535v = _mm_set1_ps( 65535.f );
        __m128 c23v = _mm_set1_ps( 23.f );
        __m128 c40v = _mm_set1_ps( 40.f );
        __m128 c51v = _mm_set1_ps( 51.f );
        __m128 c32v = _mm_set1_ps( 32.f );
        __m128 c8v = _mm_set1_ps( 8.f );
        __m128 c7v = _mm_set1_ps( 7.f );
        __m128 c6v = _mm_set1_ps( 6.f );
        __m128 c10v = _mm_set1_ps( 10.f );
        __m128 c21v = _mm_set1_ps( 21.f );
        __m128 c78v = _mm_set1_ps( 78.f );
        __m128 c69v = _mm_set1_ps( 69.f );
        __m128 c3145680v = _mm_set1_ps( 3145680.f );
        __m128 onev = _mm_set1_ps ( 1.f );
        __m128 zerov = _mm_set1_ps ( 0.f );
        __m128 d725v = _mm_set1_ps ( 0.725f );
        __m128 d1375v = _mm_set1_ps ( 0.1375f );

        float *dest1, *dest2;
        float ng, eg, wg, sg, nv, ev, wv, sv, nwg, neg, swg, seg, nwv, nev, swv, sev;
#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 0; row < height - 0; row++) {
            dest1 = rgb[FC(row, 0) & 1];
            dest2 = rgb[FC(row, 1) & 1];
            int col, indx;

            for (col = 0, indx = row * width + col; col < width - 7; col += 8, indx += 8) {
                temp1v = LVFU( rawData[row][col] );
                temp1v = CLIPV( temp1v );
                temp2v = LVFU( rawData[row][col + 4] );
                temp2v = CLIPV( temp2v );
                tempv = _mm_shuffle_ps( temp1v, temp2v, _MM_SHUFFLE( 2, 0, 2, 0 ) );
                _mm_storeu_ps( &dest1[indx >> 1], tempv );
                tempv = _mm_shuffle_ps( temp1v, temp2v, _MM_SHUFFLE( 3, 1, 3, 1 ) );
                _mm_storeu_ps( &dest2[indx >> 1], tempv );
            }

            for (; col < width; col++, indx += 2) {
                dest1[indx >> 1] = CLIP(rawData[row][col]); //rawData = RT datas
                col++;
                if(col < width)
                    dest2[indx >> 1] = CLIP(rawData[row][col]); //rawData = RT datas
            }
        }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.13);
            }
        }

#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 5; row < height - 5; row++) {
            int col, indx, indx1;

            for (col = 5 + (FC(row, 1) & 1), indx = row * width + col, indx1 = indx >> 1; col < width - 12; col += 8, indx += 8, indx1 += 4) {
                //N,E,W,S Gradients
                ngv = (epsv + (vabsf(LVFU(rgb[1][(indx - v1) >> 1]) - LVFU(rgb[1][(indx - v3) >> 1])) + vabsf(LVFU(rgb[0][indx1]) - LVFU(rgb[0][(indx1 - v1)]))) / c65535v);
                egv = (epsv + (vabsf(LVFU(rgb[1][(indx + h1) >> 1]) - LVFU(rgb[1][(indx + h3) >> 1])) + vabsf(LVFU(rgb[0][indx1]) - LVFU(rgb[0][(indx1 + h1)]))) / c65535v);
                wgv = (epsv + (vabsf(LVFU(rgb[1][(indx - h1) >> 1]) - LVFU(rgb[1][(indx - h3) >> 1])) + vabsf(LVFU(rgb[0][indx1]) - LVFU(rgb[0][(indx1 - h1)]))) / c65535v);
                sgv = (epsv + (vabsf(LVFU(rgb[1][(indx + v1) >> 1]) - LVFU(rgb[1][(indx + v3) >> 1])) + vabsf(LVFU(rgb[0][indx1]) - LVFU(rgb[0][(indx1 + v1)]))) / c65535v);
                //N,E,W,S High Order Interpolation (Li & Randhawa)
                //N,E,W,S Hamilton Adams Interpolation
                // (48.f * 65535.f) = 3145680.f
                tempv = c40v * LVFU(rgb[0][indx1]);
                nvv = LIMV(((c23v * LVFU(rgb[1][(indx - v1) >> 1]) + c23v * LVFU(rgb[1][(indx - v3) >> 1]) + LVFU(rgb[1][(indx - v5) >> 1]) + LVFU(rgb[1][(indx + v1) >> 1]) + tempv - c32v * LVFU(rgb[0][(indx1 - v1)]) - c8v * LVFU(rgb[0][(indx1 - v2)]))) / c3145680v, zerov, onev);
                evv = LIMV(((c23v * LVFU(rgb[1][(indx + h1) >> 1]) + c23v * LVFU(rgb[1][(indx + h3) >> 1]) + LVFU(rgb[1][(indx + h5) >> 1]) + LVFU(rgb[1][(indx - h1) >> 1]) + tempv - c32v * LVFU(rgb[0][(indx1 + h1)]) - c8v * LVFU(rgb[0][(indx1 + h2)]))) / c3145680v, zerov, onev);
                wvv = LIMV(((c23v * LVFU(rgb[1][(indx - h1) >> 1]) + c23v * LVFU(rgb[1][(indx - h3) >> 1]) + LVFU(rgb[1][(indx - h5) >> 1]) + LVFU(rgb[1][(indx + h1) >> 1]) + tempv - c32v * LVFU(rgb[0][(indx1 - h1)]) - c8v * LVFU(rgb[0][(indx1 - h2)]))) / c3145680v, zerov, onev);
                svv = LIMV(((c23v * LVFU(rgb[1][(indx + v1) >> 1]) + c23v * LVFU(rgb[1][(indx + v3) >> 1]) + LVFU(rgb[1][(indx + v5) >> 1]) + LVFU(rgb[1][(indx - v1) >> 1]) + tempv - c32v * LVFU(rgb[0][(indx1 + v1)]) - c8v * LVFU(rgb[0][(indx1 + v2)]))) / c3145680v, zerov, onev);
                //Horizontal and vertical color differences
                tempv = LVFU( rgb[0][indx1] ) / c65535v;
                _mm_storeu_ps( &vdif[indx1], (sgv * nvv + ngv * svv) / (ngv + sgv) - tempv );
                _mm_storeu_ps( &hdif[indx1], (wgv * evv + egv * wvv) / (egv + wgv) - tempv );
            }

            // borders without SSE
            for (; col < width - 5; col += 2, indx += 2, indx1++) {
                //N,E,W,S Gradients
                ng = (eps + (fabsf(rgb[1][(indx - v1) >> 1] - rgb[1][(indx - v3) >> 1]) + fabsf(rgb[0][indx1] - rgb[0][(indx1 - v1)])) / 65535.f);;
                eg = (eps + (fabsf(rgb[1][(indx + h1) >> 1] - rgb[1][(indx + h3) >> 1]) + fabsf(rgb[0][indx1] - rgb[0][(indx1 + h1)])) / 65535.f);
                wg = (eps + (fabsf(rgb[1][(indx - h1) >> 1] - rgb[1][(indx - h3) >> 1]) + fabsf(rgb[0][indx1] - rgb[0][(indx1 - h1)])) / 65535.f);
                sg = (eps + (fabsf(rgb[1][(indx + v1) >> 1] - rgb[1][(indx + v3) >> 1]) + fabsf(rgb[0][indx1] - rgb[0][(indx1 + v1)])) / 65535.f);
                //N,E,W,S High Order Interpolation (Li & Randhawa)
                //N,E,W,S Hamilton Adams Interpolation
                // (48.f * 65535.f) = 3145680.f
                nv = LIM(((23.0f * rgb[1][(indx - v1) >> 1] + 23.0f * rgb[1][(indx - v3) >> 1] + rgb[1][(indx - v5) >> 1] + rgb[1][(indx + v1) >> 1] + 40.0f * rgb[0][indx1] - 32.0f * rgb[0][(indx1 - v1)] - 8.0f * rgb[0][(indx1 - v2)])) / 3145680.f, 0.0f, 1.0f);
                ev = LIM(((23.0f * rgb[1][(indx + h1) >> 1] + 23.0f * rgb[1][(indx + h3) >> 1] + rgb[1][(indx + h5) >> 1] + rgb[1][(indx - h1) >> 1] + 40.0f * rgb[0][indx1] - 32.0f * rgb[0][(indx1 + h1)] - 8.0f * rgb[0][(indx1 + h2)])) / 3145680.f, 0.0f, 1.0f);
                wv = LIM(((23.0f * rgb[1][(indx - h1) >> 1] + 23.0f * rgb[1][(indx - h3) >> 1] + rgb[1][(indx - h5) >> 1] + rgb[1][(indx + h1) >> 1] + 40.0f * rgb[0][indx1] - 32.0f * rgb[0][(indx1 - h1)] - 8.0f * rgb[0][(indx1 - h2)])) / 3145680.f, 0.0f, 1.0f);
                sv = LIM(((23.0f * rgb[1][(indx + v1) >> 1] + 23.0f * rgb[1][(indx + v3) >> 1] + rgb[1][(indx + v5) >> 1] + rgb[1][(indx - v1) >> 1] + 40.0f * rgb[0][indx1] - 32.0f * rgb[0][(indx1 + v1)] - 8.0f * rgb[0][(indx1 + v2)])) / 3145680.f, 0.0f, 1.0f);
                //Horizontal and vertical color differences
                vdif[indx1] = (sg * nv + ng * sv) / (ng + sg) - (rgb[0][indx1]) / 65535.f;
                hdif[indx1] = (wg * ev + eg * wv) / (eg + wg) - (rgb[0][indx1]) / 65535.f;
            }
        }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.26);
            }
        }
#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 7; row < height - 7; row++) {
            int col, d, indx1;

            for (col = 7 + (FC(row, 1) & 1), indx1 = (row * width + col) >> 1, d = FC(row, col) / 2; col < width - 14; col += 8, indx1 += 4) {
                //H&V integrated gaussian vector over variance on color differences
                //Mod Jacques 3/2013
                ngv = LIMV(epssqv + c78v * SQRV(LVFU(vdif[indx1])) + c69v * (SQRV(LVFU(vdif[indx1 - v1])) + SQRV(LVFU(vdif[indx1 + v1]))) + c51v * (SQRV(LVFU(vdif[indx1 - v2])) + SQRV(LVFU(vdif[indx1 + v2]))) + c21v * (SQRV(LVFU(vdif[indx1 - v3])) + SQRV(LVFU(vdif[indx1 + v3]))) - c6v * SQRV(LVFU(vdif[indx1 - v1]) + LVFU(vdif[indx1]) + LVFU(vdif[indx1 + v1]))
                           - c10v * (SQRV(LVFU(vdif[indx1 - v2]) + LVFU(vdif[indx1 - v1]) + LVFU(vdif[indx1])) + SQRV(LVFU(vdif[indx1]) + LVFU(vdif[indx1 + v1]) + LVFU(vdif[indx1 + v2]))) - c7v * (SQRV(LVFU(vdif[indx1 - v3]) + LVFU(vdif[indx1 - v2]) + LVFU(vdif[indx1 - v1])) + SQRV(LVFU(vdif[indx1 + v1]) + LVFU(vdif[indx1 + v2]) + LVFU(vdif[indx1 + v3]))), zerov, onev);
                egv = LIMV(epssqv + c78v * SQRV(LVFU(hdif[indx1])) + c69v * (SQRV(LVFU(hdif[indx1 - h1])) + SQRV(LVFU(hdif[indx1 + h1]))) + c51v * (SQRV(LVFU(hdif[indx1 - h2])) + SQRV(LVFU(hdif[indx1 + h2]))) + c21v * (SQRV(LVFU(hdif[indx1 - h3])) + SQRV(LVFU(hdif[indx1 + h3]))) - c6v * SQRV(LVFU(hdif[indx1 - h1]) + LVFU(hdif[indx1]) + LVFU(hdif[indx1 + h1]))
                           - c10v * (SQRV(LVFU(hdif[indx1 - h2]) + LVFU(hdif[indx1 - h1]) + LVFU(hdif[indx1])) + SQRV(LVFU(hdif[indx1]) + LVFU(hdif[indx1 + h1]) + LVFU(hdif[indx1 + h2]))) - c7v * (SQRV(LVFU(hdif[indx1 - h3]) + LVFU(hdif[indx1 - h2]) + LVFU(hdif[indx1 - h1])) + SQRV(LVFU(hdif[indx1 + h1]) + LVFU(hdif[indx1 + h2]) + LVFU(hdif[indx1 + h3]))), zerov, onev);
                //Limit chrominance using H/V neighbourhood
                nvv = median(d725v * LVFU(vdif[indx1]) + d1375v * LVFU(vdif[indx1 - v1]) + d1375v * LVFU(vdif[indx1 + v1]), LVFU(vdif[indx1 - v1]), LVFU(vdif[indx1 + v1]));
                evv = median(d725v * LVFU(hdif[indx1]) + d1375v * LVFU(hdif[indx1 - h1]) + d1375v * LVFU(hdif[indx1 + h1]), LVFU(hdif[indx1 - h1]), LVFU(hdif[indx1 + h1]));
                //Chrominance estimation
                tempv = (egv * nvv + ngv * evv) / (ngv + egv);
                _mm_storeu_ps(&(chr[d][indx1]), tempv);
                //Green channel population
                temp1v = c65535v * tempv + LVFU(rgb[0][indx1]);
                _mm_storeu_ps( &(rgb[0][indx1]), temp1v );
            }

            for (; col < width - 7; col += 2, indx1++) {
                //H&V integrated gaussian vector over variance on color differences
                //Mod Jacques 3/2013
                ng = LIM(epssq + 78.0f * SQR(vdif[indx1]) + 69.0f * (SQR(vdif[indx1 - v1]) + SQR(vdif[indx1 + v1])) + 51.0f * (SQR(vdif[indx1 - v2]) + SQR(vdif[indx1 + v2])) + 21.0f * (SQR(vdif[indx1 - v3]) + SQR(vdif[indx1 + v3])) - 6.0f * SQR(vdif[indx1 - v1] + vdif[indx1] + vdif[indx1 + v1])
                         - 10.0f * (SQR(vdif[indx1 - v2] + vdif[indx1 - v1] + vdif[indx1]) + SQR(vdif[indx1] + vdif[indx1 + v1] + vdif[indx1 + v2])) - 7.0f * (SQR(vdif[indx1 - v3] + vdif[indx1 - v2] + vdif[indx1 - v1]) + SQR(vdif[indx1 + v1] + vdif[indx1 + v2] + vdif[indx1 + v3])), 0.f, 1.f);
                eg = LIM(epssq + 78.0f * SQR(hdif[indx1]) + 69.0f * (SQR(hdif[indx1 - h1]) + SQR(hdif[indx1 + h1])) + 51.0f * (SQR(hdif[indx1 - h2]) + SQR(hdif[indx1 + h2])) + 21.0f * (SQR(hdif[indx1 - h3]) + SQR(hdif[indx1 + h3])) - 6.0f * SQR(hdif[indx1 - h1] + hdif[indx1] + hdif[indx1 + h1])
                         - 10.0f * (SQR(hdif[indx1 - h2] + hdif[indx1 - h1] + hdif[indx1]) + SQR(hdif[indx1] + hdif[indx1 + h1] + hdif[indx1 + h2])) - 7.0f * (SQR(hdif[indx1 - h3] + hdif[indx1 - h2] + hdif[indx1 - h1]) + SQR(hdif[indx1 + h1] + hdif[indx1 + h2] + hdif[indx1 + h3])), 0.f, 1.f);
                //Limit chrominance using H/V neighbourhood
                nv = median(0.725f * vdif[indx1] + 0.1375f * vdif[indx1 - v1] + 0.1375f * vdif[indx1 + v1], vdif[indx1 - v1], vdif[indx1 + v1]);
                ev = median(0.725f * hdif[indx1] + 0.1375f * hdif[indx1 - h1] + 0.1375f * hdif[indx1 + h1], hdif[indx1 - h1], hdif[indx1 + h1]);
                //Chrominance estimation
                chr[d][indx1] = (eg * nv + ng * ev) / (ng + eg);
                //Green channel population
                rgb[0][indx1] = rgb[0][indx1] + 65535.f * chr[d][indx1];
            }
        }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.39);
            }
        }
#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 7; row < height - 7; row++) {
            int col, indx, c;

            for (col = 7 + (FC(row, 1) & 1), indx = row * width + col, c = 1 - FC(row, col) / 2; col < width - 14; col += 8, indx += 8) {
                //NW,NE,SW,SE Gradients
                nwgv = onev / (epsv + vabsf(LVFU(chr[c][(indx - v1 - h1) >> 1]) - LVFU(chr[c][(indx - v3 - h3) >> 1])) + vabsf(LVFU(chr[c][(indx + v1 + h1) >> 1]) - LVFU(chr[c][(indx - v3 - h3) >> 1])));
                negv = onev / (epsv + vabsf(LVFU(chr[c][(indx - v1 + h1) >> 1]) - LVFU(chr[c][(indx - v3 + h3) >> 1])) + vabsf(LVFU(chr[c][(indx + v1 - h1) >> 1]) - LVFU(chr[c][(indx - v3 + h3) >> 1])));
                swgv = onev / (epsv + vabsf(LVFU(chr[c][(indx + v1 - h1) >> 1]) - LVFU(chr[c][(indx + v3 + h3) >> 1])) + vabsf(LVFU(chr[c][(indx - v1 + h1) >> 1]) - LVFU(chr[c][(indx + v3 - h3) >> 1])));
                segv = onev / (epsv + vabsf(LVFU(chr[c][(indx + v1 + h1) >> 1]) - LVFU(chr[c][(indx + v3 - h3) >> 1])) + vabsf(LVFU(chr[c][(indx - v1 - h1) >> 1]) - LVFU(chr[c][(indx + v3 + h3) >> 1])));
                //Limit NW,NE,SW,SE Color differences
                nwvv = median(LVFU(chr[c][(indx - v1 - h1) >> 1]), LVFU(chr[c][(indx - v3 - h1) >> 1]), LVFU(chr[c][(indx - v1 - h3) >> 1]));
                nevv = median(LVFU(chr[c][(indx - v1 + h1) >> 1]), LVFU(chr[c][(indx - v3 + h1) >> 1]), LVFU(chr[c][(indx - v1 + h3) >> 1]));
                swvv = median(LVFU(chr[c][(indx + v1 - h1) >> 1]), LVFU(chr[c][(indx + v3 - h1) >> 1]), LVFU(chr[c][(indx + v1 - h3) >> 1]));
                sevv = median(LVFU(chr[c][(indx + v1 + h1) >> 1]), LVFU(chr[c][(indx + v3 + h1) >> 1]), LVFU(chr[c][(indx + v1 + h3) >> 1]));
                //Interpolate chrominance: R@B and B@R
                tempv = (nwgv * nwvv + negv * nevv + swgv * swvv + segv * sevv) / (nwgv + negv + swgv + segv);
                _mm_storeu_ps( &(chr[c][indx >> 1]), tempv);
            }

            for (; col < width - 7; col += 2, indx += 2) {
                //NW,NE,SW,SE Gradients
                nwg = 1.0f / (eps + fabsf(chr[c][(indx - v1 - h1) >> 1] - chr[c][(indx - v3 - h3) >> 1]) + fabsf(chr[c][(indx + v1 + h1) >> 1] - chr[c][(indx - v3 - h3) >> 1]));
                neg = 1.0f / (eps + fabsf(chr[c][(indx - v1 + h1) >> 1] - chr[c][(indx - v3 + h3) >> 1]) + fabsf(chr[c][(indx + v1 - h1) >> 1] - chr[c][(indx - v3 + h3) >> 1]));
                swg = 1.0f / (eps + fabsf(chr[c][(indx + v1 - h1) >> 1] - chr[c][(indx + v3 + h3) >> 1]) + fabsf(chr[c][(indx - v1 + h1) >> 1] - chr[c][(indx + v3 - h3) >> 1]));
                seg = 1.0f / (eps + fabsf(chr[c][(indx + v1 + h1) >> 1] - chr[c][(indx + v3 - h3) >> 1]) + fabsf(chr[c][(indx - v1 - h1) >> 1] - chr[c][(indx + v3 + h3) >> 1]));
                //Limit NW,NE,SW,SE Color differences
                nwv = median(chr[c][(indx - v1 - h1) >> 1], chr[c][(indx - v3 - h1) >> 1], chr[c][(indx - v1 - h3) >> 1]);
                nev = median(chr[c][(indx - v1 + h1) >> 1], chr[c][(indx - v3 + h1) >> 1], chr[c][(indx - v1 + h3) >> 1]);
                swv = median(chr[c][(indx + v1 - h1) >> 1], chr[c][(indx + v3 - h1) >> 1], chr[c][(indx + v1 - h3) >> 1]);
                sev = median(chr[c][(indx + v1 + h1) >> 1], chr[c][(indx + v3 + h1) >> 1], chr[c][(indx + v1 + h3) >> 1]);
                //Interpolate chrominance: R@B and B@R
                chr[c][indx >> 1] = (nwg * nwv + neg * nev + swg * swv + seg * sev) / (nwg + neg + swg + seg);
            }
        }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.65);
            }
        }
#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 7; row < height - 7; row++) {
            int col, indx;

            for (col = 7 + (FC(row, 0) & 1), indx = row * width + col; col < width - 14; col += 8, indx += 8) {
                //N,E,W,S Gradients
                ngv = onev / (epsv + vabsf(LVFU(chr[0][(indx - v1) >> 1]) - LVFU(chr[0][(indx - v3) >> 1])) + vabsf(LVFU(chr[0][(indx + v1) >> 1]) - LVFU(chr[0][(indx - v3) >> 1])));
                egv = onev / (epsv + vabsf(LVFU(chr[0][(indx + h1) >> 1]) - LVFU(chr[0][(indx + h3) >> 1])) + vabsf(LVFU(chr[0][(indx - h1) >> 1]) - LVFU(chr[0][(indx + h3) >> 1])));
                wgv = onev / (epsv + vabsf(LVFU(chr[0][(indx - h1) >> 1]) - LVFU(chr[0][(indx - h3) >> 1])) + vabsf(LVFU(chr[0][(indx + h1) >> 1]) - LVFU(chr[0][(indx - h3) >> 1])));
                sgv = onev / (epsv + vabsf(LVFU(chr[0][(indx + v1) >> 1]) - LVFU(chr[0][(indx + v3) >> 1])) + vabsf(LVFU(chr[0][(indx - v1) >> 1]) - LVFU(chr[0][(indx + v3) >> 1])));
                //Interpolate chrominance: R@G and B@G
                tempv = ((ngv * LVFU(chr[0][(indx - v1) >> 1]) + egv * LVFU(chr[0][(indx + h1) >> 1]) + wgv * LVFU(chr[0][(indx - h1) >> 1]) + sgv * LVFU(chr[0][(indx + v1) >> 1])) / (ngv + egv + wgv + sgv));
                _mm_storeu_ps( &chr[0 + 2][indx >> 1], tempv);
            }

            for (; col < width - 7; col += 2, indx += 2) {
                //N,E,W,S Gradients
                ng = 1.0f / (eps + fabsf(chr[0][(indx - v1) >> 1] - chr[0][(indx - v3) >> 1]) + fabsf(chr[0][(indx + v1) >> 1] - chr[0][(indx - v3) >> 1]));
                eg = 1.0f / (eps + fabsf(chr[0][(indx + h1) >> 1] - chr[0][(indx + h3) >> 1]) + fabsf(chr[0][(indx - h1) >> 1] - chr[0][(indx + h3) >> 1]));
                wg = 1.0f / (eps + fabsf(chr[0][(indx - h1) >> 1] - chr[0][(indx - h3) >> 1]) + fabsf(chr[0][(indx + h1) >> 1] - chr[0][(indx - h3) >> 1]));
                sg = 1.0f / (eps + fabsf(chr[0][(indx + v1) >> 1] - chr[0][(indx + v3) >> 1]) + fabsf(chr[0][(indx - v1) >> 1] - chr[0][(indx + v3) >> 1]));
                //Interpolate chrominance: R@G and B@G
                chr[0 + 2][indx >> 1] = ((ng * chr[0][(indx - v1) >> 1] + eg * chr[0][(indx + h1) >> 1] + wg * chr[0][(indx - h1) >> 1] + sg * chr[0][(indx + v1) >> 1]) / (ng + eg + wg + sg));
            }
        }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.78);
            }
        }
#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 7; row < height - 7; row++) {
            int col, indx;

            for (col = 7 + (FC(row, 0) & 1), indx = row * width + col; col < width - 14; col += 8, indx += 8) {
                //N,E,W,S Gradients
                ngv = onev / (epsv + vabsf(LVFU(chr[1][(indx - v1) >> 1]) - LVFU(chr[1][(indx - v3) >> 1])) + vabsf(LVFU(chr[1][(indx + v1) >> 1]) - LVFU(chr[1][(indx - v3) >> 1])));
                egv = onev / (epsv + vabsf(LVFU(chr[1][(indx + h1) >> 1]) - LVFU(chr[1][(indx + h3) >> 1])) + vabsf(LVFU(chr[1][(indx - h1) >> 1]) - LVFU(chr[1][(indx + h3) >> 1])));
                wgv = onev / (epsv + vabsf(LVFU(chr[1][(indx - h1) >> 1]) - LVFU(chr[1][(indx - h3) >> 1])) + vabsf(LVFU(chr[1][(indx + h1) >> 1]) - LVFU(chr[1][(indx - h3) >> 1])));
                sgv = onev / (epsv + vabsf(LVFU(chr[1][(indx + v1) >> 1]) - LVFU(chr[1][(indx + v3) >> 1])) + vabsf(LVFU(chr[1][(indx - v1) >> 1]) - LVFU(chr[1][(indx + v3) >> 1])));
                //Interpolate chrominance: R@G and B@G
                tempv = ((ngv * LVFU(chr[1][(indx - v1) >> 1]) + egv * LVFU(chr[1][(indx + h1) >> 1]) + wgv * LVFU(chr[1][(indx - h1) >> 1]) + sgv * LVFU(chr[1][(indx + v1) >> 1])) / (ngv + egv + wgv + sgv));
                _mm_storeu_ps( &chr[1 + 2][indx >> 1], tempv);
            }

            for (; col < width - 7; col += 2, indx += 2) {
                //N,E,W,S Gradients
                ng = 1.0f / (eps + fabsf(chr[1][(indx - v1) >> 1] - chr[1][(indx - v3) >> 1]) + fabsf(chr[1][(indx + v1) >> 1] - chr[1][(indx - v3) >> 1]));
                eg = 1.0f / (eps + fabsf(chr[1][(indx + h1) >> 1] - chr[1][(indx + h3) >> 1]) + fabsf(chr[1][(indx - h1) >> 1] - chr[1][(indx + h3) >> 1]));
                wg = 1.0f / (eps + fabsf(chr[1][(indx - h1) >> 1] - chr[1][(indx - h3) >> 1]) + fabsf(chr[1][(indx + h1) >> 1] - chr[1][(indx - h3) >> 1]));
                sg = 1.0f / (eps + fabsf(chr[1][(indx + v1) >> 1] - chr[1][(indx + v3) >> 1]) + fabsf(chr[1][(indx - v1) >> 1] - chr[1][(indx + v3) >> 1]));
                //Interpolate chrominance: R@G and B@G
                chr[1 + 2][indx >> 1] = ((ng * chr[1][(indx - v1) >> 1] + eg * chr[1][(indx + h1) >> 1] + wg * chr[1][(indx - h1) >> 1] + sg * chr[1][(indx + v1) >> 1]) / (ng + eg + wg + sg));
            }
        }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.91);
            }
        }
        float *src1, *src2, *redsrc0, *redsrc1, *bluesrc0, *bluesrc1;
#ifdef _OPENMP
        #pragma omp for
#endif

        for(int row = 7; row < height - 7; row++) {
            int col, indx, fc;
            fc = FC(row, 7) & 1;
            src1 = rgb[fc];
            src2 = rgb[fc ^ 1];
            redsrc0 = chr[fc << 1];
            redsrc1 = chr[(fc ^ 1) << 1];
            bluesrc0 = chr[(fc << 1) + 1];
            bluesrc1 = chr[((fc ^ 1) << 1) + 1];

            for(col = 7, indx = row * width + col; col < width - 14; col += 8, indx += 8) {
                temp1v = LVFU( src1[indx >> 1] );
                temp2v = LVFU( src2[(indx + 1) >> 1] );
                tempv = _mm_shuffle_ps( temp1v, temp2v, _MM_SHUFFLE( 1, 0, 1, 0 ) );
                tempv = _mm_shuffle_ps( tempv, tempv, _MM_SHUFFLE( 3, 1, 2, 0 ) );
                _mm_storeu_ps( &green[row][col], CLIPV( tempv ));
                temp5v = LVFU(redsrc0[indx >> 1]);
                temp6v = LVFU(redsrc1[(indx + 1) >> 1]);
                temp3v = _mm_shuffle_ps( temp5v, temp6v, _MM_SHUFFLE( 1, 0, 1, 0 ) );
                temp3v = _mm_shuffle_ps( temp3v, temp3v, _MM_SHUFFLE( 3, 1, 2, 0 ) );
                temp3v = CLIPV( tempv - c65535v * temp3v );
                _mm_storeu_ps( &red[row][col], temp3v);
                temp7v = LVFU(bluesrc0[indx >> 1]);
                temp8v = LVFU(bluesrc1[(indx + 1) >> 1]);
                temp4v = _mm_shuffle_ps( temp7v, temp8v, _MM_SHUFFLE( 1, 0, 1, 0 ) );
                temp4v = _mm_shuffle_ps( temp4v, temp4v, _MM_SHUFFLE( 3, 1, 2, 0 ) );
                temp4v = CLIPV( tempv - c65535v * temp4v );
                _mm_storeu_ps( &blue[row][col], temp4v);

                tempv = _mm_shuffle_ps( temp1v, temp2v, _MM_SHUFFLE( 3, 2, 3, 2 ) );
                tempv = _mm_shuffle_ps( tempv, tempv, _MM_SHUFFLE( 3, 1, 2, 0 ) );
                _mm_storeu_ps( &green[row][col + 4], CLIPV( tempv ));

                temp3v = _mm_shuffle_ps( temp5v, temp6v, _MM_SHUFFLE( 3, 2, 3, 2 ) );
                temp3v = _mm_shuffle_ps( temp3v, temp3v, _MM_SHUFFLE( 3, 1, 2, 0 ) );
                temp3v = CLIPV( tempv - c65535v * temp3v );
                _mm_storeu_ps( &red[row][col + 4], temp3v);
                temp4v = _mm_shuffle_ps( temp7v, temp8v, _MM_SHUFFLE( 3, 2, 3, 2 ) );
                temp4v = _mm_shuffle_ps( temp4v, temp4v, _MM_SHUFFLE( 3, 1, 2, 0 ) );
                temp4v = CLIPV( tempv - c65535v * temp4v );
                _mm_storeu_ps( &blue[row][col + 4], temp4v);
            }

            for(; col < width - 7; col++, indx += 2) {
                red  [row][col] = CLIP(src1[indx >> 1] - 65535.f * redsrc0[indx >> 1]);
                green[row][col] = CLIP(src1[indx >> 1]);
                blue [row][col] = CLIP(src1[indx >> 1] - 65535.f * bluesrc0[indx >> 1]);
                col++;
                red  [row][col] = CLIP(src2[(indx + 1) >> 1] - 65535.f * redsrc1[(indx + 1) >> 1]);
                green[row][col] = CLIP(src2[(indx + 1) >> 1]);
                blue [row][col] = CLIP(src2[(indx + 1) >> 1] - 65535.f * bluesrc1[(indx + 1) >> 1]);
            }
        }
    }// End of parallelization

    if (plistener) {
        plistener->setProgress (1.0);
    }

    free(chrarray);
    free(rgbarray);
    free(vdif);
    free(hdif);
}
#undef CLIPV
#else
void RawImageSource::igv_interpolate(int winw, int winh)
{
    static const float eps = 1e-5f, epssq = 1e-5f; //mod epssq -10f =>-5f Jacques 3/2013 to prevent artifact (divide by zero)
    static const int h1 = 1, h2 = 2, h3 = 3, h4 = 4, h5 = 5, h6 = 6;
    const int width = winw, height = winh;
    const int v1 = 1 * width, v2 = 2 * width, v3 = 3 * width, v4 = 4 * width, v5 = 5 * width, v6 = 6 * width;
    float* rgb[3];
    float* chr[2];
    float (*rgbarray), *vdif, *hdif, (*chrarray);

    rgbarray    = (float (*)) calloc(width * height * 3, sizeof( float));
    rgb[0] = rgbarray;
    rgb[1] = rgbarray + (width * height);
    rgb[2] = rgbarray + 2 * (width * height);

    chrarray    = (float (*)) calloc(width * height * 2, sizeof( float));
    chr[0] = chrarray;
    chr[1] = chrarray + (width * height);

    vdif  = (float (*))    calloc(width * height / 2, sizeof * vdif);
    hdif  = (float (*))    calloc(width * height / 2, sizeof * hdif);

    border_interpolate2(winw, winh, 7, rawData, red, green, blue);

    if (plistener) {
        plistener->setProgressStr (Glib::ustring::compose(M("TP_RAW_DMETHOD_PROGRESSBAR"), RAWParams::BayerSensor::getMethodString(RAWParams::BayerSensor::Method::IGV)));
        plistener->setProgress (0.0);
    }

#ifdef _OPENMP
    #pragma omp parallel default(none) shared(rgb,vdif,hdif,chr)
#endif
    {

        float ng, eg, wg, sg, nv, ev, wv, sv, nwg, neg, swg, seg, nwv, nev, swv, sev;

#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 0; row < height - 0; row++)
            for (int col = 0, indx = row * width + col; col < width - 0; col++, indx++) {
                int c = FC(row, col);
                rgb[c][indx] = CLIP(rawData[row][col]); //rawData = RT datas
            }

//  border_interpolate2(7, rgb);

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.13);
            }
        }

#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 5; row < height - 5; row++)
            for (int col = 5 + (FC(row, 1) & 1), indx = row * width + col, c = FC(row, col); col < width - 5; col += 2, indx += 2) {
                //N,E,W,S Gradients
                ng = (eps + (fabsf(rgb[1][indx - v1] - rgb[1][indx - v3]) + fabsf(rgb[c][indx] - rgb[c][indx - v2])) / 65535.f);;
                eg = (eps + (fabsf(rgb[1][indx + h1] - rgb[1][indx + h3]) + fabsf(rgb[c][indx] - rgb[c][indx + h2])) / 65535.f);
                wg = (eps + (fabsf(rgb[1][indx - h1] - rgb[1][indx - h3]) + fabsf(rgb[c][indx] - rgb[c][indx - h2])) / 65535.f);
                sg = (eps + (fabsf(rgb[1][indx + v1] - rgb[1][indx + v3]) + fabsf(rgb[c][indx] - rgb[c][indx + v2])) / 65535.f);
                //N,E,W,S High Order Interpolation (Li & Randhawa)
                //N,E,W,S Hamilton Adams Interpolation
                // (48.f * 65535.f) = 3145680.f
                nv = LIM(((23.0f * rgb[1][indx - v1] + 23.0f * rgb[1][indx - v3] + rgb[1][indx - v5] + rgb[1][indx + v1] + 40.0f * rgb[c][indx] - 32.0f * rgb[c][indx - v2] - 8.0f * rgb[c][indx - v4])) / 3145680.f, 0.0f, 1.0f);
                ev = LIM(((23.0f * rgb[1][indx + h1] + 23.0f * rgb[1][indx + h3] + rgb[1][indx + h5] + rgb[1][indx - h1] + 40.0f * rgb[c][indx] - 32.0f * rgb[c][indx + h2] - 8.0f * rgb[c][indx + h4])) / 3145680.f, 0.0f, 1.0f);
                wv = LIM(((23.0f * rgb[1][indx - h1] + 23.0f * rgb[1][indx - h3] + rgb[1][indx - h5] + rgb[1][indx + h1] + 40.0f * rgb[c][indx] - 32.0f * rgb[c][indx - h2] - 8.0f * rgb[c][indx - h4])) / 3145680.f, 0.0f, 1.0f);
                sv = LIM(((23.0f * rgb[1][indx + v1] + 23.0f * rgb[1][indx + v3] + rgb[1][indx + v5] + rgb[1][indx - v1] + 40.0f * rgb[c][indx] - 32.0f * rgb[c][indx + v2] - 8.0f * rgb[c][indx + v4])) / 3145680.f, 0.0f, 1.0f);
                //Horizontal and vertical color differences
                vdif[indx >> 1] = (sg * nv + ng * sv) / (ng + sg) - (rgb[c][indx]) / 65535.f;
                hdif[indx >> 1] = (wg * ev + eg * wv) / (eg + wg) - (rgb[c][indx]) / 65535.f;
            }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.26);
            }
        }

#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 7; row < height - 7; row++)
            for (int col = 7 + (FC(row, 1) & 1), indx = row * width + col, c = FC(row, col), d = c / 2; col < width - 7; col += 2, indx += 2) {
                //H&V integrated gaussian vector over variance on color differences
                //Mod Jacques 3/2013
                ng = LIM(epssq + 78.0f * SQR(vdif[indx >> 1]) + 69.0f * (SQR(vdif[(indx - v2) >> 1]) + SQR(vdif[(indx + v2) >> 1])) + 51.0f * (SQR(vdif[(indx - v4) >> 1]) + SQR(vdif[(indx + v4) >> 1])) + 21.0f * (SQR(vdif[(indx - v6) >> 1]) + SQR(vdif[(indx + v6) >> 1])) - 6.0f * SQR(vdif[(indx - v2) >> 1] + vdif[indx >> 1] + vdif[(indx + v2) >> 1])
                         - 10.0f * (SQR(vdif[(indx - v4) >> 1] + vdif[(indx - v2) >> 1] + vdif[indx >> 1]) + SQR(vdif[indx >> 1] + vdif[(indx + v2) >> 1] + vdif[(indx + v4) >> 1])) - 7.0f * (SQR(vdif[(indx - v6) >> 1] + vdif[(indx - v4) >> 1] + vdif[(indx - v2) >> 1]) + SQR(vdif[(indx + v2) >> 1] + vdif[(indx + v4) >> 1] + vdif[(indx + v6) >> 1])), 0.f, 1.f);
                eg = LIM(epssq + 78.0f * SQR(hdif[indx >> 1]) + 69.0f * (SQR(hdif[(indx - h2) >> 1]) + SQR(hdif[(indx + h2) >> 1])) + 51.0f * (SQR(hdif[(indx - h4) >> 1]) + SQR(hdif[(indx + h4) >> 1])) + 21.0f * (SQR(hdif[(indx - h6) >> 1]) + SQR(hdif[(indx + h6) >> 1])) - 6.0f * SQR(hdif[(indx - h2) >> 1] + hdif[indx >> 1] + hdif[(indx + h2) >> 1])
                         - 10.0f * (SQR(hdif[(indx - h4) >> 1] + hdif[(indx - h2) >> 1] + hdif[indx >> 1]) + SQR(hdif[indx >> 1] + hdif[(indx + h2) >> 1] + hdif[(indx + h4) >> 1])) - 7.0f * (SQR(hdif[(indx - h6) >> 1] + hdif[(indx - h4) >> 1] + hdif[(indx - h2) >> 1]) + SQR(hdif[(indx + h2) >> 1] + hdif[(indx + h4) >> 1] + hdif[(indx + h6) >> 1])), 0.f, 1.f);
                //Limit chrominance using H/V neighbourhood
                nv = median(0.725f * vdif[indx >> 1] + 0.1375f * vdif[(indx - v2) >> 1] + 0.1375f * vdif[(indx + v2) >> 1], vdif[(indx - v2) >> 1], vdif[(indx + v2) >> 1]);
                ev = median(0.725f * hdif[indx >> 1] + 0.1375f * hdif[(indx - h2) >> 1] + 0.1375f * hdif[(indx + h2) >> 1], hdif[(indx - h2) >> 1], hdif[(indx + h2) >> 1]);
                //Chrominance estimation
                chr[d][indx] = (eg * nv + ng * ev) / (ng + eg);
                //Green channel population
                rgb[1][indx] = rgb[c][indx] + 65535.f * chr[d][indx];
            }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.39);
            }
        }

//  free(vdif); free(hdif);
#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 7; row < height - 7; row += 2)
            for (int col = 7 + (FC(row, 1) & 1), indx = row * width + col, c = 1 - FC(row, col) / 2; col < width - 7; col += 2, indx += 2) {
                //NW,NE,SW,SE Gradients
                nwg = 1.0f / (eps + fabsf(chr[c][indx - v1 - h1] - chr[c][indx - v3 - h3]) + fabsf(chr[c][indx + v1 + h1] - chr[c][indx - v3 - h3]));
                neg = 1.0f / (eps + fabsf(chr[c][indx - v1 + h1] - chr[c][indx - v3 + h3]) + fabsf(chr[c][indx + v1 - h1] - chr[c][indx - v3 + h3]));
                swg = 1.0f / (eps + fabsf(chr[c][indx + v1 - h1] - chr[c][indx + v3 + h3]) + fabsf(chr[c][indx - v1 + h1] - chr[c][indx + v3 - h3]));
                seg = 1.0f / (eps + fabsf(chr[c][indx + v1 + h1] - chr[c][indx + v3 - h3]) + fabsf(chr[c][indx - v1 - h1] - chr[c][indx + v3 + h3]));
                //Limit NW,NE,SW,SE Color differences
                nwv = median(chr[c][indx - v1 - h1], chr[c][indx - v3 - h1], chr[c][indx - v1 - h3]);
                nev = median(chr[c][indx - v1 + h1], chr[c][indx - v3 + h1], chr[c][indx - v1 + h3]);
                swv = median(chr[c][indx + v1 - h1], chr[c][indx + v3 - h1], chr[c][indx + v1 - h3]);
                sev = median(chr[c][indx + v1 + h1], chr[c][indx + v3 + h1], chr[c][indx + v1 + h3]);
                //Interpolate chrominance: R@B and B@R
                chr[c][indx] = (nwg * nwv + neg * nev + swg * swv + seg * sev) / (nwg + neg + swg + seg);
            }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.52);
            }
        }
#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 8; row < height - 7; row += 2)
            for (int col = 7 + (FC(row, 1) & 1), indx = row * width + col, c = 1 - FC(row, col) / 2; col < width - 7; col += 2, indx += 2) {
                //NW,NE,SW,SE Gradients
                nwg = 1.0f / (eps + fabsf(chr[c][indx - v1 - h1] - chr[c][indx - v3 - h3]) + fabsf(chr[c][indx + v1 + h1] - chr[c][indx - v3 - h3]));
                neg = 1.0f / (eps + fabsf(chr[c][indx - v1 + h1] - chr[c][indx - v3 + h3]) + fabsf(chr[c][indx + v1 - h1] - chr[c][indx - v3 + h3]));
                swg = 1.0f / (eps + fabsf(chr[c][indx + v1 - h1] - chr[c][indx + v3 + h3]) + fabsf(chr[c][indx - v1 + h1] - chr[c][indx + v3 - h3]));
                seg = 1.0f / (eps + fabsf(chr[c][indx + v1 + h1] - chr[c][indx + v3 - h3]) + fabsf(chr[c][indx - v1 - h1] - chr[c][indx + v3 + h3]));
                //Limit NW,NE,SW,SE Color differences
                nwv = median(chr[c][indx - v1 - h1], chr[c][indx - v3 - h1], chr[c][indx - v1 - h3]);
                nev = median(chr[c][indx - v1 + h1], chr[c][indx - v3 + h1], chr[c][indx - v1 + h3]);
                swv = median(chr[c][indx + v1 - h1], chr[c][indx + v3 - h1], chr[c][indx + v1 - h3]);
                sev = median(chr[c][indx + v1 + h1], chr[c][indx + v3 + h1], chr[c][indx + v1 + h3]);
                //Interpolate chrominance: R@B and B@R
                chr[c][indx] = (nwg * nwv + neg * nev + swg * swv + seg * sev) / (nwg + neg + swg + seg);
            }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.65);
            }
        }
#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 7; row < height - 7; row++)
            for (int col = 7 + (FC(row, 0) & 1), indx = row * width + col; col < width - 7; col += 2, indx += 2) {
                //N,E,W,S Gradients
                ng = 1.0f / (eps + fabsf(chr[0][indx - v1] - chr[0][indx - v3]) + fabsf(chr[0][indx + v1] - chr[0][indx - v3]));
                eg = 1.0f / (eps + fabsf(chr[0][indx + h1] - chr[0][indx + h3]) + fabsf(chr[0][indx - h1] - chr[0][indx + h3]));
                wg = 1.0f / (eps + fabsf(chr[0][indx - h1] - chr[0][indx - h3]) + fabsf(chr[0][indx + h1] - chr[0][indx - h3]));
                sg = 1.0f / (eps + fabsf(chr[0][indx + v1] - chr[0][indx + v3]) + fabsf(chr[0][indx - v1] - chr[0][indx + v3]));
                //Interpolate chrominance: R@G and B@G
                chr[0][indx] = ((ng * chr[0][indx - v1] + eg * chr[0][indx + h1] + wg * chr[0][indx - h1] + sg * chr[0][indx + v1]) / (ng + eg + wg + sg));
            }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.78);
            }
        }
#ifdef _OPENMP
        #pragma omp for
#endif

        for (int row = 7; row < height - 7; row++)
            for (int col = 7 + (FC(row, 0) & 1), indx = row * width + col; col < width - 7; col += 2, indx += 2) {

                //N,E,W,S Gradients
                ng = 1.0f / (eps + fabsf(chr[1][indx - v1] - chr[1][indx - v3]) + fabsf(chr[1][indx + v1] - chr[1][indx - v3]));
                eg = 1.0f / (eps + fabsf(chr[1][indx + h1] - chr[1][indx + h3]) + fabsf(chr[1][indx - h1] - chr[1][indx + h3]));
                wg = 1.0f / (eps + fabsf(chr[1][indx - h1] - chr[1][indx - h3]) + fabsf(chr[1][indx + h1] - chr[1][indx - h3]));
                sg = 1.0f / (eps + fabsf(chr[1][indx + v1] - chr[1][indx + v3]) + fabsf(chr[1][indx - v1] - chr[1][indx + v3]));
                //Interpolate chrominance: R@G and B@G
                chr[1][indx] = ((ng * chr[1][indx - v1] + eg * chr[1][indx + h1] + wg * chr[1][indx - h1] + sg * chr[1][indx + v1]) / (ng + eg + wg + sg));
            }

#ifdef _OPENMP
        #pragma omp single
#endif
        {
            if (plistener) {
                plistener->setProgress (0.91);
            }

            //Interpolate borders
//  border_interpolate2(7, rgb);
        }
        /*
#ifdef _OPENMP
        #pragma omp for
#endif
            for (int row=0; row < height; row++)  //borders
                for (int col=0; col < width; col++) {
                    if (col==7 && row >= 7 && row < height-7)
                        col = width-7;
                    int indxc=row*width+col;
                    red  [row][col] = rgb[indxc][0];
                    green[row][col] = rgb[indxc][1];
                    blue [row][col] = rgb[indxc][2];
                }
        */

#ifdef _OPENMP
        #pragma omp for
#endif

        for(int row = 7; row < height - 7; row++)
            for(int col = 7, indx = row * width + col; col < width - 7; col++, indx++) {
                red  [row][col] = CLIP(rgb[1][indx] - 65535.f * chr[0][indx]);
                green[row][col] = CLIP(rgb[1][indx]);
                blue [row][col] = CLIP(rgb[1][indx] - 65535.f * chr[1][indx]);
            }
    }// End of parallelization

    if (plistener) {
        plistener->setProgress (1.0);
    }

    free(chrarray);
    free(rgbarray);
    free(vdif);
    free(hdif);
}
#endif

IDA_ML

  • Hero Member
  • *****
  • Posts: 1014
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1769 on: July 15, 2018, 11:36:03 PM »
2) That should be possible to add... let's see... but you'll get a very blurred image.

If it blurs the image too much, it wouldn't be of much use.  It should preserve fine detail as it does in DaVinci Resolve for example.  Also, it should allow a very fine adjustment as this is the case with the blur radius slider.  This will make it possible to clean the most intrusive monochrome noise while leaving the rest in the image for a more cinematic look.  I will be looking forward to it!  Thanks, Masc.

50mm1200s

  • Member
  • ***
  • Posts: 177
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1770 on: July 15, 2018, 11:56:15 PM »
If it blurs the image too much, it wouldn't be of much use.  It should preserve fine detail as it does in DaVinci Resolve for example.  Also, it should allow a very fine adjustment as this is the case with the blur radius slider.  This will make it possible to clean the most intrusive monochrome noise while leaving the rest in the image for a more cinematic look.  I will be looking forward to it!  Thanks, Masc.

A proper denoising algo would be better, no? This and this implementations are GPL licensed, maybe it could be adapted. Just an idea.

50mm1200s

  • Member
  • ***
  • Posts: 177
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1771 on: July 18, 2018, 02:32:41 AM »
Last release for Windows is crashing during batch export (ProRes 444), while using darkframe subtraction :(
Darkframe sample (averaged using MLVApp):
https://minfil.com/S8b8tdf3b6/50d_iso_100.MLV

ps: it renders ~3 files on the batch, then crashs.

masc

  • Contributor
  • Hero Member
  • *****
  • Posts: 2011
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1772 on: July 18, 2018, 11:19:57 AM »
Last release for Windows is crashing during batch export (ProRes 444), while using darkframe subtraction :(
Darkframe sample (averaged using MLVApp):
https://minfil.com/S8b8tdf3b6/50d_iso_100.MLV

ps: it renders ~3 files on the batch, then crashs.
Yeah... known problem. But I am not able to reproduce it on any of my computers. I can export as many files with darkframe as I like. Do you have Qt installed? We need someone who has this problem and who is able to debug.
5D3.113 | EOSM.202

bouncyball

  • Contributor
  • Hero Member
  • *****
  • Posts: 849
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1773 on: July 18, 2018, 03:12:59 PM »
ps: it renders ~3 files on the batch, then crashs.
I can not reproduce this either. Please do what masc suggests.

masc

  • Contributor
  • Hero Member
  • *****
  • Posts: 2011
Re: Making a New MLV Processing App! [Windows, Mac and Linux]
« Reply #1774 on: July 18, 2018, 09:08:27 PM »
A proper denoising algo would be better, no? This and this implementations are GPL licensed, maybe it could be adapted. Just an idea.
Yes, totally right. A denoising algo would be best. I could test this algo in MLVApp, because it was easy to adapt to what we need:
http://www.ipol.im/pub/art/2011/bcm_nlm/
It works but it is soooo slow - OMG - some minutes per frame, and only on Win & Linux multithreaded to work correctly. So no real solution for us ;)
5D3.113 | EOSM.202