Insane video super-resolution research

Started by Luther, May 10, 2020, 05:26:44 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.



cmh

Here's their github: https://github.com/open-mmlab/mmsr
I tried to set up an environment real quick yesterday but I should have used debian stable instead of fedora 32 (it was all borked, I needed to downgrade to gcc 8 and it is a pain).
I stumbled across this 5 months old post on reddit
https://www.reddit.com/r/GameUpscale/comments/dz8j8n/mmsr_esrgan_training_and_setup_guide/

Beware: "Take a week or two vacation. Maybe learn a new hobby or two, get reacquainted with old friends and so something meaningful. Training the model takes a LONG time with a single GPU."

Edit: Centos is probably better suited.

Luther

They have pretrained models:
https://github.com/open-mmlab/mmsr/wiki/Training-and-Testing#training

I've tried their previous work, ESRGAN. It is very impressive. Takes a while to process without CUDA (couldn't figure out how to make CUDA work on Windows and didn't have any linux distro installed at the time).
QuoteI should have used debian stable instead of fedora 32
Yes, definetly. apt is great for testing stuff. Most people in CV are using Ubuntu because of nvidia drivers, though.

cmh

On Windows it is extremely slown people are talking of 5 times slower (maybe this can be fixed idk) and without cuda I can't even fathom how slow it would be.
If I'd make a rough estimation based on their github and various reddit posts, it would probably take something like 6 hours per seconds of videos with my gtx 1060 6gb for getting a really good quality upscaling (10 iterations or so), maybe more.
It's unfortunate I don't have another box hanging around.

edit: I love those AI upscaling posts btw.

Luther

Quote from: cmh on May 16, 2020, 08:23:53 PM
it would probably take something like 6 hours per seconds of videos with my gtx 1060 6gb for getting a really good quality upscaling (10 iterations or so), maybe more.
That's pretty slow. Still, these super-resolution research are pretty useful sometimes. As an anecdote, I once had a neighbour who was robed and he got the guy on a very low resolution CCTV. I tried to upscale with lanczos/spline64 but it wasn't enough to identify the individuals. Maybe with MMSR it would be possible.
QuoteIt's unfortunate I don't have another box hanging around.
TPU pricing seems to be decreasing fast (~$15 per hour). Would be nice to test in how much time it would crunch a 1min video.
Quoteedit: I love those AI upscaling posts btw.

Check this out:



Also:
https://github.com/cchen156/Seeing-Motion-in-the-Dark
https://github.com/elliottwu/DeepHDR
https://github.com/thangvubk/FEQE


yourboylloyd

This is way beyond amazing. I wonder if this could be used in astronomy
Join the ML discord! https://discord.gg/H7h6rfq

Levas

The USRNet one has this py file "network_usrnet.py" in the models directory.
Does this mean this can be run on our own images, or is it just a paper ?

Luther

Quote from: Levas on June 02, 2020, 08:12:59 PM
The USRNet one has this py file "network_usrnet.py" in the models directory.
Does this mean this can be run on our own images, or is it just a paper ?

Yes @Levas, all of them can be used in our own images. The code for the USRNet is still under update, it seems, because the author submitted it to CVPR 2020 (the biggest and most respected computer vision conference - got postponed because of COVID this year). But you'd have to train the network yourself, as they didn't provide a pre-trained model yet.
For the ELD denoise, their model is similar to the paper "Learning to See in the Dark" and uses Raw image dataset. You'd need to train the network specifically for the camera you're using because noise changes from each different sensor (in the paper they used a Sony camera). The issue is that this dataset needs to be pre-processed and the authors didn't provide any feedback on how exactly to do that.
Alternatives: use MMSR for superres (the code and training from the OP is provided) and KAIR  for denoise (FFDNet).


Levas

Impressive results.
Although most of it is based by trained models, the words Network and Deep are widely used when the model needs to be trained first, before you can use it, right ?
I already stumbled on  "Learning to See in the Dark" , isn't that the model that is trained by a dataset of 50Gb of photos...and needs at least 64Gb of ram in a computer to be even able to train the model.
I'm very interested in denosing and superresolution and such.
But i'd like to be able to run it on a late 2012 imac  :P

I'm still very impressed with this script:
http://www.magiclantern.fm/forum/index.php?topic=20999.50

I've used that script to upscale and clean some old movie clips I made with my Canon powershot A30 (320x240 motionjpeg avi video files)
The above script cleans up nicely, although with upscaling the result is rather soft, could use more sharpness.

So if someone thinks there's a better script, which I could test, without the need of model training, I'm all ears  8)

Luther

Quote from: Levas on June 03, 2020, 04:58:06 PM
Although most of it is based by trained models, the words Network and Deep are widely used when the model needs to be trained first, before you can use it, right ?
Yes, these are machine learning algorithms, it needs to be trained using a dataset. Normally the author provides a pretrained model, so you only need to download and run it.

Quote
I already stumbled on  "Learning to See in the Dark" , isn't that the model that is trained by a dataset of 50Gb of photos...and needs at least 64Gb of ram in a computer to be even able to train the model.
Yes. And as I've said above, the author doesn't provide information about how to create your own dataset (for this specific network you need to create your own, because it uses Raw noise information and that varies between different sensors).

Quote
I'm very interested in denosing and superresolution and such.
But i'd like to be able to run it on a late 2012 imac  :P
Won't be possible. Most of these research use PyTorch/TensorFlow and require CUDA...

Quote
I'm still very impressed with this script:
http://www.magiclantern.fm/forum/index.php?topic=20999.50
This seems to be burst image denoise (multiple images), right? The networks above are made for single image denoise/super-res...
Also, if you have the time to take multiple photos, why not make long exposures with low ISO and blend with HDRMerge? That is what I do whenever I can.

Quote
So if someone thinks there's a better script, which I could test, without the need of model training, I'm all ears  8)
My suggestions:
- For upscaling single images (not video), try ESRGAN. It works great, but you will need CUDA (you can run directly on CPU, but it takes hours to precess).
- For denoising, try FFDNet or BM3D.

ps: btw, never tested FFDNet/BM3D. I suggested them because they demonstrate good results in paper and are pretty fast. I've only tested ESRGAN at the time I'm writing this.

Levas

Quote from: Luther on June 03, 2020, 05:27:13 PM
My suggestions:
- For upscaling single images (not video), try ESRGAN. It works great, but you will need CUDA (you can run directly on CPU, but it takes hours to precess).
- For denoising, try FFDNet or BM3D.

ps: btw, never tested FFDNet/BM3D. I suggested them because they demonstrate good results in paper and are pretty fast. I've only tested ESRGAN at the time I'm writing this.

Definitely interested to give these a try. Not sure when though, need some time, last 2 times I wanted to try github stuff I messed up my computer and couldn't compile magic lantern anymore  :P
I've found a spare usb drive, installed macos on it and wanna give it a try on that, if it fails, I still have a working computer  :D

DeafEyeJedi

What a thread @Luther and thanks for starting this remarkable read. Bookmarked and definitely following. So much potential and yet seems we're better off building an hackintosh for this. Ha.
5D3.113 | 5D3.123 | EOSM.203 | 7D.203 | 70D.112 | 100D.101 | EOSM2.* | 50D.109