Fork me on GitHub

Monday, December 10, 2018

Bringing history to life with GANs

Creating a Black-White (BW) version of an image is as simple as applying a filter using almost any image editing app available. The same can not be said about the reverse process. I mean it is possible to apply some filter on a BW image to make it more appealing or use a photoshop expert to restore an image, but I know of no filter that will intelligently color grass green and water blue.
Color filter app

Well, not yet, but the paper 'Colorization with Generative Adversarial Networks' reported promising results. I was excited to read this paper out of my love for history. History reminds us there is nothing new under the sun allowing us to set expectations and give us a sense of security. Although color photography was available since 1861, it wasn't popular until 1940's and up until 1960s most photos were taken in black and white. Technology can not time travel us back to those moments yet but it can stretch our imaginations to what it would have been like. The secret? GANs.

GAN
Invented by Ian Goodfellow, the Generative Adversarial Network(GAN) is composed of two Convolutional Neural Networks (CNNs), see my introduction to CNNs here.The goal of a GAN is to generate a fake dataset that resembles the real dataset. The generator generates output from the random input. This output is judged by the discriminator as to where it is from the real dataset. The generator is trained to minimize the probability that discriminator  correctly labels its output and the discriminator is trained to maximize the probability of assigning the correct label. Sort of like good cop, bad cop. After training, the generator is expected to produce meaningful output from purely random input. Magic right?

A GAN to generate MNIST digits


Colorizing GAN
The traditional GAN is expected to generate from random noise but Colorizing GAN generates from BW images. This makes it a conditional GAN where the input is zero noise with BW image as prior.
The discriminator gets as input a pair consisting of BW image and either real colored image or generated color image. It labels which pair consisted of the real image.The generator takes a BW image (L * W *1) and produces a RGB colored version (L*W*3). This is achievable with the   U-Net which is made of ConvNets in the contracting and expanding path.

Generator U-Net
The discriminator architecture is similar to the contracting path, it takes a pair of images concatenated and returns a dimensional output.

Discriminator conv-net


The method
The code and models of the Colorizing GAN  from the paper 'Image Colorization with Generative Adversarial Networks' are available on Github. Considering time and machine constraints, it made economic sense to reuse it. It wasn't easy because the project wasn't designed for my use case, to use a pre-trained to colorize images the model wasn't trained or evaluated with.

The program accepts two datasets (cifar10 and places365) and has three modes: training, test evaluation and turing test. After the testing it samples some images showing side by side, the real RGB, grayscale and generated RGB images.

I downloaded historical BW images from here and added a new dataset and respective model (HistoryBW) to the code. I chose to use pre-trained model from places365 dataset because it is the same format as HistoryBW images (jpg images unlike binary files of cifar10). I placed places365 model in the historybw checkpoint folder. Since real images RGB images are not available and the test modes required them for evaluation, I added `sample` mode which only samples the input skipping evaluation and the need for real images. To top it all of I resized all images to 256 * 256 thanks to IrfanView batch editing feature. In the end, this command worked.

``python test_sample.py --dataset historybw --sample-size 2``

You can find my fork here, contributions are always welcome!


The outcome
My favorite section.
The caption describes the historical event and evaluates how well the model performed. Filter denotes that the output wasn't intelligently generated but looks a color filter was applied.


Left: Parisians at a side walk cafe 1963, slightly random coloring
Right: Constuction of Effiel 1988 , a clear success



Left: New York Commuters read of Kennedy's death 1963, filter
Right: American artillery destroys Nazi sub 1943, identified shapes but random coloring


Left: German prisoners of war marching 1944, clear success 
Right: Jimi Hendrix 1968, moderate success

Left: Assasination scene of Russian revoulutionary Leon Trosky, Mexico 1940, filter
Right:People enjoying an afternoon on Seine river, Paris 1963, success

Left: American soldier in the Korean war, 1950, One of the rare failures for nature images
Right: Tsar Nicholaus II and Tsaritsa leave church, 1900, success

Left: Nazi War cirminal awaits trail in Israel 1961, filter
Right:Hiroshima 1945, success


Left: Capt Francis Fenton in Korea 1950, filter
Right: American warriors face German forces across Berlin wall 1961, failure


Left: Soviet sniper Lyudmila Pavlichenko, 1941, sucess
Right: 'The red baron' petting his dog on the airfiled 1916 , failure


Left: Anti war protester confronts police during Cuban missile crisis, London 1962,no effect
Right: RMS Titanic ready for launch 1911, failure


Left: Titanic propellers before lanuch 1911, no effect
Right: Royal Navy sailor repairing signal flag on the way to Sierra Leone 1942, filter

As shown above, some pictures are deceivingly real and others look like some filter was applied and worst case there was no effect. Despite the contrasts it is clear the model got the grass green and sky blue, probably a result of the training dataset (places365). Which brings us to the question, how do we interpret the network's results, how do we improve the results? Train with more images? Tweak network parameters? We'll find out in the next blogspot. Thanks for reading!

No comments:

Post a Comment