In my previous post I showed how to use progressive generative adversarial networks (pGANs) for image synthesis. In this post I show how to use styleGANs on larger images to create customizable images of watches. Additionally, I show how to apply styleGAN on custom data.
The StyleGAN paper has been released just a few months ago (1.Jan 2019) and shows some major improvements to previous generative adversarial networks. Instead of just repeating, what others already explained in a detailed and easy-to-understand way, I refer to this article.
In short, the styleGAN architecture allows to control the style of generated examples inside image synthesis network. That means that it is possible to adjust high level styles (w) of an image, by applying different vectors from W space. Furthermore, it is possible to transfer a style from one generated image to another. These styles are mapped to the generator LOD (level of detail) sub-networks, which means the effect of these styles are varying from coarse to fine.
The styleGAN paper used the Flickr-Faces-HQ dataset and produces artificial human faces, where the style can be interpreted as pose, shape and colorization of the image. The results of the paper had some media attention through the website: www.thispersondoesnotexist.com.
StyleGAN on watches
I used the styleGAN architecture on 110.810 images of watches (1024×1024 ) from chrono24. The network has seen 15 million images in almost one month of training with a RTX 2080 Ti. Results are much more detailed then in my previous post (besides the increased resolution) and the learned styles are comparable to the paper results. These images are not curated, so its simply what the GAN produces.
Now lets take a look at the style transfer from one generated image to another:
Run styleGAN on your own image data set
- First of all clone the git repository to your local machine
- Make sure to install all the requirements mentioned in the readme.md file (at least 8GB GPU memory required)
- Put all your images into a directory e.g. “E:/image_data” (alternatively you have to change some lines of code in the dataset_tool.py)
- navigate to the repository and run “python create_from_images datasets/mydata_tfrecord E:/image_data”. Mydata_tfrecord it the target folder, make sure to have enough disk space (in my case 50x the size of the source images)
- configure your generated dataset in train.py by adding:
desc += '-custom';
dataset = EasyDict(tfrecord_dir='mydata_tfrecord');
train.mirror_augment = False;
- comment or un-comment the configs, the most of them are self-explaining: important ones: number of GPUs, minibatch_dict (batch_size for each LOD)
- run the train.py with python and check the results directory for samples that will be generated from time to time
- In case the training crashes, you can resume the training by changing the params in training_loop.py, it will restart the training and create a new run
resume_run_id = 24,
resume_snapshot = None,
resume_kimg = 11160.0,