Big

Last two lectures:
- Trained an image classifier with CreateML, using a well-prepared fruit image dataset🍎🍏
- Train a sound classifier with CreateML, using a not-so-well-prepared environment sound dataset🔊 and some python code for data pre-processing

After today's lecture:
-- train a cool GAN from scratch
-- gain some confidence on knowing what's going on behind the jumbo codes
-- prepare to train your own GAN!

Staff-pick AI stuff to wake us up - 😎Cool GAN Applications - part 1
-- thispersondoesnotexist
-- thisjellyfishdoesnotexist (A project exploring the limits of available data as a means of engaging with critically endangered species)
(both use StyleGAN,TensorFlow implementation and Pytorch implementation)

Staff-pick AI stuff to wake us up - 😎Cool GAN Applications - part 2
-- Pix2Pix with an interactive demo
-- GAN dissection for unconventional image editting
-- a recent DragGAN for unconventional image editting

Pytorch🔥
It is a Python library/framework for training and using AI models with a lot of customization possibilities.
Here is a PyTorch tutorial FYI.

Let's go to the notebook:
🥇First run:
1. Run the cell one by one just to see the nice results!
(no need to understand the codes)
2. A note: run the tensorboard cell before running the training cell
3. A fun stuff: once training has started, go to images tab under tensorboard and smash that refresh button

🫲Your turn! 🫱
-- 1. Save the notebook to your drive or open in playground mode
-- 2. Under "Runtime" -> "Change runtime type", make sure "GPU" is selected
-- 3. Run the cell one by one EXCEPT:
-- 4. Run the tensorboard cell before running the training cell

Code in this notebook would take me days to write,
- today we are not aiming for understanding every line of it,
- instead we'll be working on conquering the fear of jumbo python code for training AI models 😎

- A problem: say if we have a bunch of cat images😼 and we want to train a neural network on these images so that it can generate new cat images.
- How?

A smaller problem -
Question 1: of course we'd like to have this neural network generating a different cat image every time. Where could this randomness come from?

- Solution 1: The network's weights and biases are usually deterministic once trained but we can play with the neural network input. we can use a random vector as the input to the network. Every time we infer the model, it starts from sampling a new random vector and outputs a different 2D matrix (image).

Question 2: Then how can we make this generative network able to generate cat images😼 instead of noisy 2D matrix of random numbers?🤯

Question 3: Then how can we make this generative network able to generate NEW cat images😼 beyond the training data?🤯🤯

GAN tackles all these questions in a smart way: it sets one neural network to teach another neural network to generate.

- Introduction to GAN🤘
-- It is an ensemble of two neural networks - a Generator (G) and a Discriminator (D).
--- Generator: it takes a random vector as input and expands that into a 2D matrix (aka image).
--- Discriminator: a good old classification model that predicts if an image is real (from the training dataset) or fake (generated by G).
(no magic yet... we just set up two individual neural nets.)

- Introduction to GAN🤘
-- The magic happens when we train G and D alternatively in a particlar way:
-- It is a tom and jerry game between these two networks Generator 🐭 and Discriminator 😼
--- where D tries to catch G as a fake image generator 🕵️
--- and G tries to fool D into thinking that G produces real images. 🤡

- Introduction to GAN🤘
-- G and D has its own loss term.
--- 🕵️Discriminator loss: just like any image classification model, it tries to classify if the input image is real or generated(fake) and the loss is just a typical classification task loss.
--- 🤡Generator loss: the ingenious design comes from the generator loss being "how well it is able to trick the discriminator into making mistakes", aka the inverse version of the Discriminator loss.
---It you think about it, both G and D losses come from D...

- Introduction to GAN🤘
-- Putting together they are sometimes called a "min-max loss" (check the colab notebook for implementation details, it is actually quite simple. )

- Introduction to GAN🤘
-- Despite the names, G and D are (usually) nothing more special than layers and models we have already seen in MLPs, CNNs.
--- Generator gets its name because its layers are set to output a 2D matrix given a 1D vector.
--- Discriminator gets its name because its layers are set to output 1 single number, which is trained to predict the probability of being fake or real image.

Back to this...
we'll be working on conquering the fear of jumbo AI training code
part 1 : what are the pytorch-specific code and where are the models being defined?

Look at the import cell:
all the imported pytorch stuff are NOT for hard memorising,
we get to know more only when necessary and only after we starting using them.

🥈Second run at the notebook:
there are four classes defined:
- Which class corresponds to Generator?
- Which class corresponds to Discriminator?
- Which class corresponds to assemble Generator and Discriminator together?
- Which class corresponds to data handling?

1. Build the model: have an initially guessed model (random and imperfect)
->
2. Prepare the data: pre-process data to be able to loaded into training.
->
3. Forward pass: input the data to the model, let the model do the computations and get the output.
->
4. Loss calculation: measure how wrong this output is compared to the correct answer paired with the input.
->
5. Backward pass: calculate gradients from the loss and use those to update the model's weights and biases (update rules specified by an optimizer).
->
back to step 2 and repeat

😎Conquering the fear of jumbo AI training code
- part 2 : what does each class do corresponding to the training process big picture?

🥉Third run at the notebook:
There are four classes defined:
- MNISTDataModule
- Generator
- Discriminator
- GAN

Without looking into each class's detail codes, try to assign the role of each class in terms of steps in the big picture
(add a comment above the class definition, me demo)

😎Conquering the fear of jumbo AI training code
- part 3 : diving deeper: where is the model architecture (aka layers) being defined?

There are lots of pytorch-specific classes and functions, don't worry on having to know all of them!
😎Here are three important ones -

😎
- nn.Linear : fully connnected layer (MLP)
- nn.LeakyRelu : an activation func similar to relu
- nn.Sequential() : a container for connecting individual layers together sequentially

∜Fourth run at the notebook:
Just by looking at Discriminator class
- self.model = nn.Sequential(...) is where the model architecture is defined
- nn.Linear : fully connnected layer (MLP)
- nn.LeakyRelu : activation func similar to relu
- nn.Sequential() : to connect layers together

💸 bonus questions:
-- 1. what is the number of epochs in this notebook?
--- (Epoch: During training, we divide the entire training dataset into batches and train the model batch by batch. One epoch means that the model has seen all batches from the training dataset once. )
-- 2. what is the input size of generator in this notebook?
-- 3. what is the input size of discriminator?

Summary today
- Class in python 🖲
- Introduction to GAN and the training process 👾
- GAN training notebook 📀
- How to define a neural netowrk model and the traning process using pytorch (woohoo, connection from MLOne) 🥰
- Next: we are going to WRITE a neural network from scratch

Play time!
play around with the notebook, improve or deteriorate the generated image quality 😈

here are some possiblities you could try (mainly in G and D)
- change layers params in discriminator
- Add one or more layers to discriminator
- Change the block parameters in generator
- Try more epochs
- Change (mayber lower) the latent dimension (input of generator)