Big

Last two lectures:
- Trained an image classifier with CreateML, using a well-prepared fruit image dataset🍎🍏
- Train a sound classifier with CreateML, using a not-so-well-prepared environment sound dataset🔊 and a python script for data pre-processing

After today's lecture:
-- train a cool GAN from scratch
-- gain some confidence on knowing what's going on behind the jumbo codes
-- prepare to train your own GAN!

Staff-pick AI stuff to wake us up - 😎Cool GAN Applications - part 1
-- thispersondoesnotexist
-- thisjellyfishdoesnotexist (A project exploring the limits of available data as a means of engaging with critically endangered species)
(both use StyleGAN,TensorFlow implementation and Pytorch implementation)

Staff-pick AI stuff to wake us up - 😎Cool GAN Applications - part 2
-- Pix2Pix with an interactive demo
-- GAN dissection for image editting in a special way
-- a recent DragGAN for image editting in another special way

Pytorch🔥
This is a Python library/framework for training and using AI models with a lot of customization power.
Here is a PyTorch tutorial FYI.

Let's go to the notebook:
🥇First run:
1. Run the cell one by one just to see the nice results!
(no need to understand the codes)
2. A note: run the tensorboard cell before running the training cell
3. A fun stuff: once training has started, go to images tab under tensorboard and smash that refresh button

🫲Your turn! 🫱
-- 1. Save the notebook to your drive or open in playground mode
-- 2. Under "Runtime" -> "Change runtime type", make sure "GPU" is selected
-- 3. Run the cell one by one EXCEPT:
-- 4. Run the tensorboard cell before running the training cell

Code in this notebook would take me days to write,
today we are not aiming for understanding every line of it,
instead we'll be working on conquering the fear of jumbo AI training codes 😎

It is an ensemble of two neural networks being trained together - a Generator (G) and a Discriminator (D).
It is a tom and jerry game between these two networks Generator 🐭 and Discriminator 😼.

Despite the names, G and D are (usually) nothing more special than layers and models we have already seen in MLPs, CNNs.

Generator gets its name because its layers are set to output a 2D matrix given an 1D vector.
The notion of "generation" partly comes from this dimension expansion process.

Discriminator gets its name because its layers are set to output 1 single number (indicating the probability of being fake or real image)
given an input 2D matrix.
The notion of "discrimination" comes from its training objective of making correct guess on the image source being fake or real.

Back to this...
we'll be working on conquering the fear of jumbo AI training codes
part 1 : what are the pytorch specifities and where are the models being defined?

Look at the import cell:
all the imported pytorch stuff are NOT for memorising,
we get to know more only when necessary and only after we starting using them.

🥈Second run at the notebook:
there are four classes defined:
- Which class corresponds to Generator?
- Which class corresponds to Discriminator?
- Which class corresponds to assemble Generator and Discriminator together?
- Which class corresponds to data handling?

1. Build the model: have an initially guessed model (random and imperfect)
->
2. Prepare the data: pre-process data to be able to loaded into training.
->
3. Forward pass: input the data to the model, let the model do the computations and get the output.
->
4. Loss calculation: measure how wrong this output is compared to the correct answer .
->
5. Backward pass: calculate gradients from the loss and use those to update the model's weights and biases (update rules specified by optimizer).
->
back to step 2 and repeat

😎Conquering the fear of jumbo AI training codes
- part 2 : what does each class do corresponding to the training process big picture?

🥉Third run at the notebook:
There are four classes defined:
- MNISTDataModule
- Generator
- Discriminator
- GAN

Without looking into each class's detail codes, try to assign the role of each class in terms of steps in the big picture
(add a comment above the class definition, me demo)

😎Conquering the fear of jumbo AI training codes
- part 3 : diving deeper: where are the model architecture (aka layers) being defined?

There are lots of pytorch-specific classes and functions, don't worry on having to know all of them!
😎Here are three important ones -

😎
- nn.Linear : fully connnected layer (MLP)
- nn.LeakyRelu : an activation func similar to relu
- nn.Sequential() : to define a neural network by stacking layers together sequentially

∜Fourth run at the notebook:
Just by looking at Discriminator class
- self.model = nn.Sequential(...) is where the model architecture is defined
- nn.Linear : fully connnected layer (MLP)
- nn.LeakyRelu : activation func similar to relu
- nn.Sequential() : to stack layers together can we draw out the NN diagram? (me demo)

💸 bonus questions:
-- 1. what is the number of epochs in this notebook?
--- (Epoch: During training, we divide the entire training dataset into batches and train the model batch by batch. One epoch means that the model has seen all batches from the training dataset once. )
-- 2. what is the input size of generator in this notebook?
-- 3. what is the input size of discriminator?

Summary today
- Class in python 🖲
- Introduction to GAN and the training process 👾
- GAN training notebook 📀
- How to define a neural netowrk model and the traning process using pytorch (woohoo, connection from MLOne) 🥰
- Next: we are going to WRITE a neural network from scratch

Play time!
play around with the notebook, improve or deteriorate the generated image quality 😈

here are some possiblities you could try (mainly in G and D)
- change layers params in discriminator
- Add one or more layers to discriminator
- Change the block parameters in generator
- Try more epochs
- Change (mayber lower) the latent dimension (input of generator)