Uploaded by hariprasaathvv

CS577 Deep Learning Final Exam

advertisement
CS577 Deep Learning Final Examination
Illinois Institute of Technology
Name:
Illinois Tech A #:
Instructions:
1. You have 120 minutes to complete the examination.
2. Write your answers on an empty piece of paper. After finishing, you need to take a picture for your
answer (better to use a PDF scanner APP, such as Adobe Scan.) And then upload it to BlackBoard.
3. This exam is open book. You may bring in your homework, class notes and textbooks to help you.
Internet browser searches are NOT allowed.
4. You may use a calculator. You may not share a calculator with anyone.
In recognition of and in the spirit of the Illinois Institute of Technology Honor Code, I certify that each
student will neither give nor receive unpermitted aid on this examination after starting the exam.
1
Question 1: Convolutional Neural Networks
[8 pts] Multiple Choice Questions. Each question has only ONE correct answer. There are a total of 4
questions, each worth 2 points. Your answer should be a), b), c), or d).
1. Which of the following is the use of the convolution layer in CNN?
(a) To pool the features from the previous layer.
(b) To flatten the image into a linear array.
(c) To apply a set of filters to the image and create an activation map.
(d) To connect every neuron to all neurons in the next layer.
Answer:
2. What is the primary function of the first convolutional layer in a CNN?
(a) To immediately reduce the size of the input image.
(b) To detect features like edges and corners.
(c) To classify the image into various categories.
(d) To connect every pixel with its neighboring pixels.
Answer:
3. Which of the following correctly explains the use of the pooling layer in CNN?
(a) It increases the spatial size of the image.
(b) It reduces the number of parameters and computations in the network.
(c) It converts color images to grayscale.
(d) It enhances the features of the image.
Answer:
4. Which group of of the listed items are valid hyperparameters of a pooling layer in CNN?
(a) Learning rate, epochs, batch size.
(b) Filter size, Stride, Max or average pooling.
(c) Number of filters, filter size, padding type.
(d) Activation function, dropout rate, number of neurons.
Answer:
[12 pts] Calculation Questions. You should include reasonable calculation steps and a clear result.
7. [8 pts] Given an 4 × 4 input I, calculate the result of the convolution with a 2 × 2 filter K and bias
= 1, with stride = 1 and no padding.
2
I :
1
1
1
1
0
0
1
0
0
0
1
1
1
1
1
0
K :
-1
1
1
1
bias: 1
8. [4 pts] Given an input I, calculate the result of the AVERAGE pooling with a 3 × 3 filter, stride = 3.
(The result should be rounded to two decimals using normal rounding rules.)
I :
11
3
5
2
5
7
0
6
7
3
4
8
10
8
9
1
3
2
3
8
10
11
3
15
9
8
24
13
7
6
10
10
16
8
6
0
8
Question 2: Generative Adversarial Network
[30 pts] Multiple Choice Questions. Each question has only ONE correct answer. There are a total of 15
questions, each worth 2 points. Your answer should be a), b), c), or d).
1. What is the primary objective of a Generative Adversarial Network (GAN)?
(a) Image classification
(b) Image generation
(c) Text classification
(d) Text translation
Answer:
2. What are the two main components of a GAN?
(a) Generator and encoder
(b) Discriminator and decoder
(c) Generator and discriminator
(d) Encoder and discriminator
Answer:
3. Which component of a GAN is responsible for generating synthetic samples?
(a) Generator
(b) Discriminator
(c) Encoder
(d) Decoder
Answer:
4. Which component of a GAN is responsible for distinguishing between real and generated samples?
(a) Generator
(b) Discriminator
(c) Encoder
(d) Decoder
Answer:
5. What is the training process in a GAN called?
(a) Supervised learning
(b) Reinforcement learning
(c) Unsupervised learning
(d) Adversarial learning
4
Answer:
6. Which loss function is commonly used in GANs?
(a) Cross-entropy loss
(b) Mean squared error loss
(c) Binary logistic loss
(d) Kullback-Leibler divergence
Answer:
7. What is the purpose of noise input in the generator component of a GAN?
(a) To randomize the generation process and introduce variations
(b) To control the learning rate during training
(c) To adjust the weights and biases of the generator
(d) None of the above
Answer:
8. What is the role of the discriminator in a GAN during the training process?
(a) To provide feedback to the generator and help it improve
(b) To generate synthetic samples
(c) To adjust the learning rate during training
(d) None of the above
Answer:
9. What is the purpose of the latent space in a GAN?
(a) To represent the high-dimensional space of real samples
(b) To control the diversity and characteristics of generated samples
(c) To adjust the learning rate during training
(d) None of the above
Answer:
10. Which type of GAN is designed to generate samples conditioned on specific input information?
(a) Unconditional GAN
(b) Wasserstein GAN
(c) Progressive GAN
(d) Conditional GAN
Answer:
11. What is the purpose of the reconstruction loss in a GAN with an encoder component?
5
(a) To encourage the encoder to produce meaningful latent representations
(b) To control the learning rate during training
(c) To adjust the weights and biases of the generator
(d) None of the above
Answer:
12. How does the training process of a GAN typically work?
(a) The generator and discriminator are trained alternately
(b) The generator and discriminator are trained simultaneously
(c) The generator is trained first, followed by the discriminator
(d) The discriminator is trained first, followed by the generator
Answer:
13. What is the purpose of the evaluation metrics in GANs?
(a) To measure the quality and diversity of generated samples
(b) To adjust the learning rate during training
(c) To adjust the weights and biases of the generator
(d) None of the above
Answer:
14. What is the purpose of the adversarial loss in GANs?
(a) To encourage the generator to produce samples that deceive the discriminator
(b) To control the learning rate during training
(c) To adjust the weights and biases of the generator
(d) None of the above
Answer:
15. What is the purpose of the discriminator regularization in GANs?
(a) To prevent overfitting of the discriminator
(b) To control the learning rate during training
(c) To adjust the weights and biases of the generator
(d) None of the above
Answer:
6
Question 3: Transfer Learning
[20 pts] Multiple Choice Questions. Each question has only ONE correct answer. There are a total of 10
questions, each worth 2 points. Your answer should be a), b), c), or d). If you encounter questions where
multiple options seem to be correct, please choose the option that you believe is the most appropriate answer.
In cases where some concepts may not be fully covered in the course materials, use the process of elimination
among the options to make your best choice. If a question is ultimately considered to be ambiguous, it will
be omitted from the grading process.
1. What is Transfer Learning primarily used for?
(a) Reducing training time for new models.
(b) Improving model accuracy on unrelated tasks.
(c) Utilizing knowledge from one domain to improve learning in another related domain.
(d) Simplifying complex models into simpler ones.
Answer:
2. In Transfer Learning, what is ‘Fine-tuning’ typically used for?
(a) Adjusting model parameters for a new task.
(b) Reducing the model size for faster inference.
(c) Evaluating model performance on the source task.
(d) Selecting the best source model for transfer.
Answer:
3. In Transfer Learning, which strategy is typically employed when adapting a pre-trained model to a
new, but related task in image recognition?
(a) Retraining all layers from scratch.
(b) Freezing the initial layers and fine-tuning the deeper layers.
(c) Fine-tuning only the input and output layers.
(d) Freezing all layers without any modifications.
Answer:
4. What is a key challenge in Transfer Learning, especially when the target data is limited?
(a) Selecting irrelevant source data.
(b) Overfitting the model to the target data.
(c) Underfitting the model to the target data.
(d) Ensuring equal distribution of classes in the target data.
Answer:
5. In the context of Transfer Learning, what does ‘Domain-adversarial training’ aim to achieve?
(a) Reducing the domain gap between source and target data.
(b) Increasing model complexity for better performance.
7
(c) Isolating domain-specific features from the model.
(d) Enhancing the feature extraction capabilities of the model.
Answer:
6. In Transfer Learning, ‘Layer Transfer’ refers to:
(a)
(b)
(c)
(d)
Transferring only the output layer of the source model.
Increasing model complexity for better performance.
Isolating domain-specific features from the model.
Enhancing the feature extraction capabilities of the model.
Answer:
7. Which statement best describes ‘Progressive Neural Networks’ in the context of Transfer Learning?
(a)
(b)
(c)
(d)
Networks
Networks
Networks
Networks
that
that
that
that
improve their performance progressively with more data.
add new layers for each new task while retaining learned features.
progressively reduce their complexity for different tasks.
use progressive regularization techniques.
Answer:
8. What does ‘Zero-shot Learning’ aim to achieve in Transfer Learning?
(a)
(b)
(c)
(d)
Learning to perform tasks without any training data.
Classifying objects or concepts not seen during training.
Reducing the model size to zero for efficient storage.
Achieving zero errors in model predictions.
Answer:
9. In Transfer Learning, what is the primary purpose of ‘Self-taught Clustering’ ?
(a)
(b)
(c)
(d)
To
To
To
To
group similar tasks together for efficient learning.
learn from unlabeled data in a different domain from the target task.
teach the model to automatically label the data.
cluster the target data for better feature extraction.
Answer:
10. ‘Domain-adversarial Training’ in Transfer Learning is most similar to which other machine learning
concept?
(a)
(b)
(c)
(d)
Supervised learning.
Reinforcement learning.
Generative Adversarial Networks (GANs).
Convolutional Neural Networks (CNNs).
Answer:
8
Question 4: Semi-supervised Learning
[20 pts] Multiple Choice Questions. Each question has only ONE correct answer. There are a total of 10
questions, each worth 2 points. Your answer should be a), b), c), or d). If you encounter questions where
multiple options seem to be correct, please choose the option that you believe is the most appropriate answer.
In cases where some concepts may not be fully covered in the course materials, use the process of elimination
among the options to make your best choice. If a question is ultimately considered to be ambiguous, it will
be omitted from the grading process.
1. What characterizes Semi-supervised Learning?
(a) Using only labeled data for training models.
(b) Using a large amount of unlabeled data with a small amount of labeled data.
(c) Relying exclusively on unlabeled data.
(d) Using equal amounts of labeled and unlabeled data.
Answer:
2. What is the ‘Low-density Separation Assumption’ in Semi-supervised Learning?
(a) Class boundaries are likely to lie in high-density regions.
(b) Class boundaries are likely to lie in low-density regions.
(c) Data points in different classes have high similarity.
(d) High-density regions have a uniform class distribution.
Answer:
3. In the context of Semi-supervised Learning, what does ‘Self-training’ involve?
(a) Training a model exclusively on unlabeled data.
(b) Using the model’s predictions to label unlabeled data and retrain the model.
(c) Relying on external annotations for labeling data.
(d) Using multiple models to label each other’s data.
Answer:
4. What is the ‘Smoothness Assumption’ in the context of Semi-supervised Learning?
(a) If two points in a high-density region are close to each other, they are likely to share the same
label.
(b) The transition of labels in the feature space is always gradual and predictable.
(c) The classification boundary should be as smooth as possible for better generalization.
(d) Smoothing techniques are applied to the data to make it easier to classify.
Answer:
5. What is a common approach to applying Semi-supervised Learning to Generative Models?
(a) Using only labeled data to train the model.
(b) Ignoring the unlabeled data during the training process.
9
(c) Utilizing unlabeled data to re-estimate model parameters.
(d) Treating all unlabeled data as belonging to a single new class.
Answer:
6. What is the main advantage of using Semi-supervised Learning over purely Supervised Learning?
(a)
(b)
(c)
(d)
It
It
It
It
requires no labeled data at all.
can make use of a large amount of easily obtainable unlabeled data.
is always more accurate than supervised methods.
completely eliminates the need for data labeling.
Answer:
7. Which approach in Semi-supervised Learning involves training a model on labeled data and then
applying it to label unlabeled data?
(a)
(b)
(c)
(d)
Graph-based approach.
Generative model approach.
Self-training approach.
Smoothness assumption approach.
Answer:
8. In Semi-supervised Learning, what does the ‘Entropy-based Regularization’ aim to achieve?
(a)
(b)
(c)
(d)
To
To
To
To
ensure that the distribution of labels is as uniform as possible.
minimize the complexity of the model.
maximize the uncertainty in model predictions.
minimize the uncertainty in the classification of unlabeled data.
Answer:
9. The ‘Graph-based Approach’ in Semi-supervised Learning typically involves:
(a)
(b)
(c)
(d)
Constructing a graph where each node is a data point and edges represent similarity.
Using a graphical model to represent probabilistic relationships between data points.
Drawing decision boundaries in a graphical form.
Visualizing data distribution on a graph for manual labeling.
Answer:
10. In Semi-supervised Learning, the ‘Semi-supervised SVM’ is used for:
(a)
(b)
(c)
(d)
Maximizing the margin between data points of different classes.
Clustering data points before classification.
Reducing the dimensionality of the feature space.
Performing regression analysis on unlabeled data.
Answer:
10
Question 5: Reinforcement Learning
[10 pts] Multiple Choice Questions. Each question has only ONE correct answer. There are a total of 5
questions, each worth 2 points. Your answer should be a), b), c), or d).
1. Which of the following DOES NOT apply RL?
(a) ChatGPT.
(b) AlphaGo.
(c) Traditional object detection.
(d) Robotics.
Answer:
2. Which of the following statements about RL is NOT true?
(a) RL learns from experience.
(b) RL can be viewed as an optimization problem.
(c) RL needs to explore the world for better decision making.
(d) Decisions made in RL only impact the present.
Answer:
3. Given the following statements about Markov Decision Process (MDP), which one is NOT true?
(a) Given the Markov assumption, a state is a sufficient statistic of entire observed history.
(b) The transition and reward function of a MDP can be represented as large tables.
(c) We can set a discount factor γ = 1 in finite horizon problems.
(d) In a MDP, a small discount factor γ means long-term rewards are more influential than short-term
rewards.
Answer:
4. Given the following statements about value and policy, which one is TRUE?
(a) A stochastic policy maps states to exact actions.
(b) In a finite horizon problem, the optimal policy is sensitive to the length of horizon.
(c) Optimal policy for a MDP in an infinite horizon problem is not stationary.
(d) In a MDP, there might be multiple optimal value functions derived by value iteration.
Answer:
5. Given following statements about Q Learning and Deep Q Network (DQN), which one is NOT true?
(a) Q Learning is a model-based RL algorithm.
(b) Q Learning is an off-policy RL algorithm.
(c) Replay buffer is used for breaking correlations between samples when training a DQN.
(d) The target network is always needed for stabilizing the learning process when training a DQN.
Answer:
11
Download