Uploaded by Burner Account

Assignment Template

advertisement
Reading Assignment
### This is a template for the CS449 reading assignments.
### All text following "###" is a comment, and you should delete it before turning in your response.
### Label the questions as you answer them, so we know where each answer ends and the next begins.
### Be sure to write the title of the paper you are responding to.
Paper:
UNDERSTANDING DEEP LEARNING REQUIRES RETHINKING GENERALIZATION
Academic integrity: You must write your answers by yourself, and you may not copy-paste from the paper
itself. Copying from the paper may result in you receiving a zero on the assignment. You may talk about the
papers with at most one other student, but you may not collaborate with them while writing your answers. If
you’ve discussed the papers with another student, please indicate that here.
I didn’t discuss this paper with other students
Question 1:
What are the main claims of this paper? Does it propose a new method that it claims is better than existing
methods? Does it provide a theoretical insight into how existing methods work? Is it making an argument
about the role of ML systems in society? Why do these claims matter?
Answer 1:
This paper mainly looks at models where p (parameters) > n (datapoints) or more importantly p>>n and
argues that we do not really understand how certain models generalize data and that applying regularization
isn’t an exact science yet but implicit regularization (inbuilt in the model) can be helpful. It states that neural
networks in such cases do not generalize and find patterns but rather memorize the inputs. This implies that
using large models for small datasets might be counter productive as they would be overfitted on the training
data.
Question 2:
What evidence does the paper provide to support its claims? For example, does it evaluate the empirical
performance of models on specific datasets? Does it explore case studies? Describe this evidence in detail
(e.g., which datasets were used? Which experiments?). What is a major strength of this evidence? What is a
major weakness of this evidence? Why?
Answer 2:
The paper uses random labels and corrupt data as inputs to to the neural networks and the evidence shows
that the NN is able to fit the labels perfectly despite there being no obvious relation. This is evident in figure 1
as we can see that the Inception model with modified labels (true, partial, Gaussian and random) fits the
CIFAR10 dataset perfectly. Simmilar surprising over-fitting results are seen on the ImageNet dataset (Apendix
table 2). Table 1 shows that regularization does help with generalization but not very significantly however
this view is contradicted later on when it is implied that different permutations of model architectures and
regularization techniques have varied results on varied datasets (I.e. can be significant).
Question 3:
Imagine you wanted to write a pop quiz to assess whether someone else had read this paper. Write a question
that could be answered by someone who closely read the paper, regardless of whether they had the paper in
front of them. See the Canvas “Reading Assignments” page for an example Question/Answer pair.
Answer 3:
Question: How do the authors explain the significance of implicit generaliation in the paper?
Answer: The authors first compare the performance of the models with and without explicit
generalization which comes out to be insignificant (for most models) and then look at the inherent
generalization the models’ architecture does. They argue that components of the model such as SGD
add attributes of generalization.
Question 4:
Ask a question that is neither asked nor answered by the paper. This should be a high-level question about
future work related to the paper. Provide a description of why this question is important and how it's
informed by what you learned from reading this paper.
### Your question should be one to two sentences. Describe why it matters in one to two sentences.
Reading Assignment
### This is a template for the CS449 reading assignments.
### All text following "###" is a comment, and you should delete it before turning in your response.
### Label the questions as you answer them, so we know where each answer ends and the next begins.
### Be sure to write the title of the paper you are responding to.
Paper:
### Paper title goes here. Don’t forget this!
Academic integrity: You must write your answers by yourself, and you may not copy-paste from the paper
itself. Copying from the paper may result in you receiving a zero on the assignment. You may talk about the
papers with at most one other student, but you may not collaborate with them while writing your answers. If
you’ve discussed the papers with another student, please indicate that here.
### Either “I didn’t discuss this paper with other students”,
### or “I discussed this paper with < student name >”
Question 1:
What are the main claims of this paper? Does it propose a new method that it claims is better than existing
methods? Does it provide a theoretical insight into how existing methods work? Is it making an argument
about the role of ML systems in society? Why do these claims matter?
### Three to five sentence answer.
Question 2:
What evidence does the paper provide to support its claims? For example, does it evaluate the empirical
performance of models on specific datasets? Does it explore case studies? Describe this evidence in detail
(e.g., which datasets were used? Which experiments?). What is a major strength of this evidence? What is a
major weakness of this evidence? Why?
### Three to five sentence answer goes here.
Question 3:
Imagine you wanted to write a pop quiz to assess whether someone else had read this paper. Write a question
that could be answered by someone who closely read the paper, regardless of whether they had the paper in
front of them. See the Canvas “Reading Assignments” page for an example Question/Answer pair.
### Your question should be one to two sentences. Your answer should be two to three sentences.
Question 4:
Ask a question that is neither asked nor answered by the paper. This should be a high-level question about
future work related to the paper. Provide a description of why this question is important and how it's
informed by what you learned from reading this paper.
### Your question should be one to two sentences. Describe why it matters in one to two sentences.
Download