Deep Learning Generalization Reading Assignment

Reading Assignment ### This is a template for the CS449 reading assignments. ### All text following "###" is a comment, and you should delete it before turning in your response. ### Label the questions as you answer them, so we know where each answer ends and the next begins. ### Be sure to write the title of the paper you are responding to. Paper: UNDERSTANDING DEEP LEARNING REQUIRES RETHINKING GENERALIZATION Academic integrity: You must write your answers by yourself, and you may not copy-paste from the paper itself. Copying from the paper may result in you receiving a zero on the assignment. You may talk about the papers with at most one other student, but you may not collaborate with them while writing your answers. If you’ve discussed the papers with another student, please indicate that here. I didn’t discuss this paper with other students Question 1: What are the main claims of this paper? Does it propose a new method that it claims is better than existing methods? Does it provide a theoretical insight into how existing methods work? Is it making an argument about the role of ML systems in society? Why do these claims matter? Answer 1: This paper mainly looks at models where p (parameters) > n (datapoints) or more importantly p>>n and argues that we do not really understand how certain models generalize data and that applying regularization isn’t an exact science yet but implicit regularization (inbuilt in the model) can be helpful. It states that neural networks in such cases do not generalize and find patterns but rather memorize the inputs. This implies that using large models for small datasets might be counter productive as they would be overfitted on the training data. Question 2: What evidence does the paper provide to support its claims? For example, does it evaluate the empirical performance of models on specific datasets? Does it explore case studies? Describe this evidence in detail (e.g., which datasets were used? Which experiments?). What is a major strength of this evidence? What is a major weakness of this evidence? Why? Answer 2: The paper uses random labels and corrupt data as inputs to to the neural networks and the evidence shows that the NN is able to fit the labels perfectly despite there being no obvious relation. This is evident in figure 1 as we can see that the Inception model with modified labels (true, partial, Gaussian and random) fits the CIFAR10 dataset perfectly. Simmilar surprising over-fitting results are seen on the ImageNet dataset (Apendix table 2). Table 1 shows that regularization does help with generalization but not very significantly however this view is contradicted later on when it is implied that different permutations of model architectures and regularization techniques have varied results on varied datasets (I.e. can be significant). Question 3: Imagine you wanted to write a pop quiz to assess whether someone else had read this paper. Write a question that could be answered by someone who closely read the paper, regardless of whether they had the paper in front of them. See the Canvas “Reading Assignments” page for an example Question/Answer pair. Answer 3: Question: How do the authors explain the significance of implicit generaliation in the paper? Answer: The authors first compare the performance of the models with and without explicit generalization which comes out to be insignificant (for most models) and then look at the inherent generalization the models’ architecture does. They argue that components of the model such as SGD add attributes of generalization. Question 4: Ask a question that is neither asked nor answered by the paper. This should be a high-level question about future work related to the paper. Provide a description of why this question is important and how it's informed by what you learned from reading this paper. ### Your question should be one to two sentences. Describe why it matters in one to two sentences. Reading Assignment ### This is a template for the CS449 reading assignments. ### All text following "###" is a comment, and you should delete it before turning in your response. ### Label the questions as you answer them, so we know where each answer ends and the next begins. ### Be sure to write the title of the paper you are responding to. Paper: ### Paper title goes here. Don’t forget this! Academic integrity: You must write your answers by yourself, and you may not copy-paste from the paper itself. Copying from the paper may result in you receiving a zero on the assignment. You may talk about the papers with at most one other student, but you may not collaborate with them while writing your answers. If you’ve discussed the papers with another student, please indicate that here. ### Either “I didn’t discuss this paper with other students”, ### or “I discussed this paper with < student name >” Question 1: What are the main claims of this paper? Does it propose a new method that it claims is better than existing methods? Does it provide a theoretical insight into how existing methods work? Is it making an argument about the role of ML systems in society? Why do these claims matter? ### Three to five sentence answer. Question 2: What evidence does the paper provide to support its claims? For example, does it evaluate the empirical performance of models on specific datasets? Does it explore case studies? Describe this evidence in detail (e.g., which datasets were used? Which experiments?). What is a major strength of this evidence? What is a major weakness of this evidence? Why? ### Three to five sentence answer goes here. Question 3: Imagine you wanted to write a pop quiz to assess whether someone else had read this paper. Write a question that could be answered by someone who closely read the paper, regardless of whether they had the paper in front of them. See the Canvas “Reading Assignments” page for an example Question/Answer pair. ### Your question should be one to two sentences. Your answer should be two to three sentences. Question 4: Ask a question that is neither asked nor answered by the paper. This should be a high-level question about future work related to the paper. Provide a description of why this question is important and how it's informed by what you learned from reading this paper. ### Your question should be one to two sentences. Describe why it matters in one to two sentences.

Deep Learning Generalization Reading Assignment

Related documents

Products

Support

Deep Learning Generalization Reading Assignment

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib