Data Analysis Guide Questions with Python and Pandas

Guide Questions Reading the data o How would you read a dataset using read_csv method? What are the necessary arguments in using read_csv method? The pandas function read_csv() reads in values, where the delimiter is a comma character. the Pandas read_csv() function returns a new DataFrame with the data and labels from the file data.csv, which you specified with the first argument. o How would you read a dataset using open method in Python? What are the necessary arguments in using open method? o How would you access a data from a URL ? What are other arguments necessary in reading data from the URL? o How would you read an excel dataset? What are other arguments necessary in reading data in excel format? read_excel() returns a new DataFrame that contains the values from data.xlsx. You can also use read_excel() with OpenDocument spreadsheets, or .ods files. Summary, dimensions and structure of data o How would you get the summary, dimensions and structure of your data? Pandas dataframe.info() function is used to get a concise summary of the dataframe. To get a quick overview of the dataset we use the dataframe.info() function. Pandas .size, .shape and .ndim are used to return size, shape and dimensions of data frames and series. Syntax: dataframe.size Return : Returns size of dataframe/series which is equivalent to total number of elements. That is rows x columns. Syntax: dataframe.shape Return : Returns tuple of shape (Rows, columns) of dataframe/series Syntax: dataframe.ndim Return : Returns dimension of dataframe/series. 1 for one dimension (series), 2 for two dimension (dataframe) o How would you get the type of data in each column? DataFrame.dtypes Return the dtypes in the DataFrame. This returns a Series with the data type of each column. The result’s index is the original DataFrame’s columns Data Cleaning activities: Handling missing values o What are the approaches in handling missing values? Understanding your data through fundamentals of statistics o o How would get the following statistical description from your dataset?  Mean  Median  Mode  Variance  Standard deviation  Percentiles  Ranges Identify possible outliers in your dataset

Data Analysis Guide Questions with Python and Pandas

Related documents

Products

Support

Data Analysis Guide Questions with Python and Pandas

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib