R Programming R is a programming language for statistical computing and graphics. Apart of being one of the most dominant analytics tool, R is also one of the most popular tools for data visualization. It is primarily used in Data Science. R is a free and open source software and it compiles and runs on wide variety of operating systems. It is quite simple and easy to learn. How to get started? You first need to install R and R Studio to start with R programming language. The steps to download R are: 1. Google “download R” and select the first link that appears. Or 2. Directly open from this link: https://cran.r-project.org/ a) Click on Download R for Windows b) Click on base c) Click on Download R 4.0.2 (or the latest version) for windows to download the setup. Once the setup is downloaded, install R. The steps to download R Studio are: 1. Google “download R” and select the first link that appears. Or 2. Directly download from this link https://rstudio.com/products/rstudio/download/#download Once downloaded, install R studio. Variables (write what is variables) Syntax: variable_name <- value OR Variable_name = value To see the data type of variable, we use class function To see the list of variables, we use ls() function Example: Operators Arithmetic Operators: Relational Operators: Logical Operators: Datatypes When we create some variable in a memory, then it is stored according to the value inside that variable. Based on the type of value, the memory is allocated. There are five types of data types in R: 1. Vectors: a combination of values Basic functions in the Dataframe 1. 2. 3. 4. 5. 6. 7. dim(): shows the dimensions of the data frame by row and column str(): shows the structure of the data frame summary(): provides summary statistics on the columns of the data frame colnames(): shows the name of each column in the data frame head(): shows the first 6 rows of the data frame tail(): shows the last 6 rows of the data frame View(): shows a spreadsheet-like display of the entire data frame Now, let’s import a data set see how each of these functions works. First, here’s the code: ### Import a data set on violent crime by state and assign it to the data frame "crime" crime <- read.csv("http://vincentarelbundock.github.io/Rdatasets/csv/datasets/USArrests.csv", stringsAsFactors = FALSE) ### Call the functions on crime to examine the data frame dim(crime) str(crime) summary(crime) colnames(crime) ### The head() and tail() functions default to 6 rows, but we can adjust the number of rows using the "n = " argument head(crime, n = 10) tail(crime, n = 5) ### While the first 6 functions are printed to the console, the View() function opens a table in another window View(crime) Now, let’s take a look at the output, so we can see what happens when the code is run. What is Summary Statistics/Descriptive Statistics? All the data which is gathered for any analysis is useful when it is properly represented so that it is easily understandable by everyone and helps in proper decision making. After we carry out the data analysis, we delineate its summary so as to understand it in a much better way. This is known as summarizing the data. We can summarize our data in R as follows: Descriptive/Summary Statistics – With the help of descriptive statistics, we can represent the information about our datasets. They also form the platform for carrying out complex computations as well as analysis. Therefore, even though they are developed with simple methods, they play a crucial role in the process of analysis. Tabulation – Representing the data analyzed in tabular form for easy understanding. Graphical – It is a way to represent data graphically. https://data-flair.training/blogs/descriptive-statistics-in-r/amp/ How to replace values using replace() in R : https://www.journaldev.com/39695/replace-in-r Encoding in R https://www.pluralsight.com/guides/encoding-data-with-r https://datatricks.co.uk/one-hot-encoding-in-r-three-simple-methods https://www.analytics-link.com/post/2017/08/25/how-to-r-one-hot-encoding RESOURCES http://www.sthda.com/english/wiki/easy-r-programming-basics http://www.sthda.com/english/wiki/descriptive-statistics-and-graphics https://data-flair.training/blogs/descriptive-statistics-in-r/ https://cran.r-project.org/manuals.html https://www.statsandr.com/blog/descriptive-statistics-in-r/#histogram https://uc-r.github.io/missing_values https://www.edureka.co/community/2995/splitting-the-data-into-training-and-testing-sets-r