Amar Hawkins Homework 1 1.) ML is something similar to AI except it focuses on the making of algorithms, numbered models etc. that are used to improve computer systems performance on certain things based on data trends or what is learned. Mainly learning patterns and relationships from data, to provide specific information for users. Its difeerent because it does not follow a set of rules already made to solve a problem. 5 tasks include Intrusion detection, malware detection, phishing detection, anomaly detection, and user behavior analysis. 2.) Supervised learning deals with training a model using labeled data where the goal is getting distinct labels. Where unsupervised deals with unlabeled data looking to find things like patterns in data. a. 5 supervised include (Spam email detection, network intrusion detection, malware family classification, phishing website detection, and user authentication. b. 5 unsupervised include (Anomaly detection, botnet detection, network segmentation, feature extraction, and data visualization) 3.) Linear regression is used to model the relationship between dependent and independent variables by using linear equation to watch data. a. To train a linear regression model a dataset would be used along with values that are known from targetv and features. The goal would be to understand the coefficients which would produce the least sum of squared difeerences between the predicted values and the actual values in the data training b. MSE and MAE are two losses. c. Regularization techniques include L1, L2, and Elastic Net. 4.) Bias variance decomp is a crucial concept in understanding how ML models work. Bias is the error that comes from assumptions in the learning algorithm. Variance is the error that comes from the model’s sensitivity to small changes or noises in the training data. The importance comes in getting the best balance between the two. 5.) Model selection evolves around choosing the right algorithm/model guide for the task. Overfitting happens usually when the model begins to know the training data to well. Underfitting happens when a model is to simple to get the patterns in the data. 6.) A small online store that I opened sells shirts that gets brought daily. I keep track of how many I end up selling a year and get the resulting data which shows the number I sold every day, which forms a time series. 7.) The technique used is the bayesian classification which is basically like smart detective and we used it to snoop out the spam email!