Objective-type Quiz A. Statistics and Probability 1. A fair coin is tossed twice. What is the probability of getting exactly one tail? a) 1/4 b) 1/2 c) 3/4 d) 1/3 Answer: b. 1/2 Solution: The two outcomes are HT and TH, each with a probability of 1/4. Adding those probabilities gives 1/2. 2. A random variable follows a normal distribution with mean μ = 5 and standard deviation σ = 2. What is the probability that this random variable is less than 3? a) 0.9772 b) 0.0228 c) 0.3413 d) 0.1587 Answer: d. 0.1587 Solution: To solve this problem, we need to standardize the random variable using the Z-score formula, which is: Z = (X - μ) / σ Where: X is the value of the random variable μ is the mean of the distribution σ is the standard deviation of the distribution For X = 3, μ = 5, and σ = 2, the Z-score is: Z = (3 - 5) / 2 = -1 By looking up this Z-score in the standard normal distribution table, or using a Z-score calculator, we find that the probability that the random variable is less than 3 (i.e., the cumulative probability P(X<3) for a Z-score of -1) is approximately 0.1587. 3. In hypothesis testing, if the p-value is less than or equal to the significance level, we... a) Reject the null hypothesis b) Do not reject the null hypothesis c) Accept the null hypothesis d) Accept the alternative hypothesis Answer: a. Reject the null hypothesis Solution: A low p-value suggests that the observation is unlikely under the null hypothesis, so we reject it. 4. Which of the following is NOT a characteristic of a binomial distribution? a) Consists of n identical trials b) Each trial results in one of two outcomes c) The probability of success changes from trial to trial d) Trials are independent Answer: c. The probability of success changes from trial to trial Solution: In a binomial distribution, the probability of success is constant across all trials. 5. What is the purpose of sampling in statistics? a) To make conclusions about a population based on a subset of that population b) To observe every individual in a population c) To conduct a hypothesis test d) To eliminate sources of bias Answer: a. To make conclusions about a population based on a subset of that population Solution: Sampling allows estimation of population parameters without examining every individual in the population. 6. What is the variance of a standard normal distribution? a) 1 b) 0 c) Infinite d) Can't be determined Answer: a. 1 Solution: A standard normal distribution has a mean of 0 and a variance of 1. 7. If the probability that event A occurs is 1/3, and the probability that event B occurs, given that A has occurred, is 1/4. What is the joint probability of A and B? a) 1/7 b) 1/12 c) 7/12 d) 4/12 Answer: b. 1/12 Solution: The joint probability of A and B is P(A ∩ B) = P(A) * P(B | A) = 1/3 * 1/4 = 1/12 8. Which of the following distributions is used to describe the behavior of a count variable? a) Normal distribution b) Chi-Square distribution c) Poisson distribution d) F distribution Answer: c. Poisson distribution Solution: The Poisson distribution is often used to model the number of times an event occurs in a specified interval of time or space. 9. Sampling distribution of a statistic becomes approximately a normal distribution when... a) Population is normally distributed, or if sample size is large. b) Population is not normally distributed and the sample size is small. c) Population is uniformly distributed, or if the sample size is unknown. d) Population is normally distributed, and the sample size is small. Answer: a. Population is normally distributed, or if sample size is large. Solution: According to the Central Limit Theorem, if the population from which the sample is taken is normally distributed or the sample size is large, the sampling distribution of the mean becomes a normal distribution. 10. If events A and B are mutually exclusive, P(A ∩ B) is... a) equal to P(A) * P(B) b) equal to P(A) or P(B) c) equal to zero d) can't be determined Answer: c. equal to zero Solution: If events A and B are mutually exclusive (cannot happen at the same time), the probability of both of them happening is zero. B. Machine Learning 1. Which of the following is an example of a supervised learning task? a) Clustering customers into different segments b) Predicting house prices based on various features c) Identifying fraud in a dataset d) Organizing news articles into different topics Answer: b. Predicting house prices based on various features Solution: Supervised learning involves predicting a target variable based on given features. In this case, the house prices are the targets and are predicted based on various features. 2. Which of the following algorithms can be used for both classification and regression tasks? a) K-means clustering b) Support Vector Machines (SVM) c) Hierarchical clustering d) Apriori Answer: b. Support Vector Machines (SVM) Solution: SVMs can be used for both classification (by separating different classes using a hyperplane) and regression (by introducing a margin of error). 3. Which of the following techniques can be used to avoid overfitting in a decision tree? a) Dimensionality reduction b) Pruning c) Clustering d) Feature scaling Answer: a. Dimensionality reduction and b. Pruning Solution: Dimensionality reduction is a technique that reduces the number of features in the dataset. This can help to avoid overfitting by making the model simpler. Pruning is a technique that removes unnecessary branches from the decision tree. This can also help to avoid overfitting by making the model more general. 4. What is the main advantage of ensemble methods like Random Forest over single decision trees? a) They are faster to train b) They are simpler to understand c) They reduce variance and improve accuracy d) They require less memory Answer: c. They reduce variance and improve accuracy Solution: Ensemble methods like Random Forests combine multiple weak learners (decision trees) to create a strong learner that reduces variance and improves prediction accuracy. 5. In k-nearest neighbors (k-NN) algorithm, what does 'k' represent? a) The number of clusters b) The number of features c) The number of neighbors used to predict the class of a given instance d) The number of classes in the target variable Answer: c. The number of neighbors used to predict the class of a given instance Solution: In k-NN, 'k' represents the number of nearest neighbors considered when predicting the class or value of a given instance. 6. A type of unsupervised learning where the algorithm learns the inherent structure of the data is known as... a) Classification b) Regression c) Clustering d) Reinforcement learning Answer: c. Clustering Solution: Clustering is an unsupervised learning technique where the algorithm tries to group similar instances together into clusters, based on the inherent structure of the data. 7. The purpose of Principal Component Analysis (PCA) is: a) Increase dimensionality of data b) Reduce dimensionality of data c) Increase variance retained d) Reduce variance retained Answer: b. Reduce dimensionality of data Solution: Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms a large set of variables into a smaller one that still contains most of the information in the large set. 8. In a Support Vector Machine (SVM), what is the 'kernel trick' used for? a) To speed up the SVM training process b) To convert non-linearly separable data into linearly separable by adding more dimensions c) To determine the number of support vectors d) To choose the optimal hyperparameters for the SVM Answer: b. To convert non-linearly separable data into linearly separable by adding more dimensions Solution: The 'kernel trick' is used for transforming non-linearly separable data in a lower-dimensional space into linearly separable data in a higherdimensional space. 9. What output can you expect from an unsupervised learning algorithm analyzing customer data? a) Predicted customer churn rates b) Customer segmentation groups c) Predicted customer lifetime value d) Future customer behavior Answer: b. Customer segmentation groups Solution: Unsupervised learning algorithms are capable of scouring through large volumes of customer data and segregating customers into discrete categories or segmentation groups based on common patterns. 10. Which one of the following evaluation metrics is mainly used for regression problems? a) Precision b) Rec call c) Mean Squared Error (MSE) d) F1 Score Answer: c. Mean Squared Error (MSE) Solution: Mean Squared Error (MSE) is a common evaluation metric used to measure the average of the squares of the errors between the actual and predicted values in a regression problem. C. Data Processing 1. What is the purpose of data transformation in data processing? a) To handle missing values in the data b) To normalize the values of features c) To create new features from existing ones d) To remove outliers from the dataset Answer: b. To normalize the values of features Solution: Data transformation is used to scale or normalize the values of features to a common range, which helps algorithms during training. 2. Which software library is commonly used for data manipulation and analysis in Python? a) Scikit-learn b) TensorFlow c) Pandas d) Keras Answer: c. Pandas Solution: Pandas is a popular library in Python used for data manipulation, analysis, and exploration tasks. 3. What is the purpose of data wrangling in the data processing pipeline? a) To handle missing values and outliers in the data b) To transform and reshape the data into a suitable format c) To perform statistical modeling and analysis d) To visualize and interpret the data Answer: b. To transform and reshape the data into a suitable format Solution: Data wrangling involves manipulating, cleaning, and organizing the data into a format that is suitable for further analysis and. 4. Which technique is commonly used to handle missing values in a dataset? a) Data imputation b) Feature scaling c) Dimensionality reduction d) Outlier detection Answer: a. Data imputation Solution: Data imputation is the technique of estimating or substituting missing values in a dataset with appropriate values. 5. What is the first step in the data analysis process? a) Data cleaning and preprocessing b) Data visualization and exploration c) Building machine learning models d) Evaluating model performance Answer: a. Data cleaning and preprocessing Solution: The first step in data analysis is to clean and preprocess the data to ensure its quality and prepare it for further analysis. 6. Which visualization technique is best suited for exploring the relationship between two numerical variables? a) Bar plot b) Line plot c) Scatter plot d) Histogram Answer: c. Scatter plot Solution: A scatter plot allows you to visualize the relationship between two numerical variables, showing the distribution of the data points and any patterns or correlations between them. 7. What is the purpose of feature engineering in data processing? a) To transform numerical features into categorical features b) To reduce the dimensions of the dataset c) To create new features from existing ones d) To handle missing values in the dataset Answer: c. To create new features from existing ones Solution: Feature engineering involves creating new features from existing ones that may better represent the underlying patterns and relationships in the data. 8. Which technique is commonly used to reduce the dimensions of a highdimensional dataset? a) One-hot encoding b) Principal Component Analysis (PCA) c) Feature scaling d) Data imputation Answer: b. Principal Component Analysis (PCA) Solution: PCA is a technique used for dimensionality reduction by transforming high-dimensional data into a lower-dimensional representation while retaining most of the information. 9. What is the purpose of outlier detection in data preprocessing? a) To remove missing values from the dataset b) To handle class imbalance in the target variable c) To identify potential errors or abnormalities in the data d) To minimize the impact of noisy data on model performance Answer: c. To identify potential errors or abnormalities in the data Solution: Outlier detection helps identify data points that deviate significantly from the expected patterns and may indicate errors, anomalies, or interesting phenomena in the dataset. 10. What is the most commonly used measure of central tendency? a) Mean b) Median c) Mode d) Range Answer: a. Mean Solution: The mean is the most commonly used measure of central tendency, representing the average value of a dataset.