Uploaded by Vallance Alvares

ML questions

advertisement
1. What are the issues in Machine Learning?
Ans. "Machine Learning" is one of the most popular technologies among all data scientists
and machine learning enthusiasts. It is the most effective Artificial Intelligence technology
that helps create automated learning systems to take future decisions without being
constantly programmed. It can be considered an algorithm that automatically constructs
various computer software using past experience and training data. It can be seen in every
industry, such as healthcare, education, finance, automobile, marketing, shipping,
infrastructure, automation, etc. Almost all big companies like Amazon, Facebook, Google,
Adobe, etc., are using various machine learning techniques to grow their businesses. But
everything in this world has bright as well as dark sides. Similarly, Machine Learning offers
great opportunities, but some issues need to be solved.
Although machine learning is being used in every industry and helps organizations make
more informed and data-driven choices that are more effective than classical
methodologies, it still has so many problems that cannot be ignored. Here are some
common issues in Machine Learning that professionals face to inculcate ML skills and
create an application from scratch.
1. Inadequate Training Data
The major issue that comes while using machine learning algorithms is the lack of quality
as well as quantity of data. Although data plays a vital role in the processing of machine
learning algorithms, many data scientists claim that inadequate data, noisy data, and
unclean data are extremely exhausting the machine learning algorithms. For example, a
simple task requires thousands of sample data, and an advanced task such as speech or
image recognition needs millions of sample data examples. Further, data quality is also
important for the algorithms to work ideally, but the absence of data quality is also found in
Machine Learning applications. Data quality can be affected by some factors as follows:
Noisy Data- It is responsible for an inaccurate prediction that affects the decision as well as
accuracy in classification tasks.
Incorrect data- It is also responsible for faulty programming and results obtained in machine
learning models. Hence, incorrect data may affect the accuracy of the results also.
Generalizing of output data- Sometimes, it is also found that generalizing output data
becomes complex, which results in comparatively poor future actions.
2. Poor quality of data
As we have discussed above, data plays a significant role in machine learning, and it must
be of good quality as well. Noisy data, incomplete data, inaccurate data, and unclean data
lead to less accuracy in classification and low-quality results. Hence, data quality can also
be considered as a major common problem while processing machine learning algorithms.
3. Non-representative training data
To make sure our training model is generalized well or not, we have to ensure that sample
training data must be representative of new cases that we need to generalize. The training
data must cover all cases that are already occurred as well as occurring.
Further, if we are using non-representative training data in the model, it results in less
accurate predictions. A machine learning model is said to be ideal if it predicts well for
generalized cases and provides accurate decisions. If there is less training data, then there
will be a sampling noise in the model, called the non-representative training set. It won't be
accurate in predictions. To overcome this, it will be biased against one class or a group.
Hence, we should use representative data in training to protect against being biased and
make accurate predictions without any drift.
4. Overfitting and Underfitting
Overfitting:
Overfitting is one of the most common issues faced by Machine Learning engineers and
data scientists. Whenever a machine learning model is trained with a huge amount of data,
it starts capturing noise and inaccurate data into the training data set. It negatively affects
the performance of the model. Let's understand with a simple example where we have a
few training data sets such as 1000 mangoes, 1000 apples, 1000 bananas, and 5000
papayas. Then there is a considerable probability of identification of an apple as papaya
because we have a massive amount of biased data in the training data set; hence
prediction got negatively affected. The main reason behind overfitting is using non-linear
methods used in machine learning algorithms as they build non-realistic data models. We
can overcome overfitting by using linear and parametric algorithms in the machine learning
models.
Methods to reduce overfitting:
Increase training data in a dataset.
Reduce model complexity by simplifying the model by selecting one with fewer parameters
Ridge Regularization and Lasso Regularization
Early stopping during the training phase
Reduce the noise
Reduce the number of attributes in training data.
Constraining the model.
Underfitting:
Underfitting is just the opposite of overfitting. Whenever a machine learning model is trained
with fewer amounts of data, and as a result, it provides incomplete and inaccurate data and
destroys the accuracy of the machine learning model.
Underfitting occurs when our model is too simple to understand the base structure of the
data, just like an undersized pant. This generally happens when we have limited data into
the data set, and we try to build a linear model with non-linear data. In such scenarios, the
complexity of the model destroys, and rules of the machine learning model become too
easy to be applied on this data set, and the model starts doing wrong predictions as well.
Methods to reduce Underfitting:
Increase model complexity
Remove noise from the data
Trained on increased and better features
Reduce the constraints
Increase the number of epochs to get better results.
5. Monitoring and maintenance
As we know that generalized output data is mandatory for any machine learning model;
hence, regular monitoring and maintenance become compulsory for the same. Different
results for different actions require data change; hence editing of codes as well as
resources for monitoring them also become necessary.
6. Getting bad recommendations
A machine learning model operates under a specific context which results in bad
recommendations and concept drift in the model. Let's understand with an example where
at a specific time customer is looking for some gadgets, but now customer requirement
changed over time but still machine learning model showing same recommendations to the
customer while customer expectation has been changed. This incident is called a Data
Drift. It generally occurs when new data is introduced or interpretation of data changes.
However, we can overcome this by regularly updating and monitoring data according to the
expectations.
7. Lack of skilled resources
Although Machine Learning and Artificial Intelligence are continuously growing in the
market, still these industries are fresher in comparison to others. The absence of skilled
resources in the form of manpower is also an issue. Hence, we need manpower having indepth knowledge of mathematics, science, and technologies for developing and managing
scientific substances for machine learning.
8. Customer Segmentation
Customer segmentation is also an important issue while developing a machine learning
algorithm. To identify the customers who paid for the recommendations shown by the
model and who don't even check them. Hence, an algorithm is necessary to recognize the
customer behaviour and trigger a relevant recommendation for the user based on past
experience.
9. Process Complexity of Machine Learning
The machine learning process is very complex, which is also another major issue faced by
machine learning engineers and data scientists. However, Machine Learning and Artificial
Intelligence are very new technologies but are still in an experimental phase and
continuously being changing over time. There is the majority of hits and trial experiments;
hence the probability of error is higher than expected. Further, it also includes analysing the
data, removing data bias, training data, applying complex mathematical calculations, etc.,
making the procedure more complicated and quite tedious.
10. Data Bias
Data Biasing is also found a big challenge in Machine Learning. These errors exist when
certain elements of the dataset are heavily weighted or need more importance than others.
Biased data leads to inaccurate results, skewed outcomes, and other analytical errors.
However, we can resolve this error by determining where data is actually biased in the
dataset. Further, take necessary steps to reduce it.
Methods to remove Data Bias:
Research more for customer segmentation.
Be aware of your general use cases and potential outliers.
Combine inputs from multiple sources to ensure data diversity.
Include bias testing in the development process.
Analyze data regularly and keep tracking errors to resolve them easily.
Review the collected and annotated data.
Use multi-pass annotation such as sentiment analysis, content moderation, and intent
recognition.
11. Lack of Explainability
This basically means the outputs cannot be easily comprehended as it is programmed in
specific ways to deliver for certain conditions. Hence, a lack of explainability is also found in
machine learning algorithms which reduce the credibility of the algorithms.
12. Slow implementations and results
This issue is also very commonly seen in machine learning models. However, machine
learning models are highly efficient in producing accurate results but are time-consuming.
Slow programming, excessive requirements' and overloaded data take more time to provide
accurate results than expected. This needs continuous maintenance and monitoring of the
model for delivering accurate results.
13. Irrelevant features
Although machine learning models are intended to give the best possible outcome, if we
feed garbage data as input, then the result will also be garbage. Hence, we should use
relevant features in our training sample. A machine learning model is said to be good if
training data has a good set of features or less to no irrelevant features.
2. Explain Regression Line, Scatter Plot, Error in Prediction and Best fitting time.
Ans. In machine learning, a regression line is a straight line that represents the relationship
between a dependent variable (also called the target or output variable) and one or more
independent variables (also called features or input variables). The goal of regression
analysis is to find the best-fit line that minimizes the difference between the actual observed
values and the values predicted by the line.
The equation of a simple linear regression line, which involves one independent variable, is
often written as:
\[ Y = mX + b \]
where:
- \( Y \) is the dependent variable (output),
- \( X \) is the independent variable (input),
- \( m \) is the slope of the line,
- \( b \) is the y-intercept.
The slope (\( m \)) represents the rate at which the dependent variable changes with
respect to changes in the independent variable, and the y-intercept (\( b \)) is the value of
the dependent variable when the independent variable is zero.
In the case of multiple linear regression, where there are more than one independent
variable, the equation becomes:
\[ Y = b_0 + b_1X_1 + b_2X_2 + \ldots + b_nX_n \]
Here:
- \( Y \) is the dependent variable,
- \( X_1, X_2, \ldots, X_n \) are the independent variables,
- \( b_0 \) is the y-intercept, and
- \( b_1, b_2, \ldots, b_n \) are the coefficients for each independent variable.
The regression line is determined during the training phase of a machine learning model,
where the model learns the optimal values for the coefficients (\( m \), \( b \) in simple linear
regression, or \( b_0, b_1, \ldots, b_n \) in multiple linear regression) based on the training
data. Once the model is trained, it can be used to make predictions on new data by
applying the learned regression equation.
Scatter plot
A scatter plot is a type of data visualization used in machine learning and statistics to
display the relationship between two continuous variables. It is a graphical representation in
which individual data points are plotted on a two-dimensional plane, with one variable on
the x-axis and the other on the y-axis. Each point on the scatter plot represents a pair of
values for the two variables.
In machine learning, scatter plots are often used to visually inspect and understand the
pattern, trend, or correlation between two variables. The shape and direction of the scatter
plot points can provide insights into the nature of the relationship between the variables.
Here are some common patterns observed in scatter plots:
1. **Positive Correlation:** Points on the scatter plot tend to form a rising pattern from left to
right, indicating a positive correlation between the two variables. This means that as one
variable increases, the other variable also tends to increase.
2. **Negative Correlation:** Points on the scatter plot tend to form a descending pattern
from left to right, indicating a negative correlation between the two variables. This means
that as one variable increases, the other variable tends to decrease.
3. **No Correlation:** Points on the scatter plot are scattered randomly without a clear
pattern, indicating no significant correlation between the two variables.
Machine learning practitioners often use scatter plots during the exploratory data analysis
(EDA) phase to gain insights into the data before building models. Scatter plots can also
help identify outliers, clusters, or other interesting patterns in the data.
In Python, libraries like Matplotlib or Seaborn are commonly used to create scatter plots.
Here's a simple example using Matplotlib.
Error in Prediction
In machine learning, an error is a measure of how accurately an algorithm can make
predictions for the previously unknown dataset. On the basis of these errors, the machine
learning model is selected that can perform best on the particular dataset. There are mainly
two types of errors in machine learning, which are:
Reducible errors: These errors can be reduced to improve the model accuracy. Such errors
can further be classified into bias and Variance.
Irreducible errors: These errors will always be present in the model
regardless of which algorithm has been used. The cause of these errors is unknown
variables whose value can't be reduced.
What is Bias?
In general, a machine learning model analyses the data, find patterns in it and make
predictions. While training, the model learns these patterns in the dataset and applies them
to test data for prediction. While making predictions, a difference occurs between
prediction values made by the model and actual values/expected values, and this
difference is known as bias errors or Errors due to bias. It can be defined as an inability
of machine learning algorithms such as Linear Regression to capture the true relationship
between the data points. Each algorithm begins with some amount of bias because bias
occurs from assumptions in the model, which makes the target function simple to learn.
What is a Variance Error?
The variance would specify the amount of variation in the prediction if the different training
data was used. In simple words, variance tells that how much a random variable is
different from its expected value. Ideally, a model should not vary too much from one
training dataset to another, which means the algorithm should be good in understanding the
hidden mapping between inputs and output variables. Variance errors are either of low
variance or high variance.
Low variance means there is a small variation in the prediction of the target function with
changes in the training data set. At the same time, High variance shows a large variation in
the prediction of the target function with changes in the training dataset.
A model that shows high variance learns a lot and perform well with the training dataset,
and does not generalize well with the unseen dataset. As a result, such a model gives good
results with the training dataset but shows high error rates on the test dataset.
Since, with high variance, the model learns too much from the dataset, it leads to overfitting
of the model. A model with high variance has the below problems:
A high variance model leads to overfitting.
Increase model complexities.
Usually, nonlinear algorithms have a lot of flexibility to fit the model, have high variance.
Low-Bias, Low-Variance:
The combination of low bias and low variance shows an ideal machine learning model.
However, it is not possible practically.
Low-Bias, High-Variance: With low bias and high variance, model predictions are
inconsistent and accurate on average. This case occurs when the model learns with a large
number of parameters and hence leads to an overfitting
High-Bias, Low-Variance: With High bias and low variance, predictions are consistent but
inaccurate on average. This case occurs when a model does not learn well with the training
dataset or uses few numbers of the parameter. It leads to underfitting problems in the
model.
High-Bias, High-Variance:
With high bias and high variance, predictions are inconsistent and also inaccurate on
average.
Bias-Variance Trade-Off
While building the machine learning model, it is really important to take care of bias and
variance in order to avoid overfitting and underfitting in the model. If the model is very
simple with fewer parameters, it may have low variance and high bias. Whereas, if the
model has a large number of parameters, it will have high variance and low bias. So, it is
required to make a balance between bias and variance errors, and this balance between
the bias error and variance error is known as the Bias-Variance trade-off.
3. Describe the essential steps of K-means algorithm for clustering analysis.
STEP 1: Let us pick k clusters, i.e., K=2, to separate the dataset and assign it to its
appropriate clusters. We will select two random places to function as the cluster’s centroid.
STEP 2: Now, each data point will be assigned to a scatter plot depending on its distance
from the nearest K-point or centroid. This will be accomplished by establishing a median
between both centroids. Consider the following illustration:
STEP 3: The points on the line’s left side are close to the blue centroid, while the points on
the line’s right side are close to the yellow centroid. The left Form cluster has a blue
centroid, whereas the right Form cluster has a yellow centroid.
STEP 4: Repeat the procedure, this time selecting a different centroid. To choose the new
centroids, we will determine their new center of gravity, which is represented below:
STEP 5: After that, we’ll re-assign each data point to its new centroid. We shall repeat the
procedure outlined before (using a median line). The blue cluster will contain the yellow
data point on the blue side of the median line.
STEP 6: Now that reassignment has occurred, we will repeat the previous step of locating
new centroids.
STEP 7: We will repeat the procedure outlined above for determining the center of gravity of
centroids, as shown below.
STEP 8: Similar to the previous stages, we will draw the median line and reassign the data
points after locating the new centroids.
STEP 9: We will finally group points depending on their distance from the median line,
ensuring that two distinct groups are established and that no dissimilar points are included
in a single group.
The final Cluster is as follows:
4. What is SVM? Explain the following terms: hyperplane, supporting hyperplane,
margin and support vectors with suitable examples.
Support Vector Machine (SVM) is a supervised machine learning algorithm used for both
classification and regression. Though we say regression problems as well it’s best suited for
classification. The main objective of the SVM algorithm is to find the optimal hyperplane in
an N-dimensional space that can separate the data points in different classes in the feature
space. The hyperplane tries that the margin between the closest points of different classes
should be as maximum as possible. The dimension of the hyperplane depends upon the
number of features. If the number of input features is two, then the hyperplane is just a line.
If the number of input features is three, then the hyperplane becomes a 2-D plane. It
becomes difficult to imagine when the number of features exceeds three.
A hyperplane is a supervised learning tool that separates the dataspace into one less
dimension for easier linear classification.
A separating hyperplane can be defined by two terms: an intercept term called b and a
decision hyperplane normal vector called w. These are commonly referred to as the weight
vector in machine learning. Here b is used to select the hyperplane i.e perpendicular to the
normal vector. Now since all the plane x in the hyperplane should satisfy the following
equation:
Support vectors are the data points that are close to the decision boundary, they are the
data points most difficult to classify, they hold the key for SVM to be optimal decision
surface. The optimal hyperplane comes from the function class with the lowest capacity i.e
minimum number of independent features/parameters.
5. Explain in detail Temporal Difference Learning.
Temporal Difference Learning in reinforcement learning is an unsupervised learning
technique that is very commonly used in it for the purpose of predicting the total reward
expected over the future. They can, however, be used to predict other quantities as well. It
is essentially a way to learn how to predict a quantity that is dependent on the future values
of a given signal. It is a method that is used to compute the long-term utility of a pattern of
behaviour from a series of intermediate rewards.
Essentially, Temporal Difference Learning (TD Learning) focuses on predicting a variable's
future value in a sequence of states. Temporal difference learning was a major
breakthrough in solving the problem of reward prediction. You could say that it employs a
mathematical trick that allows it to replace complicated reasoning with a simple learning
procedure that can be used to generate the very same results.
The trick is that rather than attempting to calculate the total future reward, temporal
difference learning just attempts to predict the combination of immediate reward and its own
reward prediction at the next moment in time. Now when the next moment comes and
brings fresh information with it, the new prediction is compared with the expected
prediction. If these two predictions are different from each other, the Temporal Difference
Learning algorithm will calculate how different the predictions are from each other and
make use of this temporal difference to adjust the old prediction toward the new prediction.
The temporal difference algorithm always aims to bring the expected prediction and the
new prediction together, thus matching expectations with reality and gradually increasing
the accuracy of the entire chain of prediction.
Temporal Difference Learning aims to predict a combination of the immediate reward and
its own reward prediction at the next moment in time.
In TD Learning, the training signal for a prediction is a future prediction. This method is a
combination of the Monte Carlo (MC) method and the Dynamic Programming (DP) method.
Monte Carlo methods adjust their estimates only after the final outcome is known, but
temporal difference methods tend to adjust predictions to match later, more accurate,
predictions for the future, much before the final outcome is clear and know. This is
essentially a type of bootstrapping.
Temporal difference learning in machine learning got its name from the way it uses
changes, or differences, in predictions over successive time steps for the purpose of driving
the learning process.
The prediction at any particular time step gets updated to bring it nearer to the prediction of
the same quantity at the next time step.
6. Create a decision tree for the attribute “class” using the respective values:
7. What are the different Hidden Markov Models?
Hidden Markov Model (HMM) is a statistical model that is used to describe the
probabilistic relationship between a sequence of observations and a sequence of hidden
states. It is often used in situations where the underlying system or process that generates
the observations is unknown or hidden, hence it got the name “Hidden Markov Model.”
It is used to predict future observations or classify sequences, based on the underlying
hidden process that generates the data.
An HMM consists of two types of variables: hidden states and observations.
The hidden states are the underlying variables that generate the observed data, but they
are not directly observable.
The observations are the variables that are measured and observed.
The relationship between the hidden states and the observations is modeled using a
probability distribution. The Hidden Markov Model (HMM) is the relationship between the
hidden states and the observations using two sets of probabilities: the transition
probabilities and the emission probabilities.
The transition probabilities describe the probability of transitioning from one hidden state
to another.
The emission probabilities describe the probability of observing an output given a hidden
state.
Modeling observations in these two layers, one visible and the other invisible, is very useful,
since many real world problems deal with classifying raw observations into a number of
categories, or class labels, that are more meaningful to us. For example, let us consider the
speech recognition problem, for which HMMs have been extensively used for several
decades [1]. In speech recognition, we are interested in predicting the uttered word from a
recorded speech signal. For this purpose, the speech recognizer tries to find the sequence
of phonemes (states) that gave rise to the actual uttered sound (observations). Since there
can be a large variation in the actual pronunciation, the original phonemes (and ultimately,
the uttered word) cannot be directly observed, and need to be predicted.
This approach is also useful in modeling biological sequences, such as proteins and DNA
sequences. Typically, a biological sequence consists of smaller substructures with different
functions, and different functional regions often display distinct statistical properties. For
example, it is well-known that proteins generally consist of multiple domains. Given a new
protein, it would be interesting to predict the constituting domains (corresponding to one or
more states in an HMM) and their locations in the amino acid sequence (observations).
Furthermore, we may also want to find the protein family to which this new protein
sequence belongs. In fact, HMMs have been shown to be very effective in representing
biological sequences, as they have been successfully used for modeling speech signals. As
a result, HMMs have become increasingly popular in computational molecular biology, and
many state-of-the-art sequence analysis algorithms have been built on HMMs.
8. What is Reinforcement Learning? Explain with the help of an example.
Reinforcement learning is an area of Machine Learning. It is about taking suitable action to
maximize reward in a particular situation. It is employed by various software and machines
to find the best possible behavior or path it should take in a specific situation.
Reinforcement learning differs from supervised learning in a way that in supervised learning
the training data has the answer key with it so the model is trained with the correct answer
itself whereas in reinforcement learning, there is no answer but the reinforcement agent
decides what to do to perform the given task. In the absence of a training dataset, it is
bound to learn from its experience.
Reinforcement Learning (RL) is the science of decision making. It is about learning the
optimal behavior in an environment to obtain maximum reward. In RL, the data is
accumulated from machine learning systems that use a trial-and-error method. Data is not
part of the input that we would find in supervised or unsupervised machine learning.
Reinforcement learning uses algorithms that learn from outcomes and decide which action
to take next. After each action, the algorithm receives feedback that helps it determine
whether the choice it made was correct, neutral or incorrect. It is a good technique to use
for automated systems that have to make a lot of small decisions without human guidance.
Reinforcement learning is an autonomous, self-teaching system that essentially learns by
trial and error. It performs actions with the aim of maximizing rewards, or in other words, it is
learning by doing in order to achieve the best outcomes.
Example:
The problem is as follows: We have an agent and a reward, with many hurdles in between.
The agent is supposed to find the best possible path to reach the reward. The following
problem explains the problem more easily.
The problem is as follows: We have an agent and a reward, with many hurdles in between.
The agent is supposed to find the best possible path to reach the reward. The following
problem explains the problem more easily.
The above image shows the robot, diamond, and fire. The goal of the robot is to get the
reward that is the diamond and avoid the hurdles that are fired. The robot learns by trying
all the possible paths and then choosing the path which gives him the reward with the least
hurdles. Each right step will give the robot a reward and each wrong step will subtract the
reward of the robot. The total reward will be calculated when it reaches the final reward that
is the diamond.
Main points in Reinforcement learning –
Input: The input should be an initial state from which the model will start
Output: There are many possible outputs as there are a variety of solutions to a particular
problem
Training: The training is based upon the input, The model will return a state and the user
will decide to reward or punish the model based on its output.
The model keeps continues to learn.
The best solution is decided based on the maximum reward.
9. Apply K-means algorithm on given data for k=3. Use C1(2), C2(16) and C3(38)
as initial cluster centres.
Data: 2, 4, 6, 3, 31, 12, 15, 16, 38, 35, 14, 21, 23, 25, 30
10. Explain with suitable example the advantages of Bayesian approach over
classical approaches to probability.
11. Explain in detail Principal Component Analysis for Dimension Reduction.
12. Find optimal hyperplane for the data points:
{(1,1), (2,1), (1,-1), (2,-1), (4,0), (5,1), (5, -1), (6,0)}
13. Short note on Machine Learning Applications
14. Short note on Classification using Back Propogation Algorithm
15. Short note on Issues in Decision Tree
16. What are the key tasks of Machine Learning?
17. What are the key terminologies of Support Vector Machine?
18. Explain in brief Linear Regression Technique.
19. Explain in brief elements of Reinforcement Learning.
20. Explain the steps required for selecting the right machine learning algorithm.
21. For the given data determine the entropy after classification using each attribute for
classification separately and find which attribute is cest as decision attribute for the
root by finding information gain with respect to entropy of Temperature as reference
attribute.
22. Explain in detail Principal Component Analysis for Dimension Reduction
23.
24. Apply Agglomerative clustering algorithm on given data and draw dendogram. Show
three clusters with its allocated points. Use single link method.
25. Explain classification using Back Propogation algorithm with a suitable example.
26. Detail notes on quadratic programming solution for finding maximum margin
separation in SVM
27. Detail notes on applications of machine learning algorithms
28. Detail notes on hidden Markov model
29. Explain classification using Bayesian Belief Network with an example.
30. Define SVM and further explain the maximum margin linear seperators concept.
31. Explain in detail PCA for Dimension Reduction
32. Explain reinforcement learning in detail along with the various elements involved in
forming the concept. Also define what is meant by partially observable state.
33. Detail notes on Hierarchical Clustering algorithms
34. Detail notes on Model Based Learning
35. What is machine learning? Explain how supervised learning is different from
unsupervised learning.
36. Explain Bayes theorem.
37. What are the elements of reinforcement learning?
38. Describe two methods of reducing dimensionality
39. The following table shows the midterm and final exam grades obtained for students
in a database course. Use the method of least squares using regression to predict
the final exam grade of a student who received 86 on the midterm exam.
40. Explain the steps in developing a machine learning application.
41. For a sunburn dataset given below, construct a decision tree.
42. What is SVM? How to compute the margin?
43. Explain PCA Principal Component Analysus to arrive at the transformed matrix for
the given matrix A.
1. Explain how Back Propogation algorithm helps in classification
2. For the given set of points identify clusters using complete link and average link
using agglomerative clustering.
3.
4. Short note on Temporal difference learning
5. Short note on Logistic regression
6. Short note on Machine learning Applications
7. Define well posed learning problem. Hence define robot driving learning problem.
8. Explain in brief Bayesian Belief networks
9. Write short note on Temporal Difference Learning
10. Explain procedure to construct decision trees.
11. Explain how SVM can be used to find optimal hyperplane to classify linearly
seperable data. Give suitable example.
12. Explain procedure to design machine learning system.
13. What is linear regression? Find the best fitted line for following example:
14. What is decision tree? How you will choose best attribute for decision tree classifier?
Give suitable example.
15. Explain K-mean clustering algorithm giving suitable example. Also, explain how Kmean clustering differs from hierarchical clustering.
16. What is kernel? How kernel can be used with SVM to classify non-linearly seperable
data? Also, list standard kernel functions.
17. What is Q-learning? Explain algorithm for learning Q.
18. Explain following terms with respect to Reinforcement learning: delayed rewards,
exploration and partially observable states.
19. Short notes on soft margin SVM
20. Short notes on Radial Basis functions
21. Short notes on Independent Component Analysis
22. Short notes on Logistic Regression
23. Define machine learning? Briefly explain the types of learning.
24. What is independent component analysis?
25. What are the issues in decision tree induction?
26. What are the requirements of clustering algorithms?
27. The values of independent variable x and dependent value y are given below. Find
the least square regression line y=ax+b. Estimate the value of y when x is 10.
28.
29. What are the steps in designing a machine learning problem? Explain with the
checkers problem.
30.
31. What is the goal of SVM? How to compute the margin?
32. For the given set of points identify clusters using complete link and average link
using agglomerative clustering.
33.
34. What is the role of radial basis function in separating nonlinear patterns.
35. Use Principal Component Analysis PCA to arrive at the transformed matrix for the
given matrix A.
36.
37. What are the elements of reinforcement learning?
38. Short notes on Logistic regression
39. Short notes on Back propogation algorithm
40. Short notes on issues in machine learning
41. Define Machine Learning and Explain with example importance of Machine Learning
42. Explain Multilayer perceptron with a neat diagram
43. Why is SVM more accurate than logistic regression
44. Explain Radial Basis Function with example
45. What is dimensionality reduction? Describe how principal component analysis is
carried out to reduce dimensionality of data sets.
46. Find the singular value decomposition of
47.
48. For a unknown tuple t=<Outlook =Sunny, Temperature =Cool, Wind =Strong>use
naïve bayes classifier to find whether the class for PlayTennis is yes or no.
49. The dataset is given below.
50.
51. List some advantages of derivative-based optimization techniques. Explain Steepest Descent method
for optimization.
52. Given the following data for the sales of car of an automobile company for six consecutive years.
Predict the sales for next two consecutive years
53. Explain various basic evaluation measures of supervised learning Algorithm for Classification.
54. Consider following table for binary classification. Calculate the root of the decision tree using Gini
index
55.
56. Define SVM. Explain how margin is computed and optimal hyperplane is decided.
57. Short notes on Hidden Markov Model
58. Short notes on EM Algorithm
59. Short notes on Logistic Regression
60. Short notes on McCulloch-Pitts Neuron Model
61. Short notes on DownHill simplex method.
62. Define Machine Learning (ML) Briefly explain the types of learning
63. “Entropy is a thermodynamic function used to measure the disorder of a system in Chemistry.” How
do you suitably clarify the concept of entropy in ML?
64. State the principle of Occam’s Razar. Which ML algorithm uses this principle?
65. Explain Bayesian Belief Network with an example. [5]
66. Q2 a) Use the k-means clustering algorithm and Euclidean distance to cluster the following eight 8
examples into three clusters: A1= (2, 10), A2= (2, 5), A3= (8, 4), A4= (5, 8), A5= (7, 5), A6= (6, 4), A7=
1(1, 2), A8= (4, 9). Find the new centroid at every new point entry into the cluster group. Assume
initial cluster centers as A1, A4 and A7. [10]
67. b) Compare and contrast Linear and Logistic regressions with respect to their mechanisms of
prediction. [10]
68. Q3 a) Find predicted value of Y for one epoch and RMSE using Linear regression.
69.
70. Find the new revised theta for the given problem using Expectation -Maximization
algorithm for one epoch.
71.
72. For the given set of points identify clusters using single linkage and draw the dendrogram with cluster
separation line emerging at 1.3. Find how many clusters are formed below the line? [10] n
73. b) Use Principal Component Analysis (PCA) to arrive at the transformed matrix for the given matrix A.
A T = 2 1 0 -1 4 3 1 0.5 [10]
74. Q5 a) Find optimal hyper plane for the following points: {(1, 1), (2, 1), (1, -1), (2,-1), (4, 0), (5, 1), (6,
0)} [10]
75. b) The following table consists of training data from an employee database. The data have been
generalized. For example, “31 . . . 35” for age represents the age range of 31 to 35. For a given row
entry, count represents the number of data tuples having the values for department, status, age, and
salary given in that row. Let the status be the class-label attribute. (i) Design a multilayer feedforward neural network for the given data. Label the nodes in the input and output layers. (ii) Using
the multilayer feed-forward neural network obtained in (i), show the weight values after one
iteration of the back propagation algorithm, given the training instance “(sales, senior, 31 . . . 35, 46K
. . . 50K)”. Assume initial weight values and biases. Assume learning rate to be 0.9. Use binary input
and draw (one input layer, one output layer and one hidden layer) neural network. Solve the problem
for one epoch.
76.
77. Short notes on Machine learning applications
78. Short notes on temporal difference learning
79. Short notes on independent component analysis
80. a. What are the issues in Machine learning?
81. b. Explain Regression line, Scatter plot, Error in prediction and Best fitting line
82. c. Explain the concept of margin and support vector.
83. d. Explain the distance metrics used in clustering.
84. e. Explain Logistic Regression
85. Q2. a. Explain the steps of developing Machine Learning applications. [10]
86. b. Explain Linear regression along with an example. [10]
87. Q3. a. Create a decision tree using Gini Index to classify following dataset.
88. Describe Multiclass classification
89. Explain the Random Forest algorithm in detail.
90. Explain the different ways to combine the classifiers
91. Compute the Linear Discriminant projection for the following two-dimensional
dataset. X1 = (x1, x2) = {(4,1), (2,4), (2,3), (3,6), (4,4)} and X2 = (x1, x2) = {(9,10),
(6,8), (9,5), (8,7), (10,8)}
92. Explain EM algorithm
93. Detailed note on Performance Metrics for Classification
94. Principal Component Analysis for Dimension Reduction
95. DBSCAN
96. A How to choose the right ML algorithm?
97. B Explain Regression line, Scatter plot, Error in prediction and Best fitting line.
98. C Explain the concept of feature selection and extraction
99. D Explain K-means algorithm.
100.
E Explain the concept of Logistic Regression
101.
Q2 A Explain any five applications of Machine Learning. [10]
102.
B Explain Multivariate Linear regression method. [10]
103.
Q3 A Create a decision tree using Gini Index to classify following dataset for profit.
104.
105.
106.
107.
108.
109.
110.
111.
B Find SVD for
Explain Random Forest algorithm in detail
Explain the concept of bagging and boosting
Describe Multiclass classification
Explain the concept of Expectation Maximization Algorithm
Detailed note on Linear Regression
Detailed note on Linear Discriminant Analysis for Dimension Reduction
Detailed note on DBSCAN
Download