Uploaded by prabusubadd

Unit3

advertisement
UNIT III BUSINESS FORECASTING
6
Introduction to Business Forecasting and Predictive analytics - Logic and Data Driven Models
–Data Mining and Predictive Analysis Modelling –Machine Learning for Predictive analytics.
Business forecasting
Business forecasting refers to the process of predicting future market conditions by using
business intelligence tools and forecasting methods to analyze historical data.
Business forecasting can be either qualitative or quantitative. Quantitative business forecasting
relies on subject matter experts and market research while quantitative business forecasting
focuses only on data analysis.
Quantitative Forecasting
Quantitative forecasting is applicable when there is accurate past data available to predict the
probability of future events. This method pulls patterns from the data that allow for more
probable outcomes. The data used in quantitative forecasting can include in-house data such as
sales numbers and professionally gathered data such as census statistics. Generally, quantitative
forecasting seeks to connect different variables in order to establish cause and effect
relationships that can be exploited to benefit the business.
Qualitative Forecasting
Qualitative forecasting is based on the opinion and judgment of consumers and experts. This
business forecasting method is useful if you have insufficient historical data to make any
statistically relevant conclusions. In such cases, an expert can help piece together the known bits
of data you do have to try to make a qualitative prediction from that known information.
Business Forecasting Process
Here are the steps that a business forecaster should typically follow:
1. Define the question or problem you need to solve with your business forecasting efforts.
For example, you might be interested in estimating whether your organization will be
able to meet product demand for the next quarter.
2. Identify the datasets and variables that need to be taken into consideration. In this case,
datasets such as the sales records from the previous year and variables related to capacity,
production and demand planning.
3. Choose a business forecasting method that adjusts to your dataset and forecasting goals.
That depends on whether your problem or question can be solved using a qualitative,
quantitative or mixed approach
4. Based on the analysis of historical data, you can proceed to estimate future business
performance. Keep in mind that the accuracy of your business forecasting depends on the
quality of your data.
5. Determine the discrepancy between your business forecast and actual business
performance. Document your findings and improve your business forecasting process.
Business Forecasting Methods
As stated above, there are two main types of business forecasting methods, qualitative and
quantitative. some of the more common forecasting models from both sides below.
Delphi Method
This qualitative business forecasting method consists in gathering a panel of subject matter
experts and getting their opinions on the same topic in a manner in which they can’t know each
other’s thoughts. This is done to prevent bias, which makes it possible for a manager to
objectively compare their opinions and see if there are patterns, consensus or division.
Market Research
There are many market research techniques that evaluate the behavior of customers and their
response to a certain product or service. Some of those market research methods collect and
analyze quantitative data, such as digital marketing metrics and others qualitative data, such as
product testing, or customer interviews.
Time Series Analysis
Also referred to as “trend analysis method,” this business forecasting technique simply requires
the forecaster to analyze historical data to identify trends. This data analysis process requires
statistical analysis as outliers need to be removed. More recent data should be given more weight
to better reflect the current state of the business.
The Average Approach
The average approach says that the predictions of all future values are equal to the mean of the
past data. Past data is required to use this method, so it can be considered a type of quantitative
forecasting. This approach is often used when you need to predict unknown values as it allows
you to make calculations based on past averages, where one assumes that the future will closely
resemble the past.
The Naïve Approach
The naïve approach is the most cost-effective and is often used as a benchmark to compare
against more sophisticated methods. It’s only used for time series data where forecasts are made
equal to the last observed value. This approach is useful in industries and sectors where past
patterns are unlikely to be reproduced in the future. In such cases, the most recent observed value
may prove to be the most informative.
Elements of Business Forecasting
1. Develop the Basis: Before you can start forecasting, you must develop a system to
investigate the current economic situation around you. That includes your industry and its
present position as well as its popular products to better estimate sales and general
business operations.
2. Estimating Future Business Operations: Now comes the estimation of future
conditions, such as the course that future events are likely to take in your industry. Again,
this is based on collected data to help with quantitative estimates for the scale of
operations in the future.
3. Regulating Forecasts: Whatever your forecast is, it must be compared to actual results.
This is the only way to find deviations from the norm. Then the reasons for those
deviations must be figured out, so action can be taken to correct those deviations in the
future.
4. Reviewing Forecasting Process: By reviewing the deviations between forecasts and
actual performance data, improvements are made in the process, allowing you to refine
and review the information for accuracy.
Predictive analytics
Predictive analytics encompasses a variety of statistical techniques from predictive modeling,
machine learning, and data mining that analyze current and historical facts to make predictions
about future or otherwise unknown events
Predictive analytics, a branch in the domain of advanced analytics, is used in predicting the
future events. It analyzes the current and historical data in order to make predictions about the
future by employing the techniques from statistics, data mining, machine learning, and artificial
intelligence
In business, predictive models exploit patterns found in historical and transactional data to
identify risks and opportunities. Models capture relationships among many factors to allow
assessment of risk or potential associated with a particular set of conditions, guiding decision
making for candidate transactions.[
Consider the power of predictive analytics:
• A Canadian bank uses predictive analytics to increase campaign response rates by 600% , cut
customer acquisition costs in half, and boost campaign ROI by 100%.
• A large state university predicts whether a student will choose to enroll by applying predictive
models to applicant data and admissions history.
• A research group at a leading hospital combined predictive and text analytics to improve its
ability to classify and treat pediatric brain tumors.
How Predictive Analytics Works
Predictive analytics is driven by machine-learning algorithms, principally decision trees, log
linear regression, and neural networks. These algorithms perform pattern matching. They
determine how closely new data matches a reference pattern. The algorithms are trained on real
data and then compute a predictive score for each individual they analyze.
Figure 1: Predictive Analytics Process
Requirement Collection
To develop a predictive model, it must be cleared that what is the aim of prediction. Through
the prediction, the type of knowledge which will be gained should be defined. For example, a
pharmaceutical company wants to know the forecast on the sale of a medicine in a particular
area to avoid expiry of those medicines
Data Collection
After knowing the requirement of the client organization, the analyst will collect the datasets,
may be from different sources, required in developing the predictive model.
Data Analysis and Massaging
Data analysts analyze the collected data and prepare it for analysis and to be used in the model.
The unstructured data is converted into a structured form in this step. Once the complete data is
available in the structured form, its quality is then tested. There are possibilities that erroneous
data is present in the main dataset or there are many missing values against the attributes, these
all must be addressed. The effectiveness of the predictive model totally depends on the quality
of data. The analysis phase is sometimes referred to as data munging or massaging the data that
means converting the raw data into a format that is used for analytics.
Statistics, Machine Learning
The predictive analytics process employs many statistical and machine learning technique.
Probability theory and regression analysis are most important techniques which are popularly
used in analytics. Similarly, artificial neural networks, decision tree, support vector machines
are the tools of machine learning which are widely used in many predictive analytics tasks.
Predictive Modeling
In this phase, a model is developed based on statistical and machine learning techniques and
the example dataset. After the development, it is tested on the test dataset which a part of the
main collected dataset to check the validity of the model and if successful, the model is said to
be fit. Once fitted, the model can make accurate predictions on the new data entered as input to
the system. In many applications, the multi-model solution is opted for a problem.
2.5 Prediction and Monitoring
After the successful tests in predictions, the model is deployed at the client’s site for everyday
predictions and decision- making process. The results and reports are generated by the model nor
managerial process. The model is consistently monitored to ensure whether it is giving the
correct results and making the accurate predictions.
4. PREDICTIVE ANALYTICSTECHNIQUES
All the predictive analytics models are grouped into classification models and regression
models. Classification models predict the membership of values to certain class while the
regression models predict a number. We will now list out the important techniques below
which are used popularly in developing the predictive models.
Decision Tree
A decision tree is a classification model but it can be used in regression as well. It is a tree-like
model which relates the decisions and their possible consequences [11]. The consequences may
be the outcome of events, cost of resources or utility. In its tree-like structure, each branch
represents a choice between a number of alternatives and its every leaf represents a decision
Regression Model
Regression is one of the most popular statistical technique which estimates the relationship
between variables. It models the relationship between a dependent variable and one or more
independent variables.
It analyzes how the value of dependent variable changes on changing the values of independent
variables in the modeled relation.
Artificial Neural Network
Artificial neural network, a network of artificial neurons based on biological neurons,
simulates the human nervous system capabilities of processing the input signals and producing
the outputs. This is a sophisticated model that is capable of modeling the extremely complex
relations. The architecture of a general purpose artificial neural network is represented in
figure 5.
Bayesian Statistics
This technique belongs to the statistics which takes parameters as random variables and use the
term “degree of belief” to define the probability of occurrence of an event [14]. The
Bayesian statistics is based on Bayes’ theorem which terms the events priori and posteriori. In
conditional probability, the approach is to find out the probability of a posteriori event given
that priori has occurred. On the other hand, the Bayes’ theorem finds the probability of priori
event given that posteriori has already occurred. It is represented in figure 6.
Ensemble Learning
It belongs to the category of supervised learning algorithms in the branch of machine learning.
These model are developed by training several similar type models and finally combining their
results on prediction. In this way, the accuracy of the model is improved. Development in this
way reduce the bias and reduce the variance of the model. It helps in identifying the best model
to be used with new data
Support Vector Machine
It is supervised kind of machine learning technique popularly used in predictive analytics. With
associative learning algorithms, it analyzes the data for classification and regression. However, it
is mostly used in classification applications. It is a discriminative classifier which is defined by a
hyperplane to classify examples into categories. It is the representation of examples in a plane
such that the examples are separated into categories with a clear gap. The new examples are then
predicted to belong to a class as which side of the gap they fall.
Time Series Analysis
Time series analysis is a statistical technique which uses time series data which is collected
over a time period at a particular interval. It combines the traditional data mining techniques
and the forecasting . The time series analysis is divided into two categories, namely the
frequency domain and the time domain. It predicts the future of a variable at future time
intervals based on the analysis of values at past time intervals. It is used in stock market
prediction and weather forecasting very popularly. An example of variation in the price of
some product over the period of time and its trends forecast in future years is represented in
figure.
4. APPLICATION OF PREDICTIVEANALYTICS
Banking and Financial Services
In banking and financial industries, there is a large application of predictive analytics. In both the
industries data and money is crucial part and finding insights from those data and the movement
of money is a must. The predictive analytics helps in detecting the fraudulent customers and
suspicious transactions. It minimizes the credit risk on which theses industries lend money to its
customers. It helps in cross-sell and up-sell opportunities and in retaining and attracting the
valuable customers
Retail
The predictive analytics helps the retail industry in identify the customers and
understanding what they need and what they want. By applying this technique, they predict
the behavior of customers towards a product. The companies may fix prices and set special
offers on the products after identifying the buying behavior of customers. It also helps the
retail industry in predicting that how a particular product will be successful in a particular
season. They may campaign their products and approach to customers with offers and prices
fixed for individual customers. The predictive analytics also helps the retail industries in
improving their supply-chain. They identify and predict the demand for a product in the
specific area may improve their supply of products.
Health and Insurance
The pharmaceutical sector uses predictive analytics in drug designing and improving their
supply chain of drugs. By using this technique, these companies may predict the expiry of
drugs in a specific area due to lack of sale. The insurance sector uses predictive analytics
models in identifying and predicting the fraud claims filed by the customers. The health
insurance sector using this technique to find out the customers who are most at risk of a serious
disease and approach them in selling their insurance plans which be best for their investment .
Oil Gas and Utilities
The oil and gas industries are using the predictive analytics techniques in forecasting the failure
of equipment in order to minimize the risk. They predict the requirement of resources in future
using these models. The need for maintenance can be predicted by energy-based companies to
avoid any fatal accident in future.
Government and Public Sector
The government agencies are using big data-based predictive analytics techniques to identify
the possible criminal activities in a particular area. They analyze the social media data to
identify the background of suspicious persons and forecast their future behavior. The
governments are using the predictive analytics to forecast the future trend of the population at
country level and state level. In enhancing the cybersecurity, the predictive analytics techniques
are being used in full swing.
Data-Driven Model
Data-driven Models refers to the models in which data is collected from many sources to
qualitatively establish model relationships.
The main aim of data-driven model concept is to find links between the state system variables
(input and output) without clear knowledge of the physical attributes and behaviour of the
system. The data driven predictive modelling derives the modelling method based on the set of
existing data and entails a predictive methodology to forecast the future outcomes.
It is data-driven only when there is no clear knowledge of the relationships among
variables/system, though there is lot of data. Here, you are simply predicting the outcomes based
on the data. The model is not based on hand-picked variables, but may contain unobserved,
hidden combination of variables.
Artificial intelligence (AI), which is the overarching study of how human intelligence can be
incorporated into computers.
• computational intelligence (CI), which includes neural networks, fuzzy systems and
evolutionary computing as well as other areas within AI and machine learning.
• soft computing (SC), which is close to CI, but with special emphasis on fuzzy rule-based
systems induced from data.
• machine learning (ML), which was once a sub-area of AI that concentrates on the theoretical
foundations used by CI and SC.
• data mining (DM) and knowledge discovery in databases (KDD) are focused
large databases and are associated with applications in banking,
often at very
financial services and customer resources management. DM is seen as a part of a wider KDD.
Methods used are mainly from statistics and ML.
• intelligent data analysis (IDA), which tends to focus on data analysis in medicine and research
and incorporates methods from statistics and ML
Logic driven models
Logic driven models remain based on experience, knowledge and logical relationships of
variables and constants connected to the desired business performance outcome situation.
It leverages statistics to predict outcomes. Most often the event one wants to predict is in the
future, but predictive modeling can be applied to any type of unknown event, regardless of when
it occurred. For example, predictive models are often used to detect crimes and identify suspects,
after the crime has taken place.
In many cases the model is chosen on the basis of detection theory to try to guess the probability
of an outcome given a set amount of input data, for example given an email determining how
likely that it is spam.
Models can use one or more classifiers in trying to determine the probability of a set of data
belonging to another set, say spam or ‘ham’.
Predictive models can either be used directly to estimate a response (output) given a defined set
of characteristics (input), or indirectly to drive the choice of decision rules.
Depending on the methodology employed for the prediction, it is often possible to derive a
formula that may be used in a spreadsheet software.
Data mining and predictive analysis modelling
Data mining is a process based on algorithms to analyze and extract useful information and
automatically discover hidden patterns and relationships from data. Instead, predictive analytics
is closely tied to machine learning, as it uses data patterns to make predictions, where machines
take historical and current information and apply them to a model to predict future trends. In
essence, the difference between predictive analytics and data mining is that the former explores
the data and the latter answers “What is the next step?”
Predictive data mining models
A predictive data mining model predicts the values of data using known results gathered from
the different data sets. Predictive modeling can not be classified as a separate discipline; it occurs
in all organizations or industries across all disciplines. The main objective of predictive data
mining models is to predict the future based on the past data, generally but not always on the
statistical modeling.
Predictive modeling is used in healthcare industries to identify high-risk patients with congestive
heart failures, high blood pressure, diabetes, infection, cancer, etc. It is also used in the vehicle
insurance company to assign the risk of accidents to the policyholder.
A predictive model of a data mining task comprises classification, regression, prediction, and
time series analysis. The predictive model of data mining is also called statistical regression. It
refers to a monitoring learning technique that includes an explication of the dependency of a few
attribute's values upon the other attribute's value in the same product and the growth of a model
that can predict these attribute's values in previous cases.
Classification:
In data mining, classification refers to a form of data analysis where a machine learning model
assigns a specific category to a new observation. It is based on what the model has learned from
the data sets. In other words, classification is the act of assigning objects to many predefined
categories.
One example of classification in the banking and financial services industry is identifying
whether transactions are fraudulent or not. In the same way, machine learning can also be used to
predict whether a loan application would be approved or not.
Regression:
Regression refers to a method that verifies the value of data for a function. Generally, it is used
for appropriate data.
A linear regression model in the context of machine learning or statistics is basically a linear
approach for modeling the relationships between the dependent variable known as the result and
your independent variable is known as features.
If your model has only one independent variable, it is called simple linear regression, and else it
is called multiple linear regression.
Types of regression
1. Linear Regression:
Linear regression is related to the search for the optimal line which fits the two attributes so that
with the help of one attribute, we can predict the other.
2. Multi-linear regression
Multi-linear regression includes two or more than two attributes, and the data are fit to multidimensional space.
Prediction:
In data mining, prediction is used to identify data value based on the description of another
corresponding data value. The prediction in data mining is known as Numeric Prediction.
Generally, regression analysis is used for prediction. For example, in credit card fraud detection,
data history for a particular person's credit card usage has to be analyzed. If any abnormal pattern
was detected, it should be reported as 'fraudulent action'.
Time series analysis:
Time series analysis refers to the data sets based on time. It serves as an independent variable to
predict the dependent variable in time.
Descriptive model
A descriptive model differentiates the patterns and relationships in data. A descriptive model
does not attempt to generalize to a statistical population or random process. A predictive model
attempts to generalize to a population or random process. Predictive models should give
prediction intervals and must be cross-validated; that is, they must prove that they can be used to
make predictions with data that was not used in constructing the model.
Descriptive analytics focuses on the summarization and conversion of the data into useful
information for reporting and monitoring.
Clustering:
Clustering is grouping a set of objects so that objects in the same group called a cluster are more
similar than those in other groups clusters.
Association rules:
Association rules determine a causal relationship between huge sets of data objects. The way the
algorithm works is that you have. For example, a list of items you purchase at the grocery store
for the past six months data, and it calculates a percentage at which items are purchased together.
For example, what are the chances of you buying milk with cereal?
Sequence:
Sequence refers to the discovery of useful patterns in the data is in relation to some objective of
how it is interesting.
Summarization:
Summarization holds a data set in more depth which is easy to understand form.
steps for predictive analytics using machine learning
Applications of predictive analytics and machine learning
For organisations overflowing with data but struggling to turn it into useful insights, predictive
analytics and machine learning can provide the solution. No matter how much data an
organisation has, if it can’t use that data to enhance internal and external processes and meet
objectives, the data becomes a useless resource.
Predictive analytics is most commonly used for security, marketing, operations, risk and fraud
detection. Here are just a few examples of how predictive analytics and machine
learning are utilised in different industries:
1. Banking and Financial Services
In the banking and financial services industry, predictive analytics and machine learning are
used in conjunction to detect and reduce fraud, measure market risk, identify opportunities
and much, much more.
2. Security
With cybersecurity at the top of every business’ agenda in 2017, it should come as no
surprise that predictive analytics and machine learning play a key part in security. Security
institutions typically use predictive analytics to improve services and performance, but also
to detect anomalies, fraud, understand consumer behaviour and enhance data security.
3. Retail
Retailers are using predictive analytics and machine learning to better understand consumer
behaviour; who buys what and where? These questions can be readily answered with the
right predictive models and data sets, helping retailers to plan ahead and stock items based on
seasonality and consumer trends – improving ROI significantly.
There are eight steps to perform predictive analytics with ML.
Step 1: Define the problem statement
We begin by understanding and defining the problem statement, and deciding on the required
datasets on which to perform predictive analytics.
Example: There is a grocery store. Our objective is to predict the sales of groceries for the next
six months. Here, past sales data of how many groceries were sold and the resulting profits of the
last five years will be the dataset.
Step 2: Collect the data
Once we know what sort of dataset is needed to perform predictive analytics using machine
learning, we gather all the necessary details that constitute the dataset. We need to ensure that the
historical data is collected from an authorized source.
Using the grocery store example, we can ask the accountant for records of past sales logged in
worksheets or billing software. We collect data spanning the past five years.
Step 3: Clean the data
The raw dataset obtained will have some missing data, redundancies, and errors. Since we cannot
train the model for predictive analytics directly with such noisy data, we need to clean it. Known
as preprocessing, this step involves refining the dataset by eradicating unnecessary and duplicate
data.
Step 4: Perform Exploratory Data Analysis (EDA)
EDA involves exploring the dataset thoroughly in order to identify trends, discover anomalies,
and check assumptions. It summarizes a dataset’s main characteristics. It often uses data
visualization techniques.
Step 5: Build a predictive model
Based on the patterns observed in step 4, we build a predictive statistical machine learning
model, trained with the cleaned dataset obtained after step 3. This machine learning algorithm
helps us perform predictive analytics to foresee the future of our grocery store business. The
model can be implemented using Python, R, or MATLAB.

Hypothesis testing
Hypothesis testing can be performed using a standard statistical model. It includes two
hypotheses, null and alternate. We either reject or fail to reject the null hypothesis.
Example: A new ‘buy one, get one free’ scheme is implemented where customers buy a packet
of soap and get a face wash for free. Consider the two cases below:
Case 1: Despite the scheme, sales of soap did not improve.
Case 2: After the scheme, sales of soap improved.
If the first case is true, we fail to reject the null hypothesis as there is no improvement. If the
second case is true, we reject the null hypothesis.
Step 6: Validate the model
This is a crucial step wherein we check the efficiency of the model by testing it with unseen
input datasets. Depending on the extent to which it makes correct predictions, the model is
retrained and evaluated.
Step 7: Deploy the model
The model is made available for use in a real-world environment by deploying it on a cloud
computing platform so that users can utilize it. Here, the model will make predictions on realtime inputs from the users.
Step 8: Monitor the model
Now that the model is functioning in the real world, we need to verify its performance. Model
monitoring refers to examining how the model predicts actual datasets. If any improvement must
be made, the dataset is expanded and the model is rebuilt and redeployed.
How machine learning improves predictive analytics
Predictive analytics continues to be improved with machine learning algorithms. The eight use
cases discussed below illustrate how.
E-commerce/retail
Predictive analytics achieved through machine learning helps retailers understand customers’
preferences. It works by analyzing users’ browsing patterns and how frequently a product is
clicked on in a website. For example, when we purchase a t-shirt on an e-commerce site, similar
shirts are suggested the next time we log in. Sometimes, we may be recommended several
specific items that are often purchased together for x amount of money. Such personalized
recommendations help retailers retain customers. Predictive analytics also helps maintain
inventory by foreseeing and informing sellers about stock outs.
Customer service
Customer segmentation is performed based on insights by predictive analytics. Customers are
placed into different segments depending on their purchase patterns. For example, book buyers
will form one cluster while t-shirt buyers will constitute another. Tailored marketing strategies
are then developed for each of the segments depending on their characteristics.
Predictive analytics using machine learning can also detect dissatisfied customers and help
sellers design products aimed to retain existing customers and attract new ones.
Medical diagnosis
Machine learning models that are trained on large and varied datasets can study patient
symptoms comprehensively to provide faster and more accurate diagnoses. Performing
predictive analytics on the reasons behind past hospital readmissions can also improve care.
Further, hospitals can use predictive analytics to provide the best care by pre-determining
increase of hospital bed availability or staff shortage. For example, if the number of COVID
cases for the next month can be predicted and the rise in the number of severely infected can be
forecasted, hospitals can make arrangements to deal with such a scenario more efficiently.
Sales and marketing
Predictive analytics of historical data of customer behavior and market trends can help
businesses understand the demands of prospective customers. Companies can achieve higher
targets by streamlining their sales and marketing activities into a data-based undertaking.
Demand forecasting also helps businesses estimate the demand for certain products in the future.
Financial services
Predictive analytics using machine learning helps detect fraudulent activities in the financial
sector. Fraudulent transactions are identified by training machine learning algorithms with past
datasets. The models find risky patterns in these datasets and learn to predict and deter fraud.
Cybersecurity
Machine learning algorithms can analyze web traffic in real-time. When an unusual pattern is
observed, advanced statistical methods of predictive analytics foresee and prevent cyber-attacks.
They also automatically collect attack-related data and generate useful reports on a cyber-attack,
thereby reducing the need for manpower.
Manufacturing
Machine learning and predictive analytics help manufacturers monitor machines and notify them
when crucial components need to be repaired or replaced. They can also predict market
fluctuations, reduce the number of accidents, improve key performance indicators (KPIs), and
enhance overall production quality.
Human Resource Information Systems (HRIS)
Predictive analytics using machine learning identifies employee churn rate and keeps human
resources (HR) departments informed of the same. Models can be trained with datasets that have
details such as an employee's monthly income, allowances, increments, insurance, and so on.
The models learn from past records of ex-employees and find patterns to understand the reasons
for leaving. They then predict if new employees are likely to resign or not, empowering HR to
minimize the risk.
Download