Uploaded by Bryce Lihaylihay

Data Mining & Statistical Inference Textbook Excerpt

Chapter 5
Observation or record – A set of observed values of variables associated
with a single entity, often displayed as a row in a spreadsheet or
Unsupervised learning – Category of data mining techniques in which
an algorithm explains relations without an outcome variable to guide
the process
Market Segmentation – The partitioning of customers into groups that
share common characteristics so that a business may target customers
within a group with a tailored marketing strategy
K-means clustering – Process of organizing observations into one of k
groups based on a measure of similarity (typically Euclidean distance)
Hierarchical Clustering – Process of agglomerating observations into a
series of nested groups based on a measure of similarity
Euclidean Distance – Geometric measure of dissimilarity between
observation based on the Pythagorean theorem
Manhattan Distance – Measure of dissimilarity between two
observations based on the sum of the absolute differences in each
variable dimensions
Matching Coefficient – Measure of similarity between observations
based on the number of matching values of categorical variables
Matching distance – Measure of dissimilarity between observations
based on the matching coefficient
Jaccard’s Coefficient – Measure of similarity between observations
consisting solely of binary categorical variables that considers only
matches of nonzero entries
Single Linkage- Measure of calculating dissimilarity between clusters by
considering only the two most similar observations between the two
Complete Linkage – Measure of calculating dissimilarity between
clusters by considering only the two most dissimilar observations
between the two clusters
Group Average Linkage – Measure of calculating dissimilarity between
clusters by considering the distance between each pair of observations
between two clusters
Median Linkage- Method that computes the similarity between two
clusters as the median of the similarities between each pair of
observations in the two clusters
Centroid Linkage – Method of calculating dissimilarity between clusters
by considering the two centroids of the respective clusters
Ward’s Method – Procedure that partitions observations in a manner
to obtain clusters with the least amount of information loss due to the
McQuitty’s method – Measure that computes the dissimilarity
introduced by merging clusters A and B by, for each other cluster C,
averaging the distance between A and C and the distance between B
and C and summing these averages distances
Dendrogram – A tree diagram used to illustrate the sequence of nested
clusters produced by hierarchical clustering
Association rules – An if-then statement describing the relationship
between item sets
Market basket analysis – Analysis of items frequently co-occurring in
transactions (such as purchases)
Antecedent – The item set corresponding to the if portion of an if-then
association rule
Consequent – the item set corresponding to the then portion of an ifthen association rule
Support – The percentage of transactions in which a collection of items
occurs together in a transaction data set
Confidence- The conditional probability that the consequent of an
association rule occurs given the antecedent occurs
Lift ratio – The ratio of the performance of a data mining model
measured against the performance of a random choice. In the context
of association rules, the lift ratio is the ratio of the probability of the
consequent occurring in a transaction that satisfies the antecedent
versus the probability that the consequent occurs in a randomly
selected transaction
Text Mining – The process of extracting useful information from text
Unstructured Data – data, such as text, audio, or video, that cannot be
stored in a traditional structured database
Document – A piece of text, which can range from a single sentence to
an entire book depending on the scope of the corresponding corpus
Terms – The most basic unit of text comprising a document, typically
corresponding to a word or word stem
Corpus – A collection of documents to be analyzed
Bag of words – An approach for processing text into a structured rowcolumn data format in which documents correspond to row
observations and words (or more specifically, terms) correspond to
column variables
Presence/absence – A matrix with the rows representing documents
and the columns representing words, and the entries in the columns
indicating either the presence or the absence of a particular word in a
particular document (1=present and 0 = not present)
Binary document-term matrix – A matrix with the rows representing
documents (units of text) and columns representing terms (words or
word roots), and the entries in the columns indicating either the
presence or absence of a particular term in a particular document
(1=present and 0 = not present)
Tokenization – The process of dividing text into separate terms,
referred to as tokens
Term Normalization – A set of natural language processing techniques
to map text into a standardized form
Stemming – The process of converting a word to its stem or root word
Stopwords – Common words in a language that are removed in the preprocessing of text
Frequency document-term matrix – A matrix whose rows represent
documents (units of text) and columns represent terms (words or word
roots), and the entries in the matrix are the number of times each term
occurs in each document
Term frequency times inverse document frequently (TFIDF) – Text
mining measure which accounts for term frequency and the uniqueness
of a term in a document relative to other documents in a corpus
Cosine Distance – A measure of dissimilarity between two observations
often used on frequency data derived from text because it is unaffected
by the magnitude of the frequency and instead measures differences
in frequency patterns
Word Cloud – A visualization of text data based on word frequencies in
a document or set of documents
Association rule – An if-then statement describing the relationship
between item sets
Presence/absence document-term matrix – A matrix with the rows
representing documents and the columns representing words, and the
entries in the columns indicating either the presence or the absence of
a particular word in a particular document (1=present and 0 = not
Sentiment analysis – The process of clustering/categorizing comments
or reviews as positive, negative or neutral
Term – The most basic unit of text comprising a document, typically
corresponding to a word of word stem
Chapter 6
Census – Collection of data from every element in the population of interest
Statistical Inference- The process of making estimates and drawing conclusions
about one or more characteristics of a population (the value of one or more
parameters) through the analysis of sample data drawn from the population
Sampled Population – The population from which the sample is drawn
Frame - A listing of the elements from which the sample will be selected
Parameter – A measurable factor that defines a characteristics of a population,
process or system such as a population mean , a population standard deviation ,
or a population proportion 
Simple random sample - A simple random of size n from a finite population of size
N is a sample selected such that each possible sample of size n has the same
probability of being selected
Random Sample – A random sample from an infinite population is a sample
selected such that the following conditions are satisfied: (1) Each element selected
comes from the same population and (2) each element is selected independently.
Sample Statistic – A characteristic of sample data, such as a sample mean x , a
sample standard deviation s or a sample proportion p. The value of the sample
statistic is used to estimate the value of the corresponding population parameter
Calculating sample mean, sample standard deviation, and sample proportion is
called point estimation:
Point Estimator – The sample statistic, such as x, s, or p, that provides the point
estimate of the population parameter
Point Estimate – The value of a point estimator used in a particular instance as an
estimate of a population parameter
Target Population – The population for which statistical inferences such as point
estimates are made. It is important for the target population to correspond as
closely as possible to the sampled population
Random Variable – A quantity whose values are not known with certainty
Sampling Distribution – A probability distribution consisting of all possible values of
a sample statistic
Unbiased - A property of a point estimator that is present when the expected value
of the point estimator is equal to the population parameter it estimates
Standard error- The standard deviation of a point estimator
Finite Population Correction Factor – The term (N-n)/(N-1) that is used in the
formulas for computing the estimated standard error for the sample mean and
sample proportion whenever a finite population, rather than an in infinite
population, is being sampled. The generally accepted rule of thumb is to ignore the
finite population correction factor whenever n/N<0.05
Sampling error – The difference between the value of a sample statistic (such as
the sample mean, sample standard deviation or sample proportion) and the value
of the corresponding population parameter (population mean, population
standard deviation, or population proportion) that occurs because a random
sample is used to estimate the population parameter
Interval Estimation – The process of using sample data to calculate a range of values
that is believed to include the unknown value of a population parameter
Interval Estimate – An estimate of a population parameter that provides an interval
believed to contain the value of the parameter. For the interval estimates in this
chapter, it has the form; point estimate + margin of error
Margin of Error – The + value added to and subtracted from a point estimate in
order to develop an interval estimate of a population parameter
t distribution – A family of probability distributions that can be used to develop an
interval estimate of a population mean whenever the population standard
deviation s is unknown and is estimated by the sample standard deviation s
Degrees of Freedom – A parameter of the t distribution. When the t distribution is
used in the computation of an interval estimate of a population mean, the
appropriate t distribution has n-1 degrees of freedom, where n is the size of the
Standard normal distribution – A normal distribution with a mean of zero and
standard deviation of one
Confidence Level - The confidence associated with an interval estimate. For
example, if an interval estimation procedure provides intervals such that 95% of
the intervals formed using the procedure will include the population parameter,
the interval estimate is said to be constructed at the 95 % confidence level
Confidence Coefficient – The confidence level expressed as a decimal value. For
example, 0.95 is the confidence coefficient for a 95% confidence level
Confidence Interval – Another name for an interval estimate
Level of Significance – The probability that the interval estimation procedure will
generate an interval that does not contain the value of parameter being estimated;
also, the probability of making a Type 1 error when the null hypothesis is true as an
Null Hypothesis – The hypothesis tentatively assumed to be true in the hypothesis
testing procedure
Alternative hypothesis - The hypothesis concluded to be true if the null hypothesis
is rejected
Type II error – The error of accepting H0 when it is false
Type I error – The error of rejecting H0 when it is true
One-tailed tests – A hypothesis test in which rejection of the null hypothesis occurs
for values of the test statistic in one tail of its sampling distribution
Test Statistic – A statistic whose value helps determine whether a null hypothesis
should be rejected
P value - The probability, assuming that H0 is true, of obtaining a random sample
size n that results in a test statistic at least as extreme as the one observed in the
current sample. For a lower-tail test, the p value is the probability of obtaining a
value for the test statistic as small as or smaller than that provided by the sample.
For an upper-tail test, the p value is the probability of obtaining a value for the test
statistic as large as or larger than that provided by the sample. For a two-tailed test,
the p-value is the probability of obtaining a value for the test statistic at least as
unlikely as or more unlikely than that provided by the sample
Two-tailed test – A hypothesis test in which rejection of the null hypothesis occurs
for values of the test statistic in either tail of its sampling distribution
Nonsampling error – Any difference between the value of a sample (such as the
sample mean, sample standard deviation, or sample proportion) and the value of
the corresponding population parameter (population mean, population standard
deviation, or population proportion) that are not the result of sampling error. These
include but are not limited to coverage error, nonresponse error, measurement
error, interview error and processing error
Coverage error – Nonsampling error that results when the research objective and
the population from which the sample is to be drawn are not aligned
Nonresponse error – Nonsampling error that results when some segments of the
population are more likely or less likely to respond to the survey mechanism
Measurement error is an incorrect measurement of the characteristic of interest.
Big Data – Any set of data is too large or too complex to be handled by standard
data processing techniques and typical desktop software
Volume- The amount of data generated
Variety – The diversity in types and structure of data generated
Veracity – The reliability of the data generated
Velocity – The speed at which the data are generated
Tall data – A data set that has so many observations that traditional statistical
inferences have little meaning
Wide data – A data set that has so many variables that simultaneous consideration
of all variables is infeasible
Practical Significance- The real-world impact the result of statistical inference will
have on business decisions
Central Limit Theorem – A theorem stating that when enough independent random
variables are added, the resulting sum is a normally distributed random variable.
This result slows one to use the normal probability distribution to approximate the
sampling distributions of the sample mean and sample proportion for sufficiently
large sample sizes
Hypothesis Testing – The process of making a conjecture about the value of a
population parameter, collecting sample data that can be used to assess this
conjecture, measuring the strength of the evidence against the conjecture that is
provided by the sample and using these results to draw a conclusion about the
One-tailed test – A hypothesis test in which rejection of the null hypothesis occurs
for values of the statistic in one tail of its sampling distribution
Chapter 7
Regression Analysis – A statistical procedure used to develop an equation showing
how the variables are related
Dependent Variable – The variable that is being predicted or explained. It is
denoted by y and is often referred to as the response
Independent Variables – The variable(s) used for predicting or explaining values of
the dependent variable. It is denoted by x and is often referred to as predictor
Multiple Linear Regression – Regression analysis involving one dependent variable
and more than one independent variable
Estimated Regression – The estimate of the regression equation developed from
sample data by using the least squares method. The estimated multiple linear
regression equation is y^=b0+b1x1+b2x2+⋯+bqxq.
yˆ = Estimate for the mean value of y corresponding to a give
b0 = Estimated y -intercept.
b1 = Estimated slope.
Point Estimator – A single value used as an estimate of the corresponding
population parameter
Least Squares Method – A procedure for using sample data to find the estimated
regression equation
Determine the values of b0 and b1 .
Interpretation of b0 and b1:
The slope b1 is the estimated change in the mean of the dependent
variable y that is associated with a one unit increase in the
independent variable x .
The y -intercept b 0 is the estimated value of the dependent variable y
when the independent variable x is equal to 0.
Residual- The difference between the observed value of the dependent variable
and the value predicted using the estimated regression equation; for the ith
observation, the ith residual is yi-y^i
Experimental Region – The range of values for the independent variables x1, x2, xq
for the data that are used to estimate the regression model
Extrapolation – Prediction of the mean value of the dependent variable y for values
of the independent variables x1, x2, xq that are outside the experimental range
Coefficient of determination – A measure of the goodness of fit of the estimated
regression equation. It can be interpreted as the proportion of the variability in the
dependent variable y that is explained by the estimated regression equation
Statistical Inference – The process of making estimates and drawing conclusions
about one or more characteristics of a population (the value of one or more
parameters) through analysis of sample data drawn from the population
Hypothesis testing – The process of making conjecture about the value of a
population parameter, collecting sample data that can be used to assess this
conjecture, measuring the strength of the evidence against the conjecture that is
provided by the sample, and using these results to draw a conclusion about the
Interval Estimation – The use of sample data to calculate a range of values that is
believed to include the unknown value of a population parameter
T- test – Statistical test based on the student’s t probability distribution that can be
used to test the hypothesis that a regression parameter Bj is zero; if this hypothesis
is rejected, we conclude that there is a regression relationship between the jth
independent variable and the dependent variable
Confidence Interval – An estimate of a population parameter that provides an
interval believed to contain the value of the parameter at some level of confidence
Confidence Level – An indication of how frequently interval estimates based on
samples of the same size taken from the same population using identical sampling
techniques will contain the true value of the parameter we are estimating
Multicollinearity – The degree of correlation among independent variables in a
regression model
Dummy Variable – A variable used to model the effect of categorical independent
variables in a regression model; generally, takes only the value zero or one
Quadratic Regression Model – Regression model in which a nonlinear relationship
between the independent and dependent variables is fit by including the
independent variable and the square of the independent variable in the model; also
referred toa s second-order polynomial model
Piecewise linear regression model – Regression model in which one linear
relationship between the independent variables is fit for values of the independent
variable below a prespecified value of the value independent variable, a different
linear relationship between the independent and dependent variables is fit for
values for the independent variable above the prespecified value of the
independent variable and the two regression have the same estimated value of the
dependent variable (i.e are joined) at the prespecified value of the independent
Knot – A prespecified value of the independent variable at which its relationship
with the dependent variable changes in a piecewise linear regression model; also
called the breakpoint or the joint
Interaction- Regression modeling technique is used when the relationship between
the dependent variable and one independent variable is different at different
values of a second independent variable
Backward Elimination – An iterative variable selection procedure that starts with a
model with all independent variables and considers removing an independent
variable at each step
Forward selection – An iterative variable selection procedure that starts with a
model with no variables and considers adding an independent variable at each step
Stepwise Selection – An iterative variable selection procedure that considers
adding an independent variable and removing an independent variable at each step
Best Subsets- A variable selection procedure that constructs and compares all
possible models up to a specified number of independent variables
Overfitting – Fitting a model too closely to sample data, resulting in a model that
does not accurately reflect the population
Cross-Validation – Assessment of the performance of a model on data other than
the data were used to generate the model
Holdout method – Method of cross-validation in which sample data are randomly
divided into mutually exclusive and collectively exhaustive sets, then one set is used
to build the candidate models and the other set is used to compare model
performances and ultimately select a model
Training set – The data set used to build the candidate models
Validation set – The data set used to compare model forecasts and ultimately pick
a model for predicting values of the dependent variable
Prediction Interval – An interval estimates of the prediction of an individual y value
given values of the independent variables
Independent Variable(s) – The variable(s) used for predicting or explaining values
of the dependent variable. It is denoted by x and is often referred to as predictor
Linear Regression - Regression analysis in which relationships between the
independent variables and the dependent variable are approximated by a straight
P- value – The probability that a random sample of the same size collected from
the same population using the same procedure will yield stronger evidence against
a hypothesis than the evidence in the sample data given that the hypothesis is
actually true
Parameter- A measurable factor that defines a characteristic of a population,
process or system
Random variable – A quantity whose values are not known with certainty
Regression Model – The equation that describes how the dependent variable y is
related to an independent variable x and an error term; the multiple linear
regression model
Simple Linear Regression – Regression analysis involving one dependent variable
and one independent variable
Chapter 8
Forecasts – A prediction of future values of a time series
Time Series – A set of observations on a variable measured at successive points in
time or over successive periods of time
Stationary Time series – A time series whose statistical properties are independent
of time
Trend – The long-run shift or movement in the time series observable over several
periods of time
A trend is usually the result of long-term factors such as:
• Population increases or decreases.
Shifting demographic characteristics of the population.
Improving technology.
Changes in the competitive landscape.
Changes in consumer preferences.
Seasonal Patterns – The component of the time series that shows a periodic pattern
over one year or less
Cyclical Pattern – the component of the time that results in periodic above-trend
and below-trend behavior of the time series lasting more than one year
Naïve Forecasting Method – A forecasting technique that uses the value of the time
from the most recent period as the forecast for the current period
Forecast error – The amount by which the forecasted values y^t differs from the
observed value yt, denoted by et=yt=y^t
Mean absolute error (MAE) – A measure of forecasting accuracy; the average of the
values of the forecast errors. Also referred to as mean absolute deviation (MAD)
Mean Squared Error (MSE) – A measure of the accuracy of a forecasting method;
the average of the sum of the square differences between the forecast values and
the actual time series values
Mean Absolute Percentage (MAPE) – A measure of the accuracy of a forecasting
method; the average of the absolute values of the errors as a percentage of the
corresponding forecast values
Moving Average Method – A method of forecasting or smoothing a time series that
uses the average of the most recent n data values in the time as the forecasts for
the next period
Exponential Smoothing- A forecasting technique that uses a weighted average of
past time series values as the forecast
Smoothing Constant – A parameter of the exponential smoothing model that
provides the weight given to the most recent time series value in the calculation of
the forecast value
Autoregressive Models – A regressive model in which a regression relationship
based on past time series values is used to predict the future time series values
Causal Models – Forecasting methods that relate a time series to the variables that
are believed to explained or cause its behavior
Autoregressive model – A regression model in which a regression relationship
based on past time series values is used to predict the future time series values
Seasonal Pattern – The component of the time series that shows a periodic pattern
over one year or less
Chapter 9
Observation- a set of observed values of variables associated with a single entity,
often displayed as a row in a spreadsheet or database
Variables – A characteristic or quantity of interest that can take on different values
Features – A set of input variables used to predict an observation’s outcome class
or continuous outcome value
Supervised Learning – Category of data mining techniques in which an algorithm
learns how to classify or estimate an outcome variable of interest.
Estimation – A predictive data mining task requiring the prediction of an
observation’s continuous outcome value
Classification – A predictive data mining task requiring the prediction of an
observation’s outcome class or category
Overfitting – A situation in which a model explains random patterns in the data on
which it is trained rather than just the generalized relationships, resulting in a
model with training set performance that greatly exceeds its performance on new
Training Set – Data used to build candidate predictive models
Validation Set – Data used to evaluate candidate predictive models
Test Set – Data used to compute unbiased estimate of final predictive model’s
k-fold cross-validation – A robust to train and validate models in which the
observations to be used to train and validate the model are repeatedly randomly
divided into k subsets called folds. In each iteration, one fold is designated as the
validation set and the remaining k-1 folds are designated as the training set. The
results of the iterations are then combined and evaluated
Leave-one-out cross-validation – A special case of k-fold cross validation for which
the number of folds equals the number of observations in the combined training
and validation data
Under sampling- A techniques that balances the number of Class 1 and Class 0
observations in a training set by removing majority class observations from the
training set
Oversampling – A technique that balances the number of Class 1 and Class 0
observations in a training set by inserting copies of minority class observations into
the training set.
Confusion Matrix – A matrix showing the counts of actual versus predicted class
Overall, Error Rate – The percentage of observations misclassified by a model in a
data set
Accuracy – Measure of classification success defined as 1 minus the overall error
False Positive – The misclassification of a Class 0 observation as Class 1
False Negative – The misclassification of a Class 1 observation as Class 0
Cutoff Value – The smallest value that the predicted probability of an observation
can be for the observation to be classified as Class 1
Cumulative Lift Chart – A chart used to present how well a model performs in
identifying observations most likely to be in Class 1 as compared with random
Decile-wise lift chart – A chart used to present how well a model performs at
identifying observations for each of the top k deciles most likely to be in Class 1
versus a random classification
Sensitivity – The percentage of actual Class 1 observations correctly identified
Specificity – The percentage of actual Class 0 observations correctly identified
Precision – The percentage of observations predicted to be Class 1 that actually are
Class 1
F1 Score – A measure combining precision and sensitivity into a single metric
Receiver operating characteristic (ROC) curve – A chart used to illustrate the
tradeoff between a model’s ability to identify Class 1 observations and its Class 0
error rate
Area under the ROC curve – A measure of a classification method’s performance;
an AUC of 0.5 implies that a method is no better than random classification while a
perfect classifier has an AUC of 1.0
Average Error – The average difference between the actual values and the
predicted values of observations in a data; use to detect prediction bias
Root mean squared error – A performance measure of an estimation method
defined as the square root of the sum of squared deviations between the actual
values and predicted values of observations.
Bias- The tendency of a predictive model to overestimate or underestimate the
value of a continuous outcome
Logistic Regression – A generalization of linear regression that predicts a categorical
outcome variable by computing the log odds of the outcome as a linear function of
the input variables
Mallow’s Cp Statistic – A measure in which small values approximately equal to the
number of coefficients suggest promising logistic regression models
K-nearest neighbor (k-NN) – A data mining method that predicts (classifies or
estimates) an observation I’s outcome value based on the k observations most
similar to observation I with respect to the input variables
Impurity – Measure of the heterogeneity of observations in a classification or
regression tree
Classification trees- A tree that classifies a categorical outcome variable by splitting
observations into groups via a sequence of hierarchical rules on the input variables
Regression Tree - A tree that predicts values of a continuous outcome variable by
splitting observations into groups via a sequence of hierarchical rules on the input
Ensemble Method – A predictive data mining approach in which a committee of
individual classification or estimation models are generated and a prediction is
made by combining these individual predictions
Unstable – When small changes in the training set cause a model’s predictions to
fluctuate substantially
Bagging – An ensemble method that generates a committee of models based on
different random samples and makes predictions based on the average prediction
of the set of models
Out-of-bag estimation – A measure of estimating the predictive performance of a
bagging ensemble of m models (without a separate validation set) by leveraging
the concept that the training of each model is only based on approximately 63.2%
of the original observations (due to sampling with replacement).
Boosting – an ensemble method that iteratively samples from the original training
data to generate individual models that target observations that were mispredicted
in previously generated models, and then bases the ensemble predictions on the
weighted average of the predictions of the individual models, where the weights
are proportional to the individual models, where the weights are proportional to
the individual model’s accuracy.
Random Forests – A variant of the bagging ensemble method that generates a
committee of classification or regression trees based on different random samples
but restricts each individual tree to a limited number of randomly selected features
Area under the ROC curve (AUC) – A measure of a classification method’s
performance; an AUC of 0.5 implies that a method is no better than random
classification while a perfect classifier has an AUC of 1.0
Class 0 error rate – The percentage of Class 0 observations misclassified by a model
in a data set
Class 1 error rate – The percentage of actual Class 1 observations misclassified by a
model in a data set
Classification tree – A tree that classifies a categorical outcome variable by splitting
observations into groups via a sequence of hierarchical rules on the input variables
K-nearest neighbors – A data mining method that predicts (classifies or estimates)
an observations I’s outcome value based on the k observations most similar to
observation i with respect to the input variables
Observation (record) A set of observed values of variables associated with a single
entity, often displayed as a row in a spreadsheet or database
Sensitivity (recall) The percentage of actual Class 1 observations correctly identified
Variable (feature) – A characteristics or quantity of interest that can take on
different values
Chapter 10
What-if models – A model designed to study the impact of changes in model inputs
on model outputs
Make-versus-buy decision – A decision often faced by companies that have to
decide whether they should manufacture a product or outsource its production to
another firm.
Influence Diagram – A visual representation that shows which entities influence
others in a model
Decision Variable - A model input the decision maker can control
Parameters – In a what-if model, the uncontrollable model input
Data Table - An Excel tool that quantifies the impact of changing the value of a
specific input on an output of interest
One-way data table – An Excel Data Table that summarizes a single input’s impact
on the output of interest
Two-way data table – An Excel Data Table that summarizes two inputs’ impact on
the output of interest
Goal Seek – An Excel tool that allows the user to determine the value for an input
cell that will cause the value of a related output cell to equal some specifies value,
called the goal
Scenario Manager – An Excel tool that quantifies the impact of changing multiple
input on one or more outputs of interest
Trace Precedents button: After selecting cells, this button creates arrows pointing
to the selected cell from cells that are part of the formula in that cell.
Trace Dependents button: Shows arrows pointing from the selected cell to cells
that depend on the selected cell.