Uploaded by Pham Phuong Uyen

10210215 SFM-A1.1

advertisement
Table of Contents
Introduction ............................................................................................................................................... 3
Major Findings .......................................................................................................................................... 4
Part A: Business and economic data evaluation ........................................................................ 4
1.
Data collection methods: ....................................................................................................... 4
2.
Source of Data: ......................................................................................................................... 5
3.
Method for analyzing data: .................................................................................................... 6
Part B: Communicate findings using appropriate charts / tables. ......................................... 7
1.
Cleaning the data. .................................................................................................................... 7
2.
Summary statistics, tables, charts to explore each variable ..................................... 11
3.
The relationship between variables .................................................................................. 17
4.
Summary quantitative variable classifying by qualitative variables ........................ 20
5.
Evaluation of various types of tables and charts .......................................................... 21
Part C: Analysing and evaluating “House Price Data Project” data .................................... 22
1.
T-test.......................................................................................................................................... 22
2.
Regression analysis .............................................................................................................. 24
3.
Evaluate the use of summary statistics ........................................................................... 27
4.
Differences between regression analysis and correlation coefficients .................. 27
Reference ................................................................................................................................................. 29
Introduction
This research aims to accomplish three different things. The first step is to evaluate the facts and
data pertaining to business and the economy by making use of the case study titled "House Price
Data Project." Using a number of different approaches, the second step entails locating and
analyzing the data held by the corporation. In conclusion, ensure that the most important facts
are communicated by utilizing the charts and tables that are most applicable.
Major Findings
Part A: Business and economic data evaluation
1. Data collection methods:
Any company may benefit from data. However, data must be collected utilizing a number of
methods. Furthermore, data should be as consistent, dependable, and accurate as feasible.
Quantitative data and qualitative data are the two fundamental types of data gathered.

Quantitative data is information that is expressed in a specified amount or range, allowing it
to be tallied and quantified using exact measurements. In statistical analysis, quantitative data
are widely used. Questionnaires, surveys, document reviews, random sampling,
observations, and in-person or online interviews are just some of the ways that data may be
collected (Barrow, 2017):
o
Surveys and questionnaires: This strategy is convenient to use because it can either
be carried out in person or over the internet. Platforms such as Typeform, Qualtrics,
and SurveyMonkey allow anybody to collect quantitative data. Furthermore, this
technique simplifies and quantifies the thoughts and behaviors of participants.
o
Document review: As a result of the fact that this is a method for gathering data that
already exists inside and even outside of an organization, it is a source of information
that is relatively effective. It should be noted, however, that this is primarily a secondary
data source that includes three categories of documents: public, private, and physical
evidence.
o
Sampling: Instead of analyzing the entire set of data, researchers may select a
smaller group to investigate a representative sample. The two main types of sampling
are random probability sampling and non-probability sampling. Furthermore, data
analysts may use programming languages and other procedures to uncover patterns
from massive data sets.
o
Observations: This is a straightforward approach that has proven to be effective, and
it entails nothing more than the observer merely observing the activities and
occurrences taking place in their natural setting. This makes it possible for researchers
to have their participants make judgments and respond to challenges in a context that
is closer to their everyday lives, as opposed to a controlled environment such as a
laboratory or focus group.
o
Interviews: There are several types of interviews accessible nowadays, as well as
technologies to help with the interview process. Individuals may be interviewed over
the phone or using video conferencing equipment when researchers generate
standard questions for them.

Qualitative data: The analyst is able to dig deeper into hypotheses and numerical findings
with the help of qualitative data, which are results that are descriptive in nature. In statistics,
qualitative data refers to data that is qualitative, non-numerical, and observable, and that is
classified by the attribute of an object or phenomenon. Obtain the perspectives and
experiences of individual patients as well as their families, for instance. Researchers might
collect data through a variety of means, including in-depth interviews, discussions with focus
groups, secondary research, or even just observations (Mcclave, P George Benson and
Sincich, 2014):
o
In-dept interview: This method obtains information from respondents in a flexible
manner via specialized engagement, conversation, and interview with a research
subject. With the most active and full information, "open-ended" questions are usually
employed in this manner.
o
Focus group: Multidimensional and objective results from multiple perspectives will
be obtained via interaction, discourse, and discussion with a group of research
volunteers.
o Secondary research: Collect any information that is already available, whether it be
in the form of text, images, audio recordings, or video. In addition, there are two
approaches to research, which are known as case studies and longitudinal studies.
The first type collects data from the same source on a consistent basis over the course
of time to establish a correlation between the subjects that are being researched. In
contrast, a case study involves the observation of individuals within a specific setting.
o Observations: In thorough field notes, record what researchers can see, hear, or
experience.
2. Source of Data:
Data and statistics obtained, kept, and collected are used for a variety of analysis and evaluation
reasons. Data is classified into two types: main and secondary. Primary data is collected directly
by the researchers in line with their unique aims and methodologies. Meanwhile, secondary data
was gathered via documentation, research, and analysis of others (Mcclave, P George Benson
and Sincich, 2014):
Secondary data: The data has been secured by conducting research, reviewing it, or testing it,
and it was obtained from reputable data storage locations. Due to the fact that each company
maintains its own database, data can be preserved. Examples include data on customers, sales,
profits, payroll, and bonuses, in addition to data on employees and their private information. In
addition to this, the information that can be obtained from organizations and groups is quite
diverse, and the Internet is an extremely helpful resource. Readers, for instance, are able to
conduct a speedy search on the Internet for fundamental information such as sales, product
pricing, and the commercial plans of the company, among other things.
Primary data: This is an excellent source of data, but in order to get the most out of it, the
observer will need to take notes while they are out in the field. Observers will collect one or more
variables for statistical analysis, and the results of this analysis will ultimately be incorporated in
the findings (Anderson et al., 2018). For instance, in order to determine which smartphone is the
most popular purchase at Mobile World, the company collects information on the purchasing
patterns of a representative sample of the store's customers. Using this data, managers may
study their preferences and purchase time. They may get insight into how customers' wishes
impact their spending in order to develop successful business strategies.
Advantages
Secondary data
 Cost-effective
 Fast
 high-quality data gathered
by Specialists
 Data collection methods
vary.
Primary data
 Might be more accurate
 You have more control over
the data
 Privacy is maintained
Disadvantages
 Data could have been out of
dated
 Not tailored to your
requirements
 Data can be skewed in favor
of the person gathering it.
 It take a lot of time to build
this data
 It cost a lot of money
 It will need more labor
during the survey
 Have a better
understanding of the data
3. Method for analyzing data:
In statistics, there are two methods for analyzing data: descriptive analysis and inferential
analysis.
The use of frequency tables, cross tables, and graphs are all components of descriptive analysis,
which is used to describe and evaluate data based on qualitative factors. Quantitative variables
will be graphically represented as means, medians, modes, standard deviations, and variances
in the meantime, or in some other form of graphing representation. A summary of the data is
intended to be produced by descriptive analysis. In descriptive analysis, the variables are
completely separate from one another and do not influence or affect one another in any way.
Because analysts are primarily concerned with locating evidence and examining data, rather than
developing correlations between the variables in that data set, this method is straightforward to
implement and requires little effort on their part.
In addition to this, there is a type of statistical analysis known as inferential analysis. Since it deals
with the outcomes of statistical forecasts, this method is statistically more advanced than others
like it. The researcher will make use of the estimate (P value: =0.05). Estimation can be broken
down into two categories: point estimates, which involve only one value, and interval estimates,
which involve two values that can be accessed. In addition, forecasting and drawing conclusions
about trends in large populations can be accomplished by testing hypotheses based on both
qualitative and quantitative aspects of the variables in question. With the help of this method, the
analyst should have no trouble discovering the link between the variables. As a consequence of
this, businesses might rely on statistical data to develop hypotheses and practical solutions for
their business problems.
Differences between descriptive and inferential statistics
Descriptive statistics
Inferential statistics
When dealing with data of a modest size
When dealing with data of a huge size
Providing a meaningful presentation of all of
Analysis, comparison, and forecasting of
the data.
data expressed as a percentage
Simple to carry out the procedure
Complex procedure that calls for the use of
many different methods
Low levels of error in terms of frequency
There are many errors.
Part B: Communicate findings using appropriate charts / tables.
1. Cleaning the data.
Using the frequencies method to identify the missing in the data set
Statistics
Type
N
Valid
Missing
Price
Bedrooms
Bathrooms
Area
Furnished
Level
501
501
501
501
500
501
501
0
0
0
0
1
0
0
It can be noticed that there are 501 variables and 1 missing in the area variable, since the
missing is too tiny, therefore retain it
It is clear that the price variable has a great deal of extreme values; hence, the price variable
will be treated as a dependent variable in the analysis that will take place in the next two parts:
delete the outliers variable in your analysis.
The bedrooms and bathrooms variable has a low standard deviation, which means that outliers
do not significantly impact the data; hence, the variable should be retained.
The area, in contrast to price, is not used much in the following two-part analysis; therefore, it
will be kept, as its outliers do not affect the majority of the analysis. This is despite the fact that
the area has many outliers.
Statistics
N
The price of
Number of
Number of
The Area of the
property
bedrooms
bathrooms
property by m2
Valid
501
501
501
500
0
0
0
1
Missing
Percentile
25
850000
2.00
2.00
120.000
s
50
1837000
3.00
2.00
160.000
75
3067500
3.00
3.00
199.750
Q3 + 1.5IQR
6393750
319.375
Use IQR method to find Q3 + 1.5IQR and choose outliers bigger than this value to delete
Statistics
Type
N
Valid
Missing
Price
Bedrooms
Bathrooms
Area
Furnished
Level_group
479
479
479
479
479
479
479
0
0
0
0
0
0
0
To summarize, after removing the variable and cleaning the data, there are a total of 479 values
remaining for each variable, and no missing variables.
2. Summary statistics, tables, charts to explore each variable
Qualitative variable
a. Level group
Level_group
Cumulative
Frequency
Valid
Percent
Valid Percent
Percent
1.00
284
59.3
59.3
59.3
2.00
195
40.7
40.7
100.0
Total
479
100.0
100.0
The percentage of group 1 has reached as high as 59.3%, while the percentage of group 2 has
reached as high as 40.7%.
b. Type
Type
Cumulative
Frequency
Percent
Valid Percent
Percent
Valid
Apartment
436
91.0
91.0
91.0
Duplex
27
5.6
5.6
96.7
Penthouse
10
2.1
2.1
98.7
6
1.3
1.3
100.0
479
100.0
100.0
Studio
Total
The many kinds of homes are shown by the pie chart. It is abundantly obvious that the share of
apartments accounts for the largest proportion, reaching a maximum of 91% with 436 flats.
There are a total of 27 duplex homes, however only 5.6% of them are occupied by families. Last
but not least, the number of Penthouses is 10, while the number of Studios is 6, and their
respective percentages are 2.1% and 1.3%.
c. Furnished
Furnished
Cumulative
Frequency
Valid
Percent
Valid Percent
Percent
No
295
61.6
61.6
61.6
Yes
184
38.4
38.4
100.0
Total
479
100.0
100.0
The percentage of not furnished has reached as high as 61.6%, while the percentage of
furnished has reached as high as 38.4%.
Quantitative variables
a. Price
b. Descriptive Statistics
N
Price
Range
479
6127000
Minimum
35000
Maximum
6162000
Mean
1999718.33
Std. Deviation
Variance
1407692.078 1981596986002.
352
Valid N (listwise)
479
The price of each home in this variable is determined not only by the area and the type of home,
but also by whether or not the home is furnished. As a result, the range for this variable is rather
large, and it is around 6,127,000. Although the highest possible value is around 6,162,000, the
lowest possible value is 35,000. The typical price of a house is now approximately 1,999,718.33
dollars, which indicates that the majority of homes purchased are extremely spacious. This
histogram has a big standard deviation and has a tendency to be skewed to the right.
b. Bedrooms
Descriptive Statistics
N
Range
Bedrooms
479
Valid N (listwise)
479
Minimum
4
1
Maximum
5
Mean
2.75
Std. Deviation
.606
Variance
.367
The standard deviation is closer to 0.606 than it would otherwise be given that the range for this
variable is not very large. The mean total of the house have bedrooms is 479, on the other hand,
is around 3 bedrooms. In contrast to the histogram shown before, this one is symmetrical.
c. Bathrooms
Descriptive Statistics
N
Range
Bathrooms
479
Valid N (listwise)
479
Minimum
3
1
Maximum
4
Mean
2.08
Std. Deviation
.716
Variance
.513
The standard deviation is closer to 0.716 than it would otherwise be given that the range for this
variable is not very large. The mean total of the house have bedrooms is 479, on the other hand,
is around 2 bedrooms. In contrast to the histogram shown before, this one is symmetrical.
d. Area
Descriptive Statistics
N
Range
Area
479
Valid N (listwise)
479
296.0
Minimum
20.0
Maximum
316.0
Mean
157.107
Std. Deviation
52.7270
Variance
2780.132
As a result, the range for this variable is rather large, and it is around 296. Although the highest
possible value is around 316, the lowest possible value is 20. The typical mean area of a house
is now approximately 157.107, which indicates that the majority of homes purchased are
extremely spacious. This histogram has a low standard deviation and has a tendency to be
symmetrical
3. The relationship between variables
Correlations
Price
Price
Pearson Correlation
Bedrooms
1
Sig. (2-tailed)
N
Bedrooms
Bathrooms
Area
Pearson Correlation
479
.313**
Bathrooms
Area
.313**
.567**
.449**
.000
.000
.000
479
479
479
1
.533**
.661**
.000
.000
Sig. (2-tailed)
.000
N
479
479
479
479
.567**
.533**
1
.689**
Sig. (2-tailed)
.000
.000
N
479
479
479
479
.449**
.661**
.689**
1
Sig. (2-tailed)
.000
.000
.000
N
479
479
479
Pearson Correlation
Pearson Correlation
.000
479
**. Correlation is significant at the 0.01 level (2-tailed).

R = 0.313, which in the range of 0.3 to 0.5. Then it can be said that there is low positive
correlation between price and the number of bedrooms in a house.

R = 0.567, which in the range of 0.5 to 0.7, it demonstrated the morderate positive
correlation between total number of bathrooms and the price of the house.

R = 0.449, which is in the range of 0.3 to 0.5. It is stated that there is low positive correlation
between sale price and number of area

Sig. (2-tailed) = P value = 0.01 < α = 0.05, there is a correlation between price on size and
number of rooms and the correlation is significant at 1%.
The Scatter Plot shown above depicts the relationship between several factors such as selling
price and total area. The pricing will undoubtedly fluctuate as the location changes.
Furthermore, it is statistically significant since the dense point distribution implies a positive link
between selling price and area. However, there are a few places with unusually sparse
distribution that demonstrate minimal price reliance on floor area. The cost will be high since
diverse residences have bigger area. Smaller dwellings, on the other hand, will be more
affordable.
The scatter plot reveals that there is not much of a correlation between price and the number of
bedrooms because there is not much of a significant variance regardless of how many
bedrooms there are. In addition to this, it does not have any statistical significance because the
point distribution only takes into account the selling price and the number of bedrooms. The
price is still quite high considering the type of property and the location, despite the fact that
there are only a few bedrooms. In the meantime, houses that have multiple bedrooms are
selling for an even lower price.
Because there is minimal significant variation despite the rise in bathroom count, the Scatter plot
demonstrates little link between price and number of bathrooms. Furthermore, since the point
distribution just represents the selling price and the number of bathrooms, it is statistically
unimportant. Even with a restricted number of bathrooms, the price is still excessive for the house
type and location. Houses with many bathrooms, on the other hand, are even much less costly.
4. Summary quantitative variable classifying by qualitative variables
Price
Count
Type
Apartment
436
Mean
Median
Mode
Standard Deviation
1928151
1700000
2500000
1362388
1721394
Duplex
27
2876723
3340000
265000a
Penthouse
10
3255766
3140473
1250000a
1436892
6
1160333
1200000
1200000
483885
Studio
a.
Multiple modes exist. The smallest value is shown
If you take a look at the table, you'll see that the majority of the homes that were sold fell into the
"Apartment" category. More precisely, the average selling price was 1928151, and the number
of "Apartment" houses that were sold was the highest at 2500000. The price of "Apartment" has
a standard deviation of 1362388, which can be expressed as a number. As for "Duplex" homes,
due to the fact that they are more luxurious than "Apartment," they only sold 27 units, with an
average value of 2876723, and the majority of "Duplex" houses sold for 265000a, which results
in a standard deviation of "Duplex" price of 1721394. Penthouse homes are not all that unlike
from "Duplex" residences; in all, 10 Penthouse apartments were sold for an average price of
325766. Despite this, they made the most money selling it for 1250000a, and the standard
deviation was 1436892. They were only able to sell a total of six units of the Studio, each of which
was purchased for a price ranging from 1160333 to 1200000, with a standard variation of 483885.
Price
Standard
Count
Furnished
Level_group
Mean
Median
Mode
Deviation
No
295
2039651
1837000
2200000
1359741
Yes
184
1935696
1650000
1500000
1482878
1.00
284
2125213
2000000
2500000
1432771
2.00
195
1816947
1550000
1600000a
1353242
If you take a look at the table, you'll see that the majority of the homes that were sold fell into
the category of "not Furnished." More precisely, the average selling price was 2039651, and the
number of "not Furnished" homes that were sold was the most at 2200000. The price of "not
Furnished" has a standard deviation of 1359741, which is the value. The "Furnished" homes
sold for a total of 184 units, with an average value of 1935696 dollars. The "Furnished" homes
that sold the most for 1500000, therefore the "Furnished" price has a standard deviation of
1482878 dollars. The "Level group 1" category was successful in selling 284 units at an average
price of 2125213. They also sold the most for a total of 2500000, and the standard deviation
was 1432771. For the "Level group 2," they sold 195 units at an average price of 1816947, the
highest price at which they sold any of those units was 1600000a, and the standard deviation
was 1353242.
5. Evaluation of various types of tables and charts
Qualitative variables
When it comes to measuring qualitative factors, the most useful tools are frequency tables, bar
charts, and pie charts. The number of observations in each category that are distinct from one
another is presented in the frequency table. It provides information on a variety of aspects,
including the asking price, the size of the home, the number of rooms. Viewers are better able to
quickly comprehend fluctuations when they use bar charts and pie charts. In addition to this, they
may contrast a number of different variables in order to show the trend of each component.
Quantitative variables
The quantitative technique provides information in the form of the minimum, maximum, mean,
median, mode, and standard deviation of the data. All of the characteristics indicated above,
including years, age groups, kinds of properties, and postcodes, are taken into consideration and
examined. Graphs typically make use of histograms as a means of representing quantitative
variables because they make it possible for users to quickly identify individual points within the
data.
Bivariate qualitative variables
The relationship between qualitative variables can be graphically represented using something
called a cross table. The reader may notice a correlation between elements such as the size of
the property and the number of rooms and the asking price of the home. It is also possible to use
a clustered bar chart in order to call attention to the relationships that exist between the
independent variables.
Bivariate quantitative variables
A scatter plot is a graphical representation that can be used to examine the correlation,
magnitude, and synchronization of both independent and dependent variables. This can be
done by looking at the relationship between the points on the plot. The correlation coefficient is
incorporated into each of the variables, and the resulting data is then represented as a
numerical value.
Part C: Analysing and evaluating “House Price Data Project” data
1.
T-test
Depending on the table, two hypotheses are put forward:

H0: 𝜎2furnished = 𝜎2not furnished

H1: 𝜎2furnished ≠ 𝜎2not furnished
F= 2.325
Sig. (F) = 0.128
α = 0.05
According to the result, Sig. (F) = 0.128 > 0.05 => do not reject H0
 H0: 𝜎2 furnished = 𝜎2 not furnished
 The variances between two group are not different
The hypothesis in this situation is as follows:

H0: µ furnished = µnot furnished

H1: µfurnished ≠ µnot furnished
Sig. (2-tailed) = 0.432
α = 0.05
According to the result, Sig. (2-tailed) = 0.432 > 0.05 => do not reject H0
µ furnished = µ not furnished

This show that there is no different between the price of furnished and un-furnished
houses.
Depending on the table, two hypotheses are put forward:

H0: 𝜎2group1 = 𝜎2 group2

H1: 𝜎2 group1≠ 𝜎2 group2
F= 1.052
Sig. (F) = 0.306
α = 0.05
According to the result, Sig. (F) = 0.306 > 0.05 => do not reject H0
 H0: 𝜎2 group1= 𝜎2 group2
 The variances between two group are not different
The hypothesis in this situation is as follows:

H0: µ group1 = µ group2

H1: µ group1 ≠ µ group2
Sig. (2-tailed) = 0.018
α = 0.05
According to the result, Sig. (2-tailed) = 0.018 > 0.05 => do not reject H0
µgroup1 = µgroup2

This show that there is no different between the price of group 1 and group 2
houses.
2.
Regression analysis
Model Summary
Model
1
R
.580a
R Square
.336
Adjusted R
Std. Error of the
Square
Estimate
.329
1153106.999
a. Predictors: (Constant), Level_dummy, Bedrooms, Furished_dummy,
Bathrooms, Area
The change in the dependent variable may be explained by using three different independent
variables: the number of bedrooms, bathrooms, and square footage of furnished space (price).
These factors have a considerable impact on 33.6% of the total transaction price. In addition, R
Square equals 0.336, which indicates that 0.580% of the variation in selling price may be
attributed to one of three factors: the number of rooms, the size, or the year.
ANOVAa
Model
1
Sum of Squares
Regression
Residual
Total
df
Mean Square
3182761885239
5 6365523770478
05.000
1.000
6289271707852
473 1329655752188
19.600
.625
9472033593091
F
Sig.
.000b
47.873
478
24.600
a. Dependent Variable: Price
b. Predictors: (Constant), Level_dummy, Bedrooms, Furished_dummy, Bathrooms, Area
H0: The model is overall insignificant
H1: The model is overall significant
P Value = 0.000 < α = 0.05 => reject H0
 The model is overall significant
Coefficientsa
Standardized
Unstandardized Coefficients
Model
1
B
Std. Error
(Constant)
-201980.818
256987.751
Bedrooms
-93063.600
117497.280
Bathrooms
957095.887
Coefficients
Beta
t
Sig.
-.786
.432
-.040
-.792
.429
103242.323
.487
9.270
.000
3856.473
1583.556
.144
2.435
.015
Furished_dummy
-222031.397
109761.620
-.077
-2.023
.044
Level_dummy
-130040.013
108835.403
-.045
-1.195
.233
Area
a. Dependent Variable: Price
Test for significant of β value, there are two hypothesis:
H0: β = 0
H1: β ≠ 0
If Sig. > 0.05 => do not reject H0
If Sig. < 0.05 => reject H0
Depending on the data table, P Value of each variables is:
P Value of bedrooms = 0.429 > 0.05 => do not reject H0
P Value of bathrooms = 0.000 < 0.05 => reject H0
P Value of Area = 0.015 < 0.05 => reject H0
P Value of Furnished_dummy = 0.044 < 0.005 => reject H0
P Value of Level_dummy = 0.233 > 0.05 => do not reject H0
In a broad sense, the P Values of three variables, namely the number of bathrooms, the Area,
and the Funished dummy variable, are all lower than the alpha value. You might argue that
these factors have an influence on the Price variable, which means that the regression model
considers them to be significant. If a variable's P Value is greater than 0.05, it means that it
does not play a significant role in the regression model. Another way to put this is to say that the
variable in question does not have an effect on the variable in question, which in this case is
Price.
We have the formula:
Y= B0 + B1X1 + B2X2 + B3X3 in which
B0: Regression constant
B1, B2, B3: Regression coefficient of 3 variables (Number of bathrooms, Area,
Furnished_dummy)
X1, X2, X3: 3 independent variables
̂ = −201980.818 + 957095.887 𝐵𝑎𝑡ℎ𝑟𝑜𝑜𝑚𝑠 + 3856.473 𝐴𝑟𝑒𝑎
𝑃𝑟𝑖𝑐𝑒
+ (−222031.397) 𝐹𝑢𝑟𝑛𝑖𝑠ℎ𝑒𝑑_𝑑𝑢𝑚𝑚𝑦
β1 = 957095.887: when the value of the bathroom increases by 1, the house price will increase
to 957095,887
β2 = 3856.473: when the house increases by 1 m2 of room area, the house price will increase
to 3856,473
β3 = -222031.397: when the house is equipped with more than 1 device, the house price
decreases by 222031.397
3.
Evaluate the use of summary statistics
In descriptive statistics, year variables, age bands, property types, and postcodes are analyzed
and summed up using various age bands, property types, and postcodes. This strategy is
appropriate for assisting readers in comprehending the information because the data is
presented succinctly in the form of numerous tables and charts. Having said that, using this
method does not result in the formation of any hypotheses or predictions.
In addition, the data analyst can make use of inferential statistics to test a hypothesis in order to
establish a credible hypothesis that can be used to support the data. However, this strategy is
appropriate only for analysts with a very high level of expertise.
4.
Differences between regression analysis and correlation coefficients
An study of the correlation coefficient is used by analysts in order to investigate the connection
that exists between two variables, such as the retail price and the square footage of the home. It
is possible to use it in order to ascertain if the connection between the two variables in question
is positive or negative. On the other hand, regression analysis takes into account the same
characteristics. On the other hand, it demonstrates to researchers how much the dependent
variable shifts in response to changes in the independent variable. For instance, if the total floor
space is raised by one square meter, the price that is charged per square meter will also rise
proportionately.
Reference
Anderson, D.R., Sweeney, D.J., Williams, T.A., Camm, J.D. and Cochran, J.J. (2018).
Statistics for business & economics. Boston, Ma: Cengage Learning.
Barrow, M. (2017). Statistics for economics, accounting and business studies. Pearson Education
Limited.
Kalish, C. and Thevenow-Harrison, J., 2014. Descriptive and Inferential Problems of
Induction. Psychology of Learning and Motivation, pp.1-39.
Kaur, P., Stoltzfus, J. and Yellapu, V., 2018. Descriptive statistics. International Journal of
Academic Medicine, 4(1), p.60.
Mcclave, J.T., P George Benson and Sincich, T. (2014). Statistics for business and economics.
Boston: Pearson.
Download