File - Chau P. Tran's Portfolio

advertisement
WHAT FEATURES DETERMINE THE PRICE OF A
USED CAR?
STATISTICS PROJECT
ECON 306 (001)
FALL 2013
ELIZABETH SWAIN
JORDAN HOPF
NICHOL BHANDARI
CHAU TRAN
Descriptive Statistics
The team gathered data for this paper using used car websites such as www.carmax.com and
www.cars.com. All team members gathered data for used sedans. Each team member was
assigned a unique color to use in their search to eliminate the risk of duplicate data. Parameters
were set for the age and location of the cars. The maximum age for each car used in this model is
9 years old. To keep the cars within the state of Maryland, we set a 50-mile radius from zip code
21090 (Linthicum Heights, MD). Our model is as follows:
Price = b0 + b1(mileage) + b2(age) + b3(sunroof) + b4(seats) + b5(hybrid) + b6(trans)
+ b7(pwr) +b8(doors) + b9(domfrn)
The dependent variable in our model is the selling price for a used car. Our independent
variables include the age, mileage, number of doors (2 or 4), type of seats (leather or cloth),
whether the car has an automatic or manual transmission, sunroof, or power locks and windows,
whether the car was a hybrid, and if the car was domestic or foreign.
Mileage
The independent variable, mileage, is used to explain the total amount of miles the vehicle had
been driven prior to being back up for resale. The mean (average) miles driven for a used car in
our sample was 38,638 with a standard deviation of 25,760.76 miles. The median miles driven
for each sedan was 35,000. The
minimum and maximum amount of
miles driven was 1,993 and 119,000
respectively. The interquartile range is
32,505 miles. The mileage for the 25th
percentile and 75th percentile is 17,496
and 50,000 respectively. In regards to
outliers, the upper bound is 115,920
and the lower bound is
-38,644.77.
Our data set contained three outliers:
116,000, 118,000 and 119,000 miles.
The Pearson’s Coefficient of Skewness
(PCS) is 0.42361. This means it is right skewed which is evident by the histogram above.
Age
The age is used to determine the time lapse between when the car was manufactured and today
(the date of sale), based on years. We used the formula Age = 2014 – year of sedan. For example
to determine the age of a 2008 used
sedan, we would take Age = 20142008. Therefore, a sedan manufactured
in 2008 is six years old. The mean age
for a used vehicle is 3.33 years old.
The standard deviation is 1.825 years.
The median age of a used sedan is 3
years
old.
The
minimum
and
maximum age for a used car is 1 and 9
years
old,
respectively.
The
interquartile range is 1 year. The upper
bound is 8.805 years and the lower
bound is -2.145. We have only one
outlier, which is a used 9-year-old sedan. The Pearson’s Coefficient of Skewness (PCS) is
0.54246. This means it is right skewed, which is evident by the histogram above.
Sunroof
If a sample car was chosen that had a sunroof or moonroof we gave it a value of 1, otherwise we
gave cars without this feature a 0. When coming up with variables that would affect the price of
a car, we assumed that a sunroof would increase the price. We believe the price would increase
due to the cost of the extra labor, extra parts, as well as the convenience/luxury of having one.
Sunroofs are typically on higher-end model cars. The mean for SUNROOF is 0.42 or 42% of the
vehicles in our sample data had a sunroof.
Seats
For seats, we looked at whether a car in our sample data had cloth or leather seats. We gave a
value of 1 if the car had leather, and a value of 0 if the car did not. We believe leather seats
would increase the price of a vehicle because it is generally not a standard feature, and the cost
of material is higher than that of cloth seats. The mean for SEATS is 0.42 or 42% of the cars in
our data set have leather seats. Is was not a surprise that both SUNROOF and SEATS had a
mean of 42% because both are complements and are usually offered in the same upgrade
package.
Hybrid
If a selected car was a hybrid we gave it a value of 1, and if the car was not a hybrid we gave it a
value of 0. When coming up with a variable that would affect the price, we assumed that owning
a hybrid car would have an increased sales price. This is due to the expensive parts and
technology inside the car. The mean for HYBRID was 0.14 or 14% of the vehicles in our sample
were hybrids.
Transmission
If a given car in our sample had an automatic transmission, than a value of 1 was assigned. If a
car had a manual transmission than we assigned a value of 0. When coming up with this variable
we assumed that having an automatic transmission would increase the price. The mean for
TRANS was 0.95 or 95% of the vehicles in our data set had an automatic transmission.
Power
If a given car in our sample had power windows and locks, we assigned a value of 1. Since all
cars in our sample have both power locks and windows, PWR is considered a constant and thus
has been omitted.
Doors
If a car in our sample was a four-door sedan we assigned a value of 1, and if the car was a twodoor coupe than we assigned a value of 0. We believe that the price of a two-door car would be
higher than that of a four-door car because many two-door cars are considered sports cars. The
mean for DOORS is 0.94, which means that 94% of the cars in our data set have four doors.
Domestic or Foreign
If a vehicle chosen for the data set was an “American” made car (domestic), than it was given a
value of 1. If the car was a foreign made, a value of 0 was given. It is expected that a price of a
used car would change depending on whether the car is domestic or foreign. Foreign cars tend to
have a higher resale value than their domestic counterparts. The mean for the DOMFRN is 0.71.
This means that 71% of samples collected are foreign cars.
IV. Two Sample Test Results
Hypothesis Testing
Step 1: Set up the null (H0) and alternative (H1) hypothesis.
H0: µ (Foreign used cars resale value) ≥ µ (Domestic used cars resale value)
H1: µ (Foreign used cars resale value) < µ (Domestic used cars resale value)
In this case, we wish to prove that foreign used cars do not have a higher resale value than the
domestic used cars.
Step 2: Find the p-value and interpret it in the context.
P-value is the likelihood of getting your sample if the null hypothesis is true.
One-tailed Hypothesis:
Since the Levene’s sig > 0.05, we use equal variance assumed.
P-value = sig (2-tailed) / 2
= 0.542/2
= 0.271
[If H0 is true, there is 27.1% chance of getting a difference in the resale value between foreign
used cars and domestic used cars of $486.526 or more.]
Step 3: Make a decision.
Level of Significance: 0.05
If P-value < level of significance, we reject H0, else we do not reject H0.
0.271 > 0.05, therefore, we do not reject the null (H0). We do not have enough evidence to
suggest that domestic used cars have a higher resale value than foreign used cars.
Part 5 – Linear Regression
1. State “a priori” expectations
· Positive – sunroof, seats, hybrid, power doors/windows, domestic/foreign
· Negative – mileage, age
· Unknown – transmission, number of doors
2. Run regression and get a regression equation
· We will select Model 2 because it has the highest Adjusted R Squared (0.683) and the lowest
Standard Error of the Estimate (2223.809).
· By using this mode, the number of doors and power locks/windows variables were removed
from the regression.
Price = 22,803.961 – 0.05(Mileage) – 1,121.89(Age) + 994.55(Sunroof) + 980.85(Seats) +
2,208.58(Hybrid) + 1,280.69(Transmission) + 1,250.65(DomFrn)
3. Determine the statistical significance of each individual independent variable.
· 1% - mileage, age, hybrid, DomFrn
· 5% - none
· 10% - sunroof, seats
· Insignificant – transmission
Mileage: For each additional mile a car has, its selling price will decrease by $0.05, holding all
else constant.
Age: For each additional year old a car is, its selling price will decrease by $1,121.89, holding all
else constant.
Hybrid: If a vehicle is a hybrid, its selling price will increase by $2,208.58, holding all else
constant.
DomFrn: If a vehicle is foreign, its selling price will increase by $1,250.65, holding all else
constant.
Sunroof: If a vehicle has a sunroof, its selling price will increase by $994.55, holding all else
constant.
Seats: If a vehicle has leather sets, its selling price will increase by $980.85, holding all else
constant.
4. Adjusted R Squared
· Adjusted R Squared = 0.683
·
In this regression, 68.3% of the variation in the selling price of a used vehicle can be
explained by the variation in the set of independent variables, adjusting for the number of
independent variables. 31.7% of the variation in selling price is left unexplained.
· This regression is not bad, but it could use improvements. We could add additional variables,
such as fuel economy, drive train, mechanical condition, paint color, exterior condition, interior
condition, upkeep of the vehicle, any other suggestions?
5. Standard Error of the Estimate
· Standard error of the estimate = $2,223.81
· The average difference between the actual and predicted values of the selling price of a used
car is $2,223.81.
6. Model Significance (F-Test)
· ANOVA Sig = 0.000
· H0 = B1=B2=B3=…..=0
· H1= at least one coefficient does not equal zero
·
Our model is significant at the 1% level, thus we can conclude that at least one of the
coefficients does not equal zero.
Literature Review:
The used car industry is expected to be more benefits recently. The demand for used cars keeps
increasing for its affordable prices. The price of a used car is determined by many factors. This
study is aimed to predict the features that mainly determine the price of a used car by using a
multiple variable regression analysis.
There are limited studies conducted on used cars because the consumers look for cheap cars with
acceptable quality. However, we found some similar studies that relates to our topics. The main
idea of our study is that when an individual goes to used car market to buy a car, he/she makes
the decision based on characteristics of a car such as age, mileage, number of doors, type of
seats, an automatic or manual transmission, sunroof, or power locks and windows, hybrid and if
the car was domestic or foreign. Our analysis will focus on the determinants of the value of a
used car by interpret the contributions of those factors that affect its price. We are going to use
the quantitative method to run a regression and an equation to be estimated.
The expected equation that explains the variation in prices of a used car will look like this:
Price = b0 + b1(mileage) + b2(age) + b3(sunroof) + b4(seats) + b5(hybrid) + b6(trans)
+ b7(pwr) +b8(doors) + b9(domfrn)
Because the Power doors/windows factor is removed from the regression equation as well as the
statistically insignificant Transmission, our report comes out with the six statistically significant
variables which determine the value of a used car and thus affect its price as below.
Obviously, the older and more mileage a car is, the more depreciation it get. Therefore, its price
is normally cheaper and less valuable than the newer ones. In the study “The market for used
cars: A new test of the lemons model”, Emons W. and Sheldon G. conclude that “Less than 10
percent of the used cars purchased were re-sold within the first year of ownership” by setting the
two separate samples. Another study entitled "The Determinants of Used Rental Car Prices"
reports predictably that both car age and mileage contribute negatively to resale value. The report
is supported our outcomes that for each additional mile and year old a car has, its selling price
will decrease by $0.05 and $$1,121.89 respectively with holding all else constant.
Sunroofs, leather seats are the characteristics of the cars that attract people the most. Young
people have the tendency to buy cool sunroof cars with leather and power doors. Our analysis
process points out the significance of cars’ interiors such as leather seats and sunroof feature will
help to retain higher resale values than cars without those features.
Carr-Ruffino, Norma, and John Acheson (2007) has mentioned the “Green” reason for Hybrids
tendency that “Hybrids have saved more than an estimated one million barrels of crude oil, three
million pounds of smog-forming gases, one million metric tons of carbon dioxide and an
estimated 125 million gallons of gasoline”. Moreover, the increase in gasoline price keeps
swirling the consumers to regard the benefits of Hybrid cars. Therefore, Hybrid used cars are not
only economical but also help the environment positively.
If a car is from the foreigner manufacturers, its resale price will be higher than the one with
domestic manufacturers. The Japanese automakers are definitely preferred the most for their
quality and customer services. According to the article “Detroit's new quality gap." posted in the
Wall Street Journal addresses the Japanese reputation that "Japanese automakers are particularly
effective at testing for the attributes that excite their target customers." Hence, a foreign used car
is also predicted to be more expensive. Buyers need to pay the extra $1,250.65 if they want a
foreign used car.
References
Cho, Sung Jin. "The Determinants of Used Rental Car Prices." Journal a/Economic Research 10, no. 2 (2005): 277304.
Carr-Ruffino, Norma, and John Acheson. "The Hybrid Phenomenon." Futurist 41, no. 4: 16-22, 2007. Available
from Business Source Premier, EBSCOhost. Accessed 14 September 2008.
Ganguli, Niladri, T. V. Kumaresh, and Aurobind Satpathy. "Detroit's new quality gap." McKinsey Quarterly, no. I
(2003): 148-151.
Emons, W., and G. Sheldon (2002), “The Market for Used Cars: A New Test of the Lemons Model”, CEPR
Discussion Paper, DP 3360.
Download