WHAT FEATURES DETERMINE THE PRICE OF A USED CAR? STATISTICS PROJECT ECON 306 (001) FALL 2013 ELIZABETH SWAIN JORDAN HOPF NICHOL BHANDARI CHAU TRAN Descriptive Statistics The team gathered data for this paper using used car websites such as www.carmax.com and www.cars.com. All team members gathered data for used sedans. Each team member was assigned a unique color to use in their search to eliminate the risk of duplicate data. Parameters were set for the age and location of the cars. The maximum age for each car used in this model is 9 years old. To keep the cars within the state of Maryland, we set a 50-mile radius from zip code 21090 (Linthicum Heights, MD). Our model is as follows: Price = b0 + b1(mileage) + b2(age) + b3(sunroof) + b4(seats) + b5(hybrid) + b6(trans) + b7(pwr) +b8(doors) + b9(domfrn) The dependent variable in our model is the selling price for a used car. Our independent variables include the age, mileage, number of doors (2 or 4), type of seats (leather or cloth), whether the car has an automatic or manual transmission, sunroof, or power locks and windows, whether the car was a hybrid, and if the car was domestic or foreign. Mileage The independent variable, mileage, is used to explain the total amount of miles the vehicle had been driven prior to being back up for resale. The mean (average) miles driven for a used car in our sample was 38,638 with a standard deviation of 25,760.76 miles. The median miles driven for each sedan was 35,000. The minimum and maximum amount of miles driven was 1,993 and 119,000 respectively. The interquartile range is 32,505 miles. The mileage for the 25th percentile and 75th percentile is 17,496 and 50,000 respectively. In regards to outliers, the upper bound is 115,920 and the lower bound is -38,644.77. Our data set contained three outliers: 116,000, 118,000 and 119,000 miles. The Pearson’s Coefficient of Skewness (PCS) is 0.42361. This means it is right skewed which is evident by the histogram above. Age The age is used to determine the time lapse between when the car was manufactured and today (the date of sale), based on years. We used the formula Age = 2014 – year of sedan. For example to determine the age of a 2008 used sedan, we would take Age = 20142008. Therefore, a sedan manufactured in 2008 is six years old. The mean age for a used vehicle is 3.33 years old. The standard deviation is 1.825 years. The median age of a used sedan is 3 years old. The minimum and maximum age for a used car is 1 and 9 years old, respectively. The interquartile range is 1 year. The upper bound is 8.805 years and the lower bound is -2.145. We have only one outlier, which is a used 9-year-old sedan. The Pearson’s Coefficient of Skewness (PCS) is 0.54246. This means it is right skewed, which is evident by the histogram above. Sunroof If a sample car was chosen that had a sunroof or moonroof we gave it a value of 1, otherwise we gave cars without this feature a 0. When coming up with variables that would affect the price of a car, we assumed that a sunroof would increase the price. We believe the price would increase due to the cost of the extra labor, extra parts, as well as the convenience/luxury of having one. Sunroofs are typically on higher-end model cars. The mean for SUNROOF is 0.42 or 42% of the vehicles in our sample data had a sunroof. Seats For seats, we looked at whether a car in our sample data had cloth or leather seats. We gave a value of 1 if the car had leather, and a value of 0 if the car did not. We believe leather seats would increase the price of a vehicle because it is generally not a standard feature, and the cost of material is higher than that of cloth seats. The mean for SEATS is 0.42 or 42% of the cars in our data set have leather seats. Is was not a surprise that both SUNROOF and SEATS had a mean of 42% because both are complements and are usually offered in the same upgrade package. Hybrid If a selected car was a hybrid we gave it a value of 1, and if the car was not a hybrid we gave it a value of 0. When coming up with a variable that would affect the price, we assumed that owning a hybrid car would have an increased sales price. This is due to the expensive parts and technology inside the car. The mean for HYBRID was 0.14 or 14% of the vehicles in our sample were hybrids. Transmission If a given car in our sample had an automatic transmission, than a value of 1 was assigned. If a car had a manual transmission than we assigned a value of 0. When coming up with this variable we assumed that having an automatic transmission would increase the price. The mean for TRANS was 0.95 or 95% of the vehicles in our data set had an automatic transmission. Power If a given car in our sample had power windows and locks, we assigned a value of 1. Since all cars in our sample have both power locks and windows, PWR is considered a constant and thus has been omitted. Doors If a car in our sample was a four-door sedan we assigned a value of 1, and if the car was a twodoor coupe than we assigned a value of 0. We believe that the price of a two-door car would be higher than that of a four-door car because many two-door cars are considered sports cars. The mean for DOORS is 0.94, which means that 94% of the cars in our data set have four doors. Domestic or Foreign If a vehicle chosen for the data set was an “American” made car (domestic), than it was given a value of 1. If the car was a foreign made, a value of 0 was given. It is expected that a price of a used car would change depending on whether the car is domestic or foreign. Foreign cars tend to have a higher resale value than their domestic counterparts. The mean for the DOMFRN is 0.71. This means that 71% of samples collected are foreign cars. IV. Two Sample Test Results Hypothesis Testing Step 1: Set up the null (H0) and alternative (H1) hypothesis. H0: µ (Foreign used cars resale value) ≥ µ (Domestic used cars resale value) H1: µ (Foreign used cars resale value) < µ (Domestic used cars resale value) In this case, we wish to prove that foreign used cars do not have a higher resale value than the domestic used cars. Step 2: Find the p-value and interpret it in the context. P-value is the likelihood of getting your sample if the null hypothesis is true. One-tailed Hypothesis: Since the Levene’s sig > 0.05, we use equal variance assumed. P-value = sig (2-tailed) / 2 = 0.542/2 = 0.271 [If H0 is true, there is 27.1% chance of getting a difference in the resale value between foreign used cars and domestic used cars of $486.526 or more.] Step 3: Make a decision. Level of Significance: 0.05 If P-value < level of significance, we reject H0, else we do not reject H0. 0.271 > 0.05, therefore, we do not reject the null (H0). We do not have enough evidence to suggest that domestic used cars have a higher resale value than foreign used cars. Part 5 – Linear Regression 1. State “a priori” expectations · Positive – sunroof, seats, hybrid, power doors/windows, domestic/foreign · Negative – mileage, age · Unknown – transmission, number of doors 2. Run regression and get a regression equation · We will select Model 2 because it has the highest Adjusted R Squared (0.683) and the lowest Standard Error of the Estimate (2223.809). · By using this mode, the number of doors and power locks/windows variables were removed from the regression. Price = 22,803.961 – 0.05(Mileage) – 1,121.89(Age) + 994.55(Sunroof) + 980.85(Seats) + 2,208.58(Hybrid) + 1,280.69(Transmission) + 1,250.65(DomFrn) 3. Determine the statistical significance of each individual independent variable. · 1% - mileage, age, hybrid, DomFrn · 5% - none · 10% - sunroof, seats · Insignificant – transmission Mileage: For each additional mile a car has, its selling price will decrease by $0.05, holding all else constant. Age: For each additional year old a car is, its selling price will decrease by $1,121.89, holding all else constant. Hybrid: If a vehicle is a hybrid, its selling price will increase by $2,208.58, holding all else constant. DomFrn: If a vehicle is foreign, its selling price will increase by $1,250.65, holding all else constant. Sunroof: If a vehicle has a sunroof, its selling price will increase by $994.55, holding all else constant. Seats: If a vehicle has leather sets, its selling price will increase by $980.85, holding all else constant. 4. Adjusted R Squared · Adjusted R Squared = 0.683 · In this regression, 68.3% of the variation in the selling price of a used vehicle can be explained by the variation in the set of independent variables, adjusting for the number of independent variables. 31.7% of the variation in selling price is left unexplained. · This regression is not bad, but it could use improvements. We could add additional variables, such as fuel economy, drive train, mechanical condition, paint color, exterior condition, interior condition, upkeep of the vehicle, any other suggestions? 5. Standard Error of the Estimate · Standard error of the estimate = $2,223.81 · The average difference between the actual and predicted values of the selling price of a used car is $2,223.81. 6. Model Significance (F-Test) · ANOVA Sig = 0.000 · H0 = B1=B2=B3=…..=0 · H1= at least one coefficient does not equal zero · Our model is significant at the 1% level, thus we can conclude that at least one of the coefficients does not equal zero. Literature Review: The used car industry is expected to be more benefits recently. The demand for used cars keeps increasing for its affordable prices. The price of a used car is determined by many factors. This study is aimed to predict the features that mainly determine the price of a used car by using a multiple variable regression analysis. There are limited studies conducted on used cars because the consumers look for cheap cars with acceptable quality. However, we found some similar studies that relates to our topics. The main idea of our study is that when an individual goes to used car market to buy a car, he/she makes the decision based on characteristics of a car such as age, mileage, number of doors, type of seats, an automatic or manual transmission, sunroof, or power locks and windows, hybrid and if the car was domestic or foreign. Our analysis will focus on the determinants of the value of a used car by interpret the contributions of those factors that affect its price. We are going to use the quantitative method to run a regression and an equation to be estimated. The expected equation that explains the variation in prices of a used car will look like this: Price = b0 + b1(mileage) + b2(age) + b3(sunroof) + b4(seats) + b5(hybrid) + b6(trans) + b7(pwr) +b8(doors) + b9(domfrn) Because the Power doors/windows factor is removed from the regression equation as well as the statistically insignificant Transmission, our report comes out with the six statistically significant variables which determine the value of a used car and thus affect its price as below. Obviously, the older and more mileage a car is, the more depreciation it get. Therefore, its price is normally cheaper and less valuable than the newer ones. In the study “The market for used cars: A new test of the lemons model”, Emons W. and Sheldon G. conclude that “Less than 10 percent of the used cars purchased were re-sold within the first year of ownership” by setting the two separate samples. Another study entitled "The Determinants of Used Rental Car Prices" reports predictably that both car age and mileage contribute negatively to resale value. The report is supported our outcomes that for each additional mile and year old a car has, its selling price will decrease by $0.05 and $$1,121.89 respectively with holding all else constant. Sunroofs, leather seats are the characteristics of the cars that attract people the most. Young people have the tendency to buy cool sunroof cars with leather and power doors. Our analysis process points out the significance of cars’ interiors such as leather seats and sunroof feature will help to retain higher resale values than cars without those features. Carr-Ruffino, Norma, and John Acheson (2007) has mentioned the “Green” reason for Hybrids tendency that “Hybrids have saved more than an estimated one million barrels of crude oil, three million pounds of smog-forming gases, one million metric tons of carbon dioxide and an estimated 125 million gallons of gasoline”. Moreover, the increase in gasoline price keeps swirling the consumers to regard the benefits of Hybrid cars. Therefore, Hybrid used cars are not only economical but also help the environment positively. If a car is from the foreigner manufacturers, its resale price will be higher than the one with domestic manufacturers. The Japanese automakers are definitely preferred the most for their quality and customer services. According to the article “Detroit's new quality gap." posted in the Wall Street Journal addresses the Japanese reputation that "Japanese automakers are particularly effective at testing for the attributes that excite their target customers." Hence, a foreign used car is also predicted to be more expensive. Buyers need to pay the extra $1,250.65 if they want a foreign used car. References Cho, Sung Jin. "The Determinants of Used Rental Car Prices." Journal a/Economic Research 10, no. 2 (2005): 277304. Carr-Ruffino, Norma, and John Acheson. "The Hybrid Phenomenon." Futurist 41, no. 4: 16-22, 2007. Available from Business Source Premier, EBSCOhost. Accessed 14 September 2008. Ganguli, Niladri, T. V. Kumaresh, and Aurobind Satpathy. "Detroit's new quality gap." McKinsey Quarterly, no. I (2003): 148-151. Emons, W., and G. Sheldon (2002), “The Market for Used Cars: A New Test of the Lemons Model”, CEPR Discussion Paper, DP 3360.