Regression Analysis in Residential Real Estate Litigation Speakers: Jeffrey W. Spilker, JD, CPA/ABV Helga A. Zauner, AB, CFE, CVA, MBA Speakers Jeffrey W. Spilker, JD, CPA/ABV Jeff Spilker is an Owner of Hill Schwartz Spilker Keller LLC (“HSSK”), a business valuation and litigation consulting firm in Houston, Texas. Jeff leads the HSSK’s real estate consulting and construction advisory practice. Previously, Jeff was with a national accounting and consulting firm. He has also served as the CFO of an engineering/construction company and Vice President/General Manager of a construction and real estate development firm. Jeff has provided a wide range of financial and economic consulting and financial forensics services to attorneys in matters involving intellectual property and patent infringement claims, health care and professional practices, oil & gas issues, construction disputes, professional liability claims, partnership disputes, real estate and business valuation issues, environmental issues, personal injury and employment claims, lost profits analyzes, fraud investigations and lender liability claims. He has provided expert testimony in over 100 of these matters. Jeff is a Certified Public Accountant, licensed attorney in the Commonwealth of Virginia and a Texas State Certified General Real Estate Appraiser. Speakers Helga A. Zauner, AB, CFE, CVA, MBA Helga Zauner is a Director in the Litigation Consulting group of Hill Schwartz Spilker Keller LLC (“HSSK”) in Houston. She is a testifying expert and has over 20 years of unique experience in Financial Analysis and Statistical Modeling. She specializes in Financial Forensics, Forecasting Techniques, Time Series Analysis and Econometrics and has extensive experience in Quantitative and Data Analysis. Helga's background includes serving as a Financial Advisor in banking and corporate credit and as a partner and Financial Manager in the construction and automobile dealership businesses. She was an Associate Professor for four years at Universidad de Guanajuato in Mexico, where she taught courses in Finance, Econometrics and Statistics, and did Applied Research in: Financial Analysis, Investments, Project Analysis, Time Series, Forecasting and Econometrics. Helga is a Certified Fraud Examiner and a Certified Valuation Analyst. Helga has a B.S. in Mathematics, an MBA and an ABD in Statistics. Today’s Program I. -- Litigation II. -- Simple Regression III. -- Multiple Regression IV. -- Binary Variables V. -- Examples of regression used in real estate litigation Objective Objective To show the use of the statistical methodology of multivariate regression analysis for: • Group appraisals • Prove/disprove an allegation in appraisal related litigation • Measure of damages Group Appraisals Prove/Disprove allegations How much did the slope change and how does that affect my value? The expert’s worst nightmare The Daubert Challenge Daubert Criteria 1. Has the theory or technique in question been tested or can it be tested? 2. Has this methodology been subjected to peer review and publication? 3. Is there a known or potential error rate? 4. Do maintenance of standards controlling its operation exist? 5. Has it attracted widespread acceptance within a relevant scientific community? Simple Linear Regression Model Building Size v. Sales Price 350,000 300,000 250,000 200,000 150,000 100,000 50,000 - 1,000 Source: Multiple Listing Services 2,000 3,000 4,000 5,000 6,000 Simple Linear Regression Model (cont.) Lot size v. Sales Price 350,000 300,000 250,000 200,000 SalesPrice Linear (SalesPrice) 150,000 100,000 50,000 - 2,000 4,000 Source: Multiple Listing Services 6,000 8,000 10,000 12,000 14,000 16,000 18,000 Other Possible Variables • • • • • • Property type (condo / multifamily / single family…) Year property was built Number of bedrooms / bathrooms New or recent construction? Renovated – Has property been renovated? Date property sold Bedrooms / Bathrooms Bedrooms v. Sales Price Bathrooms v. Sales Price 350,000 350,000 300,000 300,000 250,000 250,000 200,000 200,000 150,000 150,000 100,000 100,000 50,000 50,000 0 1 2 3 Source: Multiple Listing Services 4 5 6 0 1 2 3 4 5 Effect of time Year Built v. SalesPrice Closing Date v Sales Price 350,000 350,000 300,000 300,000 250,000 250,000 200,000 200,000 150,000 150,000 100,000 100,000 50,000 50,000 - 1998 2000 2002 2004 2006 Source: Multiple Listing Services 2008 2010 2012 2014 09/2011 04/2012 10/2012 05/2013 11/2013 06/2014 12/2014 Qualitative variables • • • • • Architect/Design Nice view Better materials Cleaner Good real estate agent for the seller? The Model Price = 44,912.22 + 43.85 x sqft A 3,200sqft house in Lakes of Savannah costs an average of $185,232 Coefficients: (Intercept) Estimate Std. Error t value Pr(>|t|) 44912.223 9033.256 4.972 1.85e-06 SqFtBldg 43.846 3.521 12.454 < 2e-16 --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 26600 on 145 degrees of freedom Multiple R-squared: 0.5168, Adjusted R-squared: 0.5135 F-statistic: 155.1 on 1 and 145 DF, p-value: < 2.2e-16 *** *** Price = 44,912.22 + 43.84 x sqft Building Size v. Sales Price 350,000 300,000 250,000 Slope 200,000 SalesPrice Linear (SalesPrice) 150,000 100,000 50,000 Intercept - 1,000 2,000 3,000 4,000 5,000 6,000 Remember Daubert Criteria? 1. Has the theory or technique in question been tested or can it be tested? 2. Has this methodology been subjected to peer review and publication? 3. Is there a known or potential error rate? 4. Do maintenance of standards controlling its operation exist? 5. Has it attracted widespread acceptance within a relevant scientific community? Multiple Regression Analysis • One variable explains only a certain percentage of the variability in price • Can we combine the effects of several variables? • Can we separate the effect each variable has on the price, isolating it from the effects of other variables? • Can we measure how much of the price variability our model has captured? • Can we determine how “trustworthy” our model is? The Answer Statistical methods allow us to • choose the variables that create the better model • measure the effect each variable has on the prices • determine how much of the price variability is captured by our model • measure the “trustworthiness” of the model – Perfect compliance to Daubert! – Difficult to explain • All this if our data and model comply with the assumptions of the Gauss-Markov theorem… The Model Price = -7,691,000 + 1.736 x (lot size) + 39.47 x sqft + 3,845 x (year built) + 57.60 x (days after 01/01/2012) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -7,691,000 1.470e+06 -5.233 6.16e-07 LotSize 1.736 1.046e+00 1.659 0.0994 SqFtBldg 39.47 3.712e+00 10.635 < 2e-16 YearBuilt 3,845 7.342e+02 5.237 6.06e-07 Days 57.60 7.098e+00 8.114 2.58e-13 --Significance codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 *** . *** *** *** Residual standard error: 20440 on 136 degrees of freedom (6 observations deleted due to missingness) Multiple R-squared: 0.7278, Adjusted R-squared: 0.7198 F-statistic: 90.91 on 4 and 136 DF, p-value: < 2.2e-16 Forecast Price = -7,691,000 + 1.736 x (lot size) + 39.47 x sqft + 3,845 x (year built) + 57.60 x (days after 01/01/2012) Price = -7,691,000 + 1.736 x 5,000sqft + 39,47x 3,200sqft + 3,845 x 2011 + 57.60 x 365 = $197,303 A house in Lakes of Savannah, on a 5,000 sqft lot, with 3,200sqft construction, built in 2011 and purchased on January 1, 2013 would have cost an average of $197,303 Quantitative v. Appraiser • The expertise of the appraiser can’t be substituted by a purely quantitative model • A model will provide the average price for a property with certain measurable characteristics • Regression models are particularly useful for group appraisals, and to price properties that are equivalent except for one or two variables – hence usefulness in litigation Binary (Dummy) Variables • Used for a yes/no characteristic: – – – – – – Garage/no garage Pool/no pool Single glass windows/double glass windows Renovated/not renovated In “special” area/not in “special” area Before/after certain event • Variable set to “1” if Yes or “0” if No Example • Use of binary variables for litigation: – Beach Condo High Rise – Damage model based on the assumption that prices changed after January 1, 2010 – Regression model used to determine whether prices were affected by this event of 2010. Example (cont.) – Condo Prices Sale Price $2,500,000.00 $2,000,000.00 $1,500,000.00 $1,000,000.00 $500,000.00 $01/03/05 01/03/06 01/03/07 01/03/08 01/03/09 01/03/10 01/03/11 Example (cont.) – Condo Prices Average Price per Sq Ft per Floor 500.00 450.00 Average Price per Sq Ft 400.00 350.00 300.00 250.00 200.00 150.00 100.00 50.00 1 2 3 4 5 6 7 8 9 10 11 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Floor Example (cont.) – Condo Prices Floor 35 30 25 20 15 10 5 0 01/14/04 05/28/05 10/10/06 02/22/08 07/06/09 11/18/10 04/01/12 Example (cont.) – Condo Prices Relevant factors for condo prices were Size in Square Feet and Floor: Price = -207,420.87 + 429.32 x sqft + 5,069.41 x floor Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -207420.87 27479.71 -7.548 2.84e-12 *** Size 429.32 15.46 27.775 < 2e-16 *** Floor 5069.41 675.56 7.504 3.65e-12 *** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 75230 on 165 degrees of freedom Multiple R-squared: 0.8405, Adjusted R-squared: 0.8386 F-statistic: 434.9 on 2 and 165 DF, p-value: < 2.2e-16 Example(cont.) – Was there a change in 2010? Price = -184,106.69 + 422.84 x sqft + 5,315 x floor – 59,197.88 af2010 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -184106.69 26319.65 -6.995 6.40e-11 Size 422.84 14.61 28.948 < 2e-16 Floor 5315.44 637.73 8.335 2.97e-14 af2010 -59197.88 12508.94 -4.732 4.76e-06 --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 70780 on 164 degrees of freedom Multiple R-squared: 0.8597, Adjusted R-squared: 0.8571 F-statistic: 335 on 3 and 164 DF, p-value: < 2.2e-16 *** *** *** *** Prices went down by an average of $59,198! Notice that • By adding the binary variable the intercept and coefficients for floor and square feet changed. • Previous model was biased! • Is the theory wrong????? • No… the model just didn’t fulfill the assumptions. • The Gauss-Markov theorem states that regression models which fulfill the classical linear regression model assumptions provide the best, linear and unbiased estimators. With respect to ordinary least squares, the relevant assumption of the classical linear regression model is that the error term is uncorrelated with the regressors. • The presence of omitted-variable bias violates this particular assumption. The violation causes the OLS estimator to be biased and inconsistent. Regression as an aid to a “traditional” appraiser • The coefficients of the regression indicate the change in the price as a result of a unit change in the variable. • This can be used to give a magnitude to the adjustments used in traditional appraisal techniques. Notes of Caution • Regardless of the true relationship between variables, you can ALWAYS run a regression between them. • Correlation does not imply causation. • The validity of the results of a regression model depend on the assumptions being true: – Normality – No auto-correlation – No multicollinearity Must perform tests to validate the significance of the model Bunnies cause peace! Examples • Regression analysis used in litigation in: – – – – – – – – – Age discrimination Antitrust Appraisal of shares Breach of contract Copyright infringement Gender/racial discrimination Patent infringement Securities Fraud Real property tax assessment/valuation Environmental/Property Damage • Ponca Tribe of Indians of Oklahoma v. Continental Carbon Co. – Defendant’s expert reported results of neighborhood comparison analysis and multi variate regression analysis (test of property value diminution in test areas compared to control areas). – Plaintiff’s complained that defendant’s expert selectively used data to reach conclusions that he would not have reached had he used all the data. Expert did not do an analysis of the percentage of outliers. Source: Litigation Services Handbook, 5th edition Property Tax Valuation • Department of Revenue v Grant Western Lumber Co. – The court believes that it is an error to make a conclusion about the slope coefficient for the price of the subject mill because only one variable was taken in to account. Other important factors such as age, condition, location, etc. were not included in the analysis. Source: Litigation Services Handbook, 5th edition ASHBY HIGH-RISE ASHBY HIGH-RISE The Question • Has the imminent threat of the construction of the high rise already affected market prices? • What area has been affected by the threat of the high rise? Ashby area is “different” • • • • • Traditional Houston neighborhood Renovated historical homes more valuable than new homes Size of home not as important as lot size Many “special” homes Typical variables are not the most important to determine prices Use price per square foot instead of price 2Q2013 1Q2013 4Q2012 3Q2012 2Q2012 1Q2012 4Q2011 3Q2011 2Q2011 1Q2011 4Q2010 3Q2010 2Q2010 1Q2010 4Q2009 3Q2009 2Q2009 1Q2009 4Q2008 3Q2008 2Q2008 1Q2008 4Q2007 3Q2007 2Q2007 1Q2007 4Q2006 3Q2006 2Q2006 1Q2006 Average Price per Sq Ft Ashby – Historical sales data Ashby Area Average Price per Square Foot 450 400 350 300 250 200 Outside 150 Inside 100 50 0 2Q2013 1Q2013 4Q2012 3Q2012 2Q2012 1Q2012 4Q2011 3Q2011 2Q2011 1Q2011 4Q2010 3Q2010 2Q2010 1Q2010 4Q2009 3Q2009 2Q2009 1Q2009 4Q2008 3Q2008 2Q2008 1Q2008 4Q2007 3Q2007 2Q2007 1Q2007 4Q2006 3Q2006 2Q2006 1Q2006 Average Price per Sq Ft Ashby – Perhaps “smoothing” the data Ashby Area Average Price per Square Foot 450 400 350 300 250 Outside 200 Inside 150 Poly. (Outside) Poly. (Inside) 100 50 0 Ashby – “Larger lot size = Higher price” Lot Size v PRSF 1000 900 800 700 600 PRSF 500 Linear (PRSF) 400 300 200 100 0 0 10000 20000 30000 40000 50000 60000 70000 80000 Ashby – Newer houses have lower prices? Year Built v PRSF 1000 900 800 700 600 PRSF 500 Linear (PRSF) 400 300 200 100 0 1880 1900 1920 1940 1960 1980 2000 2020 Variables • Dependent Variable: – PRSF – Price Per Square Foot • Independent Variables: – – – – – – – – – Days – Number of Days after March 1, 2012 Townhome – Is property a Townhome (Yes = 1, 0=No) Multifamily – Is Property a Multifamily Home (Yes = 1, 0=No) Bldg – Square Feet of Construction HR – Is Property In High Rise Affected Area? (Yes = 1, 0=No) Year – Year Property was built Lot – Lot Size in Square Feet Historic – Is property located in the historic district? (Yes = 1, 0=No) Renovated – Has Property been renovated? (Yes = 1, 0=No) The Model and Result Coefficients: Estimate -1,695 .09405 -47.35 -80.87 -.022170 .9890 .01327 53.33 31.42 -46.30 Std. Error 5.351e+02 3.146e-02 1.725e+01 2.347e+01 6.211e-03 2.748e-01 1.982e-03 1.851e+01 1.349e+01 1.439e+01 (Intercept) Days Townhome Multifamily Bldg Year Lot Historic Renovated HR --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 t value -3.168 2.989 -2.744 -3.445 -3.494 3.599 6.698 2.881 2.329 -3.219 Residual standard error: 60.8 on 135 degrees of freedom Multiple R-squared: 0.5492, Adjusted R-squared: 0.5192 F-statistic: 18.28 on 9 and 135 DF, p-value: < 2.2e-16 Pr(>|t|) 0.001901 ** 0.003326 ** 0.006888 ** 0.000760 *** 0.000644 *** 0.000448 *** 5.21e-10 *** 0.004616 ** 0.021343 * 0.001613 ** Conclusions • Regression analysis complies with Daubert criteria. • However… – Easy to “lie” with statistics – careful with the interpretation of the model! – Correlation does not imply causation – Must perform reliability tests on model – Can’t necessarily extrapolate – Courts can be skeptical Thank you! Questions? Helga Zauner hzauner@hssk.com Jeff Spilker jspilker@hssk.com