INTERNATIONAL JOURNAL OF GEOMATICS AND GEOSCIENCES Volume 5, No 1, 2014 © Copyright by the authors - Licensee IPA- Under Creative Commons license 3.0 Research article ISSN 0976 – 4380 Computing pit excavation volume using Multiple Regression Analysis Ragab Khalil Civil Engineering Department, Faculty of Engineering, Assiut University, Assiut, Egypt and Landscape Architecture Department, Faculty of Environmental Design, KAU, Saudi Arabia khalilragab@yahoo.com ABSTRACT Volume estimation of borrow pits is common application in civil Engineering. Several methods for volume estimation have been presented in literature. In general, they rely on a specific polynomial to fit the surface heights. Practically, each site’s topography is unique and may not follow that polynomial. In this paper, regression analysis to find the most suitable equation that fit each site is presented. Using numerical examples, results from the proposed approach are presented and accuracy compared with existing methods. Keywords: Estimate, Excavation, Regression, Pit, Volume. 1. Introduction Volume estimation of borrow pits is common application in civil Engineering. Several methods have been developed for computing pit volume. Easa (1988) proposed a seconddegree polynomial based on Simpson method in each direction of equal interval grid. Chambers (1989) applied the equation of Easa (1988) on a grid of unequal intervals. Chen and Lin (1991) developed the cubic spline method, which provides smooth connections between the approximating third-degree polynomials. Easa (1998) developed a method based on the cubic Hermite polynomial. Yanalak (2005) studied the use of natural neighbor gridding technique to transform scattered field data to uniform grid and computed the volumes of rectangular prisms. Yilmaz (2010) applied photogrammetry to obtain measurements and use Surfer software to compute the volume. Davis (1994) and Mukherji (2012) proposed using finite elements technique for computing pit excavation volume. Data points obtained from classical field survey are often irregular and scattered. Each site has its unique topography and may not coincide with any of the mentioned polynomials. This paper introduces a technique to deal with each site as a unique and find the suitable equation that represents its surface using regression analysis. Regression analysis has been used to predict earthquake effects (Youd et al 2002), trip generation model (Sekhar et al 1997), thermal conductivity (Sanjaya et al 2011), Concrete compressive strength (Chou ans Tsai 2012), overbreak in underground mining (Jang and Topal 2013), housing demand (Ng et al 2008), selecting retaining wall systems (Choi and Lee 2010) and other engineering subjects. The aim of this paper is to compare the existing methods of volume estimation with volume estimated using regression technique on example 2 of Chen and Lin (1991), which was also used by others. The results are compared with those in Mukherji (2012). The applied technique is introduced, then applications and conclusions are given. 2. Regression analysis Submitted on June 2014 published on August 2014 43 Computing pit excavation volume using Multiple Regression Analysis Ragab Khalil Statistical models of the relationship between dependent variable (response variable) and one or more independent variables (explanatory variables) can be developed using linear regression. The general formula for multiple regression models is: (1) Where “y” is a dependent variable, β0 is a constant, βi is a regression coefficient and “xi” is independent variable (i=1, 2,…, n). The regression is still linear in coefficients even the independent variables have been raised to a power of any order as in equation 2. (2) In case of volume estimation, the dependent variable is the excavation depth (z) and the independent variables are the point plane coordinates (x, y). (3) The volume could be estimated using double integral over the pit area (4) Regression is usually carried out by specialist programs such as Microcal Origin, Sigma Plot or Graphpad Prism; however these programs tend to be expensive (Brown 2001). Popular spreadsheet programs, such as Quattro Pro, Microsoft Excel, and Lotus 1-2-3 provide comprehensive statistical program packages, which include a regression tool among many others (Orlov 1996). Microsoft Excel is probably included in the computer package as part of Microsoft Office, and thus no additional expense is required. Spreadsheet programs are among the most commonly used software, and most Engineers have experience with them even if at an elementary level. Excel offers a friendly user interface, flexible data manipulation, built-in mathematical functions and instantaneous graphing of data (Brown 2001). Excel contains two functions for data analysis, REGRESSION for linear regression model and SOLVER for non-linear model. Orlov (1996) explain how to use REGRESSION function for linear model. Using SOLVER function was explained by (Brown 2001). McCormick (2010) explained using REGRESSION function for non-linear model. ElGebeily and Yushau (2007) explained using Excel to perform numerical Integration. 3. Select the best equation A regression equation can be used for several purposes. The set of variables that may be best for one purpose may not be best for another. The purpose for which a regression equation is constructed should be kept in mind in the variable selection process. Some of the purposes may be broadly summarizes as Prediction, Description and Control. Prediction means that the regression model provides best prediction of dependent variable. When a regression equation International Journal of Geomatics and Geosciences Volume 5 Issue 1, 2014 44 Computing pit excavation volume using Multiple Regression Analysis Ragab Khalil is used for prediction, the variables are selected with an eye toward minimizing the mean square error of prediction (max R2). The goal description model is to quantify the relationship between one or more independent variables of interest and dependent variable, controlling for the other variables. In situations where description the prime goal, the smallest number of independent variables that explains the most substantial part of the variation in the dependent variable are chosen. When a regression model is used for control, the purpose for constructing the equation may be to determine the magnitude by which the value of an independent variable must be altered to obtain a specified value of dependent variable (Karim 2004). For volume estimation, description is the suitable model. To select the effective variables, a technique called Backward Elimination Procedure was used. The process of selecting the equation variables was as follows: 1. Determine the fitted regression equation containing all possible independent variables, considering . 2. Remove variable that its coefficient close to zero or its P value is greater than 0.05 (Dallal, 2012). 3. Re-compute the regression equation for the remaining variables. 4. Repeat steps 2 and 3 to reach the smallest number of independent variables. The intercept coefficient is included if it increases the values of R2 of the equation. 4. Application The data in Example 2 of Chen and Lin (1991), which was also used by Easa (1998), Yanalak (2005) and Mukherji (2012), were used for this application, because the purpose of this paper was to compare volumes determined by the existing methods with volumes calculated by integration of regression equation. The example in Chen and Lin (1991) involved a pit whose ground surface is expressed with the function: , where and (values are in meters) with exact volume = 118800 m3. The following three cases shown in figure 1 for constructing the data grid were considered: 1. A 6 x 5 grid with equal intervals in the ( x ) directions (20 m) but with unequal intervals in the ( y ) direction (25, 10, 30, 15, 10). 2. A 6 x 5 grid with equal intervals in the ( y ) directions (18 m) but with unequal intervals in the ( x ) direction (15, 30, 10, 35, 10, 20). 3. A 6 x 5 grid with unequal intervals in both the ( x ) and ( y ) directions, ( x ) intervals as described in cases 2 and ( y ) intervals as in case 1. By applying the regression on case 1, 2 and 3 the surface can be expressed as for case 1 for case 2 for case 3 Because the ground surface is expressed mathematically, the volume can be determined using integration as 120570, 134726 and 123704 m3 respectively. True rational errors of the volumes were calculated as ratio of the true error, which is the difference between estimated and exact volume, to the exact volume. The new calculated volumes and rational true errors International Journal of Geomatics and Geosciences Volume 5 Issue 1, 2014 45 Computing pit excavation volume using Multiple Regression Analysis Ragab Khalil are shown in Table 1 with those of the existing methods in Mukherji (2012) so that the results can be compared easily. Figure 1: Grids for application example (cases 1, 2, and 3) By considering Table 1, which provides a summary of earlier investigations, including the proposed method the following results can be outlined for the volume calculations handled in this study: 1. Volume calculation with regression analysis is better than that of the trapezoidal formula with the original data for the three cases. Regression method has rational true errors of 1.5, 13.4, and 4.1%, but the trapezoidal formula with the original data has values of 25.4, 19.2, and 19.2% for Cases 1–3, respectively. 2. Volume calculation with regression analysis is the best, with a rational true error of 1.5% for Case 1. The worst is Chambers (1989), with a value of 29.3%. 3. Volume calculation with regression analysis, which has the rational true error of 13.4%, is better than the trapezoidal method with 19.2% and the method of Yanalak (2005) with 16.2% but worse than the methods of Chambers (1989) with 3.8%, Chen and Lin (1991) with 2.7%, Easa (1998) with 3.7%, and Mukherji (2012) with 1.6% for Case 2. International Journal of Geomatics and Geosciences Volume 5 Issue 1, 2014 46 Computing pit excavation volume using Multiple Regression Analysis Ragab Khalil 4. With 4.1% rational true error, volume calculation with regression analysis is better than the trapezoidal method, with 19.2%, and the methods of Chambers (1989), Yanalak (2005) and Mukherji (2012) having the values of 15.5%, 8.3% and 13.6% respectively and worse than the methods of Chen and Lin (1991) and Easa (1998), which have the values of 2.6 and 3.7%, respectively, for Case 3. Table 1: Comparative Application Results Case 1 Case 2 Case 3 Method Volume Error % Volume Error % Exact volume 118,800 — 118,800 — Trapezoidal 149,009 25.4 141,615 19.2 Chambers (1989) 153,551 29.3 122,820 3.8 Chen and Lin (1991) 139,568 17.5 122,009 2.7 Easa (1998) 138,280 16.4 123,207 3.7 Yanalak (2005) (best of 3) 136,117 14.6 138,085 16.2 Mukherji (2012) 137,096 15.4 120,649 1.6 Proposed regression 120,570 1.5 134,726 13.4 Volume 118,80 0 141,61 4 137,15 4 121,86 0 123,19 9 128,63 2 134,97 7 123,70 4 Error % — 19.2 15.5 2.6 3.7 8.3 13.6 4.1 5. Conclusion Volume calculation with regression analysis method can be used for both regular and scattered data. It is very simple and fast method for volume estimation. It does not need specific software; regression analysis tool is available with Microsoft Excel in almost all computer machines. It just needs arranging data in Excel spreadsheet and apply the regression analysis to get the coefficients of the equation that represents the surface .volume can be estimated by integration of the gotten equation. Applying the regression analysis method on the example used by the earlier investigators shown that it gave the best results for case 1, it deviates from the true volume only by 1.5%. For case 3, the proposed regression analysis deviates from the true volume only by 4.1% which is close to the best results gotten by earlier authors. 6. References 1. Brown A.M., (2001), A step-by-step guide to non-linear regression analysis of experimental data using a Microsoft Excel spreadsheet, Computer Methods and Programs in Biomedicine, 65, pp 191–200. 2. Chambers, D. W., (1989), Estimating pit excavation volume using unequal intervals, Journal of Surveying Engineering, 115(4), pp 390–401. International Journal of Geomatics and Geosciences Volume 5 Issue 1, 2014 47 Computing pit excavation volume using Multiple Regression Analysis Ragab Khalil 3. Chen, C. S. and Lin, H. C., (1991), Estimating pit excavation volume using cubic spline volume formula, Journal of Surveying Engineering, 117(2), pp 51–66. 4. Choi M. and Lee G., (2010), Decision tree for selecting retaining wall systems based on logistic regression analysis, Automation in Construction, 19, pp 917–928. 5. Chou J.-S. and Tsai C.-F., (2012), Concrete compressive strength analysis using a combined classification and regression technique, Automation in Construction, Vol. 24, pp 52–60. 6. Dallal, G. E., (2012), The Little Handbook of Statistical Practice, available at http://www.jerrydallal.com/LHSP/LHSP.HTM, accessed on 25 May 2014. 7. Easa, S. M., (1988), Estimating pit excavation volume using nonlinear ground profile, Journal of Surveying Engineering, 114(2), pp 71–83. 8. Easa, S. M., (1998), Smooth surface approximation for computing pit excavation volume, Journal of Surveying Engineering, 124(3), pp 125–133. 9. El-Gebeily M. and Yushau B., (2007), Numerical Methods with MS Excel, The Montana Mathematics Enthusiast, 4(1), pp 84-92. 10. Jang H. and Topal E., (2013), Optimizing overbreak prediction based on geological parameters comparing multiple regression analysis and artificial neural network, Tunnelling and Underground Space Technology, Vol. 38, pp 161–169. 11. Karim, M. E., 2004, Selection of the Best Regression Equation by sorting out Variables, available at http://www.angelfire.com/ab5/get5/selreg.pdf, accessed 12 May 2014. 12. McCormick J. M., (2010), Advanced Regression with Microsoft Excel, Truman State University, available at http://chemlab.truman.edu/DataAnalysis/Excel_Files /Advanced Regression.asp, accessed on May 2014. 13. Mukherji B., (2012), Estimating 3D Volume Using Finite Elements for Pit Excavation, Journal of Surveying Engineering, 138(2), pp 85–91. 14. Ng S.T., Skitmore M. and Wong K.F., (2008), Using genetic algorithms and linear regression analysis for private housing demand forecast, Building and Environment, 43, pp 1171–1184. 15. Orlov M. L., (1996), Multiple Linear Regression Analysis Using Microsoft Excel, Chemistry Department, Oregon State University, available at http://chemold.science.oregonstate.edu/courses/ch361-464/ch464/RegrssnFnl.pdf, accessed 12 May 2014. International Journal of Geomatics and Geosciences Volume 5 Issue 1, 2014 48 Computing pit excavation volume using Multiple Regression Analysis Ragab Khalil 16. Sanjaya C. S., Wee T. H. and Tamilselvan T., 2011, Regression analysis estimation of thermal conductivity using guarded-hot-plate apparatus, Applied Thermal Engineering, 31(10), pp 1566-1575. 17. Sekhar S. V. C., Anand S. and Karim M. R., 1997, Comparison Of Regression Model And Category Analysis (A Case Study), Journal of the Eastern Asia Society for Transportation Studies, 2(3), pp 917-929. 18. Yanalak, M., (2005), Computing pit excavation volume, Journal of Surveying Engineering, 131(1), pp 15–19. 19. Yilmaz, H. M., (2010), Close range photogrammetry in volume computing, Experimental Techniques, 34(1), pp 48–54. 20. Youd T. L., Hansen C. M. and Bartlett S. F., (2002), Multilinear Regression Equations for Prediction of Lateral Spread Displacement, Journal of Geotechnical and Geoenvironmental Engineering, 128(12), pp 1007-1017. International Journal of Geomatics and Geosciences Volume 5 Issue 1, 2014 49