Estimation of main carcass components by using bootstrapping regression method

Bootstrap resampling methods have emerged as powerful tools for constructing inferential procedures in modern statistical data analysis. This article suggests an algoritm for building a regression model by bootstrap resampling method practically and gives parameter estimates of the model used for estimating main carcass components for Awassi lambs. Special attention is given to the estimation of regression parameters, their standard errors and confidence intervals using by bootstraping regression method, and comparing results with ordinary least squares estimates. As result, so bootstrap regression method generally smaller standard errors and confidence intervals than ordinary least squares regression that the models MC (carcass muscle) = 214.198 + 3.808 MLL (muscle in long leg) + 4.866 MN (muscle in neck), BC (bone in carcass) = 605.904 + 3.641 BLL (bon in long leg) + 3.634 BN (bone in neck) and FC (fat in carcass) = -6283 + 716.8 CW (carcass weight) from bootstrapping regression method for estimation amount of muscle, bone and fat in carcass of fat tail Awassi lambs are more suitable than models from ordinary least squares method respectively.


INTRODUCTION
Partial dissection or sample joint dissection may be a good and simple tool to determine carcass composition, but it is practically inapplicable to commertial clas-sification of carcasses. It is, however, very usefull for the researchers in terms of precission and ease of application (Fisher, 1990). The precission of the method depends on predictors, animal species and breeds. It is, therefore, very important to determine best predictive components of carcass for different animals. In literature there are many attempt to predict carcass composition of western breeds (Kempster et al., 1982) by partial dissection methods, but not enough for fat tail Awassi lambs.
Linear regression method is one of the tools most often used by researchers to fit a model for estimation. They are interested in finding estimates of bias and variance of the estimator β in estimation β. They are also interested in constructing confidence intervals for β and prediction intervals for a future observation with explanatory variables x j . Some major modelling assumptions such as i. the error term has constant variance, ii. the errors are uncorrelated, and iii. the errors are normally distributed are very important with the regression model. Especially assumption iii. is required for hypotesis testing and interval estimation. It should be always considered the validity of these assumptions to be doubtful and conducted analysis to examine the adequacy of the model we have tentatively entertained. Gross violations of the assumptions may yield an unstable model in the sense that a different sample could lead to a totally different model with opposite conclusions (Montgomery and Peck, 1992). There are several methods useful for diagnosing and treating violations of the regression assumptions. Robust estimation strategies and residual diagnostics have improved the usefulness of these techniques. However, they may not provided these assumptions by using these methods. In these cases, the bootstrap adds another dimension to the subject.
In this study it was constituted an algoritm of bootstrapping in regression analysis and estimated parameters of the models which will used for estimating main carcass components muscle, bone and fat. The results were compared with ordinary least squares regression.

Material
Sixty fat-tail Awassi male lamb carcasses obtained in different feeding experiment in Animal Science Department, Faculty of Agriculture, University of Çukurova (Turkey) were used. The lambs were fed ad libitum with total mixed rations (90% concetrate and 10% lucerne straw with 1-2 chop length) containing 2.25-2.50 Mcal ME/kg and 140-180 g CP/kg. Fattening period was 56 days and initial liveweights of the lambs varied from 22 to 26 kg in the studies. The carcass dissection was performed according to Colomer-Rocher et al. (1987). According to the method, amount of carcass tissue were calculated from six sections of carcass, long leg, shoulder, neck, flank, the first five ribs and remaining ribs by summing up the same carcass components from left side of the carcass.
Daily gain, weights of kidney and channel fat, omental fat, fat tail and bone, muscle, intermuscular fat, subcutaneous fat, total fat in each joint together with eye muscle measurements (width of muscle, depth of muscle, depth of subcutaneous fat over muscle and depth of subcutaneous fat over the ventral edge of muscle serratus dorsalis) were used as variables to chose best predictors of carcass bone, muscle, fat.

Methods
The usual linear regression model is where Y = (y 1 ,y 2 ,...y n )' denotes the nx1 vector of the response, and nxk matrix of regressors is X = (x 1 ,x 2 ,...x n )', where the kx1 vector x i denotes the regressors for the i th observation where k is the number independent variables, ε i is an nx1 vector of uncorrelated error terms having mean 0 and variance σ 2 (Cook, 1977;John,1980, 1981). The px1 vector β holds the unknown parameters, for which the ordinary least squares(OLS) estimator is (2) where p is the number of parameters. It follows that . Because σ 2 is not usually known, Var(β) is estimated by (3) where s 2 is the unbiased variance estimator provided by the residuals e i = , I=1,2,...,N. (Atkinson, 1981;Catterjee and Hadi, 1986).
Bootstrapping is a broadly applicable, nonparametric approach to statistical inference that substitutes intensive computation for more traditional distributional assumptions and asymptotic results. Bootstrap aims to draw much of subsamples from sample for obtaining sampling distribution of estimator and to use the distribution for obtaining the better estimator of the population parameters (Mooney and Duval, 1993). Here, the bootstrap method bases similarity between sample and population. In addition, while the ordinary sampling techniques use some assumptions related to the form of the estimator distribution, bootstrap resampling method needn't these assumptions because of thinking sample data as population. That the bootstrapping exploits the central analogy is the population is to the sample as the sample is to the bootstrap samples. Concequently, • the bootstrap observations are analogous to the original observations • the bootstrap mean is analogous to the mean of the original sample • the mean of the original sample is analogous to the unknown population mean • the distribution of the bootstrap sample means is analogous to the unknown sampling distribution of means for samples of size n drawn from the original population. The bootstrap can be used to derive accurate standart errors, confidence intervals, and hypothesis tests for most statistics. It can be also used the bootstrap resampling techniqes for obtaining the regression parameter estimates, their standart errors and confidance intervals, and usually gives better estimates then classical methods needn't above assumptions.
A finite total of n n possible bootstrap samples exist. If it was computed the parameter estimates for each of these n n samples, it would obtain the true bootstrap estimates of parameters but such extreme computation is wasteful and unnecessary in this case (Stine, 1990). The number of bootstrap replications B depends on the application and size of sample. It was suggested the bootstrap replications sufficient to be B = 100 for standard error estimates, for confidence interval estimates B = 1000, for standard deviation estimate 50 ≤ B ≤100 (Efron, 1990;Leger et al., 1992).
It has been pointed in literature two different bootstrap resampling methods can be used in regression analysis. The coise of either methods depends upon the regressors are fixed or random. If the regressors are fixed, the bootstrap uses resampling of the error term. If the regressors are random, the bootstrap uses resampling of pairs of observations (Stine, 1990;Shao, 1996).
Here, it was given an algoritm for bootstrapping regression models based on the resampling observations. This approach is usually applied when the regression models built from data have regressors that are as random as the response. Let the (k+1) × 1 vector the values associated with ith observation. In this case, the set of observations are the vectors (w 1 ,w 2 ,...,w n ). The steps of bootstrapping with random regressor algoritm are: a. draw n sized sample from population randomly . b. draw a n sized bootstrap sample (w 1 *(b) , w 2 *(b) ,..., w n *(b) ) with replacement from the observations giving 1/n probability each w i values (Wu, 1986) and label the elements of each vector , where j = 1,2,...k, i = 1,2,...n.
From these form the vector and the matrix ŞAHINLER S., GÖRGÜLÜ M.
An illustrative study of bootstrap algorithm steps for estimation of β given above is shown in Table 1 by using muscle in carcass data. The variance-covariance matrix of β * from the probability distribution (F(β * )) are calculated by (Liu,1988;Stine, 1990) (8) The bootstrap confidence interval of β * j by normal approach is obtained by (9) where β * j is the j th bootstrap estimator, S e (β * j ) is the standard error of the j th Bootstrap estimator, and t n-p,α/2 is t values with n-p degrees of freedom and α/2 significant level (Diciccio and Tibshirani, 1987).
A nonparametric confidence interval for β * named percentile interval can be constructed from the quantiles of the bootstrap sampling distribution of β *(b) . The 95 % percentile interval is where β *(b) is the ordered bootstrap estimates of regression coefficient from Equation 5, lower = 0.025 B, and upper = 0.975 B.
The skewness of the distribution (F(β * )) of the replicates from step (e) for the β j *(b) can be determined by examining the shape of distribution plots of the β j . These plots show that a histogram of the replicates with an overlaid smooth density estimate. A solid vertical is plotted at the observed parameter value, and a dashed vertical line at the mean of the replicates.
The statistical packages, Excel, S-Plus for windows and SPSS for Windows, were used for the statistical analysis of these data.

RESULTS
The amount of lean, bone and fat tissue in the lamb carcass were related to some other carcass components and measurements. Non-significant variables were identified by inspection variable selection statistics and omitted (Draper and Smith, 1981;Catterjee and Teebagy, 1990). For this purpose in each stage outliers, leverage points and influential points were identified and after checking for some mistakes as entry of data and measurement, the outliers were omitted (Belsley et al., 1980;Şahinler, 2000). At the same time the models were controlled with regard to the assumptions of the Ordinary Least Squares method, autocorrelation and collinearity (Willan and Watts, 1978;Montgomery and Peck, 1992;Şahinler and Bek, 2002).
The parameter estimations, standard error and confidence intervals of the parameter estimations and some related descriptive statistics from ordinary least squares and bootstrapping regression methods for estimation of muscle in carcass, bone in carcass and fat in carcass are given in Table 2. For checking heterocedasticity of error term β β β β β β β β  . This means that the distributions of the β j *(b) for all regression models obtained from bootstrapping are normal.

Muscle in carcass
After selecting the independent variables by using stepwise method, muscle in long leg (MLL) and muscle in neck (MN) variables entered the model, and the ordinary least squares fit the data is: MC = 200.043 + 3.83 MLL + 4.81 MN (11) According to the results in Table 2, the regression in Equation 11 is significant (P<0.01) and all of the regression coefficients β 0, β 1 and β 2 are significant (P<0.01). The standard errors of the β 0, β 1 and β 2 are 437.877, 0.378, and confidence intervals are (-676.79 -1076.88), (3.073 -4.588) and (3.051 -6.569), respectively. Durbin Watson test statistic (d) is calculated as autocorrelation diagnostic and found as 1.787. So d = 1.787 grater then d U = 1.65 that there is no autocorrelation problem in error term. VIF statistics are calculated as collinearity diagnostic and found as VIF MLL = 1.875 and VIF MN = 1.875. Thus, so both of VIF j (= 1.875) < 10, that there is no collinearity problem between MLL and MN variables ( Table 2). The studentized deleted residual versus X MN for checking heterocedasticity of error term for model in Equation 11 and it was seen that there is no heterocedasticy problem in muscle in carcass data (Figure 1a). Thus, the model in Equation 11   standard errors and confidence intervals than ordinary least squares regression. Therefore, the model in Equation 12 is more suitable than model in Equation 11 for estimation amount of muscle in carcass of Awassi lambs.

Bone in carcass
Bone in long leg (BLL) and bone in neck (BN) variables entered to the regression model among the all of the independent variables and the ordinary least squares equation for bone in carcass (BC) is fitted as: BC = 606.418 + 3.647 BLL + 3.613 BN The regression in Equation 13 is significant (P<0.01), R 2 =0.866 and all of the regression coefficients β 0, β 1 and β 2 are significant (P<0.01). The standard errors of the β 0, β 1 and β 2 are 142. 714, 0.298 and 0.458, and 95 % confidence intervals are (320.64 -892.20), (3.050 -4.243) and (2.696 -4.531), respectively. According to the Durbin Watson test statistic (d = 2.204) and VIF statistics (VIF BLL =1.24 and VIF BN = 1.24), neither autocorrelation problem in error term nor collinearity problem between BLL and BN variables are not exist ( Table 2). Heterocedasticity of error term for bone in carcass data is not determined from the studentized deleted residual plot versus X BLL (Figure 1b). Thus, the model in Equation 13 The bootstrap standard errors of the β * 0, β * 1 and β * 2 are 120.904, 0.270 and 0.441 respectively. The bootstrap confidence and percentile intervals of the β * 0, β * 1 and β * 2 are (368.93 -842.88), (3.11 -4.17), (2.77 -4.50) and (370.88 -851.52), (3.10 -4.17), (2.69 -4.41), respectively (Table 2). According to these results, bootstrap regression method generally smaller standard errors and confidence intervals than ordinary least squares regression. Therefore, the model in Equation 14 is more suitable than model in Equation 13 for estimation amount of bone in carcass of Awassi lambs.

Fat in carcass
The fitted ordinary least squares equation for fat in carcass: where FC is fat in carcass(g) and CW is carcass weight (kg). For Equation in 15, the regression and coefficients are significant (P<0.01). And R 2 = 0.832. The standard errors and confidence intervals of the β * 0, and β * 1 are -6297, 716.751 and (-7895.5 -(-4699.1)), (632.136 -801.365), respectively. According to the Durbin Watson test statistic (d = 1.695) ( Table 2) and studentized deleted residual plot versus X CW (Figure 1c), neither autocorrelation problem in error term nor heterocedasticity of error term for fat in carcass data are not determined. Thus, the model in Equation 15 The bootstrap standard errors of the β * 0 , and β * 1 are 798.2 and 40.91, respectively. The bootstrap confidence and percentile intervals of the β * 0 , and β * 1 are (-7879.4 -(-4686.6)), (634.98 -798.62) and (-7878.7 -(-4702.1)), (634.08 -797.72), respectively (Table 2). According to these results, bootstrap regression method generally smaller standard errors and confidence intervals than ordinary least squares regression. Therefore, the model in Equation 16 is more suitable than model in Equation  15 for estimation amount of fat in carcass of Awassi lambs.

DISCUSSION
The most important advantages of the bootstrap regression method is to give smaller standard error and to need smaller sample than ordinary least squares method. On the other hand, its practical performance is frequently much better but this is not guaranteed (Hawkins and Olive, 2002). For this reason, it is a mistake to hope that bootstrap regression method always gives confident results. The confidence depends on the structure of the data and distribution function. Moreover, application of resampling methods depends on development of computer technologies.
If the results were examined in Table 2 , it was seen that there is no difference between the regression coefficients obtained from ordinary least squares and bootstrap regression method (P>0.05), except for regression coefficient ( β 2 =1.81 and β * 2 = 4.866) for muscle in carcass. Nevertheless, bootstrap regression method gives regression coefficients which have generally smaller standard errors and confidence intervals than ordinary least squares regression method. Similar result was reported by Efron (1979). But, the bootstrap regression method always might not give smaller standard error than ordinary least squares method as in regression coefficient (S.E.( β 1 ) = 42.271 and S.E.( β * 1 ) = 43.91) for fat in carcass model. Fox (1997) also reported similar results. Therefore, the model in Equations 12, 14 and 16 are more suitable than model in Equation 11, 13 and 15 for estimation amount of muscle, bone and fat in carcass of Awassi lambs, respectively. CONCLUSIONS As a result, it might be considered as the most diagnostic parts of the fat tail Awassi lambs carcass for muscle and bone in carcass are muscle and bone amounts in long leg and neck. The carcass weight is the most diagnostic parts of the fat tail Awassi lambs carcass for fat in carcass.