Comparing univariate techniques for tender price index forecasting

The poor performance of projects is a recurring event in the construction sector. Information gleaned from literature shows that uncertainty in project cost is one of the significant causes of this problem. Reliable forecast of construction cost is useful in mitigating the adverse effect of its fluctuation, however the availability of data for the development of multivariate models for construction cost forecasting remains a challenge. The study seeks to investigate the reliability of using univariate models for tender price index forecasting. Box-Jenkins and neural network are the modelling techniques applied in this study. The results show that the neural network model outperforms the Box-Jenkins model, in terms of accuracy. In addition, the neural network model provides a reliable forecast of tender price index over a period of 12 quarters ahead. The limitations of using the univariate models are elaborated. The developed neural network model can be used by stakeholders as a tool for predicting the movements in tender price index. In addition, the univariate models developed in the present study are particularly useful in countries where


Introduction
The construction sector plays a pivotal role in the economic development process of any nation.However, the positive impact of the construction sector is limited by the poor performance of construction projects (see Lewis, 1984).For instance, fluctuations in construction cost create uncertainty, which has an adverse effect on project performance.Similarly, studies have reported that finance is one of the significant causes of poor performance of construction projects (Abd El-Razek, Bassioni and Mobarak, 2008;Frimpong, Oluwoye and Crawford, 2003).In fact, Flyvbjerg, Skamris Holm andBuhl (2003, 2004) found that 9 out of 10 transport infrastructure projects experience cost overrun.Taken together, these studies have proven that finance has significant impact on project performance.Hence, there is a constant need to seek new ways of reducing uncertainty associated with cost of construction projects.
Tender price index is a measure that tracks movements in construction cost over time.Research into tender price index forecasting has a long history.Fellows (1991) asserts that the use of time series techniques provides considerable benefit for the construction sector when applied to forecasting problems.Modelling techniques found in literature can be categorised into two classes namely, univariate (Goh and Teo, 2000) and multivariate (Wong and Ng, 2010).Although studies have shown that both approaches are reliable for predicting tender price index, the choice of a specific technique depends on the objective of study and availability of data.For instance, multivariate models are useful for evaluating existing theories or building new theories (see Shmueli, 2010).In contrast, univariate models are suitable for application to short-term forecasting problems (Goh and Teo, 2000).To date, very little is known about the predictive performance of univariate variant of neural network model when applied to tender price index forecasting.
The aim of the current study is to investigate the efficacy of using univariate techniques for tender price forecasting.The study addresses two objectives, which are: (1) to apply Box-Jenkins and neural network model to tender price forecasting and (2) to compare the predictive performance of the developed models.The Box-Jenkins model was selected as benchmark model due to: (i) a well-structured process of its application and (ii) an acceptable performance in previous research (Goh and Teo, 2000;Yip, Fan and Chiang, 2014).The information provided by the developed models are helpful for strategic decisions relating to management of project cost.
The remainder of this paper is divided into four sections.The first section provides justification for the approaches used in the present study.Secondly, an explicit description of the modelling techniques adopted in the study is discussed in the research method section.The results, discussion and implications of findings are presented in the third section.Finally, the fourth section provides concluding remarks and areas of further research.

Literature review JUSTIFICATION FOR THE APPLICATION OF UNIVARIATE MODELS TO TENDER PRICE INDEX FORECASTING
A growing body of literature has shown that models can be used for predicting the future changes in tender price index.The findings from previous research have shown that several economic factors determine the future value of tender price index.Shahandashti and Ashuri (2015) found that crude oil prices and average hourly earnings can explain movements in highway cost in the United States.Wong and Ng (2010) showed that building cost index, gross domestic product (GDP) and construction sector contribution to gross domestic product are significant predictors of tender price index in Hong Kong.Table 1 provides a summary of leading determinants used to explain changes in tender price index.However, it is important for readers to note that the list of determinants presented in Table 1 is not exhaustive.
Based on the information shown in Table 1, it is evident that a large number of social and economic indicators affect tender price index.Multivariate modelling techniques can be used to capture the relationship between tender price index and its determinants.The estimated multivariate models can also be used for forecasting purposes.However, Raftery (1999) explains that the availability of data is a crucial factor that determines the possibility of developing multivariate models.Similarly, numerous studies have acknowledged that non-availability of data is a major problem faced in the field of construction economics (see Gruneberg and Folwell, 2013;K'Akumu, 2007).The unavailability of data limits the potential of applying multivariate models to forecasting problems.Hence, there is need to seek alternative reliable methods that require few or no explanatory variables.
In literature, multivariate models have been used to generate the forecast of tender price index.Those found in literature include: regression, vector error correction and neural network, amongst others.Previous research has shown that multivariate models may not produce reliable forecast in selected cases (see Akintoye and Skitmore, 1994;Fan, Ng and Wong, 2010).The poor performance of multivariate models has been attributed to wrong choice of explanatory variables (Akintoye and Skitmore, 1994), presence of autocorrelation ( Killingsworth, 1990) and volatility (Fan, Ng and Wong, 2010).Thus, the current study explores the potential of using univariate modelling techniques for tender price index forecasting.
Univariate modelling techniques generate forecast of the predicted variable using only information contained in its previous values.Univariate modelling techniques have been applied to forecasting problems in several field such as construction investments (Goh and Teo, 2000), river flow (Wang et al., 2009), construction manpower (Wong, Chan and Chiang, 2011) and maintenance cost of construction equipment (Yip, Fan and Chiang, 2014), among others.The goal of univariate technique is to identify a model that captures the underlying data generation process present in a time series data.Univariate model can be expressed as: Where − − y y ,..., t t p 1 are previous (lagged) values of the predicted variable (i.e., yt) and f is the function for estimating the model.Univariate modelling techniques have been applied in several previous studies.The univariate modelling techniques found in construction related literature includes: Box-Jenkins, neural network, support vector machine, exponential smoothing and Holt-Winters smoothing method.These techniques can be broadly classified into two classes: econometric (Box-Jenkins, exponential smoothing and Holt-Winters smoothing method) and artificial intelligence (neural network and support vector machine).According to Marwala (2013), one of the main weakness of the econometric approach lies in the assumption of the existence of linear relationship among variables included in the model.However, research has revealed that non-linear models tend to produce better prediction when compared to other approaches (Goh, 1998;Wang et al., 2009).This suggests that econometric approach may not possess the capability to capture the non-linearity present in real-world data.Despite this, very few studies have investigated the performance of univariate variant of artificial intelligence models when applied to tender price index forecasting.The present study thus addresses this gap in current knowledge by applying univariate variant of neural network model to tender price index forecasting.

Research Method SUITABILITY OF RESEARCH APPROACH
The suitability or otherwise of a method has generated discussions in several academic fields.In construction-related literature, these discussions can be found in previously published research (Fellows, 2010;Love, Holt and Li, 2002).The appropriateness of an approach is largely dependent on availability of data and the nature of research problem (Song and Li, 2008;Wing, Raftery and Walker, 1998).In addition, Fellows and Liu (2015) asserts that modelling techniques are an ideal fit for studies targeted at computing forecast.Therefore, univariate modelling techniques were used to generate the forecast of tender price index in the present study.index.In contrast, Wong and Ng (2010) utilized the tender price index of Hong Kong.Two sources provide statistical data for tender price index in Hong Kong.The tender price index for the private sector is collected by Rider Levett Bucknall, while the Architectural Services Department of Hong Kong collects information for tender price index of public sector projects.The time series data received from both organisations exhibits similar patterns and trend.Consistent with a previous study (Wong and Ng, 2010), the tender price index from Rider Levett Bucknall was adopted in the present study.The collected data covers the period between 1983Q1 and 2015Q4.The data is divided into two: the training and the test set.The training data (1983Q1-2012Q4) was used to estimate the univariate models, while, the test data (2013Q1-2015Q4) was used to evaluate the reliability of the developed model, i.e. the ability to predict previously unseen data.The time ordered approach adopted in this study is consistent with those used in similar previous research ( Jiang, Xu and Liu, 2013;Wong and Ng, 2010).
Figure 1 shows the time series plot for tender price index data.As can be seen from the plot, tender price index peaked at three-time periods, which almost coincides with three economic events.These events are the Asian Financial Crises, Global Economic meltdown and the increase in the volume of construction works being experienced in Hong Kong in recent years.Previous research has shown that volume of construction investments and property prices significantly influence construction cost (Soo and Oo, 2014;Zheng, Chau and Hui, 2012).In addition, it is evident that tender price index fluctuates over time.Thus, it is important to develop models that can accurately predict future trends in tender price index.

Box-Jenkins Model
In this work, two univariate modelling techniques were used to predict tender price index.The techniques adopted in the present study are Box-Jenkins and neural network.The Box-Jenkins approach assumes a linear relationship exists between independent variables (i.e.lagged values of tender price index) and the predicted variable (tender price index).The process of applying Box-Jenkins model to time series data was proposed in Box and Jenkins (1976).The Box-Jenkins technique is a combination of the auto-regression (AR) and moving average (MA) with differencing.The most important criteria that must be satisfied prior to the application of Box-Jenkins model to data is stationarity.Box-Jenkins model has been found useful for several forecasting problems in different fields, such as price (Goh and Teo, 2000), construction demand (Fan, Ng and Wong, 2010) and property price index (Hepşen and Vatansever, 2011).Interested readers can refer to Fan, Ng and Wong (2010), Hyndman and Athanasopoulos (2014) and Asteriou and Hall (2016) for detailed explanation on the process of applying Box-Jenkins model to time series data.

Neural Network
The neural network model is a system of interconnected neurons whose functioning is inspired by the human brain.The neural network model used in the present study is a multilayer perceptron with one hidden layer.In recent years, there has been a shift towards the application of deep learning (i.e.neural network with more than one hidden layer) to time series forecasting problems.However, a neural network with one hidden layer was selected for this study based on its performance in previous research (Wang et al., 2009).The input data is presented to the input layer and the numeric weights of the neurons are calibrated during the training process.The neural network and the Box-Jenkins model applied in the present study are similar.This is because the number of lags of tender price index (TPI) included in the neural network and Box-Jenkins model are the same.For instance, information contained in the TPI for 2001, 2002 and 2003 were used in predicting the TPI for 2004.However, it is important to note that the neural network model is non-linear and there is no need to make the data stationary prior to its application.The architecture of the neural network model is presented in Figure 2. Additional details about the neural network model developed in the present study can be found in Hyndman and Athanasopoulos (2014).
The final output of the neural network model is dependent on the initial weights of the neuron which is selected via a random process.To address this problem, the developed model is trained 25 times and the average of the forecast is computed.The number of the nodes in the hidden layer is determined by a rule of thumb suggested in Frederick (1996).This approach has proven useful in previous studies (e.g.Leung and Lee, 2013).The rule of thumb can be mathematically expressed as: where N h , N in and N out are the number of neurons of hidden layer, input layer and output layer, respectively; N s is the number of training samples.
The number of hidden neurons to be examined is chosen to be N h ± 5. To have a valid basis for comparison, the Box-Jenkins model and neural network model are developed using the same inputs.The neural network model was implemented using the R tool (R Core Team, 2015) and the forecast package (Hyndman et al., 2015) which facilitates implementation of the univariate version of the neural network model.

PRE-PROCESS DATA FOR FITTING TO UNIVARIATE MODELS
There is a need to pre-process the collected data (i.e., tender price index) before the application of the univariate model.Data pre-processing entails three steps: (i) transforming collected data into a stationary data (ii) identifying the appropriate number of lags and (iii) split collected data into training and testing sets.The tender price index data is subjected to unit root test to check for stationarity.The results of the test are presented in Table 2.The results show that tender price index (TPI) is non-stationary.However, the first difference of tender price index (∆ TPI) is stationary at 5% level of significance.The statistics suggests that tender price index is stationary after first differencing.Hence, the Box-Jenkins model to be used in the present study is integrated to the order of one.
Autocorrelation function (ACF) and partial autocorrelation function (PACF) are established tools for identifying the number of lags to be included in the Box-Jenkins model.The findings from previous research have proven that ACF and PACF are useful for selecting the number of lags (i.e.input variables) for univariate models including neural network (Wang et al., 2009).To facilitate objective comparisons between the models, the lags of tender price index included in the Box-Jenkins and neural network models are the same.Finally, the dataset needs to be split into two datasets, the training and testing datasets.The training dataset is used to calibrate the univariate models.The developed models are used to generate forecast of tender price index over the test data period.The accuracy of the model was evaluated based on its performance in out-of-sample predictions of tender price index.

FORECAST EVALUATION
The developed univariate models are used to generate out-of-sample forecasts.The predictive performance was evaluated by comparing the predicted values with the actual values in the test dataset.Shmueli (2010) points out that there is a clear difference between "explanation" and "prediction".R-square is a metric used for evaluating the strength of the causal relationship between variable.In contrast, mean absolute percentage error (MAPE) and the Theil's inequality coefficient U are used to evaluate the predictive accuracy of forecast model.These metrics were adopted in this study.These two metrics are commonly used in studies found in construction economics literature (see Fan, Ng and Wong, 2010;Goh and Teo, 2000).It is expected that good forecasting models would produce consistent results across multiple metrics (Crone, Hibon and Nikolopoulos, 2011).For a forecast to be considered reliable and acceptable: the value of MAPE should be less than 10% and Theil's inequality coefficient U should be close to 0 (Fan, Ng and Wong, 2010;Goh and Teo, 2000).MAPE and Theil's inequality coefficient U can be computed using Equations 3 and 4: Where y i is the actual value for time i, y H is the highest actual value, y L is the lowest actual value, y ˆí is the predicted value and Nis the total number of data points in the testing set.Low values of MAPE and U coefficient shows that the high prediction accuracy of the developed model.

BOX-JENKINS MODEL
The data must be stationary prior to fitting it to the Box-Jenkins model.The ACF and PACF plot of the tender price index are shown in Figure 3.The ACF plot drops off exponentially toward zero and the PACF plot drop off after first lag.This suggests that the tender price index data is non-stationary in levels.The data was transformed using 'differencing'.The ACF and PACF plot of the first-different series are presented in Figure 4.The absence of significant spike after first lag in the PACF plot and the exponential drop off towards zero suggests that the first-differenced series is stationary.The ADF unit root test (see Table 2) confirms that the first-differenced series is stationary.
The various possible Box-Jenkins models were estimated.As suggested in Asterious andHall, (2016), andFan, Ng andWong (2010), the goodness-of-fit of the Box-Jenkins model is assessed using Akaike Information Criterion, MAPE, U coefficient, and absence of serial correlation in the residuals.The correlogram of the residuals of the Box-Jenkins model are shown in Figure 5.The high value of Ljung-Box Q-statistics and the random pattern observed in the correlogram (Figure 5) indicates that no serial correlation is present in the residuals.Therefore, the developed Box-Jenkins model is robust.The 'best' Box-Jenkins model is the ARIMA (1, 1, 1) model.

NEURAL NETWORK MODEL
Equation 2 was used to identify the best parameter (i.e.number of nodes in the hidden layer) for the neural network model.From the several potential neural network models, the parameter of the best neural network model is presented in Table 3.

ASSESSMENT OF THE FORECAST ACCURACY OF THE DEVELOPED UNIVARIATE MODELS
As stated earlier, the data from 1983Q1 to 2012Q4 were used to build the univariate models.Subsequently, the developed models generate out-of-sample forecast over the test period (2013Q1-2015Q4).This provides a basis for objective comparison of the forecast accuracy of the univariate models.The predictive accuracy of the two forecasting models was examined by comparing the actual and forecast value of tender price index for the test period (see Table 4).The values of MAPE for the two models were below the 10% acceptable limit.In addition, the values of U coefficient were close to 0. This indicates that the Box-Jenkins and neural network models are acceptable for predicting tender price index.Furthermore, the results of forecast evaluation metrics suggest that the neural network model gives a better prediction of tender price index when compared with the Box-Jenkins model.This is evidenced by the lower values of MAPE and U coefficient for the developed neural network model, i.e. 0.82% and 0.0059, respectively.

Discussion
Previous studies have shown that the development of reliable models for prediction of tender price index is important.Due to unavailability of data in most countries, there is a continuous need to explore the use of models that can be built using limited data.In the current study, two univariate techniques were used for modelling and forecasting of tender price index.The suitability of applying univariate modelling techniques to tender price index forecasting was investigated.
The most obvious finding to emerge from this study is that the neural network can produce accurate forecast of future values of tender price index.The value of MAPE for the neural network model is within the satisfactory limit of 10%.In addition, the value of U coefficient is close to 0. The outcome of this research agrees with the results reported in previous research (Goh, 1998) which showed that nonlinear models (such as neural network) have the capacity to adequately explain changes in construction economics data.The findings of this study highlight the efficacy of using univariate variant of neural network model for tender price index forecasting.

Conclusion
The present study set out to examine the effectiveness of using univariate variants of neural network model for tender price index forecasting.The developed neural network model was compared with the classic Box-Jenkins model.The Box-Jenkins was used as a baseline model for comparison purposes.The ACF and PACF plots were used for identifying the input variables incorporated in the univariate models.Out-of-sample forecast over the test period (2013Q1-2015Q4) served as a basis for evaluating the forecast accuracy of the developed models.
One of the more significant findings to emerge from this study shows that the proposed neural network is more accurate that the traditional Box-Jenkins model for medium term prediction of tender price index.The findings of this research have several practical implications.First, research has shown there is a significant variance between initial project estimates and final project costs (AbouRizk, Babey and Karumanasseri, 2002;Bacon and Besant-Jones, 1998).As explained earlier changes in construction cost are captured in the tender price index.Accurate prediction of tender price index would benefit project stakeholders (such as clients and contactors, among others) by reducing the variations between initial and final cost of construction projects.Tender price index forecast can be used to estimate movements in construction cost.This ensures that adequate provisions (e.g.contingency) are made available during the planning phase for the project.Second, tender price index is an indicator of the changes in the cost of resources used during execution of construction projects.Research has proven that the gap between the adaptive capacity of the construction industry and the volume of construction investment is a major factor that influences movement in tender prices index (Lewis, 1984;Soo and Oo, 2014).Information on changes in future values of tender price index can be used for strategic planning purposes at the macro level for the construction industry.For instance, there might be a need to increase investments into apprenticeship training programs to provide the workforce needed to address labour shortages.Thus, the proposed neural network model has the potential to be utilized for forecasting of tender price index and other variables in the field of construction economics, such as productivity.
The findings of the study reported in this paper are subject to certain limitations.First, the research (i.e.developed univariate models) does not capture the interaction between tender price index and its determinants (i.e. the factors influencing tender price index).Due to this, the model cannot be used to investigate the impact of changes in GDP on tender price index.Second, the proposed model needs to be updated from time to time as new tender price index data becomes available.Despite these limitations, the objective of the present study was achieved.The neural network model is particularly useful in countries where the availability of data for development of multivariate model is a challenge.The study extends the current knowledge on the usefulness of artificial intelligence models to tender price index forecasting.Taken together, the developed neural network model can serve as a decision support tool for estimating the changes in tender price index.Reliable and accurate forecast of tender price index is vital for improving performance of construction projects and this will have a long-term beneficial impact on the business organisations in the construction sector and the economy.

Figure 1
Figure 1 Time series plots of tender price index

Figure 3
Figure 3 Correlogram of tender price index

Figure 5
Figure 5 Correlogram of the residuals of the 'best' Box-Jenkins model

Table 1
Significant determinants of tender price index from previous studies

Table 2
Results of Augmented Dickey Fuller (ADF) unit root test Note: TPI represents Tender Price Index; ∆ = first difference operator.* denotes reject of null hypothesis at 5% level of significance.

Table 4
Out-of-Sample forecast accuracy of Box-Jenkins and neural network

Table 3
Parameter of the neural network