Solar and wind forecasting by NARX neural networks

. The nonlinear autoregressive network with exogenous input (NARX) is used to perform hourly solar irradiation and wind speed forecasting, according to a multi-step ahead approach. Temperature has been considered as the exogenous variable. The NARX topology selection is supported by a combined use of two techniques: (1) a genetic algorithm (GA)-based optimization technique and (2) a method that determines the optimal network architecture by pruning (optimal brain surgeon (OBS) strategy). The considered variables are observed at hourly scale in a seven year dataset and the forecasting is done for several time horizons in the range from 8 to 24h ahead.


Introduction
An accurate prediction of solar energy production is crucial for the effective integration of photovoltaic (PV) and wind generators in smart grids [1,2].For this reason, modeling solar irradiation by means of time series forecasting techniques is becoming widespread.
In general, artificial neural networks (ANNs) have proven to be more effective for this purpose than other classical autoregressive predictors, such as ARX, ARMAX, and Box-Jenkins (BJ) model [3].On the other hand, some issues are still under discussion.For example, once a given ANN is chosen, the definition of general and reliable criteria for selecting the most appropriate structure of the ANNbased model is needed.In particular, the definition of methods for defining both the optimal weight set and the best network topology is useful to avoid a time consuming trial and error procedure for the network set-up.
In this paper, an ANN-based model is used to perform the hourly solar irradiation and wind speed forecasting, according to a multi-step ahead approach.Particularly, the nonlinear autoregressive network with exogenous input (NARX) is chosen, where the exogenous variable is the temperature.The choice of temperature is due both to its availability in the database and to the suitability as exogenous input in solar irradiation forecasting.With the available temperature dataset the wind speed forecasting has been performed as well with good results.The reason for the choice of NARX network is due to the good ability of this neural network to handle problems involving the modeling of nonlinear dynamic systems, such as dependencies among meteorological time series [4,5].
To overcome the disadvantage of the trial and error based procedure [6,7], in this work, the NARX structure selection is supported by a combined use of two techniques: (1) a genetic algorithm (GA)-based optimization technique that allows the best network weight set to be determined and (2) a pruning method based on the optimal brain surgeon (OBS) strategy that determines the optimal network architecture.In such a way an optimized NARX is obtained.
The considered variable datasets are referred to a seven year-observation period and their forecasting is done for several time horizons in the range from 8 to 24 h ahead.

Geographical context and performance indices
Temperature, global solar irradiation and wind speed data, used in this study, come from Palermo, Sicily (Italy), gauge station (latitude 38°8 0 N, longitude 13°20 0 W, elevation 55 m).The used dataset consists of the hourly global solar irradiation (MJ/m 2 ), hourly wind speed (m/s) measured at two meters above ground level and the hourly maximum and minimum temperature recorded during seven years, from 2002 to 2008.In this paper the mean hourly temperature has been used.All data have been provided by SIAS (Servizio Informativo Agrometeorologico Siciliano).The performance indices used to assess the NARX model are the normalized root mean square error (NRMSE) and the coefficient of variation of the root mean squared error, CV(RMSE).They are defined, respectively as: where Y is the observed time series, Ŷ is the predicted time series, Y is the mean of the observed values, Y max is the maximum and Y min is the minimum observed values.In this paper, the performance indices will contain the subscripts e, v, t, r and f that indicate, respectively, estimation set and validation set for time series linear model, training phase, recall phase (application of models to validation set) and forecast phase for neural approach.

Forecasting technique
The NARX neural network is derived by a class of discretetime nonlinear systems, i.e., the nonlinear autoregressive with exogenous input (NARX) models.It has been chosen for the proposed analysis since it is well suited to model nonlinear dynamic systems.The NARX model mathematical formulation is the following: where y(t) and u(t) are the past and present independent (exogenous) inputs of the model at a discrete time step t, n y ≥ 1, n u ≥ 1, n y ≥ n u are the input memory and output memory orders (delay) and f is a nonlinear mapping function.
When the function f is approximated by a multilayer perceptron (MLP), the resulting neural network is called NARX network.In other words, a NARX network consists of a MLP that takes as input a window of past independent (exogenous) inputs and past outputs and calculates the current output.Unlike a conventional recurrent neural network, the NARX network has a limited feedback coming only by the output neuron rather than by the hidden states.Actually, only the output of the NARX is fed back to the input of the feedforward neural network.Nevertheless, it has been demonstrated that it is as much computationally powerful as a fully connected recurrent neural network [8].In order to avoid the trial and error approach for the determination of parameters, the training of NARX has been obtained by GA algorithm in its open form.The multi-step-ahead forecast has been obtained using the NARX in closed-loop form (Fig. 1).In particular the procedure followed to find the best set of the NARX parameters, can be summarized as follows: (1) a starting network topology is chosen, (2) a GA-based optimization technique is used for the determination of the best weight set of the network (training phase) and (3) a pruning method based on the OBS strategy is applied to extract the optimal number of the network parameters, reducing the number of connections.
The number of the network parameters represents how many connections or weights are contained in a neural network.For the NARX network under study, this number (N p ) is given by (4): where n u and n y are the input memory and output memory orders (delays), N is the number of neurons in the hidden layer and the term "1" is added to account for the bias of the output neuron.Once the optimal number of parameters is found, the optimal structure of the considered neural network is defined.
The number of parameters of the chosen NARX network, according to (4) is 161 for the temperature/solar irradiation and 321 for temperature/wind speed dynamic systems.
As well-established, GAs are heuristic, stochastic, combinatorial, optimization techniques based on the biological process of natural evolution [9,10].The three heuristic processes of selection, crossover and mutation are applied probabilistically to discrete variables that are coded into binary or strings of real numbers.The algorithm starts by creating a random initial population, then, in each generation, creates a sequence of new population using individuals of the current generation to create the next population.To perform this procedure, the algorithm evaluates the fitness value of every individuals and selects member called parents based on their fitness.In this application the individuals of the population are the weights and the biases of the neural network; the error in training of the NARX (i.e. the MSEmean square error) is used to provide a fitness value.Comparing the performance indices for training and test set (recall phase), it has been observed that the network overfits the data.This means that the selected model structure contains too many weights [11].In order to find the best structure of NARX, the OBS strategy for pruning the neural network model is then used [12].In particular, a function that performs the training of the network after each weight elimination is employed.On such a basis, the best number of the network parameters corresponds to the minimum value of test error.error and the Akaike index (final prediction index, FPE) [13] versus the number of the network parameters, for the solar irradiation variable computation.In the considered case, the optimal number of the network parameters is 61 for solar irradiation and 21 for wind speed.Finally in Figure 3, the obtained pruned structures of the feed forward neural network, with the optimum number of connections are shown.In Figure 3 (right side), it can be noted that the starting structure of neural network in the case of wind speed is more complex than the starting structure of neural network for solar irradiation (left side).This due to the fact that, starting from a simpler structure of neural network for wind speed variable, the algorithm of pruning could not converge.In Figure 3, the solid line connections refer to excitation inputs whereas dashed connections refer to inhibitory inputs.Starting from this structure, the optimized NARX is obtained and used to perform the hourly solar irradiation and wind speed forecasting.

Results and discussion
As previously said, the hourly solar irradiation forecasting has been performed by using a NARX network whose parameters have been defined by a GA algorithm procedure, starting from a structure with a quite large number of parameters.The performance indices of the optimized NARX, obtained according to the method described in Section 3, have been computed.In particular, the NRMSE r and CV(RMSE) r in recall phase are, for solar irradiation prediction, 6.1% and 32%, respectively and 7% and 47% for wind speed prediction.This result demonstrates that the method proposed in this paper for fixing the NARX structure overcomes the disadvantage of repeated tests, typical of the trial and error procedure, where much time is needed to obtain the optimal network configuration.Moreover, with the proposed method, the performance of network is improved as well [14].It is worth noting that, in any case, the network performance is measured by the statistical indices in ( 1) and ( 2).The solar irradiation recalled by the optimized NARX is given in Figure 4 and the corresponding residuals autocorrelation is sketched in Figure 5. From Figure 5 it is possible to observe that the points corresponding to the autocorrelation of residuals are included in the confidence interval, except for three.For sake of brevity, only graphical results of solar irradiation prediction are shown in this manuscript.From the histogram of the residuals obtained by the NARX network optimized by GA-OBS for prediction of the solar irradiation, the mean value is equal to À0.0096 and the variance is equal to 0.20.The ttest performed on these residuals confirms that data are a random samples from a normal distribution with mean equal to À0.0096 and unknown variance.This confirms   the good result of the proposed method since residuals are normally distributed around zero with a low variance.
The results of the multi-step-ahead forecasting are shown in Figures 6 and 7.
Very similar results for the analysis of residual obtained for the prediction of wind speed are obtained.Starting from the structure sketched in Figure 3 the optimized NARXs has been utilized to forecast hourly solar irradiation and hourly wind speed for five different time horizons ranging from 8 to 24 h.The performance indices are shown in Table 1.The best forecasting results are obtained for 8 and 10 h for solar irradiation and for 18 and 24 h for wind speed.

Conclusions
The nonlinear autoregressive network with exogenous input (NARX) is used to perform a hourly solar irradiation and wind speed forecasting, according to a multi-stepahead approach.Temperature has been considered as the exogenous variable in the analysis.The NARX optimized by a GA and a OBS strategy overcomes the drawback to set up the network structure by repeated trials.The proposed method allows to improve the forecasting of PV and wind power generation and can be effectively implemented within a smart grid management system.

Fig. 3 .
Fig. 3. Pruned structure of the feed forward neural network with the optimum number of connections: (a) solar irradiation and (b) wind speed.

Table 1 .
Performance indices for the NARX in forecasting (bold values represent the best results).