Forecasting wind power - Modeling periodic and non-linear effects under conditional heteroscedasticity

F orecasting Wind P o w er – Mo deling P erio dic and Non-linear Eﬀects Under Conditional Heteroscedasticit y Florian Ziel Eur op e an University Viadrina, Chair for Financ e and Capital Market The ory, Gr oße Scharrnstr aße 59, 15230 F r ankfurt (Oder), Germany, T el. +49 (0)335 5534 2986, E-Mail: ziel@eur op a-uni.de. Carsten Cro onen bro ec k University of R osto ck, F aculty for Envir onmental Scienc e, Justus-von-Liebig-We g 2, 18059 R osto ck, Germany, T el. +49 (0)381 498 3267, E-Mail: c arsten.cr o onenbr o e ck@uni-r osto ck.de. Daniel Am bach Eur op e an University Viadrina, Chair for Quantitative Metho ds, esp. Statistics, Gr oße Scharrnstr aße 59, 15230 F r ankfurt (Oder), Germany, T el. + 49 (0)335 5534 2983, E-Mail: amb ach@eur op a-uni.de. Abstract In this article w e presen t an approach that enables joint wind sp eed and wind p o wer forecasts for a wind park. W e combine a m ultiv ariate seasonal time v arying threshold au- toregressiv e mo ving a verage (TV ARMA) mo del with a pow er threshold generalized autore- gressiv e conditional heteroscedastic (p o w er-TGARCH) model. The modeling framework incorp orates diurnal and ann ual p eriodicity mo deling by p eriodic B-splines, conditional heteroscedasticit y and a complex autoregressiv e structure with non-linear impacts. In con- trast to usually time-consuming estimation approac hes as lik eliho od estimation, we apply a high-dimensional shrink age technique. W e utilize an iteratively re-weigh ted least absolute shrink age and selection op erator (lasso) te c hnique. It allows for conditional heterosce- dasticit y , provides fast computing times and guaran tees a parsimonious and regularized sp eciﬁcation, ev en though the parameter space ma y b e v ast. W e are able to show that our approac h pro vides accurate forecasts of wind p o wer at a turbine-sp eciﬁc level for forecast- ing horizons of up to 48 hours (short- to medium-term forecasts). Keywor ds: Renew able Energy, Wind Sp eed, Wind P ow er, Heteroscedasticit y, Sto c hastic Mo deling, Lasso JEL: C13, C32, C53, Q47 Pr eprint submitte d to Applie d Ener gy June 29, 2018 1. In tro duction Wind p ow er is on the verge of b ecoming the most imp ortan t source of electricit y in many coun tries worldwide. Berkhout et al. (2013) argue that wind p o w er is the most emergent renewable pow er source with a gro wth rate of 30% p er y ear. Ho w ev er, the tec hnology still has a few c hallenges to master. In contrast to con ven tional p o w er, wind p o wer pro duction is non- deterministic and highly volatile. T o mak e eﬃcient con tracts at the energy p ools, accurate forecasts of wind p o w er pro duction hav e to b e a v ailable. Lei et al. (2009) as well as Soman et al. (2010) provide a time-scale classiﬁcation of wind pow er and wind speed prediction mo dels. Longer-term forecasts at horizons of tw o da ys up to one week for, e.g., decisions with resp ect to the required energy reserv es and the maintenance scheduling, are based on me- teorological or recen tly developed hybrid structure models. Matc h-making at the energy markets, i.e. trading at the usual da y-ahead mark ets common to most energy p o ols, requires predictions at forecasting horizons of up to 48 hours, at most, dep endent on the designated contract mark et. F orecasts for this medium- to long-term scenario are usually based on stochastic mo deling, on artiﬁcial intelligence mo dels or sp eciﬁc neural net- w orks. F or instance, Cadenas and Rivera (2009), Cao et al. (2012) and Azad et al. (2014) use them. Amjady et al. (2011) use ridgelet neural netw orks whic h p ossess ridge functions as activ ators for their hidden no des to pro vide forecasts of the aggregated wind p o w er output of a wind farm. Bhask ar and Singh (2012) take a statistical approac h which do es not use n umerical w eather predictions. They use a w av elet decomp osition of their wind sp eed time se- ries and an adaptive wa v elet neural net w ork. After transformation, they transfer the wind sp eed predictions by using a feed-forw ard neural net work in to wind p o wer forecasts. Liu et al. (2014) prop ose a hybrid mo del whic h com bines inputs selected by deep quan titative analysis, wa v elet transform, genetic algorithm and supp ort vector machines. Another wa v elet supp ort v ector mac hine approac h is used b y Zeng and Qiao (2012) to p erform wind p o w er predictions. Zhou et al. (2013) apply a probabilistic kernel densit y forecasting mo del with a quan tile-copula estimator to p erform wind p o wer forecasts. They ev aluate the mo del b y using a p o w er system in Illinois and compare several sc heduling strategies. Haque et al. (2014) provide a new hy- brid in telligen t algorithm for wind pow er predictions that uses a com bination of w av elet transform and fuzzy net work methods. F or match-making, stochastic forecasting approaches like the one presented 2 in this paper b eneﬁt from mo deling the persistence of wind p o w er, its p erio dic structure and its direct dependence on wind sp eed. Th us, wind speed itself is usually predicted in an entirely sto c hastic setting, while also, n umerical w eather predictions (NWPs) can b e employ ed, if av ailable. P o werful statis- tical mo dels return reliable forecasts of wind p o wer for short- to medium- term scenarios and are widely established, lik e the Wind P o wer Prediction T ool (WPPT) b y Nielsen et al. (2007), its recent generalization, GWPPT, b y Cro onen bro ec k and Dahl (2014), or the spatial GWPPT b y Cro onen- bro ec k and Am bac h (2015). How ev er, the class of statistical approac hes also incorp orates autoregressive (AR), autoregressiv e mo ving a v erage (ARMA) and autoregressiv e fractionally integrated mo ving a verage (ARFIMA) mo d- els. Ka v asseri and Seetharaman (2009) discuss these mo dels in details. The literature on con tributed mo dels for wind p o w er and wind sp eed forecasts is v ast, Jung and Broadw ater (2014) as w ell as T ascik araoglu and Uzunoglu (2014) pro vide an up-to-date ov erview. Most models ha ve several drawbac ks: One problem is that sto c hastic wind p o wer prediction mo dels require wind sp eed forecasts in the ﬁrst step. Several mo dels for wind sp eed forecasting are a v ailable, as pro vided b y Zh u et al. (2014), Am bach and Sc hmid (2015) or Shukur and Lee (2015). In the second step, these predictions are trans- formed in to forecasts of wind p o wer, as shown by , e.g., Azad et al. (2014). Most of the mo dels do not provide conjoint wind pow er and wind sp eed pre- dictions. Many mo dels, e.g. the aforemen tioned WPPT class mo dels, utilize wind sp eed as a quadratic regressor for wind p o wer, although the theoretical non-linear relationship is usually describ ed b y a cubic function. The reason for this is to b e found in the physical limitation of the turbine, i.e. the upp er b ound of producible wind p o wer. The long memory structure of usual turbine sp eciﬁc wind sp eed and wind p o w er data suggests a diurnal and an ann ual perio dic b eha vior. Sev eral con- tributions illustrate this p eriodic or cyclic behavior, as, e.g., Carap ellucci and Giordano (2013), Silv a et al. (2016), Scholz et al. (2014) and Am bach and Sc hmid (2015). Ambac h (2015) fo cuses on ann ual p eriodic eﬀects. P erio dic eﬀects may change ov er time, whic h is usually not considered. Am- bac h (2015) and Ambac h and Cro onen bro ec k (2015) incorp orate seasonal in teractions in wind speed time series b y annual and diurnal basis functions. Thereb y , they capture the ann ual change of a daily perio d. This eﬀect is ba- sically driven b y the fact that the length of the nights changes ov er the y ear: On the northern hemisphere, there are longer nights during the win ter than during the summer. Indeed, it is observ able that the diurnal p erio dicit y is 3 v arying o v er the y ear. Moreov er, evidence suggests that such p eriodicities are also observ able within the wind p o wer data (see, e.g., Nielsen et al., 2007). Consequen tly , our new forecasting mo del includes all common sto c hastic mo deling features, but in addition, it o vercomes the aforemen tioned dra w- bac ks. The main adv an tage of our approach is related to the fact that we are able to produce wind sp eed and p o w er forecasts at the same time with one mo del. The p eriodic b eha vior of the day and the year is mo deled b y p erio dic B-splines. F urthermore, we capture interaction b et ween b oth seasonalities, as the diurnal impact ma y change ov er the y ear. Thus, we allow for perio dic c hanges in the parameters to capture the seasonal interaction eﬀects. Wind sp eed and wind p o w er sho w a huge amoun t of auto correlation, as sho wn by Am bach and Sc hmid (2015). Hence, w e consider a multiv ariate seasonal V ARMA class mo del to capture the p ersistence as w ell as the p eri- o dicit y . A V ARMA model is also used b y Erdem and Shi (2011) to predict a tuple of wind sp eed time series. In a more general setting, Jeon and T a ylor (2012) tak e a biv ariate V ARMA generalized autoregressive conditional het- eroscedastic (GARCH) approac h to mo del the wind sp eed and wind direction and con vert the predictions of both into wind pow er forecasts. Instead of using wind sp eed as a quadratic regressor as done in the WPPT and GWPPT approac h, we use thresholds and v ector autoregression to co ver the non-linearity . The threshold autoregressiv e approach is also applied in a context of probabilistic load forecasting b y Ziel and Liu (2016) as a suit- able to ol to explore the non-linearity in the data. Here, we use a V ARMA mo del to capture the correlation structure of several turbines and to pre- dict wind speed and wind pow er altogether. Finally , w e prop ose a threshold GAR CH (TGAR CH) mo del for th e wind sp eed series and a p o w er-TGAR CH pro cess for the volatilit y . With it, w e are able to capture the conditionally heteroscedastic b eha vior in the data, similarly to Ewing et al. (2006) and Am bach and Sc hmid (2015). The assumed statistical model structure for the wind sp eed and p o wer allo ws us to sim ulate sample paths for several scenarios. Using bo otstrap sim ulation tec hniques w e can easily deriv e probabilistic forecasts. As p oin ted out by , e.g., Pinson et al. (2013), Alessandrini et al. (2013), and Hong et al. (2016), the imp ortance of probabilistic wind p ow er forecasting is increasing, esp e- cially for longer forecasting horizons. Zugno et al. (2012) use probabilistic forecasts of wind p o w er as w ell, Gneiting and Raftery (2007) pro vide de- tails on the computation and ev aluation of probabilistic forecasts. Recently , Berner et al. (2015) discuss bias correction and accuracy improv emen ts in 4 general probabilistic forecasting. Using a mesoscale meteorology framew ork, they address the main reason for using probabilistic forecasts instead of p oin t forecasts, i.e. “[... to] accoun t for certain asp ects of structural mo del uncer- tain ty”. F or the estimation, we apply a high-dimensional shrink age tec hnique based on the p opular least absolute shrink age and selection op erator (lasso) method, as in tro duced by Tibshirani (1996). Similarly , Ev ans et al. (2014) use the lasso metho d to augment the forecasting accuracy of a wind farm. According to Ziel et al. (2015), we apply an iterativ ely re-w eigh ted lasso approach to es- timate the mo del parameters. Thus, w e can provide a h uge parameter space, still come up with a parsimonious and regularized sp eciﬁcation and ha ve very con venien t computing times in comparison to the usual maximum likelihoo d tec hnique (i.e. few seconds compared to sev eral min utes on a modern com- puter). F or the time v arying and p eriodic eﬀects, the algorithm will estimate parameters that may v ary ov er time at a certain signiﬁcance. Otherwise, the parameters remain constan t. Our time v arying p eriodic TV ARMA-p o wer- TGAR CH mo del returns more accurate forecasts than the usual WPPT and GWPPT mo dels as w ell as a set of b enc hmark mo dels, including the usual p ersistence forecaster. Results show that our mo del provides less sk ewed forecast errors than our b enc hmarks. This pap er mak es t w o ma jor contributions: First, we present a mo deling framew ork for wind p o wer that includes wind sp eed, ﬂexible mo deling of the perio dicit y and heteroscedasticity . Second, w e sho w ho w to estimate the mo del parameters by applying a re-weigh ted heteroscedastic lasso approach to a time series setting, as has b een done recently b y Ziel (2015). Empiri- cal results from out-of-sample forecasts are compared to a set of b enc hmark mo dels. The pap er is structured as follo ws: Section 2 discusses the data set used. In Section 3 w e sho w our new mo del idea. Section 4 presen ts the estima- tion tec hnique. Empirical results are discussed in Section 5 and Section 6 concludes. 2. Data and Their Characteristics The turbine data set used in this paper is a high-frequency series collected from a wind park in German y . The wind park consists of 8 turbines. The observ ed park is situated in a mostly plain and rural region. The area has a sligh t roughness with ﬁelds and some forestation. Due to a non-disclosure 5 agreemen t, the sp eciﬁc lo cations cannot b e revealed. Ho wev er, Figure 1 presen ts a st ylized map of the turbines’ arrangement. The turbines, lab eled T urbine A to H, exhibit a pow er range of [0; 1500] kW eac h and write sensor data to log ﬁles at a frequency of ten minutes. The observ ed time frame spans from No vem ber 1, 2010 to No vem b er 5, 2012, so there are 105984 observ ations p er turbine. Figure 1: St ylized map of the wind parks inv estigated. T able 1 shows descriptiv e statistics for tw o of the turbines in the data set. Note that wind pow er observ ations may very well b e sligh tly b elo w zero: If wind sp eeds are b elo w cut-in sp eed (i.e. there is no h ub rotation and thus, no p o wer production), the turbine consumes p o wer for system op eration and aviation ligh ts. Also, nacelle and rotor pitch adjustmen ts require appreciable amoun ts of electricity and are mostly p erformed during times at which the turbines do not pro duce p o w er themselves. Th us, some p o w er observ ations are b elo w zero. Considering the entire data range (from - 19 to 1542), the part of v alues b elo w zero is only about 1% of the range. Thus, w e do not consider turbine p o wer consumption to b e of m uch imp ortance. The histograms in Figure 2 supp ort that determination. This also holds true for the v ery few observ ations at whic h the theoretical p o wer maxim um of 1500 kW is exceeded. Hence, we stay with the theoretical range assumption of [0; 1500] kW. The data set p ossesses a minor num b er of missing v alues due to engine error, main tenance sh utdown or ice error. Ab out 3% of the data are missing, but 6 the gaps are small: The maximum run length of missing v alues is 586, which is ab out 0 . 007% of the en tire data set. Therefore, we easily ﬁll the gaps b y simple linear in terp olation. A scatter plot (empirical p o wer curve) and time series plots of wind sp eed and wind p o wer are given in Figure 2. It shows that wind sp eed and wind p o w er follow a similar structure and are closely in terdep enden t. Statistic Min Median Max Mean SD Sp eed A 0.4 5.2 18.0 5.1 2.4 P ow er A -19.0 150.0 1532.0 217.2 272.0 Sp eed B 0.4 5.5 18.6 5.3 2.5 P ow er B -19.0 155.0 1493.0 230.5 291.1 T able 1: Descriptive statistics of T urbines A and B. Wind sp eed denoted in m/s, wind p o w er in kW. Besides the high p ersistence of wind sp eed and wind p o w er which is di- rectly related to the high-frequency data set, it is necessary to discuss another imp ortan t c haracteristic of wind p o w er and wind speed. The wind speed data pro vides a strong p eriodic b eha vior, as Zhu et al. (2014) and Am bach and Sc hmid (2015) p oin t out. The wind p o w er data set also pro vides these charac- teristics. A diurnal p erio dicit y as considered for the WPPT and GWPPT is observ able for our data set, but the ann ual p eriod is not completely straigh t forw ard. Hence, w e calculate the sample p erio dogram, which is sho wn in Figure 3. Figure 3 sho ws the estimated sp ectral densit y for the wind pow er and wind sp eed of turbine A. The red lines in Figure 3 show annual and half-annual frequencies in the upp er panels and daily and half-daily p erio ds in the low er panels. The diurnal p eriodicity is not so prev alent within the wind p ow er series shown in the righ t-hand panels, but there are several imp ortan t fre- quencies nearb y the daily p eriod. After all, p eriodic B-spline functions will help to mo del all m ultiples of a diurnal and ann ual p erio dicit y . 3. Mo del Let d = 8 denote the n umber of turbines in the wind park. Th us, the set of turbines is D = { 1 , . . . , d } . The d -dimensional time series of wind sp eed 7 0 5 10 15 Wind speed in m/s 2011 2012 0 500 1000 Wind po wer in kW Wind speed in m/s Density 0 5 10 15 0.00 0.05 0.10 0.15 Wind po wer in kW Density 0 500 1000 1500 0.000 0.004 0.008 0.012 0 5 10 15 0 500 1000 1500 Wind speed in m/s Wind po wer in kW (a) T urbine A. 0 5 10 15 Wind speed in m/s 2011 2012 0 500 1000 Wind po wer in kW Wind speed in m/s Density 0 5 10 15 0.00 0.05 0.10 0.15 Wind po wer in kW Density 0 500 1000 1500 0.000 0.004 0.008 0.012 0 5 10 15 0 500 1000 1500 Wind speed in m/s Wind po wer in kW (b) T urbine B. Figure 2: Time series, histograms of wind sp eed and p o wer and corresp onding empirical p o w er curves of T urbines A and B. is ( W t ) t ∈ Z with W t = ( W 1 ,t , . . . , W d,t ) 0 and the wind p o w er is ( P t ) t ∈ Z with P t = ( P 1 ,t , . . . , P d,t ) 0 . W e split the mo del description in to t wo parts: First, w e presen t the m ul- tiv ariate time v arying threshold V ARMA mo del for the wind sp eed. Its heteroscedastic v ariance structure is mo deled b y a TGAR CH t yp e pro cess. Afterw ard, w e presen t the wind p o wer mo del whic h includes the wind sp eed dep endence and considers the errors of wind p o w er themselv es to follow a p o w er-TGARCH process. 3.1. The Wind Sp e e d Comp onent F or the wind sp eed W t w e consider the m ultiv ariate time v arying threshold- V ARMA mo del 8 0.000 0.005 0.010 0.015 0.020 0 1000 2000 3000 4000 5000 Frequency in h − 1 Spectrum Spectral density Selected frequenc y 0.05 0.10 0.15 0.20 0 100 200 300 400 Frequency in h − 1 Spectrum Spectral density Selected frequenc y (a) Smo othed Periodogram of wind sp eed for T urbine A. 0.000 0.005 0.010 0.015 0.020 0e+00 2e+07 4e+07 6e+07 8e+07 Frequency in h − 1 Spectrum Spectral density Selected frequenc y 0.05 0.10 0.15 0.20 0e+00 1e+06 2e+06 3e+06 4e+06 Frequency in h − 1 Spectrum Spectral density Selected frequenc y (b) Smoothed P erio dogram of wind p o w er for T urbine A. Figure 3: Estimated sp ectral densit y of wind sp eed (left panels) and wind p o w er (right panels). W i,t = φ i, 0 ( t ) + X j ∈D X k ∈ I φ i,j X c ∈ C φ i,j,k φ i,j,k,c ( t ) max { W j,t − k , c } + X j ∈D X k ∈ I θ i,j θ i,j,k ( t ) ε j,t − k + ε i,t , (1) where i ∈ D , φ i,j,k,c resp. θ i,j,k represen t the time v arying autoregressive 9 and mo ving a verage co eﬃcien ts and ε i,t is the error term. The index sets I φ i,j and I θ i,j con tain the corresp onding relev an t AR- and MA-lags and the threshold set C φ i,j,k con tains all considered thresholds in the autoregressive part. The simple c hoice C φ i,j,k = {−∞} would turn the mo del into a standard time v arying V ARMA pro cess. The thresholds describ e the AR-dep endence of wind speed b y a piecewise linear function with breaks at the corresp ond- ing thresholds. Just as eac h smo oth function, it can b e approximated well b y piecewise linear functions, which pro vides a ﬂexible and eﬃcient w ay to capture the non-linear dep endence in the data. W e assume the error pro cess ( ε i,t ) t ∈ Z to b e conditionally heteroscedastic. Therefore, w e consider ε i,t = σ i,t Z i,t , where ( Z i,t ) t ∈ Z is i.i.d. with E ( Z i,t ) = 0 and V ar( Z i,t ) = 1. In detail, w e assume that ε i,t follo ws a time v arying TGAR CH pro cess, suc h that σ i,t = α i, 0 ( t ) + X j ∈D X k ∈ I α i,j α + i,j,k ( t ) ε + j,t − k + α − i,j,k ( t ) ε − j,t − k + X j ∈D X k ∈ I β i,j β i,j,k ( t ) σ j,t − k (2) with index sets I α i,j and I β i,j , ε + j,t − k = max { ε j,t − k , 0 } , ε − j,t − k = max {− ε j,t − k , 0 } and time v arying co eﬃcien ts α i, 0 ( t ) > 0, α + i,j,k ( t ) ≥ 0, α − i,j,k ( t ) ≥ 0, β i,j,k ( t ) ≥ 0. The index sets for the considered lags are giv en in T able 2. F or most of the co eﬃcien ts, we allo w dep endence of up to one hour (6 lags) only to k eep the sp eciﬁcation manageable. Ho wev er, for the co eﬃcien ts that describ e the wind dep endence on its own past, we allo w for more parame- ters. Here, we also include the lags 140 , . . . , 150 to cov er the impact from the previous da y (which corresponds to 6 × 24 = 144 lags). F or the thresholds C φ i,j,k , we use a parsimonious lag sp eciﬁcation: W e allo w non-linear impacts for the ﬁrst tw o lags, only . The elemen ts of C φ i,j,k con tain the 10% p ercen tiles of the pro cess in the mean equation. Thus, C φ i,j,k con tains the 10% p ercen tiles in the cases k = 1 or k = 2. All elemen ts that do not satisfy this restriction are set to C φ i,j,k = {−∞} . The non-linear impact of the threshold mo del sp eciﬁcation acts via piecewise linear functions to cov er p ossibly present turbulent ﬂow and wak e eﬀects. The eﬀect of this mo del comp onen t is explained in detail in the follo wing subsection. T o k eep the parameter space reasonable, w e keep most of the co eﬃcien ts 10 constan t and allow only a few imp ortan t ones to v ary o ver time. F or I φ i,j as w ell as I θ i,i , we consider the co eﬃcien ts for lags 1 and 2 to b e time v arying. Ziel et al. (2015) pro ceed similarly for mo deling the wind and solar p o wer net feed-in. Index sets Con tained lags I φ i,i , I α i,i 1, . . . , 40 and 140, . . . , 150 I φ i,j , I θ i,i , I θ i,j , I α i,j , I β i,i , I β i,j 1, . . . , 6 T able 2: Considered lags of the index sets, where i, j ∈ D with j 6 = i . 3.2. The Wind Power Mo del F or the wind pow er pro cess w e assume a mo del that is giv en b y P i,t = ϕ i, 0 ( t ) + X j ∈D X k ∈ I ϕ i,j X c ∈ C ϕ i,j,k ϕ i,j,k,c ( t ) max { P j,t − k , c } + X j ∈D X k ∈ I ψ i,j X c ∈ C ψ i,j,k ψ i,j,k,c ( t ) max { W j,t − k , c } + X j ∈D X k ∈ I ϑ i,j ϑ i,j,k ( t )  j,t − k + X j ∈D X k ∈ I $ i,j $ i,j,k ( t ) ε j,t − k +  i,t . (3) All observed turbines are lo cated in close proximit y to eac h other, and are inﬂuenced b y the same air pressure and weather conditions. Dep enden t on the angle of mo vemen t of a particular pressure area (and thus, wind conditions) at an y one time, these conditions ma y hit one set of turbines so oner than others. The spatial dispersion of the turbines’ p o wer pro duction can therefore b e accoun ted for b y a time-lag structure. Thus, w e assume that the p o wer P i of turbine i can dep end on its o wn past as well as on the past of the pow er of the other turbines by the ϕ parameters. The pow er can also dep end on the curren t and past wind sp eed by means of the ψ parameters. F urthermore, wind p o wer dep ends on the past residuals. Note that the lag structure for the wind p o wer model is slightly diﬀeren t from that of the wind 11 sp eed mo del, as we assume a causal/temporal structure in the data. W e assume that the wind speed W t at time t can only dep end on the past wind sp eed W t − k for k ≥ 1. Similarly , the wind p o wer P t dep ends on the past wind p o wer P t − k for k ≥ 1, but also on curren t and past wind speed W t − k for k ≥ 0. Comparably to the wind sp eed mo del, w e allow for non-linear wind sp eed and wind p o w er eﬀects by several thresholds. Particularly , the theoretical non-linear eﬀect of the wind sp eed on the wind p o wer is well known to b e describ ed b y the third-degree p olynomial: P = 1 2 ρC P AW 3 , (4) where ρ describ es the air density , C P denotes the physical properties of the turbine (v alues of up to 16/27, the so-called Betz limit), and A represen ts the sw ept area. Hennessey (1977) go es into details. How ev er, esp ecially around the upp er bound of the maxim um pro duced wind p o w er, it is known that the true impact of wind sp eed on wind p o w er is diﬀerent from the usual cubic relationship and should not b e mo deled to b e cubic. Our wa y of mo deling the non-linear impact b y piecewise linear eﬀects allows for a ﬂexible wa y to mo del the underlying non-linear impact. T o illustrate the impact of the thresholds we brieﬂy present a simple threshold mo del for the wind p o wer dep enden t on the wind sp eed. It is given b y P i,t = a + 16 X c =0 b c max { W i,t , c } + e i,t (5) for turbine i . It con tains thresholds at 0 , 1 , 2 , . . . , 16 for modeling the non-linear relationship b y piecewise linear functions. In Figure 4, the ﬁtted v alues of mo del (5) are giv en for T urbines A and B of the in v estigated wind park. It can b e seen that the piecewise linear approac h is able to cov er the non-linear relationship quite w ell. There are distinct bents at the threshold p oin ts. In fact, the ﬁtted curv e is a linear spline. Of course, a higher num b er of thresholds will increase the model ﬁt. Ho wev er, if the n umber of thresh- olds is to o large, it migh t lead to o v erﬁtting. Still, this problem is someho w limited due to our shrink age estimation pro cedure, so that a large num b er of parameters in the problem space do es not necessarily imply a lot of esti- mations in the solution. Instead, the algorithm will automatically select the 12 most plausible piecewise linear function that approximates the non-linear impact w ell, as we c ho ose the thresholds in model (3) to b e data driv en. Similarly as for the wind sp eed pro cess, w e assume a GARCH-t yp e pro cess for the wind p o wer error, so  i,t = ς i,t U i,t with U i,t i.i.d., E ( U i,t ) = 0 and E ( U 2 i,t ) = 1. This slightly diﬀers from the TGAR CH pro cess for the wind sp eed: W e assume that the third-degree relationship in equation (4) hands do wn to the residual volatilit y . Thus, instead of considering a recursion on ς i,t , we consider a recursion on the cub ed ro ot of the volatilit y ς 1 3 i,t . Conse- quen tly , w e assume that  i,t follo ws a time v arying pow er-TGAR CH process: ς 1 3 i,t = η i, 0 ( t ) + X j ∈D X k ∈ I η i,j η + i,j,k ( t ) |  + j,t − k | 1 3 + η − i,j,k ( t ) |  − j,t − k | 1 3 + X j ∈D X k ∈ I ζ i,j ζ i,j,k ( t ) ς 1 3 j,t − k + X j ∈D X k ∈ I υ i,j υ + i,j,k ( t ) | ε + j,t − k | 1 3 + υ − i,j,k ( t ) | ε − j,t − k | 1 3 + X j ∈D X k ∈ I % i,j % i,j,k ( t ) σ 1 3 j,t − k , (6)  + j,t − k = max {  j,t − k , 0 } ,  − j,t − k = max {−  j,t − k , 0 } . Finally , the index sets for the considered lags on the mean part of the mo del in equation (3) and the v ariance part in equation (6) are given in T able 3. The corresp onding parameters for the index sets I α i,j , I β i,j , I ϕ i,j , I ϑ i,j , I η i,j , I υ i,j , I ζ i,j and I % i,j are considered to b e time v arying on lags 1 and 2, those for the sets I ψ i,j and I $ i,j are time v arying on lags 0 , 1 , 2 and the corresp onding regressors are mo deled b y p erio dic B-splines, as discussed subsequen tly . 3.3. Time V arying Co eﬃcients W e assume an iden tical structure of the time v arying co eﬃcien ts in the wind sp eed and in the wind p o w er mo del, as b oth exhibit similar seasonal eﬀects. In general, the time v arying co eﬃcien ts can b e mo deled by perio dic functions lik e F ourier approximations or other perio dic basis functions, such as p erio dic B-splines or p eriodic w av elets. W e opt for the ﬂexible cubic B- spline approac h. Let ξ b e a time v arying co eﬃcien t. Then ξ ( t ) = N ξ X l =1 ξ l B ξ l ( t ) , (7) 13 0 500 1000 1500 W ind speed in m/s W ind power in kW 0 2 4 6 8 10 12 14 16 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● (a) T urbine A. 0 500 1000 1500 W ind speed in m/s W ind power in kW 0 2 4 6 8 10 12 14 16 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● (b) T urbine B. Figure 4: Fitted results of the illustrativ e example of mo del (5). Index sets Con tained lags I ϕ i,i , I η i,i , I υ i,i 1, . . . , 40 and 140, . . . , 150 I ψ i,i 0, . . . , 40 and 140, . . . , 150 I ϑ i,i , I ϕ i,j , I ϑ i,j , I η i,j , I υ i,j , I ζ i,i , I ζ i,j , I % i,i , I % i,j 1, . . . , 6 I $ i,i , I ψ i,j , I $ i,j 0, . . . , 6 T able 3: Considered lags of the index sets, where i, j ∈ D with j 6 = i . so that ξ is given b y a sum of N ξ basis functions B ξ l ( t ), w eighted b y ξ l . 1 In the literature on wind p o wer mo deling, F ourier appro ximations are used frequen tly , see, e.g., Gieb el et al. (2011). Ho wev er, the F ourier technique is a global approach. F or our purp ose, a lo cal approac h is preferable, as it is more ﬂexible with resp ect to p ossible c hanges in the time-dep enden t structure itself. W e design our basis function so that it can cov er b oth the diurnal and the annual perio dic eﬀects. F urthermore, we allo w for p ossible interactions b et w een b oth seasonalities, so that the diurnal impact can change o v er the 1 Details on the construction of the B-splines basis functions are discussed in the ap- p endix. 14 y ear. This impact is visualized in Figure 5. It displays the daily mean wind sp eed and wind p o wer of all considered wind turbines for the four seasons in a year. F or b oth the wind sp eed and the wind p o wer, w e observe that during the morning hours around 7am to 10am, there is a distinct drop, whic h is less severe during the winter mon ths. Moreo ver, it can b e seen that in the summer, this drop happ ens earlier than in the other seasons (from around 6am to 8am). In contrast, during winter time, this drop seems to happ en quite late in the da y (from around 9am to 10am), best visible in Figure 5a. This indicates strong in teraction of the wind sp eed with the sunrise. Over all, the daily mean curves diﬀer signiﬁcantly from eac h other, showing that the diurnal pattern dep ends on the ann ual pattern, and vice v ersa. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 4.5 5.0 5.5 6.0 Time of the day W ind speed in m/s 00:00 04:00 08:00 12:00 16:00 20:00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Dec−Feb Mar−May Jun−Aug Sep−Nov (a) Mean wind sp eed. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 150 200 250 300 350 Time of the day W ind power in kW 00:00 04:00 08:00 12:00 16:00 20:00 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Dec−Feb Mar−May Jun−Aug Sep−Nov (b) Mean wind p o wer. Figure 5: Daily mean wind sp eed and wind p o w er of the 8 turbines for the four seasons in a y ear. 4. Estimation As an estimation algorithm, we use a lasso based estimation technique. Lasso is a p enalized least square regression metho d. Th us, w e consider the least squares represen tation of our mo del. F or the conditional mean mo dels (1) and (3), the regression represen tations are given b y W i = W i b W i + E i (8) P i = P i b P i + E i . (9) 15 Here, W i = ( W i, 1 , . . . , W i,n ) 0 and P i = ( P i, 1 , . . . , P i,n ) 0 denote the observ ed wind sp eed and wind p o wer v ectors, W i and P i are the matrices of cov ariates that corresp ond to (1) and (3), b W i and b P i are the full parameter vectors, E i = ( ε i, 1 , . . . , ε i,n ) 0 and E i = (  i, 1 , . . . ,  i,n ) 0 denote the error v ectors and n is the n umber of observ ations. Similarly , w e form ulate a regression represen tation for the volatilit y mo d- els. W e exploit the fact that | ε i,t | = γ i σ i,t + σ i,t ( | Z i,t | − γ i ) and |  i,t | 1 3 = τ i ς 1 3 i,t + ς 1 3 i,t ( | U i,t | 1 3 − τ i ), where γ i = E | Z i,t | and τ i = E | Z i,t | 1 3 . Note that v i,t = σ i,t ( | Z i,t | − γ i ) and u i,t = ς i,t ( | U i,t | − τ i ) are w eak white noise pro cesses. With that w e express a recursion on | ε i,t | and |  i,t | 1 3 b y | ε i,t | = γ i α i, 0 ( t ) + X j ∈D X k ∈ I α i,j γ i α + i,j,k ( t ) ε + j,t − k + γ i α − i,j,k ( t ) ε − j,t − k + X j ∈D X k ∈ I β i,j γ i β i,j,k ( t ) σ j,t − k + v i,t (10) |  i,t | 1 3 = τ i η i, 0 ( t ) + X j ∈D X k ∈ I η i,j τ i η + i,j,k ( t ) |  + j,t − k | 1 3 + η − i,j,k ( t ) |  − j,t − k | 1 3 + X j ∈D X k ∈ I ζ i,j τ i ζ i,j,k ( t ) ς 1 3 j,t − k + X j ∈D X k ∈ I υ i,j τ i υ + i,j,k ( t ) | ε + j,t − k | 1 3 + υ − i,j,k ( t ) | ε − j,t − k | 1 3 + + X j ∈D X k ∈ I % i,j τ i % i,j,k ( t ) σ 1 3 j,t − k + u i,t . (11) The corresp onding m ultiv ariate regression representations are giv en b y |E i | = V i a E i + V i (12) | E i | = U i a E i + U i , (13) where V i and U i are the regressor matrices that corresp ond to (10) and (11), a E i and a E i are the parameter vectors and V i = ( v i, 1 , . . . , v i,n ) and U i = ( u i, 1 , . . . , u i,n ). F or the parameter estimation w e use an estimation tec hnique that is based 16 on a lasso regression for heteroscedastic data. The approac h is similar to the p opular F GLS (feasible generalized least squares) solution by New ey and W est (1987) at which a w eighting matrix brings the “meat” into the estima- tion, leading to p oin t-wise heteroscedasticit y consistency . This lasso metho d w as analyzed ﬁrst b y W agener and Dette (2012) in a standard regression setting and by Ziel (2015) in a time series setting. W e slightly mo dify the algorithm to plug-in the causal setting, i.e. the multiv ariate approach for the wind p o w er mean mo del. Therefore, we apply a w eighted lasso for the condi- tional mean regressions (8) and (9). F or the conditional v ariance regressions (12) and (13), w e just apply a standard lasso. Let Ω i = diag( ω i ) and Ξ i = diag( ξ i ) b e diagonal matrices of heteroscedasti- cit y weigh ts ω i = ( ω i, 1 , . . . , ω i,n ) and ξ i = ( ξ i, 1 , . . . , ξ i,n ). The w eigh ted lasso optimization problems concerning (8), (9), (12) and (13) are given b y b b W i = arg min b ( W i − W i b ) 0 Ω ( W i − W i b ) + λ W i | b | (14) b b P i = arg min b ( P i − P i b ) 0 Ξ ( P i − P i b ) + λ P i | b | (15) b a E i = arg min a ≥ 0 ( |E i | − U i a ) 0 ( |E i | − U i a ) + λ E i | a | (16) b a E i = arg min a ≥ 0 ( | E i | − V i a ) 0 ( | E i | − V i a ) + λ E i | a | , (17) with tuning parameters λ W i , λ P i , λ E i and λ E i . Here, w e only allo w estima- tors b a E i and b a E i with no negativ e en try to ensure that the recurrence equation (and th us, the volatilities σ i,t and ς i,t ) are w ell deﬁned. F or solving the lasso optimization problems we use the co ordinate descent algorithm as in tro duced b y F riedman et al. (2007). This algorithm solves the problem on a given tuning parameter grid. W e c ho ose the tuning parameters b y the minimalist but rather strict Bay esian information criterion (BIC), to a void ov erﬁtting. Other information criteria or a cross-v alidation based ap- proac h can b e also b e applied. In the ﬁrst step, w e estimate the conditional mean parameters b W and b P . Then, we consider the estimated residuals for the estimation of the volatilit y parameters a E and a E . Afterward, we use the ﬁtted volatilities to redeﬁne the heteroscedasticit y matrices Ω i and Ξ i to repeat the pro cedure with the new weigh t matrices for the conditional mean parameters b W and b P . In practice how ev er, there is an initialization problem, as the residuals ε i,t and 17  i,t as w ell as the conditional standard deviations σ i,t and ς i,t are unkno wn. Still, Chen and Chan (2011) discuss a metho d for estimating ARMA models using an iterative lasso approach. Their algorithm basically uses the fact that every ARMA( p , q ) can b e written as an AR( ∞ ). Thus, it can b e ap- pro ximated b y an AR( p ) for large p . This is also applicable for time v arying threshold V ARMA mo dels. Thus, a time v arying threshold V ARMA( p , q ) pro cess is a time v arying threshold AR( ∞ ) process. Similarly , for the volatil- it y model, it holds that every exp onen tiated ARMA mo del can b e expressed as a PGARCH with the corresp onding p o w er. F or a p o wer of t w o, a squared ARMA pro cess is a GAR CH pro cess. So, the same relationship holds: A time v arying p o w er-GARCH( p , q ) pro cess is a time v arying pow er-AR CH( ∞ ) pro- cess. Using these facts w e can handle the initialization problem as follo ws: In the ﬁrst iteration step, we replace all ε i,t ,  i,t , σ i,t and ς i,t b y 1. Thus, in the ﬁrst step, we actually estimate time v arying threshold AR pro cesses in the conditional mean equations and time v arying p o wer-AR CH pro cesses in the second step. Finally , we initialize the heteroscedasticity w eights Ω i and Ξ i . Here, we sim- ply assume homoscedasticit y in the ﬁrst step, so w e take Ω i = Ξ i = I . The algorithm can b e stated as follo ws: 1) Initialize W i , P i , Ω i , Ξ i for all i ∈ D and K = 1. 2) Estimate b b W i and b b P i with (14) and (15) using co ordinate de- scen t with weigh ts Ω i and Ξ i for all i ∈ D . 3) Estimate b a E i and b a E i with (16) and (17) by co ordinate descen t using the estimated residuals ( b ε i, 1 , . . . , b ε i,n ) and ( b  i, 1 , . . . , b  i,n ) from 2) to compute U i and V i for all i ∈ D . 4) Compute the estimated v olatilities ( b σ i, 1 , . . . , b σ i,n ) and ( b ς i, 1 , . . . , b ς i,n ) with the ﬁtted v alues from 3) and redeﬁne W i , P i , Ω i = diag( ω i ) and Ξ i = diag( ξ i ) b y ω i = ( b σ − 2 i, 1 , . . . , b σ − 2 i,n ) and ξ i = ( b ς − 2 i, 1 , . . . , b ς − 2 i,n ) for all i ∈ D . 5) If K < K max then K = K + 1 and back to 2) , otherwise stop the algorithm. W e stop the algorithm after a maximum of K max = 2 iterations, whic h already pro vides a go od ratio of accuracy and computing time. Ziel (2015) 18 sho ws that under some regularity conditions, tw o iterations are suﬃcien t to receiv e optimal asymptotic prop erties. As the considered estimation method- ology is based on the co ordinate descen t algorithm, it shares the same com- putational complexity . With n as n umber of observ ations, d as num b er of turbines and p as dimension of the underlying lasso problem (dimension of parameter v ector in (9)), the asymptotic computational complexity of the al- gorithm is O ( dnp ). Thus, if either d , n or p is doubled, the computation time gets doubled as w ell. Notably , the estimation pro cedure is easily applicable for large wind parks. 5. F orecasting and Results After the estimation, the obtained parameters are ﬁt to the current set of data in order to calculate a forecast. As it is the very nature of the lasso approac h to return a lot of zero v alued parameters, the high-dimensional parameter space is shrunk to a manageable amount of relev ant parameters, conditional on the unique settings of in-sample data at each p oin t forecast. W e ev aluate our model (“lasso”) and sev eral b enc hmark approac hes accord- ing to their forecasting accuracy . The common criterion is the mean absolute error (MAE). The out-of-sample (OOS) forecasts are p erformed for a time frame from Nov em b er 2011 to No v ember 2012. Most benchmark mo dels re- quire appreciable amoun ts of computing time (sev eral minutes p er forecast). T o k eep the time consumption reasonable, we select N = 1000 p oin ts in time ( χ ( l ) , l = 1 , . . . , N ) in the out-of-sample perio d at random. F or the respective in-sample p eriods, w e consider the corresp onding preceding year with 52830 observ ations each. F orecasts are calculated at horizons of up to a maxim um of t wo da ys (i.e. 48 hours = 288 steps). MAE is calculated b y MAE i,k = 1 N N X l =1    P i,χ ( l ) + k − b P i,τ ( l ) + k    , (18) where b P i,χ ( l ) + k is the k -step forecast of wind p o w er and P i,χ ( l ) + k is the cor- resp onding actual observ ation, each at station i . As the results lo ok similar for all of the eight turbines, we just rep ort the mean results ov er all turbines, so w e ev aluate 19 MAE k = 1 d d X i =1 MAE i,k . (19) Results for the distinct turbines are a v ailable upon request. Additionally to the MAE k w e compute the diﬀerence of MAE k to the persistent benchmark mo del, denoted b y DMAE k , i.e. DMAE k = MAE k − MAE pers. k , (20) where MAE pers. k is the MAE k of the p ersisten t forecaster. The p ersisten t forecaster (so-called na ¨ ıv e predictor, ˆ P χ ( l ) + k = P χ ( l ) , see, e.g., Costa et al., 2008) is suitable to illustrate the improv emen t of sophisticated forecasting mo dels in direct comparison to this common quasi-standard benchmark. W e compare our model’s results to further b enc hmarks. W e consider a simple univ ariate AR on the wind p o wer on eac h turbine i ∈ D (AR), a biv ariate V AR on wind p o wer and wind sp eed of each turbine i ∈ D (BV AR), a 2 × d = 16 − dimensional m ultiv ariate V AR, jointly on all wind p o w er and wind sp eed pro cesses (abbr.: V AR), the established WPPT mo del and its recen t generalization, GWPPT. F urthermore, w e ev aluate an ARMA mo del, an artiﬁcial neural net work based approac h (ANN), and a gradien t b oosting mac hine (GBM). The AR-type mo dels (AR, BV AR, V AR) are estimated by solving the sys- tem of Y ule-W alker equations, which guarantees a stationary solution. The corresp onding autoregressive order is chosen by minimizing the Ak aik e in- formation criterion (AIC). Next to AR-type models w e consider a univ ariate ARMA(1,1) pro cess benchmark mo del for eac h turbine. De Giorgi et al. (2011) states that out of all ARMA-t yp e mo dels, the ARMA(1,1) pro cess yield the b est forecasting results for a short forecasting horizon. W e esti- mate the ARMA mo del b y maximizing the Gaussian lik eliho o d. The WPPT is based on a turbine sp eciﬁc dynamic regression approac h. It tak es wind sp eed as a regressor and captures diurnal p eriodicit y by a F ourier series of time of da y observ ations to estimate the parameters of the mo del 20 ˆ P t + k = m + a 1 · P t + a 2 · P t − 1 + b 1 · W t + k | t + b 2 · W 2 t + k | t + d c 1 · cos  2 π d t + k 144  + d c 2 · cos  4 π d t + k 144  + d s 1 · sin  2 π d t + k 144  + d s 2 · sin  4 π d t + k 144  + ε t + k , (21) where W t + k | t is wind sp eed at time t + k giv en at time t , d t is time of da y for observ ation t and ε t + k is assumed white noise. The generalization of WPPT, GWPPT, is mo deled as b oth-sided censored: Eac h wind turbine is manufactured to op erate at a certain range, the so- called p o wer range. GWPPT mak es use of this a-priori kno wn information. The mo del imposes the following structure on wind pow er: P ∗ t = η ( z t ) + ε t , (22) where z t is the v ector of explanatory v ariables, η is a linear function of z t , and ε t is an assumed Gaussian error term. T o comply with WPPT, GWPPT assumes the structure shown in equation (21), but adds wind direction to the sp eciﬁcation. GWPPT imp oses a censored data structure, so that P t =      l , P ∗ t ≤ l P ∗ t , P ∗ t ∈ ( l , u ) u, P ∗ t ≥ u, (23) where l and u are the lo wer and upp er censoring p oin ts. P arameters are estimated using a generalized T obit mo del. In the end, due to assumed Gaussian errors, the forecast is calculated b y ˆ P t + k = (Φ( f 2 ) − Φ( f 1 )) · P ∗ t + k + ( φ ( f 1 ) − φ ( f 2 )) · b σ + u · (1 − Φ( f 2 )) , (24) where f 1 = ( l − P ∗ t + k ) / b σ , f 2 = ( u − P ∗ t + k ) / b σ , and φ ( · ) and Φ( · ) denote normal PDF (Probability Densit y F unction) and CDF (Cumulativ e Distri- bution F unction), respectively . F or the ANN b enc hmark, w e consider a single feed-forw ard neural net work and stick close to the setting as used by Li and Shi (2010). As its inputs, we consider the lagged v alues of the past 8 hours of observ ations. W e train the neural net work on 50 training v ectors and on 4 neurons in the hidden la yer. 21 The last b enc hmark in vestigated is based on gradien t b oosting machines (GBM). Landry et al. (2016) use GBM metho ds successfully for the wind p o w er forecasting track in the Global Energy F orecasting Comp etition 2014. W e train a GBM for each wind turbine with a memory of 5 hours on the full data set. Similarly to Landry et al. (2016), w e choose the shrink age tuning parameter to be 0.05, the in teraction depth to be 5 and the minim um n um b er of observ ations to b e 30. In total, we c ho ose 100 trees, which are suﬃcien t to reac h conv ergence. Figure 6 presents the out-of-sample aggregated forecasting error results. Lo oking at MAE and DMAE, p ersistence is outp erformed b y far b y most mo dels. ARMA and ANN p erform badly . The GBM and the AR t yp e mo d- els p erform b etter, but are still not v ery comp etitiv e. Most of the times, lasso comp etes with WPPT and GWPPT, but sometimes, lasso outruns (G)WPPT, e.g. at forecasting horizons of around 8 hours and ab o ve 32 hours. T able 4 sho ws the results for sev eral selected forecasting horizons (1 step, 6 steps (1 hour), 24 steps (4 hours), 48 steps (8 hours), 72 steps (12 hours), 144 steps (1 day) and 288 steps (2 days)). As can b e seen, lasso is either the b est model or not signiﬁcantly diﬀeren t from the b est model. Note that for longer forecasting horizons (e.g. 24 or 48 hours), the surplus of p oin t forecasting is limited due to the strong amoun t of uncertaint y . How- ev er, the prop osed mo del can b e used for probabilistic forecasting as w ell. Using residuals based b ootstrap as done by , e.g., Ziel and Liu (2016), we can easily sim ulate sample paths for the wind sp eed and p o w er of all turbines in a wind park. W e ev aluate the empirical quan tiles of the b ootstrap samples paths and obtain an estimate for the corresp onding quantile. Exemplarily , Figure 7 sho ws the probabilistic wind sp eed and p o wer forecast for the 99 p ercen tiles for T urbines A and B, starting at F ebruary 25th, 2012, 07:20. The ﬁgure rev eals b oth, the diurnal seasonal pattern as well as heterosce- dasticit y . F or instance, it can b e seen that at a forecasting horizon of 4 as w ell as for 24 + 4 = 28 hours, there are greater forecasting v alues for b oth the wind sp eed and the wind p o wer. Indeed, the observ ations around these p eaks are greater than those in the near pro ximit y . Additionally , these p eaks are rather v olatile, so that the prediction interv als at these p eaks are rela- tiv ely wide, sligh tly wider than in the neigh b oring hours. Overall, we see that the prediction interv als get wider with increasing forecasting horizon as exp ected. In general, they seem to be relativ ely wide. How ev er, we see that in eac h of the four ﬁgures some observ ations fall in to the reddish colored area 22 whic h represen ts large prediction in terv als. Most distinct, in Figure 7d and for large forecasting horizons of more than 40 hours it can b e seen that all observ ations of the wind p o wer of T urbine B fall into the prediction area of v ery small probabilities. Thus, the prediction in terv als do not seem to b e too wide or to o conserv ativ e. Finally , w e inv estigate the OOS errors’ asymmetry b y lo oking at the errors’ densities. F or more lucidity , w e restrain the plots to a few mo dels, lasso, AR and GWPPT. As Figure 8a shows, for the one step ahead forecast, all mo dels return symmetric and leptokurtic results. This symmetry declines for increasing forecasting horizons. F or the 24 steps (4 hours) ahead forecast, AR starts to tend to asymmetry , as Figure 8b sho ws. The a v erage error is negative, whic h represen ts a systematic ov er-estimation of wind p o w er. Cro onen bro ec k and Stadtmann (2015) sho w that this type of bias turns out to b e v ery costly , from a turbine op erator’s p oin t of view. GWPPT and the lasso, how ev er, are still symmetric, mostly . F or even longer forecasting horizons (Figures 8c and 8d show densities for one da y and tw o days ahead forecasting errors), the AR asymmetry b ecomes worse, while GWPPT starts to return asymmetric forecasts as well. Also, the lasso mo del b ecomes asym- metric, but not as strongly as the other mo dels. F rom that w e conclude that using the lasso mo del instead of any of the other mo dels may pro vide not only the most accurate forecasts, but also has the least sev ere impact of asymmetry , whic h is imp ortan t for an y turbine op erator with respect to the ﬁnancial impact of the forecast. 23 50 100 150 200 250 Forecast horizon in hours MAE in m/s 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 persistent lasso AR BV AR V AR ARMA WPPT GWPPT ANN GBM (a) MAE k for the selected p oin t forecasts. −60 −40 −20 0 20 Forecast horizon in hours DMAE in kW 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 persistent lasso AR BV AR V AR ARMA WPPT GWPPT ANN GBM (b) DMAE k (diﬀerence to pe rsistence) for the selected p oin t forecasts. Figure 6: MAE k and DMAE k for all forecasting horizons k , time frame from No vem b er 2011 to No v ember 2012. 24 5 10 15 Forecast horizon in hours W ind speed in m/s 2 6 10 14 18 22 26 30 34 38 42 46 1% 10% 20% 30% 40% 50% 60% 70% 80% 90% 99% (a) Probabilistic wind sp eed forecast of T urbine A. 0 500 1000 1500 Forecast horizon in hours W ind power in kW 2 6 10 14 18 22 26 30 34 38 42 46 1% 10% 20% 30% 40% 50% 60% 70% 80% 90% 99% (b) Probabilistic wind p o w er forecast of T urbine A. 5 10 15 Forecast horizon in hours W ind speed in m/s 2 6 10 14 18 22 26 30 34 38 42 46 1% 10% 20% 30% 40% 50% 60% 70% 80% 90% 99% (c) Probabilistic wind sp eed forecast of T urbine B. 0 500 1000 1500 Forecast horizon in hours W ind power in kW 2 6 10 14 18 22 26 30 34 38 42 46 1% 10% 20% 30% 40% 50% 60% 70% 80% 90% 99% (d) Probabilistic wind p o w er forecast of T urbine B. Figure 7: Probabilistic wind sp eed and p ow er forecast of T urbines A and B from 2012-02-25 07:20 to 2012-02-27 07:10. The blac k lines are the observed v alues, the dashed blue lines give the respective point estimates. 25 1 6 24 48 72 144 288 p ersisten t 47.12(0.61) 92.04(1.16) 141.58(1.59) 173.09(1.97) 186.30(2.19) 216.38(2.42) 242.99(2.59) lasso 43.10 (0.52) 88.59 (1.08) 133.87 (1.48) 157.58 (1.79) 167.20 (1.96) 190.42(2.19) 199.49 (2.12) AR 47.45(0.57) 90.93(1.03) 138.45(1.34) 163.73(1.58) 175.36(1.72) 194.72(1.88) 214.14(1.91) BV AR 47.55(0.58) 89.99(1.03) 138.74(1.36) 164.10(1.58) 177.08(1.72) 197.69(1.89) 217.29(1.94) V AR 44.72(0.50) 89.64(1.03) 138.19(1.35) 165.18(1.59) 176.02(1.69) 197.03(1.87) 217.40(1.93) ARMA 49.61(0.53) 94.41(0.94) 153.39(1.44) 189.23(1.78) 206.41(1.87) 230.39(2.10) 215.40(1.84) WPPT 67.30(0.78) 100.95(1.15) 142.09(1.52) 162.59(1.76) 171.26(1.90) 189.55(2.10) 202.28(2.17) GWPPT 59.72(0.75) 95.64(1.15) 139.26(1.52) 160.66(1.76) 169.49(1.92) 188.77 (2.12) 201.04(2.21) ANN 56.29(0.59) 104.78(1.22) 168.50(1.76) 189.13(2.05) 204.61(2.06) 240.57(2.19) 296.65(2.03) GBM 49.42(0.52) 92.96(0.93) 136.65(1.38) 161.55(1.74) 178.26(1.79) 206.27(2.10) 215.48(2.12) T able 4: MAEs with estimated standard deviations. Best = b old, all within the 2-sigma range are underlined (not signiﬁcan tly worse than the best). 26 −400 −200 0 200 400 600 0.000 0.005 0.010 0.015 0.020 Density lasso AR GWPPT (a) h = 1 −400 −200 0 200 400 600 0.000 0.001 0.002 0.003 0.004 0.005 Density lasso AR GWPPT (b) h = 24 −400 −200 0 200 400 600 0.0000 0.0010 0.0020 0.0030 Density lasso AR GWPPT (c) h = 144 −400 −200 0 200 400 600 0.0000 0.0010 0.0020 0.0030 Density lasso AR GWPPT (d) h = 288 Figure 8: OOS forecasting errors densit y , lasso, AR and GWPPT, h step ahead forecasts, time frame from No vem b er 2011 to No v ember 2012. 6. Conclusion In this pap er w e present a new wind p o wer forecasting approac h that in- corp orates conditional heteroscedasticit y , ﬂexible perio dicit y and non-linearity mo deling and pro vides the imp ortan t wind sp eed forecasts in only one step. As an estimation technique, w e present the re-w eigh ted iterative lasso, whic h consumes little computing time, provides automatic regularization and spar- sit y and does not require a distributional assumption, unlike the usual max- im um likelihoo d estimation. The mo del for wind speed and wind p o wer com bines a m ultiv ariate time v ary- ing TV ARMA pro cess with a p o w er-TGARCH mo del. The mo del returns accurate wind p o wer forecasting results that are comp etitiv e, esp ecially for the medium-term scenario. F urthermore, the mo del allows for probabilistic forecasting. Wind park op erators ma y b eneﬁt from the minor asymmetry of 27 our mo del. While other mo dels tend to o v er-estimate in increasing forecast- ing horizon settings, our mo del remains mostly stable, whic h helps k eeping the asymmetry-induced ﬁnancial loss of forecasts under con trol. F or energy mark ets match-making, ﬁnally , our mo del pro vides not only sup erior accu- racy for the p oin t forecast necessary for b oth sellers and buy ers, but also giv es insigh t into the forecasts’ distribution and th us, the forecasts’ reliabilit y . 28 App endix A. A B-spline basis function of degree H is constructed out of a B-spline basis function e B . e B is deﬁned by the degree H and a set of knots K . The set of knot K con tains H + 1 knots { k 0 , . . . , k H +1 } with k h < k h +1 . This can b e easily deﬁned b y the recurrence relation from (de Bo or, 2001, p. 90): e B ( t ; { k 0 , . . . , k H +1 } , H ) = t − k 0 k H − k 0 e B ( t ; { k 0 , . . . , k H } , H − 1) + t − k 1 k H +1 − k 1 e B ( t ; { k 1 , . . . , k H +1 } , H − 1) (A.1) with initialization e B ( t ; { k l , k l +1 } , 0) = ( 1 , t ∈ [ k l , k l +1 ) 0 , otherwise . W e consider the set of knots K ( T , H ) to b e equidistant with center T . Th us, we ﬁnd k 0 = T − h D +1 2 , k D +1 = T + h D +1 2 and since w e select an o dd degree D , we get k D +1 2 = T , where h is the distance b et ween the knots. Note that H and h deﬁne the knots K uniquely . Finally , w e consider a seasonality S to obtain a p erio dic basis function e B ( t ; K , H ). T o do so, it is suitable to c ho ose h such that S is an in teger m ultiple of h , whic h itself is at least H + 1 to guarantee a partition of the unit y . W e deﬁne e B ∗ 1 ( t ; K , H ) = X k ∈ Z e B ( t − kS ; K , H ) (A.2) as the initial p eriodic basis function. In our setting, the data has tw o seasons, a diurnal and an annual one. 2 As our data frequency is at 10 min utes, w e hav e six observ ations p er hour. Thus, our diurnal seasons are S diurnal = 24 × 6 = 144 and the yearly seasons are S annual = 365 . 24 × 24 × 6 = 2 Wind speed as well as wind p o wer can b e assumed to b e p eriodic for daily and y early patterns. Empirically , this behavior can be shown b y using p erio dograms, i.e. b y analyzing the empirical sp ectral density . 29 52594 . 56. 3 By using the initial perio dic basis function e B ∗ 1 , w e deﬁne the full p eriodic basis by e B ∗ j ( t ; K , H ) = e B ∗ j − 1 ( t − h ; K , H ). In conclusion, the basis B = { e B ∗ 1 , . . . , e B ∗ N B } has a total of N B = S/h basis functions. In our setting, w e choose h diurnal = 12 and h annual = 4. The basis functions e B ∗ l are suitable to capture seasonal c hanges of parame- ters. How ev er, due to the structure of the the basis functions, they mo del the absolute impact ov er time. In practice, it ma y b e b etter to consider the c hanges o ver time instead of the absolute impact, esp ecially if w e use auto- matic shrink age and selection algorithms for estimation, just as w e do. W e can easily mo del the c hanges in the parameters o ver time b y cum ulating the basis functions e B ∗ l within l . Hence, we deﬁne e B ∗ , cum. l = e B ∗ , cum. l − 1 + e B ∗ l (A.3) for l > 1 with e B ∗ , cum. 1 = e B ∗ 1 . W e use the cumulativ e basis functions for the conditional mean mo del (1), for b oth the diurnal and the ann ual basis functions. Also, w e use the non- cum ulative v ersion for the conditional v ariance mo del (2), due to the param- eter constraints in the v ariance mo del. W e discuss this in greater detail in the estimation section. As p oin ted out b y Ziel et al. (2015), there might b e interactions b et ween the seasonal comp onen ts. As the amount of sunshine is changing o ver the year, this might hav e impact on the wind sp eed. Thus, it is p ossible that daily cyclic eﬀects are c hanging ov er the y ear. The simplest approach to mo del these interactions is to consider m ultiplications on the corresp onding basis functions. W e will use this multiplication for the conditional mean mo del, where w e consider the cumulativ e basis functions. The m ultiplications are B cum. l 1 h annual + l 2 ( t ) = e B ∗ , cum. l 1 ( t ; K ( h annual , H ) , H ) × e B ∗ , cum. l 2 ( t ; K ( h diurnal , H ) , H ) (A.4) for l 1 ∈ { 1 , . . . , h diurnal } and l 2 ∈ { 1 , . . . , h annual } in equation (7) and for eac h p erio dic coeﬃcient ξ in (1). 3 Note that an av erage y ear lasts 365.242375 days, which is appro ximated by the leap y ear system every four years. A usual consensus is to approximate this by 365.24 days per y ear. 30 F or the conditional v ariance equation, we do not consider the cum ulative basis function, so here the m ultiplication is B l 1 h annual + l 2 ( t ) = e B ∗ l 1 ( t ; K ( h annual , H ) , H ) × e B ∗ l 2 ( t ; K ( h diurnal , H ) , H ) (A.5) for l 1 ∈ { 1 , . . . , h diurnal } and l 2 ∈ { 1 , . . . , h annual } in equation (7) and for eac h p erio dic coeﬃcient ξ of the conditional v ariance mo del (2). Ho wev er, b y construction of the perio dic basis, P N B l =1 e B ∗ l ( t ) is constan t. Thus, for the time v arying co eﬃcien t ξ of the conditional mean mo del (1), we consider the set of basis functions B cum. ξ = { B cum. l 1 h annual + l 2 | l 1 ∈ { 1 , . . . , h diurnal } , l 2 ∈ { 1 , . . . , h annual }} , (A.6) where the last elemen t is constan t. Note that B cum. ξ has h diurnal × h annual elemen ts, so that in our setting, 12 × 4 = 48 parameters for eac h time v arying co eﬃcien t. F or the conditional v ariance mo del (2), we deﬁne the used set of basis function for a p eriodic parameter ξ by B ξ = { 1 } ∪ { B l 1 h annual + l 2 | l 1 ∈ { 1 , . . . , h diurnal } , l 2 ∈ { 1 , . . . , h annual } , ( l 1 , l 2 ) 6 = (1 , 1) } . (A.7) Th us, w e replace the ﬁrst basis function B 1 b y the constant 1, to mo del the constan t impact directly . 31 References Alessandrini, S., Sperati, S., Pinson, P ., 2013. A comparison b etw een the ecm wf and cosmo ensemble prediction systems applied to short-term wind p o w er forecasting on real data. Applied Energy 107, 271–280. Am bach, D., 2015. Short-term wind sp eed forecasting in germany . Journal of Applied Statistics, 1–19. Am bach, D., Cro onen bro ec k, C., 2015. Space-time short-to medium-term wind sp eed forecasting. Statistical Methods & Applications, 1–16. Am bach, D., Schmid, W., 2015. Periodic and long range dep enden t mo dels for high frequency wind sp eed data. Energy 82, 277–293. Amjady , N., Keynia, F., Zareip our, H., 2011. Short-term wind p o wer fore- casting using ridgelet neural net work. Electric P ow er Systems Researc h 81 (12), 2099–2107. Azad, H. B., Mekhilef, S., Ganapath y , V. G., 2014. Long-term wind sp eed forecasting and general pattern recognition using neural net works. Sustain- able Energy , IEEE T ransactions on 5 (2), 546–553. Berkhout, V., F aulstic h, S., G¨ org, P ., Hahn, B., Linke, K., Neusch¨ afer, M., Pfaﬀel, S., Raﬁk, K., Rohrig, K., Rothk egel, R., Zieße, M., 2013. Wind energy rep ort german y 2013. F raunhofer-Institut f ¨ ur Windenergie und Energiesystem technik-IWES-Kassel. Berner, J., F ossell, K. R., Ha, S.-Y., Hack er, J. P ., Sn yder, C., 2015. In- creasing the skill of probabilistic forecasts: Understanding p erformance impro vemen ts from model-error representations. Monthly W eather Review 143 (4), 1295–1320. Bhask ar, K., Singh, S., 2012. Awnn-assisted wind p o w er forecasting using feed-forw ard neural net work. Sustainable Energy , IEEE T ransactions on 3 (2), 306–315. Cadenas, E., Riv era, W., 2009. Short term wind speed forecasting in la v en ta, oaxaca, m´ exico, using artiﬁcial neural netw orks. Renew able Energy 34 (1), 274–278. 32 Cao, Q., Ewing, B. T., Thompson, M. A., 2012. F orecasting wind speed with recurrent neural netw orks. Europ ean Journal of Op erational Researc h 221 (1), 148–154. Carap ellucci, R., Giordano, L., 2013. The eﬀect of diurnal proﬁle and sea- sonal wind regime on sizing grid-connected and oﬀ-grid wind p o wer plants. Applied Energy 107, 364–376. Chen, K., Chan, K.-S., 2011. Subset arma selection via the adaptive lasso. Statistics and its In terface 4 (2), 197–205. Costa, A., Cresp o, A., Nav arro, J., Lizcano, G., Madsen, H., F eitosa, E., 2008. A review on the young history of the wind pow er short term predic- tion. Renew able and Sustainable Energy Reviews 12 (6), 1725–1744. Cro onen bro ec k, C., Ambac h, D., 2015. Censored spatial wind p o w er predic- tion with random eﬀects. Renewable and Sustainable Energy Reviews 51, 613–622. Cro onen bro ec k, C., Dahl, C. M., 2014. Accurate medium-term wind p o w er forecasting in a censored classiﬁcation framew ork. Energy 73, 221 – 232. Cro onen bro ec k, C., Stadtmann, G., 2015. Minimizing asymmetric loss in medium-term wind p o w er forecasting. Renewable Energy 81, 197–208. de Bo or, C., 2001. A Practical Guide to Splines, revised Edition. Springer, New Y ork. De Giorgi, M. G., Ficarella, A., T aran tino, M., 2011. Error analysis of short term wind p o w er prediction mo dels. Applied Energy 88 (4), 1298–1311. Erdem, E., Shi, J., 2011. Arma based approaches for forecasting the tuple of wind sp eed and direction. Applied Energy 88 (4), 1405–1414. Ev ans, S. C., Zhang, Z., Iyengar, S., Chen, J., Hilton, J., Gregg, P ., Eldridge, D., Jonkhof, M., McCullo c h, C., Shokoohi-Y ekta, M., 2014. T o w ards wind farm p erformance optimization through empirical mo dels. In: Aerospace Conference, 2014 IEEE. IEEE, pp. 1–12. Ewing, B. T., Kruse, J. B., Schroeder, J. L., 2006. Time series analysis of wind sp eed with time-v arying turbulence. Environmetrics 17 (2), 119–127. 33 F riedman, J., Hastie, T., H¨ oﬂing, H., Tibshirani, R., et al., 2007. Path wise co ordinate optimization. The Annals of Applied Statistics 1 (2), 302–332. Gieb el, G., Brownsw ord, R., Kariniotakis, G., Denhard, M., Draxl, C., 2011. The state-of-the-art in short-term prediction of wind p o wer. T ec h. rep., ANEMOS.plus, RisF8 DTU, Wind Energy Division. Gneiting, T., Raftery , A. E., 2007. Strictly prop er scoring rules, prediction, and estimation. Journal of the American Statistical Association 102, 359– 378. Haque, A. U., Nehrir, M. H., Mandal, P ., 2014. A hybrid in telligent mo del for deterministic and quan tile regression approac h for probabilistic wind p o wer forecasting. P ow er Systems, IEEE T ransactions on 29 (4), 1663–1672. Hennessey , J. P ., 1977. Some aspects of wind p o wer statistics. Journal of Applied Meteorology 16 (2), 119–128. Hong, T., Pinson, P ., F an, S., Zareip our, H., T ro ccoli, A., Hyndman, R. J., 2016. Probabilistic energy forecasting: Global energy forecasting comp eti- tion 2014 and b ey ond. In ternational Journal of F orecasting, forthcoming. Jeon, J., T a ylor, J. W., 2012. Using conditional kernel density estimation for wind p o w er densit y forecasting. Journal of the American Statistical Asso ciation 107 (497), 66–79. Jung, J., Broadwater, R. P ., 2014. Current status and future adv ances for wind sp eed and p o w er forecasting. Renewable and Sustainable Energy Re- views 31, 762–777. Ka v asseri, R. G., Seetharaman, K., 2009. Da y-ahead wind sp eed forecasting using f-arima mo dels. Renew able Energy 34 (5), 1388–1393. Landry , M., Erlinger, T. P ., Patsc hk e, D., V arric hio, C., 2016. Probabilistic gradien t b o osting mac hines for gefcom2014 wind forecasting. International Journal of F orecasting, forthcoming. Lei, M., Shiyan, L., Chuan w en, J., Hongling, L., Zhang, Y., 2009. A review on the forecasting of wind sp eed and generated p o w er. Renew able and Sustainable Energy Reviews 13, 915–920. 34 Li, G., Shi, J., 2010. On comparing three artiﬁcial neural net works for wind sp eed forecasting. Applied Energy 87 (7), 2313–2320. Liu, D., Niu, D., W ang, H., F an, L., 2014. Short-term wind sp eed forecasting using w a velet transform and supp ort vector machines optimized by genetic algorithm. Renew able Energy 62, 592–597. New ey , W. K., W est, K. D., 1987. A simple, p ositiv e semi-deﬁnite, het- erosk edasticity and auto correlation consisten t cov ariance matrix. Econo- metrica 55 (3), 703–708. Nielsen, H. A., Pinson, P ., Christiansen, L. E., Nielsen, T. S., Madsen, H., Badger, J., Gieb el, G., Ravn, H. F., 2007. Improv emen t and automation of to ols for short term wind pow er forecasting. T ec h. rep., Scientiﬁc Pro- ceedings of the Europ ean Wind Energy Conference & Exhibition, Milan, Italy . Pinson, P ., et al., 2013. Wind energy: F orecasting challenges for its op era- tional managemen t. Statistical Science 28 (4), 564–585. Sc holz, T., Lop es, V. V., Estanqueiro, A., 2014. A cyclic time-dep enden t mark ov process to mo del daily patterns in wind turbine p o wer production. Energy 67, 557–568. Sh ukur, O. B., Lee, M. H., 2015. Daily wind sp eed forecasting through h ybrid kf-ann mo dels based on arima. Renew able Energy 76, 637–647. Silv a, A. R., Pimen ta, F. M., Assireu, A. T., Spyrides, M. H. C., 2016. Complemen tarity of brazils h ydro and oﬀshore wind p o w er. Renew able and Sustainable Energy Reviews 56, 413–427. Soman, S. S., Zareip our, H., Malik, O., Mandal, P ., 2010. A review of wind p o w er and wind sp eed forecasting metho ds with diﬀerent time horizons. In: North American P ow er Symp osium (NAPS), 2010. IEEE, pp. 1–8. T ascik araoglu, A., Uzunoglu, M., 2014. A review of com bined approaches for prediction of short-term wind sp eed and pow er. Renew able and Sustainable Energy Reviews 34, 243–254. Tibshirani, R., 1996. Regression shrink age and selection via the lasso. Journal of the Ro yal Statistical Society . Series B (Metho dological), 267–288. 35 W agener, J., Dette, H., 2012. Bridge estimators and the adaptive lasso under heteroscedasticit y . Mathematical Metho ds of Statistics 21 (2), 109–126. Zeng, J., Qiao, W., 2012. Short-term wind p o w er prediction using a wa v elet supp ort v ector mac hine. Sustainable Energy , IEEE T ransactions on 3 (2), 255–264. Zhou, Z., Botterud, A., W ang, J., Bessa, R., Kek o, H., Sumaili, J., Miranda, V., 2013. Application of probabilistic wind p o wer forecasting in electricity mark ets. Wind Energy 16 (3), 321–338. Zh u, X., Gen ton, M. G., Gu, Y., Xie, L., 2014. Space-time wind sp eed fore- casting for impro ved pow er system dispatc h. T est 23 (1), 1–25. Ziel, F., 2015. Iterativ ely reweigh ted adaptiv e lasso for conditional het- eroscedastic time series with applications to ar–arc h t yp e pro cesses. Com- putational Statistics & Data Analysis, forthcoming. Ziel, F., Liu, B., 2016. Lasso estimation for gefcom2014 probabilistic electric load forecasting. In ternational Journal of F orecasting, forthcoming. Ziel, F., Steinert, R., Husmann, S., 2015. Eﬃcient modeling and forecasting of electricit y sp ot prices. Energy Economics 47, 98–111. Zugno, M., Jonsson, T., Pinson, P ., 2012. T rading wind energy on the basis of probabilistic forecasts b oth of wind generation and of mark et quan tities. Wind Energy 16 (6), 909–926. 36

Forecasting wind power - Modeling periodic and non-linear effects under conditional heteroscedasticity

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment