Multivariate Dynamic Intensity Peaks-Over-Threshold Models

We propose a multivariate dynamic intensity peaks-over-threshold model to capture extreme events in a multivariate time series of returns. The random occurrence of extreme events exceeding a threshold is modeled by means of a multivariate dynamic intensity model allowing for feedback effects between the individual processes. We propose alternative specifications of the multivariate intensity process using autoregressive conditional intensity and Hawkes-type specifications. Likewise, temporal clustering of the size of exceedances is captured by an autoregressive multiplicative error model based on a generalized Pareto distribution. We allow for spillovers between both the intensity processes and the process of marks. The model is applied to jointly model extreme returns in the daily returns of three major stock indexes. We find strong empirical support for a temporal clustering of both the occurrence of extremes and the size of exceedances. Moreover, significant feedback effects between both types of processes are observed. Backtesting Value-at-Risk (VaR) and Expected Shortfall (ES) forecasts show that the proposed model does not only produce a good in-sample fit but also reliable out-of-sample predictions. We show that the inclusion of temporal clustering of the size of exceedances and feedback with the intensity thereof results in better forecasts of VaR and ES.


Introduction
Financial risk management has become a ubiquitous task for banks, companies, and financial institutions, especially during the last subprime mortgage crisis. The recent global crisis has demonstrated the importance of modeling and forecasting of extreme events and their dynamic behavior during crisis periods. Classical extreme value theory (EVT) constitutes the mathematical and statistical foundation for the description of the distribution of extreme events. Traditional methods to describe the tail of a loss distribution are the Value-at-Risk (VaR) and the Expected Shortfall (ES) (McNeil & Frey 2000, Cotter & Dowd 2006, Chavez-Demoulin et al. 2014). On the other hand, point process methods allow the dynamic behavior of (extreme) events to be captured and are typically applied in the context of portfolio credit risk, market microstructure analysis, international contagion analysis, or jump-diffusion models (Engle & Russell 1998, Bauwens & Hautsch 2006, Errais et al. 2010, Bacry & Muzy 2014, Aït-Sahalia et al. 2015. Moreover, point process theory provides an elegant formulation for the characterization of the limiting distribution of extreme value distributions, 1 and therefore, builds a natural complementary framework to extreme value analysis. In this paper, we aim to bring together both branches of the literature and propose a dynamic multivariate model capturing the occurrence and size of extremes in a multivariate time series. Important features of the proposed framework are to allow for (i) temporal clustering of both the occurrence of extremes and the size thereof, (ii) cross-sectional feedback between individual exceedance intensities, and (iii) feedback between the magnitude of exceedances and their intensity. On the one hand, we introduce an autoregressive conditional intensity peaks-over-threshold (ACI-POT) model, which, in its most basic form, corresponds to the combination of two known models: the ACI model introduced by Russell (1999) and the POT model by Davison & Smith (1990). Moreover, we propose a multivariate extension of a Hawkes-POT model, introduced for the univariate case by Chavez-Demoulin et al. (2005) and recently reviewed in different financial contexts (Chavez-Demoulin & McGill 2012, Herrera & Schipp 2014).
The major advantage of these new approaches is that they can capture the clustering of extreme events -both over time and within a cross-section. Such patterns typically occur in crisis periods and substantially challenge risk management. In addition, this class of processes gen- 1 The original development of this characterization is attributed to Pickands (1971) and Smith (1989). erates a flexible and computationally tractable multivariate dependence structure, properties that are empirically well-documented (Bowsher 2007, Hall & Hautsch 2007, Hautsch 2011, Bacry et al. 2012, Bacry et al. 2013.
A further contribution, from an empirical perspective, is to discuss some stylized facts related to the cluster behavior of extreme events within financial markets. To this end, we consider three well-investigated international stock market indexes: the DAX, the S&P 500, and the FTSE 100.
We show that, by means of the multivariate ACI-POT and Hawkes-POT approaches, we can capture these stylized facts and produce reliable forecasts of VaR and ES.
The remainder of the paper is organized as follows. In Section 2, we discuss some stylized facts that are associated with the cluster behavior of extreme events in financial time series.
Section 3 summarizes the concepts in EVT from the viewpoint of point process theory. Section 4 introduces the ACI-POT and Hawkes-POT models. In Section 5, we illustrate how to apply the proposed models to produce conditional risk measures such as the VaR and ES. Section 6 discusses estimation results and diagnostics which are based on applications of the proposed models to the daily returns of international stock indexes. Section 7 provides VaR and ES insample and out-of-sample backtesting results. Conclusions are rendered in Section 8.

Clustering of Extreme Events
The clustering of extreme events is recognized as prevalent feature in most financial time series.
The tendency for very large movements of prices (exceedances over a sufficiently high threshold) to be clustered through time is one of the major challenges toward obtaining reliable risk measures. A major difficulty is to reliably predict both the size and likelihood of extreme events (e.g., large losses; BCBS 2012). In this section, we highlight some stylized facts and demonstrate the need for approaches capturing both the dynamic and distributional features of extreme events.

Clustering of Extreme Gains vs. Extreme Losses
A well-known observation is that co-movements in international stock market returns are asymmetric. In particular, correlations are higher in market downturns than in upturns, and there is a higher level of clustering for losses than for gains. Numerous studies have examined these stylized facts. For instance, for the Dow Jones Stoxx 600 index, Baur et al. (2012) find that the lower quantiles exhibit positive dependence upon past returns, while the upper quantiles display negative dependence. Tseng & Li (2011) use different assets and show that larger extreme events tend to cluster more than smaller ones. Similarly, large losses tend to congregate together more severely than large gains. Hamidieh et al. (2009) analyze the returns of the S&P 500 index during the period from 1960 to 2007 and show that losses exhibit stronger clustering than gains. Olmo (2005) analyzes the DAX index for the period from 1994 to 2001 and finds a higher level of clustering for large losses than for gains. Jondeau & Rockinger (2003) report evidence of the clustering of extremes for a large number of countries, with differences in the cluster size for positive and negative returns.
In order to illustrate this "stylized" fact, we consider an equal-weighted portfolio based on the DAX, S&P 500, and FTSE 100 indexes from 1992 to 2012. A flexible non-parametric tool for capturing different types of extremal dependence is the extremogram introduced by Davis et al. (2009), which can be considered as an analog of the autocorrelation function for extreme events.
Let X t be a strictly stationary R d -valued time series, the extremogram at lag h is defined by for h = 0, 1, 2, . . ., provided that the limit exists for two sets A and B bounded away from 0. 2 Similarly, we can define the cross-extremogram as which can be straightforwardly extended to higher dimensions. In practice, the limits on x or y in the above equations are replaced by high quantiles of the processes. For all (cross-) extremograms displayed in this paper, we utilize a stationary bootstrap to construct confidence intervals with block sizes given by an independent geometric distribution with mean 250, which closely corresponds to the number of yearly trading days. The sampling distributions of the (cross)-extremogram and confidence intervals are obtained based on 10, 000 bootstrap replications. For a complete discussion and details on the estimation and construc- tion of confidence intervals for extremograms, we refer to Davis et al. (2012). Figure 1 displays extremograms and their corresponding cross-extremograms with x and y being the 91.5% empirical quantiles of the portfolio in both tails 3 . Observe that the (cross-)extremograms decay hyperbolically at the 95% confidence level of independence as lags increase, with losses conditional on gains decaying at the slowest rate. In addition, the extremogram of losses and the cross-extremograms for losses conditional on gains show the most significant dependence on many lags. These results are similar to the findings provided by Tseng & Li (2011), Hamidieh et al. (2009), and Olmo (2005.

Clustering of Extreme Events Across Time
Besides clustering within a time series, we also observe a tendency for clusters of extremes to simultaneously occur across different markets. An obvious reason for this observation is an 3 The justification of the tail threshold selection for all empirical approaches is given in subsection 6.1. increasing market integration, which is most distinctive in the U.S. and Europe.
In Figure 2, we display the time series of 9% of the most negative log returns of the three indexes 4 . We observe a considerable amount of clustering of extremes across the different markets.
In our sample, the first important cluster can be identified during the late 1990s and early 2000s, which is associated with the Asian financial crisis in 1997 and the end of the dot-com crash in October 2002, respectively. The most recent cluster is centered around the 2008 global financial crisis, starting in 2007 with the subprime crisis in the U.S. Moreover, we display trivariate crossextremograms for the analyzed returns. For instance, let X, Y , and Z be the negative log returns of the DAX, S&P 500, and FTSE 100 indexes, respectively. Then, the third panel displays the 4 See subsection 6.1. for threshold selection.

cross-extremogram
with x, y, and z being the 91% empirical quantiles of the negative log returns and A = (1, ∞).
Likewise, in the bottom panel of Figure 2, we depict the cross-extremogram  Poon et al. (2004) find that extreme dependence among these countries is much stronger in bear markets than in bull markets, and that some of this dependence is related to volatility co-clustering. Byström (2004) shows that the 50 most extreme losses for the Swedish index AFF (Affärvärlden's Generalindex) and the Dow Jones Industrial Average index during the period from 1980 to 1999 occur within the same month for half of the extremes, while two-thirds occur within the same quarter.
The dynamics of such co-clustering of extreme events is readily described by a multivariate intensity process. This is what we try to capture by means of the approaches proposed in this paper. For illustration, the second panel of Figure 2 displays the conditional intensities for the occurrence of extreme events based on a univariate ACI-POT model, corresponding to a special case of the model discussed in Section 4.1.

Autocorrelations in Inter-Exceedance Times
The inter-exceedance time is commonly defined as the time interval between consecutive extreme price events. In this vein, times between price events have been used as a proxy for volatility estimation on the basis of price intensities in high frequency data analysis (Engle 2000, Gerhard & Hautsch 2002. Classical EVT assumes independent and identically distributed (IID) observations. According to this assumption, the exceedances over a high threshold should behave as a Poisson point process, implying that inter-exceedance times should be exponentially distributed. Empirical evidence, however, clearly contradicts this assumption, making the direct use of this approach

A Point Process Approach to EVT
Consider the negative returns of a given stock {Z t } t≥1 and suppose, for the moment, that all of the observations are IID and have a common distribution function F . To characterize the behavior of the maxima M n = max {Z 1 , . . . , Z n }, classical EVT yields that, for given normalizing sequences a n > 0, b n ∈ R and n → ∞, the limiting distribution of M n , converges in distribution to the Generalized Extreme Value (GEV) distribution function where µ, ξ ∈ R, and σ > 0, corresponding to location, shape, and scale parameters, respectively. 5 The convergence of (1) holds if and only if The time t j and magnitude of an extreme event Y j = Z t j (the mark) exceeding a given Under the assumption that the losses are IID and stationary, Pickands (1971) shows that the two-dimensional point process defined in a sub-region A = (0, t] × (y, ∞) can be characterized by the counting process N (A) = j≥1 1 {t j ≤ t, Y j = y} corresponding to a non-homogeneous 5 We define a + = max (a, 0).

Poisson process with intensity function
where µ, ξ ∈ R, and σ > 0 are the same parameters as those determining the GEV distribution function. Consequently, for a set B = (t 1 , t 2 ) × (y, ∞), with B ⊆ Ω, the intensity measure, or alternatively, the mean of the point process in B is given by The intensity of the two-dimensional point process (2) can be rewritten in terms of a socalled marked point process (MPP). In the given framework, the arrival times corresponding to the time when a return exceeds a threshold u > 0 are driven by the so-called ground (intensity) process, while the marks correspond to the magnitude of the losses. Formally, an MPP is defined through the right-continuous counting function The internal history (natural filtration) of this process is denoted by Following Daley & Vere-Jones (2003), 6 the intensity of a MPP can be described as where λ g (t | H t ) corresponds to the intensity of the ground process N g (t) = j≥1 1 {t j ≤ t}, and g (y | H t , t) corresponds to the density function of the marks, conditional on the history of the process and the time t of the last event. Rewriting the intensity (2) in terms of an MPP, the ground process corresponds to which is the rate of a Poisson point process of exceedances above the threshold u. 7 On the other hand, according to the Pickands-Balkema-de Haan theorem, the density of IID marks is well-approximated by a generalized Pareto density function where β = σ + ξ (u − µ) is a reparametrized scale parameter. This representation is valid in the case of IID observations, which, however, is not empirically supported. In the next section, we introduce a dynamic generalization of this framework.

Multivariate Dynamic Intensity Peaks-Over-Threshold Models (MDI-POT)
We consider an of random variables in a set B defined on a probability space (Ω, H, P). In this framework, t denotes the (pooled) calendar time, and t m j corresponds to the inter-exceedance time of process m, with N g m (t) being its marginal ground process. Moreover, each m-th component of the MPP is linked to an exceedance Y m j constituting the mark. The MDI-POT is then generally specified via an M −variate vector of conditional : t m j < t denotes the complete internal history.

The Multivariate ACI-POT Model
In the case of the ACI-POT model, the conditional intensity of the ground process is driven by three main components depending on the history H t : (i) a left-continuous dynamic process Φ j that is updated instantaneously after the occurrence of t j−1 and does not change until t j ; (ii) the "standardized" excess, y m j−1 = y m j−1 /u m , capturing the influence of the size of extreme events on the conditional intensity by means of the parameter δ m ; and (iii) λ m 0 (t) = λ m 0 (x(t)), corresponding to a baseline intensity that changes continuously in terms of its own inter-exceedance Using the standardized excess y m j−1 ∈ (1, ∞] instead of the "raw" excess r m j−1 = y m j−1 − u m in the ground process is advantageous to avoid numerical instabilities of estimates in the case of very high exceedances. Such standardizations are commonly used in EVT (see, e.g., Resnick 2006). Accordingly, λ m g (t | H t ) is given by where δ m captures the effect of the size of the loss exceedance on the intensity. As proposed by Russell (1999), we specify the M × 1 vector where z m j denotes an indicator variable that takes on the value one if the j-th event of the pooled process is of type m, and zero otherwise. A m = {a m } is an M × 1 coefficient vector denoting the impact of the innovation ε j on the ground intensity of the M -variate processes when the previous extreme event was of type m, and B = {b mk } corresponds to an M × M coefficient matrix of persistence parameters. In addition, the innovation term ε j is an IID exponential random vector based on the integrated intensity, which is computed by piecewise integration where ds is the m-type integrated intensity. This allows the conditional intensity function to vary between extreme event arrivals. Finally, the baseline intensity function λ m 0 (t) is specified in the form of an appropriate hazard function. In this paper, we utilize the generalized gamma distribution with hazard function given by with q m = 0, and σ m , υ m > 0. If η m ∼ Gamma(q −2 m , 1), and ̟ m = ln (q 2 m η m ) /q m , then x m (t) = exp (υ m + σ m ̟ m ) follows a generalized gamma distribution (Prentice 1974). 8 This type of hazard function is commonly used in the empirical literature since it exhibits both monotonic and non-monotic behavior.
To capture the dynamics in the magnitudes of exceedances, we propose a conditional autoregressive specification for the mark density. Let ψ m N g , with ξ m ∈ R + being the shape parameter of the GPD, and ρ m , β m , γ m are parameters. The logarithmic specification ensures the non-negativity of the process without explicitly imposing corresponding parameter restrictions. The error terms ǫ m N g m (t) are IID generalized Pareto random variables with a probability density function given by The parameter γ m captures the effect of the most recently elapsed inter-exceedance waiting time on the size of the extreme event. The covariance stationarity of ln r m N g Under the proposed MEM specification, the conditional density of the raw exceedance r m is therefore given by which corresponds to a generalized Pareto density with time-varying scale parameter exp ϕ m N g m (t) . Finally, imposing specifications (5) and (8) into (3) yields the multivariate ACI-POT model with the m-th component given by The stationarity of the ACI-POT model is ensured if the eigenvalues of the persistence matrix B in (6) lie inside the unit circle, which is equivalent to the spectral radius of the persistence matrix being less than one (i.e., Spr (B) = max {|ϕ| : det (B − ϕI) = 0} < 1), where ϕ i are the eigenvalues of B (see Proposition 2 in Russell 1999). Observing the process over the time interval (0, T ], the resulting log-likelihood function is given by with θ 1 and θ 2 denoting the corresponding parameter vectors.

The Multivariate Hawkes-POT Model
A Hawkes process is a self-exciting point process as originally introduced by Hawkes (1971).
This class of point processes has wide applications in many different fields, primarily in seismology (Hawkes & Oakes 1974, Ogata 1988, and more recently in finance (Bowsher 2007, Dassios et al. 2011, Embrechts et al. 2011, Bacry et al. 2012, Bacry et al. 2013). In the context of EVT, a univariate Hawkes-POT process was introduced by Chavez-Demoulin et al. (2005), and more recently reviewed in Chavez-Demoulin & McGill (2012). In this paper we closely follow the representation of a Hawkes process given by Embrechts et al. (2011).
According to a Hawkes process, the ground process for the m-th component is given by where µ m > 0 corresponds to the immigrant rate, or baseline intensity, h : R → R + is a decay kernel describing the instantaneous influence of the k-th component, and how this deviates from the baseline µ m through time. Finally, the parameters b mk > 0 are coefficients defining the Similar to the ACI-POT model, the stationarity of the process is ensured if the spectral radius of B is strictly less than one. We assume the decay kernel function corresponding to the product of two exponential functions: one puts exponential weights on the time elapsed since the last event. The other scales the kernel by the size of the standardized with a k > 0 and δ k ∈ R. Observe that the impact of spillovers between the individual processes are captured by the parameters b mk .
As in the ACI-POT model, we specify the size of the exceedances for each m-th component based on an MEM model according to (8). This yields the multivariate Hawkes-POT model given The log-likelihood is then given by where θ 1 and θ 1 denote the corresponding parameter vectors. Note that the parameters associated with the Hawkes and ACI specifications for the intensities and marks are disjointed, which allows us to estimate them separately.

Improving Conditional Risk Measures
The Basel Committee on Banking Supervision has proposed using ES instead of VaR as an internal model-based approach for regulatory market risk capital mainly because of the inability of VaR to capture tail risk (BCBS 2012, BCBS 2013. Gneiting (2011), however, demonstrates that, although ES is a coherent risk measure (Artzner et al. 1999) and is able to capture tail risk, it does not satisfy the requirement of elicitability (i.e., it cannot be straightforwardly backtested).

The virtues and limitations of both risk measures have forced regulators and practitioners to
adopt only one of them, and therefore, the features either of coherence or elicitability. Here, we illustrate how to derive both risk measures based on the proposed MDI-POT specifications.
Consider all losses {Y t } t≥1 defined as the negative log returns of a particular asset with underlying cumulative distribution function F . For ease of exposition, we omit the superscript m.
ES is estimated by first obtaining the VaR at confidence level α, which is equivalent to estimating the predictive distribution (F Y t+1 |Ht (y t+1 α )) for the returns over the next period, By computing the conditional survival function F Y t+1 |Ht (y) = 1 − F Y t+1 |Ht (y) as where the last result is obtained by using the asymptotic identity ln (x) ≈ x − 1 as x → 1. The conditional probability of exceedances is then computed as where G ξ,exp(ϕ N g (t)) denotes the conditional generalized Pareto survival function. Finally, the VaR is defined in terms of the quantile y t+1 From this result, the associated conditional ES, corresponding to the conditional distribution of extreme events above the VaR, given H t , is computed as Note that with the limit not depending on time. Recently, the Basel Committee (BCBS 2013) proposed using the VaR at the 0.99 confidence level in internal model-based approaches with ES evaluated at the 0.975 confidence level. According to the Basel Committee, ES is less sensitive to extreme events than VaR, and therefore, should account for the tail risk in a more comprehensive form.
We analyze this proposition in the next section.  (15) for losses (gains). The gray rectangle displays the subset that seems to be the most stable for a given shape parameter (x-axis) and different tuning parameters β (y-axis).
and gains of an equally-weighted portfolio based on the three indexes. The second application considers a trivariate model to jointly model negative log returns of the three indexes.
In order to determine the tail threshold u, we follow the statistic proposed by Reiss & Thomas (2007) to determine the number of exceedances k by arg min whereξ i is the estimate of the shape parameter for the sample fraction of extremes above the upper order statistic i, and β ∈ [0, 0.5] is a tuning parameter. The idea is to find the sample proportion for which the distribution of the shape parameters is stable. According to Figure 4, a proportion between 395 and 445 observations for gains and losses seems to be a satisfactory size.
We thus choose to work with 420 observations, corresponding to 8.5% of the most extreme events for losses and gains. For the trivariate case, we determine a threshold of 9% to be a reasonable choice.  The overall best fit is achieved for an ACI-POT model. We find evidence for spillover effects between positive and negative extreme observations, as captured by the persistence matrix B. Note that the persistent coefficient associated with negative extreme events (b 22 = 0.652)

Modeling Extreme Gains and Losses
is larger than that for positive extreme events (b 11 = 0.521), indicating that the extent of clustering of extremes tends to be larger for extreme losses than for extreme gains. Moreover, the off-diagonal persistence coefficients reveal that negative extreme events more frequently cause positive extreme events than vice versa (b 12 > b 21 ). Finally, we find evidence for spillovers in the innovations, as reflected by the large a m coefficients.
Similarly, the estimates of the Hawkes-POT model reveal evidence for spill-over effects between negative and positive returns and clustering of extreme events. The estimated persistence matrix B indicates that negative extreme events are more likely to be followed by another negative event (b 12 = 0.274) than by a positive extreme event (b 12 = 0.205).
Overall, our estimates strongly support an obvious asymmetry between positive and negative extreme returns, as discussed in Section 2. This is particularly true for the coefficient δ m , which captures the influence of the marks on the intensities. Observe that this coefficient is negative for gains and positive for losses, indicating that negative extreme events tend to increase the probability of observing further extreme negative events, while positive extreme events tend to decrease the probability of seeing further positive shocks. As a result of this self-enforcing behavior, negative returns are much more clustered than positive ones. This is in line with the predictive asymmetry hypothesis by Campbell & Hentschel (1992), which suggests that volatility is higher after stock markets exhibit losses, making stock market returns negatively correlated with future volatility.
According to the parameter estimates, the baseline hazard functions of the inter-exceedance times of both return processes reveal an inverted U-shape with the underlying densities being positively skewed as q m < 0. As shown by Figure 6, we observe that for negative returns, the baseline function λ 0 decreases more slowly than for positive returns. Consequently, the temporal clustering of negative extremes is clearly higher than for positive ones. Recall that the dynamic process Φ j , and correspondingly, the size of the last extreme observation accelerate the conditional intensity function and its implied risk measures, but does not affect the shape of the  Engle & Russell's (1998) test for excess dispersion uses the statistic n ε /8σ 2 ε , where n ε corresponds to the number of residuals andσ ε is the empirical standard deviation of the residuals series. Under correct model specifications, the test statistic is asymptotically normally distributed.
We observe that the residuals are, on average, close to zero, with standard deviations not far from unity. Moreover, according to Ljung-Box statistics, the assumption of independence cannot be rejected, indicating that the model is able to capture the dynamics of the data fairly well.
The test of excess dispersion, however, reveals slight evidence for over-dispersion, indicating that both approaches are still not sufficiently flexible to capture the distributional properties of inter-exceedance times. Table 2 gives the estimation results based on trivariate MDI-POT models for extremes in DAX, S&P 500, and FTSE 100 returns. In order to benchmark this approach, Table 3 reports restricted versions of the proposed ACI-POT and Hawkes-POT models, in particular, MDI-POT models without incorporating cross-sectional feedback between individual exceedance intensities and without feedback between the magnitude of exceedances and their conditional ground intensities.

Modeling Cross-Sectional Spillovers in Extremes
Subsequently, we associate these specifications with "restricted" models.
The main result is that the unrestricted MDI-POT approaches exhibit a better fit in terms of the AIC and residual diagnostics. In particular, the mutual excitation mechanism of these models is what determines the rate of occurrence of future extreme events, matching the stylized fact that clusters of extreme events in financial markets are produced in part by contagion effects. Hence, it turns out that allowing for both clustering in extremes and for feedback between the size and the intensity of extremes is statistically supported.
The best-fitting specification is the unrestricted ACI-POT model. It turns out that the flexible parametrization of the ground process in the ACI-POT specification compared to the Hawkes-POT specification significantly improves its explanatory power, capturing auto-correlation structures among the inter-exceedance times in a more efficient form. Indeed, residuals of the ground process for the ACI-POT model have mean and standard deviation closer to zero and one, respectively, than the residuals of the Hawkes-POT model. The Ljung Box statistics report that the residuals resulting from both approaches do not exhibit remaining serial dependence.
In the unrestricted ACI-POT approach, the ground process is fairly persistent, with relatively low innovation coefficients and relatively high persistence parameters. All persistence parameters are significant and support the stationarity of the underlying process. Similarly to the bivariate case, the baseline functions in the trivariate case reveal a non-monotonic and inverted U-shaped pattern. As illustrated in Figure 7, the baseline hazard functions show an increasing hazard rate until three to four days after the last extreme event occurs, while it declines thereafter. This nonmonotonic pattern seems to be an important feature characterizing the time evolution of extreme events on financial markets and requires flexibility of the underlying parameterization. Observing peaks of the baseline functions around the three-to four-day mark, moreover, reflects underlying temporal clustering, making it more likely to observe a further extreme price movement just after a previous one than after a long period without exceedances. In addition, we find that the influence of the exceedance size on the conditional intensity of the ground process, as captured by the coefficient δ m , is significant in both specifications for the FTSE 100 and DAX log returns.
Hence, extreme events in one series increase the conditional intensity for the next extreme event in the same series, but also in the other series. This result is in line with the estimates of the trivariate sample extremograms in Figure 2 and with previous studies of extreme dependence in international stock markets, where extreme losses tend to affect several stock markets at the same time, creating co-movements and a stronger dependence among their conditional intensities (Poon et al. 2003, Baltzer et al. 2008).
In the dynamic specification for the marks, we find a strong persistence with coefficients β m < 1. Hence, exceedance sizes are clearly autocorrelated and, moreover, negatively depend on the length of past inter-exceedance waiting times, as reflected by the coefficient γ m . As in the bivariate model above, residual diagnostics reveal that both MDI-POT specifications capture the distributional and dynamic features in the underlying series sufficiently well.

MDI-POT-Based VaR and ES Forecasting
An important advantage of VaR-based risk assessments is the possibility of backtesting. Conversely, there is no consensus on how to backtest ES. Emmer et al. (2013) propose a framework to backtest ES based on a representation in terms of the integrated VaR, This allows making use of backtesting techniques developed for VaR. In particular, if each of these confidence levels are successfully backtested, then, to a certain degree, the same is true for ES t+1 α . In order to test the accuracy of the VaR estimates, we utilize a battery of tests proposed in the literature, which are described in detail in Appendix A. The first three tests are based on a binomial type test introduced by Christoffersen (1998): an unconditional coverage test (LR uc ), evaluating the expected fraction of exceptions (i.e., exceedances of the VaR), a test for the independence of exceptions (LR ind ), and a conditional coverage test (LR cc ), which is a combination of the latter two. Moreover, we implement the dynamic quantile tests proposed by Engle & Manganelli (2004) which rely on linear regressions. The first is the dynamic quantile hit test (DQ hit ), where de-meaned exceptions are regressed on their lags, while the second one, the dynamic quantile VaR (DQ V aR ) test, uses in addition the contemporaneous VaR estimates. Finally, we implement a loss measure V ES which evaluates the potential loss between the forecasted ES ( ES t α ) and the observed return (R t ) at time t, given that this return has exceeded the actual VaR An accurate estimate of ES should result in a low absolute value of this quantity. However, its weakness is that it depends on the accuracy of the preliminary VaR estimation, since only returns below the VaR are taken into account (Embrechts et al. 2005). For instance, if the VaR estimates of a MDI-POT model do not generate any exceedances, this measure cannot be evaluated.

Accuracy of MDI-POT-Based Risk Forecasts
In order to assess the accuracy of the proposed approaches for the estimation and prediction of VaR and ES at different confidence levels, we estimate all models using the sample from January 2, 1992 to December 31, 2012. The estimated parameters are then used to compute one-day-  The Basel Committee (BCBS 2013) recommends changing the risk-based capital framework building on 99%-VaR to 97.5%-ES. These confidence levels are used in our empirical analysis. in order to make use of the integral representation (16) enabling us to backtest ES at the 97.5% level. For comparison purposes, we also report the VaR at the 0.99 confidence level.
According to the VaR accuracy, as reflected by p-values greater than 0.05 in Table 4, the specifications providing the most accurate in-sample fit are the proposed models which explicitly include mutual interactions between the point processes and the processes of exceedances.
Indeed, the unrestricted MDI-POT models pass the tests 18% more often than the others. Comparing the predictive performance of the individual specifications in the estimation sample, the Hawkes-POT model reveals a slightly better performance than its ACI-POT counterpart in terms of the V ES statistic. On the other hand, the ACI-POT model outperforms the Hawkes-POT model in terms of the LR ind and DQ tests of independence.
Likewise, for the backtesting period, the unrestricted specifications yield correct estimates for most of the confidence levels, with 99% of p-values greater than 0.05 for the accuracy tests. In contrast, this percentage is reduced to 89% for the restricted approaches. It is remarkable that the unrestricted ACI-POT model is the only specification implying that all p-values are greater than 0.05 for the entire set of tests. The unrestricted Hawkes-POT approach slightly overestimates one of the VaR confidence levels for DAX returns. These results suggest that Hawkes-POT models might have tendency to overfit in-sample but slightly underperform out-of-sample. Therefore, working with ACI-POT models is slightly preferable in a backtesting context. In fact, the Hawkes-POT model tends to produce estimates for the VaR and ES slightly higher than the ACI-POT models, which is also reflected in the V ES measure. Hence, overall we can conclude that unrestricted MDI-POT models (i.e., allowing for cross-excitements among markets) results in significant improvements in VaR and ES forecasting. In terms of VaR accuracy tests, both the ACI-POT and Hawkes-POT approach perform similarly well at all confidence levels, with the ACI-POT model being slightly superior. Finally, as an additional evaluation metric, we analyze the difference between the predicted

Conclusions
We propose a multivariate dynamic intensity framework to jointly model the occurrence of extreme observations (exceeding a certain threshold) in a multivariate time series of log returns.
The event arrival is modeled as a MPP where the marks are associated with the magnitude of (loss) exceedances. The major feature of these models is to allow for the clustering of the arrival of extremes over both time and the cross-section and the clustering of the size of exceedances. This is achieved by combining a multivariate dynamic intensity process (ACI process or Hawkes process) with a multiplicative error model based on a generalized Pareto distribution for the magnitude of exceedances. Both components are linked to allow for feedback effects between the arrival intensity of extremes and the size of exceedances above the threshold.
Empirical evidence based on the return series of the DAX, S&P 500, and FTSE 100 indexes provides strong support for the models. We find significant evidence for (co-)cluster structures in extreme stock market losses, which are well-captured by the proposed approach period. Furthermore, we demonstrate that the new models yield a good out-of-sample backtesting performance when they are applied to the prediction of VaR and ES.
We see it as a major advantage of the proposed framework that it can be easily extended 11 Note that V aR0.975 V aR0.99 ≈ 0.4 ξ and from (14) we know that ES0.975 V aR0.975 ≈ 1 1−ξ .
in various directions and -depending on the chosen specification -is also tractable in higher dimensions. Consequently, it might be used as a valuable framework to analyze, for instance, systemic risk or tail dependencies.

A Backtesting Tests
The first three tests utilized in this paper are introduced by Christoffersen (1998). The first corresponds to a test of unconditional coverage (LR uc ), which evaluates the expected fraction of exceptions I t = I (r t < −V aR t α ), with I denoting the indicator function. We test the null hypothesis that I t | H t ∼ Bernoulli (α), against the alternative that I t | H t ∼ Bernoulli (α) (i.e., H 0 : α =α). This can be tested by means of a likelihood-ratio test LR uc = 2 [L (α; I 1 , . . . , I t ) − L (α; I 1 , . . . , where L is the binomial log likelihood. The maximum likelihood estimatorα is the ratio of the number of violations to the total number of observations. This test implicitly assumes that the exceptions are independent, an assumption which is tested in a second test. Here, we assume that the exceptions I t follow a binary first order Markov chain with transition probability matrix 1 − π 11 π 11   , π ij = P (I t = j | I t−1 = i) .

11
(1 − π) n 00 +n 10 π n 01 +n 11 ∼ χ 2 1 , with n ij being the number of observations of an event i on day t − 1 following an event j on day t. The maximum-likelihood estimators under the alternative hypothesis arê π 01 = n 01 n 00 + n 01 andπ 11 = n 11 n 10 + n 11 . Christoffersen (1998) suggests to simultaneously test for the correct unconditional coverage and independence yielding a test for correct conditional coverage In a more general framework, Engle & Manganelli (2004) introduce a dynamic quantile test (DQ) to evaluate different types of dependence. Define Hit t (α) = I t − α as the de-meaned exceptions. Jointly testing the hypothesis that E [Hit t (α)] = 0 and that Hit t (α) is uncorrelated with variables included in some information set X, can be done using the artificial regression, where u is an IID mean zeo random variable and X contains, for example, lags of Hit t , VaR estimates, etc. Under the null hypothesis H 0 : β = 0, the regressors should have no explanatory power,i.e., the regressors are not correlated with the dependent variables. The test statistic is given by where p is the number of explanatory variables X. In the empirical application, we use the dynamic quantile hit (DQ hit ) test, where the regressors contain a constant and the lagged Hit variable, and the dynamic quantile VaR (DQ V aR ) test, utilizing in addition the contemporaneous VaR estimates.
Finally, we employ the measure V ES which evaluates the difference between the predicted ES (ÊS t α ) and the observed return (R t ) at time t, given that this return has exceeded the actual VaR, i.e., This statistic is close to zero if the model is appropriate (Embrechts et al. 2005). However, its weakness is that it depends on the accuracy of the VaR estimates, since only returns below the VaR are taken into account.   Table 3: Estimates of restricted trivariate MDI-POT models for negative log returns of the FTSE 100, DAX and S&P 500 indexes from January 2, 1992 to December 31, 2012. Standard errors are in parentheses. LL 1 corresponds to the log-likelihood of the ACI or Hawkes part, while LL 2 is associated with the POT part. The Akaike Information Criterion for the AIC-POT model is −1371 and for the Hawkes-POT −1248. Spr : Spectral radius of the persistence matrix. Mean (ε m ): mean of residuals,σ ε : standard deviation of the residuals, LB ε (5): Ljung-Box statistic based on 5 lags, Exc. disp.: excess dispersion test according to Engle & Russell (1998)