Second-order estimation procedures for complete and incomplete heavy-tailed data
This thesis investigates the second-order re ned peaks over threshold model called the Extended Pareto Distribution (EPD) introduced by Beirlant et al. (2009). Focus is placed on estimation of the Extreme Value Index (EVI). Firstly we investigate the e ectiveness of the EPD in modelling heavy-tailed distributions and compare it to the Generalized Pareto Distribution (GPD) in terms of the bias, mean squared error and variance of the EVI. This is done through a simulation study and the Maximum Likelihood (ML) method of estimation is used to make the comparison. In practice, data can be tampered by some arbitrary process or study design. We therefore investigate the performance of the EPD in estimating the EVI for heavy-tailed data under the assumption that the data is completely observable and uncontaminated, random right censored and contaminated respectively. We suggest an improved ML numerical procedure in the estimation of EPD parameters under the assumption that data is completely observable and uncontaminated. We further propose a Bayesian EPD estimator of the EVI and show through a simulation study that this estimator leads to much improved results as the ML EPD estimator. A small case study is conducted to assess the performance of the Bayesian EPD estimator and the ML EPD estimator using a real dataset from a Belgian reinsurance rm. We investigate the performance of some well known parametric and semi-parametric estimators of the EVI adapted for censoring by a simulation study and further illustrate their performance by applying them to a real survival dataset. A censored Bayesian EPD estimator for right censored data is then proposed through an altered expression of the posterior density. The censored Bayesian EPD estimator is compared with the censored ML EPD estimator through a simulation study. Behaviour of the minimum density power divergence estimator (MDPDE) is assessed at uncontaminated and contaminated distributions respectively through an exhaustive simulation study including other EPD estimators mentioned in this thesis. The comparison is made in terms of the bias and mean squared error. EVI estimates from the di erent estimators are then used to estimate quantiles, the results are reported concurrently with the EVI estimates. We illustrate the performance of all mentioned estimators on a real dataset from geopedology, in which a few abnormal soil measurements highly in uence the estimates of the EVI and high quantiles.