Kernel density estimation is a widely used statistical tool and bandwidth selection is critically important. The Sheather and Jones' (SJ) selector [A reliable data-based bandwidth selection method for kernel density estimation, J. R. Stat. Soc. Ser. B 53 (1991), pp. 683-690] remains the best available data-driven bandwidth selector. It can, however, perform poorly if the true density deviates too much in shape from normal. This paper first develops an alternative selector by following ideas in Park and Marron [On the use of pilot estimators in bandwidth selection, Nonparametr. Stat. 1 (1992), pp. 231-240] to reduce the impact of the normal reference density. The selector can bring drastic improvement to less smooth densities that the SJ selector has difficulty with, but may be slightly worse off otherwise. We then propose to combine the alternative selector and SJ selector by using the smaller of the two bandwidths, which has the effect of automatically picking the better one for a particular density. In our extensive simulation, study using the 15 benchmark densities in Marron and Wand [Exact mean integrated squared error, Ann. Statist. 20 (1992), pp. 712-736], the combined selector significantly improves the SJ selector for 5 difficult densities and retains the superior performance of the SJ selector for the other 10. A ready-to-use R function is provided.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty