### Abstract

Statistical methodology, with deep roots in probability theory, provides quantitative procedures for extracting scientific knowledge from astronomical data and for testing astrophysical theory. In recent decades, statistics has enormously increased in scope and sophistication. After a historical perspective, this review outlines concepts of mathematical statistics, elements of probability theory, hypothesis tests, and point estimation. Least squares, maximum likelihood, and Bayesian approaches to statistical inference are outlined. Resampling methods, particularly the bootstrap, provide valuable procedures when distributions functions of statistics are not known. Several approaches to model selection and goodness of fit are considered.Applied statistics relevant to astronomical research are briefly discussed. Nonparametric methods are valuable when little is known about the behavior of the astronomical populations or processes. Data smoothing can be achieved with kernel density estimation and nonparametric regression. Samples measured in many variables can be divided into distinct groups using unsupervised clustering or supervised classification procedures. Many classification and data mining techniques are available. Astronomical surveys subject to nondetections can be treated with survival analysis for censored data, with a few related procedures for truncated data. Astronomical light curves can be investigated using time-domain methods involving the autoregressive models, frequency-domain methods involving Fourier transforms, and state-space modeling. Methods for interpreting the spatial distributions of points in some space have been independently developed in astronomy and other fields.Two types of resources for astronomers needing statistical information and tools are presented. First, about 40 recommended texts and monographs are listed covering various fields of statistics. Second, the public domain R statistical software system has recently emerged as a highly capable environment for statistical analysis. Together with its ∼ 3,000 (and growing) add-on CRAN packages, R implements a vast range of statistical procedures in a coherent high-level language with advanced graphics. Two illustrations of R’s capabilities for astronomical data analysis are given: An adaptive kernel estimator with bootstrap errors applied to a quasar dataset, and the second-order J function (related to the two-point correlation function) with three edge corrections applied to a galaxy redshift survey.

Original language | English (US) |
---|---|

Title of host publication | Planets, Stars and Stellar Systems Volume 2 |

Subtitle of host publication | Astronomical Techniques, Software, and Data |

Publisher | Springer Netherlands |

Pages | 445-480 |

Number of pages | 36 |

ISBN (Electronic) | 9789400756182 |

ISBN (Print) | 9789400756175 |

DOIs | |

State | Published - Jan 1 2013 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Physics and Astronomy(all)
- Earth and Planetary Sciences(all)

### Cite this

*Planets, Stars and Stellar Systems Volume 2: Astronomical Techniques, Software, and Data*(pp. 445-480). Springer Netherlands. https://doi.org/10.1007/978-94-007-5618-2_10

}

*Planets, Stars and Stellar Systems Volume 2: Astronomical Techniques, Software, and Data.*Springer Netherlands, pp. 445-480. https://doi.org/10.1007/978-94-007-5618-2_10

**Statistical methods for astronomy.** / Feigelson, Eric D.; Babu, G. Jogesh.

Research output: Chapter in Book/Report/Conference proceeding › Chapter

TY - CHAP

T1 - Statistical methods for astronomy

AU - Feigelson, Eric D.

AU - Babu, G. Jogesh

PY - 2013/1/1

Y1 - 2013/1/1

N2 - Statistical methodology, with deep roots in probability theory, provides quantitative procedures for extracting scientific knowledge from astronomical data and for testing astrophysical theory. In recent decades, statistics has enormously increased in scope and sophistication. After a historical perspective, this review outlines concepts of mathematical statistics, elements of probability theory, hypothesis tests, and point estimation. Least squares, maximum likelihood, and Bayesian approaches to statistical inference are outlined. Resampling methods, particularly the bootstrap, provide valuable procedures when distributions functions of statistics are not known. Several approaches to model selection and goodness of fit are considered.Applied statistics relevant to astronomical research are briefly discussed. Nonparametric methods are valuable when little is known about the behavior of the astronomical populations or processes. Data smoothing can be achieved with kernel density estimation and nonparametric regression. Samples measured in many variables can be divided into distinct groups using unsupervised clustering or supervised classification procedures. Many classification and data mining techniques are available. Astronomical surveys subject to nondetections can be treated with survival analysis for censored data, with a few related procedures for truncated data. Astronomical light curves can be investigated using time-domain methods involving the autoregressive models, frequency-domain methods involving Fourier transforms, and state-space modeling. Methods for interpreting the spatial distributions of points in some space have been independently developed in astronomy and other fields.Two types of resources for astronomers needing statistical information and tools are presented. First, about 40 recommended texts and monographs are listed covering various fields of statistics. Second, the public domain R statistical software system has recently emerged as a highly capable environment for statistical analysis. Together with its ∼ 3,000 (and growing) add-on CRAN packages, R implements a vast range of statistical procedures in a coherent high-level language with advanced graphics. Two illustrations of R’s capabilities for astronomical data analysis are given: An adaptive kernel estimator with bootstrap errors applied to a quasar dataset, and the second-order J function (related to the two-point correlation function) with three edge corrections applied to a galaxy redshift survey.

AB - Statistical methodology, with deep roots in probability theory, provides quantitative procedures for extracting scientific knowledge from astronomical data and for testing astrophysical theory. In recent decades, statistics has enormously increased in scope and sophistication. After a historical perspective, this review outlines concepts of mathematical statistics, elements of probability theory, hypothesis tests, and point estimation. Least squares, maximum likelihood, and Bayesian approaches to statistical inference are outlined. Resampling methods, particularly the bootstrap, provide valuable procedures when distributions functions of statistics are not known. Several approaches to model selection and goodness of fit are considered.Applied statistics relevant to astronomical research are briefly discussed. Nonparametric methods are valuable when little is known about the behavior of the astronomical populations or processes. Data smoothing can be achieved with kernel density estimation and nonparametric regression. Samples measured in many variables can be divided into distinct groups using unsupervised clustering or supervised classification procedures. Many classification and data mining techniques are available. Astronomical surveys subject to nondetections can be treated with survival analysis for censored data, with a few related procedures for truncated data. Astronomical light curves can be investigated using time-domain methods involving the autoregressive models, frequency-domain methods involving Fourier transforms, and state-space modeling. Methods for interpreting the spatial distributions of points in some space have been independently developed in astronomy and other fields.Two types of resources for astronomers needing statistical information and tools are presented. First, about 40 recommended texts and monographs are listed covering various fields of statistics. Second, the public domain R statistical software system has recently emerged as a highly capable environment for statistical analysis. Together with its ∼ 3,000 (and growing) add-on CRAN packages, R implements a vast range of statistical procedures in a coherent high-level language with advanced graphics. Two illustrations of R’s capabilities for astronomical data analysis are given: An adaptive kernel estimator with bootstrap errors applied to a quasar dataset, and the second-order J function (related to the two-point correlation function) with three edge corrections applied to a galaxy redshift survey.

UR - http://www.scopus.com/inward/record.url?scp=84956984264&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84956984264&partnerID=8YFLogxK

U2 - 10.1007/978-94-007-5618-2_10

DO - 10.1007/978-94-007-5618-2_10

M3 - Chapter

AN - SCOPUS:84956984264

SN - 9789400756175

SP - 445

EP - 480

BT - Planets, Stars and Stellar Systems Volume 2

PB - Springer Netherlands

ER -