The statistical significance of a candidate gravitational-wave (GW) event is crucial to the prospects for a confirmed detection, or for its selection as a candidate for follow-up electromagnetic observation. To determine the significance of a GW candidate, a ranking statistic is evaluated and compared to an empirically-estimated background distribution, yielding a false alarm probability or p-value. The reliability of this background estimate is limited by the number of background samples and by the fact that GW detectors cannot be shielded from signals, making it impossible to identify a pure background data set. Different strategies have been proposed: in one method, all samples, including potential signals, are included in the background estimation, whereas in another method, coincidence removal is performed in order to exclude possible signals from the estimated background. Here we report on a mock data challenge, performed prior to the first detections of GW signals by Advanced LIGO, to compare these two methods. The all-samples method is found to be self-consistent in terms of the rate of false positive detection claims, but its p-value estimates are systematically conservative and subject to higher variance. Conversely, the coincidence-removal method yields a mean-unbiased estimate of the p-value but sacrifices self-consistency. We provide a simple formula for the uncertainty in estimate significance and compare it to mock data results. Finally, we discuss the use of different methods in claiming the detection of GW signals.
All Science Journal Classification (ASJC) codes
- Physics and Astronomy (miscellaneous)