### Abstract

Activities such as global sensitivity analysis, statistical effect screening, uncertainty propagation, or model calibration have become integral to the Verification and Validation (V&V) of numerical models and computer simulations. One of the goals of V&V is to assess prediction accuracy and uncertainty, which feeds directly into reliability analysis or the Quantification of Margin and Uncertainty (QMU) of engineered systems. Because these analyses involve multiple runs of a computer code, they can rapidly become computationally expensive. An alternative to Monte Carlo-like sampling is to combine a design of computer experiments to meta-modeling, and replace the potentially expensive computer simulation by a fast-running emulator. The surrogate can then be used to estimate sensitivities, propagate uncertainty, and calibrate model parameters at a fraction of the cost it would take to wrap a sampling algorithm or optimization solver around the physics-based code. Doing so, however, offers the risk to develop an incorrect emulator that erroneously approximates the true-but-unknown sensitivities of the physics-based code. We demonstrate the extent to which this occurs when Gaussian Process Modeling (GPM) emulators are trained in high-dimensional spaces using too-sparsely populated designs-of-experiments. Our illustration analyzes a variant of the Rosenbrock function in which several effects are made statistically insignificant while others are strongly coupled, therefore, mimicking a situation that is often encountered in practice. In this example, using a combination of GPM emulator and design-of-experiments leads to an incorrect approximation of the function. A mathematical proof of the origin of the problem is proposed. The adverse effects that too-sparsely populated designs may produce are discussed for the coverage of the design space, estimation of sensitivities, and calibration of parameters. This work attempts to raise awareness to the potential dangers of not allocating enough resources when exploring a design space to develop fast-running emulators.

Original language | English (US) |
---|---|

Pages (from-to) | 1220-1231 |

Number of pages | 12 |

Journal | Reliability Engineering and System Safety |

Volume | 96 |

Issue number | 9 |

DOIs | |

State | Published - Sep 1 2011 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Safety, Risk, Reliability and Quality
- Industrial and Manufacturing Engineering

### Cite this

}

*Reliability Engineering and System Safety*, vol. 96, no. 9, pp. 1220-1231. https://doi.org/10.1016/j.ress.2011.02.015

**The dangers of sparse sampling for the quantification of margin and uncertainty.** / Hemez, François M.; Atamturktur, Sez.

Research output: Contribution to journal › Article

TY - JOUR

T1 - The dangers of sparse sampling for the quantification of margin and uncertainty

AU - Hemez, François M.

AU - Atamturktur, Sez

PY - 2011/9/1

Y1 - 2011/9/1

N2 - Activities such as global sensitivity analysis, statistical effect screening, uncertainty propagation, or model calibration have become integral to the Verification and Validation (V&V) of numerical models and computer simulations. One of the goals of V&V is to assess prediction accuracy and uncertainty, which feeds directly into reliability analysis or the Quantification of Margin and Uncertainty (QMU) of engineered systems. Because these analyses involve multiple runs of a computer code, they can rapidly become computationally expensive. An alternative to Monte Carlo-like sampling is to combine a design of computer experiments to meta-modeling, and replace the potentially expensive computer simulation by a fast-running emulator. The surrogate can then be used to estimate sensitivities, propagate uncertainty, and calibrate model parameters at a fraction of the cost it would take to wrap a sampling algorithm or optimization solver around the physics-based code. Doing so, however, offers the risk to develop an incorrect emulator that erroneously approximates the true-but-unknown sensitivities of the physics-based code. We demonstrate the extent to which this occurs when Gaussian Process Modeling (GPM) emulators are trained in high-dimensional spaces using too-sparsely populated designs-of-experiments. Our illustration analyzes a variant of the Rosenbrock function in which several effects are made statistically insignificant while others are strongly coupled, therefore, mimicking a situation that is often encountered in practice. In this example, using a combination of GPM emulator and design-of-experiments leads to an incorrect approximation of the function. A mathematical proof of the origin of the problem is proposed. The adverse effects that too-sparsely populated designs may produce are discussed for the coverage of the design space, estimation of sensitivities, and calibration of parameters. This work attempts to raise awareness to the potential dangers of not allocating enough resources when exploring a design space to develop fast-running emulators.

AB - Activities such as global sensitivity analysis, statistical effect screening, uncertainty propagation, or model calibration have become integral to the Verification and Validation (V&V) of numerical models and computer simulations. One of the goals of V&V is to assess prediction accuracy and uncertainty, which feeds directly into reliability analysis or the Quantification of Margin and Uncertainty (QMU) of engineered systems. Because these analyses involve multiple runs of a computer code, they can rapidly become computationally expensive. An alternative to Monte Carlo-like sampling is to combine a design of computer experiments to meta-modeling, and replace the potentially expensive computer simulation by a fast-running emulator. The surrogate can then be used to estimate sensitivities, propagate uncertainty, and calibrate model parameters at a fraction of the cost it would take to wrap a sampling algorithm or optimization solver around the physics-based code. Doing so, however, offers the risk to develop an incorrect emulator that erroneously approximates the true-but-unknown sensitivities of the physics-based code. We demonstrate the extent to which this occurs when Gaussian Process Modeling (GPM) emulators are trained in high-dimensional spaces using too-sparsely populated designs-of-experiments. Our illustration analyzes a variant of the Rosenbrock function in which several effects are made statistically insignificant while others are strongly coupled, therefore, mimicking a situation that is often encountered in practice. In this example, using a combination of GPM emulator and design-of-experiments leads to an incorrect approximation of the function. A mathematical proof of the origin of the problem is proposed. The adverse effects that too-sparsely populated designs may produce are discussed for the coverage of the design space, estimation of sensitivities, and calibration of parameters. This work attempts to raise awareness to the potential dangers of not allocating enough resources when exploring a design space to develop fast-running emulators.

UR - http://www.scopus.com/inward/record.url?scp=79959594169&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959594169&partnerID=8YFLogxK

U2 - 10.1016/j.ress.2011.02.015

DO - 10.1016/j.ress.2011.02.015

M3 - Article

AN - SCOPUS:79959594169

VL - 96

SP - 1220

EP - 1231

JO - Reliability Engineering and System Safety

JF - Reliability Engineering and System Safety

SN - 0951-8320

IS - 9

ER -