Statistical approach for automated weighting of datasets: Application to heat capacity data

S. Zomorodpoosh, B. Bocklund, A. Obaied, R. Otis, Z. K. Liu, I. Roslyakova

Research output: Contribution to journalArticlepeer-review

Abstract

An essential step in CALPHAD is assigning relative weights to different datasets, but there is no consensus as to the best approach regarding this issue. Currently, such an assignment of weights for experimental or first-principles data is performed manually based on the knowledge and experience of the modeler. Since the existing manual treatment is subjective and time consuming, manipulation of such data is rapidly advancing toward automated procedures through statistical and data mining tools. In the present study, we propose an automated approach to determine the weight of datasets based on the K-Fold Cross-Validation method, modified under the conditions that each fold is selected non-randomly and contains an unequal number of observations. This approach can be considered for researchers as a support tool to evaluate the reliability of each dataset involved in the CALPHAD modeling and quantify the impact of weighting by statistical analysis of the corresponding model. We demonstrate the efficacy of this method through the evaluation of heat capacity data of fcc nickel, hcp magnesium, and bcc iron.

Original languageEnglish (US)
Article number101994
JournalCalphad: Computer Coupling of Phase Diagrams and Thermochemistry
Volume71
DOIs
StatePublished - Dec 2020

All Science Journal Classification (ASJC) codes

  • Chemistry(all)
  • Chemical Engineering(all)
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Statistical approach for automated weighting of datasets: Application to heat capacity data'. Together they form a unique fingerprint.

Cite this