Model-Based Clustering of Nonparametric Weighted Networks With Application to Water Pollution Analysis

Amal Agarwal, Lingzhou Xue

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Water pollution is a major global environmental problem, and it poses a great environmental risk to public health and biological diversity. This work is motivated by assessing the potential environmental threat of coal mining through increased sulfate concentrations in river networks, which do not belong to any simple parametric distribution. However, existing network models mainly focus on binary or discrete networks and weighted networks with known parametric weight distributions. We propose a principled nonparametric weighted network model based on exponential-family random graph models and local likelihood estimation, and study its model-based clustering with application to large-scale water pollution network analysis. We do not require any parametric distribution assumption on network weights. The proposed method greatly extends the methodology and applicability of statistical network models. Furthermore, it is scalable to large and complex networks in large-scale environmental studies. The power of our proposed methods is demonstrated in simulation studies and a real application to sulfate pollution network analysis in Ohio watershed located in Pennsylvania, United States.

Original languageEnglish (US)
JournalTechnometrics
DOIs
StatePublished - Jan 1 2019

Fingerprint

Water Pollution
Model-based Clustering
Weighted Networks
Water pollution
Network Model
Network Analysis
Electric network analysis
Local Likelihood
Weight Distribution
Exponential Family
Public Health
Graph Model
Pollution
Random Graphs
Complex Networks
Statistical Model
Biodiversity
Complex networks
Public health
Mining

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Modeling and Simulation
  • Applied Mathematics

Cite this

@article{6dfe68e4bd4448d5996bbd6f2fa3e1f6,
title = "Model-Based Clustering of Nonparametric Weighted Networks With Application to Water Pollution Analysis",
abstract = "Water pollution is a major global environmental problem, and it poses a great environmental risk to public health and biological diversity. This work is motivated by assessing the potential environmental threat of coal mining through increased sulfate concentrations in river networks, which do not belong to any simple parametric distribution. However, existing network models mainly focus on binary or discrete networks and weighted networks with known parametric weight distributions. We propose a principled nonparametric weighted network model based on exponential-family random graph models and local likelihood estimation, and study its model-based clustering with application to large-scale water pollution network analysis. We do not require any parametric distribution assumption on network weights. The proposed method greatly extends the methodology and applicability of statistical network models. Furthermore, it is scalable to large and complex networks in large-scale environmental studies. The power of our proposed methods is demonstrated in simulation studies and a real application to sulfate pollution network analysis in Ohio watershed located in Pennsylvania, United States.",
author = "Amal Agarwal and Lingzhou Xue",
year = "2019",
month = "1",
day = "1",
doi = "10.1080/00401706.2019.1623076",
language = "English (US)",
journal = "Technometrics",
issn = "0040-1706",
publisher = "American Statistical Association",

}

TY - JOUR

T1 - Model-Based Clustering of Nonparametric Weighted Networks With Application to Water Pollution Analysis

AU - Agarwal, Amal

AU - Xue, Lingzhou

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Water pollution is a major global environmental problem, and it poses a great environmental risk to public health and biological diversity. This work is motivated by assessing the potential environmental threat of coal mining through increased sulfate concentrations in river networks, which do not belong to any simple parametric distribution. However, existing network models mainly focus on binary or discrete networks and weighted networks with known parametric weight distributions. We propose a principled nonparametric weighted network model based on exponential-family random graph models and local likelihood estimation, and study its model-based clustering with application to large-scale water pollution network analysis. We do not require any parametric distribution assumption on network weights. The proposed method greatly extends the methodology and applicability of statistical network models. Furthermore, it is scalable to large and complex networks in large-scale environmental studies. The power of our proposed methods is demonstrated in simulation studies and a real application to sulfate pollution network analysis in Ohio watershed located in Pennsylvania, United States.

AB - Water pollution is a major global environmental problem, and it poses a great environmental risk to public health and biological diversity. This work is motivated by assessing the potential environmental threat of coal mining through increased sulfate concentrations in river networks, which do not belong to any simple parametric distribution. However, existing network models mainly focus on binary or discrete networks and weighted networks with known parametric weight distributions. We propose a principled nonparametric weighted network model based on exponential-family random graph models and local likelihood estimation, and study its model-based clustering with application to large-scale water pollution network analysis. We do not require any parametric distribution assumption on network weights. The proposed method greatly extends the methodology and applicability of statistical network models. Furthermore, it is scalable to large and complex networks in large-scale environmental studies. The power of our proposed methods is demonstrated in simulation studies and a real application to sulfate pollution network analysis in Ohio watershed located in Pennsylvania, United States.

UR - http://www.scopus.com/inward/record.url?scp=85068557563&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068557563&partnerID=8YFLogxK

U2 - 10.1080/00401706.2019.1623076

DO - 10.1080/00401706.2019.1623076

M3 - Article

AN - SCOPUS:85068557563

JO - Technometrics

JF - Technometrics

SN - 0040-1706

ER -