### Abstract

With the growth of interest in network data across fields, the Exponential Random Graph Model (ERGM) has emerged as the leading approach to the statistical analysis of network data. ERGM parameter estimation requires the approximation of an intractable normalizing constant. Simulation methods represent the state-of-the-art approach to approximating the normalizing constant, leading to estimation by Monte Carlo maximum likelihood (MCMLE). MCMLE is accurate when a large sample of networks is used to approximate the normalizing constant. However, MCMLE is computationally expensive, and may be prohibitively so if the size of the network is on the order of 1,000 nodes (i.e., one million potential ties) or greater. When the network is large, one option is maximum pseudolikelihood estimation (MPLE). The standard MPLE is simple and fast, but generally underestimates standard errors. We show that a resampling method - the parametric bootstrap - results in accurate coverage probabilities for confidence intervals. We find that bootstrapped MPLE can be run in 1/5th the time of MCMLE. We study the relative performance of MCMLE and MPLE with simulation studies, and illustrate the two different approaches by applying them to a network of bills introduced in the United State Senate.

Original language | English (US) |
---|---|

Title of host publication | Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 |

Editors | Zoran Obradovic, Ricardo Baeza-Yates, Jeremy Kepner, Raghunath Nambiar, Chonggang Wang, Masashi Toyoda, Toyotaro Suzumura, Xiaohua Hu, Alfredo Cuzzocrea, Ricardo Baeza-Yates, Jian Tang, Hui Zang, Jian-Yun Nie, Rumi Ghosh |

Publisher | Institute of Electrical and Electronics Engineers Inc. |

Pages | 116-121 |

Number of pages | 6 |

ISBN (Electronic) | 9781538627143 |

DOIs | |

State | Published - Jan 12 2018 |

Event | 5th IEEE International Conference on Big Data, Big Data 2017 - Boston, United States Duration: Dec 11 2017 → Dec 14 2017 |

### Publication series

Name | Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 |
---|---|

Volume | 2018-January |

### Other

Other | 5th IEEE International Conference on Big Data, Big Data 2017 |
---|---|

Country | United States |

City | Boston |

Period | 12/11/17 → 12/14/17 |

### Fingerprint

### All Science Journal Classification (ASJC) codes

- Computer Networks and Communications
- Hardware and Architecture
- Information Systems
- Information Systems and Management
- Control and Optimization

### Cite this

*Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017*(pp. 116-121). (Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017; Vol. 2018-January). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2017.8257919

}

*Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017.*Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017, vol. 2018-January, Institute of Electrical and Electronics Engineers Inc., pp. 116-121, 5th IEEE International Conference on Big Data, Big Data 2017, Boston, United States, 12/11/17. https://doi.org/10.1109/BigData.2017.8257919

**Exponential random graph models with big networks : Maximum pseudolikelihood estimation and the parametric bootstrap.** / Schmid, Christian S.; Desmarais, Jr., Bruce A.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - Exponential random graph models with big networks

T2 - Maximum pseudolikelihood estimation and the parametric bootstrap

AU - Schmid, Christian S.

AU - Desmarais, Jr., Bruce A.

PY - 2018/1/12

Y1 - 2018/1/12

N2 - With the growth of interest in network data across fields, the Exponential Random Graph Model (ERGM) has emerged as the leading approach to the statistical analysis of network data. ERGM parameter estimation requires the approximation of an intractable normalizing constant. Simulation methods represent the state-of-the-art approach to approximating the normalizing constant, leading to estimation by Monte Carlo maximum likelihood (MCMLE). MCMLE is accurate when a large sample of networks is used to approximate the normalizing constant. However, MCMLE is computationally expensive, and may be prohibitively so if the size of the network is on the order of 1,000 nodes (i.e., one million potential ties) or greater. When the network is large, one option is maximum pseudolikelihood estimation (MPLE). The standard MPLE is simple and fast, but generally underestimates standard errors. We show that a resampling method - the parametric bootstrap - results in accurate coverage probabilities for confidence intervals. We find that bootstrapped MPLE can be run in 1/5th the time of MCMLE. We study the relative performance of MCMLE and MPLE with simulation studies, and illustrate the two different approaches by applying them to a network of bills introduced in the United State Senate.

AB - With the growth of interest in network data across fields, the Exponential Random Graph Model (ERGM) has emerged as the leading approach to the statistical analysis of network data. ERGM parameter estimation requires the approximation of an intractable normalizing constant. Simulation methods represent the state-of-the-art approach to approximating the normalizing constant, leading to estimation by Monte Carlo maximum likelihood (MCMLE). MCMLE is accurate when a large sample of networks is used to approximate the normalizing constant. However, MCMLE is computationally expensive, and may be prohibitively so if the size of the network is on the order of 1,000 nodes (i.e., one million potential ties) or greater. When the network is large, one option is maximum pseudolikelihood estimation (MPLE). The standard MPLE is simple and fast, but generally underestimates standard errors. We show that a resampling method - the parametric bootstrap - results in accurate coverage probabilities for confidence intervals. We find that bootstrapped MPLE can be run in 1/5th the time of MCMLE. We study the relative performance of MCMLE and MPLE with simulation studies, and illustrate the two different approaches by applying them to a network of bills introduced in the United State Senate.

UR - http://www.scopus.com/inward/record.url?scp=85047783425&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047783425&partnerID=8YFLogxK

U2 - 10.1109/BigData.2017.8257919

DO - 10.1109/BigData.2017.8257919

M3 - Conference contribution

AN - SCOPUS:85047783425

T3 - Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017

SP - 116

EP - 121

BT - Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017

A2 - Obradovic, Zoran

A2 - Baeza-Yates, Ricardo

A2 - Kepner, Jeremy

A2 - Nambiar, Raghunath

A2 - Wang, Chonggang

A2 - Toyoda, Masashi

A2 - Suzumura, Toyotaro

A2 - Hu, Xiaohua

A2 - Cuzzocrea, Alfredo

A2 - Baeza-Yates, Ricardo

A2 - Tang, Jian

A2 - Zang, Hui

A2 - Nie, Jian-Yun

A2 - Ghosh, Rumi

PB - Institute of Electrical and Electronics Engineers Inc.

ER -