20072021

Research activity per year

If you made any changes in Pure these will be visible here soon.

Personal profile

Research interests

[Note: This profile is incomplete, especially with regard to my publications. See http://shomir.net  for many more.]

My research brings together natural language processing (NLP), privacy, and artificial intelligence.

I am interested in solving problems to enable computers to do meaningful work with large volumes of natural language text. My lab develops new methods for NLP and applies them to a variety of domains, including privacy, online social networks, web science, and digital libraries. I am particularly interested in breaking down technology's "walls of text", i.e., situations where a human user or decision-maker is expected to consume a large quantity of text to take action while lacking sufficient resources (time, expertise) to properly understand what they have been given. I have applied this paradigm to privacy policies, scholarly manuscripts, documents from the world wide web, and historical texts, and I am always interested in new domains to work with.

Personal profile

I am an Assistant Professor in the College of Information Sciences and Technology at Penn State, where I lead the Human Language Technologies Lab. I am also a Faculty Affiliate of Penn State's Institute for CyberScience and a member of the Social Data Analytics graduate faculty.

From 2016 until 2018 I was an Assistant Professor in the EECS Department at the University of Cincinnati. Prior to that I was a postdoc and a lecturer in Carnegie Mellon University's School of Computer Science and an NSF International Research Fellow in the University of Edinburgh's School of Informatics. I received my PhD in Computer Science from the University of Maryland in 2011.

Education/Academic qualification

Computer Science, PhD, University of Maryland

Award Date: May 1 2011

Computer Science, M.S., University of Maryland

Award Date: May 1 2008

Computer Science, B.S., Virginia Tech

Award Date: May 1 2005

Mathematics, B.S, Virginia Tech

Award Date: May 1 2005

Philosophy, B.A., Virginia Tech

Award Date: May 1 2005

Researcher Defined Keywords

  • natural language processing
  • computational linguistics
  • privacy
  • artificial intelligence

Fingerprint

Dive into the research topics where Shomir Wilson is active. These topic labels come from the works of this person. Together they form a unique fingerprint.
  • 1 Similar Profiles

Network

Recent external collaboration on country/territory level. Dive into details by clicking on the dots or
  • A large-scale exploration of terms of service documents on the web

    Sundareswara, S. N., Srinath, M., Wilson, S. & Lee Giles, C., Aug 16 2021, DocEng 2021 - Proceedings of the 2021 ACM Symposium on Document Engineering. Association for Computing Machinery, Inc, 3474940. (DocEng 2021 - Proceedings of the 2021 ACM Symposium on Document Engineering).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  • Breaking down walls of text: How can NLP benefit consumer privacy?

    Ravichander, A., Black, A. W., Norton, T., Wilson, S. & Sadeh, N., 2021, ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference. Association for Computational Linguistics (ACL), p. 4125-4140 16 p. (ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  • Privacy at scale: Introducing the PrivaSeer corpus of web privacy policies

    Srinath, M., Wilson, S. & Giles, C. L., 2021, ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference. Association for Computational Linguistics (ACL), p. 6829-6839 11 p. (ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Scopus citations
  • PrivaSeer: A Privacy Policy Search Engine

    Srinath, M., Sundareswara, S. N., Giles, C. L. & Wilson, S., 2021, Web Engineering - 21st International Conference, ICWE 2021, Proceedings. Brambilla, M., Chbeir, R., Frasincar, F. & Manolescu, I. (eds.). Springer Science and Business Media Deutschland GmbH, p. 286-301 16 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 12706 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Scopus citations
  • Finding a Choice in a Haystack: Automatic Extraction of Opt-Out Statements from Privacy Policy Text

    Bannihatti Kumar, V., Iyengar, R., Nisal, N., Feng, Y., Habib, H., Story, P., Cherivirala, S., Hagan, M., Cranor, L., Wilson, S., Schaub, F. & Sadeh, N., Apr 20 2020, The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020. Association for Computing Machinery, Inc, p. 1943-1954 12 p. (The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Open Access
    25 Scopus citations