About me

I’m a researcher at the Allen Institute for AI on the Semantic Scholar Research team. Before that, I did some statistics in Seattle and some applied probability at Academia Sinica in Taiwan. I graduated in 2015 with an MS in Statistics from the University of Washington.


I work on NLP over scientific literature, focusing on challenges scientists face with information overload and keeping up-to-date. I’m currently interested in:

and I’ve also done some work in:

Language modeling

  • Don’t Stop Pretraining 🎶: Adapt language models to domains and tasks (ACL 2020) (GitHub) - 🎉 Runner-up for Best Paper
  • SciBERT: A pretrained language model for scientific text (EMNLP 2019) (GitHub)


Fact checking

Corpora and resources

Augmented Reading


Explainable AI

Information extraction

  • Document-Level definition detection in scholarly documents (SDP at EMNLP 2020)
  • Combining distant and direct supervision for neural relation extraction (NAACL 2019)
  • Construction of the literature graph in Semantic Scholar (NAACL 2018)

Science of science

Shared tasks

  • SciVER at SDP 2021 (NAACL 2021) - Scientific claim verification (link)
  • EPIC-QA at TAC 2020 - Open domain question answering challenge: Can systems handle a mixture of questions from experts as well as consumers? (link)
  • TREC-COVID at TREC 2020 - Information retrieval challenge over an evolving CORD-19 corpus (link) (JAMIA 2020 paper) (SIGIR Forum 2020 paper)


  • The 2nd SDP workshop will be at NAACL 2021! Stay tuned! (link)

  • 1st SciNLP workshop at AKBC 2020 (link) (recorded talks) - What a success! 166 of 422 AKBC attendees signed up for our workshop! Stay tuned for the next one ;)

My collaborators

All of my projects have been collaborations with other awesome researchers ❤️. Many thanks to:

Waleed Ammar (Google), Iz Beltagy (AI2), Isabel Cachola (JHU), Arman Cohan (AI2), Doug Downey (AI2/Northwestern), Sergey Feldman (AI2), Suchin Gururangan (UW/AI2), Andrew Head (UC Berkeley), Dongyeop Kang (UC Berkeley), Rodney Kinney (AI2), Ben Lee (UW), Ana Marasović (UW/AI2), Mark Neumann (AI2), Swabha Swayamdipta (UW/AI2), Dave Wadden (UW), Lucy Lu Wang (AI2).