About me

I’m a researcher at the Allen Institute for AI on the Semantic Scholar Research team, where I work on NLP and text mining over scientific literature. Before that, I spent a couple years working as a data scientist in Seattle, and a year as an applied probability researcher at Academia Sinica in Taiwan. I graduated in 2015 with an MS in Statistics from the University of Washington.

Research stuff

There’s too much scientific literature being published for people to make sense of. It’d be great if NLP models could improve access to & understanding of the knowledge contained in those papers. Yet, NLP models that work well on news or Wikipedia articles often perform poorly when applied to scientific text. I’m interested in understanding why that is & how we can get these systems to perform better.

Language modeling for science

One of the best ways to improve performance on many scientific NLP tasks is to adapt the underlying language models to the scientific domain:

Scientific NLP tasks & datasets

We need new challenging scientific tasks & datasets for evaluating these models:

Resources for scientific NLP

Scientific text is difficult to access (copyright restrictions 😤). We need large, machine-readable, open-access corpora to support scientific NLP research:

Tools that make research less painful

Science of science

I’m interested (and concerned) about bias in scientific papers/publishing. Can we use NLP to study these biases?

Community stuff

It’d be great if more researchers in the NLP & text mining communities worked on scientific text. To promote this, I’ve co-organized workshops & shared tasks:

My collaborators

All of my projects have been collaborations with other awesome researchers. Many thanks to:

Waleed Ammar (Google), Iz Beltagy (AI2), Isabel Cachola (AI2), Arman Cohan (AI2), Doug Downey (AI2/Northwestern), Sergey Feldman (AI2), Suchin Gururangan (UW/AI2), Rodney Kinney (AI2), Ben Lee (UW), Ana Marasović (UW/AI2), Mark Neumann (AI2), Swabha Swayamdipta (UW/AI2), Dave Wadden (UW), Lucy Lu Wang (AI2).