I am a Senior Research Scientist at Allen Institute for Artificial Intelligence (AI2) working on Natural Language Processing with focus on scientific documents. I received my PhD in Computer Science from the University of Texas at Austin working with Ray Mooney and Katrin Erk.
LongformerEncoderDecoder (LED) - a pretrained transformer for long-document generation tasks. [Code and Pretrained model]
Longformer - a BERT-like model for long documents. [Code and Pretrained model]
SPECTER - a citation-informed embedding model for scintific documents. [Code, Data and Pretrained Model]
SciSpacy - a Spacy pipeline for scientific documents. [Code]
SciBERT - a BERT model for scientific documents. [Code, Data, and Pretrained model]
[1/2021] Promoted to Senior Research Scientist at AI2.
[12/2020] Selected as an Outstanding Reviewer for EMNLP 2020.
[12/2020] Our tutorial “Beyond Paragraphs: NLP for Long Sequences” has been accepted to appear at NAACL 2021.
[11/2020] Co-organizing The Second Workshop on Scholarly Document Processing (SDP 2021).
[10/2020] Serving as an area chair for the Sentence-level Semantics and Textual Inference track at ACL 2021.
[10/2020] Serving as an area chair for the Sentence-level Semantics and Textual Inference track at NAACL 2021.
[9/2020] Serving as a publication co-chair for NAACL 2021.
[6/2020] Invited to serve as a standing reviewer of Computational Linguistics (CL) journal.
[6/2020] Gave a talk about Longformer to UW NLP students [slides]
[6/2020] Longformer is now integrated into the huggingface repo
[4/2020] Longformer is out
[3/2020] Co-organizing the SciNLP workshop. Check scinlp.org