Google Scholar
Semantic Scholar
Twitter
GitHub
LinkedIn
Contact Me
I am a Research Scientist at Allen Institute for Artificial Intelligence (AI2) working on Natural Language Processing with focus on scientific documents. I received my PhD in Computer Science from the University of Texas at Austin working with Ray Mooney and Katrin Erk.
For the list of publications, please check my Semantic Scholar or Google Scholar pages.
LongformerEncoderDecoder (LED) - a pretrained transformer for long-document generation tasks. [Code and Pretrained model]
Longformer - a BERT-like model for long documents. [Code and Pretrained model]
SPECTER - a citation-informed embedding model for scintific documents. [Code, Data and Pretrained Model]
SciSpacy - a Spacy pipeline for scientific documents. [Code]
SciBERT - a BERT model for scientific documents. [Code, Data, and Pretrained model]
[12/2020] Selected as an Outstanding Reviewer for EMNLP 2020.
[12/2020] LongformerEncoderDecoder (LED) is out. [Paper] [Code and Pretrained model]
[12/2020] Our tutorial “Beyond Paragraphs: NLP for Long Sequences” has been accepted to appear at NAACL 2021.
[11/2020] Co-organizing The Second Workshop on Scholarly Document Processing (SDP 2021).
[10/2020] Serving as an area chair for the Sentence-level Semantics and Textual Inference track at ACL 2021.
[10/2020] Serving as an area chair for the Sentence-level Semantics and Textual Inference track at NAACL 2021.
[9/2020] Serving as a publication co-chair for NAACL 2021.
[7/2020] Our paper, Don’t Stop Pretraining, won an honorary mention at ACL 2020.
[6/2020] Invited to serve as a standing reviewer of Computational Linguistics (CL) journal.
[6/2020] Gave a talk about Longformer to UW NLP students [slides]
[6/2020] Longformer is now integrated into the huggingface repo
[5/2020] SciBERT has been downloaded more than 20,000 times in the last 30 days
[4/2020] Longformer is out
[3/2020] Co-organizing the SciNLP workshop. Check scinlp.org