Andrea Wen-Yi Wang
andreawwenyi [at] infosci.cornell.edu
CV Google Scholar GitHub Twitter
I'm a second-year PhD student at Cornell Information Science, advised by Allison Koenecke and David Mimno.
I am interested in the intersection of Natural Language Processing, Data Science, and Computational Social Science.
I study the characteristics of LLMs through two lenses.
First, through the lens of models, I study the interpretability of multilingual LLMs.
Second, through the lens of social scientists, I study how language models could meet the needs of social scientists with textual data. So far, I have worked on projects involving criminal justice, gender studies, and misinformation.

Prior to joining Cornell, I was a data scientist at New York University's Public Safety Lab. I worked on the Jail Data Initiative with Orion Taylor and Anna Harvey. I was also a contributor in g0v ("gov-zero"), a grassroot civic-tech community in Taiwan, where I worked on the 0archive project.

For potential PhD applicant: I’m happy to chat and offer my experiences about applications and PhD experiences. Feel free to email me.
Publications
Hyperpolyglot LLMs: Cross-Lingual Interpretability in Token Embeddings
Andrea W Wen-Yi, David Mimno
EMNLP 2023
Paper  Code  Poster

The Evolution of Rumors on a Closed Social Networking Platform During COVID-19: Algorithm Development and Content Study
Andrea W Wang, Jo-Yu Lan, Ming-Hung Wang, Chihhao Yu
JMIR Med Inform 2021
doi:10.2196/30467
Paper
Working Papers
"Courtroom Tears": Identifying Gendered Discourse in US Capital Trial Transcripts Using Large Language Models.
Andrea W Wen-Yi, Kathryn Adamson, Nathalie Greenfield, Rachel Goldberg, Sandra Babcock, David Mimno, Allison Koenecke
Under Review
Seasonality Visualizations of Online Text
Andrea W Wen-Yi, Allison Koenecke, David Mimno
IC2S2 2023
Poster
This website is developed by gyauney. Thanks Greg!