Evaluation of African American Language Bias in Natural Language Generation

Authors: Nicholas Deas, Jessi Grieser, Shana Kleiner, Desmond Patton, Elsbeth Turcan, Kathleen McKeown

Abstract: We evaluate how well LLMs understand African American Language (AAL) in comparison to their performance on White Mainstream English (WME), the encouraged “standard” form of English taught in American classrooms. We measure LLM performance using automatic metrics and human judgments for two tasks: a counterpart generation task, where a model generates AAL (or WME) given WME (or AAL), and a masked span prediction (MSP) task, where models predict a phrase that was removed from their input. Our contributions include: (1) evaluation of six pre-trained, large language models on the two language generation tasks; (2) a novel dataset of AAL text from multiple contexts (social media, hip-hop lyrics, focus groups, and linguistic interviews) with human-annotated counterparts in WME; and (3) documentation of model performance gaps that suggest bias and identification of trends in lack of understanding of AAL features.

Access the full article here.

Deas, N., Grieser, J., Kleiner, S., Patton, D., Turcan, E., & McKeown, K. (2023). Evaluation of African American Language Bias in Natural Language Generation.

Evaluation of African American Language Bias in Natural Language Generation

Special Issue: “Improving the Social Work Response to the Gun Violence Epidemic” in Advances in Social Work

Special Issue: “Building Healthy Relationships to End Violence” in Families in Society

University of Utah Grand Challenges for Social Work Series: The Promise of Social Work Education