ABSTRACT
Language usage on social media varies widely even within the context of American English. Despite this, the majority of natural language processing systems are trained only on “Standard American English,” or SAE, the construction of English most prominent among white Americans. For hate speech classification, prior work has shown that African American English (AAE) is more likely to be misclassified as hate speech. This has harmful implications for Black social media users as it reinforces and exacerbates existing notions of anti-Black racism. While past work has highlighted the relationship between AAE and hate speech classification, no work has explored the linguistic characteristics of AAE that lead to misclassification. Our work uses Twitter datasets for AAE dialect and hate speech classifiers to explore the fine-grained relationship between specific characteristics of AAE such as word choice and grammatical features and hate speech predictions. We further investigate these biases by removing profanity and examining the influence of four aspects of AAE grammar that are distinct from SAE. Results show that removing profanity accounts for a roughly 20 to 30% reduction in the percentage of samples classified as ’hate’ ’abusive’ or ’offensive,’ and that similar classification patterns are observed regardless of grammar categories.
- Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th international conference on World Wide Web companion. 759–760.Google ScholarDigital Library
- Su Lin Blodgett and Brendan O’Connor. 2017. Racial disparity in natural language processing: A case study of social media african-american english. arXiv preprint arXiv:1707.00061(2017).Google Scholar
- Su Lin Blodgett, Johnny Wei, and Brendan O’Connor. 2018. Twitter Universal Dependency Parsing for African-American and Mainstream American English. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 1415–1425. https://doi.org/10.18653/v1/P18-1131Google ScholarCross Ref
- Pete Burnap and Matthew L Williams. 2016. Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data science 5(2016), 1–15.Google Scholar
- Ying Chen, Yilu Zhou, Sencun Zhu, and Heng Xu. 2012. Detecting offensive language in social media to protect adolescent online safety. In 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing. IEEE, 71–80.Google ScholarDigital Library
- Chris Collins, Simanique Moody, and Paul M. Postal. 2008. An AAE Camouflage Construction. Language 84, 1 (2008), 29–68. http://www.jstor.org/stable/40071011Google ScholarCross Ref
- Patricia Cukor-Avila and Guy Bailey. 1995. Grammaticalization in AAVE. In Annual Meeting of the Berkeley Linguistics Society, Vol. 21. 401–413.Google Scholar
- Thomas Davidson, Debasmita Bhattacharya, and Ingmar Weber. 2019. Racial bias in hate speech and abusive language detection datasets. arXiv preprint arXiv:1905.12516(2019).Google Scholar
- Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 11.Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).Google Scholar
- Nemanja Djuric, Jing Zhou, Robin Morris, Mihajlo Grbovic, Vladan Radosavljevic, and Narayan Bhamidipati. 2015. Hate speech detection with comment embeddings. In Proceedings of the 24th international conference on world wide web. 29–30.Google ScholarDigital Library
- Antigoni Maria Founta, Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Athena Vakali, and Ilias Leontiadis. 2019. A unified deep learning architecture for abuse detection. In Proceedings of the 10th ACM conference on web science. 105–114.Google ScholarDigital Library
- Antigoni Maria Founta, Constantinos Djouvas, Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Gianluca Stringhini, Athena Vakali, Michael Sirivianos, and Nicolas Kourtellis. 2018. Large scale crowdsourcing and characterization of twitter abusive behavior. In Twelfth International AAAI Conference on Web and Social Media.Google ScholarCross Ref
- Kishonna L Gray. 2018. Gaming out online: Black lesbian identity development and community building in Xbox Live. Journal of lesbian studies 22, 3 (2018), 282–296.Google ScholarCross Ref
- Lisa J Green. 2002. African American English: a linguistic introduction. Cambridge University Press.Google Scholar
- Sophie Groenwold, Lily Ou, Aesha Parekh, Samhita Honnavalli, Sharon Levy, Diba Mirza, and William Yang Wang. 2020. Investigating African-American Vernacular English in Transformer-Based Text Generation. arXiv preprint arXiv:2010.02510(2020).Google Scholar
- Matan Halevy, Camille Harris, Amy Bruckman, Diyi Yang, and Ayanna Howard. 2021. Mitigating Racial Biases in Toxic Language Detection with an Equity-Based Ensemble Framework. In Equity and Access in Algorithms, Mechanisms, and Optimization. 1–11.Google Scholar
- Jacob Hoffman-Andrews. 2015. BlockTogether. https://blocktogether.org/Google Scholar
- Shagun Jhaver, Sucheta Ghoshal, Amy Bruckman, and Eric Gilbert. 2018. Online harassment and content moderation: The case of blocklists. ACM Transactions on Computer-Human Interaction (TOCHI) 25, 2(2018), 1–33.Google ScholarDigital Library
- Taylor Jones. 2015. Toward a Description of African American Vernacular English Dialect Regions Using “Black Twitter”. American Speech 90, 4 (2015), 403–440.Google ScholarCross Ref
- Tiffany Marquise’ Jones. 2008. ” You Done Lost Yo’Mind Ain’t No Such Thang as AAVE”: Exploring African American Resistance to AAVE. (2008).Google Scholar
- Svetlana Kiritchenko and Saif M Mohammad. 2018. Examining gender and race bias in two hundred sentiment analysis systems. arXiv preprint arXiv:1805.04508(2018).Google Scholar
- Deepak Kumar, Patrick Gage Kelley, Sunny Consolvo, Joshua Mason, Elie Bursztein, Zakir Durumeric, Kurt Thomas, and Michael Bailey. 2021. Designing Toxic Content Classification for a Diversity of Perspectives. arXiv preprint arXiv:2106.04511(2021).Google Scholar
- Austin Lane. 2014. “You Tryna Grammaticalize?” An Analysis of “Tryna” as a Grammaticalized Semi-Auxiliary. Eagle Feather 11, 2014 (2014).Google Scholar
- Stefan Martin, Walt Wolfram, 1998. The sentence in African-American vernacular English. African American English: structure, history, and use (1998), 11–36.Google Scholar
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013).Google Scholar
- Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. 2016. Abusive language detection in online user content. In Proceedings of the 25th international conference on world wide web. 145–153.Google ScholarDigital Library
- James W Pennebaker, Ryan L Boyd, Kayla Jordan, and Kate Blackburn. 2015. The development and psychometric properties of LIWC2015. Technical Report.Google Scholar
- James W Pennebaker, Martha E Francis, and Roger J Booth. 2001. Linguistic inquiry and word count: LIWC 2001. Mahway: Lawrence Erlbaum Associates 71, 2001 (2001), 2001.Google Scholar
- Georgios K Pitsilis, Heri Ramampiaro, and Helge Langseth. 2018. Effective hate-speech detection in Twitter data using recurrent neural networks. Applied Intelligence 48, 12 (2018), 4730–4742.Google ScholarDigital Library
- Daniel Preoţiuc-Pietro and Lyle Ungar. 2018. User-Level Race and Ethnicity Predictors from Twitter Text. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, 1534–1545. https://www.aclweb.org/anthology/C18-1130Google Scholar
- Thomas Purnell, William Idsardi, and John Baugh. 1999. Perceptual and phonetic experiments on American English dialect identification. Journal of language and social psychology 18, 1 (1999), 10–30.Google ScholarCross Ref
- Jing Qian, Mai ElSherief, Elizabeth M Belding, and William Yang Wang. 2018. Leveraging intra-user and inter-user representation learning for automated hate speech detection. arXiv preprint arXiv:1804.03124(2018).Google Scholar
- Jacquelyn Rahman. 2012. The N word: Its history and use in the African American community. Journal of English Linguistics 40, 2 (2012), 137–171.Google ScholarCross Ref
- Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A Smith. 2019. The risk of racial bias in hate speech detection. In Proceedings of the 57th annual meeting of the association for computational linguistics. 1668–1678.Google ScholarCross Ref
- Anna Schmidt and Michael Wiegand. 2017. A survey on hate speech detection using natural language processing. In Proceedings of the fifth international workshop on natural language processing for social media. 1–10.Google ScholarCross Ref
- Umme Aymun Siddiqua, Abu Nowshed Chy, and Masaki Aono. 2019. KDEHatEval at SemEval-2019 Task 5: A Neural Network Model for Detecting Hate Speech in Twitter. In Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics, Minneapolis, Minnesota, USA, 365–370. https://doi.org/10.18653/v1/S19-2064Google ScholarCross Ref
- Deanna Thompson. 2016. The Morpho-syntax of Aspectual Stay in AAVE. (2016).Google Scholar
- Christopher Tuckwood. 2017. Hatebase: Online database of hate speech. The Sentinal Project. Available at: https://www. hatebase. org (2017).Google Scholar
- Cynthia Van Hee, Els Lefever, Ben Verhoeven, Julie Mennes, Bart Desmet, Guy De Pauw, Walter Daelemans, and Véronique Hoste. 2015. Detection and fine-grained classification of cyberbullying events. In International conference recent advances in natural language processing (RANLP). 672–680.Google Scholar
- Zeerak Waseem. 2016. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In Proceedings of the first workshop on NLP and computational social science. 138–142.Google ScholarCross Ref
- Zeerak Waseem and Dirk Hovy. 2016. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In Proceedings of the NAACL Student Research Workshop. Association for Computational Linguistics, San Diego, California, 88–93. https://doi.org/10.18653/v1/N16-2013Google ScholarCross Ref
- Robert L Williams. 1997. The ebonics controversy. Journal of Black Psychology 23, 3 (1997), 208–214.Google ScholarCross Ref
- Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Noah A Smith, and Yejin Choi. 2021. Challenges in automated debiasing for toxic language detection. arXiv preprint arXiv:2102.00086(2021).Google Scholar
Comments