Researchers developed SGHateCheck, the first functional test specifically tailored to evaluate hate speech in the multilingual environments of Singapore and the broader Southeast Asia.
The internet, and particularly social media, has grown exponentially over the last decades. The nature of social media allows anyone to go online and create content they find interesting, whether appropriate or not. One form of inappropriate content is hate speech—offensive or threatening speech targeting certain people based on their ethnicity, religion, sexual orientation, and the like.
Hate speech detection models are computational systems that identify and classify online comments as hate speech. “These models are crucial in moderating online content and mitigating the spread of harmful speech, particularly on social media,” said Assistant Professor Roy Lee from the Singapore University of Technology and Design (SUTD). Evaluating the performance of hate speech detection models is important, but traditional evaluation using held-out test sets often fails to properly assess the model’s performance due to inherent bias within the datasets.
To overcome this limitation, HateCheck and Multilingual HateCheck (MHC) were introduced as functional tests that capture the complexity and diversity of hate speech by simulating real-world scenarios. In their paper “SGHateCheck: Functional tests for detecting hate speech in low-resource languages of Singapore”, Asst Prof Lee and his team builds on the frameworks of HateCheck and MHC to develop SGHateCheck, an artificial intelligence (AI)-powered tool that can distinguish between hateful and non-hateful comments in the specific context of Singapore and Southeast Asia.
Creating an evaluation tool specifically for the region’s linguistic and cultural context was necessary. This is because current hate speech detection models and datasets are mostly based on Western contexts, which do not accurately represent specific social dynamics and issues in Southeast Asia. “SGHateCheck aims to address these gaps by providing functional tests tailored to the region’s specific needs, ensuring more accurate and culturally sensitive detection of hate speech,” said Asst Prof Lee.
Unlike HateCheck and MHC, SGHateCheck uses large language models (LLMs) to translate and paraphrase test cases into Singapore’s four main languages—English, Mandarin, Tamil, and Malay. Native annotators then refine these test cases to ensure cultural relevance and accuracy. The end result is over 11,000 test cases meticulously annotated as hateful or non-hateful, which allows for a more nuanced platform to evaluate hate speech detection models.
Moreover, while MHC includes many languages, it does not have the same level of regional specificity as SGHateCheck. A comprehensive list of functional tests tailored to the region’s distinct linguistic features (for example, Singlish) paired with expert guidance ensures that SGHateCheck tests are useful and relevant. “This regional focus allows SGHateCheck to more accurately capture and evaluate the manifestations of hate speech that may not be adequately addressed by broader, more general frameworks,” emphasized Asst Prof Lee.
The team also found that LLMs trained on monolingual data sets are often biased towards non-hateful classifications. On the other hand, LLMs trained on multilingual datasets have a more balanced performance and can more accurately detect hate speech across various languages due to their exposure to a broader range of language expressions and cultural contexts. This underscores the importance of including culturally diverse and multilingual training data for applications in multilingual regions.
SGHateCheck was specifically developed to solve a real-world issue in Southeast Asia. It is poised to play a significant role by enhancing the detection and moderation of hate speech in online environments in these regions, helping to foster a more respectful and inclusive online space. Social media, online forums and community platforms, and news and media websites are just some of the many areas where the implementation of SGHateCheck will be valuable.
Fortunately, a new content moderation application that uses SGHateCheck is already on Asst Prof Lee’s list of future plans. He also aims to expand SGHateCheck to include other Southeast Asian languages such as Thai and Vietnamese.
SGHateCheck demonstrates how SUTD’s ethos of integrating cutting-edge technological advancements with thoughtful design principles can lead to impactful real-world solutions. Through the use of design, AI, and technology, SGHateCheck was developed to analyze local languages and social dynamics in order to meet a specific societal need.
“By focusing on creating a hate speech detection tool that is not only technologically sophisticated but also culturally sensitive, the study underscores the importance of human-centered approach in technological research and development,” said Asst Prof Lee.