Stereotypes in ChatGPT: an empirical study
Stereotypes in ChatGPT: an empirical study
Samenvatting
ChatGPT is rapidly gaining interest and attracts many researchers, practitioners and users due to its availability, potentials and capabilities. Nevertheless, there are several voices and studies that point out the flaws of ChatGPT such as its hallucinations, factually incorrect statements, and potential for promoting harmful social biases. Being the focus area of this contribution, harmful social biases may result in unfair treatment or discrimination of (a member of) a social group. This paper aims at gaining insight into social biases incorporated in ChatGPT language models. To this end, we study the stereotypical behavior of ChatGPT. Stereotypes associate specific characteristics to groups and are related to social biases. The study is empirical and systematic, where about 2300 stereotypical probes in 6 formats (like questions and statements) and from 9 different social group categories (like age, country and profession) are posed to ChatGPT. Every probe is a stereotypical question or statement where a word is masked and ChatGPT is asked to fill in the masked word. Subsequently, as part of our analysis, we map the suggestions of ChatGPT to positive and negative sentiments to get a measure of stereotypical behavior of a language model of ChatGPT. We observe that ChatGPT stereotypical behavior differs per social group category, for some categories the average sentiment is largely positive (e.g., for religion), while for others it is negative (e.g., for political). Further, our work empirically affirms the previous claims that the formats of probing affect the sentiments of the stereotypical outcomes of ChatGPT. Our results can be used by practitioners and policy makers to devise societal interventions to change the image of a category or a social group, as captured in ChatGPT language model(s), and/or to decide to appropriately influence the stereotypical behavior of such language models.