Spaces:
Runtime error
Runtime error
Group,Modality,Type,Metaname,Suggested Evaluation,What it is evaluating,Considerations,Link,URL,Screenshots,Applicable Models ,Datasets,Hashtags,Abstract,Authors | |
BiasEvals,Text,Model,weat,Word Embedding Association Test (WEAT),Associations and word embeddings based on Implicit Associations Test (IAT),"Although based in human associations, general societal attitudes do not always represent subgroups of people and cultures.",Semantics derived automatically from language corpora contain human-like biases,https://researchportal.bath.ac.uk/en/publications/semantics-derived-automatically-from-language-corpora-necessarily,"['Images/WEAT1.png', 'Images/WEAT2.png']",,,"['Bias', 'Word Association', 'Embeddings', 'NLP']","Artificial intelligence and machine learning are in a period of astounding growth. However, there are concerns that these | |
technologies may be used, either with or without intention, to perpetuate the prejudice and unfairness that unfortunately | |
characterizes many human institutions. Here we show for the first time that human-like semantic biases result from the | |
application of standard machine learning to ordinary language—the same sort of language humans are exposed to every | |
day. We replicate a spectrum of standard human biases as exposed by the Implicit Association Test and other well-known | |
psychological studies. We replicate these using a widely used, purely statistical machine-learning model—namely, the GloVe | |
word embedding—trained on a corpus of text from the Web. Our results indicate that language itself contains recoverable and | |
accurate imprints of our historic biases, whether these are morally neutral as towards insects or flowers, problematic as towards | |
race or gender, or even simply veridical, reflecting the status quo for the distribution of gender with respect to careers or first | |
names. These regularities are captured by machine learning along with the rest of semantics. In addition to our empirical | |
findings concerning language, we also contribute new methods for evaluating bias in text, the Word Embedding Association | |
Test (WEAT) and the Word Embedding Factual Association Test (WEFAT). Our results have implications not only for AI and | |
machine learning, but also for the fields of psychology, sociology, and human ethics, since they raise the possibility that mere | |
exposure to everyday language can account for the biases we replicate here.","Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan" | |
BiasEvals,Text,Dataset,stereoset,StereoSet,Protected class stereotypes,Automating stereotype detection makes distinguishing harmful stereotypes difficult. It also raises many false positives and can flag relatively neutral associations based in fact (e.g. population x has a high proportion of lactose intolerant people).,StereoSet: Measuring stereotypical bias in pretrained language models,https://arxiv.org/abs/2004.09456,,,,,, | |
BiasEvals,Text,Dataset,crowspairs,Crow-S Pairs,Protected class stereotypes,Automating stereotype detection makes distinguishing harmful stereotypes difficult. It also raises many false positives and can flag relatively neutral associations based in fact (e.g. population x has a high proportion of lactose intolerant people).,CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models,https://arxiv.org/abs/2010.00133,,,,,, | |
BiasEvals,Text,Output,honest,HONEST: Measuring Hurtful Sentence Completion in Language Models,Protected class stereotypes and hurtful language,Automating stereotype detection makes distinguishing harmful stereotypes difficult. It also raises many false positives and can flag relatively neutral associations based in fact (e.g. population x has a high proportion of lactose intolerant people).,HONEST: Measuring Hurtful Sentence Completion in Language Models,https://aclanthology.org/2021.naacl-main.191.pdf,,,,,, | |
BiasEvals,Image,Model,ieat,Image Embedding Association Test (iEAT),Embedding associations,"Although based in human associations, general societal attitudes do not always represent subgroups of people and cultures.","Image Representations Learned With Unsupervised Pre-Training Contain Human-like Biases | Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency",https://dl.acm.org/doi/abs/10.1145/3442188.3445932,,,,,, | |
BiasEvals,Image,Dataset,imagedataleak,Dataset leakage and model leakage,Gender and label bias,,Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations,https://arxiv.org/abs/1811.08489,,,,,, | |
BiasEvals,Image,Output,stablebias,Characterizing the variation in generated images,,,Stable bias: Analyzing societal representations in diffusion models,https://arxiv.org/abs/2303.11408,,,,,, | |
BiasEvals,Image,Output,homoglyphbias,Effect of different scripts on text-to-image generation,"It evaluates generated images for cultural stereotypes, when using different scripts (homoglyphs). It somewhat measures the suceptibility of a model to produce cultural stereotypes by simply switching the script",,Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis,https://arxiv.org/pdf/2209.08891.pdf,,,,,, | |
BiasEvals,Audio,Taxonomy,notmyvoice,Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech Generators,Lists harms of audio/speech generators,Not necessarily evaluation but a good source of taxonomy. We can use this to point readers towards high-level evaluations,Not My Voice! A Taxonomy of Ethical and Safety Harms of Speech Generators,https://arxiv.org/pdf/2402.01708.pdf,,,,,, | |
BiasEvals,Video,Output,videodiversemisinfo,Diverse Misinformation: Impacts of Human Biases on Detection of Deepfakes on Networks,Human led evaluations of deepfakes to understand susceptibility and representational harms (including political violence),"Repr. harm, incite violence","Diverse Misinformation: Impacts of Human Biases on Detection of Deepfakes on Networks | |
",https://arxiv.org/abs/2210.10026,,,,,, | |
Privacy,,,,,,,,,,,,,, |