File size: 2,565 Bytes
3509cde
 
 
 
 
 
 
 
 
 
 
 
 
16eebac
0d35883
 
3509cde
105d2bc
 
 
 
31beafd
105d2bc
16eebac
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0d35883
 
 
 
b908691
31beafd
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
language: 
  - "hn"
  - "bn"
  - "mn"
tags:
- Text Classification

license: 	wtfpl
datasets:
- ComMA
metrics:
- F1-Score
widget:
- text: "but who in the holy hell says to relate with it,or inspired by it😂😂,i'm a 23 yr old student,and i say it's wrong,watch for entertainment purpose,and those who get inspired by such movies,its their mental problem.and all the praise that shahid's getting is for dark charachter that he portrays.and those sittis she's talking abt,don't we hear those when a villian arrives on [screen.my](http://screen.my/) point is bash sexism,whether it's by a man or a group of woman.and as far as i remember,those girls were not shown as dark characters,as kabir singh is🙂"
- text: "सही है, बोलने के अधिकार पर गाली दो, parotest के अधिकार पर पुलिश का सर फोड़ो ,मादरचोदो अधिकारो का कब सही इस्तेमाल करोगें🐷🐷🐷😠😠😠🖕"
---
# Automatic Identification of Gender Bias in Hindi,Bengali,Meitei Codemixed Texts

This is a XLM-Align-Base model trained on CoMMA dataset of 12k samples

-  This is an extension work from our previous paper: [Hypers at ComMA@ICON: Modelling Aggressiveness, Gender Bias and Communal Bias Identification](https://arxiv.org/abs/2112.15417).

## Example Usage

```python
import torch
import numpy as np
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
from transformers import set_seed

set_seed(425)

text = "some gender biased text"
pipe = pipeline("text-classification", model="seanbenhur/MuLTiGENBiaS")


def predict_pipe(text):
    prediction = pipe(text, return_all_scores=True)[0]
    return prediction


if __name__ == "__main__":
  target = predict_pipe(text)
  print(target)
  
 ```
 
 
 ### Some concerns
 - Note: The model is trained on relatively lower samples (i.e 12k) but with mix of four languages Hindi, Bengali, Meitei, and English. It contains both native on codemixed scripts, So the model might perform poorly on many text samples and might not generalize well.
 
 
 ## Bibtex
 
```
 @article{Benhur2021HypersAC,
  title={Hypers at ComMA@ICON: Modelling Aggressiveness, Gender Bias and Communal Bias Identification},
  author={Sean Benhur and Roshan Nayak and Kanchana Sivanraju and Adeep Hande and Subalalitha Chinnaudayar Navaneethakrishnan and Ruba Priyadharshini and Bharathi Raja Chakravarthi6},
  journal={ArXiv},
  year={2021},
  volume={abs/2112.15417}
}
```