Edit model card

label_model

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("davanstrien/label_model")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 252
  • Number of training documents: 14986
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 date - city - pre - heavy - fur 5 -1_date_city_pre_heavy
0 label_1 label_2 - label_0 label_1 label_2 - label_0 label_1 - label_1 - label_2 1333 0_label_1 label_2_label_0 label_1 label_2_label_0 label_1_label_1
1 label_1 label_2 label_3 - label_3 label_4 label_5 - label_4 label_5 - label_2 label_3 label_4 - label_5 1043 1_label_1 label_2 label_3_label_3 label_4 label_5_label_4 label_5_label_2 label_3 label_4
2 negative positive - positive negative - negative - positive - target 803 2_negative positive_positive negative_negative_positive
3 loc misc org - loc misc - misc org - misc - org loc 651 3_loc misc org_loc misc_misc org_misc
4 neutral positive - neutral - positive negative - negative - positive 479 4_neutral positive_neutral_positive negative_negative
5 label_0 - - - - 357 5_label_0___
6 contradiction - entailment - neutral - ambiguous - 348 6_contradiction_entailment_neutral_ambiguous
7 label_0 - - - - 334 7_label_0___
8 99 - - - - 326 8_99___
9 label_1 label_2 label_3 - label_2 label_3 label_4 - label_3 label_4 - label_2 label_3 - label_4 300 9_label_1 label_2 label_3_label_2 label_3 label_4_label_3 label_4_label_2 label_3
10 entailment - true - child - related - non 257 10_entailment_true_child_related
11 snake - dog - bear - wolf - sea 245 11_snake_dog_bear_wolf
12 label_5 label_6 label_7 - label_6 label_7 - label_4 label_5 label_6 - label_5 label_6 - label_6 label_7 label_8 241 12_label_5 label_6 label_7_label_6 label_7_label_4 label_5 label_6_label_5 label_6
13 loc misc org - loc misc - misc org - misc - org loc 229 13_loc misc org_loc misc_misc org_misc
14 weather - transfer - alarm - text - time 228 14_weather_transfer_alarm_text
15 label_1 label_2 label_3 - label_2 label_3 - label_3 - label_1 label_2 - label_0 label_1 label_2 222 15_label_1 label_2 label_3_label_2 label_3_label_3_label_1 label_2
16 delete - different - bad - related - rel 207 16_delete_different_bad_related
17 label_12 label_13 label_14 - label_11 label_12 label_13 - label_13 label_14 - label_12 label_13 - label_10 label_11 label_12 172 17_label_12 label_13 label_14_label_11 label_12 label_13_label_13 label_14_label_12 label_13
18 - - - - 166 18____
19 loc org loc - loc org - org loc - org - loc 142 19_loc org loc_loc org_org loc_org
20 label_6 label_60 label_61 - label_60 label_61 - label_62 label_63 - label_61 label_62 label_63 - label_61 label_62 126 20_label_6 label_60 label_61_label_60 label_61_label_62 label_63_label_61 label_62 label_63
21 label_4 label_5 label_6 - label_5 label_6 - label_6 - label_1 label_2 label_3 - label_3 label_4 label_5 117 21_label_4 label_5 label_6_label_5 label_6_label_6_label_1 label_2 label_3
22 test - second - - - 106 22_test_second__
23 forest - industrial - transport - low - bamboo 104 23_forest_industrial_transport_low
24 answer - header - question - quantity - 104 24_answer_header_question_quantity
25 healthy - leaf - rust - plant - spot 103 25_healthy_leaf_rust_plant
26 left - right - stop - yes - unknown 100 26_left_right_stop_yes
27 en - na - alpha - fan - lifestyle 93 27_en_na_alpha_fan
28 label_13 label_14 label_15 - label_14 label_15 - label_15 - label_12 label_13 label_14 - label_11 label_12 label_13 92 28_label_13 label_14 label_15_label_14 label_15_label_15_label_12 label_13 label_14
29 disease - bio - disorder - healthy - 86 29_disease_bio_disorder_healthy
30 work - group - person product - product - location 86 30_work_group_person product_product
31 fear joy - sadness surprise - anger fear - joy love - surprise 82 31_fear joy_sadness surprise_anger fear_joy love
32 common - non - different - - 78 32_common_non_different_
33 dis - - - - 76 33_dis___
34 - - - - 73 34____
35 restaurant - pizza - place - salad - food 69 35_restaurant_pizza_place_salad
36 cconj det intj - adj adp adv - det intj noun - det intj - noun num pron 66 36_cconj det intj_adj adp adv_det intj noun_det intj
37 label_17 label_18 label_19 - label_18 label_19 label_2 - label_18 label_19 - label_19 label_2 - label_16 label_17 label_18 66 37_label_17 label_18 label_19_label_18 label_19 label_2_label_18 label_19_label_19 label_2
38 ll - year - related - cause - delete 65 38_ll_year_related_cause
39 anger fear - joy love - surprise - joy - love 64 39_anger fear_joy love_surprise_joy
40 true - news - partial - - 64 40_true_news_partial_
41 - - - - 63 41____
42 label_1 label_10 label_11 - label_10 label_11 - label_8 label_9 label_0 - label_7 label_8 label_9 - label_8 label_9 62 42_label_1 label_10 label_11_label_10 label_11_label_8 label_9 label_0_label_7 label_8 label_9
43 pos - neg - - - 62 43_pos_neg__
44 loc org - org - loc - date - sex 61 44_loc org_org_loc_date
45 label_19 label_2 label_20 - label_2 label_20 - label_20 - label_18 label_19 label_2 - label_18 label_19 60 45_label_19 label_2 label_20_label_2 label_20_label_20_label_18 label_19 label_2
46 event - group - person product - product - location 57 46_event_group_person product_product
47 bio - chemical - disease - effect - food 57 47_bio_chemical_disease_effect
48 234 - 19 20 21 - 20 21 22 - 22 23 24 - 23 24 57 48_234_19 20 21_20 21 22_22 23 24
49 fear happy neutral - happy neutral - fear happy - sad - happy 53 49_fear happy neutral_happy neutral_fear happy_sad
50 battery - volume - juice - chinese - korean 53 50_battery_volume_juice_chinese
51 menu - price - num - - 52 51_menu_price_num_
52 poor - ok - good - bad - great 52 52_poor_ok_good_bad
53 ll - cause - delete - unknown - 51 53_ll_cause_delete_unknown
54 hospital - unknown - en - material - digital 48 54_hospital_unknown_en_material
55 ll - cause - delete - unknown - 48 55_ll_cause_delete_unknown
56 self - question - neutral - yes - statement 48 56_self_question_neutral_yes
57 fat - loose - small - sugar - common 47 57_fat_loose_small_sugar
58 true - - - - 47 58_true___
59 cream - drinks - seafood - fruit - ice cream 46 59_cream_drinks_seafood_fruit
60 tr - ru - pers - pt - prod 46 60_tr_ru_pers_pt
61 - - - - 45 61____
62 clothing - care - kitchen - personal - health 44 62_clothing_care_kitchen_personal
63 business - news - tech - entertainment - sport 43 63_business_news_tech_entertainment
64 non - partial - neutral - yes - ok 43 64_non_partial_neutral_yes
65 organization person - location organization - organization - location - person 43 65_organization person_location organization_organization_location
66 daisy - tulip - rose - - 43 66_daisy_tulip_rose_
67 joy - sadness - anger - angry - happy 42 67_joy_sadness_anger_angry
68 samoyed - corgi - husky - pomeranian - golden 41 68_samoyed_corgi_husky_pomeranian
69 music - instrument - engine - wind - animals 41 69_music_instrument_engine_wind
70 hate - language - reporting - non - normal 41 70_hate_language_reporting_non
71 label_23 label_24 label_25 - label_24 label_25 - label_22 label_23 label_24 - label_23 label_24 - label_21 label_22 label_23 41 71_label_23 label_24 label_25_label_24 label_25_label_22 label_23 label_24_label_23 label_24
72 id - - - - 40 72_id___
73 animals - tech - dance - tiger - sport 40 73_animals_tech_dance_tiger
74 org org - loc loc - org - misc - loc 40 74_org org_loc loc_org_misc
75 star - positive - negative - negative positive - 38 75_star_positive_negative_negative positive
76 bird - ship - frog - horse - truck 37 76_bird_ship_frog_horse
77 cat - cats - dog - dogs - sleeping 37 77_cat_cats_dog_dogs
78 family - sports - music - related - health 37 78_family_sports_music_related
79 label_8 label_9 label_0 - label_9 label_0 label_1 - label_9 label_0 - label_7 label_8 label_9 - label_8 label_9 37 79_label_8 label_9 label_0_label_9 label_0 label_1_label_9 label_0_label_7 label_8 label_9
80 room - service - transport - care - kitchen 37 80_room_service_transport_care
81 positive - negative - neutral positive - neutral - positive negative 37 81_positive_negative_neutral positive_neutral
82 test - play - train - non - live 36 82_test_play_train_non
83 tim - evt - pro - gpe - org 36 83_tim_evt_pro_gpe
84 cold - disease - pressure - drug - blood 36 84_cold_disease_pressure_drug
85 non - early - late - - 35 85_non_early_late_
86 21 - office - 20 - 17 - 16 34 86_21_office_20_17
87 prep - nn - cc - pro - ex 34 87_prep_nn_cc_pro
88 evidence - position - statement - lead - request 33 88_evidence_position_statement_lead
89 adp - aux - sconj - cconj - det noun 33 89_adp_aux_sconj_cconj
90 job - start - help - address - quantity 33 90_job_start_help_address
91 gender - number - case - ind - person 33 91_gender_number_case_ind
92 threat - hate - adult - target - male 33 92_threat_hate_adult_target
93 institution - tools - organization - org - agent 32 93_institution_tools_organization_org
94 - - - - 32 94____
95 email - age - patient - state - zip 32 95_email_age_patient_state
96 mixed - positive - negative - neutral - neutral positive 32 96_mixed_positive_negative_neutral
97 test - help - joke - contact - report 32 97_test_help_joke_contact
98 address - balance - statement - request - second 31 98_address_balance_statement_request
99 - - - - 31 99____
100 hate - non - neutral - - 30 100_hate_non_neutral_
101 - - - - 30 101____
102 unk - zero - seven - 10 - blank 30 102_unk_zero_seven_10
103 male - female - young - adult - skin 30 103_male_female_young_adult
104 94 - 59 60 - 49 50 - 81 - 97 29 104_94_59 60_49 50_81
105 normal - cell - large - clean - lower 29 105_normal_cell_large_clean
106 lincoln - jaguar - audio - source - general 28 106_lincoln_jaguar_audio_source
107 title - section - header - list - item 28 107_title_section_header_list
108 - - - - 28 108____
109 yes - - - - 27 109_yes___
110 - - - - 26 110____
111 contradiction - entailment - neutral - non - 26 111_contradiction_entailment_neutral_non
112 instrument - org org - org org org - term - org 26 112_instrument_org org_org org org_term
113 ft - cardinal - act - loc - loc loc 25 113_ft_cardinal_act_loc
114 event - pro - pers - loc org - prod 25 114_event_pro_pers_loc org
115 ben - ext - exp - root - loc 25 115_ben_ext_exp_root
116 - - - - 25 116____
117 low - - - - 25 117_low___
118 ft - cardinal - act - loc - loc misc org 25 118_ft_cardinal_act_loc
119 statement - question - evidence - experience - answer 25 119_statement_question_evidence_experience
120 label_122 - label_121 - label_120 - label_123 - label_119 24 120_label_122_label_121_label_120_label_123
121 clean - - - - 24 121_clean___
122 ru - tr - el - en - hi 24 122_ru_tr_el_en
123 disgust - sadness surprise - joy love - surprise - joy 24 123_disgust_sadness surprise_joy love_surprise
124 statement - info - check - news - non 24 124_statement_info_check_news
125 motor - start - help - housing - yes 24 125_motor_start_help_housing
126 greek - chinese - italian - japanese - dutch 24 126_greek_chinese_italian_japanese
127 anger disgust - fear - disgust - sadness - anger 23 127_anger disgust_fear_disgust_sadness
128 date event - percent person - quantity - money - percent 23 128_date event_percent person_quantity_money
129 label_95 label_96 label_97 - label_97 label_98 label_99 - label_97 label_98 - label_94 label_95 label_96 - label_94 label_95 23 129_label_95 label_96 label_97_label_97 label_98 label_99_label_97 label_98_label_94 label_95 label_96
130 period - question - noun - number - 23 130_period_question_noun_number
131 neutral - - - - 22 131_neutral___
132 local - la - pad - data - personal 22 132_local_la_pad_data
133 partial - - - - 22 133_partial___
134 human - art - machine - - 22 134_human_art_machine_
135 fear joy - sadness surprise - surprise - disgust fear - joy 21 135_fear joy_sadness surprise_surprise_disgust fear
136 location organization - organization person - organization - price - disease 21 136_location organization_organization person_organization_price
137 14 15 16 - 12 13 14 - 13 14 15 - 11 12 13 - 10 11 12 21 137_14 15 16_12 13 14_13 14 15_11 12 13
138 sports - tech - business - sport - 21 138_sports_tech_business_sport
139 disorder - body - patient - age - disease 20 139_disorder_body_patient_age
140 sad - dis - sur - joy - 20 140_sad_dis_sur_joy
141 healthy - - - - 20 141_healthy___
142 drink - tea - wine - coffee - soft 20 142_drink_tea_wine_coffee
143 protein - chemical - cell - - 20 143_protein_chemical_cell_
144 rna - - - - 20 144_rna___
145 normal - covid - - - 20 145_normal_covid__
146 ex - pt - - - 20 146_ex_pt__
147 ok - ft - year - int - rel 20 147_ok_ft_year_int
148 header - currency - item - zip - state 20 148_header_currency_item_zip
149 label_122 label_123 - label_123 - label_122 - label_121 - label_120 19 149_label_122 label_123_label_123_label_122_label_121
150 anger disgust - anger disgust fear - disgust fear - disgust - sadness surprise 19 150_anger disgust_anger disgust fear_disgust fear_disgust
151 na - nn - ft - dis - bio 19 151_na_nn_ft_dis
152 angry - happy - sad - happy neutral - neutral 19 152_angry_happy_sad_happy neutral
153 organization percent person - organization percent - miscellaneous - percent person - percent 19 153_organization percent person_organization percent_miscellaneous_percent person
154 paper - metal - glass - tray - ticket 19 154_paper_metal_glass_tray
155 mask - normal - sharp - head - green 19 155_mask_normal_sharp_head
156 noun num pron - num pron propn - pron propn punct - num pron - adj adp adv 18 156_noun num pron_num pron propn_pron propn punct_num pron
157 answer - - - - 18 157_answer___
158 review - id - job - email - state 18 158_review_id_job_email
159 seven - queen - jack - king - war 18 159_seven_queen_jack_king
160 neg - nan - good - - 18 160_neg_nan_good_
161 ii - blank - vi - et - lower 18 161_ii_blank_vi_et
162 golden - husky - samoyed - pug - german 17 162_golden_husky_samoyed_pug
163 arg - delete - act - neg - lead 17 163_arg_delete_act_neg
164 exp - pp - intj - punc - prep 17 164_exp_pp_intj_punc
165 email - form - letter - report - news 17 165_email_form_letter_report
166 protein - rna - cell - line - type 17 166_protein_rna_cell_line
167 en - hi - fur - - 17 167_en_hi_fur_
168 - - - - 17 168____
169 - - - - 17 169____
170 loc loc - loc - pers - evt - 16 170_loc loc_loc_pers_evt
171 menu - - - - 16 171_menu___
172 normal - - - - 16 172_normal___
173 label_122 label_123 - label_97 label_98 label_99 - label_97 label_98 - label_96 label_97 label_98 - label_98 label_99 16 173_label_122 label_123_label_97 label_98 label_99_label_97 label_98_label_96 label_97 label_98
174 cell - organ - organism - tissue - disease 16 174_cell_organ_organism_tissue
175 target - instrument - opinion - price - product 16 175_target_instrument_opinion_price
176 org org - org org org - loc loc - org - prs 16 176_org org_org org org_loc loc_org
177 10 11 - 10 11 12 - 11 12 - 12 - 11 16 177_10 11_10 11 12_11 12_12
178 korean - russian - dutch - persian - french 16 178_korean_russian_dutch_persian
179 label_4 label_40 label_41 - label_39 label_4 label_40 - label_38 label_39 label_4 - label_37 label_38 label_39 - label_40 label_41 16 179_label_4 label_40 label_41_label_39 label_4 label_40_label_38 label_39 label_4_label_37 label_38 label_39
180 experience - location - loc misc org - loc misc - misc org 15 180_experience_location_loc misc org_loc misc
181 normal - pressure - high - water - 15 181_normal_pressure_high_water
182 company - institution - loc org - degree - org 15 182_company_institution_loc org_degree
183 short - sl - long - - 15 183_short_sl_long_
184 good - bad - non - - 15 184_good_bad_non_
185 149 - 151 - 191 - 199 - 231 15 185_149_151_191_199
186 unknown - vi - ii - - 15 186_unknown_vi_ii_
187 end - head - cross - - 15 187_end_head_cross_
188 forest - street - road - tree - mountain 15 188_forest_street_road_tree
189 label_7 label_8 label_9 - label_8 label_9 - label_0 label_1 label_10 - label_1 label_10 - label_10 14 189_label_7 label_8 label_9_label_8 label_9_label_0 label_1 label_10_label_1 label_10
190 prod - loc - evt - org org - loc loc 14 190_prod_loc_evt_org org
191 tech - business - sports - science - female 14 191_tech_business_sports_science
192 adult - child - young - - 14 192_adult_child_young_
193 human - organism - plants - - 14 193_human_organism_plants_
194 hot dog - chicken - hot - food - dog 14 194_hot dog_chicken_hot_food
195 rain - snow - - - 14 195_rain_snow__
196 objective - neutral - - - 14 196_objective_neutral__
197 pro - neutral - russian - attack - 14 197_pro_neutral_russian_attack
198 normal - disorder - good - - 14 198_normal_disorder_good_
199 road - good - bike - - 14 199_road_good_bike_
200 - - - - 14 200____
201 science - energy - arts - nuclear - systems 13 201_science_energy_arts_nuclear
202 - - - - 13 202____
203 event - ticket - ok - loose - non 13 203_event_ticket_ok_loose
204 neutral - left - right - unknown - 13 204_neutral_left_right_unknown
205 - - - - 13 205____
206 crime - pers - time - book - org 13 206_crime_pers_time_book
207 seven - start - record - zero - open 13 207_seven_start_record_zero
208 label_5 label_50 label_51 - label_50 label_51 label_52 - label_51 label_52 label_53 - label_51 label_52 - label_50 label_51 13 208_label_5 label_50 label_51_label_50 label_51 label_52_label_51 label_52 label_53_label_51 label_52
209 label_29 label_3 label_30 - label_26 label_27 label_28 - label_27 label_28 label_29 - label_27 label_28 - label_28 label_29 label_3 13 209_label_29 label_3 label_30_label_26 label_27 label_28_label_27 label_28 label_29_label_27 label_28
210 human - machine - - - 13 210_human_machine__
211 control - la - sin - social - ambient 13 211_control_la_sin_social
212 anger fear - sadness - anger - fear - fear joy 13 212_anger fear_sadness_anger_fear
213 panda - ticket - air - bamboo - el 13 213_panda_ticket_air_bamboo
214 target - - - - 13 214_target___
215 id - container - type - person - number 12 215_id_container_type_person
216 neutral - positive - negative - neutral positive - positive negative 12 216_neutral_positive_negative_neutral positive
217 change - bad - movement - work - science 12 217_change_bad_movement_work
218 rust - - - - 12 218_rust___
219 quantity - container - package - id - weight 12 219_quantity_container_package_id
220 text - - - - 12 220_text___
221 background - objective - - - 12 221_background_objective__
222 middle - subject - yes - request - answer 12 222_middle_subject_yes_request
223 - - - - 12 223____
224 public - ambiguous - non - person - 12 224_public_ambiguous_non_person
225 healthy - plant - pepper - spot - leaf 12 225_healthy_plant_pepper_spot
226 punc - prep - digit - latin - conj 12 226_punc_prep_digit_latin
227 location money - language - percent person - actor - money 12 227_location money_language_percent person_actor
228 - - - - 11 228____
229 punc - zero - pers - neg - reflex 11 229_punc_zero_pers_neg
230 album - major - copper - coon - common 11 230_album_major_copper_coon
231 metal - pop - country - dance - hip 11 231_metal_pop_country_dance
232 energy - common - grass - persian - removal 11 232_energy_common_grass_persian
233 man - double - bird - long - single 11 233_man_double_bird_long
234 17 - 16 - 18 - 13 - 15 11 234_17_16_18_13
235 email - actor - threat - tools - attack 11 235_email_actor_threat_tools
236 space - - - - 11 236_space___
237 type - country - jeep - van - lincoln 11 237_type_country_jeep_van
238 general - - - - 10 238_general___
239 ru - mat - - - 10 239_ru_mat__
240 contradiction - non - entailment - neutral - 10 240_contradiction_non_entailment_neutral
241 city - new - country - location - label_1 10 241_city_new_country_location
242 non - legal - sub - - 9 242_non_legal_sub_
243 tulip - cattle - motorcycle - road - color 8 243_tulip_cattle_motorcycle_road
244 item - color - cc - model - 8 244_item_color_cc_model
245 delivery - product - service - different - environment 7 245_delivery_product_service_different
246 degree - tim - neg - pos - propn 6 246_degree_tim_neg_pos
247 threat - hate - non - unknown - neutral 6 247_threat_hate_non_unknown
248 label_33 label_34 - label_32 label_33 label_34 - label_32 label_33 - label_31 label_32 label_33 - label_31 label_32 6 248_label_33 label_34_label_32 label_33 label_34_label_32 label_33_label_31 label_32 label_33
249 experience - location - - - 6 249_experience_location__
250 nat - gpe - geo - pro - tim 5 250_nat_gpe_geo_pro

Training hyperparameters

  • calculate_probabilities: False
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True

Framework versions

  • Numpy: 1.22.4
  • HDBSCAN: 0.8.29
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.29.2
  • Numba: 0.56.4
  • Plotly: 5.13.1
  • Python: 3.10.11
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using davanstrien/label_model 1