Edit model card

bertopic_first

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("DobreMihai/bertopic_first")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 50
  • Number of training documents: 24020
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 it - be - to - the - alarm 10 -1_it_be_to_the
0 app - math - the - alarm - to 9997 0_app_math_the_alarm
1 alarm - snooze - not - be - it 5678 1_alarm_snooze_not_be
2 subscription - - - - 1906 2_subscription___
3 picture - photo - take - the - emergency 1326 3_picture_photo_take_the
4 wake - up - it - annoying - love 603 4_wake_up_it_annoying
5 snooze - - - - 293 5_snooze___
6 loud - - - - 288 6_loud___
7 barcode - scan - code - scanner - qr 285 7_barcode_scan_code_scanner
8 mission - the - be - you - to 282 8_mission_the_be_you
9 easy - use - simple - very - and 279 9_easy_use_simple_very
10 loud - - - - 256 10_loud___
11 help - up - early - wake - helpful 246 11_help_up_early_wake
12 ring - not - do - it - sometimes 243 12_ring_not_do_it
13 app - work - open - not - phone 195 13_app_work_open_not
14 ads - - - - 179 14_ads___
15 volume - vibrate - mute - sound - vibration 176 15_volume_vibrate_mute_sound
16 work - use - great - easy - well 170 16_work_use_great_easy
17 star - give - it - because - five 162 17_star_give_it_because
18 prevent - phone - off - power - switch 140 18_prevent_phone_off_power
19 loud - - - - 130 19_loud___
20 work - off - sometimes - not - go 128 20_work_off_sometimes_not
21 music - song - spotify - own - file 127 21_music_song_spotify_own
22 weather - - - - 110 22_weather___
23 annoying - job - but - work - it 88 23_annoying_job_but_work
24 student - helpful - for - useful - very 73 24_student_helpful_for_useful
25 perfect - word - good - amazing - be 66 25_perfect_word_good_amazing
26 reliable - easy - dependable - use - and 58 26_reliable_easy_dependable_use
27 minute - 10 - set - scroll - add 53 27_minute_10_set_scroll
28 aap - very - this - student - good 50 28_aap_very_this_student
29 android - 10 - work - update - support 45 29_android_10_work_update
30 reliable - alarm - very - sometimes - not 40 30_reliable_alarm_very_sometimes
31 alarmy - thank - premium - wake - much 37 31_alarmy_thank_premium_wake
32 application - student - very - excellent - study 31 32_application_student_very_excellent
33 exit - message - quote - love - smile 28 33_exit_message_quote_love
34 paywall - behind - lock - feature - real 25 34_paywall_behind_lock_feature
35 mb - space - storage - size - take 22 35_mb_space_storage_size
36 easy - set - setup - up - to 21 36_easy_set_setup_up
37 squat - premium - the - mission - do 21 37_squat_premium_the_mission
38 overheat - hot - phone - heat - run 19 38_overheat_hot_phone_heat
39 star - give - deserve - it - scarey 18 39_star_give_deserve_it
40 add - instal - pretty - plaster - interfaceit 16 40_add_instal_pretty_plaster
41 uninstall - deactivate - logo - not - let 14 41_uninstall_deactivate_logo_not
42 develper - legend - describe - word - clear 13 42_develper_legend_describe_word
43 team - thank - alarmy - lot - help 13 43_team_thank_alarmy_lot
44 update - ui - the - version - new 13 44_update_ui_the_version
45 love - get - perfect - time - thay 13 45_love_get_perfect_time
46 scare - medication - it - weapon - secret 12 46_scare_medication_it_weapon
47 accurate - dependable - 8n - fashion - ti 11 47_accurate_dependable_8n_fashion
48 procrastinator - exms - help - anoye - hv 11 48_procrastinator_exms_help_anoye

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 50
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False
  • zeroshot_min_similarity: 0.85
  • zeroshot_topic_list: ['android', 'premium*', 'ads', 'math', 'subscription', 'update', 'camera', 'shake', 'weather', 'snooze', 'loud', 'doesn', 'off']

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.38.post1
  • UMAP: 0.5.6
  • Pandas: 2.2.1
  • Scikit-Learn: 1.5.1
  • Sentence-transformers: 3.0.1
  • Transformers: 4.44.0
  • Numba: 0.60.0
  • Plotly: 5.23.0
  • Python: 3.10.14
Downloads last month
50
Inference Examples
Inference API (serverless) is not available, repository is disabled.