BERTopic-booksum-ngram1-sentence-t5-xl-chapter
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic safetensors
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("pszemraj/BERTopic-booksum-ngram1-sentence-t5-xl-chapter")
topic_model.get_topic_info()
Topic overview
- Number of topics: 138
- Number of training documents: 70840
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | were - her - was - had - she | 30 | -1_were_her_was_had |
0 | were - had - was - could - miss | 28715 | 0_were_had_was_could |
1 | artagnan - athos - musketeers - porthos - treville | 16916 | 1_artagnan_athos_musketeers_porthos |
2 | rama - ravan - brahma - lakshman - raghu | 4563 | 2_rama_ravan_brahma_lakshman |
3 | were - canoe - hist - huron - hutter | 1268 | 3_were_canoe_hist_huron |
4 | slave - were - slavery - had - was | 1011 | 4_slave_were_slavery_had |
5 | holmes - sherlock - watson - moor - baskerville | 580 | 5_holmes_sherlock_watson_moor |
6 | prisoner - milady - felton - were - madame | 549 | 6_prisoner_milady_felton_were |
7 | coriolanus - cassius - brutus - sicinius - titus | 527 | 7_coriolanus_cassius_brutus_sicinius |
8 | confederation - constitution - federal - states - senate | 511 | 8_confederation_constitution_federal_states |
9 | heathcliff - catherine - wuthering - cathy - hindley | 498 | 9_heathcliff_catherine_wuthering_cathy |
10 | were - seemed - rima - was - had | 492 | 10_were_seemed_rima_was |
11 | laws - lawes - law - civill - actions | 452 | 11_laws_lawes_law_civill |
12 | fang - wolf - fangs - musher - growl | 401 | 12_fang_wolf_fangs_musher |
13 | sigurd - thorgeir - thord - gunnar - skarphedinn | 395 | 13_sigurd_thorgeir_thord_gunnar |
14 | achilles - troy - patroclus - aeneas - ulysses | 385 | 14_achilles_troy_patroclus_aeneas |
15 | fogg - passengers - passed - phileas - travellers | 376 | 15_fogg_passengers_passed_phileas |
16 | troy - trojans - aeneas - fates - trojan | 370 | 16_troy_trojans_aeneas_fates |
17 | disciples - jesus - pharisees - temple - jerusalem | 340 | 17_disciples_jesus_pharisees_temple |
18 | helsing - harker - diary - dr - he | 324 | 18_helsing_harker_diary_dr |
19 | lama - who - no - kim - am | 312 | 19_lama_who_no_kim |
20 | sara - princess - herself - she - minchin | 301 | 20_sara_princess_herself_she |
21 | horses - horse - saddle - stable - were | 293 | 21_horses_horse_saddle_stable |
22 | hester - pearl - scarlet - her - human | 292 | 22_hester_pearl_scarlet_her |
23 | candide - inquisitor - friar - cunegonde - philosopher | 286 | 23_candide_inquisitor_friar_cunegonde |
24 | dick - aunt - were - could - had | 275 | 24_dick_aunt_were_could |
25 | wolves - wolf - cub - hunger - were | 261 | 25_wolves_wolf_cub_hunger |
26 | god - gods - consequences - satan - som | 241 | 26_god_gods_consequences_satan |
27 | modesty - women - behaviour - human - woman | 240 | 27_modesty_women_behaviour_human |
28 | society - education - distribution - service - labour | 240 | 28_society_education_distribution_service |
29 | siddhartha - buddha - gotama - kamaswami - om | 237 | 29_siddhartha_buddha_gotama_kamaswami |
30 | ship - captain - aboard - squire - ll | 229 | 30_ship_captain_aboard_squire |
31 | cyrano - roxane - montfleury - hark - love | 227 | 31_cyrano_roxane_montfleury_hark |
32 | alice - were - rabbit - hare - hatter | 225 | 32_alice_were_rabbit_hare |
33 | toto - kansas - dorothy - oz - scarecrow | 211 | 33_toto_kansas_dorothy_oz |
34 | lancelot - camelot - merlin - guinevere - arthur | 209 | 34_lancelot_camelot_merlin_guinevere |
35 | were - soldiers - seemed - soldier - th | 201 | 35_were_soldiers_seemed_soldier |
36 | were - was - fields - seemed - hills | 200 | 36_were_was_fields_seemed |
37 | reason - thyself - actions - thine - life | 179 | 37_reason_thyself_actions_thine |
38 | hetty - her - she - judith - were | 170 | 38_hetty_her_she_judith |
39 | othello - iago - desdemona - ll - roderigo | 170 | 39_othello_iago_desdemona_ll |
40 | wildeve - yes - were - vye - was | 165 | 40_wildeve_yes_were_vye |
41 | utilitarian - morality - morals - virtue - moral | 165 | 41_utilitarian_morality_morals_virtue |
42 | ransom - isaac - thine - thy - shekels | 163 | 42_ransom_isaac_thine_thy |
43 | weasels - rat - ratty - toad - badger | 157 | 43_weasels_rat_ratty_toad |
44 | philip - he - were - vicar - was | 155 | 44_philip_he_were_vicar |
45 | macbeth - banquo - macduff - fleance - murderer | 154 | 45_macbeth_banquo_macduff_fleance |
46 | lydgate - bulstrode - himself - he - had | 145 | 46_lydgate_bulstrode_himself_he |
47 | capulet - romeo - juliet - verona - mercutio | 142 | 47_capulet_romeo_juliet_verona |
48 | dying - her - were - helen - she | 141 | 48_dying_her_were_helen |
49 | anne - avonlea - diana - her - marilla | 141 | 49_anne_avonlea_diana_her |
50 | tartuffe - scene - dorine - pernelle - scoundrel | 140 | 50_tartuffe_scene_dorine_pernelle |
51 | were - yes - had - was - no | 139 | 51_were_yes_had_was |
52 | jekyll - hyde - were - myself - had | 135 | 52_jekyll_hyde_were_myself |
53 | loved - were - philip - was - could | 128 | 53_loved_were_philip_was |
54 | falstaff - mistress - ford - forsooth - windsor | 127 | 54_falstaff_mistress_ford_forsooth |
55 | hurstwood - were - barn - had - was | 127 | 55_hurstwood_were_barn_had |
56 | provost - capell - collier - conj - pope | 126 | 56_provost_capell_collier_conj |
57 | gretchen - highness - chancellor - hildegarde - yes | 125 | 57_gretchen_highness_chancellor_hildegarde |
58 | delamere - watson - dr - ll - no | 124 | 58_delamere_watson_dr_ll |
59 | jem - her - were - felt - margaret | 123 | 59_jem_her_were_felt |
60 | beowulf - grendel - hrothgar - wiglaf - hero | 111 | 60_beowulf_grendel_hrothgar_wiglaf |
61 | verloc - seemed - was - were - had | 102 | 61_verloc_seemed_was_were |
62 | hamlet - guildenstern - rosencrantz - fortinbras - polonius | 102 | 62_hamlet_guildenstern_rosencrantz_fortinbras |
63 | corey - mrs - yes - business - lapham | 101 | 63_corey_mrs_yes_business |
64 | projectiles - cannon - projectile - distance - satellite | 99 | 64_projectiles_cannon_projectile_distance |
65 | piano - musical - music - played - beethoven | 98 | 65_piano_musical_music_played |
66 | wedding - bridegroom - were - marriage - looked | 93 | 66_wedding_bridegroom_were_marriage |
67 | juan - her - fame - some - had | 92 | 67_juan_her_fame_some |
68 | were - looked - felt - her - had | 91 | 68_were_looked_felt_her |
69 | staked - gambling - wildeve - stakes - dice | 91 | 69_staked_gambling_wildeve_stakes |
70 | mistress - leonora - wanted - florence - was | 89 | 70_mistress_leonora_wanted_florence |
71 | delano - ship - sailor - captain - benito | 87 | 71_delano_ship_sailor_captain |
72 | yes - goring - no - robert - room | 85 | 72_yes_goring_no_robert |
73 | stockmann - yes - horster - mayor - dr | 81 | 73_stockmann_yes_horster_mayor |
74 | ll - were - looked - carl - was | 80 | 74_ll_were_looked_carl |
75 | barber - philosophy - no - some - man | 78 | 75_barber_philosophy_no_some |
76 | tom - maggie - came - had - tulliver | 78 | 76_tom_maggie_came_had |
77 | middlemarch - hustings - candidate - brooke - may | 75 | 77_middlemarch_hustings_candidate_brooke |
78 | inspector - verloc - yes - affair - police | 75 | 78_inspector_verloc_yes_affair |
79 | scrooge - merry - no - christmas - man | 73 | 79_scrooge_merry_no_christmas |
80 | coquenard - mutton - served - were - pudding | 70 | 80_coquenard_mutton_served_were |
81 | yes - no - jack - ll - tell | 69 | 81_yes_no_jack_ll |
82 | seth - lisbeth - th - ud - no | 67 | 82_seth_lisbeth_th_ud |
83 | higgins - eliza - her - she - liza | 66 | 83_higgins_eliza_her_she |
84 | yarmouth - were - went - had - was | 65 | 84_yarmouth_were_went_had |
85 | servian - sergius - yes - catherine - no | 64 | 85_servian_sergius_yes_catherine |
86 | service - army - salvation - institution - training | 61 | 86_service_army_salvation_institution |
87 | condemn - ff - pray - mercy - conj | 58 | 87_condemn_ff_pray_mercy |
88 | lucy - bartlett - were - could - she | 57 | 88_lucy_bartlett_were_could |
89 | wills - seemed - bequest - were - testator | 54 | 89_wills_seemed_bequest_were |
90 | scene - iii - malvolio - valentine - cesario | 54 | 90_scene_iii_malvolio_valentine |
91 | fuss - think - ll - thinks - oh | 53 | 91_fuss_think_ll_thinks |
92 | hermia - demetrius - helena - theseus - helen | 50 | 92_hermia_demetrius_helena_theseus |
93 | seemed - rochester - were - had - yes | 50 | 93_seemed_rochester_were_had |
94 | sorrow - mourned - myself - had - was | 48 | 94_sorrow_mourned_myself_had |
95 | gerty - sleepless - tea - weariness - tired | 48 | 95_gerty_sleepless_tea_weariness |
96 | rushworth - crawford - were - sotherton - was | 47 | 96_rushworth_crawford_were_sotherton |
97 | reasoning - syllogisme - names - signification - definitions | 46 | 97_reasoning_syllogisme_names_signification |
98 | could - caleb - sure - work - no | 46 | 98_could_caleb_sure_work |
99 | rose - tears - hope - tell - wish | 46 | 99_rose_tears_hope_tell |
100 | peggotty - em - gummidge - he - ll | 46 | 100_peggotty_em_gummidge_he |
101 | time - future - story - paradox - traveller | 46 | 101_time_future_story_paradox |
102 | cleopatra - antony - caesar - loved - slave | 45 | 102_cleopatra_antony_caesar_loved |
103 | appendicitis - doctors - doctor - dr - wanted | 45 | 103_appendicitis_doctors_doctor_dr |
104 | slept - awoke - waking - sleep - seemed | 44 | 104_slept_awoke_waking_sleep |
105 | parlour - room - seemed - sat - had | 43 | 105_parlour_room_seemed_sat |
106 | prophets - scripture - prophet - moses - prophecy | 43 | 106_prophets_scripture_prophet_moses |
107 | letter - honour - adieu - duval - evelina | 43 | 107_letter_honour_adieu_duval |
108 | complications - cranky - had - tanis - was | 43 | 108_complications_cranky_had_tanis |
109 | fled - armies - brussels - imperial - napoleon | 42 | 109_fled_armies_brussels_imperial |
110 | philip - easel - greco - impressionists - manet | 42 | 110_philip_easel_greco_impressionists |
111 | harlings - harling - frances - were - shimerdas | 40 | 111_harlings_harling_frances_were |
112 | jane - mrs - janet - eyre - her | 40 | 112_jane_mrs_janet_eyre |
113 | prisoner - confinement - prisoners - prison - gaoler | 40 | 113_prisoner_confinement_prisoners_prison |
114 | hardcastle - marlow - impudence - constance - modesty | 40 | 114_hardcastle_marlow_impudence_constance |
115 | horatio - murder - revenge - sorrow - hieronimo | 40 | 115_horatio_murder_revenge_sorrow |
116 | traddles - had - married - room - horace | 39 | 116_traddles_had_married_room |
117 | philip - tell - feelings - was - remember | 38 | 117_philip_tell_feelings_was |
118 | nervous - countenance - seemed - he - huxtable | 38 | 118_nervous_countenance_seemed_he |
119 | rogers - wanted - lapham - could - silas | 38 | 119_rogers_wanted_lapham_could |
120 | titus - timon - varro - servilius - alcibiades | 37 | 120_titus_timon_varro_servilius |
121 | morality - justice - moral - impartiality - unjust | 37 | 121_morality_justice_moral_impartiality |
122 | willard - elmer - were - was - henderson | 37 | 122_willard_elmer_were_was |
123 | had - was - could - circumstances - possession | 37 | 123_had_was_could_circumstances |
124 | monkey - he - sahib - rat - sara | 36 | 124_monkey_he_sahib_rat |
125 | mcmurdo - mcginty - cormac - police - scanlan | 36 | 125_mcmurdo_mcginty_cormac_police |
126 | hetty - herself - she - her - had | 36 | 126_hetty_herself_she_her |
127 | dimmesdale - reverend - chillingworth - clergyman - deacon | 35 | 127_dimmesdale_reverend_chillingworth_clergyman |
128 | formerly - eliza - was - friend - friends | 34 | 128_formerly_eliza_was_friend |
129 | were - seemed - had - was - felt | 34 | 129_were_seemed_had_was |
130 | prisoner - jerry - lorry - tellson - court | 33 | 130_prisoner_jerry_lorry_tellson |
131 | macmurdo - wenham - captain - steyne - crawley | 33 | 131_macmurdo_wenham_captain_steyne |
132 | ducal - duchy - xv - fetes - theatre | 32 | 132_ducal_duchy_xv_fetes |
133 | chapter - book - dows - unt - windowpane | 32 | 133_chapter_book_dows_unt |
134 | money - riches - things - risk - thoughts | 31 | 134_money_riches_things_risk |
135 | bethy - beth - seemed - sister - her | 31 | 135_bethy_beth_seemed_sister |
136 | oliver - pickwick - were - was - inn | 30 | 136_oliver_pickwick_were_was |
Training hyperparameters
- calculate_probabilities: True
- language: None
- low_memory: False
- min_topic_size: 30
- n_gram_range: (1, 1)
- nr_topics: auto
- seed_topic_list: None
- top_n_words: 10
- verbose: True
Framework versions
- Numpy: 1.24.3
- HDBSCAN: 0.8.29
- UMAP: 0.5.3
- Pandas: 2.0.2
- Scikit-Learn: 1.2.2
- Sentence-transformers: 2.2.2
- Transformers: 4.30.2
- Numba: 0.57.1
- Plotly: 5.15.0
- Python: 3.10.11
- Downloads last month
- 2
Inference API (serverless) has been turned off for this model.