Bertopic utilities module¶
This module provides functionality to create topic modelling models and visualizations.
- berttopic_utils.create_bert_model(documents)[source]¶
Given a list of sentences, it creates a BERTopic model with them.
- Parameters
documents – List of sentences (strings).
- Returns
The BERTopic model, the topics and the probabilities.
- berttopic_utils.get_cleaned_documents(df_original)[source]¶
Given a df with a column Texto (tweets), this function preprocess the texts of that column removing punctuation, common words, stop words and urls.
- Parameters
df_original – A df as with the tweets.
- Returns
A list with the tweets (strings).
- berttopic_utils.get_heatmap(model)[source]¶
Wrapper to create the heatmap visualization.
- Parameters
model – The BERTopic model.
- Returns
The heatmap visualization.
- berttopic_utils.get_hierarchical_clusterin(model)[source]¶
Wrapper to create the hierachical visualization.
- Parameters
model – The BERTopic model.
- Returns
The hierachical visualization.
- berttopic_utils.get_intertopic_distance(model, top_n_topics=20)[source]¶
Wrapper to create the intertopic distance visualization
- Parameters
model – The BERTopic model.
top_n_topics – The number of topics to show.
- Returns
The intertopic visualization.
- berttopic_utils.get_topics_bar(model, top_n_topics=9)[source]¶
Wrapper to create the barchart visualization.
- Parameters
model – The BERTopic model.
- Returns
The barchart visualization.
- berttopic_utils.get_topics_over_time(df, model, documents, topics)[source]¶
Wrapper to create the heatmap visualization.
- Parameters
model – The BERTopic model.
- Returns
The heatmap visualization.
- berttopic_utils.load_model(filename)[source]¶
Given a filename, it loads the BERTopic model.
- Parameters
filename – A string that represents the filename.
- Returns
The BERTopic model.