Figure generation module¶
This module is used mainly to generate Dash components, such as filters or figures. It also provides some extra functionality to anonymize usernames and accessing data located in a database.
- dash_utils.acumulate_retweets(df)[source]¶
Given a dataframe with the columns Number and Date it accumulates the number (Counting total retweets).
- Parameters
df – A dataframe with the columns Date and Number
- Returns
A dataframe with the accumulated result per date.
- dash_utils.get_all_hashtags(df, keywords=None, stopwords=None)[source]¶
Given a DataFrame with Tweets it returns a DataFrame with the Hashtags and the number of times (Count) they appear.
- Parameters
df – A DataFrame with all the tweets
keywords – A list of words to filter the tweets
stopwords – A list of words to filter the tweets
- Returns
A DataFrame counting the hashtags and the number of times they have appeared.
- dash_utils.get_all_temporalseries(df, keywords=None)[source]¶
Given a DataFrame containing all the tweets the function returns a DataFrame with the hashtags and dates, a list of dates and the hashtags sorted by number of appearances
- Parameters
df – DataFrame with all the tweets
keywords – Keywords to filter the DataFrame
- Returns
DataFrame with hastags and dates, a list of days and hashtags sorted by appearance
- dash_utils.get_communities(g, algorithm='louvain')[source]¶
Function to calculate the communities of a given network.
- Parameters
g – Graph that represents the network
algorithm – The algorithm to create the communities (louvain or propagation)
- Returns
A list of communities. Each community is represented as a list of user names
- dash_utils.get_community_graph(g, communities, i=0)[source]¶
Function to create a graph given a list of users (communities) that are connected.
- Parameters
g – The graph that contains the information of the whole network
communities – A list of communities. Each community contains a list of user names
i – The community for which we want to create a graph
- Returns
The graph that represents the requested community
- dash_utils.get_controls_activity()[source]¶
Function to create the filtering options for the Geomap visualizations in Dash
- Returns
The filtering options for the Geomap visualizations.
- dash_utils.get_controls_community2(communities)[source]¶
Given a list of communities, being each community a list of users, the function creates the filtering options for the Dash visualization.
- Parameters
communities – A list of communities, being each community a list of usernames
- Returns
The filtering options for the Dash visualization.
- dash_utils.get_controls_rt(number_id, keyword_id)[source]¶
Given two ids, the function creates the filtering options for several Dash Visualizations
- Parameters
number_id – Id for the number input
keyword_id – Id for the text input
- Returns
The filtering options for the Dash visualization.
- dash_utils.get_controls_rt_g(keyword_id)[source]¶
Given and id it creates the filtering options for the graph of retweets in Dash
- Parameters
keyword_id – Id of the text input
- Returns
The filtering options for the graph of retweets
- dash_utils.get_controls_topics(number_id, keyword_id, topics)[source]¶
Function to create the filtering options for the topic modelling visualization
- Parameters
number_id – The id for the number input (Number of topics to create)
keyword_id – The id for the text input (List of words to filter the dataframe)
topics – The number of topics.
- Returns
- dash_utils.get_controls_ts(number_id, keyword_id, dc_id, df_ts)[source]¶
Given the ids for the different inputs, it creates the different filters for the time series visualization
- Parameters
number_id – Id for the number input (Number of hashtags to show)
keyword_id – Id for the list of keywords
dc_id – Id for the dropdown options (Search specific hashtags to show)
df_ts – The DataFrame with the hashtag count
- Returns
- dash_utils.get_cstrack_graph(df, type, title)[source]¶
A function to create the different graphs for the cstrackproject twitter account.
- Parameters
df – A dataframe with the data
type – The type of graph that is wanted to be drawn (Retweets, Tweets, Followers)
title – The title of the graph
- Returns
A figure representing the results according to the given parameters.
- dash_utils.get_degrees(df)[source]¶
Given a DataFrame with tweets and users it calculates different centrality measures.
- Parameters
df – A DataFrame with tweets and users
- Returns
A Dataframe with the centrality measures of the users
- dash_utils.get_df_ts(df, days, hashtags)[source]¶
Given a DataFrame, a list of days and a list of hashtags it returns a Dataframe with the appearance of each hashtag each day
- Parameters
df – Input Dataframe
days – A list of dates
elements – A list of hashtags
- Returns
DataFrame with the count for each hashtag each day
- dash_utils.get_figure(df)[source]¶
Given a dataframe where the appearance of each hashtag is counted, it creates a barplot to represent the results.
- Parameters
df – A dataframe with the columns Hashtags and Count
- Returns
A barplot representing the dataframe
- dash_utils.get_graph_rt(df)[source]¶
Given a Dataframe with Tweets and users it creates a Graph of retweets
- Parameters
df – A Dataframe containing the tweets and users
- Returns
A graph representing the network of retweets
- dash_utils.get_hash_name_list(nodes)[source]¶
Function to anonymize a list of users.
- Parameters
nodes – list with the user names
- Returns
list with the user names anonymized
- dash_utils.get_map_df()[source]¶
Function to get the information to create geomap visualizations
- Returns
A DataFrame with geographical information
- dash_utils.get_rt_hashtags(df, keywords=None, stopwords=None, n_hashtags=10)[source]¶
Given a DataFrame with Tweets it returns a DataFrame with the Hashtags and the number of times (Count) they have been retweeted
- Parameters
df – A DataFrame with all the tweets
keywords – A list of words to filter the tweets
stopwords – A list of words to filter the tweets
n_hashtags – Number of hashtags to get
- Returns
A DataFrame counting the hashtags and the number of times they have been retweeted
- dash_utils.get_rt_temporalseries(df, keywords=None)[source]¶
Given a DataFrame containing all the tweets the function returns a DataFrame with the retweeted hashtags and dates, a list of dates and the hashtags sorted by number of appearances
- Parameters
df – DataFrame with all the tweets
k – Keywords to filter the DataFrame
- Returns
DataFrame with hashtags and dates, a list of days and hashtags sorted by appearance
- dash_utils.get_single_counts(df)[source]¶
Given a dataframe with the columns Date and Number it counts the increment (Tweets and Follows). For instance, having 10-03-2021, 11-03-2021 as Dates and 10, 12 as counts it will return (10-03-2021, 10; 11-03-2021, 2).
- Parameters
df – A dataframe that must have the columns Date and Number
- Returns
A dataframe counting the increments
- dash_utils.get_temporal_figure(df, n_hashtags=5)[source]¶
Given a dataframe that contains the number of appearances of each hashtag in each day it creates a time series figure to represent the results.
- Parameters
df – A dataframe with the name of the hastags, the date, and the number of appearances
n_hashtags – The number of hashtags that are wanted to be shown
- Returns
A time series figure representing the dataframe
- dash_utils.get_topic_file(id)[source]¶
Given an id it creates a dropbox to upload a file containing keywords (One keyword in each line)
- Parameters
id – The id of the button
- Returns
The button to upload a file
- dash_utils.get_twitter_info_df()[source]¶
A function to return the cstrackproject Twitter user stats
- Returns
A Dataframe with information about followers, retweets and tweets
- dash_utils.get_two_mode_graph(df, keywords=None)[source]¶
Given a DataFrame containing all the tweets the function returns a two-mode graph connecting users with tweets
- Parameters
df – A DataFrame with all the information
keywords – A list of words to filter the DataFrame
- Returns
A two-mode graph connecting users with tweets
- dash_utils.kcore_graph(df, keywords=None, stopwords=None, interest=None, anonymize=False)[source]¶
Given a dataframe with tweets, users… creates a graph of retweets.
- Parameters
df – The dataframe containing the information
keywords – A list of words to get the tweets that contain those words
stopwords – A list of words to remove tweets that contain those words
interest – The interest (Lynguo filter)
anonymize – False if we want to get the user names and false if we want to anonymize them
- Returns
The graph
- dash_utils.set_loading(controls, dcc_graph)[source]¶
Function to create a loading effect when filtering a graph
- Parameters
controls – The filters
dcc_graph – The figure that is being updated
- Returns
The element to embed the figure in in order to apply the loading effect
- dash_utils.wordcloudmain(df, keywords=None, stopwords=None, interest=None)[source]¶
Given a DataFrame with all the tweets the function creates a Wordcloud with the words that appear the most.
- Parameters
df – A DataFrame with all the tweets
keywords – A list of words to filter the DataFrame
stopwords – A list of words to filter the DataFrame
interest – The interest to filter the DataFrame (Lynguo)