Figure generation module¶

This module is used mainly to generate Dash components, such as filters or figures. It also provides some extra functionality to anonymize usernames and accessing data located in a database.

dash_utils.acumulate_retweets(df)[source]¶

Given a dataframe with the columns Number and Date it accumulates the number (Counting total retweets).

Parameters: df – A dataframe with the columns Date and Number
Returns: A dataframe with the accumulated result per date.

dash_utils.get_all_hashtags(df, keywords=None, stopwords=None)[source]¶

Given a DataFrame with Tweets it returns a DataFrame with the Hashtags and the number of times (Count) they appear.

Parameters

df – A DataFrame with all the tweets
keywords – A list of words to filter the tweets
stopwords – A list of words to filter the tweets

Returns

A DataFrame counting the hashtags and the number of times they have appeared.

dash_utils.get_all_temporalseries(df, keywords=None)[source]¶

Given a DataFrame containing all the tweets the function returns a DataFrame with the hashtags and dates, a list of dates and the hashtags sorted by number of appearances

Parameters

df – DataFrame with all the tweets
keywords – Keywords to filter the DataFrame

Returns

DataFrame with hastags and dates, a list of days and hashtags sorted by appearance

dash_utils.get_communities(g, algorithm='louvain')[source]¶

Function to calculate the communities of a given network.

Parameters

g – Graph that represents the network
algorithm – The algorithm to create the communities (louvain or propagation)

Returns

A list of communities. Each community is represented as a list of user names

dash_utils.get_community_graph(g, communities, i=0)[source]¶

Function to create a graph given a list of users (communities) that are connected.

Parameters

g – The graph that contains the information of the whole network
communities – A list of communities. Each community contains a list of user names
i – The community for which we want to create a graph

Returns

The graph that represents the requested community

dash_utils.get_controls_activity()[source]¶

Function to create the filtering options for the Geomap visualizations in Dash

Returns: The filtering options for the Geomap visualizations.

dash_utils.get_controls_community2(communities)[source]¶

Given a list of communities, being each community a list of users, the function creates the filtering options for the Dash visualization.

Parameters: communities – A list of communities, being each community a list of usernames
Returns: The filtering options for the Dash visualization.

dash_utils.get_controls_rt(number_id, keyword_id)[source]¶

Given two ids, the function creates the filtering options for several Dash Visualizations

Parameters

number_id – Id for the number input
keyword_id – Id for the text input

Returns

The filtering options for the Dash visualization.

dash_utils.get_controls_rt_g(keyword_id)[source]¶

Given and id it creates the filtering options for the graph of retweets in Dash

Parameters: keyword_id – Id of the text input
Returns: The filtering options for the graph of retweets

dash_utils.get_controls_topics(number_id, keyword_id, topics)[source]¶

Function to create the filtering options for the topic modelling visualization

Parameters

number_id – The id for the number input (Number of topics to create)
keyword_id – The id for the text input (List of words to filter the dataframe)
topics – The number of topics.

Returns

dash_utils.get_controls_ts(number_id, keyword_id, dc_id, df_ts)[source]¶

Given the ids for the different inputs, it creates the different filters for the time series visualization

Parameters

number_id – Id for the number input (Number of hashtags to show)
keyword_id – Id for the list of keywords
dc_id – Id for the dropdown options (Search specific hashtags to show)
df_ts – The DataFrame with the hashtag count

Returns

dash_utils.get_cstrack_graph(df, type, title)[source]¶

A function to create the different graphs for the cstrackproject twitter account.

Parameters

df – A dataframe with the data
type – The type of graph that is wanted to be drawn (Retweets, Tweets, Followers)
title – The title of the graph

Returns

A figure representing the results according to the given parameters.

dash_utils.get_degrees(df)[source]¶

Given a DataFrame with tweets and users it calculates different centrality measures.

Parameters: df – A DataFrame with tweets and users
Returns: A Dataframe with the centrality measures of the users

dash_utils.get_df_ts(df, days, hashtags)[source]¶

Given a DataFrame, a list of days and a list of hashtags it returns a Dataframe with the appearance of each hashtag each day

Parameters

df – Input Dataframe
days – A list of dates
elements – A list of hashtags

Returns

DataFrame with the count for each hashtag each day

dash_utils.get_figure(df)[source]¶

Given a dataframe where the appearance of each hashtag is counted, it creates a barplot to represent the results.

Parameters: df – A dataframe with the columns Hashtags and Count
Returns: A barplot representing the dataframe

dash_utils.get_graph_rt(df)[source]¶

Given a Dataframe with Tweets and users it creates a Graph of retweets

Parameters: df – A Dataframe containing the tweets and users
Returns: A graph representing the network of retweets

dash_utils.get_hash_name_list(nodes)[source]¶

Function to anonymize a list of users.

Parameters: nodes – list with the user names
Returns: list with the user names anonymized

dash_utils.get_map_df()[source]¶

Function to get the information to create geomap visualizations

Returns: A DataFrame with geographical information

dash_utils.get_rt_hashtags(df, keywords=None, stopwords=None, n_hashtags=10)[source]¶

Given a DataFrame with Tweets it returns a DataFrame with the Hashtags and the number of times (Count) they have been retweeted

Parameters

df – A DataFrame with all the tweets
keywords – A list of words to filter the tweets
stopwords – A list of words to filter the tweets
n_hashtags – Number of hashtags to get

Returns

A DataFrame counting the hashtags and the number of times they have been retweeted

dash_utils.get_rt_temporalseries(df, keywords=None)[source]¶

Given a DataFrame containing all the tweets the function returns a DataFrame with the retweeted hashtags and dates, a list of dates and the hashtags sorted by number of appearances

Parameters

df – DataFrame with all the tweets
k – Keywords to filter the DataFrame

Returns

DataFrame with hashtags and dates, a list of days and hashtags sorted by appearance

dash_utils.get_single_counts(df)[source]¶

Given a dataframe with the columns Date and Number it counts the increment (Tweets and Follows). For instance, having 10-03-2021, 11-03-2021 as Dates and 10, 12 as counts it will return (10-03-2021, 10; 11-03-2021, 2).

Parameters: df – A dataframe that must have the columns Date and Number
Returns: A dataframe counting the increments

dash_utils.get_temporal_figure(df, n_hashtags=5)[source]¶

Given a dataframe that contains the number of appearances of each hashtag in each day it creates a time series figure to represent the results.

Parameters

df – A dataframe with the name of the hastags, the date, and the number of appearances
n_hashtags – The number of hashtags that are wanted to be shown

Returns

A time series figure representing the dataframe

dash_utils.get_topic_file(id)[source]¶

Given an id it creates a dropbox to upload a file containing keywords (One keyword in each line)

Parameters: id – The id of the button
Returns: The button to upload a file

dash_utils.get_twitter_info_df()[source]¶

A function to return the cstrackproject Twitter user stats

Returns: A Dataframe with information about followers, retweets and tweets

dash_utils.get_two_mode_graph(df, keywords=None)[source]¶

Given a DataFrame containing all the tweets the function returns a two-mode graph connecting users with tweets

Parameters

df – A DataFrame with all the information
keywords – A list of words to filter the DataFrame

Returns

A two-mode graph connecting users with tweets

dash_utils.kcore_graph(df, keywords=None, stopwords=None, interest=None, anonymize=False)[source]¶

Given a dataframe with tweets, users… creates a graph of retweets.

Parameters

df – The dataframe containing the information
keywords – A list of words to get the tweets that contain those words
stopwords – A list of words to remove tweets that contain those words
interest – The interest (Lynguo filter)
anonymize – False if we want to get the user names and false if we want to anonymize them

Returns

The graph

dash_utils.set_loading(controls, dcc_graph)[source]¶

Function to create a loading effect when filtering a graph

Parameters

controls – The filters
dcc_graph – The figure that is being updated

Returns

The element to embed the figure in in order to apply the loading effect

dash_utils.wordcloudmain(df, keywords=None, stopwords=None, interest=None)[source]¶

Given a DataFrame with all the tweets the function creates a Wordcloud with the words that appear the most.

Parameters

df – A DataFrame with all the tweets
keywords – A list of words to filter the DataFrame
stopwords – A list of words to filter the DataFrame
interest – The interest to filter the DataFrame (Lynguo)