Figure generation module

This module is used mainly to generate Dash components, such as filters or figures. It also provides some extra functionality to anonymize usernames and accessing data located in a database.

dash_utils.acumulate_retweets(df)[source]

Given a dataframe with the columns Number and Date it accumulates the number (Counting total retweets).

Parameters

df – A dataframe with the columns Date and Number

Returns

A dataframe with the accumulated result per date.

dash_utils.get_all_hashtags(df, keywords=None, stopwords=None)[source]

Given a DataFrame with Tweets it returns a DataFrame with the Hashtags and the number of times (Count) they appear.

Parameters
  • df – A DataFrame with all the tweets

  • keywords – A list of words to filter the tweets

  • stopwords – A list of words to filter the tweets

Returns

A DataFrame counting the hashtags and the number of times they have appeared.

dash_utils.get_all_temporalseries(df, keywords=None)[source]

Given a DataFrame containing all the tweets the function returns a DataFrame with the hashtags and dates, a list of dates and the hashtags sorted by number of appearances

Parameters
  • df – DataFrame with all the tweets

  • keywords – Keywords to filter the DataFrame

Returns

DataFrame with hastags and dates, a list of days and hashtags sorted by appearance

dash_utils.get_communities(g, algorithm='louvain')[source]

Function to calculate the communities of a given network.

Parameters
  • g – Graph that represents the network

  • algorithm – The algorithm to create the communities (louvain or propagation)

Returns

A list of communities. Each community is represented as a list of user names

dash_utils.get_community_graph(g, communities, i=0)[source]

Function to create a graph given a list of users (communities) that are connected.

Parameters
  • g – The graph that contains the information of the whole network

  • communities – A list of communities. Each community contains a list of user names

  • i – The community for which we want to create a graph

Returns

The graph that represents the requested community

dash_utils.get_controls_activity()[source]

Function to create the filtering options for the Geomap visualizations in Dash

Returns

The filtering options for the Geomap visualizations.

dash_utils.get_controls_community2(communities)[source]

Given a list of communities, being each community a list of users, the function creates the filtering options for the Dash visualization.

Parameters

communities – A list of communities, being each community a list of usernames

Returns

The filtering options for the Dash visualization.

dash_utils.get_controls_rt(number_id, keyword_id)[source]

Given two ids, the function creates the filtering options for several Dash Visualizations

Parameters
  • number_id – Id for the number input

  • keyword_id – Id for the text input

Returns

The filtering options for the Dash visualization.

dash_utils.get_controls_rt_g(keyword_id)[source]

Given and id it creates the filtering options for the graph of retweets in Dash

Parameters

keyword_id – Id of the text input

Returns

The filtering options for the graph of retweets

dash_utils.get_controls_topics(number_id, keyword_id, topics)[source]

Function to create the filtering options for the topic modelling visualization

Parameters
  • number_id – The id for the number input (Number of topics to create)

  • keyword_id – The id for the text input (List of words to filter the dataframe)

  • topics – The number of topics.

Returns

dash_utils.get_controls_ts(number_id, keyword_id, dc_id, df_ts)[source]

Given the ids for the different inputs, it creates the different filters for the time series visualization

Parameters
  • number_id – Id for the number input (Number of hashtags to show)

  • keyword_id – Id for the list of keywords

  • dc_id – Id for the dropdown options (Search specific hashtags to show)

  • df_ts – The DataFrame with the hashtag count

Returns

dash_utils.get_cstrack_graph(df, type, title)[source]

A function to create the different graphs for the cstrackproject twitter account.

Parameters
  • df – A dataframe with the data

  • type – The type of graph that is wanted to be drawn (Retweets, Tweets, Followers)

  • title – The title of the graph

Returns

A figure representing the results according to the given parameters.

dash_utils.get_degrees(df)[source]

Given a DataFrame with tweets and users it calculates different centrality measures.

Parameters

df – A DataFrame with tweets and users

Returns

A Dataframe with the centrality measures of the users

dash_utils.get_df_ts(df, days, hashtags)[source]

Given a DataFrame, a list of days and a list of hashtags it returns a Dataframe with the appearance of each hashtag each day

Parameters
  • df – Input Dataframe

  • days – A list of dates

  • elements – A list of hashtags

Returns

DataFrame with the count for each hashtag each day

dash_utils.get_figure(df)[source]

Given a dataframe where the appearance of each hashtag is counted, it creates a barplot to represent the results.

Parameters

df – A dataframe with the columns Hashtags and Count

Returns

A barplot representing the dataframe

dash_utils.get_graph_rt(df)[source]

Given a Dataframe with Tweets and users it creates a Graph of retweets

Parameters

df – A Dataframe containing the tweets and users

Returns

A graph representing the network of retweets

dash_utils.get_hash_name_list(nodes)[source]

Function to anonymize a list of users.

Parameters

nodes – list with the user names

Returns

list with the user names anonymized

dash_utils.get_map_df()[source]

Function to get the information to create geomap visualizations

Returns

A DataFrame with geographical information

dash_utils.get_rt_hashtags(df, keywords=None, stopwords=None, n_hashtags=10)[source]

Given a DataFrame with Tweets it returns a DataFrame with the Hashtags and the number of times (Count) they have been retweeted

Parameters
  • df – A DataFrame with all the tweets

  • keywords – A list of words to filter the tweets

  • stopwords – A list of words to filter the tweets

  • n_hashtags – Number of hashtags to get

Returns

A DataFrame counting the hashtags and the number of times they have been retweeted

dash_utils.get_rt_temporalseries(df, keywords=None)[source]

Given a DataFrame containing all the tweets the function returns a DataFrame with the retweeted hashtags and dates, a list of dates and the hashtags sorted by number of appearances

Parameters
  • df – DataFrame with all the tweets

  • k – Keywords to filter the DataFrame

Returns

DataFrame with hashtags and dates, a list of days and hashtags sorted by appearance

dash_utils.get_single_counts(df)[source]

Given a dataframe with the columns Date and Number it counts the increment (Tweets and Follows). For instance, having 10-03-2021, 11-03-2021 as Dates and 10, 12 as counts it will return (10-03-2021, 10; 11-03-2021, 2).

Parameters

df – A dataframe that must have the columns Date and Number

Returns

A dataframe counting the increments

dash_utils.get_temporal_figure(df, n_hashtags=5)[source]

Given a dataframe that contains the number of appearances of each hashtag in each day it creates a time series figure to represent the results.

Parameters
  • df – A dataframe with the name of the hastags, the date, and the number of appearances

  • n_hashtags – The number of hashtags that are wanted to be shown

Returns

A time series figure representing the dataframe

dash_utils.get_topic_file(id)[source]

Given an id it creates a dropbox to upload a file containing keywords (One keyword in each line)

Parameters

id – The id of the button

Returns

The button to upload a file

dash_utils.get_twitter_info_df()[source]

A function to return the cstrackproject Twitter user stats

Returns

A Dataframe with information about followers, retweets and tweets

dash_utils.get_two_mode_graph(df, keywords=None)[source]

Given a DataFrame containing all the tweets the function returns a two-mode graph connecting users with tweets

Parameters
  • df – A DataFrame with all the information

  • keywords – A list of words to filter the DataFrame

Returns

A two-mode graph connecting users with tweets

dash_utils.kcore_graph(df, keywords=None, stopwords=None, interest=None, anonymize=False)[source]

Given a dataframe with tweets, users… creates a graph of retweets.

Parameters
  • df – The dataframe containing the information

  • keywords – A list of words to get the tweets that contain those words

  • stopwords – A list of words to remove tweets that contain those words

  • interest – The interest (Lynguo filter)

  • anonymize – False if we want to get the user names and false if we want to anonymize them

Returns

The graph

dash_utils.set_loading(controls, dcc_graph)[source]

Function to create a loading effect when filtering a graph

Parameters
  • controls – The filters

  • dcc_graph – The figure that is being updated

Returns

The element to embed the figure in in order to apply the loading effect

dash_utils.wordcloudmain(df, keywords=None, stopwords=None, interest=None)[source]

Given a DataFrame with all the tweets the function creates a Wordcloud with the words that appear the most.

Parameters
  • df – A DataFrame with all the tweets

  • keywords – A list of words to filter the DataFrame

  • stopwords – A list of words to filter the DataFrame

  • interest – The interest to filter the DataFrame (Lynguo)