Introduction

In this module, we will be covering the basics of doing research using Twitter. We will be exploring the following topics:

1. Overview of the project

2. Definitions of key terms

3. Scope and limitations of using Twitter for generating insights

4. Class activity: a brief stroll through Twitter land


1. Project Overview


Twitter is a microblogging social media channel that allows users to publish content in upto 280 characters along with images, and videos. Features such as hashtags, mentions, and replies, allows users to network and interact with other Twitter users, making it amenable for asynchronous interactions in a classroom settings. Apart from its communicative function, Twitter allows us to observe real time social phenomenon such as social interaction, information sharing, information seeking, self-documentation, and self-expression (Malik et. al, 2019). Data gathered from twitter can be helpful to measure short-term effects and/or look for insights (“trends”) relating to variables of interest.

For this project we will be gathering tweets using an OER called TAGS. Using TAGS, we will be creating an archive of tweets related to a particular hashtag (Module 2). These tweets will be analyzed using qualitative methodologies (Module 3). Finally, we will be presenting our insights based on what we have observed using a blog post, video, or infographic (Module 4)


2.Definitions of Key Terms


Social media are computer-mediated communication software that enable users to create, share and view content in publicly networked one-to-one, one-to-many, and/or many-to-many communications.(Hopkins, 2017)

boyd (2014) decsribes social media as a form of networked public inhabited by young people (and others) that is charaterized by four structural affordnaces:

• Persistence: online expressions are automatically recorded and archived.
• Replicability: social media content made can be duplicated with ease
• Scalability: the potential visibility of content in networked publics is great.
• Searchability: content in networked publics can be accessed through search.

Big Data: Big data refers to data sets that are too large or complex to be dealt with by traditional data-processing application software (Wikipedia).They can be characterized by 4 V’s: volume, velocity, variety, veracity (Chen and Wojcik, 2016)

  • Volume refers to the sheer scale of the data
  • Velocity refers to the speed at which data is generated and speed at which analytic processing is required
  • Variety refers to the many forms that big data can take—including structured numeric data, text documents, audio, video, and social media.
  • Veracity: handling the challenges and ambiguities of varying forms of data. Raw, unstructured data must be translated and structured to prepare them for analyses.

Tweet: A Tweet is a short message, status update or short-form content posted on social media platform twitter. A tweet, is limited to 280 characters, and may contain photos, GIFs, videos, links, and text.

Hashtag : hashtags are a word or phrase preceded by #. A hashtag is used to index keywords or topics across social media platforms, including Twitter. It is a way to indicate (for users and algorithms) that a piece of content relates to a specific topic or belongs to a category. 


3. Scope and limitations of using Twitter for generating insights

Text from tweets provides us rich information about distribution of sentiments, understanding public discourse and opinions on current affairs or topics of interest, mapping individual characteristics of users such as political dispositions, personality traits, and observing information sharing behaviors. URLs (uniform resource locators) and mentions/retweets embedded in tweets can offer key insights into how people connect virtually, and the spread of false news on social media (Chen et al., 2021)

Examples of research articles that use twitter data

Kim, N. J., Lin, J., Hiller, C., Hildebrand, C., & Auerswald, C. (2021). Analyzing US tweets for stigma against people experiencing homelessness. Stigma and Health.

Storer, H. L., Rodriguez, M., & Franklin, R. (2021). “Leaving Was a Process, Not an Event”: The Lived Experience of Dating and Domestic Violence in 140 Characters. Journal of Interpersonal Violence, 36(11–12), NP6553–NP6580.

Walker, L. A., Williams, A., Triche, J., Rainey, L., Evans, M., Calabrese, R., & Martin, N. (2021). #StayMadAbby: Reframing affirmative action discourse and White entitlement on Black Twitter. Journal of Diversity in Higher Education.

Zhang, C., Yu, M. C., & Marin, S. (2021). Exploring public sentiment on enforced remote work during COVID-19. Journal of Applied Psychology, 106(6), 797–810. https://doi-org.ezproxy.gc.cuny.edu/10.1037/apl0000933

Limitations

a. You cannot retrieve tweets that are older than 6-9 days, without a paid account

b. Representation Bias: According to a survey conducted by PEW Research Center only 23% Americans are on Twitter. It difficult to make generalizations about the population as a whole based on Twitter data. 

c. Eliminating bots : It may be difficult to discern human engagement vs. engagement generated by automated bots on twitter. A bot is software that may autonomously perform actions such as tweeting, re-tweeting, liking, following, unfollowing, or direct messaging other accounts. A study conducted in 2017 by Varol et. al estimated that up to 15% of Twitter users were automated bot accounts.

d. Missing contextual information: Due to limits set by the platform on number of characters, inclusion of other media etc, it maybe difficult to comprehend the exact context and meaning that the author intends to communicate through their tweet. Many researchers argue that automated computational tools lack the ability to understand context and nuance in human communication and language (Patton, 2020)

e. Issues of Consent: While there is consensus among researchers that information extracted from public accounts on Twitter does not require consent of individual participants, whether analyzing Twitter data counts as “human subjects research” is open to debate (Chen & Wojcik, 2016)


4. Class Activity

The objective of this class activity is to familiarize students to Twitter as a platform and understand how are hashtags used, by whom, adn for what kind of messages.

One of the most common ways that social scientists study behavior on twitter is by exploring hashtags. Simply put, hashtags are a word or phrase preceded by #. A hashtag is used to index keywords or topics on Twitter. It is a way for folks to  indicate (for users and algorithms) that a piece of content relates to a specific topic or belongs to a category.  Some examples of popular hashtags include:

  • Social movements: #BLM,  #NotinMyName
  • Fandom communities around your favorite shows, movies, video games: #squidgames, #BLT, #sexeducation
  • Sentiment: #happy, #lol, #wow, #love
  • Impact of COVID: #COVID19, #Stayhome
  • Trending hashtags on Twitter in 2020

Instructions for class activity

For this class activity, participants can be divided into small groups of 3-8 members (depending on the class size)

Before beginning, you might want to shortlist one topic that the class in interested in exploring. You could select a topic related to the class syllabus, or something fun like #cats.

Instructions:

a. Using Twitter’s search engine, search for tweets related to your topic.

b. Make a list of popular hashtags that are being used in the tweets. Which ones are most common? are there some that are often used together?

c. What are the different type of accounts that tweet on this topic regularly? for eg companies? users? government? non profits?

d. Can you discern any pattern in the content of the tweets?

Participants can note their observations as a group and share with the class.

References:

Boyd, D. (2014). It’s complicated: The social lives of networked teens. Yale University Press.

Chen, E. E., & Wojcik, S. P. (2016). A practical guide to big data research in psychology. Psychological methods, 21(4), 458.

Chen, K., Duan, Z., & Yang, S. (2021). Twitter as research data: Tools, costs, skill sets, and lessons learned. Politics and the Life Sciences, 1-17.

Odabas M. (2022). 10 facts about Americans and Twitter. https://www.pewresearch.org/fact-tank/2022/05/05/10-facts-about-americans-and-twitter/

Malik, A., Heyman-Schrum, C., & Johri, A. (2019). Use of Twitter across educational settings: a review of the literature. International Journal of Educational Technology in Higher Education16(1), 1-22.

Patton, D. U., Frey, W. R., McGregor, K. A., Lee, F. T., McKeown, K., & Moss, E. (2020, February). Contextual analysis of social media: The promise and challenge of eliciting context in social media posts with natural language processing. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (pp. 337-342).

Varol, O., Ferrara, E., Davis, C., Menczer, F., & Flammini, A. (2017, May). Online human-bot interactions: Detection, estimation, and characterization. In Proceedings of the international AAAI conference on web and social media (Vol. 11, No. 1).

Leave a comment

Your email address will not be published. Required fields are marked *