Analysis of Tweets


Welcome to module three. In this section, we will be discussing about How do we make meaning of this large corpus of tweets? What are systematic approaches to data analysis ?

Overview of module:

1. Brief introduction to approaches for data analysis

2. Sampling

1. Brief Introduction to Approaches for Data Analysis

Approaches to analyze tweets, and other social media data are multifarious and vast. Describing them in detail is beyond the scope of this resource. The slide show below provides a quick snapshot of different approaches for analyzing tweets. There are additional resources provided, in case you are interested to learn more.

 

Download

 

 

2. Sampling

random sample is when all observations/respondents in a population have an equal chance of being selected. For eg picking a number for bingo. For our project, an easy way to do this is to use a random number generator and select corresponding tweets from the TAGS archive.

purposive sample is when you select a sample of tweets, based on certain criteria. This criteria will depend on your topic of interest or your research question. For example, you might decide to select tweets of those residing in a certain region or in tweets by users of mental health services. In the former case, you can easily shortlist regions of interest by selecting relevant categories under “user_location” (Column P).

How to select regions of interest using the Google Sheets drop down menu

Note: If the number of tweets in your purposive sample is high , you might want to use a random number generator to select tweets from within the category

Additional resources

Dictionary based Text Analysis: https://sicss.io/2019/materials/day3-text-analysis/dictionary-methods/rmarkdown/Dictionary-Based_Text_Analysis.html

Topic Modeling: https://cbail.github.io/SICSS_Topic_Modeling.html

Grounded Theory:https://www.depts.ttu.edu/education/our-people/Faculty/additional_pages/duemer/epsy_5382_class_materials/Grounded-theory-methodology.pdf

 

Critical pedagogy for social media


A side profile of a woman in a russet-colored turtleneck and white bag. She looks up with her eyes closed.

“The classroom remains the most radical space of possibility in the academy”

― bell hooks

The above quote by bell hooks from her book “Teaching to transgress” exalts us to expand our vision of the classroom. hooks provokes us to move beyond passive instructional pedagogy and imagine learning communities that are collaborative, and nurture dialogue and critical thinking.

Reframing this in the context of the current project, the classroom offers radical possibilities of collaborating with “digital natives” (Prensky, 2011) to uncover the socio-historical context of existing digital landscapes and unpack how power is perpetuated, exacerbated, and mitigated by information systems. It also offers us unique possibilities of crafting digital futures that cultivate meaningful relationships, and fostering social change.  

Technical systems, akin to social and legal codes, are entrenched in the inequalities of power plaguing our societies. Within popular discourse, replacing human judgment with AI based decision making is wrongly considered a viable solution to address issues of biases in institutions such as the criminal justice system. These assumptions stem from beliefs about Big Data as being “unbiased”, “objective”, and “theory- free”. Wrong assumptions about the implicit neutrality of digital architecture are harmful as they can lead to overconfidence in exactitude, underestimation of risks, and minimization of epistemological issues

In her seminal book “Race after Technology: Abolitionist tools for a New Jim Code”, Ruha Benjamin (2019) argues that racism and other forms of discrimination are embedded in digital architectures. Across diverse sectors such as health care, criminal justice, and finance, researchers have demonstrated how “unbiased” algorithms systematically discriminate against people of color. Obermeyer (2019) and his colleagues demonstrated how a widely used algorithm in healthcare was more likely to flag white patients for extra medical attention than blacks who were just as sick.

Algorithms not only betray the biased assumptions of individuals and institutions who create them but also of society as a whole. All AI based decision making tools need to be “trained” based on existing datasets. By default, their results will continue to perpetuate the discrepancies in the original datasets themselves.

There is a need to critically examine the technical choices underlying digital infrastructure. These choices determine the nature, purpose, and outcome of digital applications. They also offer important clues to how structural problems in society extend to digital contexts.

Risks and affordances enabled by the rise of social media

Digital technologies act as neural pathways supporting mobilizing efforts both online and offline. On one hand this has enabled mass mobilization across the globe around issues of racism and gender discrimination. The Black Lives Matters movement is a case in point. Social media enabled the mobilization of mass protests against racial violence across the US and the globe. According to some estimates, the murder of George Floyd witnessed the highest number of anti racist protests in the history of the US (Buchanan et. al, 2020).

Social media platforms, unfortunately, have also created opportunities for mobilization around extremist ideologies and a re-emergence for the alt- right across the globe. According to Jessie Daniels (2018), “The rise of the alt-right is both a continuation
of a centuries-old dimension of racism in the U.S. and part of an emerging media ecosystem powered by algorithms”. The alt- right have been early adopters of social media and is heavily invested in community building online and recruiting White working class youth into its fold. When such users were banned from mainstream social media sites for spreading false information and violent content, they created their own parallel digital forums such as Gab. By exploiting the affordability of emerging technologies , the alt- right has been able to expand the boundaries of the acceptable ideas in public discourse, also termed as the “Overton window”. Discussing the implications of these shifts, and the resultant rise in intolerance and hate, dehumanization of the other, and our own vulnerable to false information are important social justice issues of our times.

By their ability to customize content visible to us on the internet, algorithms have the ability to create echo chambers and reinforce our existing beliefs. This results in increased polarization of views and aids in the spread of misinformation. It also masks the covert operations of machineries that spread intolerance and hate. Most people are unaware that platforms like GAB with over 1.1 million followers exist.

As educators it is paramount that we provide our students with the necessary intellectual and digital tools to dissect and confront these unjust technological infrastructures. In the words of Paulo Freire (1972), “There is no such thing as neutral education. Education either functions as an instrument to bring about conformity or freedom.”

References

Balazka, D., & Rodighiero, D. (2020). Big data and the little big bang: an epistemological (R) evolution. Frontiers in big Data3, 31.

Buchanan, L., Bui, Q., & Patel, J. K. (2020). Black Lives Matter may be the largest movement in US history. The New York Times3.

Daniels, J. (2018). The algorithmic rise of the “alt-right”. Contexts17(1), 60-65.

Freire, P. (1972). Pedagogy of the oppressed. New York: Herder and Herder.

Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science (New York, N.Y.)366(6464), 447–453.

Ethical Considerations

The rapid expansion of technology and the accompanying rise in social media users has led to a heightened awareness about the risks and potential harms associated with social media usage. Creating a safe forum for students to discuss and share their ideas is crucial to promote learning in a classroom. These concerns become amplified when digital projects require public-facing engagement by students such as posting, sharing, or reacting to content on social media. In the section below I summarize some of these ethical quandries related to privacy, impact of social media use on mental health, and questions of equity.

Privacy

As more of our everyday lives move online, companies, governments and even the class bully are able to keep a closer watch on the way we think, feel, or behave. The data trails we leave online through our activities raises serious concerns related to privacy. The ways in which data mined by powerful social media platforms makes us vulnerable to manipulation was evident in the The Facebook Cambridge Analytica controversy. As educators, thinking through privacy concerns, especially accounting for how they may impact vulnerable and marginalized students in our class is important.

Dr. Cottom suggests having a separate alias class account, different from student’s “real” or personal account, is a very deliberate strategy while using a social media project in class. Content posted on social media accounts tends to be permanent and gets curated as an artifact of our larger digital persona. Not all students would want their coursework to be in the public eye and part of their online presence. It is common practice for people to “look up” a person on social media- whether it is someone you want to date or a potential employer. Students may not wish to be associated with content they post as an undergraduate student project. The (relatively) permanent nature of the content adds further complication over time. Students personal and professional identities may shift and they may not associate with content they have created.

An additional and potentially more grave concern is around online trolling and doxxing. Words like “racism” or “white supremacy” or discussions on trending issues might expose them to harmful trolling and doxxing. Having a separate social media account for class acts as a form of buffer against online trolling.

Reviewing and controlling the privacy settings is another potential safeguard against trolling. For instance, Twitter allows you to set your account to “protected” so only those who follow you or are tagged in your post can access your tweets.

Removing any personally identifiable information (including images) before posting online is another helpful strategy. This also applies to users whose content we are using in our class projects as well (for eg tweets in the TAGS project). As an additional reference, you can read more about the ethical guidelines developed by the Safelab for working with social media data.

Cyberbullying

A related concern to doxxing is that of cyberbullying. Cyberbullying is the intent to consistently harass another individual to cause harm via any electronic method, including social media, and includes defamation, public disclosure of private facts, and intentional emotional distress. Researchers have enlisted 7 forms of cyberbullying including flaming, online harassment, cyberstalking, denigration, masquerading, trickery and outing, and exclusion (Watts et al., 2017)

Video: What is cyberbullying?

Researchers estimate that between 5 to 40% of undergraduate students have been victims of some form of cyberbullying. This is a serious concern as it could potentially exacerbate mental health concerns within this population (Martinez-Monteagudoet al., 2020)

The data cited above clearly indicates that this is a much more pervasive problem and not necessarily associated with introducing social media projects in class. As educators it is important to be aware of these realities and also be cognizant about how having a social media project could create additional opportunities for bullying. For example, a student may try to stalk a fellow classmate via their class social media handle.

Collectively framing guidelines for respectful digital behavior on social media and discussing case studies of cyber bullying can be a starting point to address these issues.

Surveillance and challenges for vulnerable populations

Considerations around privacy and safety become paramount while working with students belonging to vulnerable and/or marginalized communities. Social media provides a ” broad and deep tool for surveillance ” to law enforcement agencies. During the George Floyd Protests across the US in 2020 or the pro-democracy protests in Hongkong earlier, governments used social media posts to identify and arrest protestors. Information available on social media is used as evidence to incarcerate young people, without adequate consideration of youth digital cultures and norms of exaggeration related to online self presentation. Black and Latino youth in particular face police surveillance and violence that extends from their neighborhoods to social media (Lane and Ramirez, 2021) . Immigrant and undocumented folks are similarly at-risk for surveillance by the Department of Homeland Surveillance. The department has been notoriously infamous for conducting raids using social media activity or even denying Visa applications if social media posts have been critical of the US government.

Listening to concerns of students and identifying those who may be especially vulnerable maybe a first step to address these concerns. There are also some great resources on online safety created by CryptoHarlem, a nonprofit that works to promote AntiSurveillance, Cybersecurity Education & Advocacy.

Social Media Use: Impact on mental health

Many studies are now suggesting that social media usage is correlated with a series of mental health concerns including depression, anxiety, suicidality, body image concerns among others (Kross et. al, 2021). While there it is difficult to establish a direct connection between the two, there are mediating factors such as exposure to cyberbullying, increased social comparison, and reduction in physical activity that occur as a result of high social media usage and exacerbate mental health concerns (Viner et al., 2019). We know for example that increased screen time while going to bed impacts quality and duration of sleep, which in turn impacts mental health.

When initiating a social media based project in the classroom, it might be helpful to have a conversation with students about self care and boundaries around social media. This will also provide students an opportunity to reflect upon their own social media usage and its impact on their wellbeing. Another possibility to take into consideration is that some students may be on a digital detox (El- Khoury et al., 2020) for mental health concerns, and do not wish to be present on a social media platform. In such a scenario, you may want to clarify the requirements on the course at the beginning of the semester and guide students to select alternate courses that might work better for their current situation. You may also consider creating alternate assignments that do not require the use of social media such as a review of literature etc.

Addressing Inequities in Access

In 2021, one in four adults with lower incomes do not have access to home broadband services (43%) and or a desktop or laptop computer (41%).

Pew Research Center, 2021

The CoVID-19 pandemic has directed our attention to ethical concerns around inequities in digital access. As everyday activities moved online, lower income families found it increasingly difficult to navigate these digital demands. Family members were forced to share whatever limited digital resources they had including a broadband connection and devices such as laptops, tablets, and desktops. Researchers have used the term “homework gap” to describe the barriers students face when working on assignments without a reliable Internet source at home. In 2021, one in four adults with lower incomes do not have access to home broadband services (43%) and or a desktop or laptop computer (41%) (Pew Research Center, 2021). Recognizing this gap, schools and universities used government funds to loan devices and provide low cost internet during the pandemic.

For instructors serving low income communities, the lack of access to a personal digital device and good internet connection can pose serious limitations to implementing digital projects. In many educational institutions there is a dearth of devices, and computer labs need to be booked months in advance. Digital divides persisted before the pandemic, and would likely continue into the near future. To what extent institutional supports to counter digital divides will be accessible to students moving forward remains unclear. some studies estimate that 75% of solutions implemented during the pandemic to bridge the homework gap are expected to expire in the next one to three years (Rideout & Robb, 2021) . An important pedagogical consideration, therefore, is to design digital projects keeping in mind these challenges.

Lack of access to a personal digital device (other than a smartphone) was an important factor incorporated in the design of the resources outlined on this website. Both TAGS and Google sheets are applications that are stored on the google drive cloud. You do not need to install them on a specific computer. The archive of tweets you gather via TAGS is saved on your google drive and you can access it anywhere- a friends device or a public computer.

Digital divides also have important implications for designing lesson plans. Student’s technical skills are closely linked with their previous exposure to digital contexts. For example, projects described on this website require the use of google sheets or Microsoft Excel. The first time I implemented this project, I wrongly assumed students would be familiar with these platforms. After noticing some students puzzled by the project instructions, I realized they had not used these software before. Acknowledging this error, I held a separate workshop to teach basic spreadsheet skills for the class. Surprisingly, a significant majority of the class participated. Moving forward, I send out a brief survey at the beginning of the semester asking students about their knowledge of various software, but also to understand their current access to internet and digital devices. A survey can help us identify current capabilities of students and plan for supports they need. This also holds true for students who may have advanced technical skills and need more complex assignments to sustain their interest in class. Mapping the capabilities of students and adapting your lesson plans accordingly is a critical aspect of designing learning projects in the classroom.

The financial costs associated with a digital project are another crucial aspect to consider while thinking about access and equity. Many lab based courses offered to undergraduates require the use of paid proprietary software such as STATA, SPSS, SAS, Atlas. These software require paid subscription upwards of 100$ annually, and may discourage students with strained finances to enrolling for the course. As discussed in the Introduction, this website uses Open Educational Resources (OER) applications. OER are freely accessible teaching or research resources that have an open-copyright license (such as one from Creative Commons), or they are part of the public domain and have no copyright. OER materials not only help to reduce costs but also allow students to collaborate and contribute to a growing universe of shared knowledge resources (Wikipedia is a great example).

References

El-Khoury, J., Haidar, R., Kanj, R. R., Bou Ali, L., & Majari, G. (2021). Characteristics of social media ‘detoxification’ in university students. The Libyan journal of medicine16(1), 1846861.

Pew Research Center (2021). Digital divide persists even as Americans with lower incomes make gains in tech adoption.

Kross, E., Verduyn, P., Sheppes, G., Costello, C. K., Jonides, J., & Ybarra, O. (2021). Social media and well-being: Pitfalls, progress, and next steps. Trends in Cognitive Sciences, 25(1), 55-66.

Lane J. and Ramrirez F.A. (2021).Social Media as Criminal Evidence: New Possibilities, Problems. Footnotes, 49 (4).

Martínez-Monteagudo, M. C., Delgado, B., García-Fernández, J. M., & Ruíz-Esteban, C. (2020). Cyberbullying in the university setting. Relationship with emotional problems and adaptation to the university. Frontiers in psychology, 3074.

Rideout, V.J. & Robb, M.B. (2021) The Common Sense Census presents: Research brief. Remote learning and digital equity during the pandemic. San Francisco, CA: Common Sense.

Viner, R. M., Gireesh, A., Stiglic, N., Hudson, L. D., Goddings, A. L., Ward, J. L., & Nicholls, D. E. (2019). Roles of cyberbullying, sleep, and physical activity in mediating the effects of social media use on mental health and wellbeing among young people in England: a secondary analysis of longitudinal data. The Lancet Child & Adolescent Health3(10), 685-696.

Watts, L. K., Wagner, J., Velasquez, B., & Behrens, P. I. (2017). Cyberbullying in higher education: A literature review. Computers in Human Behavior69, 268-274.

Inquiry Based Learning & Social Media

 

Inquiry – based learning (IBL) is an educational approach that emphasizes learning through student involvement in solving complex, authentic questions or problems (Lippmann, 2020). IBL envisions an active role for students in knowledge construction, and puts them on the driver seat. Students are encouraged to use tools that scientists and practitioners use to find a solution or explore a problem. Within the social sciences this entails that students undertake their own research: from framing a hypothesis or a research question, designing and implementing an appropriate research methodology, and finally analyzing and interpreting the results. The role of the instructor in such a learning process is that of a facilitator.

Inquiry based learning can aid in the development of conceptual, critical and analytical thinking of students. Students can learn to formulate hypothesis, evaluate “credible” sources of information, and further enhance their problem solving, analytical, composition, and communication skills. Involvement in such projects can lead to a greater appreciation for scientific, and systematic inquiry (Gasper and Gardner, 2013).

Undergraduate social science classes are often segregated into “Lab-based/Methods” courses and theoretical courses. Lab-based courses are highly technical, often smaller in size, and offer opportunities for students learn about “doing” research without acknowledging the theoretical basis underlying the processes it advocates. Conceptual or theory based courses, on the other hand, rely mostly on didactic pedagogies. This artificial binary within academia inhibits students from creatively engaging with theory and develop their application skills. Many experiential educators argue that “learning by doing” and direct experience offer the most powerful intellectual experience to learners (Roberts, 2012)

Inquiry based projects using Social Media

Educators in the social sciences have incorporated social media into the undergraduate classroom in innovative and engaging ways. The table below offers a small glimpse of social media projects that draw on principles of IBL. These could range from a 20 minute class activity to a semester long projects. Common pedagogical goals underlying these lesson plans or projects include creative engagement with learning materials, application of theoretical constructs in real life settings, and enhancing students critical and analytical skills. 

Table 1 : Overview of Inquiry based activities/projects using social media

References

Freeman, S., Eddy, S. L., McDonough, M., Smith, M. K., Okoroafor, N., Jordt, H., & Wenderoth, M. P. (2014). Active learning increases student performance in science, engineering, and mathematics. Proceedings of the national academy of sciences111(23), 8410-8415.

Lippmann, M. (2020). Inquiry-Based Learning in Psychology. International Handbook of Psychology Learning and Teaching, 1-30.

Gasper, B. J., & Gardner, S. M. (2013). Engaging students in authentic microbiology research in an introductory biology laboratory course is correlated with gains in student understanding of the nature of authentic research and critical thinking. Journal of microbiology & biology education14(1), 25-34.

Roberts, J. W. (2012). Beyond learning by doing: Theoretical currents in experiential education. Routledge.

Introductory Note

 Today, digital spaces are an important and inescapable aspect of our social and personal lives. In 2005, only 5% of Americans reported using one or more social media platforms; by 2019, this number had risen to 72% (Pew Research Center, 2021). Furthermore, young people on an average spend between 2-6 hours on social media sites such as Facebook, Instagram, Snapchat, TikTok and YouTube (Statista, 2022).

Given the amount of time young people are spending on social media and how intertwined it is with their existence, connecting concepts taught in the classroom with what is happening on social media can potentially lead to enhanced learning outcomes. Research suggests that learning engagement is key to achieve superior academic outcomes (Freeman et al, 2014). When students perceive course content to be relevant and valuable to their personal and professional lives, they feel motivated to learn.  Furthermore, understanding the rapid growth in engagement on social media platforms and their impacts on human behavior can be sites of inquiry themselves. 

Most instructors require students to abstain from using social media in the classroom. The insights offered in this resource help us reimagine alternatives to surveillance, by mindfully integrating social media use as part of the lesson plan. This can create a classroom experience that is more “natural” for young people, and also open possibilities for conversations around thoughtful and meaningful social media engagement, and what it means for us.

Experiences shared on social media are “naturally occurring” i.e. they are  not generated for the purposes of research. Unlike information gathered in a laboratory or through self report questionnaires, where there is a possibility that the information we gather can be artificial or biased, social media Therefore, it has potential to offer us unique insights about behavior in everyday life rather than data generated in artificial settings 

Social media platforms offer a relatively economical, quick, and targeted way of collecting data, especially as compared to some other sources, e.g. polls, surveys, demographic and economic data. This data can useful for class projects and to help students analyze social phenomenon as they are unfolding in real time. 

Rationale for this project

Designing class projects or assignments that draw on social media can be daunting for instructors in the social sciences for several reasons. First, many of us may lack the necessary resources to pull off such a project such as, access to a dedicated computer lab or proprietary software for data harvesting and analysis, and/or technical assistance to pull off these projects. Second, social media research is often perceived interchangeably with Big Data and computational methods. A related assumption is that instructors and students need to be well versed in coding in programming languages/software such as R or Python to undertake such projects. Third,  it involves gaining familiarity with the rapidly evolving landscape of digital technology and social media platforms. The demands of an increasingly precarious, neoliberal academia are exhausting. The additional task of tuning into popular online culture, trending hashtags, or even the basic skills of navigating a new platform can feel overwhelming and an additional “burden”. This is especially for those of us (including myself) who do not identify as digital natives.

To address some of the concerns outlined above, this website draws on Open Educational Resources that require no previous familiarity with computer programming and need minimal investment in terms of learning new skills. Open Educational Resources (OER) are learning, teaching and research materials that are available in the public domain or are under copyright that have been released under an open license, that permit no-cost access, re-use, re-purpose, adaptation and redistribution by others (UNESCO). The OERCommons  is a public digital library of open educational resources and a great database to begin exploring the world of OER.

For analyzing social media data, the website suggests methods that require little use of programming languages. These include primarily inductive and qualitative,  approaches such as thematic analysis, grounded coding, and use of descriptive statistics. (For those curious to learn more about coding, there are a host of OER resources available online.)

References

Freeman, S., Eddy, S. L., McDonough, M., Smith, M. K., Okoroafor, N., Jordt, H., & Wenderoth, M. P. (2014). Active learning increases student performance in science, engineering, and mathematics. Proceedings of the national academy of sciences111(23), 8410-8415.

Pew Research Center (2021). Social Media Factsheet.https://www.pewresearch.org/internet/fact-sheet/social-media/

Statista Research Department (2022). Daily time spent on social networking by internet users worldwide from 2012 to 2022. https://www.statista.com/statistics/433871/daily-social-media-usage-worldwide/