By Indiana Lee.
Perhaps just as disconcerting as the coronavirus pandemic raging across the world is the prevalence of information — both true and misleading — that has propagated online. Social media has helped spread both truth and lies, protecting and disillusioning different segments of the population. The dangers occur for the same reasons these platforms are useful: they are open, transparent sources of data.
Never before have human beings had such broad access to information. Big data, or the vast amounts of data available through social media and recorded digital interactions, gives us benefits that inarguably aid humankind — but there are drawbacks as well.
Here, we will explore how data is used through social media and scientific study, as well as the benefits and drawbacks of big data’s role in information dissemination.
Data Utilization and Science
Social media is, in many ways, a data scientist’s dream. With the number of people freely offering opinions, content, and experiences, scientists can generate comprehensive investigations into the workings of humanity through platforms we use every day like Facebook and Twitter.
Through the STEM Fellowship Big Data Challenge, participants utilized this data in a scientific study through Twitter to analyze the public perception of the malaria drug hydroxychloroquine after prominent figures made comments about it versus the subsequent release of scientific studies with differing claims.
Emily Chan, Ginah Choi, Kendrew Wong, and Shirley Zang of the University of British Colombia composed their findings in a study after generating the data from Twitter. To do this, they used an open-source extraction method found on the popular code repository Github. Using this method, these scientists were able to pull every instance of the word “hydroxychloroquine” for a predetermined period. This enabled them to then divide the tweets by subject matter on a scale of positivity and negativity regarding the perception of the drug. On a timeline of information coming out about hydroxychloroquine, they were able to determine how information impacted public perception.
The results showed that public figures making comments about the drug gained more attention than actual scientific findings, but that “relative distributions of positive to negative sentiments remain unchanged throughout the pandemic since the beginning, due to possible belief perseverance.”
This kind of data analysis is useful to science in a variety of ways. First, it demonstrates how social media data can be utilized to understand how the public perceives and picks up on information in an information-crowded world. Second, it shows the application of big data in the social, political, and health sciences. The right information must find its way to public consciousness amidst the sea of content that exists on the internet; scientific data analysis like this helps us to comprehend how we can break through the fog to form evidence-based approaches to our perception of the world.
With 2.5 quintillion bytes of data generated every day — a number that is constantly rising — the need for judicious analysis of big data and the content spread on the internet becomes more important all the time. Data scientists and big data engineers team up to help turn the vast tides of data into narratives we are capable of understanding and acting upon, but amidst all the noise of constantly fed social media platforms, is big data utilization more of a benefit or a drawback of the modern world?
The Benefits of Big Data on Social Media
The applications of data utilization through social media are far-reaching. These platforms give us tools to share truth through scientific study without the natural “gates” that occur in traditional scientific journals and spaces. Similarly, the accumulation of public data gives scientists further tools to understand the world around us.
Through social media, the scientific community can spread its findings and generate funding through a broader audience. In 2018, the PEW Research Center found that 30 science-related Facebook pages had between 3 and 44 million followers. In comparison, the scientific journal Nature has only 63,000 subscribers.
When science and the big data it analyzes are restricted to platforms like peer-review journals, the audience naturally becomes gated, locked in essence behind a resource that many people do not have access to or awareness of. Social media platforms, however, act as a method of democratizing science, spreading actual knowledge and data to more people than a journal ever could.
Additionally, the availability of data in a platform like Twitter gives scientists powerful tools in comprehending the results of world events and their perceptions across the public at large. The example from the STEM Fellowship Big Data Challenge illustrates the usefulness of this data. In that case, analysts were able to chart public perception through available tweets. The comprehension this kind of study creates can then be directed towards finding solutions to the misinformation and false perceptions that seem to propagate in today’s world of “fake news” and misinformation campaigns.
The Drawbacks of Big Data on Social Media
Unfortunately, the presence of vast amounts of data on social media lends itself to the problems of discerning truth among a landscape of falsehoods or pseudosciences. This noise atop the threats of cherry-picked data or stolen personal data makes for an often troubling reality.
Fake science that creates and perpetuates ignorance does damage to real science and can lead to situations in which people are losing their lives because of adherence to incorrect information. For example, the fabricated claims that vaccines for measles and mumps cause autism has led to avoidable disability and even death among children. Such claims are propagated on social media where context and fact-checking are not required like they are with peer-reviewed scientific literature.
The ability of social media to spread misinformation at a rapid pace increases the misconception and rejection of scientific consensus among a broader population as well. An assessment of the ability of American students to determine bias in political tweets or news-like advertisements found a majority struggled with the media literacy to effectively do so. This makes for a problematic atmosphere in which reality and verified science can be lost among the vastness of data that contradicts or dismisses it.
When such a proliferation of false science exists on the internet, it can be easy for individuals — even scientists — to select for the data that reinforces bias. With so much data to choose from, misinformation can be cherry picked to further a goal, political or monetary, without regard for the damages that might be done in the process. Sometimes, this can even be well-intentioned, without the user of this data understanding that it was falsely generated. This is why researchers must be cautious in navigating the landscape of information, carefully choosing their research, and vetting their sources.
Additionally, the danger of data theft and mined data sold to advertisers with or without user consent is a real concern of social media platforms. Scandals like the 2018 Cambridge Analytic use of Facebook data occur with alarming frequency, demonstrating the power of data misuse in effecting our reality. Protecting consumer data in the big data era is a challenge that has yet to be fully solved. Since that data can be used to target misinformation directly to consumers based on their preferences, a threat both political and scientific is added to the existing financial consequences of cybercrime.
A Complicated Staple of Our Present Landscape
With the reality of big data on social media, the way that data is utilized can represent two sides of a complicated coin. On one side, big data on social media can assist scientists and help spread truth and understanding. On the other, misinformation and the possibility of crafted attacks based on data can proliferate misunderstanding and ignorance.
Social media is a powerful tool, and if any take-away can be produced from the way data is utilized on these platforms it is that caution and respect for that power should be maintained at all times. Far from everything you see on social media is true. Facts should be checked and research backed up.
The proliferation of big data on social media to spread misinformation and mislead the public is an unfortunate staple of our present landscape but with enhanced media literacy and broader education on how to verify information, social media data is a tool that can be used for knowledge and understanding.