Big Data

What are big data?

“Big data” are also data, but involve far larger amounts of data than can usually be handled on a desktop computer or in a traditional database. Big data are not only huge in volume, but they grow exponentially with time. Big data are so large and complex that none of the traditional data-management tools are able to store them or process them efficiently. If you have an amount of data that you can process on your computer or the database on your usual server without it crashing, “big data” are likely not what you are working with.

How does big data work?

The field of big data has evolved as technology’s ability to constantly capture information has skyrocketed. Big data are usually captured without being entered into a database by a human being, in real time: in other words, big data are “passively” captured by digital devices.

The internet provides infinite opportunities to gather information, ranging from so-called meta-information or metadata (geographic location, IP address, time, etc.) to more detailed information about users’ behaviors. This is often from online social media or credit card-purchasing behavior. Cookies are one of the principal ways that web browsers gather information about users: they are essentially tiny pieces of data stored on a web browser, or little bits of memory about something you did on a website. (For more on cookies, visit this resource).

Data sets can also be assembled from the Internet of Things, which involves sensors tied to other devices and networks. For example, censor-equipped streetlights might collect traffic information that can then be analyzed to optimize traffic flow. The collection of data through sensors is a common element of smart city infrastructure.

Healthcare workers in Indonesia. The use of big data can improve health systems and inform public health policies. Photo credit: courtesy of USAID EMAS.

Big data can also be medical or scientific data, such as DNA information or data related to disease outbreaks. This can be useful to humanitarian and development organizations. For example, during the Ebola outbreak in West Africa between 2014 and 2016, UNICEF combined data from a number of sources, including population estimates, information on air travel, estimates of regional mobility from mobile phone records and tagged social media locations, temperature data, and case data from WHO reports to better understand the disease and predict future outbreaks.

Big data are created and used by a variety of actors. In data-driven societies, most actors (private sector, governments, and other organizations) are encouraged to collect and analyze data to notice patterns and trends, measure success or failure, optimize their processes for efficiency, etc. Not all actors will create datasets themselves, often they will collect publicly available data or even purchase data from specialized companies. For instance, in the advertising industry, Data Brokers specialize in collecting and processing information about internet users, which they then sell to advertisers. Other actors will create their own datasets, like energy providers, railway companies, ride-sharing companies, and governments. Data are everywhere, and the actors capable of collecting them intelligently and analyzing them are numerous.

Back to top

How is big data relevant in civic space and for democracy?

In Tanzania, an open-source platform allows government and financial institutions to record all land transactions to create a comprehensive dataset. Photo credit: Riaz Jahanpour for USAID / Digital Development Communications.

From forecasting presidential elections to helping small-scale farmers deal with changing climate to predicting disease outbreaks, analysts are finding ways to turn Big Data into an invaluable resource for planning and decision-making. Big data are capable of providing civil society with powerful insights and the ability to share vital information. Big data tools have been deployed recently in civic space in a number of interesting ways, for example, to:

  • monitor elections and support open government (starting in Kenya with Ushahidi in 2008)
  • track epidemics like Ebola in Sierra Leone and other West African nations
  • track conflict-related deaths worldwide
  • understand the impact of ID systems on refugees in Italy
  • measure and predict agricultural success and distribution in Latin America
  • press forward with new discoveries in genetics and cancer treatment
  • make use of geographic information systems (GIS mapping applications) in a range of contexts, including planning urban growth and traffic flow sustainably, as has been done by the World Bank in various countries in South Asia, East Asia, Africa, and the Caribbean

The use of big data that are collected, processed, and analyzed to improve health systems or environmental sustainability, for example, can ultimately greatly benefit individuals and society. However, a number of concerns and cautions have been raised about the use of big datasets. Privacy and security concerns are foremost, as big data are often captured without our awareness and used in ways to which we may not have consented, sometimes sold many times through a chain of different companies we never interacted with, exposing data to security risks such as data breaches. It is crucial to consider that anonymous data can still be used to “re-identify” people represented in the dataset – achieving 85% accuracy using as little as postal code, gender, and date of birth – conceivably putting them at risk (see discussion of “re-identification” below).

There are also power imbalances (divides) in who is represented in the data as opposed to who has the power to use them. Those who are able to extract value from big data are often large companies or other actors with the financial means and capacity to collect (sometimes purchase), analyze, and understand the data.

This means the individuals and groups whose information is put into datasets (shoppers whose credit card data is processed, internet users whose clicks are registered on a website) do not generally benefit from the data they have given. For example, data about what items shoppers buy in a store is more likely used to maximize profits than to help customers with their buying decisions. The extractive way that data are taken from individuals’ behaviors and used for profit has been called “surveillance capitalism“, which some believe is undermining personal autonomy and eroding democracy.

The quality of datasets must also be taken into consideration, as those using the data may not know how or where they were gathered, processed, or integrated with other data. And when storing and transmitting big data, security concerns are multiplied by the increased numbers of machines, services, and partners involved. It is also important to keep in mind that big datasets themselves are not inherently useful, but they become useful along with the ability to analyze them and draw insights from them, using advanced algorithms, statistical models, etc.

Last but not least, there are important considerations related to protecting the fundamental rights of those whose information appears in datasets. Sensitive, personally identifiable, or potentially personally identifiable information can be used by other parties or for other purposes than those intended, to the detriment of the individuals involved. This is explored below and in the Risks section, as well as in other primers.

Protecting anonymity of those in the dataset

Anyone who has done research in the social or medical sciences should be familiar with the idea that when collecting data on human subjects, it is important to protect their identities so that they do not face negative consequences from being involved in research, such as being known to have a particular disease, voted in a particular way, engaged in stigmatized behavior, etc. (See the Data Protection resource). The traditional ways of protecting identities – removing certain identifying information, or only reporting statistics in aggregate – can and should also be used when handling big datasets to help protect those in the dataset. Data can also be hidden in multiple ways to protect privacy: methods include encryption (encoding), tokenization, and data masking. Talend identifies the strengths and weaknesses of the primary strategies for hiding data using these methods.

One of the biggest dangers involved in using big datasets is the possibility of re-identification: figuring out the real identities of individuals in the dataset, even if their personal information has been hidden or removed. To give a sense of how easy it could be to identify individuals in a large dataset, one study found that using only three fields of information—postal code, gender, and date of birth—it was possible to identify 87% of Americans individually, and then connect their identities to publicly-available databases containing hospital records. With more data points, researchers have demonstrated a near-perfect ability to identify individuals in a dataset: four random pieces of data credit card records could achieve 90% identifiability, and researchers were able to re-identify individuals with 99.98% accuracy using 15 data points.

Ten simple rules for responsible big data research, quoted from a paper of the same name by Zook, Barocas, Boyd, Crawford, Keller, Gangadharan, et al, 2017

  1. Acknowledge that data are people and that data can do harm. Most data represent or affect people. Simply starting with the assumption that all data are people until proven otherwise places the difficulty of disassociating data from specific individuals front and center.
  2. Recognize that privacy is more than a binary value. Privacy may be more or less important to individuals as they move through different contexts and situations. Looking at someone’s data in bulk may have different implications for their privacy than looking at one record. Privacy may be important to groups of people (say, by demographic) as well as to individuals.
  3. Guard against the reidentification of your data. Be aware that apparently harmless, unexpected data, like phone battery usage, could be used to re-identify data. Plan to ensure your data sharing and reporting lowers the risk that individuals could be identified.
  4. Practice ethical data sharing. There may be times when participants in your dataset expect you to share (such as with other medical researchers working on a cure), and others where they trust you not to share their data. Be aware that other identifying data about your participants may be gathered, sold, or shared about them elsewhere, and that combining that data with yours could identify participants individually. Be clear about how and when you will share data and stay responsible for protecting the privacy of the people whose data you collect.
  5. Consider the strengths and limitations of your data; big does not automatically mean better. Understand where your large dataset comes from, and how that may evolve over time. Don’t overstate your findings and acknowledge when they may be messy or have multiple meanings.
  6. Debate the tough, ethical choices. Talk with your colleagues about these ethical concerns. Follow the work of professional organizations to stay current with concerns.
  7. Develop a code of conduct for your organization, research community, or industry and engage your peers in creating it to ensure unexpected or under-represented perspectives are included.
  8. Design your data and systems for auditability. This both strengthens the quality of your research and services and can give early warnings about problematic uses of the data.
  9. Engage with the broader consequences of data and analysis practices. Keep social equality, the environmental impact of big data processing, and other society-wide impacts in view as you plan big data collection.
  10. Know when to break these rules. With debate, code of conduct, and auditability as your guide, consider that in a public health emergency or other disaster, you may find there are reasons to put the other rules aside.

Gaining informed consent

Those providing their data may not be aware at the time that their data may be sold later to data brokers who may then re-sell them.

Unfortunately, data privacy consent forms are generally hard for the average person to read, even in the wake of General Data Protection Regulation (GDPR ) expansion of privacy protections. Terms of Service (ToS documents) are so notoriously difficult to read that one filmmaker even made a documentary on the subject. Researchers who have studied terms of service and privacy policies have found that users generally accept them without reading them because they are too long and complex. Otherwise, users that need to access a platform or service for personal reasons (for example to get in contact with a relative) or for their livelihood (to deliver their products to customers) may not be able to simply reject the ToS when they have no viable or immediate alternative.

Important work is being done to try to protect users of platforms and services from these kinds of abusive data-sharing situations. For example, Carnegie Mellon’s Usable Privacy and Security laboratory (CUPS) has developed best practices to inform users about how their data may be used. These take the shape of data privacy “nutrition labels” that are similar to FDA-specified food nutrition labels and are evidence-based.

In Chipata, Zambia, a resident draws water from a well. Big data offer invaluable insights for the design of climate change solutions. Photo credit: Sandra Coburn.

Back to top

Opportunities

Big data can have positive impacts when used to further democracy, human rights, and governance issues. Read below to learn how to more effectively and safely think about big data in your work.

Greater insight

Big datasets can present some of the richest, most comprehensive information that has ever been available in human history. Researchers using big datasets have access to information from a massive population. These insights can be much more useful and convenient than self-reported data or data gathered from logistically tricky observational studies. One major trade-off is between the richness of the insights gained through self-reported or very carefully collected data, versus the ability to generalize the insights from big data. Big data gathered from social-media activity or sensors also can allow for the real-time measurements of activity at a large scale. Big data insights are very important in the field of logistics. For example, the United States Postal Service collects data from across its package deliveries using GPS and vast networks of sensors and other tracking methods, and they then process these data with specialized algorithms. These insights allow them to optimize their deliveries for environmental sustainability.

Increased access to data

Making big datasets publicly available can begin to take steps toward closing divides in access to data. Apart from some public datasets, big data often ends up as the property of corporations, universities, and other large organizations. Even though the data produced are about individual people and their communities, those individuals and communities may not have the money or technical skills needed to access those data and make productive use of them. This creates the risk of worsening existing digital divides.

Publicly available data have helped communities understand and act on government corruption, municipal issues, human-rights abuses, and health crises, among other things. Though again, when data are made public, they are of particular importance to ensure strong privacy for those whose data is in the dataset. The work of the Our Data Bodies project provides additional guidance for how to engage with communities whose data is in the datasets. Their workshop materials can support community understanding and engagement in making ethical decisions around data collection and processing, and about how to monitor and audit data practices.

Back to top

Risks

The use of emerging technologies to collect data can also create risks in civil society programming. Read below on how to discern the possible dangers associated with big data collection and use in DRG work, as well as how to mitigate for unintended – and intended – consequences.

Surveillance

With the potential for re-identification as well as the nature and aims of some uses of big data, there is a risk that individuals included in a dataset will be subjected to surveillance by governments, law enforcement, or corporations. This may put the fundamental rights and safety of those in the dataset at risk.

The Chinese government is routinely criticized for the invasive surveillance of Chinese citizens through gathering and processing big data. More specifically, the Chinese government has been criticized for their system of social ranking of citizens based on their social media, purchasing, and education data, as well as the gathering of DNA of members of the Uighur minority (with the assistance of a US company, it should be noted). China is certainly not the only government to abuse citizen data in this way. Edward Snowden’s revelations about the US National Security Agency’s gathering and use of social media and other data were among the first public warnings about the surveillance potential of big data. Concerns have also been raised about partnerships involved in the development of India’s Aadhar biometric ID system, a technology whose producers are eager to sell it to other countries. In the United States, privacy advocates have raised concerns about companies and governments gathering data at scale about students by using their school-provided devices, a concern that should also be raised in any international context when laptops or mobiles are provided for students.

It must be emphasized that surveillance concerns are not limited to the institutions originally gathering the data, whether governments or corporations. When data are sold or combined with other datasets, it is possible that other actors, from email scammers to abusive domestic partners, could access the data and track, exploit, or otherwise harm people appearing in the dataset.

Data security concerns

Because big data are collected, cleaned, and combined through long, complex pipelines of software and storage, it presents significant challenges for security. These challenges are multiplied whenever the data are shared between many organizations. Any stream of data arriving in real time (for example, information about people checking into a hospital) will need to be specifically protected from tampering, disruption, or surveillance. Given that data may present significant risks to the privacy and safety of those included in the datasets and may be very valuable to criminals, it is important to ensure sufficient resources are provided for security.

Existing security tools for websites are not enough to cover the entire big data pipeline. Major investments in staff and infrastructure are needed to provide proper security coverage and respond to data breaches. And unfortunately, within the industry, there are known shortages of big data specialists, particularly security personnel familiar with the unique challenges big data presents. Internet of Things sensors present a particular risk if they are part of the data-gathering pipeline; these devices are notorious for having poor security. For example, a malicious actor could easily introduce fake sensors into the network or fill the collection pipeline with garbage data in order to render your data collection useless.

Exaggerated expectations of accuracy and objectivity

Big data companies and their promoters often make claims that big data can be more objective or accurate than traditionally-gathered data, supposedly because human judgment does not come into play and because the scale at which it is gathered is richer. This picture downplays the fact that algorithms and computer code also bring human judgment to bear on data, including biases and data that may be accidentally excluded. Human interpretation is also always necessary to make sense of patterns in big data; so again, claims of objectivity should be taken with healthy skepticism.

It is important to ask questions about data-gathering methods, algorithms involved in processing, and the assumptions or inferences made by the data gatherers/programmers and their analyses to avoid falling into the trap of assuming big data are “better.” For example, while data about the proximity of two cell phones tells you the fact that two people were near each other, only human interpretation can tell you why those two people were near each other. How an analyst interprets that closeness may differ from what the people carrying the cell phones might tell you. For example, this is a major challenge in using phones for “contact tracing” in epidemiology. During the COVID-19 health crisis, many countries raced to build contact tracing cellphone apps. The precise purposes and functioning of these apps varies widely (as has their effectiveness) but it is worth noting that major tech companies have preferred to refer to these apps as “exposure-risk notification” apps rather than contact tracing: this is because the apps can only tell you if you have been in proximity with someone with the coronavirus, not whether or not you have contacted the virus.

Misinterpretation

As with all data, there are pitfalls when it comes to interpreting and drawing conclusions. Because big data is often captured and analyzed in real-time, it may be particularly weak in providing historical context for the current patterns it is highlighting. Anyone analyzing big data should also consider what its source or sources were, whether the data was combined with other datasets, and how it was cleaned. Cleaning refers to the process of correcting or removing inaccurate or extraneous data. This is particularly important with social-media data, which can have lots of “noise” (extra information) and are therefore almost always cleaned.

Back to top

Questions

If you are trying to understand the implications of big data in your work environment, or are considering using aspects of big data as part of your DRG programming, ask yourself these questions:

  1. Is gathering big data the right approach for the question you’re trying to answer? How would your question be answered differently using interviews, historical research, or a focus on statistical significance?
  2. Do you already have these data, or are they publicly available? Is it really necessary to acquire these data yourself?
  3. What is your plan to make it impossible to identify individuals through their data in your dataset? If the data come from someone else, what kind of de-anonymization have they already performed?
  4. How could individuals be made more identifiable by someone else when you publish your data and findings? What steps can you take to lower the risk they will be identified?
  5. What is your plan for getting consent from those whose data you are collecting? How will you make sure your consent document is easy for them to understand?
  6. If your data come from another organization, how did they seek consent? Did that consent include consent for other organizations to use the data?
  7. If you are getting data from another organization, what is the original source of these data? Who collected them, and what were they trying to accomplish?
  8. What do you know about the quality of these data? Is someone inspecting them for errors, and if so, how? Did the collection tools fail at any point, or do you suspect that there might be some inaccuracies or mistakes?
  9. Have these data been integrated with other datasets? If data were used to fill in gaps, how was that accomplished?
  10. What is the end-to-end security plan for the data you are capturing or using? Are there third parties involved whose security propositions you need to understand?

Back to top

Case Studies

Village resident in Tanzania. Big data analytics can pinpoint strategies that work for small-scale farmers. Photo credit: Riaz Jahanpour for USAID / Digital Development Communications.
Big Data for climate-smart agriculture

Big Data for climate-smart agriculture

“Scientists at the International Center for Tropical Agriculture (CIAT) have applied Big Data tools to pinpoint strategies that work for small-scale farmers in a changing climate…. Researchers have applied Big Data analytics to agricultural and weather records in Colombia, revealing how climate variation impacts rice yields. These analyses identify the most productive rice varieties and planting times for specific sites and seasonal forecasts. The recommendations could potentially boost yields by 1 to 3 tons per hectare. The tools work wherever data is available, and are now being scaled out through Colombia, Argentina, Nicaragua, Peru and Uruguay.”

School Issued Devices and Student Privacy

School-Issued Devices and Student Privacy, particularly the Best Practices for Ed Tech Companies section.

“Students are using technology in the classroom at an unprecedented rate…. Student laptops and educational services are often available for a steeply reduced price and are sometimes even free. However, they come with real costs and unresolved ethical questions. Throughout EFF’s investigation over the past two years, [they] have found that educational technology services often collect far more information on kids than is necessary and store this information indefinitely. This privacy-implicating information goes beyond personally identifying information (PII) like name and date of birth, and can include browsing history, search terms, location data, contact lists, and behavioral information…All of this often happens without the awareness or consent of students and their families.”

Big Data and Thriving Cities: Innovations in Analytics to Build Sustainable, Resilient, Equitable and Livable Urban Spaces.

Big Data and Thriving Cities: Innovations in Analytics to Build Sustainable, Resilient, Equitable and Livable Urban Spaces.

This paper includes case studies of big data used to track changes in urbanization, traffic congestion, and crime in cities. “[I]nnovative applications of geospatial and sensing technologies and the penetration of mobile phone technology are providing unprecedented data collection. This data can be analyzed for many purposes, including tracking population and mobility, private sector investment, and transparency in federal and local government.”

Battling Ebola in Sierra Leone: Data Sharing to Improve Crisis Response.

Battling Ebola in Sierra Leone: Data Sharing to Improve Crisis Response.

“Data and information have important roles to play in the battle not just against Ebola, but more generally against a variety of natural and man-made crises. However, in order to maximize that potential, it is essential to foster the supply side of open data initiatives – i.e., to ensure the availability of sufficient, high-quality information. This can be especially challenging when there is no clear policy backing to push actors into compliance and to set clear standards for data quality and format. Particularly during a crisis, the early stages of open data efforts can be chaotic, and at times redundant. Improving coordination between multiple actors working toward similar ends – though difficult during a time of crisis – could help reduce redundancy and lead to efforts that are greater than the sum of their parts.”

Tracking Conflict-Related Deaths: A Preliminary Overview of Monitoring Systems.

Tracking Conflict-Related Deaths: A Preliminary Overview of Monitoring Systems.

“In the framework of the United Nations 2030 Agenda for Sustainable Development, states have pledged to track the number of people who are killed in armed conflict and to disaggregate the data by sex, age, and cause—as per Sustainable Development Goal (SDG) Indicator 16. However, there is no international consensus on definitions, methods, or standards to be used in generating the data. Moreover, monitoring systems run by international organizations and civil society differ in terms of their thematic coverage, geographical focus, and level of disaggregation.”

Balancing data utility and confidentiality in the US census.

Balancing data utility and confidentiality in the US census.

Describes how the Census is using differential privacy to protect the data of respondents. “As the Census Bureau prepares to enumerate the population of the United States in 2020, the bureau’s leadership has announced that they will make significant changes to the statistical tables the bureau intends to publish. Because of advances in computer science and the widespread availability of commercial data, the techniques that the bureau has historically used to protect the confidentiality of individual data points can no longer withstand new approaches for reconstructing and reidentifying confidential data. … [R]esearch at the Census Bureau has shown that it is now possible to reconstruct information about and reidentify a sizeable number of people from publicly available statistical tables. The old data privacy protections simply don’t work anymore. As such, Census Bureau leadership has accepted that they cannot continue with their current approach and wait until 2030 to make changes; they have decided to invest in a new approach to guaranteeing privacy that will significantly transform how the Census Bureau produces statistics.”

Back to top

References

Find below the works cited in this resource.

Additional Resources

Back to top

Categories

Social Media

What is social media?

Social media provides spaces for people and organizations to share and access news and information, communicate with beneficiaries, and advocate for change. Social media content includes text, photos, videos, infographics, or any other material placed on a blog, Facebook page, X (formerly known as Twitter) account, etc. for an audience to consume, interact with, and circulate. This content is curated by platforms and delivered to users according to what is most likely to attract their attention. There is an ever-expanding amount of content available on these platforms.

Digital inclusion center in the Peruvian Amazon. For NGOs, social media platforms can be useful to reach new audiences and to raise awareness of services. Photo credit: Jack Gordon for USAID / Digital Development Communications.

Theoretically, through social media everyone has a way to speak out and reach audiences across the world, which can be empowering and bring people together. At the same time, much of what is shared on social media can be misleading, hateful, and dangerous, which theoretically imposes a level of responsibility by the owners of platforms to moderate content.

How does social media work?

Social media platforms are owned by private companies, with business models usually based on advertising and monetization of users’ data. This affects the way that content appears to users, and influences data-sharing practices. Moderating content on these social media spaces brings its own challenges and complications because it requires balancing multiple fundamental freedoms. Understanding the content moderation practices and business models of the platforms is essential to reap the benefits while mitigating the risks of using social media.

Business Models

Most social media platforms rely on advertising. Advertisers pay for engagement, such as clicks, likes, and shares. Therefore, sensational and attention-grabbing content is more valuable. This motivates platforms to use automated-recommendation technology that relies on algorithmic decision-making to prioritize content likely to grab attention. The main strategy of “user-targeted amplification” shows users content that is most likely to interest them based on detailed data  that are collected about them. See more in the Risk section under Data Monetization  by social media companies and tailored information streams.

The Emergence of Programmatic Advertising

The transition of advertising to digital systems has dramatically altered the advertising business. In an analog world, advertising placements were predicated on aggregate demographics, collected by publishers and measurement firms. These measurements were rough, capable at best of tracking subscribers and household-level engagement. Advertisers hoped their ads would be seen by enough of their target demographic (for example, men between 18 and 35 with income at a certain level) to be worth their while. Even more challenging was tracking the efficacy of the ads. Systems for measuring whether an ad resulted in a sale were limited largely to mail-in cards and special discount codes.

The emergence of digital systems changed all of that. Pioneered for the most part by Google and then supercharged by Facebook in the early 21st century, a new promise emerged: “Place ads through our platform, and we can put the right ad in front of the right person at the right time. Not only that, but we can report back to you (the advertiser) which users saw the ad, whether they clicked on it, and if that click led to a ‘conversion’ or a sale.”

But this promise has come with significant unintended consequences. The way that the platforms—and the massive ad tech industry that has rapidly emerged alongside them—deliver on this promise requires a level of data gathering, tracking, and individual surveillance unprecedented in human history. The tracking of individual behaviors, preferences, and habits powers the wildly profitable digital advertising industry, dominated by platforms that can control these data at scale.

Managing huge consumer data sets at the scale and speed required to deliver value to advertisers has come to mean a heavy dependence on algorithms to do the searching, sorting, tracking, placement, and delivery of ads. This development of sophisticated algorithms led to the emergence of programmatic advertising, which is the placement of ads in real time on websites with no human intervention. Programmatic advertising made up roughly two thirds of the $237 billion global ad market in 2019.

The digitization of the advertising market, particularly the dominance of programmatic advertising, has resulted in a highly uneven playing field. The technology companies possess a significant advantage: they built the new structures and set the terms of engagement. What began as a value-add in the new digital space—“We will give advertisers efficiency and publishers new audiences and revenue streams”—has evolved to disadvantage both groups.

One of the primary challenges is in how audience engagement is measured and tracked. The primary performance indicators in the digital world are views and clicks. As mentioned above, an incentive structure based on views and clicks (engagement) tends to favor sensational and eye-catching content. In the race for engagement, misleading or false content with dramatic headlines and incendiary claims consistently wins out over more balanced news and information. See also the section on digital advertising in the disinformation resource.

Advertising-motivated content

Platforms leverage tools like hashtags and search engine optimization (SEO) to rank and cluster content around certain topics. Unfortunately, automated content curation motivated by advertising does not tend to prioritize healthful, educational, or rigorous content. Instead, conspiracy theories, shocking or violent content, and “click-bait” (misleading phrases designed to entice viewing) tend to spread more widely. Many platforms have features of upvoting (“like” buttons) which, similar to hashtags and SEO, influence the algorithmic moderation and promote certain content to circulate more widely. These features together cause “virality,” one of the defining features of the social-media ecosystem: the tendency of an image, video, or piece of information to be circulated rapidly and widely.

In some cases, virality can spark political activism and raise awareness (like the #MeToo movement), but it can also amplify tragedies and spread inaccurate information  (anti-vaccine information and other health rumors, etc.). Additionally, the business models of the platforms reward quantity over quality (number of “likes”, “followers”, and views), encouraging a growth logic that has led to the problem of information saturation or information overload, overwhelming users with seemingly infinite content. Indeed, design decisions like the “infinite scroll” intended to make our social media spaces ever larger and more entertaining have been associated with impulsive behaviors, increased distraction, attention-seeking behavior, lower self-esteem, etc.

Many digital advertising strategies raise risks regarding access to information, privacy, and discrimination, in part because of their pervasiveness and subtlety. Influencer marketing, for example, is the practice of sponsoring a social media influencer to promote or use a certain product by working it into their social-media content, while native advertising is the practice of embedding ads in or beside other non-paid content. Most consumers do not know what native advertising is and may not even know when they are being delivered ads.

It is not new for brands to strategically place their content. However, today there is much more advertising, and it is seamlessly integrated with other content. In addition, the design of platforms makes content from diverse sources—advertisers and news agencies, experts and amateurs—indistinguishable. Individuals’ right to information and basic guarantees of transparency are at stake if advertisements are placed on equal footing with desired content.

Content Moderation

Content moderation is at the heart of the services that social-media platforms provide: the hosting and curation of the content uploaded by their users. Content moderation is not just the review of content, but every design decision made by the platforms, from the Terms of Service and their Community Guidelines, to the algorithms used to rank and order content, to the types of content allowed and encouraged through design features (“like”, “follow”, “block”, “restrict”, etc.).

Content moderation is particularly challenging because of the issues it raises around freedom of expression. While it is necessary to address massive quantities of harmful content that circulate widely, educational, historic, or journalistic content is often censored by algorithmic moderation systems. In 2016, for example, Facebook took down a post with a Pulitzer Prize-winning image of a naked 9-year-old girl fleeing a napalm bombing and suspended the account of the journalist who had posted it.

Though nations differ in their stances on freedom of speech, international human rights provide a framework for how to balance freedom of expression against other rights, and against protections for vulnerable groups. Still, content-moderation challenges increase as content itself evolves, for instance through increase of live streaming, ephemeral content, voice assistants, etc. Moderating internet memes is particularly challenging, for instance, because of their ambiguity and ever-changing nature; and yet meme culture is a central tool used by the far right to share ideology and glorify violence. Some information manipulation is also intentionally difficult to detect; for example, “dog whistling” (sending coded messages to subgroups of the population) and “gaslighting” (psychological manipulation to make people doubt their own knowledge or judgment).

Automated moderation

Content moderation is usually performed by a mix of humans and artificial intelligence , with the precise mix dependent on the platform and the category of content. The largest platforms like Facebook and YouTube use automated tools to filter content as it is uploaded. Facebook, for example, claims it is able to detect up to 80% of hate speech content in some languages as it is posted, before it reaches the level of human review. Though the working conditions for the human moderators have been heavily criticized, algorithms are not a perfect alternative. Their accuracy and transparency have been disputed, and experts have warned of some concerning biases stemming from algorithmic content moderation.

The complexity of content-moderation decisions does not lend itself easily to automation, and the porosity between legal and illegal or permissible and impermissible content leads to legitimate content being censored and harmful and illegal content (cyberbullying, defamation, etc.) passing through the filters.

The moderation of content posted to social media was increasingly important during the COVID-19 pandemic, when access to misleading and inaccurate information about the virus had the potential to result in severe illness or bodily harm. One characterization of Facebook described “a platform that is effectively at war with itself: the News Feed algorithm relentlessly promotes irresistible click-bait about Bill Gates, vaccines, and hydroxychloroquine; the trust and safety team then dutifully counters it with bolded, underlined doses of reality.”

Community moderation

Some social media platforms have come to rely on their users for content moderation. Reddit was one of the first social networks to popularize community-led moderation and allows subreddits to tack additional rules onto the company’s master content policy. These rules are then enforced by human moderators and, in some cases, automated bots. While the decentralization of moderation gives user communities more autonomy and decision-making power over their conversations, it also relies inherently on unpaid labor and exposes untrained volunteers to potentially problematic content.

Another approach to community-led moderation is X’s Community Notes, which is essentially a crowd-sourced fact-checking system. The feature allows users who are members of the program to add additional context to posts (formerly called tweets) that may contain false or misleading information, which other users then vote on if they find the context to be helpful.

Addressing harmful content

In some countries, local laws may address content moderation, but they relate mainly to child abuse images or illegal content that incites violence. Most platforms also have community standards or safety and security policies that state the kind of content allowed, and that sets the rules for harmful content. Enforcement of legal requirements and the platforms’ own standards relies primarily on content being flagged by social media users. The social-media platforms are only responsible for harmful content shared on their platforms once it has been reported to them.

Some platforms have established mechanisms that allow civil society organizations (CSOs) to contribute to the reporting process by becoming so-called “trusted flaggers.” Facebook’s Trusted Partner program, for example, provides partners with a dedicated escalation channel for reporting content that violates the company’s Community Standards.. However, even with programs like this in place,  limited access to platforms to raise local challenges and trends remains an obstacle for CSOs, marginalized groups, and other communities, especially in the Global South.

Regulation

The question of how to regulate and enforce the policies of social media platforms remains far from settled. As of this writing, there are several common approaches to social-media regulation.

Self-regulation

The standard model of social-media regulation has long been self-regulation, with platforms establishing and enforcing their own standards for safety and equity. Incentives for self-regulation, including avoiding the imposition of more restrictive government regulation and broadening a building consumer trust to broaden a platform’s user base (and ultimately boosting profits). On the other hand, there are obvious limits to self-regulation when these incentives are outweighed by perceived costs. Self-regulation can also be contingent on the ownership of a company, as demonstrated by the reversal of numerous policy decisions in the name of “free speech” by Elon Musk after his takeover of X (known as Twitter, at the time).

In 2020, the Facebook Oversight Board was established as an accountability mechanism for users to appeal decisions by Facebook to remove content that violates its policies against harmful and hateful posts. While the Oversight Board’s content decisions on individual cases are binding, its broader policy recommendations are not. For example, Meta was required to remove a video posted by Cambodian Prime Minister Hun Sen that threatened his opponents with physical violence, but it declined to comply with the Board’s recommendation to suspend the Prime Minister’s account entirely. Though the Oversight Board’s mandate and model is promising, there have been concerns about its capacity to respond to the volume of requests it receives in a timely manner.

Government Regulation

In recent years, individual governments and regional blocs have introduced legislation to hold social media companies accountable for the harmful content that spreads on their platforms, as well as to protect the privacy of citizens given the massive amounts of data these companies collect. Perhaps the most prominent and far-reaching example of this kind of legislation is the European Union’s Digital Services Act (DSA), which came into effect for “Very Large Online Platforms” such as Facebook and Instagram (Meta), TikTok, YouTube (Google), and X in late August of 2023. Under the rules of the DSA, online platforms risk significant fines if they fail to prevent and remove posts containing illegal content. The DSA also bans targeted advertising based on a person’s sexual orientation, religion, ethnicity, or political beliefs and requires platforms to provide more transparency on how their algorithms work.

With government regulation comes the risk of over-regulation via “fake news” laws and threats to free speech and online safety. In 2023, for example, security researchers warned that the draft legislation of the U.K.’s Online Safety Bill would compromise the security provided to users of end-to-end encrypted communications services, such as WhatsApp and Signal. Proposed Brazilian legislation to increase transparency and accountability for online platforms was also widely criticized—and received strong backlash from the platforms themselves—as negotiations took place between closed doors without proper engagement with civil society and other sectors.

Back to top

How is social media relevant in civic space and for democracy?

Social media encourages and facilitates the spread of information at unprecedented speeds, distances, and volumes. As a result, information in the public sphere is no longer controlled by journalistic “gatekeepers.” Rather, social media provide platforms for groups excluded from traditional media to connect and be heard. Citizen journalism has flourished on social media, enabling users from around the world to supplement mainstream media narratives with on-the-ground local perspectives that previously may have been overlooked or misrepresented. Read more about citizen journalism under the Opportunities section of this resource.

Social media can also serve as a resource for citizens and first responders during emergencies, humanitarian crises, and natural disasters, as described in more detail in the Opportunities section. In the aftermath of the deadly earthquake that struck Turkey and Syria in February 2023, for example, people trapped under the rubble turned to social media to alert rescue crews to their location. Social media platforms have also been used during this and other crises to mobilize volunteers and crowdsource donations for food and medical aid.

Digital inclusion center in the Peruvian Amazon. The business models and content moderation practices of social media platforms directly affect the content displayed to users. Photo Credit: Chandy Mao, Development Innovations.

However, like any technology, social media can be used in ways that negatively affect free expression, democratic debate, and civic participation. Profit-driven companies like X have in the past complied with content takedown requests from individual governments, prompting censorship concerns. When private companies control the flow of information, censorship can occur not only through such direct mechanisms, but also through the determination of which content is deemed most credible or worthy of public attention.

The effects of harassment, hate speech, and “trolling” on social media can spill over into offline spaces, presenting a unique danger for women, journalists, political candidates, and marginalized groups. According to UNESCO, 20% of respondents to a 2020 survey on online violence against women journalists reported being attacked offline in connection with online violence. Read more about online violence and targeted digital attacks in the Risks section of this resource, as well as the resource on Digital Gender Divide[1].

Social media platforms have only become more prevalent in our daily lives (with the average internet user spending nearly 2.5 hours per day on social media), and those not active on the platforms risk missing important public announcements, information about community events, and opportunities to communicate with family and friends. Design features like the “infinite scroll,” which allows users to endlessly swipe through content without clicking, are intentionally addictive—and associated with impulsive behavior and lower self-esteem. The oversaturation of content in curated news feeds makes it ever more difficult for users to distinguish factual, unbiased information from the onslaught of clickbait and sensational narratives. Read about the intentional sharing of misleading or false information to deceive or cause harm in our Disinformation[2] resource.

Social media and elections

Social media platforms have become increasingly important to the engagement of citizens, candidates, and political parties during elections, referendums, and other political events. On the one hand, lesser-known candidates can leverage social media to reach a broader audience by conducting direct outreach and sharing information about their campaign, while citizens can use social media to communicate with candidates about immediate concerns in their local communities. On the other hand, disinformation circulating on social media can amplify voter confusion, reduce turnout, galvanize social cleavages, suppress political participation of women and marginalized populations, and degrade overall trust in democratic institutions.

Social media companies like Google, Meta, and X do have a track record of adjusting their policies and investing in new products ahead of global elections. They also collaborate directly with electoral authorities and independent fact-checkers to mitigate disinformation and other online harms. However, these efforts often fall short. As one example, despite Facebook’s self-proclaimed efforts to safeguard election integrity, Global Witness found that the platform failed to detect election-related disinformation in ads ahead of the 2022 Brazilian presidential election (a similar pattern was also uncovered in Myanmar, Ethiopia, and Kenya). Facebook and other social media platforms were strongly criticized for their inaction in the lead up to and during the subsequent riots instigated by far-right supporters of former president Jair Bolsonaro. In fragile democracies, the institutions that could help counter the impact of fake news and disinformation disseminated on social media—such as independent media, agile political parties, and sophisticated civil society organizations—remain nascent.

Meanwhile, online political advertising has introduced new challenges to election transparency and accountability as the undeclared sponsoring of content has become easier through unofficial pages paid for by official campaigns. Social media companies have made efforts to increase the transparency of political ads by making “ad libraries” available in some countries and introducing new requirements for the purchase and identification of political ads. But these efforts have varied by country, with most attention directed to larger or more influential markets.

Social media monitoring can help civil society researchers better understand their local information environment, including common disinformation narratives during election cycles. The National Democratic Institute, for example, used Facebook’s social monitoring platform Crowdtangle to track the online political environment in Moldova following Maia Sandu’s victory in the November 2020 presidential elections. However, social media platforms have made this work more challenging by introducing exorbitant fees to access data or ceasing support for user interfaces that make analysis easier for non-technical users.

Back to top

Opportunities

Students from the Kandal Province, Cambodia. Social media platforms have opened up new platforms for video storytelling. Photo credit: Chandy Mao, Development Innovations.

Social media can have positive impacts when used to further democracy, human rights, and governance issues. Read below to learn how to more effectively and safely think about social media use in your work.

Citizen Journalism

Social media has been credited with providing channels for citizens, activists, and experts to report instantly and directly—from disaster settings, during protests, from within local communities, etc. Citizen journalism, also referred to as participatory journalism or guerrilla journalism, does not have a definite set of principles and is an important supplement to (but not a replacement for) mainstream journalism. Collaborative journalism, the partnership between citizen and professional journalists, as well as crowdsourcing strategies, are additional techniques facilitated by social media that have enhanced journalism, helping to promote voices from the ground and to magnify diverse voices and viewpoints. The outlet France 24 has developed a network of 5,000 contributors, the “observateurs,” who are able to cover important events directly by virtue of being on scene at the time, as well as to confirm the accuracy of information.

Social media and blogging platforms have allowed for the decentralization of expertise, bridging elite and non-elite forms of knowledge. Without proper fact-checking or supplementary sources and proper context, citizen reporting carries risks—including security risks to the authors themselves—but it is an important democratizing force and source of information.

Crowdsourcing

In crowdsourcing, the public is mobilized to share data together to tell a larger story or accomplish a greater goal. Crowdsourcing can be a method for financing, for journalism and reporting, or simply for gathering ideas. Usually some kind of software tool or platform is put in place that the public can easily access and contribute to. Crisis mapping, for example, is a type of crowdsourcing through which the public shares data in real time during a crisis (a natural disaster, an election, a protest, etc.). These data are then ordered and displayed in a useful way. For instance, crisis mapping can be used in the wake of an earthquake to show first responders the areas that have been hit and need immediate assistance. Ushahidi is an open-source crisis-mapping software developed in Kenya after the violent outbreak following the election in 2007. The tool was first created to allow Kenyans to flag incidents, form a complete and accurate picture of the situation on the ground, and share information with the media, outside governments, and relevant civil society and relief organizations. In Kenya, the tool gathered texts, posts, and photos and created crowdsourced maps of incidents of violence, election fraud, and other abuse. Ushahidi now has a global team with deployments in more than 160 countries and more than 40 languages.

Digital Activism

Social media has allowed local and global movements to spring up overnight, inviting broad participation and visibility. Twitter hashtags in particular have been instrumental for coalition building, coordination, and raising awareness among international audiences, media, and governments. Researchers began to take note of digital activism around the 2011 “Arab Spring,” when movements in Tunisia, Morocco, Syria, Libya, Egypt, and Bahrain, among others countries, leveraged social media to galvanize support. This pattern continued with the Occupy Wallstreet movement in the United States, the Ukranian Euromaidan movement in late 2013, and the Hong Kong protests in 2019.

In 2013, the acquittal of George Zimmerman in the death of unarmed 17-year-old Trayvon Martin inspired the creation of the  #BlackLivesMatter hashtag. This movement grew stronger in response to the tragic killings of Michael Brown in 2014 and George Floyd in 2020. The hashtag, at the front of an organized national protest movement, provided an outlet for people to join an online conversation and articulate alternative narratives in real time about subjects that the media and the rest of the United States had not paid sufficient attention to: police brutality, systemic racism, racial profiling, inequality, etc.

The #MeToo movement against sexual misconduct in the media industry, which also became a global movement, allowed a multitude of people to participate in activism previously bound to a certain time and place.

Some researchers and activists fear that social media will lead to “slacktivism” by giving people an excuse to stay at home rather than make a more dynamic response. Others fear that  social media is ultimately insufficient for enacting meaningful social change, which requires nuanced political arguments. (Interestingly, a 2018 Pew Research survey on attitudes toward digital activism showed that just 39% of white Americans believed social media was an important tool for expressing themselves, while 54% percent of Black people said that it was an important tool for them.)

Social media has enabled new online groups to gather together and to express a common sentiment as a form of solidarity or as a means to protest. Especially after the COVID-19 pandemic broke out, many physical protests were suspended or canceled, and virtual protests proceeded in their place.

Expansion and engagement with international audience at low costs

Social media provides a valuable opportunity for CSOs to reach their goals and engage with existing and new audiences. A good social-media strategy is underpinned by a permanent staff position to grow a strong and consistent social media presence based on the organization’s purpose, values, and culture. This person should know how to seek information, be aware of both the risks and benefits of sharing information online, and understand the importance of using sound judgment when posting on social media. The USAID “Social Networking: A Guide to Strengthening Civil Society through Social Media” provides a set of questions as guidance to develop a sound social-media policy, asking organizations to think about values, roles, content, tone, controversy, and privacy.

Increased awareness of services

Social media can be integrated into programmatic activities to strengthen the reach and impact of programming, for example, by generating awareness of an organization’s services to a new demographic. Organizations can promote their programs and services while responding to questions and fostering open dialogue. Widely used social media platforms can be useful to reach new audiences for training and consulting activities through webinars or individual meetings designed for NGOs.

Opportunities for Philanthropy and Fundraising

Social-media fundraising presents an important opportunity for nonprofits. After the blast in Beirut’s harbor in the summer of 2020, many Lebanese people started online fundraising pages for their organizations. Social media platforms were used extensively to share funding suggestions to the global audience watching the disaster unfold, reinforced by traditional media coverage. However, organizations should carefully consider the type of campaign and platforms they choose. TechSoup, a nonprofit providing tech support for NGOs, offers advice and an online course on fundraising with social media for nonprofits.

Emergency communication

In some contexts, civic actors rely on social media platforms to produce and disseminate critical information, for example, during humanitarian crises or emergencies. Even in a widespread disaster, the internet often remains a significant communication channel, which makes social media a useful, complementary means for emergency teams and the public. Reliance on the internet, however, increases vulnerability in the event of network shutdowns.

Back to top

Risks

In Kyiv, Ukrainian students share pictures at the opening ceremony of a Parliamentary Education Center. Photo credit: Press Service of the Verkhovna Rada of Ukraine, Andrii Nesterenko.

The use of social media can also create risks in civil society programming. Read below on how to discern the possible dangers associated with social media platforms in DRG work, as well as how to mitigate unintended – and intended – consequences.

Polarization and Ideological Segregation

The ways in which content flows and is presented on social media due to the platforms’ business models risk limiting our access to information, particularly to information that challenges our preexisting beliefs, by exposing us to content likely to attract our attention and support our views. The concept of the filter bubble refers to the filtering of information by online platforms to exclude information we as users have not already expressed an interest in. When paired with our own intellectual biases, filter bubbles worsen polarization by allowing us to live in echo chambers. This is easily witnessed in a YouTube feed: when you search for a song by an artist, you will likely be directed to more songs by the same artist, or similar ones—the algorithms are designed to prolong your viewing, and assume you want more of something similar. The same trend has been observed with political content. Social media algorithms encourage confirmation bias, exposing us to content we will agree with and enjoy, often at the expense of the accuracy, rigor, or educational and social value of that content.

The massive and precise data amassed by advertisers and social media companies about our preferences and opinions facilitates the practice of micro-targeting, which involves the display of tailored content based on data about users’ online behaviors, connections, and demographics, as will be further explained below.

The increasingly tailored distribution of news and information on social media is a threat to political discourse, diversity of opinions, and democracy. Users can become detached even from factual information that disagrees with their viewpoints, and isolated within their own cultural or ideological bubbles.

Because tailoring news and other information on social media is driven largely by nontransparent, opaque algorithms that are owned by private companies, it is hard for users to avoid these bubbles. Access to and intake of the very diverse information available on social media, with its many viewpoints, perspectives, ideas, and opinions, requires an explicit effort by the individual user to go beyond passive consumption of the content presented to them by the algorithm.

Misinformation and Disinformation

The internet and social media provide new tools that amplify and alter the danger presented by false, inaccurate, or out-of-context information. The online space increasingly drives discourse and is where much of today’s disinformation takes root. Refer to the Disinformation  resource for a detailed overview of these problems.

Online Violence and Targeted Digital Attacks

Social media facilitates a number of violent behaviors such as defamation, harassment, bullying, stalking, “trolling,” and “doxxing.” Cyberbullying among children, much like traditional offline bullying, can harm students’ performance in school and causes real psychological damage. Cyberbullying is particularly harmful because victims experience the violence alone, isolated in cyberspace. They often do not seek help from parents and teachers, who they believe are not able to intervene. Cyberbullying is also difficult to address because it can move across social-media platforms, beginning on one and moving to another. Like cyberbullying, cyber harassment and cyberstalking have very tangible offline effects. Women are most often the victims of cyber harassment and cyberviolence, sometimes through the use of stalkerware installed by their partners to track their movements. A frightening cyber-harassment trend  accelerated in France during the COVID-19 pandemic in the form of “fisha” accounts, where bullies, aggressors, or jilted ex-boyfriends would publish and circulate naked photos of teenage girls without their consent.

Journalists, women in particular, are often subject to cyber harassment and threats. Online violence against journalists, particularly those who write about socially sensitive or political topics, can lead to self-censorship, affecting the quality of the information environment and democratic debate. Social media provides new ways to spread and amplify hate speech and harassment. The use of fake accounts, bots, and bot-nets (automated networks of accounts) allow perpetrators to attack, overwhelm, and even disable the social media accounts of their victims. Revealing sensitive information about journalists through doxxing is another strategy that can be used to induce self-censorship.

The 2014 case of Gamergate, when several women video-game developers were attacked by a coordinated harassment campaign that included doxxing and threats of rape and death, illustrates the strength and capacity of loosely connected hate groups online to rally together, inflict real violence, and even drown out criticism. Many of the actions of the most active Gamergate trolls were illegal, but their identities were unknown. Importantly, it has been suggested by supporters of Gamergate that the most violent trolls were a “smaller, but vocal minority” — evidence of the magnifying power of internet channels and their use for coordinated online harassment.

Online hoaxes, scams, and frauds, like in their traditional offline forms, usually aim to extract money or sensitive information from a target. The practice of phishing is increasingly common on social media: an attacker pretends to be a contact or a reputable source in order to send malware or extract personal information and account credentials. Spearphishing is a targeted phishing attack that leverages information about the recipient and details related to the surrounding circumstances to achieve this same aim.

Data monetization by social media companies and tailored information streams

Most social media platforms are free to use. Social media platforms do not receive revenue directly from users, like in a traditional subscription service; rather they generate profit primarily through digital advertising. Digital advertising is based on the collection of users’ data by social-media companies, which allows advertisers to target their ads to specific users and types of users. Social media platforms monitor their users and build detailed profiles that they sell to advertisers. The data tracked includes information about the user’s connections and behavior on the platform, such as friends, posts, likes, searches, clicks, and mouse movements. Data are also extensively collected outside platforms, including information about users’ location, web pages visited, online shopping, and banking behavior. Additionally, many companies regularly request permission to access the contacts and photos of their users.

In the case of Facebook, this has led to a long-held and widespread conspiracy theory that the company listens to conversations to serve tailored advertisements. No one has ever been able to find clear evidence that this is actually happening. Research has shown that a company like Facebook does not need to listen in to your conversations, because it has the capacity to track you in so many other ways: “Not only does the system know exactly where you are at every moment, it knows who your friends are, what they are interested in, and who you are spending time with. It can track you across all your devices, log call and text metadata on phones, and even watch you write something that you end up deleting and never actually send.”

The massive and precise data amassed by advertisers and social-media companies about our preferences and opinions permit the practice of micro-targeting, that is, displaying targeted advertisements based on what you have recently purchased, searched for or liked. But just as online advertisers can target us with products, political parties can target us with more relevant or personalized messaging. Studies have attempted  to determine the extent to which political micro-targeting is a serious concern for the functioning of democratic elections. The question has also been raised by researchers and digital rights activists as to how micro-targeting may be interfering with our freedom of thought.

Government surveillance and access to personal data

The content shared on social media can be monitored by governments, who use social media for censorship, control, and information manipulation. Even democratic governments are known to engage in extensive social-media monitoring for law enforcement and intelligence-gathering purposes. These practices should be guided by robust legal frameworks and data protection laws to safeguard individuals’ rights online, but many countries have not yet enacted this type of legislation.

There are also many examples of authoritarian governments using personal and other data harvested through social media to intimidate activists, silence opposition, and bring development projects to a halt. The information shared on social media often allows bad actors to build extensive profiles of individuals, enabling targeted online and offline attacks. Through social engineering, a phishing email can be carefully crafted based on social media data to trick an activist into clicking on a malicious link that provides access to their device, documents, or social-media accounts.

Sometimes, however, a strong, real-time presence on social media can protect a prominent activist against threats by the government. A disappearance or arrest would be immediately noticed by followers or friends of a person who suddenly becomes silent on social media.

Market power and differing regulation

We rely on social-media platforms to help fulfill our fundamental rights (freedom of expression, assembly, etc.). However, these platforms are massive global monopolies and have been referred to as “the new governors.” This market concentration is troubling to national and international governance mechanisms. Simply breaking up the biggest platform companies will not fully solve the information disorders and social problems fueled by social media. Civil society and governments also need visibility into the design choices made by the platforms to understand how to address the harms they facilitate.

The growing influence of social-media platforms has given many governments reasons to impose laws on online content. There is a surge in laws across the world regulating illegal and harmful content, such as incitement to terrorism or violence, false information, and hate speech. These laws often criminalize speech and contain punishments of jail terms or high fines for something like a retweet on X. Even in countries where the rule of law is respected, legal approaches to regulating online content may be ineffective due to the many technical challenges of content moderation. There is also a risk of violating internet users’ freedom of expression by reinforcing imperfect and non-transparent moderation practices and over-deletion. Lastly, they constitute a challenge to social media companies to navigate between compliance with local laws and defending international human rights law.

Impact on journalism

Social media has had a profound impact on the field of journalism. While it has enabled the emergence of the citizen-journalist, local reporting, and crowd-sourced information, social-media companies have displaced the relationship between advertising and the traditional newspaper. In turn this has created a rewards system that privileges sensationalist, click-bait-style content over quality journalism that may be pertinent to local communities.

In addition, the way search tools work dramatically affects local publishers, as search is a powerful vector for news and information. Researchers have found that search rankings have a marked impact on our attention. Not only do we tend to think information that is ranked more highly is more trusted and relevant, but we tend to click on top results more often than lower ones. The Google search engine concentrates our attention on a narrow range of news sources, a trend that works against diverse and pluralistic media outlets. It also tends to work against the advertising revenue of smaller and community publishers, which is based on user attention and traffic. In this downward spiral, search results favor larger outlets, and those results drive more user engagement; in turn, their inventory becomes more valuable in the advertising market, and those publishers grow larger driving more favorable search results and onward we go.

Back to top

Questions

To understand the implications of social media information flows and choice of platforms used in your work, ask yourself these questions:

  1. Does your organization have a social-media strategy? What does your organization hope to achieve through social media use?
  2. Do you have staff who can oversee and ethically moderate your social-media accounts and content?
  3. Which platform do you intend to use to accomplish your organization’s goals? What is the business model of that platform? How does this business model affect you as a user?
  4. How is content ordered and moderated on the platforms you use (by humans, volunteers, AI, etc.)?
  5. Where is the platform legally headquartered? What jurisdiction and legal frameworks does it fall under?
  6. Do the platforms chosen have mechanisms for users to flag harassment and hate speech for review and possible removal?
  7. Do the platforms have mechanisms for users to dispute decisions on content takedowns or blocked accounts?
  8. What user data are the platforms collecting? Who else has access to collected data and how is it being used?
  9. How does the platform engage its community of users and civil society (for instance, in flagging dangerous content, in giving feedback on design features, in fact-checking information, etc.)? Does the platform employ local staff in your country or region?
  10. Do the platform(s) have privacy features like encryption? If so, what level of encryption do they offer and for what precise services (for example, only on the app, only in private message threads)? What are the default settings?

Back to top

Case Studies

Everyone saw Brazil violence coming. Except social media giants

Everyone saw Brazil violence coming. Except social media giants

“When far-right rioters stormed Brazil’s key government buildings on January 8, social media companies were again caught flat-footed. In WhatsApp groups—many with thousands of subscribers—viral videos of the attacks quickly spread like wildfire… On Twitter, social media users posted thousands of images and videos in support of the attacks under the hashtag #manifestacao, or protest. On Facebook, the same hashtag garnered tens of thousands of engagements via likes, shares and comments, mostly in favor of the riots… In failing to clamp down on such content, the violence in Brazil again highlights the central role social media companies play in the fundamental machinery of 21st century democracy. These firms now provide digital tools like encrypted messaging services used by activists to coordinate offline violence and rely on automated algorithms designed to promote partisan content that can undermine people’s trust in elections.”

Crowdsourced mapping in crisis zones: collaboration, organization and impact

Crowdsourced mapping in crisis zones: collaboration, organization and impact

“Within a crisis, crowdsourced mapping allows geo-tagged digital photos, aid requests posted on Twitter, aerial imagery, Facebook posts, SMS messages, and other digital sources to be collected and analyzed by multiple online volunteers…[to build] an understanding of the damage in an area and help responders focus on those in need. By generating maps using information sourced from multiple outlets, such as social media…a rich impression of an emergency situation can be generated by the power of ‘the crowd’.” Crowdsourced mapping has been employed in multiple countries during natural disasters, refugee crises, and even election periods.

What makes a movement go viral? Social media, social justice coalesce under #JusticeForGeorgeFloyd

What makes a movement go viral? Social media, social justice coalesce under #JusticeForGeorgeFloyd

A 2022 USC study was among the first to measure the link between social media posts and participation in the #BlackLivesMatter protests after the 2020 death of George Floyd. “The researchers found that Instagram, as a visual content platform, was particularly effective in mobilizing coalitions around racial justice by allowing new opinion leaders to enter public discourse. Independent journalists, activists, entertainers, meme groups and fashion magazines were among the many opinion leaders that emerged throughout the protests through visual communications that went viral. This contrasts with text-based platforms like Twitter that allow voices with institutional power (such as politicians, traditional news media or police departments) to control the flow of information.”

Myanmar: The social atrocity: Meta and the right to remedy for the Rohingya

Myanmar: The social atrocity: Meta and the right to remedy for the Rohingya

A 2022 Amnesty International report investigated Meta’s role in the serious human rights violations perpetrated during the Myanmar security forces’ brutal campaign of ethnic cleansing against Rohingya Muslims starting in August 2017. The report found that “Meta’s algorithms proactively amplified and promoted content which incited violence, hatred, and discrimination against the Rohingya – pouring fuel on the fire of long-standing discrimination and substantially increasing the risk of an outbreak of mass violence.”

How China uses influencers to build a propaganda network

How China uses influencers to build a propaganda network

“As China continues to assert its economic might, it is using the global social media ecosystem to expand its already formidable influence. The country has quietly built a network of social media personalities who parrot the government’s perspective in posts seen by hundreds of thousands of people, operating in virtual lockstep as they promote China’s virtues, deflect international criticism of its human rights abuses, and advance Beijing’s talking points on world affairs like Russia’s war against Ukraine. Some of China’s state-affiliated reporters have posited themselves as trendy Instagram influencers or bloggers. The country has also hired firms to recruit influencers to deliver carefully crafted messages that boost its image to social media users. And it is benefitting from a cadre of Westerners who have devoted YouTube channels and Twitter feeds to echoing pro-China narratives on everything from Beijing’s treatment of Uyghur Muslims to Olympian Eileen Gu, an American who competed for China in the [2022] Winter Games.”

Why Latin American Leaders Are Obsessed With TikTok

Why Latin American Leaders Are Obsessed With TikTok

“Latin American heads of state have long been early adopters of new social media platforms. Now they have seized on TikTok as a less formal, more effective tool for all sorts of political messaging. In Venezuela, Nicolas Maduro has been using the platform to share bite-sized pieces of propaganda on the alleged successes of his socialist agenda, among dozens of videos of himself dancing salsa. In Ecuador, Argentina and Chile, presidents use the app to give followers a view behind the scenes of government. In Brazil, former President Jair Bolsonaro and his successor Luiz Inácio Lula da Silva have been competing for views in the aftermath of a contested election…In much of the West, TikTok is the subject of political suspicion; in Latin America, it’s a cornerstone of political strategy.”

Back to top

References

Find below the works cited in this resource.

Additional Resources

Back to top

Categories

Digital Development in the time of COVID-19