Big Data

What are big data?

“Big data” are also data, but involve far larger amounts of data than can usually be handled on a desktop computer or in a traditional database. Big data are not only huge in volume, but they grow exponentially with time. Big data are so large and complex that none of the traditional data- management tools are able to store them or process them efficiently. If you have an amount of data that you can process on your computer or the database on your usual server without it crashing, “big data” are likely not what you are working with.

How do big data work?

The field of big data has evolved as technology’s ability to constantly capture information has skyrocketed. Big data are usually captured without being entered into a database by a human being, in real time: in other words, big data are “passively” captured by digital devices.

The internet provides infinite opportunities to gather information, ranging from so-called meta-information or metadata (geographic location, IP address, time, etc.) to more detailed information about users’ behaviors. This is often from online social media or credit-card-purchasing behavior. Cookies are one of the principle ways that web browsers gather information about users: they are essentially tiny pieces of data stored on a web browser, or little bits of memory about something you did on a website. (For more on cookies, visit this resource).

Data sets can also be assembled from the Internet of Things (IoT), which involve sensors tied into other devices and networks. For example, censor-equipped streetlights might collect traffic information that can then be analyzed to optimize traffic flow. The collection of data through sensors is a common element of smart city infrastructure.

Healthcare workers in Indonesia. The use of big data can improve health systems and inform public health policies. Photo credit: courtesy of USAID EMAS.

Big data can also be medical or scientific data, such as DNA information or data related to disease outbreaks. This can be useful to humanitarian and development organizations. For example, during the Ebola outbreak in West Africa between 2014 and 2016, UNICEF combined data from a number of sources, including population estimates, information on air travel, estimates of regional mobility from mobile phone records and tagged social media locations, temperature data, and case data from WHO reports to better understand the disease and predict future outbreaks.

Big data are created and used by a variety of actors. In data-driven societies, most actors (private sector, governments, and other organizations) are encouraged to collect and analyze data to notice patterns and trends, to measure success or failure, to optimize their processes for efficiency, etc. Not all actors will create datasets themselves, often they will collect publicly available data or even purchase data from specialized companies. For instance, in the advertising industry, Data Brokers specialize in collecting and processing information about internet users, which they then sell to advertisers. Other actors will create their own datasets, like energy providers, railway companies, ride-sharing companies, and governments. Data are everywhere, and the actors capable of collecting them intelligently and analyzing are numerous.

Back to top

How are big data relevant in civic space and for democracy?

In Tanzania, an open source platform allows government and financial institutions to record all land transactions to create a comprehensive dataset. Photo credit: Riaz Jahanpour for USAID / Digital Development Communications.

From forecasting presidential elections to helping small-scale farmers deal with changing climate to predicting disease outbreaks, analysts are finding ways to turn Big Data into an invaluable resource for planning and decision-making. Big data are capable of providing civil society with powerful insights and the ability to share vital information. Big data tools have been deployed recently in civic space in a number of interesting ways, for example, to:

  • monitor elections and support open government (starting in Kenya with Ushahidi in 2008)
  • track epidemics like Ebola in Sierra Leone and other West African nations
  • track conflict-related deaths worldwide
  • understand the impact of ID systems on refugees in Italy
  • measure and predict agricultural success and distribution in Latin America
  • press forward with new discoveries in genetics and cancer treatment
  • make use of geographic information systems (GIS mapping applications) in a range of contexts, including planning urban growth and traffic flow sustainably, as has been done by the World Bank in various countries in South Asia, East Asia, Africa, and the Caribbean

The use of big data that are collected, processed, and analyzed to improve health systems or environmental sustainability, for example, can ultimately greatly benefit individuals and society. However, a number of concerns and cautions have been raised about the use of big datasets. Privacy and security concerns are foremost, as big data are often captured without our awareness and used in ways to which we may not have consented, sometimes sold many times through a chain of different companies we never interacted with, exposing data to security risks such as data breaches. It is crucial to consider that anonymous data can still be used to “re-identify” people represented in the dataset – achieving 85% accuracy using as little as postal code, gender, and date of birth – conceivably putting them at risk (see discussion of “re-identification” below).

There are also power imbalances (divides) in who is represented in the data as opposed to who has the power to use them. Those who are able to extract value from big data are often large companies or other actors with the financial means and capacity to collect (sometimes purchase), analyze, and understand the data.

This means the individuals and groups whose information is put into datasets (shoppers whose credit card data is processed, internet users whose clicks are registered on a website) do not generally benefit from the data they have given. For example, data about what items shoppers buy in a store is more likely used to maximize profits than to help customers with their buying decisions. The extractive way that data are taken from individuals’ behaviors and used for profit has been called “surveillance capitalism“, which some believe is undermining personal autonomy and eroding democracy.

The quality of datasets must also be taken into consideration, as those using the data may not know how or where they were gathered, processed, or integrated with other data. And when storing and transmitting big data, security concerns are multiplied by the increased numbers of machines, services, and partners involved. It is also important to keep in mind that big datasets themselves are not inherently useful, but they become useful along with the ability to analyze them and draw insights from them, using advanced algorithms, statistical models, etc.

Last but not least, there are important considerations related to protecting the fundamental rights of those whose information appears in datasets. Sensitive, personally identifiable, or potentially personally identifiable information can be used by other parties or for other purposes than those intended, to the detriment of the individuals involved. This is explored below and in the Risks section, as well as in other primers.

Protecting anonymity of those in the dataset

Anyone who has done research in the social or medical sciences should be familiar with the idea that when collecting data on human subjects, it is important to protect their identities so that they do not face negative consequences from being involved in research, such as being known to have a particular disease, voted in a particular way, engaged in stigmatized behavior, etc. (See the Data Protection resource ). The traditional ways of protecting identities – removing certain identifying information, or only reporting statistics in aggregate – can and should also be used when handling big datasets to help protect those in the dataset. Data can also be hidden in multiple ways to protect privacy: methods include encryption (encoding), tokenization, and data masking. Talend identifies the strengths and weaknesses of the primary strategies for hiding data using these methods.

One of the biggest dangers involved in using big datasets is the possibility of re-identification: figuring out the real identities of individuals in the dataset, even if their personal information has been hidden or removed. To give a sense of how easy it could be to identify individuals in a large dataset, one study found that using only three fields of information—postal code, gender, and date of birth—it was possible to identify 87% of Americans individually, and then connect their identities to publicly-available databases containing hospital records. With more data points, researchers have demonstrated near-perfect ability to identify individuals in a dataset: four random pieces of data credit card records could achieve 90% identifiability, and researchers were able to re-identify individuals with 99.98% accuracy using 15 data points.

Ten simple rules for responsible big data research, quoted from a paper of the same name by Zook, Barocas, Boyd, Crawford, Keller, Gangadharan et al, 2017

  1. Acknowledge that data are people and that data can do harm. Most data represent or affect people. Simply starting with the assumption that all data are people until proven otherwise places the difficulty of disassociating data from specific individuals front and center.
  2. Recognize that privacy is more than a binary value. Privacy may be more or less important to individuals as they move through different contexts and situations. Looking at someone’s data in bulk may have different implications for their privacy than looking at one record. Privacy may be important to groups of people (say, by demographic) as well as to individuals.
  3. Guard against the reidentification of your data. Be aware that apparently harmless, unexpected data, like phone battery usage, could be used to re-identify data. Plan to ensure your data sharing and reporting lowers the risk that individuals could be identified.
  4. Practice ethical data sharing. There may be times when participants in your dataset expect you to share (such as with other medical researchers working on a cure), and others where they trust you not to share their data. Be aware that other identifying data about your participants may be gathered, sold, or shared about them elsewhere, and that combining that data with yours could identify participants individually. Be clear about how and when you will share data and stay responsible for protecting the privacy of the people whose data you collect.
  5. Consider the strengths and limitations of your data; big does not automatically mean better. Understand where your large dataset comes from, and how that may evolve over time. Don’t overstate your findings and acknowledge when they may be messy or have multiple meanings.
  6. Debate the tough, ethical choices. Talk with your colleagues about these ethical concerns. Follow the work of professional organizations to stay current with concerns.
  7. Develop a code of conduct for your organization, research community, or industry and engage your peers in creating it to ensure unexpected or under-represented perspectives are included.
  8. Design your data and systems for auditability. This both strengthens the quality of your research and services and can give early warnings about problematic uses of the data.
  9. Engage with the broader consequences of data and analysis practices. Keep social equality, the environmental impact of big data processing, and other society-wide impacts in view as you plan big data collection.
  10. Know when to break these rules. With debate, code of conduct, and auditability as your guide, consider that in a public health emergency or other disaster, you may find there are reasons to put the other rules aside.

Gaining informed consent

Those providing their data may not be aware at the time that their data may be sold later to data brokers who may then re-sell them.

Unfortunately, data privacy consent forms are generally hard for the average person to read, even in the wake of General Data Protection Regulation (GDPR ) expansion of privacy protections. Terms of Service (ToS documents) are so notoriously difficult to read that one filmmaker even made a documentary on the subject. Researchers who have studied terms of service and privacy policies have found that users generally accept them without reading them, because they are too long and complex. Otherwise, users that need to access a platform or service for personal reasons (for example to get in contact with a relative) or for their livelihood (to deliver their products to customers) may not be able to simply reject the ToS when they have no viable or immediate alternative.

Important work is being done to try to protect users of platforms and services from these kinds of abusive data-sharing situations. For example, Carnegie Mellon’s Usable Privacy and Security laboratory (CUPS) has developed best practices to inform users about how their data may be used. These take the shape of data privacy “nutrition labels” that are similar to FDA-specified food nutrition labels and are evidence-based.

In Chipata, Zambia, a resident draws water from well. Big data offer invaluable insights for the design of climate change solutions. Photo credit: Sandra Coburn.

Back to top

Opportunities

Big data can have positive impacts when used to further democracy, human rights and governance issues. Read below to learn how to more effectively and safely think about big data in your work.

Greater insight

Big datasets can present some of the richest, most comprehensive information that has ever been available in human history. Researchers using big datasets have access to information from a massive population. These insights can be much more useful and convenient than self-reported data or data gathered from logistically tricky observational studies. One major trade-off is between the richness of the insights gained through self-reported or very carefully collected data, versus the ability to generalize the insights from big data. Big data gathered from social-media activity or sensors also can allow for the real-time measurements of activity at a large scale. Big-data insights are very important in the field of logistics. For example, the United States Postal Service collects data from across its package deliveries using GPS and vast networks of sensors and other tracking methods, and they then process these data with specialized algorithms. These insights allow them to optimize their deliveries for environmental sustainability.

Increased access to data

Making big datasets publicly available can begin to take steps toward closing divides in access to data. Apart from some public datasets, big data often ends up as the property of corporations, universities, and other large organizations. Even though the data produced are about individual people and their communities, those individuals and communities may not have the money or technical skills needed to access those data and make productive use of them. This creates the risk of worsening existing digital divides.

Publicly available data have helped communities understand and act on government corruption, municipal issues, human-rights abuses, and health crises, among other things. Though again, when data are made public, they are of particular importance to ensure strong privacy for those whose data is in the dataset. The work of the Our Data Bodies project provides additional guidance for how to engage with communities whose data is in the datasets. Their workshop materials can support community understanding and engagement in making ethical decisions around data collection and processing, and about how to monitor and audit data practices.

Back to top

Risks

The use of emerging technologies to collect data can also create risks in civil society programming. Read below on how to discern the possible dangers associated with big data collection and use in DRG work, as well as how to mitigate for unintended – and intended – consequences.

Surveillance

With the potential for re-identification as well as the nature and aims of some uses of big data, there is a risk that individuals included in a dataset will be subjected to surveillance by governments, law enforcement, or corporations. This may put the fundamental rights and safety of those in the dataset at risk.  

The Chinese government is routinely criticized for the invasive surveillance of Chinese citizens through gathering and processing big data. More specifically, the Chinese government has been criticized for their system of social ranking of citizens based on their social media, purchasing, and education data, as well as the gathering of DNA of members of the Uighur minority (with the assistance of a US company, it should be noted). China is certainly not the only government to abuse citizen data in this way. Edward Snowden’s revelations about the US National Security Agency’s gathering and use of social media and other data was among the first public warnings about the surveillance potential of big data. Concerns have also been raised about partnerships involved in the development of India’s Aadhar biometric ID system, technology whose producers are eager to sell it to other countries. In the United States, privacy advocates have raised concerns about companies and governments gathering data at scale about students by using their school-provided devices, a concern that should also be raised in any international context when laptops or mobiles are provided for students. 

It must be emphasized that surveillance concerns are not limited to the institutions originally gathering the data, whether governments or corporations. When data are sold or combined with other datasets, it is possible that other actors, from email scammers to abusive domestic partners, could access the data and track, exploit, or otherwise harm people appearing in the dataset. 

Data security concerns

Because big data are collected, cleaned, and combined through long, complex pipelines of software and storage, it presents significant challenges for security. These challenges are multiplied whenever the data are shared between many organizations. Any stream of data arriving in real time (for example, information about people checking into a hospital) will need to be specifically protected from tampering, disruption, or surveillance. Given that data may present significant risks to the privacy and safety of those included in the datasets and may be very valuable to criminals, it is important to ensure sufficient resources are provided for security.

Existing security tools for websites are not enough to cover the entire big data pipeline. Major investments in staff and infrastructure are needed to provide proper security coverage and respond to data breaches. And unfortunately, within industry there are known shortages in big data specialists, particularly security personnel familiar with the unique challenges big data presents. Internet of Things sensors present a particular risk if they are part of the data-gathering pipeline; these devices are notorious for having poor security. For example, a malicious actor could easily introduce fake sensors into the network or fill the collection pipeline with garbage data in order to render your data collection useless.

Exaggerated expectations of accuracy and objectivity

Big data companies and their promoters often make claims that big data can be more objective or accurate than traditionally-gathered data, supposedly because human judgment does not come into play and because the scale at which it is gathered is richer. This picture downplays the fact that algorithms and computer code also bring human judgment to bear on data, including biases and data that may be accidentally excluded. Human interpretation is also always necessary to make sense of patterns in big data; so again, claims of objectivity should be taken with healthy skepticism.

It is important to ask questions about data-gathering methods, algorithms involved in processing, and the assumptions or inferences made by the data gatherers/programmers and their analyses to avoid falling into the trap of assuming big data are “better.” For example, while data about the proximity of two cell phones tells you the fact that two people were near each other, only human interpretation can tell you why those two people were near each other. How an analyst interprets that closeness may differ from what the people carrying the cell phones might tell you. For example, this is a major challenge in using phones for “contact tracing” in epidemiology. During the COVID-19 health crisis, many countries raced to build contact tracing cellphone apps. The precise purposes and functioning of these apps varies widely (as has their effectiveness) but it is worth noting that major tech companies have preferred to refer to these apps as “exposure-risk notification” apps rather than contact tracing: this is because the apps can only tell you if you have been in proximity with someone with the coronavirus, not whether or not you have contacted the virus.

Misinterpretation

As with all data, there are pitfalls when it comes to interpreting and drawing conclusions. Because big data are often captured and analyzed in real-time, it may be particularly weak in providing historical context for the current patterns it is highlighting. Anyone analyzing big data should also consider what its source or sources were, whether the data was combined with other datasets, and how it was cleaned. Cleaning refers to the process of correcting or removing inaccurate or extraneous data. This is particularly important with socialmedia data, which can have lots of “noise” (extra information) and are therefore almost always cleaned 

Back to top

Questions

If you are trying to understand the implications of big data in your work environment, or are considering using aspects of big data as part of your DRG programming, ask yourself these questions:

  1. Is gathering big data the right approach for the question you’re trying to answer? How would your question be answered differently using interviews, historical research, or a focus on statistical significance?
  2. Do you already have these data, or are they publicly available? Is it really necessary to acquire these data yourself?
  3. What is your plan to make it impossible to identify individuals through their data in your dataset? If the data come from someone else, what kind of de-anonymization have they already performed?
  4. How could individuals be made more identifiable by someone else when you publish your data and findings? What steps can you take to lower the risk they will be identified?
  5. What is your plan for getting consent from those whose data you are collecting? How will you make sure your consent document is easy for them to understand?
  6. If your data come from another organization, how did they seek consent? Did that consent include consent for other organizations to use the data?
  7. If you are getting data from another organization, what is the original source of these data? Who collected them, and what were they trying to accomplish?
  8. What do you know about the quality of these data? Is someone inspecting them for errors, and if so, how? Did the collection tools fail at any point, or do you suspect that there might be some inaccuracies or mistakes?
  9. Have these data been integrated with other datasets? If data were used to fill in gaps, how was that accomplished?
  10. What is the end-to-end security plan for the data you are capturing or using? Are there third parties involved whose security propositions you need to understand?

Back to top

Case Studies

Village resident in Tanzania. Big data analytics can pinpoint strategies that work for small-scale farmers. Photo credit: Riaz Jahanpour for USAID / Digital Development Communications.
Digital Identity in the Migration and Refugee Context

Digital Identity in the Migration and Refugee Context

For migrants and refugees in Italy, identity data collection processes can “exacerbate existing biases, discrimination, or power imbalances.” One key challenge is obtaining meaningful consent. Often, biometric data are collected as soon as migrants and refugees arrive in a new country, at a moment when they may be vulnerable and overwhelmed. Language barriers exacerbate the issue, making it difficult to provide adequate context around rights to privacy. Identity data are collected inconsistently by different organizations, all of whose data protection and privacy practices vary widely.

Big Data for climate-smart agriculture

Big Data for climate-smart agriculture

“Scientists at the International Center for Tropical Agriculture (CIAT) have applied Big Data tools to pinpoint strategies that work for small-scale farmers in a changing climate…. Researchers have applied Big Data analytics to agricultural and weather records in Colombia, revealing how climate variation impacts rice yields. These analyses identify the most productive rice varieties and planting times for specific sites and seasonal forecasts. The recommendations could potentially boost yields by 1 to 3 tons per hectare. The tools work wherever data is available, and are now being scaled out through Colombia, Argentina, Nicaragua, Peru and Uruguay.”

School Issued Devices and Student Privacy

School Issued Devices and Student Privacy, particularly the Best Practices for Ed Tech Companies section. “Students are using technology in the classroom at an unprecedented rate…. Student laptops and educational services are often available for a steeply reduced price and are sometimes even free. However, they come with real costs and unresolved ethical questions. Throughout EFF’s investigation over the past two years, [they] have found that educational technology services often collect far more information on kids than is necessary and store this information indefinitely. This privacy-implicating information goes beyond personally identifying information (PII) like name and date of birth, and can include browsing history, search terms, location data, contact lists, and behavioral information…All of this often happens without the awareness or consent of students and their families.”

Big Data and Thriving Cities: Innovations in Analytics to Build Sustainable, Resilient, Equitable and Livable Urban Spaces.

Big Data and Thriving Cities: Innovations in Analytics to Build Sustainable, Resilient, Equitable and Livable Urban Spaces.

This paper includes case studies of big data used to track changes in urbanization, traffic congestion, and crime in cities. “[I]nnovative applications of geospatial and sensing technologies and the penetration of mobile phone technology are providing unprecedented data collection. This data can be analyzed for many purposes, including tracking population and mobility, private sector investment, and transparency in federal and local government.”

Battling Ebola in Sierra Leone: Data Sharing to Improve Crisis Response.

Battling Ebola in Sierra Leone: Data Sharing to Improve Crisis Response.

“Data and information have important roles to play in the battle not just against Ebola, but more generally against a variety of natural and man-made crises. However, in order to maximize that potential, it is essential to foster the supply side of open data initiatives – i.e., to ensure the availability of sufficient, high-quality information. This can be especially challenging when there is no clear policy backing to push actors into compliance and to set clear standards for data quality and format. Particularly during a crisis, the early stages of open data efforts can be chaotic, and at times redundant. Improving coordination between multiple actors working toward similar ends – though difficult during a time of crisis – could help reduce redundancy and lead to efforts that are greater than the sum of their parts.”

Tracking Conflict-Related Deaths: A Preliminary Overview of Monitoring Systems.

Tracking Conflict-Related Deaths: A Preliminary Overview of Monitoring Systems.

“In the framework of the United Nations 2030 Agenda for Sustainable Development, states have pledged to track the number of people who are killed in armed conflict and to disaggregate the data by sex, age, and cause—as per Sustainable Development Goal (SDG) Indicator 16. However, there is no international consensus on definitions, methods, or standards to be used in generating the data. Moreover, monitoring systems run by international organizations and civil society differ in terms of their thematic coverage, geographical focus, and level of disaggregation.”

Balancing data utility and confidentiality in the US census.

Balancing data utility and confidentiality in the US census.

Describes how the Census is using differential privacy to protect the data of respondents. “As the Census Bureau prepares to enumerate the population of the United States in 2020, the bureau’s leadership has announced that they will make significant changes to the statistical tables the bureau intends to publish. Because of advances in computer science and the widespread availability of commercial data, the techniques that the bureau has historically used to protect the confidentiality of individual data points can no longer withstand new approaches for reconstructing and reidentifying confidential data. … [R]esearch at the Census Bureau has shown that it is now possible to reconstruct information about and reidentify a sizeable number of people from publicly available statistical tables. The old data privacy protections simply don’t work anymore. As such, Census Bureau leadership has accepted that they cannot continue with their current approach and wait until 2030 to make changes; they have decided to invest in a new approach to guaranteeing privacy that will significantly transform how the Census Bureau produces statistics.”

Back to top

References

Find below the works cited in this resource.

Additional Resources

Back to top

Categories

Social Media

What is social media?

Social media provide spaces for people and organizations to share and access news and information, communicate with beneficiaries and advocate for change. Social-media content includes the text, photos, videos, infographics, or any other material placed on a blog, Facebook page, Twitter account, etc. for the audience to consume, interact with, and circulate. Content is curated by platforms and delivered to users according to what is most likely to attract their attention. There is an ever-expanding amount of content available on these platforms.

Digital inclusion center in the Peruvian Amazon. For NGOs, social media platforms can be useful to reach new audiences and to raise awareness of services. Photo credit: Jack Gordon for USAID / Digital Development Communications.

Theoretically, through social media, everyone has a way to speak out and reach audiences across the world, which can be empowering and bring people together. At the same time, much of what is shared on social media can be misleading, hateful, and dangerous , which theoretically imposes a level of responsibility by the owners of platforms to moderate content.

How does social media work?

Social media platforms are owned by private companies, with business models usually based on advertising and monetization of users’ data. This affects the way that content appears to users, and influences data-sharing practices. Moderating content on these social-media spaces brings its own challenges and complications because it requires balancing multiple fundamental freedoms. Understanding the content moderation practices and business models of the platforms is essential to reap the benefits while mitigating the risks of using social media.

Business Models

Most social-media platforms rely on advertising. Advertisers pay for engagement, such as clicks, likes and shares. Therefore, sensational and attention-grabbing content is more valuable. This motivates platforms to use automated-recommendation technology that relies on algorithmic decision-making to prioritize content likely to grab attention. The main strategy of “user-targeted amplification” shows users content that is most likely to interest them based on detailed data that are collected about them. See more in the Risk section under Data Monetization by social media companies and tailored information streams.

The Emergence of Programmatic Advertising

The transition of advertising to digital systems has dramatically altered the advertising business. In an analog world, advertising placements were predicated on aggregate demographics, collected by publishers and measurement firms. These measurements were rough, capable at best of tracking subscribers and household-level engagement. Advertisers hoped their ads would be seen by enough of their target demographic (for example, men between 18 and 35 with income at a certain level) to be worth their while. Even more challenging was tracking the efficacy of the ads. Systems for measuring if an ad resulted in a sale were limited largely to mail-in cards and special discount codes.

The emergence of digital systems changed all of that. Pioneered for the most part by Google and then supercharged by Facebook in the early years of the 21st century, a new promise emerged: “Place ads through our platform, and we can put the right ad in front of the right person at the right time. Not only that, but we can report back to you (advertiser) who saw the ad, if they clicked on it, and if that click led to a ‘conversion’ or a sale.”

But this promise has come with significant unintended consequences. The way that the platforms—and the massive ad tech industry that has rapidly emerged alongside them—deliver on this promise requires a level of data gathering, tracking and individual surveillance unprecedented in human history. The tracking of individual behaviors, preferences and habits powers the wildly profitable digital advertising industry, dominated by platforms that can control these data at scale.

Managing huge consumer data sets at the scale and speed required to deliver value to advertisers has come to mean a heavy dependence on algorithms to do the searching, sorting, tracking, placement and delivery of ads. This development of sophisticated algorithms led to the emergence of programmatic advertising, which is the placement of ads in real time on websites with no human intervention. Programmatic advertising made up roughly two thirds of the $237 billion global ad market in 2019.

The digitization of the advertising market, particularly the dominance of programmatic advertising, has resulted in a highly uneven playing field. The technology companies enter with a significant advantage: they built the new structures and set the terms of engagement. What began as a value add in the new digital space— “We will give advertisers efficiency and publishers new audience and revenue streams”—has evolved to disadvantage both.

One of the primary challenges is in how audience engagement is measured and tracked. The primary performance indicators in the digital world are views and clicks. As mentioned above (and well documented in the literature), an incentive structure based on views and clicks (engagement) tends to favor sensational and eye-catching content. In the race for engagement, misleading or false content, with dramatic headlines and incendiary claims, consistently wins out over more balanced news and information. See also the section on digital advertising in the disinformation resource.

Advertising motivated content

Platforms leverage tools like hashtags and search engine optimization (SEO) to rank and cluster content around certain topics. Unfortunately, automated content curation motivated by advertising does not tend to prioritize healthful, educational, or rigorous content. Instead, frivolous, distracting, potentially untrue or even harmful content tends to spread more widely: conspiracy theories, shocking or violent content and “click-bait” (misleading phrases designed to entice viewing). Many platforms have features of upvoting (“like” buttons) which, similar to hashtags and SEO, influence the algorithmic moderation and promote certain content to circulate more widely. These features together cause “virality,” one of the defining features of the social-media ecosystem: the tendency of an image, video, or piece of information to be circulated rapidly and widely.

In some cases, virality can spark political activism and raise awareness (like the #MeToo hashtag), but it can also amplify tragedies and spread inaccurate information (anti-vaccine information and other health rumors, etc.). Additionally, the business models of the platforms reward quantity over quality (number of “likes”, “followers”, and views), encouraging a growth logic that has led to the problem of information saturation or information overload, overwhelming users with seemingly infinite content. Indeed, design decisions like the “infinite scroll” intended to make our social media spaces ever larger and more entertaining have been associated with impulsive behaviors, increased distraction, attention-seeking behavior, lower self-esteem, etc.

Many digital advertising strategies raise risks regarding access to information, privacy, and discrimination, in part because of their pervasiveness and subtlety. Influencer marketing, for example, is the practice of sponsoring a social media influencer to promote or use a certain product by working it into their social-media content, while native advertising is the practice of embedding ads in or beside other non-paid content. Most consumers do not know what native advertising is and may not even know when they are being delivered ads.

It is not new for brands to strategically place their content. However, today there is much more advertising, and it is seamlessly integrated with other content. In addition, the design of platforms makes content from diverse sources—advertisers and news agencies, experts and amateurs—indistinguishable. Individuals’ right to information and basic guarantees of transparency are at stake if advertisements are placed on equal footing with desired content.

Content Moderation

Content moderation is at the heart of the service that social-media platforms provide: the hosting and curation of the content uploaded by their users. Content moderation is not just the review of content, but every design decision made by the platforms, from the Terms of Service and their Community Guidelines, to the algorithms they use to rank and order content, to the types of content they allow and encourage through design features (“like”, “follow”, “block”, “restrict”, etc.).

Content moderation is particularly challenging because of the issues it raises around freedom of expression. While it is necessary to address massive quantities of harmful content that circulate widely, educational, historic, or journalistic content is often censored by algorithmic moderation systems. In 2016, Facebook took down a post with a Pulitzer Prize-winning image of a naked 9-year-old girl fleeing a napalm bombing and suspended the account of the journalist who had posted it.

Though nations differ in their stances on freedom of speech, international human rights provide a framework for how to balance freedom of expression against other rights, and against protections for vulnerable groups. Still, content-moderation challenges increase as content itself evolves, for instance through increase of live streaming, ephemeral content, voice assistants, etc. Moderating internet memes is particularly challenging, for instance, because of their ambiguity and ever-changing nature; and yet meme culture is a central tool used by the far right to share ideology and glorify violence. Some communications manipulations are also intentionally difficult to detect, for example, “dog whistling” (sending coded messages to subgroups of the population) and “gaslighting” (psychological manipulation to make people doubt their own knowledge or judgement).

Automated moderation

Content moderation is usually performed by a mix of humans and artificial intelligence , with the precise mix dependent on the platform and the category of content. The largest platforms like Facebook and YouTube use automated tools to filter content as it is uploaded. Facebook, for example, claims it is able to detect up to 80% of hate speech content in some languages as it is posted, then submitting it for human review. Though the working conditions for the human moderators have been heavily critiqued, the accuracy and transparency of the algorithms are also disputed and, expectedly, have some concerning biases. Humans are of course subject to biases as well, but algorithmic bias in content moderation poses more serious threats around equity and freedom of expression.

The complexity of content-moderation decisions does not lend itself easily to automation, and the porosity between legal and illegal/permissible and impermissible content leads to both cases of legitimate content being censored, and cases of harmful and illegal content passing through the filter (cyberbullying, defamation, etc.).

The moderation of content posted to social media has been increasingly important during the COVID-19 pandemic, when access to misleading and inaccurate information about the virus can result in death. The current moderation strategy of Facebook, for example, has been described as creating “a platform that is effectively at war with itself: the News Feed algorithm relentlessly promotes irresistible click-bait about Bill Gates, vaccines, and hydroxychloroquine; the trust and safety team then dutifully counters it with bolded, underlined doses of reality.”

Addressing harmful content

In some countries, local laws may address content moderation, but they relate mainly to child abuse images or illegal content that incites violence. Most platforms also have community standards or safety and security policies that state the kind of content allowed, and that sets the rules for harmful content. Enforcement of legal requirements and the platforms’ own standards relies primarily on content being flagged by social media users. The social-media platforms are only responsible for harmful content shared on their platforms once it has been reported to them.

Some platforms have established mechanisms that allow civil society organizations (CSOs) to contribute to the flagging process by becoming a so-called trusted flagger. With Facebook, this allows verification of accounts for civic organizations, provides higher levels of protection, faster response for incident reports, and accounts are less easily automatically disabled. For example, Access Now’s Digital Security Helpline is a trusted partner, and Facebook also offers access to its Trusted Partners Program to partners of members of the Design 4 Democracy Coalition. However, this program does not compensate for the limited accessibility to the platform for CSOs that encounter problems.

Back to top

How is social media relevant in civic space and for democracy?

The flow of information on social media involves many fundamental human rights and supports the functioning of a democracy, to the extent that it allows freedom of expression, democratic debate, and civic participation. Social-media platforms have become core communication channels for CSOs. As more aspects of our lives take place within digital environments, social media becomes critical to aspects as fundamental as access to education, work and livelihood, health, and other services.

For example, citizen journalism, which has flourished through social media, has allowed internet users across the world to supplement the mainstream media with facts and perspectives ‘from the ground’ that might otherwise be overlooked or misrepresented. In some contexts, civic space actors rely on social-media platforms to produce and disseminate critical information during humanitarian crises or emergencies.

Digital inclusion center in the Peruvian Amazon. The business models and content moderation practices of social media platforms directly affect the content displayed to users. Photo Credit: Chandy Mao, Development Innovations.

However, the information shared over social media is mediated by private companies and governments, who possess new tactics for censorship, control and information manipulation . Censorship is no longer necessarily the denial of information but can be the denial of attention or credibility. Further, the porosity of online and offline space can be dangerous for individuals and for democracy, as harassment, hate speech, and “trolling” behaviors offer new methods for violence, including organized violence. Doxxing and targeted digital attacks have also been used to intimidate journalists and political minorities or opponents. Read more about online violence and targeted digital attacks in the Risks section .

Social-media content is also pervasive and harmonized—in our personalized news feeds, information shared by amateurs, advertisers, or for political objectives can be difficult to distinguish from quality news, giving rise to a range of information disorders, from the accidental forwarding of inaccurate information to the intentional sharing of harmful content, as explored in the disinformation primer .

The platforms’ algorithms are designed to reward quantity over quality (number of “likes,” “followers,” and views), which has led to the problem of information saturation or information overload, overwhelming users with seemingly infinite content. Indeed, design decisions like the “infinite scroll,” which intended to make our social-media spaces ever larger and more entertaining, have been associated with impulsive behaviors, increased distraction, attention-seeking behavior, lower self-esteem, etc.

Furthermore, social-media platforms have become gatekeepers of information and connections. It has become harder to work and live without these platforms: those not using social media may miss important public announcements, events, community information or even family updates.

Back to top

Opportunities

Students from the Kandal Province, Cambodia. Social media platforms have opened up new platforms for video storytelling. Photo credit: Chandy Mao, Development Innovations.

Social media can have positive impacts when used to further democracy, human rights and governance issues. Read below to learn how to more effectively and safely think about social media use in your work.

Citizen Journalism

Social media has been credited with providing channels for citizens, activists, and experts to report instantly and directly—from disaster settings, during protests, from within local communities, etc. Citizen journalism, also referred to as participatory journalism or guerrilla journalism, does not have a definite set of principles and should not be considered as a replacement for professional journalism, but it is an important supplement to mainstream journalism. Collaborative journalism, the partnership between citizen and professional journalists, as well as crowdsourcing strategies, are further techniques permitted by social media that have enhanced journalism, helping to promote voices from the ground and to magnify diverse voices and viewpoints. The outlet France 24 has developed a network of 5,000 contributors, the “observateurs,” who are able to cover important events directly by virtue of being on scene at the time, as well as to confirm the accuracy of information.

Social-media platforms as well as blogging tools have allowed for the decentralization of expertise, bridging elite and non-elite forms of knowledge. Without proper fact-checking or supplementary sources and proper context, citizen reporting carries risks— including security risks to the authors themselves—but it is an important democratizing force and source of information.

Crowdsourcing

In crowdsourcing, the public is mobilized to share data together to tell a larger story or accomplish a greater goal. Crowdsourcing can be a method for financing, for journalism/reporting, or simply for gathering ideas. Usually some kind of software tool or platform is put in place that the public can easily access and contribute to. Crisis mapping, for example, is a type of crowdsourcing through which the public shares data in real time during a crisis (a natural disaster, an election, a protest, etc.). These data are then ordered and displayed in a useful way. For instance, crisis mapping can be used in the wake of an earthquake to show first responders the areas that have been hit and need immediate assistance. Ushahidi is an open-source crisis-mapping software developed in Kenya after the violent outbreak following the election in 2007. The tool was first created to allow Kenyans to flag incidents, to form a complete and accurate picture of the situation on the ground, to share with the media, outside governments, and relevant civil society and relief organizations. In Kenya, the tool gathered texts, tweets, and photos and created crowdsourced maps of incidents of violence, election fraud, and other abuse. Ushahidi now has a global team and works in 30 different languages.

Digital Activism

Social media has allowed local and global movements to spring up overnight, inviting broad participation and visibility. Twitter hashtags in particular have been instrumental for coalition building, coordination, and for raising awareness among international audiences, media and government. Researchers began to take note of digital activism around the 2011 “Arab Spring,” when movements in Tunisia, Morocco, Syria, Libya, Egypt and Bahrain, among others countries, leveraged social media and were quickly followed by the Occupy Wallstreet movement in the United States. Ukranian’s Euromaidan movement in late 2013 and the Hong Kong protests in 2019 are also examples of political movements that used social media to galvanize support.

In 2013, the acquittal of George Zimmerman in the death of unarmed 17-year-old Trayvon Martin inspired the creation of the  #BlackLivesMatter hashtag. This movement grew stronger in response to the tragic killing of Michael Brown. The hashtag, at the front of an organized national protest movement, provided an outlet for people to join an online conversation and articulate alternative narratives in real time about subjects that the media and the rest of the United States (and more recently with the killing of George Floyd, the world) had not paid sufficient attention to: police brutality, systemic racism, racial profiling, inequality, etc.

The #MeToo movement against sexual misconduct in the media industry, which also became a global movement, has allowed a multitude of people to participate in activism previously bound to a certain time and place.

Some researchers and activists fear “slacktivism” and the effect of social media giving people an excuse to stay at home rather than make a more dynamic response, and some fear that the tools of social media are ultimately insufficient for enacting meaningful social change, which requires nuanced political arguments. (Interestingly, a 2018 Pew Research survey on attitudes toward digital activism showed that just 39% of white Americans believed social media was an important tool to use to express themselves, while 54% percent of Black people said that it was an important tool for them.)

Social media has enabled new online groups to gather together and to express a common sentiment as a form of solidarity or as a means to protest. Especially since the COVID-19 pandemic broke out, many physical protests have been suspended or cancelled, and virtual protests proceeded.

Expansion and engagement with international audience at low costs

Social media provides a valuable opportunity for CSOs to reach their goals and engage with existing and new audiences. A good social-media strategy is best underpinned by a permanent staff position to grow a strong and consistent social-media presence based on the organization’s purpose, values, and culture. This person should know how to seek information, be aware of both the risks and benefits of sharing information online and understand the importance of using sound judgment when posting. The USAID ‘Social Networking: A Guide to Strengthening Civil Society through Social Media’ provides a set of questions as guidance to develop a sound social-media policy, asking organizations to think about values, roles, content, tone, controversy and privacy.

Increased awareness of services

Social media can be integrated into programmatic activities to strengthen the reach and impact of the program, for example, by generating awareness of an organization’s services to a new demographic. Organizations can promote their programs and services while responding to questions and fostering open dialogue. Widely used social media platforms can be useful to reach new audiences for training and consulting activities through webinars or individual meetings designed for NGOs.

Opportunities for Philanthropy and Fundraising

Social-media fundraising presents an important opportunity for non-profits, but organizations should carefully consider the type of campaign and platforms they choose. TechSoup, a non-profit providing tech support for NGOs, offers advice and an online course on fundraising with social media for non-profits.

After the blast in Beirut’s harbor in the summer of 2020, many Lebanese people started online fundraising pages for their organizations. Social-media platforms were used extensively to share funding suggestions to the global audience watching the disaster unfold, reinforced by traditional media coverage.

Emergency communication

In some contexts, civic actors rely on social media platforms to produce and disseminate critical information, for example, during humanitarian crises or emergencies. Even in a widespread disaster, the internet often remains a significant communication channel, which makes social media a useful complementary means for emergency teams and the public. Reliance on the internet, however, increases vulnerability in case of network shutdowns.

Back to top

Risks

In Kyiv, Ukrainian students share pictures at the opening ceremony of a Parliamentary Education Center. Photo credit: Press Service of the Verkhovna Rada of Ukraine, Andrii Nesterenko.

The use of emerging technologies can also create risks in civil society programming. Read below on how to discern the possible dangers associated with social media platforms in DRG work, as well as how to mitigate for unintended – and intended – consequences.

Polarization and Ideological Segregation

The ways in which content flows and is presented in social media due to the platforms’ business models risk limiting our access to information, particularly to information that challenges our preexisting beliefs, by exposing us to content likely to attract our attention and support our views. The concept of the filter bubble refers to the filtering of information by online platforms and through our own intellectual biases that worsen polarization by allowing us to live in echo chambers. This is easily witnessed in a YouTube feed: when you search for a song by an artist, you will likely find more songs by the same artist, or similar ones—the algorithms are designed to prolong your viewing, and assume you want more of something similar. The same has been identified for political content. Social media algorithms encourage confirmation bias, exposing us to content we will agree with and enjoy, often at the expense of the accuracy, rigor, educational or social value of that content.

The massive and precise data amassed by advertisers and social media companies about our preferences and opinions permit the practice of micro-targeting, which involves the display of tailored content based on data about users’ online behaviors, connections, and demographics, among others, as will be further explained below.

The increasingly tailored distribution of news and information is a threat to political discourse, diversity of opinions and democracy. Users can become detached even from factual information that disagrees with their viewpoints and isolated within their own cultural or ideological bubbles.

Because tailoring news and other information on social media is driven largely by nontransparent, opaque algorithms that are owned by private companies, it is hard for users to avoid these bubbles. Access to and intake of the very diverse information available on social media, with its many viewpoints, perspectives, ideas and opinions, requires an explicit effort by the individual user to go beyond passive consumption of the content that is presented to them.

Misinformation and Disinformation

The internet and the dominant online platforms provide new tools that amplify and alter the danger presented by false, inaccurate or taken out of context information. The online space increasingly drives discourse and is where much of today’s disinformation takes root. Refer to the Disinformation resource for a detailed overview of these problems.

Online Violence and Targeted Digital Attacks

Social media facilitates a number of violent behaviors such as defamation, harassment, bullying, stalking, “trolling” and “doxxing.” Cyberbullying among children, like traditional offline bullying, can harm students’ performance in school and causes real psychological damage. Cyberbullying is particularly harmful because victims experience the violence alone, isolated in cyberspace. They often do not seek help from parents and teachers, who they believe are not able to intervene. Cyberbullying is also difficult to address because it can move across social-media platforms, beginning on one and moving to another. Like cyberbullying, cyber harassment and cyberstalking have very tangible offline effects. Women are most often the victims of cyber harassment and cyberviolence, sometimes through the use of stalkerware installed by their partners to track their movements. A frightening cyber harassment trend has accelerated in France during the COVID-19 confinement in the form of “fisha” accounts, where bullies, aggressors or jilted ex-boyfriends will publish and circulate naked photos of teenage girls without their consent.

Journalists, women in particular, are often subject to cyber harassment and threats, particularly those who write about socially sensitive or political topics. Online violence against journalists can lead to journalistic self-censorship, affecting the quality of the information environment and democratic debate. Online tools provide new ways to spread and amplify hate speech and harassment. The use of fake accounts, bots, and even bot-nets (automated networks of accounts) allow perpetrators to attack, overwhelm, and even disable the social media accounts of their victims. Doxxing, by revealing sensitive information about journalists, is another strategy that can be used for censorship.

The 2014 case of Gamergate, when several women video-game developers were attacked by a coordinated harassment campaign that included doxxing and threats of rape and death, illustrates the strength and capacity of loosely-connected hate groups online to rally together, inflict real violence, and even drown out criticism. Many of the actions of the most active Gamergate trolls are illegal, but their identities are unknown. Importantly, it has been suggested by supporters of Gamergate that the most violent trolls were a “smaller, but vocal minority”—evidence of the magnifying power of internet channels and their use for coordinated online harassment.

Online hoaxes, scams, and frauds, like in their traditional offline forms, usually aim to extract money or sensitive information from the target. The practice of phishing is increasingly common on social media: an attacker pretends to be a contact or a reputable source in order to send malware or to extract personal information or account credentials. Spearphishing is a targeted phishing attack that leverages information about the recipient and details related to the surrounding circumstances. Coronavirus-related spearphishing attacks have increased steadily during the pandemic.

Data monetization by social media companies and tailored information streams

Most social-media platforms are free to use. Users simply register and then are able to use the platform. Social-media platforms do not receive revenue directly from users, like in a traditional subscription service; rather they generate profit primarily through digital advertising. Digital advertising is based on the collection of users’ data by social-media companies, which allows advertisers to target their ads to specific users and types of users. Social-media platforms monitor their users and build detailed profiles that they sell to advertisers. The data tracked includes information about the user’s connections and behavior on the platform, such as friends, posts, likes, searches, clicks and mouse movements. Data are also extensively collected outside platforms, including information about users’ location, webpages visited, online shopping and banking behavior. Additionally, many companies regularly access the contact book and photos of their users.

In the case of Facebook, this has led to a long-held and widespread conspiracy theory that the company listens to conversations to serve tailored advertisements. No one has ever been able to find clear evidence that this is actually happening. Research has shown that a company like Facebook does not need to listen in to your conversations, because it has the capacity to track you in so many other ways: “Not only does the system know exactly where you are at every moment, it knows who your friends are, what they are interested in, and who you are spending time with. It can track you across all your devices, log call and text metadata on phones, and even watch you write something that you end up deleting and never actually send.”

The massive and precise data amassed by advertisers and social-media companies about our preferences and opinions permit the practice of micro-targeting, that is, displaying targeted advertisements based on what you have recently purchased, searched for or liked. But just as online advertisers can target us with products, political parties can target us with more relevant or personalized messaging. Studies are currently trying to determine the extent to which political micro-targeting is a serious concern for the functioning of democratic elections. The question has also been raised by researchers and digital rights activists as to how micro-targeting may be interfering with our freedom of thought.

The increasingly tailored distribution of news and information is a threat to political discourse, diversity of opinions and democracy. Users can become detached from information that disagrees with their viewpoints and isolated within their own cultural or ideological bubbles. This is easily witnessed in a YouTube feed: when you search for a song by an artist, you will likely be shown more by the same artist, or similar ones—the algorithms are designed to prolong your viewing, and assume you want more of something similar. The same has been identified for ideological content.

Government surveillance and access to personal data

The content shared over social media is monitored by governments, who use social media for censorship, control and information manipulation. Many democratic governments are known to engage in extensive social-media monitoring for law enforcement and intelligence-gathering purposes. These practices should be guided by robust legal frameworks to safeguard individuals’ rights online, such as privacy and data-protection laws , but many countries have not yet enacted these types of laws.

There are also many examples of authoritarian governments using personal and other data harvested through social media to intimidate activists, silence opposition, and bring development projects to a halt. The information shared on social media often allows bad actors to build extensive profiles of individuals that enable targeted online and offline attacks, often through social engineering techniques. For example, a phishing email can be carefully crafted based on social-media data to trick an activist to click on a malicious link that provides access to the device, documents or social-media accounts.

Sometimes, however, a strong, real-time presence on social media can protect a prominent activist against threats by the government. A disappearance or arrest will be immediately noticed by followers or friends of a person who suddenly becomes silent on social media.

Market power and differing regulation

We rely on social-media platforms to help fulfill our fundamental rights (freedom of expression, assembly, etc.). However, these platforms are massive global monopolies and have been referred to as “the new governors.” This market concentration is troubling to national and international governance mechanisms. Simply breaking up the biggest platform companies will not fully solve the information disorders and social problems presented by social media. Civil society and governments also need visibility of the design choices made by the platforms to understand how to address their negative aspects.

The growing influence of social-media platforms has given many governments reasons to impose laws on online content. There is a surge in laws across the world regulating illegal and harmful content, such as incitement to terrorism or violence, false information, and hate speech. These laws often criminalize speech and contain punishments of jail terms or high fines for something like a retweet on Twitter. Even in countries where the rule of law is respected, legal approaches to regulating online content may be ineffective due to the many technical challenges of content moderation. There is also a risk of violating internet users’ freedom of expression by reinforcing imperfect and non-transparent moderation practices and over-deletion. Lastly, they constitute a challenge to social media companies to navigate between compliance with local laws and defending international human rights law.

Amplification/virality

As noted above, virality is one of the defining features of the social-media ecosystem: the tendency of an image, video, or piece of information to circulate rapidly and widely. In some cases, virality can spark political activism and raise awareness (like the #MeToo hashtag), but it can also amplify tragedies (the video of the Christchurch massacre in New Zealand) and spread inaccurate information.

Impact on journalism

Social media has had a profound impact on the field of journalism. While it has allowed the emergence of the citizen-journalist, locally-reported and crowd-sourced information, social-media companies have displaced the relationship between advertising and the traditional newspaper, and created a rewards system for sensationalist, click-bait content that will attract the widest attention globally, over quality journalism that may be pertinent to local communities. In many places, these monopolies have also been partially responsible for the collapse of local news.

The reason for this impact is that, while advertising has successfully made the transition to digital, with global revenues currently at $247 billion and growing by 4% year over year, very little of that revenue is making its way to publishers. The ad tech supply chain, dominated by Google and Facebook, now consumes 90% of all new growth in the world’s major markets and 61 cents of every dollar spent on digital advertising worldwide.

The disruption of the publishing business model has been a slow-motion disaster for news organizations around the world. Since 2012, digital newspaper ad revenues worldwide have grown from an anemic $7.3 billion to $9.95 billion in 2016, while the Google/Facebook duopoly will earn $174 billion or 61% of the global digital advertising market.

In addition, the way search tools work dramatically affects local publishers, as search is a powerful vector for news and information. Researchers have found that search rankings have a marked impact on our attention. Not only do we tend to think information that is ranked more highly is more trusted and relevant, but we tend to click on top results more often than lower ones. The Google search engine concentrates our attention on a narrow range of news sources, a trend that works against diverse and pluralistic media outlets. It also tends to work against the advertising revenue of smaller and community publishers, which is based on user attention and traffic. It is a downward spiral: search results favor larger outlets, those results drive more user engagement which makes their inventory more valuable in the advertising market, those publishers grow larger driving more favorable search results and onward we go.

Back to top

Questions

To understand the implications of social media information flows and choice of platforms used in your work, ask yourself these questions:

  1. Does your organization have a social-media strategy? What does your organization hope to achieve through social media use?
  2. Do you have staff who can oversee and ethically moderate your social-media accounts and content?
  3. Which platform do you intend to use to accomplish your organization’s goals? What is the business model of that platform? How does this business model affect you as a user?
  4. How is content ordered and moderated on the platform used (humans, volunteers, AI, etc.)? Can content go viral?
  5. Where is the platform legally headquartered? What jurisdiction and legal frameworks does it fall under?
  6. Do the platforms chosen have mechanisms for users to signal harassment and hate speech for review and possible removal?
  7. Do the platforms have mechanisms for users to be heard when content is unfairly taken down or accounts unfairly blocked?
  8. Are the platforms collecting data about users? Who else has access to collected data and how is it being used?
  9. How does the platform involve their community of users and civil society (for instance, in flagging dangerous content, in giving feedback on design features, in fact-checking information, etc.)? Are there local representatives?
  10. Do the platforms chosen have privacy features like encryption? If so, what level of encryption and for what precise services (for example, only on the app, only in private message threads)? What are the default settings?

Back to top

Case Studies

Crowdsourced mapping in crisis zones: collaboration, organisation and impact

Amelia Hunt & Doug Specht, Journal of International Humanitarian Action, 2019

“Crowdsourced mapping has become an integral part of humanitarian response, with high profile deployments of platforms following the Haiti and Nepal earthquakes, and the multiple projects initiated during the Ebola outbreak in North West Africa in 2014, being prominent examples. There have also been hundreds of deployments of crowdsourced mapping projects across the globe that did not have a high profile. This paper, through an analysis of 51 mapping deployments between 2010 and 2016, complimented with expert interviews, seeks to explore the organisational structures that create the conditions for effective mapping actions, and the relationship between the commissioning body, often a non-governmental organisation (NGO) and the volunteers who regularly make up the team charged with producing the map.”

Scaling Social Movements Through Social Media: The Case of Black Lives Matter

Marcia Mundt, Karen Ross , and Charla M Burnett, Social Media and Society, 2018

“Drawing on a case study of Black Lives Matter (BLM) that includes both analysis of public social media accounts and interviews with BLM groups, [the authors] highlight possibilities created by social media for building connections, mobilizing participants and tangible resources, coalition building, and amplifying alternative narratives. [They] also discuss challenges and risks associated with using social media as a platform for scaling up. Our analysis suggests that while benefits of social media use outweigh its risks, careful management of online media platforms is necessary to mitigate concrete, physical risks that social media can create for activists.”

Facebook’s Role in the Genocide in Myanmar: New Reporting Complicates the Narrative

Evelyn Douek, LawFare, 2018

“Members of the Myanmar military have systematically used Facebook as a tool in the government’s campaign of ethnic cleansing against Myanmar’s Rohingya Muslim minority, according to an incredible piece of reporting by the New York Times on Oct. 15. The Times writes that the military harnessed Facebook over a period of years to disseminate hate propaganda, false news and inflammatory posts. The story adds to the horrors known about the ongoing violence in Myanmar, but it also should complicate the ongoing debate about Facebook’s role and responsibility for spreading hate and exacerbating conflict in Myanmar and other developing countries…”

How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument

Gary King, Jennifer Pan & Margaret E. Roberts, American Political Science Review, 2017 (Study)

“The Chinese government has long been suspected of hiring as many as 2,000,000 people to surreptitiously insert huge numbers of pseudonymous and other deceptive writings into the stream of real social media posts, as if they were the genuine opinions of ordinary people. Many academics, and most journalists and activists, claim that these so-called “50c party” posts vociferously argue for the government’s side in political and policy debates. As we show, this is also true of the vast majority of posts openly accused on social media of being 50c. Yet, almost no systematic empirical evidence exists for this claim, or, more importantly, for the Chinese regime’s strategic objective in pursuing this activity. In the first large scale empirical analysis of this operation, we show how to identify the secretive authors of these posts, the posts written by them, and their content. We estimate that the government fabricates and posts about 448 million social media comments a year. In contrast to prior claims, we show that the Chinese regime’s strategy is to avoid arguing with skeptics of the party and the government, and to not even discuss controversial issues. We show that the goal of this massive secretive operation is instead to distract the public and change the subject, as most of these posts involve cheerleading for China, the revolutionary history of the Communist Party, or other symbols of the regime. We discuss how these results fit with what is known about the Chinese censorship program and suggest how they may change our broader theoretical understanding of “common knowledge” and information control in authoritarian regimes.”

Environmental campaigning: Earth Hour

The World Wide Fund For Nature (WWF) launched Earth Hour in 2007. This social media campaign calls for everyone – individuals and businesses alike – to switch off their lights for one hour. In 2017, the campaign’s 10th anniversary, millions of people and thousands of landmarks around the world turned their lights off for a single hour. The WWF uses the #EarthHour hashtag (amongst others) to galvanize its followers. Elements of this successful campaign are the limited time frame which makes engagement actionable, the push to share individual actions on multiple platforms, the enticing language and countdown timers.

The Cyber Harassment Helpline

The Cyber Harassment Helpline is an accessible toll-free Helpline for victims and survivors of online harassment and violence in Pakistan. Digital Rights Foundation founded this helpline in 2016 in response to the increasing examples of harassment of social media users and in particular women. The helpline focuses in particular on marginalized groups in Pakistan and prefers not to communicate via social media platforms for reasons of privacy and confidentiality.

Back to top

References

Find below the works cited in this resource.

Additional Resources

  • BellingCat: An independent international collective of researchers, investigators and citizen journalists using open source and social media investigation.
  • Documentary “The Social Dilemma.” Preview available here.
  • Graphika: an investigative research company that leverages AI to study online communities, analyzing how online social networks form, evolve, and are manipulated.
  • Tufekci, Zeynep. (2017). Twitter and Tear Gas: The Power and Fragility of Networked Protest. Yale University Press. Read this book excerpt.

Back to top

Categories

Digital Development in the time of COVID-19