Artificial Intelligence & Machine Learning

What is AI and ML?

Artificial intelligence (AI) is a field of computer science dedicated to solving cognitive problems commonly associated with human intelligence, such as learning, problem solving, and pattern recognition. Put another way, AI is a catch-all term used to describe new types of computer software that can approximate human intelligence. There is no single, precise, universal definition of AI.

Machine learning (ML) is a subset of AI. Essentially, machine learning is one of the ways computers “learn.” ML is an approach to AI that relies on algorithms trained to develop their own rules. This is an alternative to traditional computer programs, in which rules have to be hand-coded in. Machine learning extracts patterns from data and places that data into different sets. ML has been described as “the science of getting computers to act without being explicitly programmed.” Two short videos provide simple explanations of AI and ML: What Is Artificial Intelligence? | AI Explained and What is machine learning?

Other subsets of AI include speech processing, natural language processing (NLP), robotics, cybernetics, vision, expert systems, planning systems, and evolutionary computation.

artificial intelligence, types

The diagram above shows the many different types of technology fields that comprise AI. AI can refer to a broad set of technologies and applications. Machine learning is a tool used to create AI systems. When referring to AI, one can be referring to any or several of these technologies or fields. Applications that use AI, like Siri or Alexa, utilize multiple technologies. For example, if you say to Siri, “Siri, show me a picture of a banana,” Siri utilizes natural language processing (question answering) to understand what you’re asking, and then uses vision (image recognition) to find a banana and show it to you.

As noted above, AI doesn’t have a universal definition. There are many myths surrounding AI—from the fear that AI will take over the world by enslaving humans, to the hope that AI can one day be used to cure cancer. This primer is intended to provide a basic understanding of artificial intelligence and machine learning, as well as to outline some of the benefits and risks posed by AI.

Definitions

Algorithm: An algorithm is defined as “a finite series of well-defined instructions that can be implemented by a computer to solve a specific set of computable problems.” Algorithms are unambiguous, step-by-step procedures. A simple example of an algorithm is a recipe; another is a procedure to find the largest number in a set of randomly ordered numbers. An algorithm may either be created by a programmer or generated automatically. In the latter case, it is generated using data via ML.

Algorithmic decision-making/Algorithmic decision system (ADS): Algorithmic decision systems use data and statistical analyses to make automated decisions, such as determining whether people are eligible for a benefit or a penalty. Examples of fully automated algorithmic decision systems include the electronic passport control check-point at airports or an automated decision by a bank to grant a customer an unsecured loan based on the person’s credit history and data profile with the bank. Driver-assistance features that control a vehicle’s brake, throttle, steering, speed, and direction are an example of a semi-automated ADS.

Big Data: There are many definitions of “big data,” but we can generally think of it as extremely large data sets that, when analyzed, may reveal patterns, trends, and associations, including those relating to human behavior. Big Data is characterized by the five V’s: the volume, velocity, variety, veracity, and value of the data in question. This video provides a short introduction to big data and the concept of the five V’s.

Class label: A class label is applied after a machine learning system has classified its inputs; for example, determining whether an email is spam.

Data mining: Data mining, also known as knowledge discovery in data, is the “process of analyzing dense volumes of data to find patterns, discover trends, and gain insight into how the data can be used.”

Generative AI[1]: Generative AI is a type of deep-learning model that can generate high-quality text, images, and other content based on training data. See section on Generative AI for more details.

Label: A label is the thing a machine learning model is predicting, such as the future price of wheat, the kind of animal shown in a picture, or the meaning of an audio clip.

Large language model: A large language model (LLM) is “a type of artificial intelligence that uses deep learning techniques and massively large data sets to understand, summarize, generate, and predict new content.” An LLM is a type of generative AI[2]  that has been specifically architected to help generate text-based content.

Model: A model is the representation of what a machine learning system has learned from the training data.

Neural network: A biological neural network (BNN) is a system in the brain that makes it possible to sense stimuli and respond to them. An artificial neural network (ANN) is a computing system inspired by its biological counterpart in the human brain. In other words, an ANN is “an attempt to simulate the network of neurons that make up a human brain so that the computer will be able to learn and make decisions in a humanlike manner.” Large-scale ANNs drive several applications of AI.

Profiling: Profiling involves automated data processing to develop profiles that can be used to make decisions about people.

Robot: Robots are programmable, automated devices. Fully autonomous robots (e.g., self-driving vehicles) are capable of operating and making decisions without human control. AI enables robots to sense changes in their environments and adapt their responses and behaviors accordingly in order to perform complex tasks without human intervention.

Scoring: Scoring, also called prediction, is the process of a trained machine learning model generating values based on new input data. The values or scores that are created can represent predictions of future values, but they might also represent a likely category or outcome. When used vis-a-vis people, scoring is a statistical prediction that determines whether an individual fits into a category or outcome. A credit score, for example, is a number drawn from statistical analysis that represents the creditworthiness of an individual.

Supervised learning: In supervised learning, ML systems are trained on well-labeled data. Using labeled inputs and outputs, the model can measure its accuracy and learn over time.

Unsupervised learning: Unsupervised learning uses machine learning algorithms to find patterns in unlabeled datasets without the need for human intervention.

Training: In machine learning, training is the process of determining the ideal parameters comprising a model.

 

How do artificial intelligence and machine learning work?

Artificial Intelligence

Artificial Intelligence is a cross-disciplinary approach that combines computer science, linguistics, psychology, philosophy, biology, neuroscience, statistics, mathematics, logic, and economics to “understand, model, and replicate intelligence and cognitive processes.”

AI applications exist in every domain, industry, and across different aspects of everyday life. Because AI is so broad, it is useful to think of AI as made up of three categories:

  • Narrow AI or Artificial Narrow Intelligence (ANI) is an expert system in a specific task, like image recognition, playing Go, or asking Alexa or Siri to answer a question.
  • Strong AI or Artificial General Intelligence (AGI) is an AI that matches human intelligence.
  • Artificial Superintelligence (ASI) is an AI that exceeds human capabilities.

Modern AI techniques are developing quickly, and AI applications are already pervasive. However, these applications only exist presently in the “Narrow AI” field. Artificial general intelligence and artificial superintelligence have not yet been achieved and likely will not be for the next few years or decades.

Machine Learning

Machine learning is an application of artificial intelligence. Although we often find the two terms used interchangeably, machine learning is a process by which an AI application is developed. The machine learning process involves an algorithm that makes observations based on data, identifies patterns and correlations in the data, and uses the pattern or correlation to make predictions. Most of the AI in use today is driven by machine learning.

Just as it is useful to break-up AI into three categories, machine learning can also be thought of as three different techniques: supervised learning; unsupervised learning; and deep learning.

Supervised Learning

Supervised learning efficiently categorizes data according to pre-existing definitions embodied in a data set  containing training examples with associated labels. Take the example of a spam-filtering system that is being trained using spam and non-spam emails. The “input” in this case is all the emails the system processes. After humans have marked certain emails as spam, the system sorts spam emails into a separate folder. The “output” is the categorization of email. The system finds a correlation between the label “spam” and the characteristics of the email message, such as the text in the subject line, phrases in the body of the message, or the email or IP address of the sender. Using this correlation, the system tries to predict the correct label (spam/not spam) to apply to all the future emails it processes.

“Spam” and “not spam” in this instance are called “class labels.” The correlation that the system has found is called a “model” or “predictive model.” The model may be thought of as an algorithm the ML system has generated automatically by using data. The labeled messages from which the system learns are called “training data.” The “target variable” is the feature the system is searching for or wants to know more about—in this case, it is the “spaminess” of an email. The “correct answer,” so to speak, in the categorization of email is called the “desired outcome” or “outcome of interest.”

Unsupervised Learning

Unsupervised learning involves neural networks finding a relationship or pattern without access to previously labeled datasets of input-output pairs. The neural networks organize and group the data on their own, finding recurring patterns and detecting deviations from these patterns. These systems tend to be less predictable than those that use labeled datasets, and are most often deployed in environments that may change at some frequency and are unstructured or partially structured. Examples include:

  1. An optical character-recognition system that can “read” handwritten text, even if it has never encountered the handwriting before.
  2. The recommended products a user sees on retail websites. These recommendations may be determined by associating the user with a large number of variables such as their browsing history, items they purchased previously, their ratings of those items, items they saved to a wish list, the user’s location, the devices they use, their brand preference, and the prices of their previous purchases.
  3. The detection of fraudulent monetary transactions based on timing and location. For instance, if two consecutive transactions happened on the same credit card within a short span of time in two different cities.

A combination of supervised and unsupervised learning (called “semi-supervised learning”) is used when a relatively small dataset with labels is available to train the neural network to act upon a larger, unlabeled dataset. An example of semi-supervised learning is software that creates deepfakes, or digitally altered audio, videos, or images.

Deep Learning

Deep learning makes use of large-scale artificial neural networks (ANNs) called deep neural networks to create AI that can detect financial fraud, conduct medical-image analysis, translate large amounts of text without human intervention, and automate the moderation of content on social networking websites. These neural networks learn to perform tasks by utilizing numerous layers of mathematical processes to find patterns or relationships among different data points in the datasets. A key attribute to deep learning is that these ANNs can peruse, examine, and sort huge amounts of data, which theoretically enables them to identify new solutions to existing problems.

Generative AI

Generative AI[3] is a type of deep-learning model that can generate high-quality text, images, and other content based on training data. The launch of OpenAI’s chatbot, ChatGPT, in late 2022 placed a spotlight on generative AI and created a race among companies to churn out alternate (and ideally superior) versions of this technology. Excitement over large language models and other forms of generative AI was also accompanied by concerns about accuracy, bias within these tools, data privacy, and how these tools can be used to spread disinformation more efficiently.

Although there are other types of machine learning, these three—supervised learning, unsupervised learning and deep learning—represent the basic techniques used to create and train AI systems.

Bias in AI and ML

Artificial intelligence is built by humans, and trained on data generated by them. Inevitably, there is a risk that individual and societal human biases will be inherited by AI systems.

There are three common types of biases in computing systems:

  • Pre-existing bias has its roots in social institutions, practices, and attitudes.
  • Technical bias arises from technical constraints or considerations.
  • Emergent bias arises in a context of use.

Bias in artificial intelligence may affect, for example, the political advertisements one sees on the internet, the content pushed to the top of social media news feeds, the cost of an insurance premium, the results of a recruitment screening process, or the ability to pass through border-control checks in another country.

Bias in a computing system is a systematic and repeatable error. Because ML deals with large amounts of data, even a small error rate can get compounded or magnified and greatly affect the outcomes from the system. A decision made by an ML system, especially one that processes vast datasets, is often a statistical prediction. Hence, its accuracy is related to the size of the dataset. Larger training datasets are likely to yield decisions that are more accurate and lower the possibility of errors.

Bias in AI/ML systems can result in discriminatory practices, ultimately leading to the exacerbation of existing inequalities or the generation of new ones.. For more information, see this explainer related to AI bias and the Risks section of this resource.

Back to top

How are AI and ML relevant in civic space and for democracy?

Elephant tusks pictured in Uganda. In wildlife conservation, AI/ML algorithms and past data can be used to predict poacher attacks. Photo credit: NRCN.

The widespread proliferation, rapid deployment, scale, complexity, and impact of AI on society is a topic of great interest and concern for governments, civil society, NGOs, human rights bodies, businesses, and the general public alike. AI systems may require varying degrees of human interaction or none at all. When applied in design, operation, and delivery of services, AI/ML offers the potential to provide new services and improve the speed, targeting, precision, efficiency, consistency, quality, or performance of existing ones. It may provide new insights by making apparent previously undiscovered linkages, relationships, and patterns, and offering new solutions. By analyzing large amounts of data, ML systems save time, money, and effort. Some examples of the application of AI/ ML in different domains include using AI/ ML algorithms and past data in wildlife conservation to predict poacher attacks, and discovering new species of viruses.

Tuberculosis microscopy diagnosis in Uzbekistan. AI/ML systems aid healthcare professionals in medical diagnosis and the detection of diseases. Photo credit: USAID.

The predictive abilities of AI and the application of AI and ML in categorizing, organizing, clustering, and searching information have brought about improvements in many fields and domains, including healthcare, transportation, governance, education, energy, and security, as well as in safety, crime prevention, policing, law enforcement, urban management, and the judicial system. For example, ML may be used to track the progress and effectiveness of government and philanthropic programs. City administrations, including those of smart cities , use ML to analyze data accumulated over time about energy consumption, traffic congestion, pollution levels, and waste in order to monitor and manage these issues and identify patterns in their generation, consumption, and handling.

Digital maps created in Mugumu, Tanzania. Artificial intelligence can support planning of infrastructure development and preparation for disaster. Photo credit: Bobby Neptune for DAI.

AI is also used in climate monitoring, weather forecasting, the prediction of disasters and hazards, and the planning of infrastructure development. In healthcare, AI systems aid professionals in medical diagnosis, robot-assisted surgery, easier detection of diseases, prediction of disease outbreaks, tracing the source(s) of disease spread, and so on. Law enforcement and security agencies deploy AI/ML-based surveillance systems, facial recognition systems, drones, and predictive policing for the safety and security of the citizens. On the other side of the coin, many of these applications raise questions about individual autonomy, privacy, security, mass surveillance, social inequality, and negative impacts on democracy (see the Risks section).

Fish caught off the coast of Kema, North Sulawesi, Indonesia. Facial recognition is used to identify species of fish to contribute to sustainable fishing practices. Photo credit: courtesy of USAID SNAPPER.

AI and ML have both positive and negative implications for public policy and elections, as well as democracy more broadly. While data may be used to maximize the effectiveness of a campaign through targeted messaging to help persuade prospective voters, it may also be used to deliver propaganda or misinformation to vulnerable audiences. During the 2016 U.S. presidential election, for example, Cambridge Analytica used big data and machine learning to tailor messages to voters based on predictions about their susceptibility to different arguments.

During elections in the United Kingdom and France in 2017, political bots were used to spread misinformation on social media and leak private campaign emails. These autonomous bots are “programmed to aggressively spread one-sided political messages to manufacture the illusion of public support” or even dissuade certain populations from voting. AI-enabled deepfakes (audio or video that has been fabricated or altered) also contribute to the spread of confusion and falsehoods about political candidates and other relevant actors. Though artificial intelligence can be used to exacerbate and amplify disinformation, it can also be applied in potential solutions to the challenge. See the Case Studies section  of this resource for examples of how the fact-checking industry is leveraging artificial intelligence to more effectively identify and debunk false  and misleading narratives.

Cyber attackers seeking to disrupt election processes use machine learning to effectively target victims and develop strategies for defeating cyber defenses. Although these tactics can be used to prevent cyber attacks, the level of investment in artificial intelligence technologies by malign actors in many cases exceeds that of legitimate governments or other official entities. Some of these actors also use AI-powered digital surveillance tools to track down and target opposition figures, human rights defenders, and other perceived critics.

As discussed elsewhere in this resource, “the potential of automated decision-making systems to reinforce bias and discrimination also impacts the right to equality and participation in public life.” Bias within AI systems can harm historically underrepresented communities and exacerbate existing gender divides and the online harms experienced by women candidates, politicians, activists, and journalists.

AI-driven solutions can help improve the transparency and legitimacy of campaign strategies, for example, by leveraging political bots for good to help identify articles that contain misinformation or by providing a tool for collecting and analyzing the concerns of voters. Artificial intelligence can also be used to make redistricting less partisan (though in some cases it also facilitates partisan gerrymandering) and prevent or detect fraud or significant administrative errors. Machine learning can inform advocacy by predicting which pieces of legislation will be approved based on algorithmic assessments of the text of the legislation, how many sponsors or supporters it has, and even the time of year it is introduced.

The full impact of the deployment of AI systems on the individual, society, and democracy is not known or knowable, which creates many legal, social, regulatory, technical, and ethical conundrums. The topic of harmful bias in artificial intelligence and its intersection with human rights and civil rights has been a matter of concern for governments and activists. The European Union’s (EU) General Data Protection Regulation (GDPR) has provisions on automated decision-making, including profiling. The European Commission released a whitepaper on AI in February 2020 as a prequel to potential legislation governing the use of AI in the EU, while another EU body has released recommendations on the human rights impacts of algorithmic systems. Similarly, Germany, France, Japan, and India have drafted AI strategies for policy and legislation. Physicist Stephen Hawking once said, “…success in creating AI could be the biggest event in the history of our civilization. But it could also be the last, unless we learn how to avoid the risks.”

Back to top

Opportunities

Artificial intelligence and machine learning can have positive impacts when used to further democracy, human rights, and good governance. Read below to learn how to more effectively and safely think about artificial intelligence and machine learning in your work.

Detect and overcome bias

Although artificial intelligence can reproduce human biases, as discussed above, it can also be used to combat unconscious biases in contexts like job recruitment.  Responsibly designed algorithms can bring hidden biases into view and, in some cases, nudge people into less-biased outcomes; for example by masking candidates’ names, ages, and other bias-triggering features on a resume.

Improve security and safety

AI systems can be used to detect attacks on public infrastructure, such as a cyber attack or credit card fraud. As online fraud becomes more advanced, companies, governments, and individuals need to be able to identify fraud quickly, or even prevent it before it occurs. Machine learning can help identify agile and unusual patterns that match or exceed traditional strategies used to avoid detection.

Moderate harmful online content

Enormous quantities of content are uploaded every second to the internet and social media . There are simply too many videos, photos, and posts for humans to manually review. Filtering tools like algorithms and machine-learning techniques are used by many social media platforms to screen for content that violates their terms of service (like child sexual abuse material, copyright violations, or spam). Indeed, artificial intelligence is at work in your email inbox, automatically filtering unwanted marketing content away from your main inbox. Recently, the arrival of deepfakes and other computer-generated content requires similarly advanced identification tactics. Fact-checkers and other actors working to diffuse the dangerous, misleading power of deepfakes are developing their own artificial intelligence to identify these media as false.

Web Search

Search engines run on algorithmic ranking systems. Of course, search engines are not without serious biases and flaws, but they allow us to locate information from the vast stretches of the internet. Search engines on the web (like Google and Bing) or within platforms and websites (like searches within Wikipedia or The New York Times) can enhance their algorithmic ranking systems by using machine learning to favor higher-quality results that may be beneficial to society. For example, Google has an initiative to highlight original reporting, which prioritizes the first instance of a news story rather than sources that republish the information.

Translation

Machine learning has allowed for truly incredible advances in translation. For example, Deepl is a small machine-translation company that has surpassed even the translation abilities of the biggest tech companies. Other companies have also created translation algorithms that allow people across the world to translate texts into their preferred languages, or communicate in languages beyond those they know well, which has advanced the fundamental right of access to information, as well as the right to freedom of expression and the right to be heard.

Back to top

Risks

The use of emerging technologies like AI can also create risks for democracy and in civil society programming. Read below to learn how to discern the possible dangers associated with artificial intelligence and machine learning in DRG work, as well as how to mitigate  unintended—and intended—consequences.

Discrimination against marginalized groups

There are several ways in which AI may make decisions that can lead to discrimination, including how the “target variable” and the “class labels” are defined; during the process of labeling the training data; when collecting the training data; during the feature selection; and when proxies are identified. It is also possible to intentionally set up an AI system to be discriminatory towards one or more groups. This video explains how commercially available facial recognition systems trained on racially biased data sets discriminate against people of dark skin, women and gender-diverse people.

The accuracy of AI systems is based on how ML processes Big Data, which in turn depends on the size of the dataset. The larger the size, the more accurate the system’s decisions are likely to be. However, women, Black people and people of color (PoC), disabled people, minorities, indigenous people, LGBTQ+ people, and other minorities, are less likely to be represented in a dataset because of structural discrimination, group size, or external attitudes that prevent their full participation in society. Bias in training data reflects and systematizes existing discrimination. Because an AI system is often a black box, it is hard to determine why AI makes certain decisions about some individuals or groups of people, or conclusively prove it has made a discriminatory decision. Hence, it is difficult to assess whether certain people were discriminated against on the basis of their race, sex, marginalized status, or other protected characteristics. For instance, AI systems used in predictive policing, crime prevention, law enforcement, and the criminal justice system are, in a sense, tools for risk-assessment. Using historical data and complex algorithms, they generate predictive scores that are meant to indicate the probability of the occurrence of crime, the probable location and time, and the people who are likely to be involved. When relying on biased data or biased decision-making structures, these systems may end up reinforcing stereotypes about underprivileged, marginalized or minority groups.

A study by the Royal Statistical Society notes that the “…predictive policing of drug crimes results in increasingly disproportionate policing of historically over‐policed communities… and, in the extreme, additional police contact will create additional opportunities for police violence in over‐policed areas. When the costs of policing are disproportionate to the level of crime, this amounts to discriminatory policy.” Likewise, when mobile applications for safe urban navigation or software for credit-scoring, banking, insurance, healthcare, and the selection of employees and university students rely on biased data and decisions, they reinforce social inequality and negative and harmful stereotypes.

The risks associated with AI systems are exacerbated when AI systems make decisions or predictions involving vulnerable groups such as refugees, or about life or death circumstances, such as in medical care. A 2018 report by the University of Toronto’s Citizen Lab notes, “Many [asylum seekers and immigrants] come from war-torn countries seeking protection from violence and persecution. The nuanced and complex nature of many refugee and immigration claims may be lost on these technologies, leading to serious breaches of internationally and domestically protected human rights, in the form of bias, discrimination, privacy breaches, due process and procedural fairness issues, among others. These systems will have life-and-death ramifications for ordinary people, many of whom are fleeing for their lives.” For medical and healthcare uses, the stakes are especially high because an incorrect decision made by the AI system could potentially put lives at risk or drastically alter the quality of life or wellbeing of the people affected by it.

Security vulnerabilities

Malicious hackers and criminal organizations may use ML systems to identify vulnerabilities in and target public infrastructure or privately owned systems such as internet of things (IoT) devices and self-driven cars.

If malicious entities target AI systems deployed in public infrastructure, such as smart cities, smart grids, nuclear installations,healthcare facilities, and banking systems, among others, they “will be harder to protect, since these attacks are likely to become more automated and more complex and the risk of cascading failures will be harder to predict. A smart adversary may either attempt to discover and exploit existing weaknesses in the algorithms or create one that they will later exploit.” Exploitation may happen, for example, through a poisoning attack, which interferes with the training data if machine learning is used. Attackers may also “use ML algorithms to automatically identify vulnerabilities and optimize attacks by studying and learning in real time about the systems they target.”

Privacy and data protection

The deployment of AI systems without adequate safeguards and redress mechanisms may pose many risks to privacy and data protection. Businesses and governments collect immense amounts of personal data in order to train the algorithms of AI systems that render services or carry out specific tasks. Criminals, illiberal governments, and people with malicious intent often  target these data for economic or political gain. For instance, health data captured from smartphone applications and internet-enabled wearable devices, if leaked, can be misused by credit agencies, insurance companies, data brokers, cybercriminals, etc. The issue is not only leaks, but the data that people willingly give out without control about how it will be used down the road. This includes what we share with both companies and government agencies. The breach or abuse of non-personal data, such as anonymized data, simulations, synthetic data, or generalized rules or procedures, may also affect human rights.

Chilling effect

AI systems used for surveillance, policing, criminal sentencing, legal purposes, etc. become a new avenue for abuse of power by the state to control citizens and political dissidents. The fear of profiling, scoring, discrimination, and pervasive digital surveillance may have a chilling effect on citizens’ ability or willingness to exercise their rights or express themselves. Many people will modify their behavior in order to obtain the benefits of a good score and to avoid the disadvantages that come with having a bad score.

Opacity (Black box nature of AI systems)

Opacity may be interpreted as either a lack of transparency or a lack of intelligibility. Algorithms, software code, behind-the-scenes processing and the decision-making process itself may not be intelligible to those who are not experts or specialized professionals. In legal or judicial matters, for instance, the decisions made by an AI system do not come with explanations, unlike decisions made by  judges who are required to justify their legal order or judgment.

Technological unemployment

Automation systems, including AI/ML systems, are increasingly being used to replace human labor in various domains and industries, eliminating a large number of jobs and causing structural unemployment (known as technological unemployment). With the introduction of AI/ML systems, some types of jobs will be lost, others will be transformed, and new jobs will appear. The new jobs are likely to require specific or specialized skills that are amenable to AI/ML systems.

Loss of individual autonomy and personhood

Profiling and scoring in AI raise apprehensions that people are being dehumanized and reduced to a profile or score. Automated decision-making systems may affect wellbeing, physical integrity, and quality of life. This affects what constitutes an individual’s consent (or lack thereof); the way consent is formed, communicated and understood; and the context in which it is valid. “[T]he dilution of the free basis of our individual consent—either through outright information distortion or even just the absence of transparency—imperils the very foundations of how we express our human rights and hold others accountable for their open (or even latent) deprivation”. – Human Rights in the Era of Automation and Artificial Intelligence

Back to top

Questions

If you are trying to understand the implications of artificial intelligence and machine learning in your work environment, or are considering using aspects of these technologies as part of your DRG programming, ask yourself these questions:

  1. Is artificial intelligence or machine learning an appropriate, necessary, and proportionate tool to use for this project and with this community?
  2. Who is designing and overseeing the technology? Can they explain what is happening at different steps of the process?
  3. What data are being used to design and train the technology? How could these data lead to biased or flawed functioning of the technology?
  4. What reason do you have to trust the technology’s decisions? Do you understand why you are getting a certain result, or might there be a mistake somewhere? Is anything not explainable?
  5. Are you confident the technology will work as intended when used with your community and on your project, as opposed to in a lab setting (or a theoretical setting)? What elements of your situation might cause problems or change the functioning of the technology?
  6. Who is analyzing and implementing the AI/ML technology? Do these people understand the technology, and are they attuned to its potential flaws and dangers? Are these people likely to make any biased decisions, either by misinterpreting the technology or for other reasons?
  7. What measures do you have in place to identify and address potentially harmful biases in the technology?
  8. What regulatory safeguards and redress mechanisms do you have in place for people who claim that the technology has been unfair to them or abused them in any way?
  9. Is there a way that your AI/ML technology could perpetuate or increase social inequalities, even if the benefits of using AI and ML outweigh these risks? What will you do to minimize these problems and stay alert to them?
  10. Are you certain that the technology abides with relevant regulations and legal standards, including the GDPR?
  11. Is there a way that this technology may not discriminate against people by itself, but that it may lead to discrimination or other rights violations, for instance when it is deployed in different contexts or if it is shared with untrained actors? What can you do to prevent this?

Back to top

Case Studies

Leveraging artificial intelligence to promote information integrity

The United Nations Development Programme’s eMonitor+ is an AI-powered platform that helps “scan online media posts to identify electoral violations, misinformation, hate speech, political polarization and pluralism, and online violence against women.” Data analysis facilitated by eMonitor+ enables election commissions and media stakeholders to “observe the prevalence, nature, and impact of online violence.” The platform relies on machine learning to track and analyze content on digital media to generate graphical representations for data visualization. eMonitor+ has been used by Peru’s Asociación Civil Transparencia and Ama Llulla to map and analyze digital violence and hate speech in political dialogue, and by the Supervisory Election Commission during the 2022 Lebanese parliamentary election to monitor potential electoral violations, campaign spending, and misinformation. The High National Election Commission of Libya has also used eMonitor+ to monitor and identify online violence against women in elections.

“How Nigeria’s fact-checkers are using AI to counter election misinformation”

How Nigeria’s fact-checkers are using AI to counter election misinformation”

Ahead of Nigeria’s 2023 presidential election, the UK-based fact-checking organization Full Fact “offered its artificial intelligence suite—consisting of three tools that work in unison to automate lengthy fact-checking processes—to greatly expand fact-checking capacity in Nigeria.” According to Full Fact, these tools are not intended to replace human fact-checkers but rather assist with time-consuming, manual monitoring and review, leaving fact-checkers “more time to do the things they’re best at: understanding what’s important in public debate, interrogating claims, reviewing data, speaking with experts and sharing their findings.” The scalable tools which include search, alerts, and live functions allow fact-checkers to “monitor news websites, social media pages, and transcribe live TV or radio to find claims to fact check.”

Monitoring crop development: Agroscout

Monitoring crop development: Agroscout

The growing impact of climate change could further cut crop yields, especially in the world’s most food-insecure regions. And our food systems are responsible for about 30% of greenhouse gas emissions. Israeli startup AgroScout envisions a world where food is grown in a more sustainable way. “Our platform uses AI to monitor crop development in real-time, to more accurately plan processing and manufacturing operations across regions, crops and growers,” said Simcha Shore, founder and CEO of AgroScout. ‘By utilizing AI technology, AgroScout detects pests and diseases early, allowing farmers to apply precise treatments that reduce agrochemical use by up to 85%. This innovation helps minimize the environmental damage caused by traditional agrochemicals, making a positive contribution towards sustainable agriculture practices.’”

Machine Learning for Peace

The Machine Learning for Peace Project seeks to understand how civic space is changing in countries around the world using state of the art machine learning techniques. By leveraging the latest innovations in natural language processing, the project classifies “an enormous corpus of digital news into 19 types of civic space ‘events’ and 22 types of Resurgent Authoritarian Influence (RAI) events which capture the efforts of authoritarian regimes to wield influence on developing countries.” Among the civic space “events” being tracked are activism, coups, election activities, legal changes, and protests. The civic space event data is combined with “high frequency economic data to identify key drivers of civic space and forecast shifts in the coming months.” Ultimately, the project hopes to serve as a “useful tool for researchers seeking rich, high-frequency data on political regimes and for policymakers and activists fighting to defend democracy around the world.”

Food security: Detecting diseases in crops using image analysis

Food security: Detecting diseases in crops using image analysis

“Plant diseases are not only a threat to food security at the global scale, but can also have disastrous consequences for smallholder farmers whose livelihoods depend on healthy crops.” As a first step toward supplementing existing solutions for disease diagnosis with a smartphone-assisted diagnosis system, researchers used a public dataset of 54,306 images of diseased and healthy plant leaves to train a “deep convolutional neural network” to automatically identify 14 different crop species and 26 unique diseases (or the absence of those diseases).

Back to top

References

Find below the works cited in this resource.

Additional Resources

Back to top

Categories

Automation

What is automation?

A worker at the assembly line of a car-wiring factory in Bizerte, Tunisia. The automation of labor disproportionately affects women, the poor, and other vulnerable members of society. Photo credit: Alison Wright for USAID, Tunisia, Africa

Automation involves techniques and methods applied to enable machines, devices, and systems to function with minimal or no human involvement. Automation is used, for example, in applications for managing the operation of traffic lights in a city, navigating aircrafts, running and configuring different elements of a telecommunications network, in robot-assisted surgeries, and even for automated storytelling (which uses artificial intelligence software to create verbal stories). Automation can improve efficiency and reduce error, but it also creates new opportunities for error and introduces new costs and challenges for government and society.

How does automation work?

Processes can be automated by programming certain procedures to be performed without human intervention (like a recurring payment for a credit card or phone app) or by linking electronic devices to communicate directly with one another (like self-driving vehicles communicating with other vehicles and with road infrastructure). Automation can involve the use of temperature sensors, light sensors, alarms, microcontrollers, robots, and more. Home automation, for example, may include home assistants such as Amazon Echo, Google Home, and OpenHAB. Some automation systems are virtual, for example, email filters that automatically sort incoming emails into different folders, and AI-enabled moderation systems for online content.

The exact architecture and functioning of automation systems depend on their purpose and application. However, automation should not be confused with artificial intelligence in which an algorithm-led process ‘learns’ and changes over time: for instance, an algorithm that reviews thousands of job applications and studies and learns from patterns in the applications is using artificial intelligence, while a chatbot that replies to candidates’ questions is using automation.

For more information on the different components of automation systems, read also the resources about the Internet of Things and sensors, robots and drones, and biometrics.

Back to top

How is automation relevant in civic space and for democracy?

Automated processes can be built to increase transparency, accuracy, efficiency, and scale. They can help minimize effort (labor) and time; reduce errors and costs; improve the quality and/or precision in tasks/processes; carry out tasks that are too strenuous, hazardous, or beyond the physical capabilities of humans; and generally free humans of repetitive, monotonous tasks.

From a historical perspective, automation is not new: the first industrial revolution in the 1700s harnessed the power of steam and water; the technological revolution of the 1880s relied on railways and telegraphs; and the digital revolution in the 20th century saw the beginning of computing. Each of these transitions brought fundamental changes not only to industrial production and the economy, but to society, government, and international relations.

Now, the fourth industrial revolution, or the ‘automation revolution’ as it is sometimes called, promises to once again disrupt work as we know it as well as relationships between people, machines, and programmed processes.

When used by governments, automated processes promise to deliver government services with greater speed, efficiency, and coverage. These developments are often called e-government, e-governance, or digital government. E-government includes government communication and information sharing on the web (sometimes even the publishing of government budgets and agendas), facilitation of financial transactions online such as electronic filing of tax returns, digitization of health records, electronic voting, and digital IDs.

Additionally, automation can be used in elections to help count votes, register voters, and record voter turnout to increase trust in the integrity of the democratic process. Without automation, counting votes can take weeks or months and can lead to results being challenged by anti-democratic forces and to possible voter disenchantment with democratic systems. E-voting and automated vote counting have already become politicized in many countries like Kazakhstan and Pakistan, although many countries are increasingly adopting e-voting systems to help increase voter turnout and participation and hasten the election process.

A health worker receives information on a disease outbreak in Brewerville, Liberia. Automated processes promise to deliver government services with greater speed, efficiency, and coverage. Photo credit: Sarah Grile.

The benefits of automating government services are numerous, as the UK’s K4D helpdesk explains, by lowering the cost of service delivery, improving quality and coverage  (for example, through telemedicine or drones); strengthening communication, monitoring, and feedback, and in some cases by encouraging citizen participation at the local level. In Indonesia, for example, the Civil Service Agency (BKN) introduced a computer-assisted testing system (CAT) to disrupt the previously long-standing manual testing system that created rampant opportunities for corruption in civil service recruitment by line ministry officials. With the new system, the database of questions is tightly controlled, and the results are posted in real time outside the testing center.

In India, an automated system relying on a specifically designed computer (an Advanced Virtual RISC) and the common telecommunications standard GSM (Global System for Mobile) is used to inform farmers about exact field conditions and to point to the necessary next steps with command functions such as irrigating, plowing, deploying seeds and carrying out other farming activities.

Drone used for irrigation scheduling in the southern part of Bangladesh. Automated systems have vast applications in agriculture. Photo credit: Alanuzzaman Kurishi.

As with previous industrial revolutions, automation changes the nature of work, and these changes could bring unemployment in certain sectors if not properly planned. The removal of humans from processes also brings new opportunities for error (such as ‘automation bias’) and raises new legal and ethical questions. See the Risks section below.

Back to top

Opportunities

Islamabad Electric Supply Company’s (IESCO) Power Distribution Control Center (PDC), Pakistan. Smart meters enable monitoring of power demand, supply, and load shedding in real-time. Photo credit: USAID.

Automation can have positive impacts when used to further democracy, human rights, and governance issues. Read below to learn how to more effectively and safely think about automation in your work.

Increase in productivity

Automation may improve output while reducing the time and labor required, thus increasing the productivity of workers and the demand for other kinds of work. For example, automation can streamline document review, cutting down on the time that lawyers need to search through documents or academics through sources, etc. In Azerbaijan, the government partnered with the private sector in the use of an automated system to reduce the backlog of relatively simple court cases, such as claims for unpaid bills. In instances where automation increases the quality of services or goods and/or brings down their cost, a more significant demand for the goods or services can be served.

Improvements in processes and outputs

Automation can improve the speed, efficiency, quality, consistency, and coverage of service delivery and reduce human error, time spent, and costs. It can therefore allow activities to scale up. For example, the UNDP and the government of the Maldives used automation to create 3-D maps of the islands and chart their topography. Having this information on record would speed up further disaster relief and rescue efforts. The use of drones also reduced the time and money required to conduct this exercise: while mapping 11 islands would normally take almost a year, using a drone reduced the time to one day. See the Robots and Drones resource for additional examples.

Optimizing an automated task generally requires trade-offs among cost, precision, the permissible margin of error, and scale. Automation may sometimes require tolerating more errors in order to reduce costs or achieve greater scale. For more, see the section “Knowing when automation offers a suitable solution to the challenge at hand” in Automation of government processes.

For democratic processes, automation can help facilitate access for voters who cannot travel to polling stations via remote e-voting or using accessible systems at polling stations. Moreover, using automation for counting votes can help decrease user error in some cases and increase trust in the democratic process.

Increase transparency

Automation may increase transparency by making data and information easily available to the public, thus building public trust and aiding accountability. In India, the State Transport Department of Karnataka has automated driving test centers hoping to eliminate bribery in the issuing of driver’s licenses. A host of high-definition cameras and sensors placed along the test track captured the movement of the vehicle while a computerized system decides if the driver has passed or failed the test. See also “Are emerging technologies helping win the fight against corruption in developing countries?”

Back to top

Risks

The use of emerging technologies can also create risks in civil society programming. Read below on how to discern the possible dangers associated with automation in DRG work, as well as how to mitigate unintended – and intended – consequences.

Labor issues

When automation is used to replace human labor, the resulting loss of jobs causes structural unemployment known as “technological unemployment.” Structural unemployment disproportionately affects women, the poor, and other vulnerable members of society, unless they are re-skilled and provided with adequate protections. Automation also requires skilled labor that can operate, oversee or maintain automated systems, eventually creating jobs for a smaller section of the population. But the immediate impact of this transformation of work can be harmful to people and communities without social safety nets or opportunities for finding other work.

Additionally, there have been links drawn between increased automation and a rise in preferences for populist politicians as job loss begins to affect particularly low-wage workers. A study conducted by the Proceedings of the National Academy of Sciences (PNAS) found a correlation between the impact of globalization and automation and increased vote shares for right-wing populist parties in several European countries. Although automation can have a positive impact on overall profits, low-wage, non-educated workers may feel particularly impacted as wages remain low with their tasks being replaced by automated systems.

Discrimination towards marginalized groups and minorities and increasing social inequality

Automation systems equipped with artificial intelligence (AI) may produce results that are discriminatory towards some marginalized and minority groups when the system has learned from biased learning patterns, from biased datasets, or from biased human decision-making. The outputs of AI-equipped automated systems may reflect real-life societal biases, prejudices, and discriminatory treatment towards some demographics. Biases can also occur from the human implementation of automated systems, for instance, when the systems do not function in the real world as they were able to function in a lab or theoretical setting, or when the humans working with the machines misinterpret or misuse the automated technology.

There are numerous examples of racial and other types of discrimination being either replicated or magnified by automation. To take an example from the field of predictive policing, ProPublica reported after conducting an investigation in 2016 that COMPAS, a data-driven AI tool meant to assist judges in the United States, was biased against Black people while determining if a convicted offender would commit more crimes in the future. For more on predictive policing see “How to Fight Bias with Predictive Policing” and “A Popular Algorithm Is No Better at Predicting Crimes Than Random People.

These risks exist in other domains as well. The University of Toronto and Citizen Lab report titled “Bots at the gate: A human rights analysis of automated decision-making in Canada’s immigration and refugee system” notes that “[m]any [asylum seekers and immigrants] come from war-torn countries seeking protection from violence and persecution. The nuanced and complex nature of many refugee and immigration claims is lost on these technologies, leading to serious breaches of internationally and domestically protected human rights, in the form of bias, discrimination, privacy breaches, due process and procedural fairness issues, among others. These systems will have life-and-death ramifications for ordinary people, many of whom are fleeing for their lives.”

Insufficient Legal Protections

Existing laws and regulations may not be applicable to automation systems and, in cases where they are, the application may not be well-defined. Not all countries have laws that protect individuals against these dangers. Under the GDPR  (the European General Data Protection Regulation), individuals have the right not to be subject to a decision based only on automated processing, including profiling. In other words, humans must oversee important decisions that affect individuals. But not all countries have or respect such regulations, and even the GDPR is not upheld in all situations. Meanwhile, individuals would have to actively claim their rights and contest these decisions, usually by seeking legal assistance, which is beyond the means of many. Groups at the receiving end of such discrimination tend to have fewer resources and limited access to human rights protections to contest such decisions.

Automation Bias

People tend to have faith in automation and tend to believe that technology is accurate, neutral, and non-discriminating. This can be described as “automation bias”: when humans working with or overseeing automated systems tend to give up responsibility to the machine and trust the machine’s decision-making uncritically. Automation bias has been shown to have harmful impacts across automated sectors, including leading to errors in healthcare. Automation bias also plays a role in the discrimination described above.

Uncharted ethical concerns

The ever-increasing use of automation brings ethical questions and concerns that may not have been considered before the arrival of the technology itself. For example, who is responsible if a self-driving car gets into an accident? How much personal information should be given to health-service providers to facilitate automated health monitoring? In many cases, further research is needed to even begin to address these dilemmas.

Issues related to individual consent

When automated systems make decisions that affect people’s lives, they blur the formation, context, and expression of an individual’s consent (or lack thereof) as described in this quote: “…[T]he dilution of the free basis of our individual consent – either through outright information distortion or even just the absence of transparency – imperils the very foundations of how we express our human rights and hold others accountable for their open (or even latent) deprivation.” See additional information about informed consent in the Data Protection resource.

High capital costs

Large-scale automation technologies require very high capital costs, which is a risk in case the use of the technology becomes unviable in the long term or does not otherwise guarantee commensurate returns or recovery of costs. Hence, automation projects funded with public money (for example, some “smart city ” infrastructure) require thorough feasibility studies for assessing needs and ensuring long-term viability. On the other hand, initial costs also may be very high for individuals and communities. An automated solar-power installation or a rainwater-harvesting system is a large investment for a community. However, depending on the tariffs for grid power or water, the expenditure may be recovered in the long run.

Back to top

Questions

If you are trying to understand the implications of automation in your work environment, or are considering using aspects of automation as part of your DRG programming, ask yourself these questions:

  1. Is automation a suitable method for the problem you are trying to solve?
  2. What are the indicators or guiding factors that determine if automation is a suitable and required solution to a particular problem or challenge?
  3. What risks are involved regarding security, the potential for discrimination, etc? How will you minimize these risks? Do the benefits of using automation or automated technology outweigh these risks?
  4. Who will work with and oversee these technologies? What is their training and what are their responsibilities? Who is liable legally in case of an accident?
  5. What are the long-term effects of using these technologies in the surrounding environment or community? What are the effects on individuals, jobs, salaries, social welfare, etc.? What measures are necessary to ensure that the use of these technologies does not aggravate or reinforce inequality through automation bias or otherwise?
  6. How will you ensure that humans are overseeing any important decisions made about individuals using automated processes? (How will you abide by the GDPR or other applicable regulations?)
  7. What privacy and security safeguards are necessary for applying these technologies in a given context regarding, for example, cybersecurity, protection or personal privacy, protecting operators from accidents, etc.? How will you build-in these safeguards?

Back to top

Case studies

Automated Farming Vehicles

Automated Farming Vehicles

“Forecasts of world population increases in the coming decades demand new production processes that are more efficient, safer, and less destructive to the environment. Industries are working to fulfill this mission by developing the smart factory concept. The agriculture world should follow industry leadership and develop approaches to implement the smart farm concept. One of the most vital elements that must be configured to meet the requirements of the new smart farms is the unmanned ground vehicles (UGV).”

Automated Voting Systems in Estonia

Automated Voting Systems in Estonia

Since 2005, Estonia has allowed e-voting wherein citizens are able to cast their ballot online. In each succeeding election voters have increasingly chosen to cast online ballots to save time and participate in local and national elections with ease. Voters use digital IDs to help verify their identification and prevent fraud and ballots cast online are automatically cross-referenced with lists to ensure there is no duplication or voter fraud.

Automated Mining in South Africa

Automated Mining in South Africa

“Spiraling labour and energy costs are putting pressure on the financial performance of gold mines in South Africa, but the solution could be found in adopting digital technologies. By implementing automation operators can remove underground workers from harm’s way, and that is going to become an ever-bigger imperative if gold miners are to remain investable by international capital. This increased emphasis for the safety of the workforce and mines is motivating the development of the mining automation market. Earlier, old-style techniques of exploration and drilling compromised the security of mine labour force. Such examples have forced operators to develop smart resolutions and tools to confirm security of workers.”

Automating Processing of Uncontested Civil Cases to Reduce Court Backlogs in Azerbaijan, Case Study 14

Automating Processing of Uncontested Civil Cases to Reduce Court Backlogs in Azerbaijan, Case Study 14

“In Azerbaijan, the government developed a new approach to dealing with their own backlog of cases, one which addressed both supply side and demand side elements. Recognizing that much of the backlog stemmed from relatively simple civil cases, such as claims for unpaid bills, the government partnered with the private sector in the use of an automated system to streamline the handling of uncontested cases, thus freeing up judges’ time for more important cases.”

Reforming Civil Service Recruitment through Computerized Examinations in Indonesia, Case Study 6

Reforming Civil Service Recruitment through Computerized Examinations in Indonesia, Case Study 6

“In Indonesia, the Civil Service Agency (BKN) succeeded in introducing a computer-assisted testing system (CAT) to disrupt the previously long-standing manual testing system that created rampant opportunities for corruption in civil service recruitment by line ministry officials. Now the database of questions is tightly controlled, and the results are posted in real time outside the testing center. Since its launch in 2013, CAT has become the de facto standard for more than 62 ministries and agencies.”

Real Time Automation of Indian Agriculture

Real Time Automation of Indian Agriculture

“Real time automation of Indian agricultural system” using AVR (Advanced Virtual RISC) microcontroller and GSM (Global System for Mobile) is focused on making the agriculture process easier with the help of automation. The set up consists of processor which is an 8-bit microcontroller. GSM plays an important part by controlling the irrigation on field. GSM is used to send and receive the data collected by the sensors to the farmer. GSM acts as a connecting bridge between AVR microcontroller and farmer. Our study aims to implement the basic application of automation of the irrigation field by programming the components and building the necessary hardware. In our study different type of sensors like LM35, humidity sensor, soil moisture sensor, IR sensor used to find the exact field condition. GSM is used to inform the farmer about the exact field condition so that [they] can carry necessary steps. AT(Attention) commands are used to control the functions like irrigation, ploughing, deploying seeds and carrying out other farming activities.”

E-voting terminated in Kazakhstan

A study published in May 2020 on the discontinuation of e-voting in Kazakhstan highlights some of the political challenges around e-voting. Kazakhstan used e-voting between 2004 and 2011 and was considered a leading example. See “Kazakhstan: Voter registration Case Study (2006)” produced by the Ace Project Electoral Knowledge Network. However, the country returned to a traditional paper ballot due to a lack of confidence from citizens and civil society in the government’s ability to ensure the integrity of e-voting procedures. See “Politicization of e-voting rejection: reflections from Kazakhstan,” by Maxat Kassen. It is important to note that Kazakhstan did not employ biometric voting, but rather electronic voting machines that operated via touch screens.

Back to top

References

Additional resources

Back to top

Categories

Big Data

What are big data?

“Big data” are also data, but involve far larger amounts of data than can usually be handled on a desktop computer or in a traditional database. Big data are not only huge in volume, but they grow exponentially with time. Big data are so large and complex that none of the traditional data-management tools are able to store them or process them efficiently. If you have an amount of data that you can process on your computer or the database on your usual server without it crashing, “big data” are likely not what you are working with.

How does big data work?

The field of big data has evolved as technology’s ability to constantly capture information has skyrocketed. Big data are usually captured without being entered into a database by a human being, in real time: in other words, big data are “passively” captured by digital devices.

The internet provides infinite opportunities to gather information, ranging from so-called meta-information or metadata (geographic location, IP address, time, etc.) to more detailed information about users’ behaviors. This is often from online social media or credit card-purchasing behavior. Cookies are one of the principal ways that web browsers gather information about users: they are essentially tiny pieces of data stored on a web browser, or little bits of memory about something you did on a website. (For more on cookies, visit this resource).

Data sets can also be assembled from the Internet of Things, which involves sensors tied to other devices and networks. For example, censor-equipped streetlights might collect traffic information that can then be analyzed to optimize traffic flow. The collection of data through sensors is a common element of smart city infrastructure.

Healthcare workers in Indonesia. The use of big data can improve health systems and inform public health policies. Photo credit: courtesy of USAID EMAS.

Big data can also be medical or scientific data, such as DNA information or data related to disease outbreaks. This can be useful to humanitarian and development organizations. For example, during the Ebola outbreak in West Africa between 2014 and 2016, UNICEF combined data from a number of sources, including population estimates, information on air travel, estimates of regional mobility from mobile phone records and tagged social media locations, temperature data, and case data from WHO reports to better understand the disease and predict future outbreaks.

Big data are created and used by a variety of actors. In data-driven societies, most actors (private sector, governments, and other organizations) are encouraged to collect and analyze data to notice patterns and trends, measure success or failure, optimize their processes for efficiency, etc. Not all actors will create datasets themselves, often they will collect publicly available data or even purchase data from specialized companies. For instance, in the advertising industry, Data Brokers specialize in collecting and processing information about internet users, which they then sell to advertisers. Other actors will create their own datasets, like energy providers, railway companies, ride-sharing companies, and governments. Data are everywhere, and the actors capable of collecting them intelligently and analyzing them are numerous.

Back to top

How is big data relevant in civic space and for democracy?

In Tanzania, an open-source platform allows government and financial institutions to record all land transactions to create a comprehensive dataset. Photo credit: Riaz Jahanpour for USAID / Digital Development Communications.

From forecasting presidential elections to helping small-scale farmers deal with changing climate to predicting disease outbreaks, analysts are finding ways to turn Big Data into an invaluable resource for planning and decision-making. Big data are capable of providing civil society with powerful insights and the ability to share vital information. Big data tools have been deployed recently in civic space in a number of interesting ways, for example, to:

  • monitor elections and support open government (starting in Kenya with Ushahidi in 2008)
  • track epidemics like Ebola in Sierra Leone and other West African nations
  • track conflict-related deaths worldwide
  • understand the impact of ID systems on refugees in Italy
  • measure and predict agricultural success and distribution in Latin America
  • press forward with new discoveries in genetics and cancer treatment
  • make use of geographic information systems (GIS mapping applications) in a range of contexts, including planning urban growth and traffic flow sustainably, as has been done by the World Bank in various countries in South Asia, East Asia, Africa, and the Caribbean

The use of big data that are collected, processed, and analyzed to improve health systems or environmental sustainability, for example, can ultimately greatly benefit individuals and society. However, a number of concerns and cautions have been raised about the use of big datasets. Privacy and security concerns are foremost, as big data are often captured without our awareness and used in ways to which we may not have consented, sometimes sold many times through a chain of different companies we never interacted with, exposing data to security risks such as data breaches. It is crucial to consider that anonymous data can still be used to “re-identify” people represented in the dataset – achieving 85% accuracy using as little as postal code, gender, and date of birth – conceivably putting them at risk (see discussion of “re-identification” below).

There are also power imbalances (divides) in who is represented in the data as opposed to who has the power to use them. Those who are able to extract value from big data are often large companies or other actors with the financial means and capacity to collect (sometimes purchase), analyze, and understand the data.

This means the individuals and groups whose information is put into datasets (shoppers whose credit card data is processed, internet users whose clicks are registered on a website) do not generally benefit from the data they have given. For example, data about what items shoppers buy in a store is more likely used to maximize profits than to help customers with their buying decisions. The extractive way that data are taken from individuals’ behaviors and used for profit has been called “surveillance capitalism“, which some believe is undermining personal autonomy and eroding democracy.

The quality of datasets must also be taken into consideration, as those using the data may not know how or where they were gathered, processed, or integrated with other data. And when storing and transmitting big data, security concerns are multiplied by the increased numbers of machines, services, and partners involved. It is also important to keep in mind that big datasets themselves are not inherently useful, but they become useful along with the ability to analyze them and draw insights from them, using advanced algorithms, statistical models, etc.

Last but not least, there are important considerations related to protecting the fundamental rights of those whose information appears in datasets. Sensitive, personally identifiable, or potentially personally identifiable information can be used by other parties or for other purposes than those intended, to the detriment of the individuals involved. This is explored below and in the Risks section, as well as in other primers.

Protecting anonymity of those in the dataset

Anyone who has done research in the social or medical sciences should be familiar with the idea that when collecting data on human subjects, it is important to protect their identities so that they do not face negative consequences from being involved in research, such as being known to have a particular disease, voted in a particular way, engaged in stigmatized behavior, etc. (See the Data Protection resource). The traditional ways of protecting identities – removing certain identifying information, or only reporting statistics in aggregate – can and should also be used when handling big datasets to help protect those in the dataset. Data can also be hidden in multiple ways to protect privacy: methods include encryption (encoding), tokenization, and data masking. Talend identifies the strengths and weaknesses of the primary strategies for hiding data using these methods.

One of the biggest dangers involved in using big datasets is the possibility of re-identification: figuring out the real identities of individuals in the dataset, even if their personal information has been hidden or removed. To give a sense of how easy it could be to identify individuals in a large dataset, one study found that using only three fields of information—postal code, gender, and date of birth—it was possible to identify 87% of Americans individually, and then connect their identities to publicly-available databases containing hospital records. With more data points, researchers have demonstrated a near-perfect ability to identify individuals in a dataset: four random pieces of data credit card records could achieve 90% identifiability, and researchers were able to re-identify individuals with 99.98% accuracy using 15 data points.

Ten simple rules for responsible big data research, quoted from a paper of the same name by Zook, Barocas, Boyd, Crawford, Keller, Gangadharan, et al, 2017

  1. Acknowledge that data are people and that data can do harm. Most data represent or affect people. Simply starting with the assumption that all data are people until proven otherwise places the difficulty of disassociating data from specific individuals front and center.
  2. Recognize that privacy is more than a binary value. Privacy may be more or less important to individuals as they move through different contexts and situations. Looking at someone’s data in bulk may have different implications for their privacy than looking at one record. Privacy may be important to groups of people (say, by demographic) as well as to individuals.
  3. Guard against the reidentification of your data. Be aware that apparently harmless, unexpected data, like phone battery usage, could be used to re-identify data. Plan to ensure your data sharing and reporting lowers the risk that individuals could be identified.
  4. Practice ethical data sharing. There may be times when participants in your dataset expect you to share (such as with other medical researchers working on a cure), and others where they trust you not to share their data. Be aware that other identifying data about your participants may be gathered, sold, or shared about them elsewhere, and that combining that data with yours could identify participants individually. Be clear about how and when you will share data and stay responsible for protecting the privacy of the people whose data you collect.
  5. Consider the strengths and limitations of your data; big does not automatically mean better. Understand where your large dataset comes from, and how that may evolve over time. Don’t overstate your findings and acknowledge when they may be messy or have multiple meanings.
  6. Debate the tough, ethical choices. Talk with your colleagues about these ethical concerns. Follow the work of professional organizations to stay current with concerns.
  7. Develop a code of conduct for your organization, research community, or industry and engage your peers in creating it to ensure unexpected or under-represented perspectives are included.
  8. Design your data and systems for auditability. This both strengthens the quality of your research and services and can give early warnings about problematic uses of the data.
  9. Engage with the broader consequences of data and analysis practices. Keep social equality, the environmental impact of big data processing, and other society-wide impacts in view as you plan big data collection.
  10. Know when to break these rules. With debate, code of conduct, and auditability as your guide, consider that in a public health emergency or other disaster, you may find there are reasons to put the other rules aside.

Gaining informed consent

Those providing their data may not be aware at the time that their data may be sold later to data brokers who may then re-sell them.

Unfortunately, data privacy consent forms are generally hard for the average person to read, even in the wake of General Data Protection Regulation (GDPR ) expansion of privacy protections. Terms of Service (ToS documents) are so notoriously difficult to read that one filmmaker even made a documentary on the subject. Researchers who have studied terms of service and privacy policies have found that users generally accept them without reading them because they are too long and complex. Otherwise, users that need to access a platform or service for personal reasons (for example to get in contact with a relative) or for their livelihood (to deliver their products to customers) may not be able to simply reject the ToS when they have no viable or immediate alternative.

Important work is being done to try to protect users of platforms and services from these kinds of abusive data-sharing situations. For example, Carnegie Mellon’s Usable Privacy and Security laboratory (CUPS) has developed best practices to inform users about how their data may be used. These take the shape of data privacy “nutrition labels” that are similar to FDA-specified food nutrition labels and are evidence-based.

In Chipata, Zambia, a resident draws water from a well. Big data offer invaluable insights for the design of climate change solutions. Photo credit: Sandra Coburn.

Back to top

Opportunities

Big data can have positive impacts when used to further democracy, human rights, and governance issues. Read below to learn how to more effectively and safely think about big data in your work.

Greater insight

Big datasets can present some of the richest, most comprehensive information that has ever been available in human history. Researchers using big datasets have access to information from a massive population. These insights can be much more useful and convenient than self-reported data or data gathered from logistically tricky observational studies. One major trade-off is between the richness of the insights gained through self-reported or very carefully collected data, versus the ability to generalize the insights from big data. Big data gathered from social-media activity or sensors also can allow for the real-time measurements of activity at a large scale. Big data insights are very important in the field of logistics. For example, the United States Postal Service collects data from across its package deliveries using GPS and vast networks of sensors and other tracking methods, and they then process these data with specialized algorithms. These insights allow them to optimize their deliveries for environmental sustainability.

Increased access to data

Making big datasets publicly available can begin to take steps toward closing divides in access to data. Apart from some public datasets, big data often ends up as the property of corporations, universities, and other large organizations. Even though the data produced are about individual people and their communities, those individuals and communities may not have the money or technical skills needed to access those data and make productive use of them. This creates the risk of worsening existing digital divides.

Publicly available data have helped communities understand and act on government corruption, municipal issues, human-rights abuses, and health crises, among other things. Though again, when data are made public, they are of particular importance to ensure strong privacy for those whose data is in the dataset. The work of the Our Data Bodies project provides additional guidance for how to engage with communities whose data is in the datasets. Their workshop materials can support community understanding and engagement in making ethical decisions around data collection and processing, and about how to monitor and audit data practices.

Back to top

Risks

The use of emerging technologies to collect data can also create risks in civil society programming. Read below on how to discern the possible dangers associated with big data collection and use in DRG work, as well as how to mitigate for unintended – and intended – consequences.

Surveillance

With the potential for re-identification as well as the nature and aims of some uses of big data, there is a risk that individuals included in a dataset will be subjected to surveillance by governments, law enforcement, or corporations. This may put the fundamental rights and safety of those in the dataset at risk.

The Chinese government is routinely criticized for the invasive surveillance of Chinese citizens through gathering and processing big data. More specifically, the Chinese government has been criticized for their system of social ranking of citizens based on their social media, purchasing, and education data, as well as the gathering of DNA of members of the Uighur minority (with the assistance of a US company, it should be noted). China is certainly not the only government to abuse citizen data in this way. Edward Snowden’s revelations about the US National Security Agency’s gathering and use of social media and other data were among the first public warnings about the surveillance potential of big data. Concerns have also been raised about partnerships involved in the development of India’s Aadhar biometric ID system, a technology whose producers are eager to sell it to other countries. In the United States, privacy advocates have raised concerns about companies and governments gathering data at scale about students by using their school-provided devices, a concern that should also be raised in any international context when laptops or mobiles are provided for students.

It must be emphasized that surveillance concerns are not limited to the institutions originally gathering the data, whether governments or corporations. When data are sold or combined with other datasets, it is possible that other actors, from email scammers to abusive domestic partners, could access the data and track, exploit, or otherwise harm people appearing in the dataset.

Data security concerns

Because big data are collected, cleaned, and combined through long, complex pipelines of software and storage, it presents significant challenges for security. These challenges are multiplied whenever the data are shared between many organizations. Any stream of data arriving in real time (for example, information about people checking into a hospital) will need to be specifically protected from tampering, disruption, or surveillance. Given that data may present significant risks to the privacy and safety of those included in the datasets and may be very valuable to criminals, it is important to ensure sufficient resources are provided for security.

Existing security tools for websites are not enough to cover the entire big data pipeline. Major investments in staff and infrastructure are needed to provide proper security coverage and respond to data breaches. And unfortunately, within the industry, there are known shortages of big data specialists, particularly security personnel familiar with the unique challenges big data presents. Internet of Things sensors present a particular risk if they are part of the data-gathering pipeline; these devices are notorious for having poor security. For example, a malicious actor could easily introduce fake sensors into the network or fill the collection pipeline with garbage data in order to render your data collection useless.

Exaggerated expectations of accuracy and objectivity

Big data companies and their promoters often make claims that big data can be more objective or accurate than traditionally-gathered data, supposedly because human judgment does not come into play and because the scale at which it is gathered is richer. This picture downplays the fact that algorithms and computer code also bring human judgment to bear on data, including biases and data that may be accidentally excluded. Human interpretation is also always necessary to make sense of patterns in big data; so again, claims of objectivity should be taken with healthy skepticism.

It is important to ask questions about data-gathering methods, algorithms involved in processing, and the assumptions or inferences made by the data gatherers/programmers and their analyses to avoid falling into the trap of assuming big data are “better.” For example, while data about the proximity of two cell phones tells you the fact that two people were near each other, only human interpretation can tell you why those two people were near each other. How an analyst interprets that closeness may differ from what the people carrying the cell phones might tell you. For example, this is a major challenge in using phones for “contact tracing” in epidemiology. During the COVID-19 health crisis, many countries raced to build contact tracing cellphone apps. The precise purposes and functioning of these apps varies widely (as has their effectiveness) but it is worth noting that major tech companies have preferred to refer to these apps as “exposure-risk notification” apps rather than contact tracing: this is because the apps can only tell you if you have been in proximity with someone with the coronavirus, not whether or not you have contacted the virus.

Misinterpretation

As with all data, there are pitfalls when it comes to interpreting and drawing conclusions. Because big data is often captured and analyzed in real-time, it may be particularly weak in providing historical context for the current patterns it is highlighting. Anyone analyzing big data should also consider what its source or sources were, whether the data was combined with other datasets, and how it was cleaned. Cleaning refers to the process of correcting or removing inaccurate or extraneous data. This is particularly important with social-media data, which can have lots of “noise” (extra information) and are therefore almost always cleaned.

Back to top

Questions

If you are trying to understand the implications of big data in your work environment, or are considering using aspects of big data as part of your DRG programming, ask yourself these questions:

  1. Is gathering big data the right approach for the question you’re trying to answer? How would your question be answered differently using interviews, historical research, or a focus on statistical significance?
  2. Do you already have these data, or are they publicly available? Is it really necessary to acquire these data yourself?
  3. What is your plan to make it impossible to identify individuals through their data in your dataset? If the data come from someone else, what kind of de-anonymization have they already performed?
  4. How could individuals be made more identifiable by someone else when you publish your data and findings? What steps can you take to lower the risk they will be identified?
  5. What is your plan for getting consent from those whose data you are collecting? How will you make sure your consent document is easy for them to understand?
  6. If your data come from another organization, how did they seek consent? Did that consent include consent for other organizations to use the data?
  7. If you are getting data from another organization, what is the original source of these data? Who collected them, and what were they trying to accomplish?
  8. What do you know about the quality of these data? Is someone inspecting them for errors, and if so, how? Did the collection tools fail at any point, or do you suspect that there might be some inaccuracies or mistakes?
  9. Have these data been integrated with other datasets? If data were used to fill in gaps, how was that accomplished?
  10. What is the end-to-end security plan for the data you are capturing or using? Are there third parties involved whose security propositions you need to understand?

Back to top

Case Studies

Village resident in Tanzania. Big data analytics can pinpoint strategies that work for small-scale farmers. Photo credit: Riaz Jahanpour for USAID / Digital Development Communications.
Big Data for climate-smart agriculture

Big Data for climate-smart agriculture

“Scientists at the International Center for Tropical Agriculture (CIAT) have applied Big Data tools to pinpoint strategies that work for small-scale farmers in a changing climate…. Researchers have applied Big Data analytics to agricultural and weather records in Colombia, revealing how climate variation impacts rice yields. These analyses identify the most productive rice varieties and planting times for specific sites and seasonal forecasts. The recommendations could potentially boost yields by 1 to 3 tons per hectare. The tools work wherever data is available, and are now being scaled out through Colombia, Argentina, Nicaragua, Peru and Uruguay.”

School Issued Devices and Student Privacy

School-Issued Devices and Student Privacy, particularly the Best Practices for Ed Tech Companies section.

“Students are using technology in the classroom at an unprecedented rate…. Student laptops and educational services are often available for a steeply reduced price and are sometimes even free. However, they come with real costs and unresolved ethical questions. Throughout EFF’s investigation over the past two years, [they] have found that educational technology services often collect far more information on kids than is necessary and store this information indefinitely. This privacy-implicating information goes beyond personally identifying information (PII) like name and date of birth, and can include browsing history, search terms, location data, contact lists, and behavioral information…All of this often happens without the awareness or consent of students and their families.”

Big Data and Thriving Cities: Innovations in Analytics to Build Sustainable, Resilient, Equitable and Livable Urban Spaces.

Big Data and Thriving Cities: Innovations in Analytics to Build Sustainable, Resilient, Equitable and Livable Urban Spaces.

This paper includes case studies of big data used to track changes in urbanization, traffic congestion, and crime in cities. “[I]nnovative applications of geospatial and sensing technologies and the penetration of mobile phone technology are providing unprecedented data collection. This data can be analyzed for many purposes, including tracking population and mobility, private sector investment, and transparency in federal and local government.”

Battling Ebola in Sierra Leone: Data Sharing to Improve Crisis Response.

Battling Ebola in Sierra Leone: Data Sharing to Improve Crisis Response.

“Data and information have important roles to play in the battle not just against Ebola, but more generally against a variety of natural and man-made crises. However, in order to maximize that potential, it is essential to foster the supply side of open data initiatives – i.e., to ensure the availability of sufficient, high-quality information. This can be especially challenging when there is no clear policy backing to push actors into compliance and to set clear standards for data quality and format. Particularly during a crisis, the early stages of open data efforts can be chaotic, and at times redundant. Improving coordination between multiple actors working toward similar ends – though difficult during a time of crisis – could help reduce redundancy and lead to efforts that are greater than the sum of their parts.”

Tracking Conflict-Related Deaths: A Preliminary Overview of Monitoring Systems.

Tracking Conflict-Related Deaths: A Preliminary Overview of Monitoring Systems.

“In the framework of the United Nations 2030 Agenda for Sustainable Development, states have pledged to track the number of people who are killed in armed conflict and to disaggregate the data by sex, age, and cause—as per Sustainable Development Goal (SDG) Indicator 16. However, there is no international consensus on definitions, methods, or standards to be used in generating the data. Moreover, monitoring systems run by international organizations and civil society differ in terms of their thematic coverage, geographical focus, and level of disaggregation.”

Balancing data utility and confidentiality in the US census.

Balancing data utility and confidentiality in the US census.

Describes how the Census is using differential privacy to protect the data of respondents. “As the Census Bureau prepares to enumerate the population of the United States in 2020, the bureau’s leadership has announced that they will make significant changes to the statistical tables the bureau intends to publish. Because of advances in computer science and the widespread availability of commercial data, the techniques that the bureau has historically used to protect the confidentiality of individual data points can no longer withstand new approaches for reconstructing and reidentifying confidential data. … [R]esearch at the Census Bureau has shown that it is now possible to reconstruct information about and reidentify a sizeable number of people from publicly available statistical tables. The old data privacy protections simply don’t work anymore. As such, Census Bureau leadership has accepted that they cannot continue with their current approach and wait until 2030 to make changes; they have decided to invest in a new approach to guaranteeing privacy that will significantly transform how the Census Bureau produces statistics.”

Back to top

References

Find below the works cited in this resource.

Additional Resources

Back to top

Categories

Blockchain

What is Blockchain?

A blockchain is a distributed database existing on multiple computers at the same time, with a detailed and un-changeable transaction history leveraging cryptography. Blockchain-based technologies, perhaps most famous for their use in “cryptocurrencies ” such as Bitcoin, are also referred to as “distributed ledger technology (DLT).”

How does Blockchain work?

Unlike hand-written records, like this bed net distribution in Tanzania, data added to a blockchain can’t be erased or manipulated. Photo credit: USAID.
Unlike hand-written records, like this bed net distribution in Tanzania, data added to a blockchain can’t be erased or manipulated. Photo credit: USAID.

Blockchain is a constantly growing database as new sets of recordings, or ‘blocks,’ are added to it. Each block contains a timestamp and a link to the previous block, so they form a chain. The resulting blockchain is not managed by any particular body; instead, everyone in the network has access to the whole database. Old blocks are preserved forever, and new blocks are added to the ledger irreversibly, making it impossible to erase or manipulate the database records.

Blockchain can provide solutions for very specific problems. The most clear-cut use case is for public, shared data where all changes or additions need to be clearly tracked, and where no data will ever need to be redacted. Different uses require different inputs (computing power, bandwidth, centralized management), which need to be carefully considered based on each context. Blockchain is also an over-hyped concept applied to a range of different problems where it may not be the most appropriate technology, or in some cases, even a responsible technology to use.

There are two core concepts around Blockchain technology: the transaction history aspect and the distributed aspect. They are technically tightly interwoven, but it is worth considering them and understanding them independently as well.

'Immutable' Transaction History

Imagine stacking blocks. With increasing effort, one can continue adding more blocks to the tower, but once a block is in the stack, it cannot be removed without fundamentally and very visibly altering—and in some cases destroying—the tower of blocks. A blockchain is similar in that each “block” contains some amount of information—information that may be used, for example, to track currency transactions and store actual data. (You can explore the bitcoin blockchain, which itself has already been used to transmit messages and more, to learn about a real-life example.)

This is a core aspect of the blockchain technology, generally called immutability, meaning data, once stored, cannot be altered. In a practical sense, blockchain is immutable, though a 100% agreement among users could permit changes and actually making those changes would be incredibly tedious.

Blockchain is, at its simplest, a valuable digital tool that replicates online the value of a paper-and-ink logbook. While this can be useful to track a variety of sequential transactions or events (ownership of a specific item / parcel of land / supply chain) and could even be theoretically applied to concepts like voting or community ownership and management of resources, it comes with an important caveat. Mistakes can never be truly unmade, and changes to data tracked in a blockchain can never be updated.

Many of the potential applications of blockchain would rely on one of the pieces of data tracked being the identity of a person or legal organization. If that entity changes, their previous identity will be forever immutably tracked and linked to the new identity. On top of being damaging to a person fleeing persecution or legally changing their identity, in the case of transgender individuals, for example, this is also a violation of the right to privacy established under international human rights law.

Distributed and Decentralized

The second core tenet of blockchain technology is the absence of a central authority or oracle of “truth.” By nature of the unchangeable transaction records, every stakeholder contributing to a blockchain tracks and verifies the data it contains. At scale, this provides powerful protection against problems common not only to NGOs but to the private sector and other fields that are reliant on one service to maintain a consistent data store. This feature can protect a central system from collapsing or being censored, corrupted, lost, or hacked — but at the risk of placing significant hurdles in the development of the protocol and requirements for those interacting with the data.

A common misconception is that blockchain is completely open and transparent. Blockchains may be private, with various forms of permissions applied. In such cases, some users have more control over the data and transactions than others. Privacy settings for blockchain can make for easier management, but also replicate some of the specific challenges that blockchains, in theory, are solving.

Permissionless vs Permissioned Blockchain

Permissionless blockchains are public, so anyone can interact with and participate in them. Permissioned blockchains, on the other hand, are closed networks, which only specific actors can access and contribute to. As such, permissionless blockchains are more transparent and decentralized, while permissioned blockchains are governed by an entity or a group of entities that can customize the platform, choosing who can participate, the level of transparency, and whether or not to use digital assets. Another key difference is that public blockchains tend to be anonymous, while private ones, by nature, cannot be. Because of this, permissioned blockchain is chosen in many human-rights use cases, using identity to hold users accountable.

Back to top

How is blockchain relevant in civic space and for democracy?

Blockchain technology has the potential to provide substantial benefits in the development sector broadly, as well as specifically for human rights programs. By providing a decentralized, verifiable source of data, blockchain technology can be a more transparent, efficient form of information and data management for improved governance, accountability, financial transparency, and even digital identities. While blockchain can be effective when used strategically on specific problems, practitioners who choose to use it must do so fastidiously. The decisions to use DLTs should be based on a detailed analysis and research on comparable technologies, including non-DLT options. As blockchains are used more and more for governance and in the civic space, irresponsible applications threaten human rights, especially data security and the right to privacy.

By providing a decentralized, verifiable source of data, blockchain technology can enable a more transparent, efficient form of information and data management. Practitioners should understand that blockchain technology can be applied to humanitarian challenges, but it is not a separate humanitarian innovation in itself.

Blockchain for the Humanitarian Sector – Future Opportunities

Blockchains lend themselves to some interesting tools being used by companies, governments, and civil society. Examples of how blockchain technology may be used in civic space include: land titles (necessary for economic mobility and preventing corruption), digital IDs  (especially for displaced persons), health records, voucher-based cash transfers, supply chain, censorship resistant publications and applications, digital currency , decentralized data management , recording votes, crowdfunding and smart contracts. Some of these examples are discussed below. Specific examples of the use of blockchain technology may be found on this page under case studies.

A USAID-funded project used a mobile app and software to track the sale and transfer of land rights in Tanzania. Blockchain technology may also be used to record land titles. Photo credit: Riaz Jahanpour for USAID / Digital Development Communications.
A USAID-funded project used a mobile app and software to track the sale and transfer of land rights in Tanzania. Blockchain technology may also be used to record land titles. Photo credit: Riaz Jahanpour for USAID / Digital Development Communications.

Blockchain’s core tenets – an immutable transaction history and its distributed and decentralized nature – lend themselves to some interesting tools being used by companies, governments, and civil society. The risks and opportunities these present will be explored more fully in the relevant sections below, while specific examples will be given in the Case Studies section but, at a high level, many actors are looking at leveraging blockchain in the following ways:

Smart Contracts

Smart contracts are agreements that provide automatic payments on the completion of a specific task or event. For example, in civic space, smart contracts could be used to execute agreements between NGOs and local governments to expedite transactions, lower costs, and reduce mutual suspicions. However, since these contracts are “defined” in code, any software bugs can interfere with the intent of the contract or become potential loopholes in which the contract could be exploited. One case of this happened when an attacker exploited a software bug in a smart contract-based firm called The DAO for approximately $50M.

Liquid Democracy

Liquid democracy is a form of democracy wherein, rather than simply voting for elected leaders, citizens also engage in collective decision making. While direct democracy (each individual having a say on every choice a country makes) is not feasible, blockchain could lower the barriers to liquid democracy, a system which would put more power into the hands of the people. Blockchain would allow citizens to register their opinions on specific subject matters or delegate votes to subject matter experts.

Government Transparency

Blockchain can be used to tackle governmental corruption and waste in common areas like public procurement. Governments can use blockchain to publicize the steps of procurement processes and build citizen trust as citizens know the transactions recorded cannot have been tampered with. The tool can also be used to automate tax calculation and collection.

Innovative Currency and Payment Systems

Many new cryptocurrencies are considering ways to leverage blockchain for transactions without the volatility of bitcoin, and with other properties, such as speed, cost, stability and anonymity. Cryptocurrencies are also occasionally combined with smart contracts, to establish shared ownership through funding of projects.

Potential for fund-raising

In addition, the digital currency subset of blockchain is being used to establish shared ownership (not dissimilar to stocks / shares of large companies) of projects.

Potential for election integrity

The transparency and immutability of blockchain could be used to increase public confidence in elections by integrating electronic voting machines and blockchain. However, there are privacy concerns with publically tracking the tally of votes. Additionally, this system relies on electronic voting machines, which raise some security concerns, as computers can be hacked, and have been met by mistrust in several societies where they were suggested. Online voting through blockchain faces similar distrust, but integrating blockchain into voting would make audits much easier and more reliable. This traceability would also be a useful feature in transparently transmitting results from polling places to tabulation centers.

Censorship-resistant technology

The decentralized, immutable nature of blockchain provides clear benefits to protecting speech, but not without significant risks. There have been high-visibility uses of blockchain to publish censored speech in China, Turkey, and Catalonia. Article 19 has written an in-depth report specifically on the interplay between freedom of expression and blockchain technologies, which provides a balanced view of the potential benefits and risks and guidance for stakeholders considering engaging in this facet.

Decentralized computation and storage

Micro-payments through a blockchain can be used to formalize and record actions. This can be useful when carrying out activities with multiple stakeholders where trust, transparency, and a permanent record are valuable, for example, automated auctions (to prevent corruption), voting (to build voter trust), signing contracts (to keep a record of ownership and obligations that will outlast crises that destroy paper or even digital systems), and even for copyright purposes and preventing manipulation of facts.

Ethereum is a cryptocurrency focused on using the blockchain system to help manage decentralized computation and storage through smart contracts and digital payments. Ethereum encourages the development of “distributed apps” which are tied to transactions on the Ethereum Blockchain. Examples of these apps include a X-like tool, and apps that pay for content creation/sharing. See case studies in the cryptocurrencies primer for more detail.

The vast majority of these applications presume some form of micro-payment as part of the transaction. However, this requirement has ramifications for equal access as internet accessibility, capital, and access to online payment systems are all barriers to usage. Furthermore, with funds involved, informed consent is even more essential and challenging to ensure.

Back to top

Opportunities

Blockchain can have positive impacts when used to further democracy, human rights and governance issues. Read below to learn how to more effectively and safely think about blockchain in your work.

Proof of Digital Integrity

Data stored or tracked using blockchain technologies have a clear, sequential, and unalterable chain of verifications. Once data is added to the blockchain, there is ongoing mathematical proof that it has not been altered. This does not provide any assurance that the original data is valid or true, and it means that any data added cannot be deleted or changed – only appended to. However, in civil society, this benefit has been applied to concepts such as creating records for land titles/ownership; improving voting security by ensuring one person matches with one unchangeable vote; and preventing fraud and corruption while enhancing transparency in international philanthropy. It has been used to keep record of digital identities to help people retain ownership over their identity  and documents and, in humanitarian contexts, to make voucher-based cash transfers more efficient. As an enabler for digital currency, in some circumstances, blockchain facilitates cross-border funding of civil society. Blockchain could be used not only to preserve identification documents, but qualifications and degrees as well.

A function such as this can provide a solution to the legal invisibility most often borne by refugees and migrants. Rohingya refugees in Bangladesh, for example, are often at risk of discrimination and exploitation, because they are stateless. Proponents of blockchain argue that its distributed system can grant individuals with “self-sovereign identity,” a concept by which ownership of identity documents is taken from authorities and put in the hands of individuals. This allows individuals to use their identity documents across a number of authorities while authorities’ access requires a degree of consent. A self-sovereign identity model could be a solution to regulations raised by the GDPR and similar privacy-rights-supporting legislation.

However, if blockchain architects do not secure transaction permissions and public/private state variables, governments could use machine-learning algorithms to monitor public blockchain activity and gain insight into whatever daily, lower- level activities of their citizens are linkable to their blockchain identities. This might include payments (both interpersonal and business) and services, be they health, financial, or other. Anywhere citizens need to show their ID, their location and time would be tracked. While this is an infringement on privacy rights, it is extra problematic for marginalized groups whose legal status in a country can change rapidly and with no warning. Furthermore, such a use of blockchain assumes that individuals would be prepared and able to adopt that technology, an unlikely possibility due to the financial insecurity and lack of access to information and the internet many vulnerable groups, such as refugees, face. In this context, it is impossible to get meaningful informed consent from these target groups.

Blockchains promise anonymity, or at least pseudonymity, because limited information regarding individuals is stored in transaction logs. However, this does not guarantee that the platforms protect freedom of expression. For instance, the central internet regulator in China proposed regulations that would require local blockchain companies to register users with their real names and national identification card numbers.

Supply Chain Transparency

Blockchain has been used to create transparency in the supply chain and connect consumers directly with the producers of the products they are buying. This enables consumers to know companies are following ethical and sustainable production practices. For example, Moyee Coffee uses blockchain to track their supply chain, and makes this information available to customers, who can confirm the coffee beans were picked by paid, adult farmers and even tip those farmers directly.

Decentralized Store of Data

Around the world, blockchain technology helps displaced people regain IDs and access to other social services. Here, a CARD agent in the Philippines tracks IDs by hand. Photo credit: Brooke Patterson/USAID.
Around the world, blockchain technology helps displaced people regain IDs and access to other social services. Here, a CARD agent in the Philippines tracks IDs by hand. Photo credit: Brooke Patterson/USAID.

Blockchain is resistant to the traditional problems one central authority or data store faces when being attacked or experiencing outages. In a blockchain, data are constantly being shared and verified across all members—although blockchain has been criticized for requiring large amounts of energy, storage, and bandwidth to maintain a shared data store. This decentralization is most valued in digital currencies, which rely on the scale of their blockchain to balance not having a country or region “owning” and regulating the printing of the currency. Blockchain has also been explored to distribute data and coordinate resources without a reliance on a central authority in order to resist censorship.

Blockchains promise anonymity, or at least pseudonymity, because limited information regarding individuals is stored in transaction logs. However, this does not guarantee that the platforms protect freedom of expression. For instance, the central internet regulator in China proposed regulations that would require local blockchain companies to register users with their real names and national identification card numbers.
Blockchain and freedom of expression

Back to top

Risks

The use of emerging technologies can also create risks in civil society programming. Read below on how to discern the possible dangers associated with blockchain in DRG work, as well as how to mitigate unintended – and intended – consequences.

Unequal Access

The minimal requirements for an individual or group to engage with Blockchain present a challenge for many. Connectivity, reliable and robust bandwidth, and local storage are all needed. Therefore, mobile phones are often an insufficient device to host or download blockchains. The infrastructure it requires can serve as a barrier to access in areas where Internet connectivity primarily occurs via mobile devices. Because every full node (host of a blockchain) stores a copy of the entire transaction log, blockchains only grow longer and larger with time, and thus can be extremely resource-intensive to download on a mobile device. For instance, over the span of a few years, the blockchains underlying Bitcoin grew from several gigabytes to several hundred. And for a cryptocurrency blockchain, this growth is a necessary sign of healthy economic growth. While the use of blockchain offline is possible, offline components are among the most vulnerable to cyberattacks, and this could put the entire system at risk.

Blockchains — whether they are fully independent or part of existing blockchains — require some percentage of actors to lend processing power to the blockchain, which — especially as they scale — itself becomes either exclusionary or creates classes of privileged users.

Another problem that can undermine the intended benefits of the system is the unequal access to opportunities to convert blockchain-based currencies to traditional currencies. This is especially a problem in relation to philanthropy or to support civil society organizations in restrictive regulatory environments. For cryptocurrencies  to have actual value, someone has to be willing to pay money for them.

Lack of digital literacy

Beyond these technical challenges, blockchain technology requires a strong baseline understanding of technology and its use in situations where digital literacy itself is a challenge. Use of the technology without a baseline understanding of the consequences is not really consent and could have dire consequences.

There are paths around some of these problems, but any blockchain use needs to reflect on what potential inequalities could be exacerbated by or with this technology.

Further, these technologies are inherently complex, and outside the atypical case where individuals do possess the technical sophistication and means to install blockchain software and set up nodes; the question remains as to how the majority of individuals can effectively access them. This is especially true of individuals who may have added difficulty interfacing with technologies due to disability, literacy, or age. Ill-equipped users are at increased risk of their investments or information being exposed to hacking and theft.

Blockchain and freedom of expression

Breaches of Privacy

Account Ledgers for Nepali Savings and Credit Cooperatives shows the burden of paper. Blockchain replicates online the value of a paper-and-ink records. Photo credit: Brooke Patterson/USAID.
Account Ledgers for Nepali Savings and Credit Cooperatives shows the burden of paper. Blockchain replicates online the value of a paper-and-ink records. Photo credit: Brooke Patterson/USAID.

Storing sensitive information  on a blockchain – such as biometrics  or gender – combined with the immutable aspects of the system, can lead to considerable risks for individuals when this information is accessed by others with the intention to harm. Even when specific personally identifiable information is not stored on a blockchain, pseudonymous accounts are difficult to protect from being mapped to real-world identities, especially if they are connected with financial transactions, services, and/or actual identities. This can erode rights to privacy and protection of personal data, as well as exacerbate the vulnerability of already marginalized populations and persons who change fundamental aspects of their person (gender, name). Data privacy rights, including explicit consent, modification, and deletion of one’s own data are often protected through data protection and privacy legislation, such as the General Data Protection Regulation (GDPR) in the EU that serves as a framework for many other policies around the world. An overview of legislation in this area around the world is kept up to date by the United Nations Conference on Trade and Development.

For example, in September 2017, concerns surfaced about the Bangladeshi government’s plans to create a ‘merged ID’ that would combine citizens’ biometric, financial, and communications data (Rahman, 2017). At that time, some local organizations had started exploring a DLT solution to identify and serve the needs of local Rohingya asylum-seekers and refugees. Because aid agencies are required to comply with national laws, any data recorded on a DLT platform could be subject to automatic data-sharing with government authorities. If these sets of records were to be combined, they would create an indelible, uneditable, untamperable set of records of highly vulnerable Rohingya asylum-seekers, ready for cross-referencing with other datasets. “As development and humanitarian donors and agencies rush to adopt new technologies that facilitate surveillance, they may be creating and supporting systems that pose serious threats to individuals’ human rights.”

These issues raise questions about meaningful, informed consent – how and to what extent do aid recipients understand DLTs and their implications when they receive assistance? […] Most experts agree that data protection needs to be considered not only in the realm of privacy, empowerment and dignity, but also in terms of potential physical impact or harm (ICRC and Brussels Privacy Hub, 2017; ICRC, 2018a)

Blockchain and distributed ledger technologies in the humanitarian sector

Environmental Impact

As blockchains scale, they require increasing amounts of computational power to stay in sync. In most digital currency blockchains, this scale problem is balanced by rewarding people who contribute to the processing power required with currency. The University of Cambridge estimated in fall 2019 that Bitcoin alone currently uses .28% of global electricity consumption, which, if Bitcoin were a country, would place it as the 41st most energy-consuming country, just ahead of Switzerland. Further, the negative impact is demonstrated by research showing that each Bitcoin transaction takes as much energy as needed to run a well-appointed house and all the appliances in it for an entire week.

Regulatory Uncertainty

As is often the case for emerging technology, the regulations surrounding blockchain are either ambiguous or nonexistent. In some cases, such as when the technology may be used to publish censored speech, regulators overcorrect and block access to the entire system or remove pseudonymous protections of the system in-country. In Western democracies, there are evolving financial regulations as well as concerns around the immutable nature of the records stored in a blockchain. Personally-Identifiable  Information (see Privacy, above) in a blockchain cannot be removed or changed as required by the GDPR right to be forgotten, and widely illegal content has already been inserted into the bitcoin blockchain.

Trust, Control, and Management Issues

While a blockchain has no “central database” which could be hacked, it also has no central authority to adjudicate or resolve problems. A lost or compromised password is almost guaranteed to result in the loss of ability to access funds or worse, digital identities. Compromised passwords or illegitimate use of the blockchain can harm individuals involved, especially when personal information is accessed or when child sexual abuse images are stored forever. Building mechanisms to address this problem undermines the key benefits of the blockchain.

That said, an enormous amount of trust is inherently placed in the software-development process around blockchain technologies, especially those using smart contracts. Any flaw in the software, and any intentional “back door”, could enable an attack that undermines or subverts the entire goal of the project.

Where is trust being placed: whether it is in the coders, the developers, those who design and govern mobile devices or apps; and whether trust is in fact being shifted from social institutions to private actors. All stakeholders should consider what implications does this have and how are these actors accountable to human rights standards.

Blockchain and freedom of expression

Back to top

Questions

If you are trying to understand the implications of blockchain in your work environment, or are considering using aspects of blockchain as part of your DRG programming, ask yourself these questions:

  1. Does blockchain provide specific, needed features that existing solutions with proven track records and sustainability do not?
  2. Do you really need blockchain, or would a database be sufficient?
  3. How will this implementation respect data privacy and control laws such as the GDPR?
  4. Do your intended beneficiaries have the internet bandwidth needed to use the product you are developing with blockchain?
  5. What external actors/partners will control critical aspects of the tool or infrastructure this project will rely on?
  6. What external actors/partners will have access to the data this project creates? What access conditions, limits, or ownership will they have?
  7. What level of transparency and trust do you have with these actors/partners?
  8. Are there ways to reduce dependency on these actors/partners?
  9. How are you conducting and measuring informed consent  processes for any data gathered?
  10. How will this project mitigate technical, financial, and/or infrastructural inequalities and ensure they are not exacerbated?
  11. Will the use of blockchain in your project comply with data protection and privacy laws?
  12. Do other existing laws and policies address the risks and offer mitigating measures related to the use of blockchain in your context, such as anti-money-laundering regulation?
  13. Are there laws in the works that may mitigate your project or increase costs?
  14. Do existing laws enable the benefits you have identified for the blockchain-enabled project?
  15. Are these laws aligned with international human rights law, such as the right to privacy, to freedom of expression and opinion, and to enjoy the benefits of scientific progress?

Back to top

Case Studies

Blockchain and the supply chain

Blockchain has been used for supply chain transparency of products that are commonly not ethically sourced. For example, in 2018, the World Wildlife Fund collaborated with Sea Quest Fiji Ltd., a tuna fishing and processing company and ConsenSys, a tech company with an implementer called TraSeable, to use blockchain to trace the origin of tuna caught in a Fijian longline fishery. Each fish was tagged when caught and the entire journey of the fish was recorded on the blockchain. This methodology is a weapon for sustainability and ethical business practices in other supply chains as well, including those that rely on child and forced labor.

Blockchain to combat corruption in registering land titles

A program was developed in Georgia to address corruption in land management in the country. Land ownership is a sector particularly vulnerable to corruption, in part because it is very easy for government officials to extract bribes to register land due to the fact that ownership is recognized through titles, which can easily be lost or destroyed. Blockchain was introduced to provide a transparent and immutable recording of each step of the process to register land, so that the procurement process could be tracked and there would be no danger of losing the record.

Blockchain for COVID-19 vaccine passports

After the COVID-19 vaccine was made public, many states considered implementing a vaccine passport system, whereby individuals would be required to show documentation to prove they were vaccinated in order to enter certain countries or buildings. Blockchain was considered as a tool to more easily store vaccine records and track doses without negative consequences for individuals who lose their records. While there are significant data privacy concerns in a system where there is no alternative to allowing one’s data to be stored on a blockchain, this would have significant public health benefits. Furthermore, it demonstrates that future identification documents are likely to rely on blockchain.

Blockchain to facilitate transactions for humanitarian aid

Humanitarian aid is the sector where blockchain for human rights and democracy has been adopted the most. Blockchain has been embraced as a way to combat corruption and ensure money and aid reach intended targets, to allow access to donations in countries where crises have affected the banking system, and in coordination with Digital IDs to allow donor organizations to better track funding and get money to people without traditional methods of receiving money.

Sikka, a project of the Nepal Innovation Lab, operates through partnerships with local vendors and cooperatives within the community, sending value vouchers and digital tokens to individuals through SMS. Value vouchers can be used to purchase humanitarian goods from vendors, while digital tokens can be exchanged for cash. The initiative also supplies donors with data for monitoring and evaluation purposes. The International Federation of the Red Cross and Red Crescent Societies (IFRC) has a similar project, the Blockchain Open Loop Cash Transfer Pilot Project for cash transfer programming. The Kenya-based project utilized a mobile money transfer service operating in the country, Safaricom M-Pesa, to send payments to the mobile wallets of beneficiaries without the need for national ID documentation, and blockchain was used to track the payments. A management platform called “Red Rose” allowed donor organizations to manage data, and the program explored many of the ethics concerns around the use of blockchain.

The Start Network is another humanitarian aid organization that has experimented with using blockchain to disperse funds because of the reduced transfer fees, transparency, and speed benefits. Using the Disperse platform, a distribution platform for foreign aid, the Start Network hoped to increase the humanitarian sector’s comfort with introducing new tech solutions.

AIDONIC is a private company with a donation management tool that incentivizes humanitarian donation with a platform allowing donors, even individuals, greater control over what their donations are used for. Small donors can choose specific initiatives, which will launch when fully funded, and throughout projects, donors can monitor, track, and trace their contributions.

Blockchain for collaboration

A similar humanitarian application of blockchain is collaboration. The World Food Program’s Building Blocks project allows organizations that work in the region but offer different types of humanitarian aid to coordinate their efforts. All of the actions of the humanitarian organizations are recorded on a shared private blockchain. Though the program has a policy to support data privacy, including not recording any data other than that required, pseudonymous data only being released to approved humanitarian orgs, and not recording any sensitive information, humanitarian aid applications of blockchain raise a lot of cybersecurity and data privacy concerns, and all members of the network must be approved. The project has not been as successful as hoped; only UN Women and the World Food Program are full members, but the network makes it easier for beneficiaries to access aid from both organizations, and it provides a clearer picture for aid organizations of what types of aid are being provided and what is missing.

Blockchain in electronic banking

In addition to its applications in humanitarian funding, blockchain has been used to address gaps in financial services outside of crisis zones. Project i2i provides a nontraditional solution for the unbanked population in the Philippines. While standing up the internet technology infrastructure necessary to establish traditional banking in rural areas is extremely challenging and resource intensive, with the blockchain, each bank only needs an iPad. With this, banks connect to the Ethereum network and users have access to a trustworthy and efficient system to process transactions. Though the system has successfully reduced the number of unbanked people in the Philippines, there are informed consent issues, as the majority of users have no other option and because of the data privacy rights.

Blockchain and data integrity

While data privacy is a serious concern, blockchain also has the potential to support democracy and human rights work through data collection, verification, and even through supporting data privacy. The Chemonics’ 2018 Blockchain for Development Solutions Lab used blockchain to make the process of collecting and verifying the biodata of USAID professionals more efficient. The use of blockchain reduced incidents of error and fraud and provided increased data protection because of the natural defense against hacking that blockchains provide and because instead of sharing ID documents through email, the program utilized encrypted keys on Chemonics.

Blockchain for fact checking images

Truepic is a company that provides fact checking solutions. The company supports information integrity by storing accurate information about pictures that have been verified. Truepic combines camera technology, which records pertinent details of every photo, with blockchain storage to create a database of verified imagery that cannot be tampered with. This database can then be used to fact check manipulated images.

Blockchain to permanently keep news articles

Civil.co was a journalism-supporting organization that harnessed the blockchain to permanently keep news articles online in the face of censorship. Civil’s usage of blockchain aimed to encourage community trust of the news. First, articles were published using the blockchain itself, meaning a user with sufficient technical skills could theoretically verify that the articles came from where they say they did. Civil also supported trust with two non-blockchain “technologies”: a “constitution” which all their newsrooms adopted and a ranking system through which their community of readers and journalists could vote up news and newsrooms they found trustworthy. Publishing on a peer-to-peer blockchain, gave their publishing additional resistance to censorship. Readers could also pay journalists for articles using Civil’s tokens. However, Civil struggled from the beginning to raise money, and its newsroom model failed to prove itself.

For more blockchain case studies, check out these resources:

  • New America keeps a Blockchain Impact Ledger with a database of blockchain projects and the people they serve.
  • The 2019 report “Blockchain and distributed ledger technologies in the humanitarian sector” provides multiple examples of humanitarian use of DLTs, including for financial inclusion, land titling, donation transparency, fraud reduction, cross-border transfers, cash programming, grant management and organizational governance, among others.
  • In “Blockchain: Can We Talk About Impact Yet?”, Shailee Adinolfi, John Burg and Tara Vassefi respond to a MERLTech blog post that not only failed to find successful applications of blockchain in international development, but was unable to identify companies willing to talk about the process. This article highlights three case studies of projects with discussion and links to project information and/or case studies.
  • Digital Currencies and Blockchain in the Social Sector,” David Lehr and Paul Lamb summarize work in international development leveraging blockchain for philanthropy, international development funding, remittances, identity, land rights, democracy and governance, and environmental protection.
  • Consensys, a company building and investing in blockchain solutions, including some in the civil sector, summarizes (successful) use cases in “Real-World Blockchain Case Studies.”

Back to top

References

Find below the works cited in this resource.

Additional Resources

Back to top

Categories

Cryptocurrency

What are cryptocurrencies?

Cryptocurrency is a type of digital or virtual currency that uses cryptography for secure and private transactions and control of new units. Unlike traditional currencies issued by governments (like the US Dollar or Euro), cryptocurrencies are typically decentralized and operate on blockchain technology. It was created in the wake of the 2008 global financial crisis to decentralize the system of financial transactions. Cryptocurrency is almost a direct contrast to the global financial system: no currency is attached to state authority, it is unbound by geographic regulations, and most importantly, the maintenance of the system is community driven by a network of users. All transactions are logged anonymously on a public ledger, such as Bitcoin’s blockchain.

Definitions

Blockchain: Blockchain is a type of technology used in many digital currencies as a bank ledger. Unlike a normal bank ledger, copies of that ledger are distributed digitally, among computers all over the world, automatically updating with every transaction.

Cryptography: The practice of employing mathematical techniques to secure and protect information, transforming it into an unreadable format using encryption and hashing. In cryptocurrencies, cryptography safeguards transactions, privacy, and ownership verification using techniques like public-private keys and digital signatures on a blockchain.

Currency: A currency is a widely accepted system of money in circulation, usually designated by a nation or group of nations. Currency is commonly found in the form of paper and coins, but can also be digital (as this primer explores).

Fiat money: Government-issued currency, such as the USD. Sometimes referred to as Fiat currency.

Hashing: The process through which cryptocurrency transactions are verified. When one person pays another using Bitcoin, for example, computers on the blockchain automatically check that the transaction is accurate.

Hash: The mathematical problem computers must solve to add transactions to the blockchain.

Initial Coin Offering (ICO): The process by which a new cryptocurrency or digital “token” invites investment.

Mining: The process by which a computer solves a hash. The first computer to solve the hash permanently stores the transaction as a block on the blockchain. When a computer successfully adds a block to the blockchain, it is rewarded with a coin. Arriving at the right answer for a hash before another miner relates to how fast a computer can produce hashes. In the early years of Bitcoin, for example, mining could be performed effectively using open-source software on standard desktop computers. More recently, only special-purpose machines known as application-specific integrated circuit (ASIC) miners can mine bitcoin cost-effectively, because they are optimized for the task. Mining pools (groups of miners) and companies now control most Bitcoin mining activity.

How do cryptocurrencies work?

Money transfer agencies in Nepal. Cryptocurrencies potentially allow users to send and receive remittances and access foreign financial markets. Photo credit: Brooke Patterson/USAID.

Users purchase cryptocurrency with a credit card, debit card, bank account, or through mining. They then store the currency in a digital “wallet,” either online, on a computer, or offline, on a portable storage device, such as USB sticks. These wallets are used to send and receive money through “public addresses” or keys that link the money to a specific type of cryptocurrency. These addresses are strings of characters that signify a wallet’s identity for transactions. A user’s public address can be shared with anyone to receive funds and can also be represented as a QR code. Anyone with whom a user makes a transaction can see the balance in the public address that he or she uses.

While transactions are publicly recorded, identifying user information is not. For example, on the Bitcoin blockchain, only a user’s public address appears next to a transaction—making transactions confidential but not necessarily anonymous.

Cryptocurrencies have increasingly struggled with intense periods of volatility, most of which stem from the decentralized system of which they are part. The lack of a central body means that cryptocurrencies are not legal tender, they are not regulated, there is little to no insurance if an individual’s digital wallet is hacked, and most payments are not reversible. As a result, cryptocurrencies are inherently speculative. In November 2021, Bitcoin peaked at a price of nearly $65,000 per coin, but crashed almost a year after following the collapse of FTX which led to a domino effect in the crypto sector. Prior to the crash new supposed ‘meme coins’ which gained popularity on social media were seeing substantial price increases as investors flocked to the new coins. The crash that followed led to increased attention to tightening regulatory control over cryptocurrency and trading. While some cryptocurrencies such as Tether, have attempted to offset volatility by tying their market value to an external currency like USD or gold. However, the industry overall has not yet reconciled how to maintain an autonomous, decentralized system with overall stability.

Types of Cryptocurrencies

The value of a certain cryptocurrency is heavily dependent on the faith of its investors, its integration into financial markets, public interest in using it, and its performance compared to other cryptocurrencies. Bitcoin, founded in 2008, was the first and only cryptocurrency until 2011 when “altcoins” began to appear. Estimates for the number of cryptocurrencies vary, but as of June 2023, there were about 23,000 different types of cryptocurrencies.

  • Bitcoin
    It has the largest user base and a market capitalization in the hundreds of billions. While Bitcoin initially attracted financial institutions like Goldman Sachs, the collapse of Bitcoin’s value (along with other cryptocurrencies) in 2018 has since increased skepticism towards its long-term viability.
  • Ethereum
    Ethereum is a decentralized software platform that enables Smart Contracts and Decentralized Applications (DApps) to be built and automated without interference from a third party (like Bitcoin: they both run on Blockchain technology). Ethereum launched in 2015 and is currently the second-largest cryptocurrency based on market capitalization after Bitcoin.
  • Ripple (XRP)
    Ripple is a real-time payment-processing network that offers both instant and low-cost international payments, to compete with other transaction systems such as SWIFT or VISA. It is the third largest cryptocurrency.
  • Tether (USDT)
    Tether is one of the first and most popular of a group of “stablecoins” — cryptocurrencies that peg their market value to a currency or other external reference point to reduce volatility.
  • Monero
    Monero is the largest of what are known as privacy coins. Unlike Bitcoin, Monero transactions and account balances are not public by default.
  • Zcash
    Another anonymity-preserving cryptocurrency, Zcash, is operated under a foundation of the same name. It is branded as a mission-based, privacy-centric cryptocurrency that enables users “to protect their privacy on their own terms”, regarding privacy as essential to human dignity and to the healthy functioning of civil society.

Fish vendor in Indonesia. Women are the most underbanked sector, and financial technologies can provide tools to address this gap. Photo credit: Afandi Djauhari/NetHope.

Back to top

How are cryptocurrencies relevant in civic space and for democracy?

Cryptocurrencies are, in many ways, ideal for the needs of NGOs, humanitarians, and other civil society actors. Civic space actors who require blocking-resistant, low-fee transactions might find cryptocurrencies both convenient and secure. The use of cryptocurrencies in the developing world reveals their role as not just vehicles for aid, but also tools that facilitate the development of small- to medium-sized enterprises (SMEs) looking to enter international trade. For example, UNICEF created a cryptofund in 2019 in order to receive and distribute funding in cryptocurrencies (ether and bitcoin). In June of 2020, UNICEF announced its largest investment yet in startups located in developing economies, that are helping respond to the Covid-19 pandemic.

However, regarding cryptocurrencies through only a traditional development lens – i.e. that they may only be useful for refugees or countries with unreliable fiat currencies – simplifies the economic landscape of such low and middle income countries. Many countries are home to a significant youth population who are poised to harness cryptocurrency in innovative ways, for instance, to send and receive remittances, to access foreign financial markets and investment possibilities, and even to encourage ecological or ethical purchasing behaviors (see the Case Studies section). During the Coronavirus lockdown in India, and after the country’s reserve bank lifted a ban it had on cryptocurrencies, many young people started trading in Indian cryptocurrencies and using cryptocurrencies to transfer money to one another. Still, the future of crypto in India and elsewhere is uncertain. The frontier nature of cryptocurrencies poses significant risks to users when it comes to insurance and, in some cases, security.

Moreover, as will be discussed below, the distributed technology (blockchain) underlying cryptocurrencies is seen as offering resistance to censorship, as the data are distributed over a large network of computers. The blockchain offers a high level of anonymity, which may be helpful for those living under autocratic regimes and democratic activists conduct transactions that may otherwise be monitored. Cryptocurrencies could also give a broader range of people access to banking, an essential element of economic inclusion.

Back to top

Opportunities

Cryptocurrencies can have positive impacts when used to further democracy, human rights, and governance issues. Read below to learn how to more effectively and safely think about cryptocurrencies in your work.

Accessibility

Cryptocurrencies are more accessible to a broader range of users than regular cash currency transactions are; they are not subject to government regulation and they don’t have high processing fees. Cross-border transactions in particular benefit from the features of cryptocurrencies; international banking fees and poor exchange rates can be extremely costly. (In some cases, the value of cryptocurrencies may even be more stable than the local currency (see volatile markets case study below). Cryptocurrencies that require participants to log in (on “permissioned” systems) necessitate that an organization controls participation in its system. In some cases, certain users also help run the system in other ways, like operating servers. When this is the case, it is important to understand who those users are, how they are selected, and how their ability to use the system could be taken away if they turn out to be bad actors.

Additionally, Initial Coin Offerings (ICOs) lower the entry barrier to investing, cutting venture capitalists and investment banks out of the investing process, thereby democratizing the processWhile similar to Initial Public Offerings (IPO)s, ICOs differ significantly in that they allow companies to interact directly with individual investors. This also poses a risk to investors, as the safeguards offered by investment banks for traditional IPOs do not apply. (See Lack of Governance and Regulatory Uncertainty ). The lack of regulatory bodies has also spurred the growth of scam ICOs. When an ICO or cryptocurrency does not have a legitimate strategy for generating value, they are typically a scam ICO.

Still, broad accessibility has not yet been achieved as a result of a combination of factors including user knowledge-gaps, internet and computing requirements, and incompatibility between traditional banking systems and cryptocurrency fintech. For an understanding of the usability and risk side of cryptocurrency use, and the disproportionate risks marginalized groups face, see section on digital literacy and access requirements.

Anonymity and Censorship Resistance

The decentralized, peer-to-peer nature of cryptocurrencies may be of great comfort to those seeking anonymity, such as human rights defenders working in closed spaces or people simply seeking an equivalent to “cash” for online purchases (see the Cryptocurrencies in Volatile Markets case study, below). Cryptocurrencies can be useful for someone who wishes to donate anonymously to a foundation or organization when that donation could put them at risk if their identity were known making it a powerful tool for activists. The anonymity of cryptocurrencies has also caused concern amongst advocacy groups who argue that, without open ledgers and tracking, crypto could be used by foreign illiberal actors to fund more authoritarian campaigns.

Since the data that supports the currency is distributed over a large network of computers, it is more difficult for a bad actor to locate and target a transaction or system operation. But a currency’s ability to protect anonymity largely depends on the specific goal of the cryptocurrency. Zcash, for example, was specifically developed to hide transaction amounts and user addresses from public view. Zcash has also played a role in allowing more charitable giving, and several charities tackling research, journalism, and climate change advocacy are powered by Zcash. Cryptocurrencies with a large number of participants are also resistant to more benign, routine, system outages because some data stores in the network can operate if others are breached.

Creating new governance systems

There have been few successful attempts at regulating cryptocurrency at the transnational level, with most governance frameworks remaining at the national level, if at all. Therefore, there are substantial opportunities for international cooperation on crypto governance and efforts to create multilateral networks and partnerships between the private and public sectors are growing. The Digital Currency Governance Consortium for example is composed of 80 organizations across the globe and helps to facilitate discussions around promoting competitiveness, financial stability and protections, and regulatory frameworks in regard to cryptocurrency.

Back to top

Risks

User in the Philippines received transaction confirmation. Users purchase cryptocurrency with a credit card, debit card, bank account or through mining. Photo credit: Brooke Patterson/USAID.

The use of emerging technologies can also create risks in civil society programming. Read below on how to discern the possible dangers associated with cryptocurrencies in DRG work, as well as how to mitigate for unintended – and intended – consequences.

Anonymity

While no central authority records cryptocurrency transactions, the public nature of the transactions does not prevent governments from recording them. An identity that can be associated with records on a blockchain is particularly a problem under totalitarian surveillance regimes. The central internet regulator in China, for example, proposed regulations that would require local blockchain companies to register users with their real names and national identification card numbers. In order to trade or exchange a cryptocurrency into an established fiat currency, a new digital currency would need to incorporate Know Your Customer (KYC), Anti-Money Laundering (AML), and Combating the Financing of Terrorism (CFT) regulations into its process for signing up new users and validating their identities. These processes pose a high barrier to undocumented migrants and anyone else not holding a valid government ID.

As described in the case study below, the partially anarchical environment of cryptocurrencies can also foster criminal activity.

Case Study: The Dark Side of the Anonymous User Bitcoin and other cryptocurrencies are praised for supporting financial transactions that do not reveal a user’s identity. But this has made them popular on “dark web” sites like Silk Road, where cryptocurrency can be exchanged for illegal goods and services like drugs, weapons, or sex work. The Silk Road was eventually taken down by the U.S. Federal Bureau of Investigation, when its founder, Ross Ulbricht, used the same name to advertise the site and seek employees in another forum, linking to a Gmail address. Google provided the contents of that address to the authorities when subpoenaed.

The lessons to take from the Silk Road case are that anonymity is rarely perfect and unbreakable, and cryptocurrency’s identity protection is not an ironclad guarantee and law enforcement officials and governments have tried to increase the regulatory tools at their disposal and international cooperation on crimes involving cryptocurrency On a public blockchain, a single identity slip (even in some other forum) can tie all of the transactions of that cryptocurrency account to one user. The owner of that wallet can then be connected to their subsequent purchases, as easily as a cookie tracks a user’s web browsing activity.

Lack of Governance

The lack of a central body greatly increases the risk of investing in a cryptocurrency. There is little to no recourse for users if the system is attacked digitally and their currency is stolen. In 2022, criminals hacked the FTX blockchain and stole $415 million worth of cryptocurrency, one of the largest hacks in history, just hours after the company was rocked by an embezzlement scandal. The move led government regulators to increase scrutiny on the sector as users were left unable to recover much of the stolen funds.

Regulatory Uncertainty

The legal and regulatory frameworks for blockchain are developing at a slower pace than the technology. Each jurisdiction – whether within a country or a financial zone, such as the 27 European countries known as the Schengen Area that have abolished passports and border controls – regulates cryptocurrencies differently, and there is yet to be a global financial standard for regulating them. The seven Arab nations bordering the Persian Gulf  (Gulf States), for example, have enacted a number of different laws on cryptocurrencies: they face an outright ban in the United Arab Emirates and Saudi Arabia. Other countries have developed tax laws, anti-money-laundering laws, and anti-terrorism laws to regulate cryptocurrencies. In many places, cryptocurrency is taxed as a property, instead of as a currency.

Cryptocurrency’s commitment to autonomy – that is, its separation from a fiat currency – has pitted it as an antagonist to many established regulatory bodies. Observers note that eliminating the ability of intermediaries (e.g., governments or banks) to claim transaction fees, for example, alters existing power balances and may trigger prohibitive regulations even as it temporarily decreases financial costs. Thus, there is always a risk that governments will develop policies unfavorable to financial technologies (fintech), rendering cryptocurrency and mobile money useless within their borders. The constantly evolving nature of laws around fintech proves difficult for any new digital currency.

Environmental Inefficiency

The larger a blockchain grows, the more computational power it requires. In late 2019, the University of Cambridge estimated that Bitcoin uses .55% of global electricity consumption. This consumption level roughly equates to the usage of Malaysia and Sweden.

Digital Literacy and Access Requirements

Blockchain technology underlying cryptocurrencies requires access to the Internet, and areas with inadequate infrastructure or capacity would not be usable contexts for cryptocurrencies, although limited possibilities of using cryptocurrency without Internet access do exist. “This digital divide also extends to technological understanding between those who know how to ‘operate securely on the Internet, and those who do not’”, as noted by the DH Network. Cryptocurrency apps are not usable on lower-end devices, which require users to use a smartphone or computer. The apps themselves involve a steep learning curve. Additionally, the slow speed of transactions — which can take minutes or up to an hour — is a significant disadvantage, especially when compared to the seconds-fast speed of standard Visa transactions. Lastly, using platforms like Bitcoin can be particularly tricky for groups with lower rates of digital literacy and those with fewer resources who are less financially resilient to the volatility of the crypto market. Given the lack of consumer protections and regulation that exists on cryptocurrency in certain areas and the lack of awareness about the existing risks, lower income users and investors are more likely to face negative financial consequences during market fluctuations. Recently, however, some countries, like Ghana and the Gambia, are launching government initiatives to bridge the divide on digital literacy and connect otherwise marginalized groups with the necessary tools to effectively use crypto and other forms of emerging tech.

Back to top

Questions

If you are trying to understand the implications of cryptocurrencies in your work environment, or are considering using cryptocurrencies as part of your DRG programming, ask yourself these questions:

  1. Do the issues you or your organization are seeking to address require cryptocurrency? Can more traditional currency solutions apply to the problem?
  2. Is cryptocurrency an appropriate currency for the populations you are working with? Will it help them access the resources they need? Is it accepted by the other relevant stakeholders?
  3. Do you or your organization need an immutable database distributed across multiple servers? Would it be ok to have the currency and transactions connected to a central server?
  4. Is the cryptocurrency you wish to use viable? Do you trust the currency and have good reason to assume it will be sufficiently stable in the future?
  5. Is the currency legal in the areas where you will be operating? If not, will this pose problems for your organization?
  6. How will you obtain this currency? What risks are involved? What external actors will you be reliant on?
  7. Will the users of this currency be able to benefit from it easily and safely? Will they have the required devices and knowledge?

Back to top

Case Studies

Mobile money agency in Ghana. The use of cryptocurrencies in the developing world can facilitate the development of small- to medium-sized enterprises looking to enter international trade. Photo credit: John O’Bryan/ USAID.
Crypto is helping connect people in low-income countries to global markets

For many humanitarian actors, the ideal role for cryptocurrencies is to facilitate the transfer of remittances to families across borders. This is especially useful during conflicts when traditional banking systems may shut down. Cross border transfers can be costly and subject to complicated regulations but apps like Strike are helping to ease the process. Strike and Bitnob partnered to allow people in Kenya, Nigeria, and Ghana to easily receive instant payments from US bank accounts through the Bitcoin lightning network and convert payments to local currency. Bitcoin apps and other fintech are highly useful for upper-middle-class entrepreneurs in lower income countries who are building international businesses through trade and online commerce, and emerging apps like Strike may help to bring banking accessibility to underbanked areas.

Using Crypto to increase accessibility in authoritarian regimes

Some in human rights activism have argued that cryptocurrency has helped those in authoritarian regimes maintain financial ties to the outside world. Given the anonymity associated with transactions in cryptocurrency, the new form of technology can offer opportunities for trade and transactions where they may not otherwise be possible. In China and Russia for example, financial transactions that would normally be monitored by the state can be circumvented by using cryptocurrency. Bitcoin and other platforms also offer platforms for refugees and other persons without traditional forms of identity to access their finances. Conversely, critics have argued that various cryptocurrencies are often used in the purchasing of black market goods which often involve exploitative industries like drug and sex trafficking or may be used by widely-sanctioned countries like North Korea. Still, in situations where people may be cut off from traditional forms of banking, crypto may fill an important gap.

Cryptocurrencies in Volatile Markets

In recent years, countries with volatile markets have been slowly incorporating cryptocurrency in response to financial crises as citizens search for new options. Bitcoin has been used to purchase medicine, Amazon gift cards, and send remittances. Cryptocurrency has also become increasingly adopted at the institutional level. In January of 2023, two years after formally recognizing it as legal currency, El Salvador introduced legislation to regulate Bitcoin. Despite hopes that Bitcoin would be used to ease the process of sending remittances and increase accessibility for underbanked people, widespread use of the currency has not caught on as users cite high fees as reasons for avoiding the cryptocurrency. Moreover, many still cite uncertainty and a lack of knowledge as reasons that they have not switched from traditional forms of banking and exchange. The introduction of Bitcoin has also worsened El Salvador’s credit rating and reportedly caused further division with the International Monetary Fund (IMF). Additionally, Bitcoin is highly volatile as it is dependent on supply and demand rather than being pegged to an asset like most other currencies although the government in El Salvador has introduced legislation to regulate crypto exchanges.

Venezuela, which has also faced unprecedented inflation, has also turned to crypto. Between August 2014 and November 2016, the number of Bitcoin users in Venezuela rose from 450 to 85,000. The financial crisis in the country has prompted many of its citizens to search for new options.There are no laws regulating Bitcoin in Venezuela, which has emboldened people further. Some countries with financial markets that have experienced similar rates of inflation to Venezuela- such as South Sudan, Zimbabwe, and Argentina – have relatively active cryptocurrency markets.

Cryptocurrencies for Social Impact

Many new cryptocurrencies have attempted to monetize the social impacts of their users. SolarCoin rewards people for installing solar panels. Tree Coin gathers resources for planting trees in the developing world (as one way to fight climate change) and rewards local people for maintaining those trees. Impak Coin is “the first app to reward and simplify responsible consumption” by helping users find socially responsible businesses. The coin it offered is intended to be used to buy products and services from these businesses, and to support users in microlending and crowdlending. It was part of an ecosystem of technologies that included ratings based on the UN’s Sustainable Development Goals and the Impact Management Project. True to its principles, Impak has proposed to begin assessing its impact. In the future, the impact of SolarCoin may be limited, as the value remains relatively low in comparison to set-up costs, potentially deterring people from using it more widely. In contrast, however, Treecoin may be having a more direct impact on local communities as demonstrated in the Mangrove restoration project.

Back to top

References

Find below the works cited in this resource.

Additional Resources

Back to top

Categories

Data Protection

What is data protection?

Data protection refers to practices, measures, and laws that aim to prevent certain information about a person from being collected, used, or shared in a way that is harmful to that person.

Interview with fisherman in Bone South Sulawesi, Indonesia. Data collectors must receive training on how to avoid bias during the data collection process. Photo credit: Indah Rufiati/MDPI – Courtesy of USAID Oceans.

Data protection isn’t new. Bad actors have always sought to gain access to individuals’ private records. Before the digital era, data protection meant protecting individuals’ private data from someone physically accessing, viewing, or taking files and documents. Data protection laws have been in existence for more than 40 years.

Now that many aspects of peoples’ lives have moved online, private, personal, and identifiable information is regularly shared with all sorts of private and public entities. Data protection seeks to ensure that this information is collected, stored, and maintained responsibly and that unintended consequences of using data are minimized or mitigated.

What are data?

Data refer to digital information, such as text messages, videos, clicks, digital fingerprints, a bitcoin, search history, and even mere cursor movements. Data can be stored on computers, mobile devices, in clouds, and on external drives. It can be shared via email, messaging apps, and file transfer tools. Your posts, likes and retweets, your videos about cats and protests, and everything you share on social media is data.

Metadata are a subset of data. It is information stored within a document or file. It’s an electronic fingerprint that contains information about the document or file. Let’s use an email as an example. If you send an email to your friend, the text of the email is data. The email itself, however, contains all sorts of metadata like who created it, who the recipient is, the IP address of the author, the size of the email, etc.

Large amounts of data get combined and stored together. These large files containing thousands or millions of individual files are known as datasets. Datasets then get combined into very large datasets. These very large datasets, referred to as big data, are used to train machine-learning systems.

Personal Data and Personally Identifiable Information

Data can seem to be quite abstract, but the pieces of information are very often reflective of the identities or behaviors of actual persons. Not all data require protection, but some data, even metadata, can reveal a lot about a person. This is referred to as Personally Identifiable Information (PII). PII is commonly referred to as personal data. PII is information that can be used to distinguish or trace an individual’s identity such as a name, passport number, or biometric data like fingerprints and facial patterns. PII is also information that is linked or linkable to an individual, such as date of birth and religion.

Personal data can be collected, analyzed and shared for the benefit of the persons involved, but they can also be used for harmful purposes. Personal data are valuable for many public and private actors. For example, they are collected by social media platforms and sold to advertising companies. They are collected by governments to serve law-enforcement purposes like the prosecution of crimes. Politicians value personal data to target voters with certain political information. Personal data can be monetized by people for criminal purposes such as selling false identities.

“Sharing data is a regular practice that is becoming increasingly ubiquitous as society moves online. Sharing data does not only bring users benefits, but is often also necessary to fulfill administrative duties or engage with today’s society. But this is not without risk. Your personal information reveals a lot about you, your thoughts, and your life, which is why it needs to be protected.”

Access Now’s ‘Creating a Data Protection Framework’, November 2018.

How does data protection relate to the right to privacy?

The right to protection of personal data is closely interconnected to, but distinct from, the right to privacy. The understanding of what “privacy” means varies from one country to another based on history, culture, or philosophical influences. Data protection is not always considered a right in itself. Read more about the differences between privacy and data protection here.

Data privacy is also a common way of speaking about sensitive data and the importance of protecting it against unintentional sharing and undue or illegal  gathering and use of data about an individual or group. USAID’s Digital Strategy for 2020 – 2024 defines data privacy as ‘the  right  of  an  individual  or  group  to  maintain  control  over  and  confidentiality  of  information  about  themselves’.

How does data protection work?

Participant of the USAID WeMUNIZE program in Nigeria. Data protection must be considered for existing datasets as well. Photo credit: KC Nwakalor for USAID / Digital Development Communications

Personal data can and should be protected by measures that protect from harm the identity or other information about a person and that respects their right to privacy. Examples of such measures include determining which data are vulnerable based on privacy-risk assessments; keeping sensitive data offline; limiting who has access to certain data; anonymizing sensitive data; and only collecting necessary data.

There are a couple of established principles and practices to protect sensitive data. In many countries, these measures are enforced via laws, which contain the key principles that are important to guarantee data protection.

“Data Protection laws seek to protect people’s data by providing individuals with rights over their data, imposing rules on the way in which companies and governments use data, and establishing regulators to enforce the laws.”

Privacy International on data protection

A couple of important terms and principles are outlined below, based on The European Union’s General Data Protection Regulation (GDPR).

  • Data Subject: any person whose personal data are being processed, such as added to a contacts database or to a mailing list for promotional emails.
  • Processing data means that any operation is performed on personal data, manually or automated.
  • Data Controller: the actor that determines the purposes for, and means by which, personal data are processed.
  • Data Processor: the actor that processes personal data on behalf of the controller, often a third-party external to the controller, such as a party that offers mailing lists or survey services.
  • Informed Consent: individuals understand and agree that their personal data are collected, accessed, used, and/or shared and how they can withdraw their consent.
  • Purpose limitation: personal data are only collected for a specific and justified use and the data cannot be used for other purposes by other parties.
  • Data minimization: that data collection is minimized and limited to essential details.

 

Healthcare provider in Eswatini. Quality data and protected datasets can accelerate impact in the public health sector. Photo credit: Ncamsile Maseko & Lindani Sifundza.

Access Now’s guide lists eight data-protection principles that come largely from international standards, in particular,, the Council of Europe Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data (widely known as Convention 108) and the Organization for Economic Development (OECD) Privacy Guidelines and are considered to be “minimum standards” for the protection of fundamental rights by countries that have ratified international data protection frameworks.

A development project that uses data, whether establishing a mailing list or analyzing datasets, should comply with laws on data protection. When there is no national legal framework, international principles, norms, and standards can serve as a baseline to achieve the same level of protection of data and people. Compliance with these principles may seem burdensome, but implementing a few steps related to data protection from the beginning of the project will help to achieve the intended results without putting people at risk.

common practices of civil society organizations relate to the terms and principles of the data protection framework of laws and norms

The figure above shows how common practices of civil society organizations relate to the terms and principles of the data protection framework of laws and norms.

The European Union’s General Data Protection Regulation (GDPR)

The data protection law in the EU, the GDPR, went into effect in 2018. It is often considered the world’s strongest data protection law. The law aims to enhance how people can access their information and limits what organizations can do with personal data from EU citizens. Although coming from the EU, the GDPR can also apply to organizations that are based outside the region when EU citizens’ data are concerned. GDPR, therefore, has a global impact.

The obligations stemming from the GDPR and other data protection laws may have broad implications for civil society organizations. For information about the GDPR- compliance process and other resources, see the European Center for Not-for-Profit Law‘s guide on data-protection standards for civil society organizations.

Notwithstanding its protections, the GDPR also has been used to harass CSOs and journalists. For example, a mining company used a provision of the GDPR to try to force Global Witness to disclose sources it used in an anti-mining campaign. Global Witness successfully resisted these attempts.

Personal or organizational protection tactics

How to protect your own sensitive information or the data of your organization will depend on your specific situation in terms of activities and legal environment. The first step is to assess your specific needs in terms of security and data protection. For example, which information could, in the wrong hands, have negative consequences for you and your organization?

Digital–security specialists have developed online resources you can use to protect yourself. Examples are the Security Planner, an easy-to-use guide with expert-reviewed advice for staying safer online with recommendations on implementing basic online practices. The Digital Safety Manual offers information and practical tips on enhancing digital security for government officials working with civil society and Human Rights Defenders (HRDs). This manual offers 12 cards tailored to various common activities in the collaboration between governments (and other partners) and civil society organizations. The first card helps to assess the digital security.

Digital Safety Manual

 

The Digital First Aid Kit is a free resource for rapid responders, digital security trainers, and tech-savvy activists to better protect themselves and the communities they support against the most common types of digital emergencies. Global digital safety responders and mentors can help with specific questions or mentorship, for example, The Digital Defenders Partnership and the Computer Incident Response Centre for Civil Society (CiviCERT).

Back to top

How is data protection relevant in civic space and for democracy?

Many initiatives that aim to strengthen civic space or improve democracy use digital technology. There is a widespread belief that the increasing volume of data and the tools to process them can be used for good. And indeed, integrating digital technology and the use of data in democracy, human rights, and governance programming can have significant benefits; for example, they can connect communities around the globe, reach underserved populations better, and help mitigate inequality.

“Within social change work, there is usually a stark power asymmetry. From humanitarian work, to campaigning, documenting human rights violations to movement building, advocacy organisations are often led by – and work with – vulnerable or marginalised communities. We often approach social change work through a critical lens, prioritising how to mitigate power asymmetries. We believe we need to do the same thing when it comes to the data we work with – question it, understand its limitations, and learn from it in responsible ways.”

What is Responsible Data?

When quality information is available to the right people when they need it, the data are protected against misuse, and the project is designed with the protection of its users in mind, it can accelerate impact.

  • USAID’s funding of improved vineyard inspection using drones and GIS data in Moldova, allows farmers to quickly inspect, identify, and isolate vines infected by a ​phytoplasma disease of the vine.
  • Círculo is a digital tool for female journalists in Mexico to help them create strong networks of support, strengthen their safety protocols and meet needs related to the protection of themselves and their data. The tool was developed with the end-users through chat groups and in-person workshops to make sure everything built into the app was something they needed and could trust.

At the same time, data-driven development brings a new responsibility to prevent misuse of data, when designing,  implementing or monitoring development projects. When the use of personal data is a means to identify people who are eligible for humanitarian services, privacy and security concerns are very real.

  • Refugee camps In Jordan have required community members to allow scans of their irises to purchase food and supplies and take out cash from ATMs. This practice has not integrated meaningful ways to ask for consent or allow people to opt out. Additionally, the use and collection of highly sensitive personal data like biometrics to enable daily purchasing habits is disproportionate, because other less personal digital technologies are available and used in many parts of the world.

Governments, international organizations, and private actors can all – even unintentionally – misuse personal data for other purposes than intended, negatively affecting the well-being of the people related to that data. Some examples have been highlighted by Privacy International:

  • The case of Tullow Oil, the largest oil and gas exploration and production company in Africa, shows how a private actor considered extensive and detailed research by a micro-targeting research company into the behaviors of local communities in order to get ‘cognitive and emotional strategies to influence and modify Turkana attitudes and behavior’ to the Tullow Oil’s advantage.
  • In Ghana, the Ministry of Health commissioned a large study on health practices and requirements in Ghana. This resulted in an order from the ruling political party to model future vote distribution within each constituency based on how respondents said they would vote, and a negative campaign trying to get opposition supporters not to vote.

There are resources and experts available to help with this process. The Principles for Digital Development website offers recommendations, tips, and resources to protect privacy and security throughout a project lifecycle, such as the analysis and planning stage, for designing and developing projects and when deploying and implementing. Measurement and evaluation are also covered. The Responsible Data website offers the Illustrated Hand-Book of the Modern Development Specialist with attractive, understandable guidance through all steps of a data-driven development project: designing it, managing data, with specific information about collecting, understanding and sharing it, and closing a project.

NGO worker prepares for data collection in Buru Maluku, Indonesia. When collecting new data, it’s important to design the process carefully and think through how it affects the individuals involved. Photo credit: Indah Rufiati/MDPI – Courtesy of USAID Oceans.

Back to top

Opportunities

Data protection measures further democracy, human rights, and governance issues. Read below to learn how to more effectively and safely think about data protection in your work.

Privacy respected and people protected

Implementing data–protection standards in development projects protects people against potential harm from abuse of their data. Abuse happens when an individual, company or government accesses personal data and uses them for purposes other than those for which the data were collected. Intelligence services and law enforcement authorities often have legal and technical means to enforce access to datasets and abuse the data. Individuals hired by governments can access datasets by hacking the security of software or clouds. This has often led to intimidation, silencing, and arrests of human rights defenders and civil society leaders criticizing their government. Privacy International maps examples of governments and private actors abusing individuals’ data.

Strong protective measures against data abuse ensure respect for the fundamental right to privacy of the people whose data are collected and used. Protective measures allow positive development such as improving official statistics, better service delivery, targeted early warning mechanisms, and effective disaster response.

It is important to determine how data are protected throughout the entire life cycle of a project. Individuals should also be ensured of protection after the project ends, either abruptly or as intended, when the project moves into a different phase or when it receives funding from different sources. Oxfam has developed a leaflet to help anyone handling, sharing, or accessing program data to properly consider responsible data issues throughout the data lifecycle, from making a plan to disposing of data.

Back to top

Risks

The collection and use of data can also create risks in civil society programming. Read below on how to discern the possible dangers associated with collection and use of data in DRG work, as well as how to mitigate for unintended – and intended – consequences.

Unauthorized access to data

Data need to be stored somewhere. On a computer or an external drive, in a cloud, or on a local server. Wherever the data are stored, precautions need to be taken to protect the data from unauthorized access and to avoid revealing the identities of vulnerable persons. The level of protection that is needed depends on the sensitivity of the data, i.e. to what extent it could have negative consequences if the information fell into the wrong hands.

Data can be stored on a nearby and well-protected server that is connected to drives with strong encryption and very limited access, which is a method to stay in control of the data you own. Cloud services offered by well-known tech companies often offer basic protection measures and wide access to the dataset for free versions. More advanced security features are available for paying customers, such as storage of data in certain jurisdictions with data-protection legislation. The guidelines on how to secure private data stored and accessed in the cloud help to understand various aspects of clouds and to decide about a specific situation.

Every system needs to be secured against cyberattacks and manipulation. One common challenge is finding a way to protect identities in the dataset, for example, by removing all information that could identify individuals from the data, i.e. anonymizing it. Proper anonymization is of key importance and harder than often assumed.

One can imagine that a dataset of GPS locations of People Living with Albinism across Uganda requires strong protection. Persecution is based on the belief that certain body parts of people with albinism can transmit magical powers, or that they are presumed to be cursed and bring bad luck. A spatial-profiling project mapping the exact location of individuals belonging to a vulnerable group can improve outreach and delivery of support services to them. However, hacking of the database or other unlawful access to their personal data might put them at risk of people wanting to exploit or harm them.

One could also imagine that the people operating an alternative system to send out warning sirens for air strikes in Syria run the risk of being targeted by authorities. While data collection and sharing by this group aims to prevent death and injury, it diminishes the impact of air strikes by the Syrian authorities. The location data of the individuals running and contributing to the system needs to be protected against access or exposure.

Another risk is that private actors who run or cooperate in data-driven projects could be tempted to sell data if they are offered large sums of money. Such buyers could be advertising companies or politicians that aim to target commercial or political campaigns at specific people.

The Tiko system designed by social enterprise Triggerise rewards young people for positive health-seeking behaviors, such as visiting pharmacies and seeking information online. Among other things, the system gathers and stores sensitive personal and health information about young female subscribers who use the platform to seek guidance on contraceptives and safe abortions, and it tracks their visits to local clinics. If these data are not protected, governments that have criminalized abortion could potentially access and use that data to carry out law-enforcement actions against pregnant women and medical providers.

Unsafe collection of data

When you are planning to collect new data, it is important to carefully design the collection process and think through how it affects the individuals involved. It should be clear from the start what kind of data will be collected, for what purpose, and that the people involved agree with that purpose. For example, an effort to map people with disabilities in a specific city can improve services. However, the database should not expose these people to risks, such as attacks or stigmatization that can be targeted at specific homes. Also, establishing this database should answer to the needs of the people involved and not driven by the mere wish to use data. For further guidance, see the chapter Getting Data in the Hand-book of the Modern Development Specialist and the OHCHR Guidance to adopt a Human Rights Based Approach to Data, focused on collection and disaggregation.

If data are collected in person by people recruited for this process, proper training is required. They need to be able to create a safe space to obtain informed consent from people whose data are being collected and know how to avoid bias during the data-collection process.

Unknowns in existing datasets

Data-driven initiatives can either gather new data, for example, through a survey of students and teachers in a school or use existing datasets from secondary sources, for example by using a government census or scraping social media sources. Data protection must also be considered when you plan to use existing datasets, such as images of the Earth for spatial mapping. You need to analyze what kind of data you want to use and whether it is necessary to use a specific dataset to reach your objective. For third-party datasets, it is important to gain insight into how the data that you want to use were obtained, whether the principles of data protection were met during the collection phase, who licensed the data and who funded the process. If you are not able to get this information, you must carefully consider whether to use the data or not. See the Hand-book of the Modern Development Specialist on working with existing data.

Benefits of cloud storage

A trusted cloud-storage strategy offers greater security and ease of implementation compared to securing your own server. While determined adversaries can still hack into individual computers or local servers, it is significantly more challenging for them to breach the robust security defenses of reputable cloud storage providers like Google or Microsoft. These companies deploy extensive security resources and a strong business incentive to ensure maximum protection for their users. By relying on cloud storage, common risks such as physical theft, device damage, or malware can be mitigated since most documents and data are securely stored in the cloud. In case of incidents, it is convenient to resynchronize and resume operations on a new or cleaned computer, with little to no valuable information accessible locally.

Backing up data

Regardless of whether data is stored on physical devices or in the cloud, having a backup is crucial. Physical device storage carries the risk of data loss due to various incidents such as hardware damage, ransomware attacks, or theft. Cloud storage provides an advantage in this regard as it eliminates the reliance on specific devices that can be compromised or lost. Built-in backup solutions like Time Machine for Macs and File History for Windows devices, as well as automatic cloud backups for iPhones and Androids, offer some level of protection. However, even with cloud storage, the risk of human error remains, making it advisable to consider additional cloud backup solutions like Backupify or SpinOne Backup. For organizations using local servers and devices, secure backups become even more critical. It is recommended to encrypt external hard drives using strong passwords, utilize encryption tools like VeraCrypt or BitLocker, and keep backup devices in a separate location from the primary devices. Storing a copy in a highly secure location, such as a safe deposit box, can provide an extra layer of protection in case of disasters that affect both computers and their backups.

Back to top

Questions

If you are trying to understand the implications of lacking data protection measures in your work environment, or are considering using data as part of your DRG programming, ask yourself these questions:

  1. Are data protection laws adopted in the country or countries concerned? Are these laws aligned with international human rights law, including provisions protecting the right to privacy?
  2. How will the use of data in your project comply with data protection and privacy standards?
  3. What kind of data do you plan to use? Are personal or other sensitive data involved?
  4. What could happen to the persons related to that data if the government accesses these data?
  5. What could happen if the data are sold to a private actor for other purposes than intended?
  6. What precaution and mitigation measures are taken to protect the data and the individuals related to the data?
  7. How is the data protected against manipulation and access and misuse by third parties?
  8. Do you have sufficient expertise integrated during the entire course of the project to make sure that data are handled well?
  9. If you plan to collect data, what is the purpose of the collection of data? Is data collection necessary to reach this purpose?
  10. How are collectors of personal data trained? How is informed consent generated when data are collected?
  11. If you are creating or using databases, how is the anonymity of the individuals related to the data guaranteed?
  12. How is the data that you plan to use obtained and stored? Is the level of protection appropriate to the sensitivity of the data?
  13. Who has access to the data? What measures are taken to guarantee that data are accessed for the intended purpose?
  14. Which other entities – companies, partners – process, analyze, visualize, and otherwise use the data in your project? What measures are taken by them to protect the data? Have agreements been made with them to avoid monetization or misuse?
  15. If you build a platform, how are the registered users of your platform protected?
  16. Is the database, the system to store data or the platform auditable to independent research?

Back to top

Case Studies

People Living with HIV Stigma Index and Implementation Brief

The People Living with HIV Stigma Index is a standardized questionnaire and sampling strategy to gather critical data on intersecting stigmas and discrimination affecting people living with HIV. It monitors HIV-related stigma and discrimination in various countries and provides evidence for advocacy in countries. The data in this project are the experiences of people living with HIV. The implementation brief provides insight into data protection measures. People living with HIV are at the center of the entire process, continuously linking the data that is collected about them to the people themselves, starting from research design, through implementation, to using the findings for advocacy. Data are gathered through a peer-to-peer interview process, with people living with HIV from diverse backgrounds serving as trained interviewers. A standard implementation methodology has been developed, including the establishment if a steering committee with key  stakeholders and population groups.

RNW Media’s Love Matters Program Data Protection

RNW Media’s Love Matters Program offers online platforms to foster discussion and information-sharing on love, sex and relationships to 18-30 year-olds in areas where information on sexual and reproductive health and rights (SRHR) is censored or taboo. RNW Media’s digital teams introduced creative approaches to data processing and analysis, Social Listening methodologies and Natural Language Processing techniques to make the platforms more inclusive, create targeted content, and identify influencers and trending topics. Governments have imposed restrictions such as license fees or registrations for online influencers as a way of monitoring and blocking “undesirable” content, and RNW Media has invested in security of its platforms and literacy of the users to protect them from access to their sensitive personal information. Read more in the publication ‘33 Showcases – Digitalisation and Development – Inspiration from Dutch development cooperation’, Dutch Ministry of Foreign Affairs, 2019, p 12-14.

Amnesty International Report

Amnesty International Report

Thousands of democracy and human rights activists and organizations rely on secure communication channels every day to maintain the confidentiality of conversations in challenging political environments. Without such security practices, sensitive messages can be intercepted and used by authorities to target activists and break up protests. One prominent and well-documented example of this occurred in the aftermath of the 2010 elections in Belarus. As detailed in this Amnesty International report, phone recordings and other unencrypted communications were intercepted by the government and used in court against prominent opposition politicians and activists, many of whom spent years in prison. In 2020, another swell of post-election protests in Belarus saw thousands of protestors adopt user-friendly, secure messaging apps that were not as readily available just 10 years prior to protect their sensitive communications.

Norway Parliament Data

Norway Parliament Data

The Storting, Norway’s parliament, has experienced another cyberattack that involved the exploitation of recently disclosed vulnerabilities in Microsoft Exchange. These vulnerabilities, known as ProxyLogon, were addressed by emergency security updates released by Microsoft. The initial attacks were attributed to a state-sponsored hacking group from China called HAFNIUM, which utilized the vulnerabilities to compromise servers, establish backdoor web shells, and gain unauthorized access to internal networks of various organizations. The repeated cyberattacks on the Storting and the involvement of various hacking groups underscore the importance of data protection, timely security updates, and proactive measures to mitigate cyber risks. Organizations must remain vigilant, stay informed about the latest vulnerabilities, and take appropriate actions to safeguard their systems and data.

Girl Effect

Girl Effect, a creative non-profit working where girls are marginalized and vulnerable, uses media and mobile tech to empower girls. The organization embraces digital tools and interventions and acknowledges that any organisation that uses data also has a responsibility to protect the people it talks to or connects online. Their ‘Digital safeguarding tips and guidance’ provides in-depth guidance on implementing data protection measures while working with vulnerable people. Referring to Girl Effect as inspiration, Oxfam has developed and implemented a Responsible Data Policy and shares many supporting resources online. The publication ‘Privacy and data security under GDPR for quantitative impact evaluation’ provides detailed considerations of the data protection measures Oxfam implements while doing quantitative impact evaluation through digital and paper-based surveys and interviews.

Back to top

References

Find below the works cited in this resource.

Additional Resources

Back to top

Categories

Generative AI

What is Generative AI?

Generative artificial intelligence (GenAI) refers to a class of artificial-intelligence techniques and models that creates new, original content based on data on which the models were trained. The output can be text, images, or videos that reflect or respond to the input. Much as artificial intelligence applications can span many industries, so too can GenAI. Many of these applications are in the area of art and creativity, as GenAI can be used to create art, music, video games, and poetry based on the patterns observed in training data. But its learning of language also makes it well suited to facilitate communication, for example, as chatbots or conversational agents that can simulate human-like conversations, language translation, realistic speech synthesis or text-to-speech. These are just a few examples. This article elaborates on the ways in which GenAI offers both opportunities and risks in civic space and to democracy and what government institutions, international organizations, activists, and civil society organizations can do to capitalize on the opportunities and guard against the risks.

How does GenAI work?

At the core of GenAI are generative models, which are algorithms or information architectures designed to learn the underlying patterns and statistics of training data. These models can then use this learned knowledge to produce new outputs that resemble the original data distribution. The idea is to capture the underlying patterns and statistics of the training data so that the AI model can generate new samples that belong to the same distribution.

Steps of the GenAI Process

As the figure above illustrates, GenAI models are developed through a process by which a curated database is used to train neural networks with machine learning techniques. These networks learn to identify patterns in the data, which allows them to generate new content or make predictions based on the learned information. From there, users can input commands in the form of words, numbers, or images into these algorithmic models, and the model produces content that responds based on the input and the patterns learned from the training data. As they are trained on ever-larger datasets, the GenAI models gain a broader range of possible content they can generate across different media, from audio to images and text.

Until recently, GenAI simply mimicked the style and substance of the input. For example, someone could input a snippet of a poem or news article into a model, and the model would output a complete poem or news article that sounded like the original content. An example of what this looks like in the linguistics field that you may have seen in your own email is predictive language along the lines of a Google Smart Compose that completes a sentence based on a combination of the initial words you use and the probabilistic expectation of what could follow. For example, a machine studying billions of words from datasets would generate a probabilistic expectation of a sentence that starts with “please come ___.” In 95% of cases, the machine might have seen “here” as the next word, in 3% of cases “with me” and in 2% of cases “soon.” Thus, when completing sentences or generating outputs, the algorithm that learned the language would use the sentence structure and combination of words that it had seen previously. Because the models are probabilistic, they might sometimes make errors that do not reflect the nuanced intentions of the input.

GenAI now has far more expansive capabilities. Far beyond text, GenAI is also a tool for producing images from text. For example, tools such as DALL-E, Stable Diffusion, and MidJourney allow a user to input text descriptions that the model then uses to produce a corresponding image. These images vary in their realism–for example, some look like they are out of a science fiction scene while others look like a painting while others are more like a photograph. Additionally, it is worth noting that these tools are constantly improving, ensuring that the boundaries of what can be achieved with text-to-image generation continue to expand.

Conversational AI

Recent models have incorporated machine learning from language patterns but also factual information about politics, society, and economics. Recent models are also able to take input commands from images and voice, further expanding their versatility and utility in various applications.

Consumer-facing models that simulate human conversation–“conversational AI”–have proliferated recently and operate more as chatbots, responding to queries and questions, much in the way that a search engine would function. Some examples include asking the model to answer any of the following:

  • Provide a photo of a political leader playing a ukulele in the style of Salvador Dali.
  • Talk about Kenya’s capital, form of government, or character, or about the history of decolonization in South Asia.
  • Write and perform a song about adolescence that mimics a Drake song.

In other words, these newer models may function like a blend between a Google search and an exchange with a knowledgeable individual about their area of expertise. Much like a socially attentive individual, these models can be taught during a conversation. If you were to ask a question about the best restaurants in Manila, and the chatbot responds with a list of restaurants that include some Continental European restaurants, you can then follow up and express a preference for Filipino restaurants, which will prompt the chatbot to tailor its output to your specific preferences. The model learns based on feedback, although models such as ChatGPT will be quick to point out that it is only trained on data up to a certain date, which means some restaurants will have gone out of business and some award-winning restaurants may have cropped up. The example highlights a fundamental tension between up-to-date models or content and the ability to refine models. If we try to have models learn from information as it is produced, those models will generate up-to-date answers but will not be able to filter outputs for bad information, hate speech, or conspiracy theories.

Definitions

GenAI involves several key concepts:

Generative Models: Generative models are a class of machine learning models designed to create or generate new data outputs that resemble a given set of training data. These models learn underlying patterns and structures from the training data and use that knowledge to generate new, similar data outputs.

ChatGPT: ChatGPT is a Generative Pre-trained Transformer (GPT) model developed by OpenAI. While researchers had developed and used language models for decades, ChatGPT was the first consumer-facing language model. Trained to understand and produce human-like text in a dialogue setting, it was specifically designed for generating conversational responses and engaging in interactive text-based conversations. As such, it is well-suited for creating chatbots, virtual assistants, and other conversational AI applications.

Neural Network: A neural network is a computational model intended to function like the brain’s interconnected neurons. It is an important part of deep learning because it performs a calculation, and the strength of connections (weights) between neurons determines the flow of information and influences the output.

Training Data: Training data are the data used to train generative models. These data are crucial since the model learns patterns and structures from these data to create new content. For example, in the context of text generation, training data would consist of a large collection of text documents, sentences, or paragraphs. The quality and diversity of the training data have a significant impact on the performance of the GenAI model because it helps the model generate more relevant content.

Hallucination: In the context of GenAI, the term “hallucination” refers to a phenomenon where the AI model produces outputs that are not grounded in reality or accurate representations of the input data. In other words, the AI generates content that seems to exist, but in reality, it is entirely fabricated and has no basis in the actual data on which it was trained. For instance, a language model might produce paragraphs of text that seem coherent and factual but, upon closer inspection, might include false information, events that never happened, or connections between concepts that are logically flawed. The problem results from noise in the training data. Addressing and minimizing hallucinations in GenAI is an ongoing research challenge. Researchers and developers strive to improve the models’ understanding of context, coherence, and factual accuracy to reduce the likelihood of generating content that can be considered hallucinatory.

Prompt: GenAI prompt is a specific input or instruction provided to a GenAI model to guide it in producing a desired output. In image generation, a prompt might involve specifying the style, content, or other attributes you want the generated image to have. The quality and relevance of the generated output often depend on the clarity and specificity of the prompt. A well-crafted prompt can lead to more accurate and desirable generated content.

Evaluation Metrics: Evaluating the quality of outputs from GenAI models can be challenging, but several evaluation metrics have been developed to assess various aspects of generated content. Metrics like Inception Score, Frechet Inception Distance (FID), and Perceptual Path Length (PPL) attempt to measure aspects of model performance such as the diversity of responses (so that they do not all sound like copies of each other), relevance (so the responses are on topic) and coherence (so that responses stay on topic) of the output.

Prompt Engineering: Prompt engineering is the process of designing and refining prompts or instructions given to GenAI systems, such as chatbots or language models like GPT-3.5, to elicit specific and desired responses. It involves crafting the input text or query in such a way that the model generates outputs that align with the user’s intent or the desired task. It is useful for optimizing the benefits of GenAI but requires a deep understanding of the model’s behavior and capabilities as well as the specific requirements of the application or task. Well-crafted prompts can enhance the user experience by ensuring that the models provide valuable and accurate responses.

Back to top

How is GenAI relevant in civic space and for democracy?

The rapid development and diffusion of GenAI technologies–across medicine, environmental sustainability, politics, and journalism, among many other fields–is creating and will create enormous opportunities. GenAI is being used for drug discovery, molecule design, medical-imaging analysis, and personalized treatment recommendations. It is being used to model and simulate ecosystems, predict environmental changes, and devise conservation strategies. It offers more accessible answers about bureaucratic procedures so citizens better understand their government, which is a fundamental change to how citizens access information and how governments operate. It is supporting the generation of written content such as articles, reports, and advertisements.

Across all of these sectors, GenAI also introduces potential risks. Governments, working with the private sector and civil society organizations, are taking different approaches to balancing capitalizing on the opportunities while guarding against the risks, reflecting different philosophies about risk and the role of innovation in their respective economies and different legal precedents and political landscapes across countries. Many of the pioneering efforts are taking place in the countries where AI is being used most, such as in the United States or countries in the European Union, or in tech-heavy countries such as China. Conversations about regulation in other countries have lagged. In Africa, for example, experts at the Africa Tech Week conference in spring 2023 expressed concern about the lag in Africa’s access to AI and the need to catch up to reap the benefits of AI in the economy, medicine, and society, though they also gestured toward privacy issues and the importance of diversity in AI research teams to guard against bias. These conversations suggest that both access and regulation are developing at different rates across different contexts, and those regions developing and testing regulations now may be role models or at least provide lessons learned for other countries as they regulate.

The European Union has moved quickly to regulate AI, using a tiered, risk-based approach that designates some types of “high risk uses” as prohibited. GenAI systems that do not have risk-assessment and -mitigation plans, clear information for users, explainability, activity logging, and other requirements are considered high risk. Most GenAI systems would not meet those standards, according to a 2021 Stanford University study. However, executives from 150 European companies have collectively pushed back against aggressive regulation, suggesting that overly stringent AI regulation will incentivize companies to establish headquarters outside of Europe and stifle innovation and economic development in the region. An open letter acknowledges that some regulation may be warranted but that GenAI will be “decisive” and “powerful” and that “Europe cannot afford to stay on the sidelines.”

China has been one of the most aggressive countries when it comes to AI regulation. The Cybersecurity Administration of China requires that AI be transparent, unbiased, and not used for generating misinformation or social unrest. Existing rules highly regulate deepfakes—synthetic media in which a person’s likeness, including their face and voice, is replaced with someone else’s likeness, typically using AI. Any service provider that uses content produced by GenAI must also obtain consent from deepfake subjects, label outputs, and then counter any misinformation. However, enacting such regulations does not mean that state actors will not use AI for malicious purposes or for influence operations themselves as we discuss below.

The United States has held a number of hearings to better understand the technology and its impact on democracy, but by September 2023 had not put in place any significant legislation to regulate GenAI. The Federal Trade Commission, responsible for promoting consumer protection, issued a 20-page letter to OpenAI, the creator of ChatGPT, requesting responses to its concerns about consumer privacy and security. In addition, the US government has worked with the major GenAI firms to establish voluntary transparency and safety safeguards as the risks and benefits of the technology evolve.

Going beyond regional or country-level regulatory initiatives, the UN Secretary General, António Guterrez, has advocated for transparency, accountability, and oversight of AI. Mr. Guterrez observed: “The international community has a long history of responding to new technologies with the potential to disrupt our societies and economies. We have come together at the United Nations to set new international rules, sign new treaties and establish new global agencies. While many countries have called for different measures and initiatives around the governance of AI, this requires a universal approach.” The statement gestures toward the fact that digital space does not know boundaries and that the software technologies innovated in one country will inevitably cross over to others, suggesting that meaningful norms or constraints on GenAI will likely require a coordinated, international approach. To that end, some researchers have proposed an international artificial intelligence organization that would help certify compliance with international standards on AI safety, which also acknowledges the inherently international nature of AI development and deployment.

Back to top

Opportunities

Enhancing Representation

One of the main challenges in democracy and for civil society is ensuring that constituent voices are heard and represented, which in part involves citizens themselves participating in the democratic process. GenAI may be useful in providing both policymakers and citizens a way to communicate more efficiently and enhance trust in institutions. Another avenue for enhancing representation is for GenAI to provide data that allow researchers and policymakers an opportunity to understand various social, economic, and environmental issues and constituents’ concerns about these issues. For example, GenAI could be used to synthesize large volumes of incoming commentary from open comment lines or emails and then better understand the bottom-up concerns that citizens have about their democracy. To be sure, these data-analysis tools need to ensure data privacy, but can provide data visualization for institutional leaders to understand what people care about.

Easy Read Access

Many regulations and pieces of legislation are dense and difficult to comprehend for anyone outside the decisionmaking establishment. These accessibility challenges are magnified for  individuals with disabilities such as cognitive impairments. GenAI can summarize long pieces of legislation and translate dense governmental publications into an easy read format, with images and simple language. Civil society organizations can also use GenAI to develop social media campaigns and other content to make it more accessible to those with disabilities.

Civic Engagement

GenAI can enhance civic engagement by generating personalized content tailored to individual interests and preferences through a combination of data analysis and machine learning. This could involve generating informative materials, news summaries, or visualizations that appeal to citizens and encourage them to participate in civic discussions and activities. The marketing industry has long capitalized on the realization that content specific to individual consumers is more likely to elicit consumption or engagement, and the idea holds in civil society. The more the content is personalized and targeted to a specific individual or category of individual, the more likely that individual will be to respond. Again, the use of data for helping classify citizen preferences inherently relies on user data. Not all societies will endorse this use of data. For example, the European Union has shown a wariness about privacy, suggesting that one size will not fit all in terms of this particular use of GenAI for civic engagement.

That being said, this tool could help dislodge voter apathy that can lead to disaffection and disengagement from politics. Instead of boilerplate communication urging young people to vote, for example, GenAI could produce clever content known to resonate with young women or marginalized groups, helping to counter some of the additional barriers to engagement that marginalized groups face. In an educational setting, personalized content could be used to cater to the needs of students in different regions and with different learning abilities, while also providing virtual tutors or language-learning tools.

Public Deliberation

Another way that GenAI could enable public participation and deliberation is through GenAI-powered chatbots and conversational agents. These tools can facilitate public deliberation by engaging citizens in dialogue, addressing their concerns, and helping them navigate complex civic issues. These agents can provide information, answer questions, and stimulate discussions. Some municipalities have already launched AI-powered virtual assistants and chatbots that automate civic services, streamlining processes such as citizen inquiries, service requests, and administrative tasks. This can lead to increased efficiency and responsiveness in government operations. Lack of municipal resources—for example, staff—can mean that citizens also lack the information they need to be meaningful participants in their society. With relatively limited resources, a chatbot can be trained on local data to provide specific information needed to narrow that gap.

Chatbots can be trained in multiple languages, making civic information and resources more accessible to diverse populations. They can assist people with disabilities by generating alternative formats for information, such as audio descriptions or text-to-speech conversions. GenAI can be trained on local dialects and languages, promoting indigenous cultures and making digital content more accessible to diverse populations.

It is important to note that the deployment of GenAI must be done with sensitivity to local contexts, cultural considerations, and privacy concerns. Adopting a human-centered design approach to collaborations among AI researchers, developers, civil society groups, and local communities can help to ensure that these technologies are adapted appropriately and equitably to address specific needs and challenges.

Predictive Analytics

GenAI can also be used for predictive analytics to forecast potential outcomes of policy decisions. For example, AI-powered generative models can analyze local soil and weather data to optimize crop yield and recommend suitable agricultural practices for specific regions. It can be used to generate realistic simulations to predict potential impacts and develop disaster response strategies for relief operations. It can analyze local environmental conditions and energy demand to optimize the deployment of renewable energy sources like solar and wind power, promoting sustainable power solutions.

By analyzing historical data and generating simulations, policymakers can make more informed and evidence-based choices for the betterment of society. These same tools can assist not only policymakers but also civil society organizations in generating data visualizations or summarizing information about citizen preferences. This can aid in producing more informative and timely content about citizen preferences and the state of key issues, like the number of people who are homeless.

Environmental Sustainability

GenAI can be used in ways that lead to favorable environmental impacts. For example, it can be used in fields such as architecture and product design to optimize designs for efficiency. It can be used to optimize processes in the energy industry that can enhance energy efficiency. It also has potential for use in logistics where GenAI can optimize routes and schedules, thereby reducing fuel consumption and emissions.

Back to top

Risks

To harness the potential of GenAI for democracy and the civic space, a balanced approach that addresses ethical concerns, fosters transparency, promotes inclusive technology development, and engages multiple stakeholders is necessary. Collaboration among researchers, policymakers, civil society, and technology developers can help ensure that GenAI contributes positively to democratic processes and civic engagement. The ability to generate large volumes of credible content can create opportunities for policymakers and citizens to connect with each other–but those same capabilities of advanced GenAI models create possible risks as well.

Online Misinformation

Although GenAI has improved, the models still hallucinate and produce convincing-sounding outputs, for example, facts or stories that sound plausible but are not correct. While there are many cases in which these hallucinations are benign–such as a scientific query about the age of the universe–there are other cases where the consequences are destabilizing politically or societally.

Given that GenAI is public facing, individuals can use these technologies without understanding the limitations. They could then inadvertently spread misinformation from an inaccurate answer to a question about politics or history, for example, an inaccurate statement about a political leader that ends up inflaming an already acrimonious political environment. The spread of AI-generated misinformation flooding the information ecosystem has the potential to reduce trust in the information ecosystem as a whole, leading people to be skeptical of all facts and to conform to the beliefs of their social circles. The spread of information may mean that members of society believe things that are not true about political candidates, election procedures, or wars.

Examples of GenAI generating disinformation include not just text but also deepfakes. While deepfakes have benign potential applications, such as for entertainment or special effects, they can also be misused to create highly realistic videos that spread false information or fabricated events that make it difficult for viewers to discern between fake and real content, which can lead to the spread of misinformation and erode trust in the media. Relatedly, they can be used for political manipulation, in which videos of politicians or public figures are altered to make them appear to say or do things that could defame, harm their reputation, or influence public opinion.

GenAI makes it more efficient to generate and amplify disinformation, intentionally created for the purposes of misleading a reader, because it can produce, in large quantities, seemingly original and seemingly credible but nonetheless inaccurate information. None of the stories or comments would necessarily repeat, which could then lead to an even more credible-seeming narrative. Foreign disinformation campaigns have often been identified on the basis of spelling or grammatical errors, but the ability to use these new GenAI technologies means the efficient creation of native-sounding content that can fool the usual filters that a platform might use to identify large-scale disinformation campaigns. GenAI may also proliferate social bots that are indistinguishable from humans and can micro-target individuals with disinformation in a tailored way.

Astroturfing Campaigns

Since GenAI technologies are public facing and easy to use, they can be used to manipulate not only the mass public, but also different levels of government elites. Political leaders are expected to engage with their constituents’ concerns, as reflected in communications such as emails that reveal public opinion and sentiment. But what if a malicious actor used ChatGPT or another GenAI model to create large volumes of advocacy content and distributed it to political leaders as if it were from real citizens? This would be a form of astroturfing, a deceptive practice that masks the source of content with an aim of creating a perception of grassroots support. Research suggests that elected officials in the United States have been susceptible to these attacks. Leaders could well allow this volume of content to influence their political agenda, passing laws or establishing bureaucracies in response to the apparent groundswell of support that in fact was manufactured by the ability to generate large volumes of credible-sounding content.

Bias

GenAI also raises discrimination and bias concerns. If the training data used to create the generative model contains biased or discriminatory information, the model will produce biased or offensive outputs. This could perpetuate harmful stereotypes and contribute to privacy violations for certain groups. If a GenAI model is trained on a dataset containing biased language patterns, it might produce text that reinforces gender stereotypes. For instance, it might associate certain professions or roles with a particular gender, even if there is no inherent connection. If a GenAI model is trained on a dataset with skewed racial or ethnic representation, it can produce images that unintentionally depict certain groups in a negative or stereotypical manner. These models might also, if trained on biased or discriminatory datasets, produce content that is culturally insensitive or uses derogatory terms. Text-to-image GenAI mangles the features of a “Black woman” at high rates, which is harmful to the groups misrepresented. The cause is overrepresentation of non-Black groups in the training datasets. One solution is more balanced, diverse datasets instead of just Western and English-language data that would contain Western bias and create biases by lacking other perspectives and languages. Another is to train the model so that users cannot “jailbreak” it into spewing racist or inappropriate content.

However, the issue of bias extends beyond training data that is openly racist or sexist. AI models draw conclusions from data points; so an AI model might look at hiring data and see that the demographic group that has been most successful getting hired at a tech company is white men and conclude that white men are the most qualified for working at a tech company, though in reality the reason white men may be more successful is because they do not face the same structural barriers that affect other demographics, such as being unable to afford a tech degree, facing sexism in classes, or racism in the hiring department.

Privacy

GenAI raises several privacy concerns. One is that the datasets could contain sensitive or personal information. Unless that content is properly anonymized or protected, personal information could be exposed or misused. Because GenAI outputs are intended to be realistic-looking, generated content that resembles real individuals could be used to re-identify individuals whose data was intended to be anonymized, also undermining privacy protections. Further, during the training process, GenAI models may inadvertently learn and memorize parts of the training data, including sensitive or private information. This could lead to data leakage when generating new content. Policymakers and the GenAI platforms themselves have not yet resolved the concern about how to protect privacy in the datasets, outputs, or even the prompts themselves, which can include sensitive data or reflect a user’s intentions in ways that could be harmful if not secure.

Copyright and Intellectual Property

One of the fundamental concerns around GenAI is who owns the copyright for work that GenAI creates. Copyright law attributes authorship and ownership to human creators. However, in the case of AI-generated content, determining  authorship, the cornerstone of copyright protection, becomes challenging. It is unclear whether the creator should be the programmer, the user, the AI system itself, or a combination of these parties. AI systems learn from existing copyrighted content to generate new work that could resemble existing copyrighted material. This raises questions about whether AI-generated content could be considered derivative work and thus infringe upon the original copyright holder’s rights or whether the use of GenAI would be considered fair use, which allows limited use of copyrighted material without permission from the holder of the copyright. Because the technology is still new, the legal frameworks for judging fair use versus copyright infringement are still evolving and might look different depending on the jurisdiction and its legal culture. As that body of law develops, it should balance innovation with treating creators, users, and AI systems’ developers fairly.

Environmental Impacts

Training GenAI models and storing and transmitting data uses significant computational resources, often with hardware that consumes energy that can contribute to carbon emissions if it is not powered by renewable sources. These impacts can be mitigated in part through the use of renewable energy and by optimizing algorithms to reduce computational demands.

Unequal Access

Although access to GenAI tools is becoming more widespread, the emergence of the technology risks expanding the digital divide between those with access to technology and those without. There are several reasons why unequal access–and its consequences–may be particularly germane in the case of GenAI:

  • The computing power required is enormous, which can strain the infrastructure of countries that have inadequate power supply, internet access, data storage, or cloud computing.
  • Low and middle income countries (LMICs) may lack the high-tech talent pool necessary for AI innovation and implementation. One report suggests that the whole continent of Africa has 700,000 software developers, compared to California, which has 630,000. This problem is exacerbated by the fact that, once qualified, developers from LMICs often leave for countries where they can earn more.
  • Mainstream, consumer-facing models like ChatGPT were trained on a handful of languages, including English, Spanish, German, and Chinese, which means that individuals seeking to use GenAI in these languages have access advantages unavailable to Swahili speakers, for example, not to mention local dialects.
  • Localizing GenAI requires large amounts of data from the particular context, and low-resourced environments often rely on models developed by larger tech companies in the United States or China.

The ultimate result may be the disempowerment of marginalized groups who have fewer opportunities and means to share their stories and perspectives through AI-generated content. Because these technologies may enhance an individual’s economic prospects, unequal access to GenAI can in turn increase economic inequality as those with access are able to engage in creative expression, content generation, and business innovation more efficiently.

Back to top

Questions

If you are conducting a project and considering whether to use GenAI for it, ask yourself these questions:

  1. Are there cases where individual interactions between people might be more effective, more empathetic, and even more efficient than using AI for communication?
  2. What ethical concerns—whether from biases or privacy—might the use of GenAI introduce? Can they be mitigated?
  3. Can local sources of data and talent be employed to create localized GenAI?
  4. Are there legal, regulatory, or security measures that will guard against the misuses of GenAI and protect the populations that might be vulnerable to these misuses?
  5. Can sensitive or proprietary information be protected in the process of developing datasets that serve as training data for GenAI models?
  6. In what ways can GenAI technology bridge the digital divide and increase digital access in a tech-dependent society (or as societies become more tech-dependent)? How can we mitigate the tendency of new GenAI technologies to widen the digital divide?
  7. Are there forms of digital literacy for members of society, civil society, or a political class that can mitigate against the risks of deepfakes or large-scale generated misinformation text?
  8. How can you mitigate against the negative environmental impacts associated with the use of GenAI?
  9. Can GenAI be used to tailor approaches to education, access to government and civil society, and opportunities for innovation and economic advancement?
  10. Is the data your model trained on accurate data, representative of all identities, including marginalized groups? What inherent biases might the dataset carry?

Back to top

Case Studies

GenAI largely emerged in a widespread, consumer-facing way in the first half of 2023, which limits the number of real-world case studies. This section on case studies therefore includes cases where forms of GenAI have proved problematic in terms of deception or misinformation; ways that GenAI may conceivably affect all sectors, including democracy, to increase efficiencies and access; and experiences or discussions of specific country approaches to privacy-innovation tradeoffs.

Experiences with Disinformation and Deception

In Gabon, a possible deepfake played a significant role in the country’s politics. The president had reportedly experienced a stroke but had not been seen in public. The government ultimately issued a video on New Year’s Eve 2018 intending to assuage concerns about the president’s health, but critics suggested that he had inauthentic blinking patterns and facial expressions in the video and that it was a deepfake. Rumors that the video was inauthentic proliferated, leading many to conclude that the president was not in good health, which led to an attempted coup, due to the belief that the president’s ability to withstand the overthrow attempt would be weakened. The example demonstrates the serious ramifications of a loss of trust in the information environment.

In March 2023, a GenAI image of the Pope in a Balenciaga puffy coat went viral on the internet, fooling readers because of the likeness between the image and the Pope. Balenciaga, several months before, had faced backlash because of an ad campaign that had featured children in harnesses and bondage. The Pope seemingly wearing Balenciaga then implied that he and the Catholic church embraced these practices. The internet consensus ultimately concluded that it was a deepfake after identifying telltale signs such as a blurry coffee cup and resolution problems with the Pope’s eyelid. Nonetheless, the incident illustrated just how easily these images can be generated and fool readers. It also illustrated the way in which reputations could be stained through deepfakes.

In September 2023, the Microsoft Threat Initiative released a report pointing to numerous instances of online influence operations. Ahead of the 2022 election, Microsoft identified Chinese Communist Party (CCP)-affiliated social media accounts that were impersonating American voters, responding to comments in order to influence opinions through exchanges and persuasion. In 2023, Microsoft then observed the use of AI-created visuals that portrayed American images such as the Statue of Liberty in a negative light. These images had hallmarks of AI such as the wrong number of fingers on a hand but were nonetheless provocative and convincing. In early 2023, Meta similarly found the CCP engaged in an influence operation by posting comments critical of American foreign policy, which Meta was able to identify due to the types of spelling and grammatical mistakes combined with the time of day (appropriate hours for China rather than the US).

Current and Future Applications

As GenAI tools improve, they will become even more effective in these online influence campaigns. On the other hand, applications with positive outcomes will also become more effective. GenAI, for example, will increasingly step in to fill gaps in government resources. An estimated four billion people lack access to basic health services, with a significant limitation being the low number of health care providers. While GenAI is not a substitute for direct access to an individual health care provider, it can at least bridge some access gaps in certain settings. One healthcare chatbot, Ada Health, is powered by OpenAI and can correspond with individuals about their symptoms. ChatGPT has demonstrated an ability to pass medical qualification exams and should not be used as a stand-in for a doctor, but, in resource-constrained environments, it could at least provide an initial screening, a savings of costs, time, and resources. Relatedly, analogous tools can be used in mental health settings. The World Economic Forum reported in 2021 that an estimated 100 million individuals in Africa have clinical depression, but there are only 1.4 health care providers per 100,000 people, compared to the global average of 9 providers per 100,000 people. People in need of care, who lack better options, are increasingly relying on mental health chatbots until a more comprehensive approach can be implemented because, while the level of care they can provide is limited, it is better than nothing. These GenAI-based resources are not without challenges–potential privacy problems and suboptimal responses–and societies and individuals will have to determine whether these tools are better than the alternative but may be considered in resource-constrained environments.

Other future scenarios involve using GenAI to increase government efficiency on a range of tasks. One such scenario entails a government bureaucrat trained in economics assigned to work on a policy brief related to the environment. The individual begins the policy brief but then puts the question into a GenAI tool, which helps draft an outline of ideas, reminds the individual about points that had been missed, identifies key relevant international legal guideposts, and then translates the English-language brief into French. Another scenario involves an individual citizen trying to figure out where to vote, pay taxes, clarify government processes, make sense of policies for citizens deciding between candidates, or explain certain policy concepts. These scenarios are already possible and accessible at all levels within society and will only become more prevalent as individuals become more familiar with the technology. However, it is important that users understand the limitations and how to appropriately use the technology to prevent situations in which they are spreading misinformation or failing to find accurate information.

In an electoral context, GenAI can help evaluate aspects of democracy, such as electoral integrity. Manual tabulation of votes, for example, takes time and is onerous. However, new AI tools have played a role in ascertaining the degree of electoral irregularities. Neural networks have been used in Kenya to “read” paper forms submitted at the local level and enumerate the degree of electoral irregularities and then correlate those with electoral outcomes to assess whether these irregularities were the result of fraud or human error. These technologies may actually alleviate some of the workload burden placed on electoral institutions. In the future, advances in GenAI will be able to provide data visualization that further eases the cognitive load of efforts to adjudicate electoral integrity.

Approaches to the Privacy-Innovation Dilemma

Countries such as Brazil have raised concerns about the potential misuses of GenAI. After the release of ChatGPT in November 2022, the Brazilian government received a detailed report, written by academic and legal experts as well as company leaders and members of a national data-protection watchdog, urging that these technologies be regulated. The report raised three main concerns:

  • That citizen rights be protected by ensuring that there be “non-discrimination and correction of direct, indirect, illegal, or abusive discriminatory biases” as well as clarity and transparency as to when citizens were interacting with AI.
  • That the government categorize risks and inform citizens of the potential risks. Based on this analysis, “high risk” sectors included essential services, biometric verification and job recruitment, and “excessive risk” included the exploitation of vulnerable peoples and social scoring (a system that tracks individual behavior for trustworthiness and blacklists those with too many demerits or equivalents), both practices that should be scrutinized closely.
  • That the government issue governance measures and administrative sanctions, first by determining how businesses that fall afoul of regulations would be penalized and second by recommending a penalty of 2% of revenue for mild non-compliance and the equivalent of 9 million USD for more serious harms.

At the time of this writing in 2023, the government was debating next steps, but the report and deliberations are illustrative of the concerns and recommendations that have been issued with respect to GenAI in the Global South.  

In India, the government has approached AI in general and GenAI in particular with a less skeptical eye, which sheds light on the differences in how governments may approach these technologies and the basis for those differences. In 2018, the Indian government proposed a National Strategy for AI, which prioritized the development of AI in agriculture, education, healthcare, smart cities, and smart mobility. In 2020, the National Artificial Intelligence Strategy called for all systems to be transparent, accountable, and unbiased. In March 2021, the Indian government announced that it would use “light touch” regulation and that the bigger risk was not from AI but from not seizing on the opportunities presented by AI. India has an advanced technological research and development sector that is poised to benefit from AI. Advancing this sector is, according to the Ministry of Electronics and Information Technology, “significant and strategic,” although it acknowledged that it needed some policies and infrastructure measures that would address bias, discrimination, and ethical concerns.

Back to top

References

Find below the works cited in this resource.

Additional Resources

Back to top

Categories

Digital Development in the time of COVID-19