Web design plays a role in how much we reveal online


European Commission: “A JRC study, “Nudges to Privacy Behaviour: Exploring an Alternative Approach to Privacy Notices“, used behavioural sciences to look at how individuals react to different types of privacy notices. Specifically, the authors analysed users’ reactions to modified choice architecture (i.e. the environment in which decisions take place) of web interfaces.

Two types of privacy behaviour were measured: passive disclosure, when people unwittingly disclose personal information, and direct disclosure, when people make an active choice to reveal personal information. After testing different designs with over 3 000 users from the UK, Italy, Germany and Poland, results show web interface affects decisions on disclosing personal information. The study also explored differences related to country of origin, gender, education level and age.

A depiction of a person’s face on the website led people to reveal more personal information. Also, this design choice and the visualisation of the user’s IP or browsing history had an impact on people’s awareness of a privacy notice. If confirmed, these features are particularly relevant for habitual and instinctive online behaviour.

With regard to education, users who had attended (though not necessarily graduated from) college felt significantly less observed or monitored and more comfortable answering questions than those who never went to college. This result challenges the assumption that the better educated are more aware of information tracking practices. Further investigation, perhaps of a qualitative nature, could help dig deeper into this issue. On the other hand, people with a lower level of education were more likely to reveal personal information unwittingly. This behaviour appeared to be due to the fact that non-college attendees were simply less aware that some online behaviour revealed personal information about themselves.

Strong differences between countries were noticed, indicating a relation between cultures and information disclosure. Even though participants in Italy revealed the most personal information in passive disclosure, in direct disclosure they revealed less than in other countries. Approximately 75% of participants in Italy chose to answer positively to at least one stigmatised question, compared to 81% in Poland, 83% in Germany and 92% in the UK.

Approximately 73% of women answered ‘never’ to the questions asking whether they had ever engaged in socially stigmatised behaviour, compared to 27% of males. This large difference could be due to the nature of the questions (e.g. about alcohol consumption, which might be more acceptable for males). It could also suggest women feel under greater social scrutiny or are simply more cautious when disclosing personal information.

These results could offer valuable insights to inform European policy decisions, despite the fact that the study has targeted a sample of users in four countries in an experimental setting. Major web service providers are likely to have extensive amounts of data on how slight changes to their services’ privacy controls affect users’ privacy behaviour. The authors of the study suggest that collaboration between web providers and policy-makers can lead to recommendations for web interface design that allow for conscientious disclosure of privacy information….(More)”

Five principles for applying data science for social good


Jake Porway at O’Reilly: “….Every week, a data or technology company declares that it wants to “do good” and there are countless workshops hosted by major foundations musing on what “big data can do for society.” Add to that a growing number of data-for-good programs from Data Science for Social Good’s fantastic summer program toBayes Impact’s data science fellowships to DrivenData’s data-science-for-good competitions, and you can see how quickly this idea of “data for good” is growing.

Yes, it’s an exciting time to be exploring the ways new datasets, new techniques, and new scientists could be deployed to “make the world a better place.” We’ve already seen deep learning applied to ocean health,satellite imagery used to estimate poverty levels, and cellphone data used to elucidate Nairobi’s hidden public transportation routes. And yet, for all this excitement about the potential of this “data for good movement,” we are still desperately far from creating lasting impact. Many efforts will not only fall short of lasting impact — they will make no change at all….

So how can these well-intentioned efforts reach their full potential for real impact? Embracing the following five principles can drastically accelerate a world in which we truly use data to serve humanity.

1. “Statistics” is so much more than “percentages”

We must convey what constitutes data, what it can be used for, and why it’s valuable.

There was a packed house for the March 2015 release of the No Ceilings Full Participation Report. Hillary Clinton, Melinda Gates, and Chelsea Clinton stood on stage and lauded the report, the culmination of a year-long effort to aggregate and analyze new and existing global data, as the biggest, most comprehensive data collection effort about women and gender ever attempted. One of the most trumpeted parts of the effort was the release of the data in an open and easily accessible way.

I ran home and excitedly pulled up the data from the No Ceilings GitHub, giddy to use it for our DataKind projects. As I downloaded each file, my heart sunk. The 6MB size of the entire global dataset told me what I would find inside before I even opened the first file. Like a familiar ache, the first row of the spreadsheet said it all: “USA, 2009, 84.4%.”

What I’d encountered was a common situation when it comes to data in the social sector: the prevalence of inert, aggregate data. ….

2. Finding problems can be harder than finding solutions

We must scale the process of problem discovery through deeper collaboration between the problem holders, the data holders, and the skills holders.

In the immortal words of Henry Ford, “If I’d asked people what they wanted, they would have said a faster horse.” Right now, the field of data science is in a similar position. Framing data solutions for organizations that don’t realize how much is now possible can be a frustrating search for faster horses. If data cleaning is 80% of the hard work in data science, then problem discovery makes up nearly the remaining 20% when doing data science for good.

The plague here is one of education. …

3. Communication is more important than technology

We must foster environments in which people can speak openly, honestly, and without judgment. We must be constantly curious about each other.

At the conclusion of one of our recent DataKind events, one of our partner nonprofit organizations lined up to hear the results from their volunteer team of data scientists. Everyone was all smiles — the nonprofit leaders had loved the project experience, the data scientists were excited with their results. The presentations began. “We used Amazon RedShift to store the data, which allowed us to quickly build a multinomial regression. The p-value of 0.002 shows …” Eyes glazed over. The nonprofit leaders furrowed their brows in telegraphed concentration. The jargon was standing in the way of understanding the true utility of the project’s findings. It was clear that, like so many other well-intentioned efforts, the project was at risk of gathering dust on a shelf if the team of volunteers couldn’t help the organization understand what they had learned and how it could be integrated into the organization’s ongoing work…..

4. We need diverse viewpoints

To tackle sector-wide challenges, we need a range of voices involved.

One of the most challenging aspects to making change at the sector level is the range of diverse viewpoints necessary to understand a problem in its entirety. In the business world, profit, revenue, or output can be valid metrics of success. Rarely, if ever, are metrics for social change so cleanly defined….

Challenging this paradigm requires diverse, or “collective impact,” approaches to problem solving. The idea has been around for a while (h/t Chris Diehl), but has not yet been widely implemented due to the challenges in successful collective impact. Moreover, while there are many diverse collectives committed to social change, few have the voice of expert data scientists involved. DataKind is piloting a collective impact model called DataKind Labs, that seeks to bring together diverse problem holders, data holders, and data science experts to co-create solutions that can be applied across an entire sector-wide challenge. We just launchedour first project with Microsoft to increase traffic safety and are hopeful that this effort will demonstrate how vital a role data science can play in a collective impact approach.

5. We must design for people

Data is not truth, and tech is not an answer in-and-of-itself. Without designing for the humans on the other end, our work is in vain.

So many of the data projects making headlines — a new app for finding public services, a new probabilistic model for predicting weather patterns for subsistence farmers, a visualization of government spending — are great and interesting accomplishments, but don’t seem to have an end user in mind. The current approach appears to be “get the tech geeks to hack on this problem, and we’ll have cool new solutions!” I’ve opined that, though there are many benefits to hackathons, you can’t just hack your way to social change….(More)”

Harnessing the Data Revolution for Sustainable Development


US State Department Fact Sheet on “U.S. Government Commitments and Collaboration with the Global Partnership for Sustainable Development Data”: “On September 27, 2015, the member states of the United Nations agreed to a set of Sustainable Development Goals (Global Goals) that define a common agenda to achieve inclusive growth, end poverty, and protect the environment by 2030. The Global Goals build on tremendous development gains made over the past decade, particularly in low- and middle-income countries, and set actionable steps with measureable indicators to drive progress. The availability and use of high quality data is essential to measuring and achieving the Global Goals. By harnessing the power of technology, mobilizing new and open data sources, and partnering across sectors, we will achieve these goals faster and make their progress more transparent.

Harnessing the data revolution is a critical enabler of the global goals—not only to monitor progress, but also to inclusively engage stakeholders at all levels – local, regional, national, global—to advance evidence-based policies and programs to reach those who need it most. Data can show us where girls are at greatest risk of violence so we can better prevent it; where forests are being destroyed in real-time so we can protect them; and where HIV/AIDS is enduring so we can focus our efforts and finish the fight. Data can catalyze private investment; build modern and inclusive economies; and support transparent and effective investment of resources for social good…..

The Global Partnership for Sustainable Development Data (Global Data Partnership), launched on the sidelines of the 70th United Nations General Assembly, is mobilizing a range of data producers and users—including governments, companies, civil society, data scientists, and international organizations—to harness the data revolution to achieve and measure the Global Goals. Working together, signatories to the Global Data Partnership will address the barriers to accessing and using development data, delivering outcomes that no single stakeholder can achieve working alone….The United States, through the U.S. President’s Emergency Plan for AIDS Relief (PEPFAR), is joining a consortium of funders to seed this initiative. The U.S. Government has many initiatives that are harnessing the data revolution for impact domestically and internationally. Highlights of our international efforts are found below:

Health and Gender

Country Data Collaboratives for Local Impact – PEPFAR and the Millennium Challenge Corporation(MCC) are partnering to invest $21.8 million in Country Data Collaboratives for Local Impact in sub-Saharan Africa that will use data on HIV/AIDS, global health, gender equality, and economic growth to improve programs and policies. Initially, the Country Data Collaboratives will align with and support the objectives of DREAMS, a PEPFAR, Bill & Melinda Gates Foundation, and Girl Effect partnership to reduce new HIV infections among adolescent girls and young women in high-burden areas.

Measurement and Accountability for Results in Health (MA4Health) Collaborative – USAID is partnering with the World Health Organization, the World Bank, and over 20 other agencies, countries, and civil society organizations to establish the MA4Health Collaborative, a multi-stakeholder partnership focused on reducing fragmentation and better aligning support to country health-system performance and accountability. The Collaborative will provide a vehicle to strengthen country-led health information platforms and accountability systems by improving data and increasing capacity for better decision-making; facilitating greater technical collaboration and joint investments; and developing international standards and tools for better information and accountability. In September 2015, partners agreed to a set of common strategic and operational principles, including a strong focus on 3–4 pathfinder countries where all partners will initially come together to support country-led monitoring and accountability platforms. Global actions will focus on promoting open data, establishing common norms and standards, and monitoring progress on data and accountability for the Global Goals. A more detailed operational plan will be developed through the end of the year, and implementation will start on January 1, 2016.

Data2X: Closing the Gender GapData2X is a platform for partners to work together to identify innovative sources of data, including “big data,” that can provide an evidence base to guide development policy and investment on gender data. As part of its commitment to Data2X—an initiative of the United Nations Foundation, Hewlett Foundation, Clinton Foundation, and Bill & Melinda Gates Foundation—PEPFAR and the Millennium Challenge Corporation (MCC) are working with partners to sponsor an open data challenge to incentivize the use of gender data to improve gender policy and practice….(More)”

See also: Data matters: the Global Partnership for Sustainable Development Data. Speech by UK International Development Secretary Justine Greening at the launch of the Global Partnership for Sustainable Development Data.

Who you are/where you live: do neighbourhood characteristics explain co-production?


Paper by Peter Thijssen and Wouter Van Dooren in the International Review of Administrative Sciences: “Co-production establishes an interactive relationship between citizens and public service providers. Successful co-production hence requires the engagement of citizens. Typically, individual characteristics such as age, gender, and income are used to explain why citizens co-produce. In contrast, neighbourhood-level variables receive less attention. Nevertheless, the co-production literature, as well as social capital and urban planning theory, provides good arguments why neighbourhood variables may be relevant. In this study, we examine the administrative records of citizen-initiated contacts in a reporting programme for problems in the public domain. This co-production programme is located in the district of Deurne in the city of Antwerp, Belgium. A multilevel analysis is used to simultaneously assess the impact of neighbourhood characteristics and individual variables. While the individual variables usually found to explain co-production are present in our case, we also find that neighbourhood characteristics significantly explain co-production. Thus, our findings suggest that participation in co-production activities is determined not only by who you are, but also by where you live.

Points for practitioners In order to facilitate co-production and participation, the neighbourhood should be the first place to look. Co-production benefits may disproportionaly accrue to strong citizens, but also to strong neighbourhoods. Social corrections should take both into account. More broadly, a good understanding of the neighbourhoods in the city is needed to grasp citizen behaviour. Place-based policies in the city should focus on the neighbourhood….(More)”

EveryPolitician


“The clue’s in the name. EveryPolitician aims to provide data about, well, every politician. In the world. It’s a simple but ambitious project to collect and share that data, in a consistent, open format that anyone can use.

Why? Because this resource doesn’t yet exist. And it would be incredibly useful, for a huge number of people and organisations all around the world.

When data is in a consistent, structured format, it can be reused by developers everywhere. You don’t have waste time scraping data and converting it into a format you can work with; instead, you can simply concentrate on making tools. And those tools can more easily be picked up, used and adapted to local needs anywhere in the world, saving everyone time and effort.

The data

The long term aim is to include every elected official in the world, but let’s start simple. Our first goal is to have data for all present-day national-level legislators.

To see how far we’ve got, pick a country.

There’s more to this data than you’ll see there, though. For most datasets there is richer information available, including contact details, photos, gender, and more.

If you want to use that data, you can download it in two useful formats:

  • CSV format (great for spreadsheets)
  • JSON in Popolo format (ideal for developers)

A note about the Popolo standard: it’s a rich, expressive format that, like a language, is used in many different ways by different authors. However, when we add data to EveryPolitician we always use Popolo according to the same, defined principles. It’s because of this consistency that the tools you build will work with EveryPolitician data from any country, for any country.

Want more detail? Interested in using this data in a web application or tool you’re building? See the technical overview of EveryPolitician.”

Beyond the Jailhouse Cell: How Data Can Inform Fairer Justice Policies


Alexis Farmer at DataDrivenDetroit: “Government-provided open data is a value-added approach to providing transparency, analytic insights for government efficiency, innovative solutions for products and services, and increased civic participation. Two of the least transparent public institutions are jails and prisons. The majority of population has limited knowledge about jail and prison operations and the demographics of the jail and prison population, even though the costs of incarceration are substantial. The absence of public knowledge about one of the many establishments public tax dollars support can be resolved with an open data approach to criminal justice. Increasing access to administrative jail information enables communities to collectively and effectively find solutions to the challenges the system faces….

The data analysis that compliments open data practices is a part of the formula for creating transformational policies. There are numerous ways that recording and publishing data about jail operations can inform better policies and practices:

1. Better budgeting and allocation of funds. By monitoring the rate at which dollars are expended for a specific function, data allows for administrations to ensure accurate estimates of future expenditures.

2. More effective deployment of staff. Knowing the average daily population and annual average bookings can help inform staffing decisions to determine a total need of officers, shift responsibilities, and room arrangements. The population information also helps with facility planning, reducing overcrowding, controlling violence within the facility, staffing, determining appropriate programs and services, and policy and procedure development.

3. Program participation and effectiveness. Gauging the amount of inmates involved in jail work programs, educational training services, rehabilitation/detox programs, and the like is critical to evaluating methods to improve and expand such services. Quantifying participation and effectiveness of these programs can potentially lead to a shift in jail rehabilitating services.

4. Jail suicides. “The rate of jail suicides is about three times the rate of prison suicides.” Jails are isolating spaces that separate inmates from social support networks, diminish personal control, and often lack mental health resources. Most people in jail face minor charges and spend less time incarcerated due to shorter sentences. Reviewing the previous jail suicide statistics aids in pinpointing suicide risk, identifying high-risk groups, and ultimately, prescribing intervention procedures and best practices to end jail suicides.

5. Gender and race inequities. It is well known that Black men are disproportionately incarcerated, and the number of Black women in jails and prisons has rapidly increased . It is important to view this disparity as it reflects to the demographics of the total population of an area. Providing data that show trends in particular crimes committed by race and gender data might lead to further analysis and policy changes in the root causes of these crimes (poverty, employment, education, housing, etc.).

6. Prior interaction with the juvenile justice system. The school-to-prison pipeline describes the systematic school discipline policies that increase a student’s interaction with the juvenile justice system. Knowing how many incarcerated persons that have been suspended, expelled, or incarcerated as a juvenile can encourage schools to examine their discipline policies and institute more restorative justice programs for students. It would also encourage transitional programs for formerly incarcerated youth in order to decrease recidivism rate among young people.

7. Sentencing reforms. Evaluating the charges on which a person is arrested, the length of stay, average length of sentences, charges for which sentences are given, and the length of time from the first appearance to arraignment and trial disposition can inform more just and balanced sentencing laws enforced by the judicial branch….(More)”

Accur8Africa


Accur8Africa aims to be the leading platform supporting the accuracy of data in the continent. If we intend to meet the Sustainable Development Goals (SDGs) in the next fifteen years, accurate data remains a non-negotiable necessity. Accur8Africa recognizes that nothing less than a data revolution is required. To achieve this we are building the statistical capacity of institutions across Africa and encouraging the use of data-driven decisions alongside better development metrics for key sectors such as gender equality, climate change, equity and social inclusion and health.

Africa has data in abundance but it exists in a fragmented and disorganized manner. As a result, the achievements of the Millennium Development Goals will be largely unquantifiable. As we transition from the MDG’s to the Sustainable Development Goals, and national governments meet to discuss the 17 goals that could transform the world by 2030, we believe that the African Continent deserves better and more accurate data…..Africa has a great role to play in the next fifteen years. The United Nations development agenda has generated momentum for a worldwide “data revolution,” shining a much-needed light on the need for better development data in Africa and elsewhere. Governments, international institutions, and donors need accurate data on basic development metrics such as inflation, vaccination coverage, and school enrolment in order to accurately plan, budget, and evaluate their activities. Governments, citizens, and civil society can then use this data as a “currency” for accountability. When statistical systems function properly, good-quality data are exchanged freely amongst all stakeholders ensuring that funding and development efforts are producing the desired results….(More)”

The internet is the answer to all the questions of our time


Cory Doctorow in The Guardian: “…Why do people work for these organisations? Because they are utopians. Not utopians in the sense of believing that the internet is predestined to come out all right no matter what. Rather, we are utopians because, on the one hand, we are terrified of what kind of surveillance and control the internet enables, and because, on the other hand, we believe that the future is up for grabs: that we can work together to change what the internet is and what it will become. Nothing is more utopian than a belief that, when things are bad, we can make them better.

The internet has become the nervous system of the 21st century, wiring together devices that we carry, devices that are in our bodies, devices that our bodies are in. It is woven into the fabric of government service delivery, of war-fighting systems, of activist groups, of major corporations and teenagers’ social groups and the commerce of street-market hawkers.

There are many fights more important than the fight over how the internet is regulated. Equity in race, gender, sexual preference; the widening wealth gap; the climate crisis – each one far more important than the fight over the rules for the net.

Except for one thing: the internet is how every one of these fights will be won or lost. Without a free, fair and open internet, proponents of urgent struggles for justice will be outmaneuvered and outpaced by their political opponents, by the power-brokers and reactionaries of the status quo. The internet isn’t the most important fight we have; but it’s the most foundational….

The questions of the day are “How do we save the planet from the climate crisis?” and “What do we do about misogyny, racial profiling and police violence, and homophobic laws?” and “How do we check mass surveillance and the widening power of the state?” and “How do we bring down autocratic, human-rights-abusing regimes without leaving behind chaos and tragedy?”

Those are the questions.

But the internet is the answer. If you propose to fix any of these things without using the internet, you’re not being serious. And if you want to free the internet to use in all those fights, there’s a quarter century’s worth of Internet Utopians who’ve got your back….(More)

Democratising the Data Revolution


Jonathan Gray at Open Knowledge: “What will the “data revolution” do? What will it be about? What will it count? What kinds of risks and harms might it bring? Whom and what will it serve? And who will get to decide?

Today we are launching a new discussion paper on “Democratising the Data Revolution”, which is intended to advance thinking and action around civil society engagement with the data revolution. It looks beyond the disclosure of existing information, towards more ambitious and substantive forms of democratic engagement with data infrastructures.1

It concludes with a series of questions about what practical steps institutions and civil society organisations might take to change what is measured and how, and how these measurements are put to work.

You can download the full PDF report here, or continue to read on in this blog post.

What Counts?

How might civil society actors shape the data revolution? In particular, how might they go beyond the question of what data is disclosed towards looking at what is measured in the first place? To kickstart discussion around this topic, we will look at three kinds of intervention: changing existing forms of measurement, advocating new forms of measurement and undertaking new forms of measurement.

Changing Existing Forms of Measurement

Rather than just focusing on the transparency, disclosure and openness of public information, civil society groups can argue for changing what is measured with existing data infrastructures. One example of this is recent campaigning around company ownership in the UK. Advocacy groups wanted to unpick networks of corporate ownership and control in order to support their campaigning and investigations around tax avoidance, tax evasion and illicit financial flows.

While the UK company register recorded information about “nominal ownership”, it did not include information about so-called “beneficial ownership”, or who ultimately benefits from the ownership and control of companies. Campaigners undertook an extensive programme of activities to advocate for changes and extensions to existing data infrastructures – including via legislation, software systems, and administrative protocols.2

Advocating New Forms of Measurement

As well as changing or recalibrating existing forms of measurement, campaigners and civil society organisations can make the case for the measurement of things which were not previously measured. For example, over the past several decades social and political campaigning has resulted in new indicators about many different issues – such as gender inequality, health, work, disability, pollution or education.3 In such cases activists aimed to establish a given indicator as important and relevant for public institutions, decision makers, and broader publics – in order to, for example, inform policy development or resource allocation.

Undertaking New Forms of Measurement

Historically, many civil society organisations and advocacy groups have collected their own data to make the case for action on issues that they work on – from human rights abuses to endangered species….(More)”

The death of data science – and rise of the citizen scientist


Ben Rossi at Information Age: “The notion of data science was born from the recent idea that if you have enough data, you don’t need much (if any) science to divine the truth and foretell the future – as opposed to the long-established rigours of statistical or actuarial science, which most times require painstaking efforts and substantial time to produce their version of ‘the truth’. …. Rather than embracing this untested and, perhaps, doomed form of science, and aimlessly searching for unicorns (also known as data scientists) to pay vast sums to, many organisations are now embracing the idea of making everyone data and analytics literate.

This leads me to what my column is really meant to focus on: the rise of the citizen scientist. 

The citizen scientist is not a new idea, having seen action in the space and earth sciences world for decades now, and has really come into its own as we enter the age of open data.

Cometh the hour

Given the exponential growth of open data initiatives across the world – the UK remains the leader, but has growing competition from all locations – the need for citizen scientists is now paramount. 

As governments open up vast repositories of new data of every type, the opportunity for these same governments (and commercial interests) to leverage the passion, skills and collective know-how of citizen scientists to help garner deeper insights into the scientific and civic challenges of the day is substantial. 

They can then take this knowledge and the collective energy of the citizen scientist community to develop common solution sets and applications to meet the needs of all their constituencies without expending much in terms of financial resources or suffering substantial development time lags. 

This can be a windfall of benefits for every level or type of government found around the world. The use of citizen scientists to tackle so-called ‘grand challenge’ problems has been a driving force behind many governments’ commitment to and investment in open data to date. 

There are so many challenges in governing today that it would be foolish not to employ these very capable resources to help tackle them. 

The benefits manifested from this approach are substantial and well proven. Many are well articulated in the open data success stories to date. 

Additionally, you only need to attend a local ‘hack fest’ to see how engaged citizen scientists can be of any age, gender and race, and feel the sense of community that these events foster as everyone focuses on the challenges at hand and works diligently to surmount them using very creative approaches. 

As open data becomes pervasive in use and matures in respect to the breadth and richness of the data sets being curated, the benefits returned to both government and its constituents will be manifold. 

The catalyst to realising these benefits and achieving return on investment will be the role of citizen scientists, which are not going to be statisticians, actuaries or so-called data gurus, but ordinary people with a passion for science and learning and a desire to contribute to solving the many grand challenges facing society at large….(More)