All European scientific articles to be freely accessible by 2020


EU Presidency: “All scientific articles in Europe must be freely accessible as of 2020. EU member states want to achieve optimal reuse of research data. They are also looking into a European visa for foreign start-up founders.

And, according to the new Innovation Principle, new European legislation must take account of its impact on innovation. These are the main outcomes of the meeting of the Competitiveness Council in Brussels on 27 May.

Sharing knowledge freely

Under the presidency of Netherlands State Secretary for Education, Culture and Science Sander Dekker, the EU ministers responsible for research and innovation decided unanimously to take these significant steps. Mr Dekker is pleased that these ambitions have been translated into clear agreements to maximise the impact of research. ‘Research and innovation generate economic growth and more jobs and provide solutions to societal challenges,’ the state secretary said. ‘And that means a stronger Europe. To achieve that, Europe must be as attractive as possible for researchers and start-ups to locate here and for companies to invest. That calls for knowledge to be freely shared. The time for talking about open access is now past. With these agreements, we are going to achieve it in practice.’

Open access

Open access means that scientific publications on the results of research supported by public and public-private funds must be freely accessible to everyone. That is not yet the case. The results of publicly funded research are currently not accessible to people outside universities and knowledge institutions. As a result, teachers, doctors and entrepreneurs do not have access to the latest scientific insights that are so relevant to their work, and universities have to take out expensive subscriptions with publishers to gain access to publications.

Reusing research data

From 2020, all scientific publications on the results of publicly funded research must be freely available. It also must be able to optimally reuse research data. To achieve that, the data must be made accessible, unless there are well-founded reasons for not doing so, for example intellectual property rights or security or privacy issues….(More)”

Time for sharing data to become routine: the seven excuses for not doing so are all invalid


Paper by Richard Smith and Ian Roberts: “Data are more valuable than scientific papers but researchers are incentivised to publish papers not share data. Patients are the main beneficiaries of data sharing but researchers have several incentives not to share: others might use their data to get ahead in the academic rat race; they might be scooped; their results might not be replicable; competitors may reach different conclusions; their data management might be exposed as poor; patient confidentiality might be breached; and technical difficulties make sharing impossible. All of these barriers can be overcome and researchers should be rewarded for sharing data. Data sharing must become routine….(More)”

If you build it… will they come?


Laura Bacon at Omidyar Network: “What do datasets on Danish addresses, Indonesian elections, Singapore Dengue Fever, Slovakian contracts, Uruguayan health service provision, and Global weather systems have in common? Read on to learn more…

On May 12, 2016, more than 40 nations’ leaders gathered in London for an Anti-Corruption Summit, convened by UK Prime Minister David Cameron. Among the commitments made, 40 countries pledged to make their procurement processes open by default, with 14 countries specifically committing to publish to the Open Contracting Data Standard.

This conference and these commitments can be seen as part of a larger global norm toward openness and transparency, also embodied by the Open Government Partnership, Open Data Charter, and increasing numbers of Open Data Portals.

As government data is increasingly published openly in the public domain, valid questions have been raised about what impact the data will have: As governments release this data, will it be accessed and used? Will it ultimately improve lives, root out corruption, hold answers to seemingly intractable problems, and lead to economic growth?*

Omidyar Network — having supported several Open Data organizations and platforms such as Open Data Institute, Open Knowledge, and Web Foundation — sought data-driven answers to these questions. After a public call for proposals, we selected NYU’s GovLab to conduct research on the impact open data has already had. Not the potential or prospect of impact, but past proven impact. The GovLab research team, led by Stefaan Verhulst, investigated a variety of sectors — health, education, elections, budgets, contracts, etc. — in a variety of locations, spanning five continents.

Their findings are promising and exciting, demonstrating that open data is changing the world by empowering people, improving governance, solving public problems, and leading to innovation. A summary is contained in thisKey Findings report, and is accompanied by many open data case studies posted in this Open Data Impact Repository.

Of course, stories such as this are not 100% rosy, and the report is clear about the challenges ahead. There are plenty of cases in which open data has had minimal impact. There are cases where there was negative impact. And there are obstacles to open data reaching its full potential: namely, open data projects that don’t respond to citizens’ questions and needs, a lack of technical capacity on either the data provider and data user side, inadequate protections for privacy and security, and a shortage of resources.

But this research holds good news: Danish addresses, Indonesian elections,Singapore Dengue Fever, Slovakian contracts, Uruguayan health service provision, Global weather systems, and others were all opened up. And all changed the world by empowering citizens, improving governance, solving public problems, and leading to innovation. Please see this report for more….(More)”

See also odimpact.org

Smart crowds in smart cities: real life, city scale deployments of a smartphone based participatory crowd management platform


Tobias FrankePaul Lukowicz and Ulf Blanke at the Journal of Internet Services and Applications: “Pedestrian crowds are an integral part of cities. Planning for crowds, monitoring crowds and managing crowds, are fundamental tasks in city management. As a consequence, crowd management is a sprawling R&D area (see related work) that includes theoretical models, simulation tools, as well as various support systems. There has also been significant interest in using computer vision techniques to monitor crowds. However, overall, the topic of crowd management has been given only little attention within the smart city domain. In this paper we report on a platform for smart, city-wide crowd management based on a participatory mobile phone sensing platform. Originally, the apps based on this platform have been conceived as a technology validation tool for crowd based sensing within a basic research project. However, the initial deployments at the Notte Bianca Festival1 in Malta and at the Lord Mayor’s Show in London2 generated so much interest within the civil protection community that it has gradually evolved into a full-blown participatory crowd management system and is now in the process of being commercialized through a startup company. Until today it has been deployed at 14 events in three European countries (UK, Netherlands, Switzerland) and used by well over 100,000 people….

Obtaining knowledge about the current size and density of a crowd is one of the central aspects of crowd monitoring . For the last decades, automatic crowd monitoring in urban areas has mainly been performed by means of image processing . One use case for such video-based applications can be found in, where a CCTV camera-based system is presented that automatically alerts the staff of subway stations when the waiting platform is congested. However, one of the downsides of video-based crowd monitoring is the fact that video cameras tend to be considered as privacy invading. Therefore,  presents a privacy preserving approach to video-based crowd monitoring where crowd sizes are estimated without people models or object tracking.

With respect to the mitigation of catastrophes induced by panicking crowds (e.g. during an evacuation), city planners and architects increasingly rely on tools simulating crowd behaviors in order to optimize infrastructures. Murakami et al. presents an agent based simulation for evacuation scenarios. Shendarkar et al. presents a work that is also based on BSI (believe, desire, intent) agents – those agents however are trained in a virtual reality environment thereby giving greater flexibility to the modeling. Kluepfel et al. on the other hand uses a cellular automaton model for the simulation of crowd movement and egress behavior.

With smartphones becoming everyday items, the concept of crowd sourcing information from users of mobile application has significantly gained traction. Roitman et al. presents a smart city system where the crowd can send eye witness reports thereby creating deeper insights for city officials. Szabo et al. takes this approach one step further and employs the sensors built into smartphones for gathering data for city services such as live transit information. Ghose et al. utilizes the same principle for gathering information on road conditions. Pan et al. uses a combination of crowd sourcing and social media analysis for identifying traffic anomalies….(More)”.

Case Studies of Government Use of Big Data in Latin America: Brazil and Mexico


Chapter by Roberto da Mota Ueti, Daniela Fernandez Espinosa, Laura Rafferty, Patrick C. K. Hung in Big Data Applications and Use Cases: “Big Data is changing our world with masses of information stored in huge servers spread across the planet. This new technology is changing not only companies but governments as well. Mexico and Brazil, two of the most influential countries in Latin America, are entering a new era and as a result, facing challenges in all aspects of public policy. Using Big Data, the Brazilian Government is trying to decrease spending and use public money better by grouping public information with stored information on citizens in public services. With new reforms in education, finances and telecommunications, the Mexican Government is taking on a bigger role in efforts to channel the country’s economic policy into an improvement of the quality of life of their habitants. It is known that technology is an important part for sub-developed countries, who are trying to make a difference in certain contexts such as reducing inequality or regulating the good usage of economic resources. The good use of Big Data, a new technology that is in charge of managing a big quantity of information, can be crucial for the Mexican Government to reach the goals that have been set in the past under Peña Nieto’s administration. This article focuses on how the Brazilian and Mexican Governments are managing the emerging technologies of Big Data and how it includes them in social and industrial projects to enhance the growth of their economies. The article also discusses the benefits of these uses of Big Data and the possible problems that occur related to security and privacy of information….(More)’

Scientists Are Just as Confused About the Ethics of Big-Data Research as You


Sarah Zhang at Wired: “When a rogue researcher last week released 70,000 OkCupid profiles, complete with usernames and sexual preferences, people were pissed. When Facebook researchers manipulated stories appearing in Newsfeeds for a mood contagion study in 2014, people were really pissed. OkCupid filed a copyright claim to take down the dataset; the journal that published Facebook’s study issued an “expression of concern.” Outrage has a way of shaping ethical boundaries. We learn from mistakes.

Shockingly, though, the researchers behind both of those big data blowups never anticipated public outrage. (The OkCupid research does not seem to have gone through any kind of ethical review process, and a Cornell ethics review board approved the Facebook experiment.) And that shows just how untested the ethics of this new field of research is. Unlike medical research, which has been shaped by decades of clinical trials, the risks—and rewards—of analyzing big, semi-public databases are just beginning to become clear.

And the patchwork of review boards responsible for overseeing those risks are only slowly inching into the 21st century. Under the Common Rule in the US, federally funded research has to go through ethical review. Rather than one unified system though, every single university has its own institutional review board, or IRB. Most IRB members are researchers at the university, most often in the biomedical sciences. Few are professional ethicists.

Even fewer have computer science or security expertise, which may be necessary to protect participants in this new kind of research. “The IRB may make very different decisions based on who is on the board, what university it is, and what they’re feeling that day,” says Kelsey Finch, policy counsel at the Future of Privacy Forum. There are hundreds of these IRBs in the US—and they’re grappling with research ethics in the digital age largely on their own….

Or maybe other institutions, like the open science repositories asking researchers to share data, should be picking up the slack on ethical issues. “Someone needs to provide oversight, but the optimal body is unlikely to be an IRB, which usually lacks subject matter expertise in de-identification and re-identification techniques,” Michelle Meyer, a bioethicist at Mount Sinai, writes in an email.

Even among Internet researchers familiar with the power of big data, attitudes vary. When Katie Shilton, an information technology research at the University of Maryland, interviewed 20 online data researchers, she found “significant disagreement” over issues like the ethics of ignoring Terms of Service and obtaining informed consent. Surprisingly, the researchers also said that ethical review boards had never challenged the ethics of their work—but peer reviewers and colleagues had. Various groups like theAssociation of Internet Researchers and the Center for Applied Internet Data Analysis have issued guidelines, but the people who actually have power—those on institutional review boards–are only just catching up.

Outside of academia, companies like Microsoft have started to institute their own ethical review processes. In December, Finch at the Future of Privacy Forum organized a workshop called Beyond IRBs to consider processes for ethical review outside of federally funded research. After all, modern tech companies like Facebook, OkCupid, Snapchat, Netflix sit atop a trove of data 20th century social scientists could have only dreamed up.

Of course, companies experiment on us all the time, whether it’s websites A/B testing headlines or grocery stores changing the configuration of their checkout line. But as these companies hire more data scientists out of PhD programs, academics are seeing an opportunity to bridge the divide and use that data to contribute to public knowledge. Maybe updated ethical guidelines can be forged out of those collaborations. Or it just might be a mess for a while….(More)”

Building Data Responsibility into Humanitarian Action


Stefaan Verhulst at The GovLab: “Next Monday, May 23rd, governments, non-profit organizations and citizen groups will gather in Istanbul at the first World Humanitarian Summit. A range of important issues will be on the agenda, not least of which the refugee crisis confronting the Middle East and Europe. Also on the agenda will be an issue of growing importance and relevance, even if it does not generate front-page headlines: the increasing potential (and use) of data in the humanitarian context.

To explore this topic, a new paper, “Building Data Responsibility into Humanitarian Action,” is being released today, and will be presented tomorrow at the Understanding Risk Forum. This paper is the result of a collaboration between the United Nations Office for the Coordination of Humanitarian Affairs (OCHA), The GovLab (NYU Tandon School of Engineering), the Harvard Humanitarian Initiative, and Leiden UniversityCentre for Innovation. It seeks to identify the potential benefits and risks of using data in the humanitarian context, and begins to outline an initial framework for the responsible use of data in humanitarian settings.

Both anecdotal and more rigorously researched evidence points to the growing use of data to address a variety of humanitarian crises. The paper discusses a number of data risk case studies, including the use of call data to fight Malaria in Africa; satellite imagery to identify security threats on the border between Sudan and South Sudan; and transaction data to increase the efficiency of food delivery in Lebanon. These early examples (along with a few others discussed in the paper) have begun to show the opportunities offered by data and information. More importantly, they also help us better understand the risks, including and especially those posed to privacy and security.

One of the broader goals of the paper is to integrate the specific and the theoretical, in the process building a bridge between the deep, contextual knowledge offered by initiatives like those discussed above and the broader needs of the humanitarian community. To that end, the paper builds on its discussion of case studies to begin establishing a framework for the responsible use of data in humanitarian contexts. It identifies four “Minimum Humanitarian standards for the Responsible use of Data” and four “Characteristics of Humanitarian Organizations that use Data Responsibly.” Together, these eight attributes can serve as a roadmap or blueprint for humanitarian groups seeking to use data. In addition, the paper also provides a four-step practical guide for a data responsibility framework (see also earlier blog)….(More)” Full Paper: Building Data Responsibility into Humanitarian Action

Twelve principles for open innovation 2.0


Martin Curley in Nature: “A new mode of innovation is emerging that blurs the lines between universities, industry, governments and communities. It exploits disruptive technologies — such as cloud computing, the Internet of Things and big data — to solve societal challenges sustainably and profitably, and more quickly and ably than before. It is called open innovation 2.0 (ref. 1).

Such innovations are being tested in ‘living labs’ in hundreds of cities. In Dublin, for example, the city council has partnered with my company, the technology firm Intel (of which I am a vice-president), to install a pilot network of sensors to improve flood management by measuring local rain fall and river levels, and detecting blocked drains. Eindhoven in the Netherlands is working with electronics firm Philips and others to develop intelligent street lighting. Communications-technology firm Ericsson, the KTH Royal Institute of Technology, IBM and others are collaborating to test self-driving buses in Kista, Sweden.

Yet many institutions and companies remain unaware of this radical shift. They often confuse invention and innovation. Invention is the creation of a technology or method. Innovation concerns the use of that technology or method to create value. The agile approaches needed for open innovation 2.0 conflict with the ‘command and control’ organizations of the industrial age (see ‘How innovation modes have evolved’). Institutional or societal cultures can inhibit user and citizen involvement. Intellectual-property (IP) models may inhibit collaboration. Government funders can stifle the emergence of ideas by requiring that detailed descriptions of proposed work are specified before research can begin. Measures of success, such as citations, discount innovation and impact. Policymaking lags behind the market place….

Keys to collaborative innovation

  1. Purpose. Efforts and intellects aligned through commitment rather than compliance deliver an impact greater than the sum of their parts. A great example is former US President John F. Kennedy’s vision of putting a man on the Moon. Articulating a shared value that can be created is important. A win–win scenario is more sustainable than a win–lose outcome.
  2. Partner. The ‘quadruple helix’ of government, industry, academia and citizens joining forces aligns goals, amplifies resources, attenuates risk and accelerates progress. A collaboration between Intel, University College London, Imperial College London and Innovate UK’s Future Cities Catapult is working in the Intel Collaborative Research Institute to improve people’s well-being in cities, for example to enable reduction of air pollution.
  3. Platform. An environment for collaboration is a basic requirement. Platforms should be integrated and modular, allowing a plug-and-play approach. They must be open to ensure low barriers to use, catalysing the evolution of a community. Challenges in security, standards, trust and privacy need to be addressed. For example, the Open Connectivity Foundation is securing interoperability for the Internet of Things.
  4. Possibilities. Returns may not come from a product but from the business model that enabled it, a better process or a new user experience. Strategic tools are available, such as industrial designer Larry Keeley’s breakdown of innovations into ten types in four categories: finance, process, offerings and delivery.
  5. Plan. Adoption and scale should be the focus of innovation efforts, not product creation. Around 20% of value is created when an innovation is established; more than 80% comes when it is widely adopted7. Focus on the ‘four Us’: utility (value to the user); usability; user experience; and ubiquity (designing in network effects).
  6. Pyramid. Enable users to drive innovation. They inspired two-thirds of innovations in semiconductors and printed circuit boards, for example. Lego Ideas encourages children and others to submit product proposals — submitters must get 10,000 supporters for their idea to be reviewed. Successful inventors get 1% of royalties.
  7. Problem. Most innovations come from a stated need. Ethnographic research with users, customers or the environment can identify problems and support brainstorming of solutions. Create a road map to ensure the shortest path to a solution.
  8. Prototype. Solutions need to be tested and improved through rapid experimentation with users and citizens. Prototyping shows how applicable a solution is, reduces the risks of failures and can reveal pain points. ‘Hackathons’, where developers come together to rapidly try things, are increasingly common.
  9. Pilot. Projects need to be implemented in the real world on small scales first. The Intel Collaborative Research Institute runs research projects in London’s parks, neighbourhoods and schools. Barcelona’s Laboratori — which involves the quadruple helix — is pioneering open ‘living lab’ methods in the city to boost culture, knowledge, creativity and innovation.
  10. Product. Prototypes need to be converted into viable commercial products or services through scaling up and new infrastructure globally. Cloud computing allows even small start-ups to scale with volume, velocity and resilience.
  11. Product service systems. Organizations need to move from just delivering products to also delivering related services that improve sustainability as well as profitability. Rolls-Royce sells ‘power by the hour’ — hours of flight time rather than jet engines — enabled by advanced telemetry. The ultimate goal of open innovation 2.0 is a circular or performance economy, focused on services and reuse rather than consumption and waste.
  12. Process. Innovation is a team sport. Organizations, ecosystems and communities should measure, manage and improve their innovation processes to deliver results that are predictable, probable and profitable. Agile methods supported by automation shorten the time from idea to implementation….(More)”

We know where you live


MIT News Office: “From location data alone, even low-tech snoopers can identify Twitter users’ homes, workplaces….Researchers at MIT and Oxford University have shown that the location stamps on just a handful of Twitter posts — as few as eight over the course of a single day — can be enough to disclose the addresses of the poster’s home and workplace to a relatively low-tech snooper.

The tweets themselves might be otherwise innocuous — links to funny videos, say, or comments on the news. The location information comes from geographic coordinates automatically associated with the tweets.

Twitter’s location-reporting service is off by default, but many Twitter users choose to activate it. The new study is part of a more general project at MIT’s Internet Policy Research Initiative to help raise awareness about just how much privacy people may be giving up when they use social media.

The researchers describe their research in a paper presented last week at the Association for Computing Machinery’s Conference on Human Factors in Computing Systems, where it received an honorable mention in the best-paper competition, a distinction reserved for only 4 percent of papers accepted to the conference.

“Many people have this idea that only machine-learning techniques can discover interesting patterns in location data,” says Ilaria Liccardi, a research scientist at MIT’s Internet Policy Research Initiative and first author on the paper. “And they feel secure that not everyone has the technical knowledge to do that. With this study, what we wanted to show is that when you send location data as a secondary piece of information, it is extremely simple for people with very little technical knowledge to find out where you work or live.”

Conclusions from clustering

In their study, Liccardi and her colleagues — Alfie Abdul-Rahman and Min Chen of Oxford’s e-Research Centre in the U.K. — used real tweets from Twitter users in the Boston area. The users consented to the use of their data, and they also confirmed their home and work addresses, their commuting routes, and the locations of various leisure destinations from which they had tweeted.

The time and location data associated with the tweets were then presented to a group of 45 study participants, who were asked to try to deduce whether the tweets had originated at the Twitter users’ homes, their workplaces, leisure destinations, or locations along their commutes. The participants were not recruited on the basis of any particular expertise in urban studies or the social sciences; they just drew what conclusions they could from location clustering….

Predictably, participants fared better with map-based representations, correctly identifying Twitter users’ homes roughly 65 percent of the time and their workplaces at closer to 70 percent. Even the tabular representation was informative, however, with accuracy rates of just under 50 percent for homes and a surprisingly high 70 percent for workplaces….(More; Full paper )”

The promises and pitfalls of open urban data


Keynote by Robert M. Goerge at the 2016 Third International Conference on eDemocracy & eGovernment (ICEDEG) Open data portals are springing up around the world. Municipalities, states and countries have made available data that has never been as accessible to the general public. These data have led to many applications that have informed the public of new urban conditions or provided information to make urban life easier. However, it should be clear that these data have limitations in the effort to solve many urban problems because in may cases they do not provide all of the information that is needed by government and NGOs to get at the cause or at least correlations of the problem at hand. It is still necessary to have access to data that cannot be made public to address some of most serious urban problems. While this seems just to apply to public access, it is also the case that government employees or those with legitimate access to the necessary non-open data lack access because of legal, organizational, privacy, or bureaucratic issues. This limits the promise of increasing data-driven efforts to address the most critical urban issues. Solutions to these problems in the context of ethical behavior will be discussed….(More)”