Paper by Usamah A. Algemili: “In recent years, we have witnessed increasing interest in government data. Many governments around the world have sensed the value of its passive data sets. These governments started their Open Data policies, yet many countries are on the way converting raw data into useful representation. This paper surveys the previous efforts of Open Data initiatives. It discusses the various challenges that open data projects may encounter during the transformation from passive data sets towards Open Data culture. It reaches out project teams acquiring their practical assessment. Thus, an online form has been distributed among project teams. The questionnaire was developed in alignment with previous literature of data integration challenges. 138 eligible professional participated, and their responds has been analyzed by the researcher. The result section identifies the most critical challenges from project-teams’ point-of-view, and the findings show four obstacles that stand out as critical challenges facing project teams. This paper casts on these challenges, and it attempts to indicate the missing gap between current guidelines and practical experience. Accordingly, this paper presents the current infrastructure of Open Data framework followed by additional recommendations that may lead to successful implementation of Open Data development….(More)”
Twelve principles for open innovation 2.0
Martin Curley in Nature: “A new mode of innovation is emerging that blurs the lines between universities, industry, governments and communities. It exploits disruptive technologies — such as cloud computing, the Internet of Things and big data — to solve societal challenges sustainably and profitably, and more quickly and ably than before. It is called open innovation 2.0 (ref. 1).
Such innovations are being tested in ‘living labs’ in hundreds of cities. In Dublin, for example, the city council has partnered with my company, the technology firm Intel (of which I am a vice-president), to install a pilot network of sensors to improve flood management by measuring local rain fall and river levels, and detecting blocked drains. Eindhoven in the Netherlands is working with electronics firm Philips and others to develop intelligent street lighting. Communications-technology firm Ericsson, the KTH Royal Institute of Technology, IBM and others are collaborating to test self-driving buses in Kista, Sweden.
Yet many institutions and companies remain unaware of this radical shift. They often confuse invention and innovation. Invention is the creation of a technology or method. Innovation concerns the use of that technology or method to create value. The agile approaches needed for open innovation 2.0 conflict with the ‘command and control’ organizations of the industrial age (see ‘How innovation modes have evolved’). Institutional or societal cultures can inhibit user and citizen involvement. Intellectual-property (IP) models may inhibit collaboration. Government funders can stifle the emergence of ideas by requiring that detailed descriptions of proposed work are specified before research can begin. Measures of success, such as citations, discount innovation and impact. Policymaking lags behind the market place….
Keys to collaborative innovation
- Purpose. Efforts and intellects aligned through commitment rather than compliance deliver an impact greater than the sum of their parts. A great example is former US President John F. Kennedy’s vision of putting a man on the Moon. Articulating a shared value that can be created is important. A win–win scenario is more sustainable than a win–lose outcome.
- Partner. The ‘quadruple helix’ of government, industry, academia and citizens joining forces aligns goals, amplifies resources, attenuates risk and accelerates progress. A collaboration between Intel, University College London, Imperial College London and Innovate UK’s Future Cities Catapult is working in the Intel Collaborative Research Institute to improve people’s well-being in cities, for example to enable reduction of air pollution.
- Platform. An environment for collaboration is a basic requirement. Platforms should be integrated and modular, allowing a plug-and-play approach. They must be open to ensure low barriers to use, catalysing the evolution of a community. Challenges in security, standards, trust and privacy need to be addressed. For example, the Open Connectivity Foundation is securing interoperability for the Internet of Things.
- Possibilities. Returns may not come from a product but from the business model that enabled it, a better process or a new user experience. Strategic tools are available, such as industrial designer Larry Keeley’s breakdown of innovations into ten types in four categories: finance, process, offerings and delivery.
- Plan. Adoption and scale should be the focus of innovation efforts, not product creation. Around 20% of value is created when an innovation is established; more than 80% comes when it is widely adopted7. Focus on the ‘four Us’: utility (value to the user); usability; user experience; and ubiquity (designing in network effects).
- Pyramid. Enable users to drive innovation. They inspired two-thirds of innovations in semiconductors and printed circuit boards, for example. Lego Ideas encourages children and others to submit product proposals — submitters must get 10,000 supporters for their idea to be reviewed. Successful inventors get 1% of royalties.
- Problem. Most innovations come from a stated need. Ethnographic research with users, customers or the environment can identify problems and support brainstorming of solutions. Create a road map to ensure the shortest path to a solution.
- Prototype. Solutions need to be tested and improved through rapid experimentation with users and citizens. Prototyping shows how applicable a solution is, reduces the risks of failures and can reveal pain points. ‘Hackathons’, where developers come together to rapidly try things, are increasingly common.
- Pilot. Projects need to be implemented in the real world on small scales first. The Intel Collaborative Research Institute runs research projects in London’s parks, neighbourhoods and schools. Barcelona’s Laboratori — which involves the quadruple helix — is pioneering open ‘living lab’ methods in the city to boost culture, knowledge, creativity and innovation.
- Product. Prototypes need to be converted into viable commercial products or services through scaling up and new infrastructure globally. Cloud computing allows even small start-ups to scale with volume, velocity and resilience.
- Product service systems. Organizations need to move from just delivering products to also delivering related services that improve sustainability as well as profitability. Rolls-Royce sells ‘power by the hour’ — hours of flight time rather than jet engines — enabled by advanced telemetry. The ultimate goal of open innovation 2.0 is a circular or performance economy, focused on services and reuse rather than consumption and waste.
- Process. Innovation is a team sport. Organizations, ecosystems and communities should measure, manage and improve their innovation processes to deliver results that are predictable, probable and profitable. Agile methods supported by automation shorten the time from idea to implementation….(More)”
Policy Compass
“The Policy Compass helps you to utilise, interact, visualise and interpret the growing amount of publicly available Open Data. By offering easy to use and web-based data processing tools, it opens up the potential of Open Data for people with limited knowledge about technology. By doing so, it enables journalists, scientists and citizens to easily execute transparent and accountable policy evaluation or analysis.
The website already has selected Open Data from a number of reliable international sources but it is also possible to upload your own data for processing. Once you have found the data you’re looking for you can visualise this or create a causal model. The results of your processing activities are then easily shared for discussion or use in your work….(More)”
We know where you live
MIT News Office: “From location data alone, even low-tech snoopers can identify Twitter users’ homes, workplaces….Researchers at MIT and Oxford University have shown that the location stamps on just a handful of Twitter posts — as few as eight over the course of a single day — can be enough to disclose the addresses of the poster’s home and workplace to a relatively low-tech snooper.
The tweets themselves might be otherwise innocuous — links to funny videos, say, or comments on the news. The location information comes from geographic coordinates automatically associated with the tweets.
Twitter’s location-reporting service is off by default, but many Twitter users choose to activate it. The new study is part of a more general project at MIT’s Internet Policy Research Initiative to help raise awareness about just how much privacy people may be giving up when they use social media.
The researchers describe their research in a paper presented last week at the Association for Computing Machinery’s Conference on Human Factors in Computing Systems, where it received an honorable mention in the best-paper competition, a distinction reserved for only 4 percent of papers accepted to the conference.
“Many people have this idea that only machine-learning techniques can discover interesting patterns in location data,” says Ilaria Liccardi, a research scientist at MIT’s Internet Policy Research Initiative and first author on the paper. “And they feel secure that not everyone has the technical knowledge to do that. With this study, what we wanted to show is that when you send location data as a secondary piece of information, it is extremely simple for people with very little technical knowledge to find out where you work or live.”
Conclusions from clustering
In their study, Liccardi and her colleagues — Alfie Abdul-Rahman and Min Chen of Oxford’s e-Research Centre in the U.K. — used real tweets from Twitter users in the Boston area. The users consented to the use of their data, and they also confirmed their home and work addresses, their commuting routes, and the locations of various leisure destinations from which they had tweeted.
The time and location data associated with the tweets were then presented to a group of 45 study participants, who were asked to try to deduce whether the tweets had originated at the Twitter users’ homes, their workplaces, leisure destinations, or locations along their commutes. The participants were not recruited on the basis of any particular expertise in urban studies or the social sciences; they just drew what conclusions they could from location clustering….
Predictably, participants fared better with map-based representations, correctly identifying Twitter users’ homes roughly 65 percent of the time and their workplaces at closer to 70 percent. Even the tabular representation was informative, however, with accuracy rates of just under 50 percent for homes and a surprisingly high 70 percent for workplaces….(More; Full paper )”
City planners tap into wealth of cycling data from Strava tracking app
Peter Walker in The Guardian: “Sheila Lyons recalls the way Oregon used to collect data on how many people rode bikes. “It was very haphazard, two-hour counts done once a year,” said the woman in charge of cycling policy for the state government.“Volunteers, sitting on the street corner because they wanted better bike facilities. Pathetic, really.”
But in 2013 a colleague had an idea. She recorded her own bike rides using an app called Strava, and thought: why not ask the company to share its data? And so was born Strava Metro, both an inadvertent tech business spinoff and a similarly accidental urban planning tool, one that is now quietly helping to reshape streets in more than 70 places around the world and counting.
Using the GPS tracking capability of a smartphone and similar devices, Strata allows people to plot how far and fast they go and compare themselves against other riders. Users create designated route segments, which each have leaderboards ranked by speed.
Originally aimed just at cyclists, Strava soon incorporated running and now has options for more than two dozen pursuits. But cycling remains the most popular,and while the company is coy about overall figures, it says it adds 1 million new members every two months, and has more than six million uploads a week.
For city planners like Lyons, used to very occasional single-street bike counts,this is a near-unimaginable wealth of data. While individual details are anonymised, it still shows how many Strava-using cyclists, plus their age and gender, ride down any street at any time of the day, and the entire route they take.
The company says it initially had no idea how useful the information could be,and only began visualising data on heatmaps as a fun project for its engineers.“We’re not city planners,” said Michael Horvath, one of two former HarvardUniversity rowers and relatively veteran 40-something tech entrepreneurs who co-founded Strava in 2009.
“One of the things that we learned early on is that these people just don’t have very much data to begin with. Not only is ours a novel dataset, in many cases it’s the only dataset that speaks to the behaviour of cyclists and pedestrians in that city or region.”…(More)”
Robot Regulators Could Eliminate Human Error
Regblog: “Long a fixture of science fiction, artificial intelligence is now part of our daily lives, even if we do not realize it. Through the use of sophisticated machine learning algorithms, for example, computers now work to filter out spam messages automatically from our email. Algorithms also identify us by our photos on Facebook, match us with new friends on online dating sites, and suggest movies to watch on Netflix.
in the San Francisco Chronicle andThese uses of artificial intelligence hardly seem very troublesome. But should we worry if government agencies start to use machine learning?
Complaints abound even today about the uncaring “bureaucratic machinery” of government. Yet seeing how machine learning is starting to replace jobs in the private sector, we can easily fathom a literal machinery of government in which decisions made by human public servants increasingly become made by machines.
Technologists warn of an impending “singularity,” when artificial intelligence surpasses human intelligence. Entrepreneur Elon Musk cautions that artificial intelligence poses one of our “biggest existential threats.” Renowned physicist Stephen Hawking eerily forecasts that artificial intelligence might even “spell the end of the human race.”
Are we ready for a world of regulation by robot? Such a world is closer than we think—and it could actually be worth welcoming.
Already government agencies rely on machine learning for a variety of routine functions. The Postal Service uses learning algorithms to sort mail, and cities such as Los Angeles use them to time their traffic lights. But while uses like these seem relatively benign, consider that machine learning could also be used to make more consequential decisions. Disability claims might one day be processed automatically with the aid of artificial intelligence. Licenses could be awarded to airplane pilots based on what kinds of safety risks complex algorithms predict each applicant poses.
Learning algorithms are already being explored by the Environmental Protection Agency to help make regulatory decisions about what toxic chemicals to control. Faced with tens of thousands of new chemicals that could potentially be harmful to human health, federal regulators have supported the development of a program to prioritize which of the many chemicals in production should undergo the more in-depth testing. By some estimates, machine learning could save the EPA up to $980,000 per toxic chemical positively identified.
It’s not hard then to imagine a day in which even more regulatory decisions are automated. Researchers have shown that machine learning can lead to better outcomes when determining whether parolees ought to be released or domestic violence orders should be imposed. Could the imposition of regulatory fines one day be determined by a computer instead of a human inspector or judge? Quite possibly so, and this would be a good thing if machine learning could improve accuracy, eliminate bias and prejudice, and reduce human error, all while saving money.
But can we trust a government that bungled the initial rollout of Healthcare.gov to deploy artificial intelligence responsibly? In some circumstances we should….(More)”
Crowdsourcing corruption in India’s maternal health services
MSMA (“My Health, My Voice”) is part of SAHAYOG, a non-governmental umbrella organization that helped launch the project. MSMA uses an Ushahidi platform to map and collect data on unofficial fees that plague India’ ostensibly “free” maternal health services. It is one of the many projects showcased in DW Akademie’s recently launched Digital Innovation Library. SAHAYOG works closely with grassroots organizations to promote gender equality and women’s health issues from a human rights perspective…
Sandhya and her colleagues are convinced that promoting transparency and accountability through the data collected can empower the women. If they’re aware of their entitlements, she says, they can demand their rights and in the process hold leaders accountable.
“Information is power,” Sandhya explains. Without this information, she says, “they aren’t in a position to demand what is rightly theirs.”
Health care providers hold a certain degree of power when entrusted with taking care of expectant mothers. Many give into bribes for fear of being otherwise neglected or abused.
With the MSMA project, however, poor rural women have technology that is easy to use and accessible on their mobile phones, and that empowers them to make complaints and report bribes for services that are supposed to be free.
MSMA is an innovative data-driven platform that combines a toll free number, an interactive voice response system (IVRS) and a website that contains accessible reports. In addition to enabling poor women to air their frustrations anonymously, the project aggregates actionable data which can then be used by the NGO as well as the government to work towards improving the situation for mothers in India….(More)”
Big Data for public policy: the quadruple helix
Julia Lane in the Journal of Policy Analysis and Management: “Data from the federal statistical system, particularly the Census Bureau, have long been a key resource for public policy. Although most of those data have been collected through purposive surveys, there have been enormous strides in the use of administrative records on business (Jarmin & Miranda, 2002), jobs (Abowd, Halti- wanger, & Lane, 2004), and individuals (Wagner & Layne, 2014). Those strides are now becoming institutionalized. The President has allocated $10 million to an Administrative Records Clearing House in his FY2016 budget. Congress is considering a bill to use administrative records, entitled the Evidence-Based Policymaking Commission Act, sponsored by Patty Murray and Paul Ryan. In addition, the Census Bureau has established a Center for “Big Data.” In my view, these steps represent important strides for public policy, but they are only part of the story. Public policy researchers must look beyond the federal statistical system and make use of the vast resources now available for research and evaluation.
All politics is local; “Big Data” now mean that policy analysis can increasingly be local. Modern empirical policy should be grounded in data provided by a network of city/university data centers. Public policy schools should partner with scholars in the emerging field of data science to train the next generation of policy researchers in the thoughtful use of the new types of data; the apparent secular decline in the applications to public policy schools is coincident with the emergence of data science as a field of study in its own right. The role of national statistical agencies should be fundamentally rethought—and reformulated to one of four necessary strands in the data infrastructure; that of providing benchmarks, confidentiality protections, and national statistics….(More)”
Global sharing of HIV vaccine research
Springwise: “Noticing that a global, collaborative effort is missing in the world of HIV vaccine research, scientists came together to make it a reality. Populated by research from the Collaboration for AIDS Vaccine Discovery — an international network of laboratories — DataSpace is a partnership between the Statistical Center for HIV/AIDS Research and Prevention, data management and software development company LabKey, and technology product development company Artefact.
Through pooled research results, scientists hope to make data more accessible and comparable. Two aspects make the platform particularly powerful. The Artefact team hand-coded a number of research points to allow results from multiple studies to be compared like-for-like. And rather than discard the findings of failed or inconclusive studies, DataSpace includes them in analysis, vastly increasing the volume of available information.
Material is added as study results become available, creating a constantly developing resource. Being able to quickly test ideas online helps researchers make serendipitous connections and avoid duplicating efforts….(More)”
Big data’s ‘streetlight effect’: where and how we look affects what we see
risk of the disease’s spread and underestimated the local initiatives that played a critical role in controlling the outbreak.
Big data offers us a window on the world. But large and easily available datasets may not show us the world we live in. For instance, epidemiological models of the recent Ebola epidemic in West Africa using big data consistently overestimated theResearchers are rightly excited about the possibilities offered by the availability of enormous amounts of computerized data. But there’s reason to stand back for a minute to consider what exactly this treasure trove of information really offers. Ethnographers like me use a cross-cultural approach when we collect our data because family, marriage and household mean different things in different contexts. This approach informs how I think about big data.
We’ve all heard the joke about the drunk who is asked why he is searching for his lost wallet under the streetlight, rather than where he thinks he dropped it. “Because the light is better here,” he said.
This “streetlight effect” is the tendency of researchers to study what is easy to study. I use this story in my course on Research Design and Ethnographic Methods to explain why so much research on disparities in educational outcomes is done in classrooms and not in students’ homes. Children are much easier to study at school than in their homes, even though many studies show that knowing what happens outside the classroom is important. Nevertheless, schools will continue to be the focus of most research because they generate big data and homes don’t.
The streetlight effect is one factor that prevents big data studies from being useful in the real world – especially studies analyzing easily available user-generated data from the Internet. Researchers assume that this data offers a window into reality. It doesn’t necessarily.
Looking at WEIRDOs
Based on the number of tweets following Hurricane Sandy, for example, it might seem as if the storm hit Manhattan the hardest, not the New Jersey shore. Another example: the since-retired Google Flu Trends, which in 2013 tracked online searches relating to flu symptoms to predict doctor visits, but gave estimates twice as high as reports from the Centers for Disease Control and Prevention. Without checking facts on the ground, researchers may fool themselves into thinking that their big data models accurately represent the world they aim to study.
The problem is similar to the “WEIRD” issue in many research studies. Harvard professor Joseph Henrich and colleagues have shown that findings based on research conducted with undergraduates at American universities – whom they describe as “some of the most psychologically unusual people on Earth” – apply only to that population and cannot be used to make any claims about other human populations, including other Americans. Unlike the typical research subject in psychology studies, they argue, most people in the world are not from Western, Educated, Industrialized, Rich and Democratic societies, i.e., WEIRD.
Twitter users are also atypical compared with the rest of humanity, giving rise to what our postdoctoral researcher Sarah Laborde has dubbed the “WEIRDO” problem of data analytics: most people are not Western, Educated, Industrialized, Rich, Democratic and Online.
Context is critical
Understanding the differences between the vast majority of humanity and that small subset of people whose activities are captured in big data sets is critical to correct analysis of the data. Considering the context and meaning of data – not just the data itself – is a key feature of ethnographic research, argues Michael Agar, who has written extensively about how ethnographers come to understand the world….(https://theconversation.com/big-datas-streetlight-effect-where-and-how-we-look-affects-what-we-see-58122More)”