Geospatial big data and cartography: research challenges and opportunities for making maps that matter


International Journal Of Cartography; “Geospatial big data present a new set of challenges and opportunities for cartographic researchers in technical, methodological and artistic realms. New computational and technical paradigms for cartography are accompanying the rise of geospatial big data. Additionally, the art and science of cartography needs to focus its contemporary efforts on work that connects to outside disciplines and is grounded in problems that are important to humankind and its sustainability. Following the development of position papers and a collaborative workshop to craft consensus around key topics, this article presents a new cartographic research agenda focused on making maps that matter using geospatial big data. This agenda provides both long-term challenges that require significant attention and short-term opportunities that we believe could be addressed in more concentrated studies….(More)”.

Big data helps Belfort, France, allocate buses on routes according to demand


 in Digital Trends: “As modern cities smarten up, the priority for many will be transportation. Belfort, a mid-sized French industrial city of 50,000, serves as proof of concept for improved urban transportation that does not require the time and expense of covering the city with sensors and cameras.

Working with Tata Consultancy Services (TCS) and GFI Informatique, the Board of Public Transportation of Belfort overhauled bus service management of the city’s 100-plus buses. The project entailed a combination of ID cards, GPS-equipped card readers on buses, and big data analysis. The collected data was used to measure bus speed from stop to stop, passenger flow to observe when and where people got on and off, and bus route density. From start to finish, the proof of concept project took four weeks.

Using the TCS Intelligent Urban Exchange system, operations managers were able to detect when and where about 20 percent of all bus passengers boarded and got off on each city bus route. Utilizing big data and artificial intelligence the city’s urban planners were able to use that data analysis to make cost-effective adjustments including the allocation of additional buses on routes and during times of greater passenger demand. They were also able to cut back on buses for minimally used routes and stops. In addition, the system provided feedback on the effect of city construction projects on bus service….

Going forward, continued data analysis will help the city budget wisely for infrastructure changes and new equipment purchases. The goal is to put the money where the needs are greatest rather than just spending and then waiting to see if usage justified the expense. The push for smarter cities has to be not just about improved services, but also smart resource allocation — in the Belfort project, the use of big data showed how to do both….(More)”

Watchdog to launch inquiry into misuse of data in politics


, and Alice Gibbs in The Guardian: “The UK’s privacy watchdog is launching an inquiry into how voters’ personal data is being captured and exploited in political campaigns, cited as a key factor in both the Brexit and Trump victories last year.

The intervention by the Information Commissioner’s Office (ICO) follows revelations in last week’s Observer that a technology company part-owned by a US billionaire played a key role in the campaign to persuade Britons to vote to leave the European Union.

It comes as privacy campaigners, lawyers, politicians and technology experts express fears that electoral laws are not keeping up with the pace of technological change.

“We are conducting a wide assessment of the data-protection risks arising from the use of data analytics, including for political purposes, and will be contacting a range of organisations,” an ICO spokeswoman confirmed. “We intend to publicise our findings later this year.”

The ICO spokeswoman confirmed that it had approached Cambridge Analytica over its apparent use of data following the story in the Observer. “We have concerns about Cambridge Analytica’s reported use of personal data and we are in contact with the organisation,” she said….

In the US, companies are free to use third-party data without seeking consent. But Gavin Millar QC, of Matrix Chambers, said this was not the case in Europe. “The position in law is exactly the same as when people would go canvassing from door to door,” Millar said. “They have to say who they are, and if you don’t want to talk to them you can shut the door in their face.That’s the same principle behind the data protection act. It’s why if telephone canvassers ring you, they have to say that whole long speech. You have to identify yourself explicitly.”…

Dr Simon Moores, visiting lecturer in the applied sciences and computing department at Canterbury Christ Church University and a technology ambassador under the Blair government, said the ICO’s decision to shine a light on the use of big data in politics was timely.

“A rapid convergence in the data mining, algorithmic and granular analytics capabilities of companies like Cambridge Analytica and Facebook is creating powerful, unregulated and opaque ‘intelligence platforms’. In turn, these can have enormous influence to affect what we learn, how we feel, and how we vote. The algorithms they may produce are frequently hidden from scrutiny and we see only the results of any insights they might choose to publish.” …(More)”

AI, machine learning and personal data


Jo Pedder at the Information Commissioner’s Office Blog: “Today sees the publication of the ICO’s updated paper on big data and data protection.

But why now? What’s changed in the two and a half years since we first visited this topic? Well, quite a lot actually:

  • big data is becoming the norm for many organisations, using it to profile people and inform their decision-making processes, whether that’s to determine your car insurance premium or to accept/reject your job application;
  • artificial intelligence (AI) is stepping out of the world of science-fiction and into real life, providing the ‘thinking’ power behind virtual personal assistants and smart cars; and
  • machine learning algorithms are discovering patterns in data that traditional data analysis couldn’t hope to find, helping to detect fraud and diagnose diseases.

The complexity and opacity of these types of processing operations mean that it’s often hard to know what’s going on behind the scenes. This can be problematic when personal data is involved, especially when decisions are made that have significant effects on people’s lives. The combination of these factors has led some to call for new regulation of big data, AI and machine learning, to increase transparency and ensure accountability.

In our view though, whilst the means by which the processing of personal data are changing, the underlying issues remain the same. Are people being treated fairly? Are decisions accurate and free from bias? Is there a legal basis for the processing? These are issues that the ICO has been addressing for many years, through oversight of existing European data protection legislation….(More)”

When the Big Lie Meets Big Data


Peter Bruce in Scientific America: “…The science of predictive modeling has come a long way since 2004. Statisticians now build “personality” models and tie them into other predictor variables. … One such model bears the acronym “OCEAN,” standing for the personality characteristics (and their opposites) of openness, conscientiousness, extroversion, agreeableness, and neuroticism. Using Big Data at the individual level, machine learning methods might classify a person as, for example, “closed, introverted, neurotic, not agreeable, and conscientious.”

Alexander Nix, CEO of Cambridge Analytica (owned by Trump’s chief donor, Rebekah Mercer), says he has thousands of data points on you, and every other voter: what you buy or borrow, where you live, what you subscribe to, what you post on social media, etc. At a recent Concordia Summit, using the example of gun rights, Nix described how messages will be crafted to appeal specifically to you, based on your personality profile. Are you highly neurotic and conscientious? Nix suggests the image of a sinister gloved hand reaching through a broken window.

In his presentation, Nix noted that the goal is to induce behavior, not communicate ideas. So where does truth fit in? Johan Ugander, Assistant Professor of Management Science at Stanford, suggests that, for Nix and Cambridge Analytica, it doesn’t. In counseling the hypothetical owner of a private beach how to keep people off his property, Nix eschews the merely factual “Private Beach” sign, advocating instead a lie: “Sharks sighted.” Ugander, in his critique, cautions all data scientists against “building tools for unscrupulous targeting.”

The warning is needed, but may be too late. What Nix described in his presentation involved carefully crafted messages aimed at his target personalities. His messages pulled subtly on various psychological strings to manipulate us, and they obeyed no boundary of truth, but they required humans to create them.  The next phase will be the gradual replacement of human “craftsmanship” with machine learning algorithms that can supply targeted voters with a steady stream of content (from whatever source, true or false) designed to elicit desired behavior. Cognizant of the Pandora’s box that data scientists have opened, the scholarly journal Big Data has issued a call for papers for a future issue devoted to “Computational Propaganda.”…(More)”

Handbook of Big Data Technologies


Handbook by Albert Y. Zomaya and Sherif Sakr: “…offers comprehensive coverage of recent advancements in Big Data technologies and related paradigms.  Chapters are authored by international leading experts in the field, and have been reviewed and revised for maximum reader value. The volume consists of twenty-five chapters organized into four main parts. Part one covers the fundamental concepts of Big Data technologies including data curation mechanisms, data models, storage models, programming models and programming platforms. It also dives into the details of implementing Big SQL query engines and big stream processing systems.  Part Two focuses on the semantic aspects of Big Data management including data integration and exploratory ad hoc analysis in addition to structured querying and pattern matching techniques.  Part Three presents a comprehensive overview of large scale graph processing. It covers the most recent research in large scale graph processing platforms, introducing several scalable graph querying and mining mechanisms in domains such as social networks.  Part Four details novel applications that have been made possible by the rapid emergence of Big Data technologies such as Internet-of-Things (IOT), Cognitive Computing and SCADA Systems.  All parts of the book discuss open research problems, including potential opportunities, that have arisen from the rapid progress of Big Data technologies and the associated increasing requirements of application domains.
Designed for researchers, IT professionals and graduate students, this book is a timely contribution to the growing Big Data field. Big Data has been recognized as one of leading emerging technologies that will have a major contribution and impact on the various fields of science and varies aspect of the human society over the coming decades. Therefore, the content in this book will be an essential tool to help readers understand the development and future of the field….(More)”

Fighting Illegal Fishing With Big Data


Emily Matchar in Smithsonian: “In many ways, the ocean is the Wild West. The distances are vast, the law enforcement agents few and far between, and the legal jurisdiction often unclear. In this environment, illegal activity flourishes. Illegal fishing is so common that experts estimate as much as a third of fish sold in the U.S. was fished illegally. This illegal fishing decimates the ocean’s already dwindling fish populations and gives rise to modern slavery, where fishermen are tricked onto vessels and forced to work, sometimes for years.

A new use of data technology aims to help curb these abuses by shining a light on the high seas. The technology uses ships’ satellite signals to detect instances of transshipment, when two vessels meet at sea to exchange cargo. As transshipment is a major way illegally caught fish makes it into the legal supply chain, tracking it could potentially help stop the practice.

“[Transshipment] really allows people to do something out of sight,” says David Kroodsma, the research program director at Global Fishing Watch, an online data platform launched by Google in partnership with the nonprofits Oceana and SkyTruth. “It’s something that obscures supply chains. It’s basically being able to do things without any oversight. And that’s a problem when you’re using a shared resource like the oceans.”

Global Fishing Watch analyzed some 21 billion satellite signals broadcast by ships, which are required to carry transceivers for collision avoidance, from between 2012 and 2016. It then used an artificial intelligence system it created to identify which ships were refrigerated cargo vessels (known in the industry as “reefers”). They then verified this information with fishery registries and other sources, eventually identifying 794 reefers—90 percent of the world’s total number of such vessels. They tracked instances where a reefer and a fishing vessel were moving at similar speeds in close proximity, labeling these instances as “likely transshipments,” and also traced instances where reefers were traveling in a way that indicated a rendezvous with a fishing vessel, even if no fishing vessel was present—fishing vessels often turn off their satellite systems when they don’t want to be seen. All in all there were more than 90,000 likely or potential transshipments recorded.

Even if these encounters were in fact transshipments, they would not all have been for nefarious purposes. They may have taken place to refuel or load up on supplies. But looking at the patterns of where the potential transshipments happen is revealing. Very few are seen close to the coasts of the U.S., Canada and much of Europe, all places with tight fishery regulations. There are hotspots off the coast of Peru and Argentina, all over Africa, and off the coast of Russia. Some 40 percent of encounters happen in international waters, far enough off the coast that no country has jurisdiction.

The tracked reefers were flying flags from some 40 different countries. But that doesn’t necessarily tell us much about where they really come from. Nearly half of the reefers tracked were flying “flags of convenience,” meaning they’re registered in countries other than where the ship’s owners are from to take advantage of those countries’ lax regulations….(More)”

Read more: http://www.smithsonianmag.com/innovation/fighting-illegal-fishing-big-data-180962321/#7eCwGrGS5v5gWjFz.99
Give the gift of Smithsonian magazine for only $12! http://bit.ly/1cGUiGv
Follow us: @SmithsonianMag on Twitter

The Datafied Society. Studying Culture through Data


(Open Access) book edited by Mirko Tobias Schäfer & Karin van Es: “As more and more aspects of everyday life are turned into machine-readable data, researchers are provided with rich resources for researching society. The novel methods and innovative tools to work with this data not only require new knowledge and skills, but also raise issues concerning the practices of investigation and publication. This book critically reflects on the role of data in academia and society and challenges overly optimistic expectations considering data practices as means for understanding social reality. It introduces its readers to the practices and methods for data analysis and visualization and raises questions not only about the politics of data tools, but also about the ethics in collecting, sifting through data, and presenting data research. AUP S17 Catalogue text As machine-readable data comes to play an increasingly important role in everyday life, researchers find themselves with rich resources for studying society. The novel methods and tools needed to work with such data require not only new knowledge and skills, but also a new way of thinking about best research practices. This book critically reflects on the role and usefulness of big data, challenging overly optimistic expectations about what such information can reveal, introducing practices and methods for its analysis and visualization, and raising important political and ethical questions regarding its collection, handling, and presentation….(More)”.

From big data to smart data: FDA’s INFORMED initiative


Sean KhozinGeoffrey Kim & Richard Pazdur in Nature: “….Recent advances in our understanding of disease mechanisms have led to the development of new drugs that are enabling precision medicine. For example, the co-development of kinase inhibitors that target ‘driver mutations’ in metastatic non-small-cell lung cancer (NSCLC) with companion diagnostics has led to substantial improvements in the treatment of some patients. However, growing evidence suggests that most patients with metastatic NSCLC and other advanced cancers may not have tumours with single driver mutations. Furthermore, the generation of clinical evidence in genomically diverse and geographically dispersed groups of patients using traditional trial designs and multiple competing therapies is becoming more costly and challenging.

Strategies aimed at creating new efficiencies in clinical evidence generation and extending the benefits of precision medicine to larger groups of patients are driving a transformation from a reductionist approach to drug development (for example, a single drug targeting a driver mutation and traditional clinical trials) to a holistic approach (for example, combination therapies targeting complex multiomic signatures and real-world evidence). This transition is largely fuelled by the rapid expansion in the four dimensions of biomedical big data, which has created a need for greater organizational and technical capabilities (Fig. 1). Appropriate management and analysis of such data requires specialized tools and expertise in health information technology, data science and high-performance computing. For example, efforts to generate clinical evidence using real-world data are being limited by challenges such as capturing clinically relevant variables from vast volumes of unstructured content (such as physician notes) in electronic health records and organizing various structured data elements that are primarily designed to support billing rather than clinical research. So, new standards and quality-control mechanisms are needed to ensure the validity of the design and analysis of studies based on electronic health records.

Figure 1: Conceptual map of technical and organizational capacity for biomedical big data.
Conceptual map of technical and organizational capacity for biomedical big data.

Big data can be defined as having four dimensions: volume (data size), variety (data type), veracity (data noise and uncertainty) and velocity (data flow and processing). Currently, FDA approval decisions are generally based on data of limited variety, mainly from clinical trials and preclinical studies (1) that are mostly structured (2), in data sets usually no more than a few gigabytes in size (3), that are processed intermittently as part of regulatory submissions (4). The expansion of big data in the four dimensions (grey lines) calls for increasing organizational and technical capacity. This could transform big data into smart data by enabling a holistic approach to personalization of therapies that takes patient, disease and environmental characteristics into account. (Full size image (309 KB);Download PowerPoint slide (492 KB)More)”

Will Democracy Survive Big Data and Artificial Intelligence?


Dirk Helbing, Bruno S. Frey, Gerd Gigerenzer, Ernst Hafen, Michael Hagner, Yvonne Hofstetter, Jeroen van den Hoven, Roberto V. Zicari, and Andrej Zwitter in Scientific American: “….In summary, it can be said that we are now at a crossroads (see Fig. 2). Big data, artificial intelligence, cybernetics and behavioral economics are shaping our society—for better or worse. If such widespread technologies are not compatible with our society’s core values, sooner or later they will cause extensive damage. They could lead to an automated society with totalitarian features. In the worst case, a centralized artificial intelligence would control what we know, what we think and how we act. We are at the historic moment, where we have to decide on the right path—a path that allows us all to benefit from the digital revolution. Therefore, we urge to adhere to the following fundamental principles:

1. to increasingly decentralize the function of information systems;

2. to support informational self-determination and participation;

3. to improve transparency in order to achieve greater trust;

4. to reduce the distortion and pollution of information;

5. to enable user-controlled information filters;

6. to support social and economic diversity;

7. to improve interoperability and collaborative opportunities;

8. to create digital assistants and coordination tools;

9. to support collective intelligence, and

10. to promote responsible behavior of citizens in the digital world through digital literacy and enlightenment.

Following this digital agenda we would all benefit from the fruits of the digital revolution: the economy, government and citizens alike. What are we waiting for?A strategy for the digital age

Big data and artificial intelligence are undoubtedly important innovations. They have an enormous potential to catalyze economic value and social progress, from personalized healthcare to sustainable cities. It is totally unacceptable, however, to use these technologies to incapacitate the citizen. Big nudging and citizen scores abuse centrally collected personal data for behavioral control in ways that are totalitarian in nature. This is not only incompatible with human rights and democratic principles, but also inappropriate to manage modern, innovative societies. In order to solve the genuine problems of the world, far better approaches in the fields of information and risk management are required. The research area of responsible innovation and the initiative ”Data for Humanity” (see “Big Data for the benefit of society and humanity”) provide guidance as to how big data and artificial intelligence should be used for the benefit of society….(More)”