Better “nowcasting” can reveal what weather is about to hit within 500 meters


MIT Technology Review: “Weather forecasting is impressively accurate given how changeable and chaotic Earth’s climate can be. It’s not unusual to get 10-day forecasts with a reasonable level of accuracy.

But there is still much to be done.  One challenge for meteorologists is to improve their “nowcasting,” the ability to forecast weather in the next six hours or so at a spatial resolution of a square kilometer or less.

In areas where the weather can change rapidly, that is difficult. And there is much at stake. Agricultural activity is increasingly dependent on nowcasting, and the safety of many sporting events depends on it too. Then there is the risk that sudden rainfall could lead to flash flooding, a growing problem in many areas because of climate change and urbanization. That has implications for infrastructure, such as sewage management, and for safety, since this kind of flooding can kill.

So meteorologists would dearly love to have a better way to make their nowcasts.

Enter Blandine Bianchi from EPFL in Lausanne, Switzerland, and a few colleagues, who have developed a method for combining meteorological data from several sources to produce nowcasts with improved accuracy. Their work has the potential to change the utility of this kind of forecasting for everyone from farmers and gardeners to emergency services and sewage engineers.

Current forecasting is limited by the data and the scale on which it is gathered and processed. For example, satellite data has a spatial resolution of 50 to 100 km and allows the tracking and forecasting of large cloud cells over a time scale of six to nine hours. By contrast, radar data is updated every five minutes, with a spatial resolution of about a kilometer, and leads to predictions on the time scale of one to three hours. Another source of data is the microwave links used by telecommunications companies, which are degraded by rainfall….(More)”

Big data analytics to identify illegal construction waste dumping: A Hong Kong study


WeishengLu at Resources, Conservation and Recycling: “Illegal dumping, referring to the intentional and criminal abandonment of waste in unauthorized areas, has long plagued governments and environmental agencies worldwide. Despite the tremendous resources spent to combat it, the surreptitious nature of illegal dumping indicates the extreme difficulty in its identification. In 2006, the Construction Waste Disposal Charging Scheme (CWDCS) was implemented, regulating that all construction waste must be disposed of at government waste facilities if not otherwise properly reused or recycled.

While the CWDCS has significantly improved construction waste management in Hong Kong, it has also triggered illegal dumping problems. Inspired by the success of big data in combating urban crime, this paper aims to identify illegal dumping cases by mining a publicly available data set containing more than 9 million waste disposal records from 2011 to 2017. Using behavioral indicators and up-to-date big data analytics, possible drivers for illegal dumping (e.g., long queuing times) were identified. The analytical results also produced a list of 546 waste hauling trucks suspected of involvement in illegal dumping. This paper contributes to the understanding of illegal dumping behavior and joins the global research community in exploring the value of big data, particularly for combating urban crime. It also presents a three-step big data-enabled urban crime identification methodology comprising ‘Behavior characterization’, ‘Big data analytical model development’, and ‘Model training, calibration, and evaluation’….(More)”.

Positive deviance, big data, and development: A systematic literature review


Paper by Basma Albanna and Richard Heeks: “Positive deviance is a growing approach in international development that identifies those within a population who are outperforming their peers in some way, eg, children in low‐income families who are well nourished when those around them are not. Analysing and then disseminating the behaviours and other factors underpinning positive deviance are demonstrably effective in delivering development results.

However, positive deviance faces a number of challenges that are restricting its diffusion. In this paper, using a systematic literature review, we analyse the current state of positive deviance and the potential for big data to address the challenges facing positive deviance. From this, we evaluate the promise of “big data‐based positive deviance”: This would analyse typical sources of big data in developing countries—mobile phone records, social media, remote sensing data, etc—to identify both positive deviants and the factors underpinning their superior performance.

While big data cannot solve all the challenges facing positive deviance as a development tool, they could reduce time, cost, and effort; identify positive deviants in new or better ways; and enable positive deviance to break out of its current preoccupation with public health into domains such as agriculture, education, and urban planning. In turn, positive deviance could provide a new and systematic basis for extracting real‐world development impacts from big data…(More)”.

Algorithmic Government: Automating Public Services and Supporting Civil Servants in using Data Science Technologies


Zeynep Engin and Philip Treleaven in the Computer Journal:  “The data science technologies of artificial intelligence (AI), Internet of Things (IoT), big data and behavioral/predictive analytics, and blockchain are poised to revolutionize government and create a new generation of GovTech start-ups. The impact from the ‘smartification’ of public services and the national infrastructure will be much more significant in comparison to any other sector given government’s function and importance to every institution and individual.

Potential GovTech systems include Chatbots and intelligent assistants for public engagement, Robo-advisors to support civil servants, real-time management of the national infrastructure using IoT and blockchain, automated compliance/regulation, public records securely stored in blockchain distributed ledgers, online judicial and dispute resolution systems, and laws/statutes encoded as blockchain smart contracts. Government is potentially the major ‘client’ and also ‘public champion’ for these new data technologies. This review paper uses our simple taxonomy of government services to provide an overview of data science automation being deployed by governments world-wide. The goal of this review paper is to encourage the Computer Science community to engage with government to develop these new systems to transform public services and support the work of civil servants….(More)”.

Statistics Canada promises more detailed portrait of Canadians with fewer surveys


Bill Curry at The Globe and Mail: “Canadians are increasingly shunning phone surveys, but they could still be providing Statistics Canada with valuable data each time they flush the toilet or flash their debit card.

The national statistics agency laid out an ambitious plan Thursday to overhaul the way it collects and reports on issues ranging from cannabis and opioid use to market-moving information on unemployment and economic growth.

According to four senior Statscan officials, the agency is in the midst of a major transformation as it adapts to a world of big data collected by other government agencies as well as private sector actors such as banks, cellphone companies and digital-based companies like Uber.

At its core, the shift means the agency will become less reliant on traditional phone surveys or having businesses fill out forms to report their sales data. Instead, Statscan is reaching agreements with other government departments and private companies in order to gain access to their raw data, such as point-of-sale information. According to agency officials, such arrangements reduce the reporting paperwork faced by businesses while creating the potential for Statscan to produce faster and more reliable information.

Key releases such as labour statistics or reporting on economic growth could come out sooner, reducing the lag time between the end of a quarter and reporting on results. Officials said economic data that is released quarterly could shift to monthly reporting. The greater access to raw data sources will also allow for more localized reporting at the neighbourhood level….(More)”.

Rationality and politics of algorithms. Will the promise of big data survive the dynamics of public decision making?


Paper by H.G. (Haiko)van der Voort et al: “Big data promises to transform public decision-making for the better by making it more responsive to actual needs and policy effects. However, much recent work on big data in public decision-making assumes a rational view of decision-making, which has been much criticized in the public administration debate.

In this paper, we apply this view, and a more political one, to the context of big data and offer a qualitative study. We question the impact of big data on decision-making, realizing that big data – including its new methods and functions – must inevitably encounter existing political and managerial institutions. By studying two illustrative cases of big data use processes, we explore how these two worlds meet. Specifically, we look at the interaction between data analysts and decision makers.

In this we distinguish between a rational view and a political view, and between an information logic and a decision logic. We find that big data provides ample opportunities for both analysts and decision makers to do a better job, but this doesn’t necessarily imply better decision-making, because big data also provides opportunities for actors to pursue their own interests. Big data enables both data analysts and decision makers to act as autonomous agents rather than as links in a functional chain. Therefore, big data’s impact cannot be interpreted only in terms of its functional promise; it must also be acknowledged as a phenomenon set to impact our policymaking institutions, including their legitimacy….(More)”.

When AI Misjudgment Is Not an Accident


Douglas Yeung at Scientific American: “The conversation about unconscious bias in artificial intelligence often focuses on algorithms that unintentionally cause disproportionate harm to entire swaths of society—those that wrongly predict black defendants will commit future crimes, for example, or facial-recognition technologies developed mainly by using photos of white men that do a poor job of identifying women and people with darker skin.

But the problem could run much deeper than that. Society should be on guard for another twist: the possibility that nefarious actors could seek to attack artificial intelligence systems by deliberately introducing bias into them, smuggled inside the data that helps those systems learn. This could introduce a worrisome new dimension to cyberattacks, disinformation campaigns or the proliferation of fake news.

According to a U.S. government study on big data and privacy, biased algorithms could make it easier to mask discriminatory lending, hiring or other unsavory business practices. Algorithms could be designed to take advantage of seemingly innocuous factors that can be discriminatory. Employing existing techniques, but with biased data or algorithms, could make it easier to hide nefarious intent. Commercial data brokers collect and hold onto all kinds of information, such as online browsing or shopping habits, that could be used in this way.

Biased data could also serve as bait. Corporations could release biased data with the hope competitors would use it to train artificial intelligence algorithms, causing competitors to diminish the quality of their own products and consumer confidence in them.

Algorithmic bias attacks could also be used to more easily advance ideological agendas. If hate groups or political advocacy organizations want to target or exclude people on the basis of race, gender, religion or other characteristics, biased algorithms could give them either the justification or more advanced means to directly do so. Biased data also could come into play in redistricting efforts that entrench racial segregation (“redlining”) or restrict voting rights.

Finally, national security threats from foreign actors could use deliberate bias attacks to destabilize societies by undermining government legitimacy or sharpening public polarization. This would fit naturally with tactics that reportedly seek to exploit ideological divides by creating social media posts and buying online ads designed to inflame racial tensions….(More)”.

Babbage among the insurers: big 19th-century data and the public interest.


Wilson, D. C. S.  at the History of the Human Sciences: “This article examines life assurance and the politics of ‘big data’ in mid-19th-century Britain. The datasets generated by life assurance companies were vast archives of information about human longevity. Actuaries distilled these archives into mortality tables – immensely valuable tools for predicting mortality and so pricing risk. The status of the mortality table was ambiguous, being both a public and a private object: often computed from company records they could also be extrapolated from quasi-public projects such as the Census or clerical records. Life assurance more generally straddled the line between private enterprise and collective endeavour, though its advocates stressed the public interest in its success. Reforming actuaries such as Thomas Rowe Edmonds wanted the data on which mortality tables were based to be made publicly available, but faced resistance. Such resistance undermined insurers’ claims to be scientific in spirit and hindered Edmonds’s personal quest for a law of mortality. Edmonds pushed instead for an open actuarial science alongside fellow-travellers at the Statistical Society of London, which was populated by statisticians such as William Farr (whose subsequent work, it is argued, was influenced by Edmonds) as well as by radical mathematicians such as Charles Babbage. The article explores Babbage’s little-known foray into the world of insurance, both as a budding actuary but also as a fierce critic of the industry. These debates over the construction, ownership, and accessibility of insurance datasets show that concern about the politics of big data did not begin in the 21st century….(More)”.

Privacy and Synthetic Datasets


Paper by Steven M. Bellovin, Preetam K. Dutta and Nathan Reitinger: “Sharing is a virtue, instilled in us from childhood. Unfortunately, when it comes to big data — i.e., databases possessing the potential to usher in a whole new world of scientific progress — the legal landscape prefers a hoggish motif. The historic approach to the resulting database–privacy problem has been anonymization, a subtractive technique incurring not only poor privacy results, but also lackluster utility. In anonymization’s stead, differential privacy arose; it provides better, near-perfect privacy, but is nonetheless subtractive in terms of utility.

Today, another solution is leaning into the fore, synthetic data. Using the magic of machine learning, synthetic data offers a generative, additive approach — the creation of almost-but-not-quite replica data. In fact, as we recommend, synthetic data may be combined with differential privacy to achieve a best-of-both-worlds scenario. After unpacking the technical nuances of synthetic data, we analyze its legal implications, finding both over and under inclusive applications. Privacy statutes either overweigh or downplay the potential for synthetic data to leak secrets, inviting ambiguity. We conclude by finding that synthetic data is a valid, privacy-conscious alternative to raw data, but is not a cure-all for every situation. In the end, computer science progress must be met with proper policy in order to move the area of useful data dissemination forward….(More)”.

Internet of Things for Smart Cities: Technologies, Big Data and Security


Book by Waleed Ejaz and Alagan Anpalagan: “This book introduces the concept of smart city as the potential solution to the challenges created by urbanization. The Internet of Things (IoT) offers novel features with minimum human intervention in smart cities. This book describes different components of Internet of Things (IoT) for smart cities including sensor technologies, communication technologies, big data analytics and security….(More)”.