Paper by Sara Makki et al: “Fraudulent activities (e.g., suspicious credit card transaction, financial reporting fraud, and money laundering) are critical concerns to various entities including bank, insurance companies, and public service organizations. Typically, these activities lead to detrimental effects on the victims such as a financial loss. Over the years, fraud analysis techniques underwent a rigorous development. However, lately, the advent of Big data led to vigorous advancement of these techniques since Big Data resulted in extensive opportunities to combat financial frauds. Given that the massive amount of data that investigators need to sift through, massive volumes of data integrated from multiple heterogeneous sources (e.g., social media, blogs) to find fraudulent patterns is emerging as a feasible approach….(More)”.
On the cultural ideology of Big Data
Nathan Jurgenson in The New Inquiry: “Modernity has long been obsessed with, perhaps even defined by, its epistemic insecurity, its grasping toward big truths that ultimately disappoint as our world grows only less knowable. New knowledge and new ways of understanding simultaneously produce new forms of nonknowledge, new uncertainties and mysteries. The scientific method, based in deduction and falsifiability, is better at proliferating questions than it is at answering them. For instance, Einstein’s theories about the curvature of space and motion at the quantum level provide new knowledge and generates new unknowns that previously could not be pondered.
Since every theory destabilizes as much as it solidifies in our view of the world, the collective frenzy to generate knowledge creates at the same time a mounting sense of futility, a tension looking for catharsis — a moment in which we could feel, if only for an instant, that we know something for sure. In contemporary culture, Big Data promises this relief.
As the name suggests, Big Data is about size. Many proponents of Big Data claim that massive databases can reveal a whole new set of truths because of the unprecedented quantity of information they contain. But the big in Big Data is also used to denote a qualitative difference — that aggregating a certain amount of information makes data pass over into Big Data, a “revolution in knowledge,” to use a phrase thrown around by startups and mass-market social-science books. Operating beyond normal science’s simple accumulation of more information, Big Data is touted as a different sort of knowledge altogether, an Enlightenment for social life reckoned at the scale of masses.
As with the similarly inferential sciences like evolutionary psychology and pop-neuroscience, Big Data can be used to give any chosen hypothesis a veneer of science and the unearned authority of numbers. The data is big enough to entertain any story. Big Data has thus spawned an entire industry (“predictive analytics”) as well as reams of academic, corporate, and governmental research; it has also sparked the rise of “data journalism” like that of FiveThirtyEight, Vox, and the other multiplying explainer sites. It has shifted the center of gravity in these fields not merely because of its grand epistemological claims but also because it’s well-financed. Twitter, for example recently announced that it is putting $10 million into a “social machines” Big Data laboratory.
The rationalist fantasy that enough data can be collected with the “right” methodology to provide an objective and disinterested picture of reality is an old and familiar one: positivism. This is the understanding that the social world can be known and explained from a value-neutral, transcendent view from nowhere in particular. The term comes from Positive Philosophy (1830-1842), by August Comte, who also coined the term sociology in this image. As Western sociology began to congeal as a discipline (departments, paid jobs, journals, conferences), Emile Durkheim, another of the field’s founders, believed it could function as a “social physics” capable of outlining “social facts” akin to the measurable facts that could be recorded about the physical properties of objects. It’s an arrogant view, in retrospect — one that aims for a grand, general theory that can explain social life, a view that became increasingly rooted as sociology became focused on empirical data collection.
A century later, that unwieldy aspiration has been largely abandoned by sociologists in favor of reorienting the discipline toward recognizing complexities rather than pursuing universal explanations for human sociality. But the advent of Big Data has resurrected the fantasy of a social physics, promising a new data-driven technique for ratifying social facts with sheer algorithmic processing power…(More)”
Policy Analytics, Modelling, and Informatics
The Rise of Big Data Policing: Surveillance, Race, and the Future of Law Enforcement
How We Can Stop Earthquakes From Killing People Before They Even Hit
Justin Worland in Time Magazine: “…Out of that realization came a plan to reshape disaster management using big data. Just a few months later, Wani worked with two fellow Stanford students to create a platform to predict the toll of natural disasters. The concept is simple but also revolutionary. The One Concern software pulls geological and structural data from a variety of public and private sources and uses machine learning to predict the impact of an earthquake down to individual city blocks and buildings. Real-time information input during an earthquake improves how the system responds. And earthquakes represent just the start for the company, which plans to launch a similar program for floods and eventually other natural disasters….
Previous software might identify a general area where responders could expect damage, but it would appear as a “big red blob” that wasn’t helpful when deciding exactly where to send resources, Dayton says. The technology also integrates information from many sources and makes it easy to parse in an emergency situation when every moment matters. The instant damage evaluations mean fast and actionable information, so first responders can prioritize search and rescue in areas most likely to be worst-hit, rather than responding to 911 calls in the order they are received.
One Concern is not the only company that sees an opportunity to use data to rethink disaster response. The mapping company Esri has built rapid-response software that shows expected damage from disasters like earthquakes, wildfires and hurricanes. And the U.S. government has invested in programs to use data to shape disaster response at agencies like the National Oceanic and Atmospheric Administration (NOAA)….(More)”.
Mobility Score
“MobilityScore® helps you understand how easy it is to get around. It works at any location or address within the US and Canada and gives you a score ranging from 0 (no mobility choices) to 100 (excellent mobility choices).
What do we mean by mobility? Any transportation option that can help you move around your city. Transportation is changing massively as new choices emerge: ridesharing, bikesharing, carsharing. Private and on-demand mobility services have sprung up. However, tools for measuring transportation access have not kept up. That’s why we created MobilityScore as an easy-to-understand measure of transportation access.
Technical Details
MobilityScore includes all the transportation choices that can be found on TransitScreen displays, including the following services:
- Public transit (subways, trains, buses, ferries, cable cars…)
- Car sharing services (Zipcar, Enterprise, and one-way services like car2go)
- Bike sharing services
- Hailed ride sharing services (e.g. taxis, Uber, Lyft)
We have developed a common way of comparing how choices that might seem very different contribute to your mobility. For each mobility choice, we measure how long it will take you until you can start moving on it – for example, the time it takes you to leave your building, walk to a subway station, and wait for a train.
Because we’re measuring how easy it is for you to move around the city, we also consider what mobility choices look like at different times of the day and different days of the week. Mobility data is regularly collected for most services, while ridehailing (Uber/Lyft) data is based on a geographic model of arrival times.
MobilityScore’s framework is future-proof. Just like we do with TransitScreen, we will integrate future services into the calculation as they emerge (e.g. microtransit, autonomous vehicles, mobility-as-a-service)….(More)”
Polish activists turn to digital democracy
Zosia Wasik in the Financial Times: “Opponents of the Polish government have mounted a series of protests on issues ranging from reform of the judiciary to an attempt to ban abortion. In February, they staged yet another, less public but intensely emotive, battle — to save the country’s trees.
At the beginning of the year, a new law allowed property owners to cut trees on their land without official permission. As a result, hundreds of trees disappeared from the centres of Polish cities as more valuable treeless plots were sold off to developers. In parallel, the government authorised extensive logging of the ancient forest in Bialowieza, a Unesco world heritage site.
“People reacted very emotionally to these practices,” says Wojciech Sanko, a co-ordinator at Code for Poland, a programme run by ePanstwo (eState), the country’s biggest non-governmental organisation in this field.
The group aims to deploy new technology tools designed to explain local and national policies, and to make it easier for citizens to take part in public life. As no one controlled the tree-cutting, for example, Mr Sanko thought technology could at least help to monitor it. First, he wanted to set up a simple digital map of trees cut in Warsaw. But as the controversial liberalisation of tree-cutting was reversed, the NGO together with local activists decided to work on another project — to map trees still standing, along with data about species and their absorption of carbon dioxide associated with climate change.
The group has also started to create an app for activists in Bialowieza forest: an open-source map that will gather all documentation from civic patrols monitoring the site, and will indicate the exact places of logging.
A trend towards recruiting technology for civic projects has been slowly gathering pace in a country that is hard to describe as socially-engaged: only 59 per cent of Poles say they have done volunteer work for the community, according to a 2016 survey by the Centre of Public Opinion Research.
Election turnout barely surpasses 50 per cent. Yet since the election of the rightwing Law and Justice government in 2015, which has introduced rapid and controversial reforms across all domains of public life, citizens have started to take a closer look at politicians and their actions.
In addition to the tree map, Code for Poland has developed a website that aggregates public data, such as tax spending or air pollution.
Mr Sanko underlines, however, that Code for Poland is much more about local communities than national politics. Many of the group’s projects are small scale, ranging from a mobile app for an animal shelter in Gdansk and a tool that shows people where they can take their garbage.
Piotr Micula, board member of Miasto Jest Nasze (The City is Ours), an urban movement in Warsaw, says that increasing access to data is fuelling the development of civic tech. “Even as a small organisation, we try to use big data and visualise it,” he says….(More)”.
The role of eGovernment in deepening the single market
Using big data to predict suicide risk among Canadian youth
SAS Insights “Suicide is the second leading cause of death among youth in Canada, according to Statistics Canada, accounting for one-fifth of deaths of people under the age of 25 in 2011. The Canadian Mental Health Association states that among 15 – 24 year olds the number is an even more frightening at 24 percent – the third highest in the industrialized world. Yet despite these disturbing statistics, the signals that an individual plans on self-injury or suicide are hard to isolate….
Team members …collected 2.3 million tweets and used text mining software to identify 1.1 million of them as likely to have been authored by 13 to 17 year olds in Canada by building a machine learning model to predict age, based on the open source PAN author profiling dataset. Their analysis made use of natural language processing, predictive modelling, text mining, and data visualization….
However, there were challenges. Ages are not revealed on Twitter, so the team had to figure out how to tease out the data for 13 – 17 year olds in Canada. “We had a text data set, and we created a model to identify if people were in that age group based on how they talked in their tweets,” Soehl said. “From there, we picked some specific buzzwords and created topics around them, and our software mined those tweets to collect the people.”
Another issue was the restrictions Twitter places on pulling data, though Soehl believes that once this analysis becomes an established solution, Twitter may work with researchers to expedite the process. “Now that we’ve shown it’s possible, there are a lot of places we can go with it,” said Soehl. “Once you know your path and figure out what’s going to be valuable, things come together quickly.”
The team looked at the percentage of people in the group who were talking about depression or suicide, and what they were talking about. Horne said that when SAS’ work went in front of a Canadian audience working in health care, they said that it definitely filled a gap in their data — and that was the validation he’d been looking for. The team also won $10,000 for creating the best answer to this question (the team donated the award money to two mental health charities: Mind Your Mind and Rise Asset Development)
What’s next?
That doesn’t mean the work is done, said Jos Polfliet. “We’re just scraping the surface of what can be done with the information.” Another way to use the results is to look at patterns and trends….(More)”
Why Information Matters
Essay by Luciano Floridi in Special Issue of Atlantis on Information, Matter and Life: “…As information technologies come to affect all areas of life, they are becoming implicated in our most important problems — their causes, effects, and solutions, the scientific investigations aimed at explaining them, the concepts created to understand them, the means of discussing them, and even, as in the case of Bill Gates, the wealth required to tackle them.
Furthermore, information technologies don’t just modify how we act in the world; they also profoundly affect how we understand the world, how we relate to it, how we see ourselves, how we interact with each other, and how our hopes for a better future are shaped. All these are old philosophical issues, of course, but we must now consider them anew, with the concept of information as a central concern.
This means that if philosophers are to help enable humanity to make sense of our world and to improve it responsibly, information needs to be a significant field of philosophical study. Among our mundane and technical concepts, information is currently not only one of the most important and widely used, but also one of the least understood. We need a philosophy of information.
How to Ask a Question
In the fall of 1999, NASA lost radio contact with its Mars Climate Orbiter, a $125 million weather satellite that had been launched the year before. In a maneuver to enter the spacecraft into orbit around Mars, the trajectory had put the spacecraft far closer to Mars than planned, so that it directly entered the planet’s atmosphere, where it probably disintegrated. The reason for this unhappy event was that for a particular software file, the Lockheed Martin engineering team had used English (imperial) units of measurement instead of the metric units specified by the agency, whose trajectory modelers assumed the data they were looking at was provided in metric.
This incident illustrates a simple lesson: successful cooperation depends on an agreement between all parties that the information being exchanged is fixed at a specified level. Wrongly assuming that everyone will follow the rules that specify the level — for example, that impulse will be expressed not as pound-seconds (the English unit) but as newton-seconds (the metric unit) — can lead to costly mistakes. Even though this principle may seem obvious, it is one of the most valuable contributions that philosophy can offer to our understanding of information. This is because, as we will see, failing to specify a level at which we ask a given philosophical question can be the reason for deep confusions and useless answers. Another simple example will help to illustrate the problem…(More)”