This War of Mine – The Ultimate Serious Game


The Escapist Magazine “…there are not many games about the effect of war. Paweł Miechowski thinks that needs to be changed, and he’s doing it with a little game called This War of Mine from the Polish outfit 11 Bit Studio.
“We’re in the moment where we want to talk about important things via games,” Miechowski said. “We are used to the fact that important topics are covered by music, novels, movies, while games mostly about fun. Laughing ‘ha ha ha’ fun.”
In fact, he believes games are well-suited for showing harsh truths and realities, not by ham-fistedly repeating political phrases or mantras, but by allowing you to draw your own conclusions from the circumstances. “Games are perfect for this because they are interactive. Novels or movies are not,” he said. “Games can take you through the experience through your hands, by your eyes. You are not a spectator. You are part of the experience.”
What is the experience of This War of Mine then? 11 Bit Studios was inspired by the firsthand accounts of people who tried to survive within a modern city that had no law, no order or infrastructure due to an ongoing war between militaries. “Everything we did in this game, we did after extensive research. Any mechanics in the game are just a translation of our knowledge of situations in recent history,” he said. “Yugoslavia, Syria, Serbia. Anywhere civilians survived within a besieged city after war. They were all pretty similar, struggling for water, hygiene items, food, simple tools to make something, wood to heat the house up.”
Miechowski showed me an early build of This War of Mine and that’s exactly what it is. Your only goal, which is emblazoned on the screen when you start the game, is to “Survive for 30 days.” You begin inside a 2D representation of a bombed-out building with several floors. You have a few allies with names like Boris or Yvette, each of whom have traits such as “good cook” or “strong, but slow.” Orders can be given to your team, such as to build a bed or to scavenge the piles of junk within your stronghold for any useful items. You usually start out with nothing, but over time you’ll accumulate all sorts of items and materials. The game is in real time, the hours slowly tick by, but once you assign tasks it can be useful to advance the timeline by clicking the “Start Night” button.”

“Open-washing”: The difference between opening your data and simply making them available


Christian Villum at the Open Knowledge Foundation Blog:  “Last week, the Danish it-magazine Computerworld, in an article entitled “Check-list for digital innovation: These are the things you must know“, emphasised how more and more companies are discovering that giving your users access to your data is a good business strategy. Among other they wrote:

(Translation from Danish) According to Accenture it is becoming clear to many progressive businesses that their data should be treated as any other supply chain: It should flow easily and unhindered through the whole organisation and perhaps even out into the whole eco-system – for instance through fully open API’s.

They then use Google Maps as an example, which firstly isn’t entirely correct, as also pointed out by the Neogeografen, a geodata blogger, who explains how Google Maps isn’t offering raw data, but merely an image of the data. You are not allowed to download and manipulate the data – or run it off your own server.

But secondly I don’t think it’s very appropriate to highlight Google and their Maps project as a golden example of a business that lets its data flow unhindered to the public. It’s true that they are offering some data, but only in a very limited way – and definitely not as open data – and thereby not as progressively as the article suggests.

Surely it’s hard to accuse Google of not being progressive in general. The article states how Google Maps’ data are used by over 800,000 apps and businesses across the globe. So yes, Google has opened its silo a little bit, but only in a very controlled and limited way, which leaves these 800,000 businesses dependent on the continual flow of data from Google and thereby not allowing them to control the very commodities they’re basing their business on. This particular way of releasing data brings me to the problem that we’re facing: Knowing the difference between making data available and making them open.

Open data is characterized by not only being available, but being both legally open (released under an open license that allows full and free reuse conditioned at most to giving credit to it’s source and under same license) and technically available in bulk and in machine readable formats – contrary to the case of Google Maps. It may be that their data are available, but they’re not open. This – among other reasons – is why the global community around the 100% open alternative Open Street Map is growing rapidly and an increasing number of businesses choose to base their services on this open initiative instead.

But why is it important that data are open and not just available? Open data strengthens the society and builds a shared resource, where all users, citizens and businesses are enriched and empowered, not just the data collectors and publishers. “But why would businesses spend money on collecting data and then give them away?” you ask. Opening your data and making a profit are not mutually exclusive. Doing a quick Google search reveals many businesses that both offer open data and drives a business on them – and I believe these are the ones that should be highlighted as particularly progressive in articles such as the one from Computerworld….

We are seeing a rising trend of what can be termed “open-washing” (inspired by “greenwashing“) – meaning data publishers that are claiming their data is open, even when it’s not – but rather just available under limiting terms. If we – at this critical time in the formative period of the data driven society – aren’t critically aware of the difference, we’ll end up putting our vital data streams in siloed infrastructure built and owned by international corporations. But also to give our praise and support to the wrong kind of unsustainable technological development.”

The Problem with Easy Technology


New post by at the NewYorker: “In the history of marketing, there’s a classic tale that centers on the humble cake mix. During the nineteen-fifties, there were differences of opinion over how “instant” powdered cake mixes should be, and, in particular, over whether adding an egg ought to be part of the process. The first cake mixes, invented in the nineteen-thirties, merely required water, and some people argued that this approach, the easiest, was best. But others thought bakers would want to do more. Urged on by marketing psychologists, Betty Crocker herself began to instruct housewives to “add water, and two of your own fresh eggs.”…
The choice between demanding and easy technologies may be crucial to what we have called technological evolution. We are, as I argued in my most recent piece in this series, self-evolving. We make ourselves into what we, as a species, will become, mainly through our choices as consumers. If you accept these premises, our choice of technological tools becomes all-important; by the logic of biological atrophy, our unused skills and capacities tend to melt away, like the tail of an ape. It may sound overly dramatic, but the use of demanding technologies may actually be important to the future of the human race.
Just what is a demanding technology? Three elements are defining: it is technology that takes time to master, whose usage is highly occupying, and whose operation includes some real risk of failure. By this measure, a piano is a demanding technology, as is a frying pan, a programming language, or a paintbrush. So-called convenience technologies, in contrast—like instant mashed potatoes or automatic transmissions—usually require little concentrated effort and yield predictable results.
There is much to be said for the convenience technologies that have remade human society over the past century. They often open up life’s pleasures to a wider range of people (downhill skiing, for example, can be exhausting without lifts). They also distribute technological power more widely: consider that, nowadays, you don’t need special skills to take pretty good photos, or to capture a video of police brutality. Nor should we neglect that promise first made to all Americans in the nineteen-thirties: freedom from a life of drudgery to focus on what we really care about. Life is hard enough; do we need to be churning our own butter? Convenience technologies promised more space in our lives for other things, like thought, reflection, and leisure.
That, at least, is the idea. But, even on its own terms, convenience technology has failed us. Take that promise of liberation from overwork. In 1964, Life magazine, in an article about “Too Much Leisure,” asserted that “there will certainly be a sharp decline in the average work week” and that “some prophets of what automation is doing to our economy think we are on the verge of a 30-hour week; others as low as 25 or 20.” Obviously, we blew it. Our technologies may have made us prosthetic gods, yet they have somehow failed to deliver on the central promise of free time. The problem is that, as every individual task becomes easier, we demand much more of both ourselves and others. Instead of fewer difficult tasks (writing several long letters) we are left with a larger volume of small tasks (writing hundreds of e-mails). We have become plagued by a tyranny of tiny tasks, individually simple but collectively oppressive. And, when every task in life is easy, there remains just one profession left: multitasking.
The risks of biological atrophy are even more important. Convenience technologies supposedly free us to focus on what matters, but sometimes the part that matters is what gets eliminated. Everyone knows that it is easier to drive to the top of a mountain than to hike; the views may be the same, but the feeling never is. By the same logic, we may evolve into creatures that can do more but find that what we do has somehow been robbed of the satisfaction we hoped it might contain.
The project of self-evolution demands an understanding of humanity’s relationship with tools, which is mysterious and defining. Some scientists, like the archaeologist Timothy Taylor, believe that our biological evolution was shaped by the tools our ancestors chose eons ago. Anecdotally, when people describe what matters to them, second only to human relationships is usually the mastery of some demanding tool. Playing the guitar, fishing, golfing, rock-climbing, sculpting, and painting all demand mastery of stubborn tools that often fail to do what we want. Perhaps the key to these and other demanding technologies is that they constantly require new learning. The brain is stimulated and forced to change. Conversely, when things are too easy, as a species we may become like unchallenged schoolchildren, sullen and perpetually dissatisfied.
I don’t mean to insist that everything need be done the hard way, or that we somehow need to suffer like our ancestors to achieve redemption. It isn’t somehow wrong to use a microwave rather than a wood fire to reheat leftovers. But we must take seriously our biological need to be challenged, or face the danger of evolving into creatures whose lives are more productive but also less satisfying.
There have always been groups, often outcasts, who have insisted on adhering to harder ways of doing some things. Compared to Camrys, motorcycles are unreliable, painful, and dangerous, yet some people cannot leave them alone. It may seem crazy to use command-line or plain-text editing software in an age of advanced user interfaces, but some people still do. In our times, D.I.Y. enthusiasts, hackers, and members of the maker movement are some of the people who intuitively understand the importance of demanding tools, without rejecting the idea that technology can improve the human condition. Derided for lacking a “political strategy,” they nonetheless realize that there are far more important agendas than the merely political. Whether they know it or not, they are trying to work out the future of what it means to be human, and, along the way, trying to find out how to make that existence worthwhile.”

Tim Berners-Lee: we need to re-decentralise the web


Wired:  “Twenty-five years on from the web’s inception, its creator has urged the public to re-engage with its original design: a decentralised internet that at its very core, remains open to all.
Speaking with Wired editor David Rowan at an event launching the magazine’s March issue, Tim Berners-Lee said that although part of this is about keeping an eye on for-profit internet monopolies such as search engines and social networks, the greatest danger is the emergence of a balkanised web.
“I want a web that’s open, works internationally, works as well as possible and is not nation-based,” Berners-Lee told the audience… “What I don’t want is a web where the  Brazilian government has every social network’s data stored on servers on Brazilian soil. That would make it so difficult to set one up.”
It’s the role of governments, startups and journalists to keep that conversation at the fore, he added, because the pace of change is not slowing — it’s going faster than ever before. For his part Berners-Lee drives the issue through his work at the Open Data Institute, World Wide Web Consortium and World Wide Web Foundation, but also as an MIT professor whose students are “building new architectures for the web where it’s decentralised”. On the issue of monopolies, Berners-Lee did say it’s concerning to be “reliant on big companies, and one big server”, something that stalls innovation, but that competition has historically resolved these issues and will continue to do so.
The kind of balkanised web he spoke about, as typified by Brazil’s home-soil servers argument or Iran’s emerging intranet, is partially being driven by revelations of NSA and GCHQ mass surveillance. The distrust that it has brewed, from a political level right down to the threat of self-censorship among ordinary citizens, threatens an open web and is, said Berners-Lee,  a greater threat than censorship. Knowing the NSA  may be breaking commercial encryption services could result in the emergence of more networks like China’s Great Firewall, to “protect” citizens. This is why we need a bit of anti-establishment push back, alluded to by Berners-Lee.”

The Moneyball Effect: How smart data is transforming criminal justice, healthcare, music, and even government spending


TED: “When Anne Milgram became the Attorney General of New Jersey in 2007, she was stunned to find out just how little data was available on who was being arrested, who was being charged, who was serving time in jails and prisons, and who was being released. It turns out that most big criminal justice agencies like my own didn’t track the things that matter,” she says in today’s talk, filmed at TED@BCG. “We didn’t share data, or use analytics, to make better decisions and reduce crime.”
Milgram’s idea for how to change this: “I wanted to moneyball criminal justice.”
Moneyball, of course, is the name of a 2011 movie starring Brad Pitt and the book it’s based on, written by Michael Lewis in 2003. The term refers to a practice adopted by the Oakland A’s general manager Billy Beane in 2002 — the organization began basing decisions not on star power or scout instinct, but on statistical analysis of measurable factors like on-base and slugging percentages. This worked exceptionally well. On a tiny budget, the Oakland A’s made it to the playoffs in 2002 and 2003, and — since then — nine other major league teams have hired sabermetric analysts to crunch these types of numbers.
Milgram is working hard to bring smart statistics to criminal justice. To hear the results she’s seen so far, watch this talk. And below, take a look at a few surprising sectors that are getting the moneyball treatment as well.

Moneyballing music. Last year, Forbes magazine profiled the firm Next Big Sound, a company using statistical analysis to predict how musicians will perform in the market. The idea is that — rather than relying on the instincts of A&R reps — past performance on Pandora, Spotify, Facebook, etc can be used to predict future potential. The article reads, “For example, the company has found that musicians who gain 20,000 to 50,000 Facebook fans in one month are four times more likely to eventually reach 1 million. With data like that, Next Big Sound promises to predict album sales within 20% accuracy for 85% of artists, giving labels a clearer idea of return on investment.”
Moneyballing human resources. In November, The Atlantic took a look at the practice of “people analytics” and how it’s affecting employers. (Billy Beane had something to do with this idea — in 2012, he gave a presentation at the TLNT Transform Conference called “The Moneyball Approach to Talent Management.”) The article describes how Bloomberg reportedly logs its employees’ keystrokes and the casino, Harrah’s, tracks employee smiles. It also describes where this trend could be going — for example, how a video game called Wasabi Waiter could be used by employers to judge potential employees’ ability to take action, solve problems and follow through on projects. The article looks at the ways these types of practices are disconcerting, but also how they could level an inherently unequal playing field. After all, the article points out that gender, race, age and even height biases have been demonstrated again and again in our current hiring landscape.
Moneyballing healthcare. Many have wondered: what about a moneyball approach to medicine? (See this call out via Common Health, this piece in Wharton Magazine or this op-ed on The Huffington Post from the President of the New York State Health Foundation.) In his TED Talk, “What doctors can learn from each other,” Stefan Larsson proposed an idea that feels like something of an answer to this question. In the talk, Larsson gives a taste of what can happen when doctors and hospitals measure their outcomes and share this data with each other: they are able to see which techniques are proving the most effective for patients and make adjustments. (Watch the talk for a simple way surgeons can make hip surgery more effective.) He imagines a continuous learning process for doctors — that could transform the healthcare industry to give better outcomes while also reducing cost.
Moneyballing government. This summer, John Bridgeland (the director of the White House Domestic Policy Council under President George W. Bush) and Peter Orszag (the director of the Office of Management and Budget in Barack Obama’s first term) teamed up to pen a provocative piece for The Atlantic called, “Can government play moneyball?” In it, the two write, “Based on our rough calculations, less than $1 out of every $100 of government spending is backed by even the most basic evidence that the money is being spent wisely.” The two explain how, for example, there are 339 federally-funded programs for at-risk youth, the grand majority of which haven’t been evaluated for effectiveness. And while many of these programs might show great results, some that have been evaluated show troubling results. (For example, Scared Straight has been shown to increase criminal behavior.) Yet, some of these ineffective programs continue because a powerful politician champions them. While Bridgeland and Orszag show why Washington is so averse to making data-based appropriation decisions, the two also see the ship beginning to turn around. They applaud the Obama administration for a 2014 budget with an “unprecendented focus on evidence and results.” The pair also gave a nod to the nonprofit Results for America, which advocates that for every $99 spent on a program, $1 be spent on evaluating it. The pair even suggest a “Moneyball Index” to encourage politicians not to support programs that don’t show results.
In any industry, figuring out what to measure, how to measure it and how to apply the information gleaned from those measurements is a challenge. Which of the applications of statistical analysis has you the most excited? And which has you the most terrified?”

Bad Data


Bad Data is a site providing real-world examples of how not to prepare or provide data. It showcases the poorly structured, the mis-formatted, or the just plain ugly. Its primary purpose is to educate – though there may also be some aspect of entertainment.
As a side-product it also provides a source of good practice material for budding data wranglers (the repo in fact began as a place to keep practice data for Data Explorer).
New examples wanted and welcome – submit them here »

Examples

Garbage In, Garbage Out… Or, How to Lie with Bad Data


Medium: For everyone who slept through Stats 101, Charles Wheelan’s Naked Statistics is a lifesaver. From batting averages and political polls to Schlitz ads and medical research, Wheelan “illustrates exactly why even the most reluctant mathophobe is well advised to achieve a personal understanding of the statistical underpinnings of life” (New York Times). What follows is adapted from the book, out now in paperback.
Behind every important study there are good data that made the analysis possible. And behind every bad study . . . well, read on. People often speak about “lying with statistics.” I would argue that some of the most egregious statistical mistakes involve lying with data; the statistical analysis is fine, but the data on which the calculations are performed are bogus or inappropriate. Here are some common examples of “garbage in, garbage out.”

Selection Bias

….Selection bias can be introduced in many other ways. A survey of consumers in an airport is going to be biased by the fact that people who fly are likely to be wealthier than the general public; a survey at a rest stop on Interstate 90 may have the opposite problem. Both surveys are likely to be biased by the fact that people who are willing to answer a survey in a public place are different from people who would prefer not to be bothered. If you ask 100 people in a public place to complete a short survey, and 60 are willing to answer your questions, those 60 are likely to be different in significant ways from the 40 who walked by without making eye contact.

Publication Bias

Positive findings are more likely to be published than negative findings, which can skew the results that we see. Suppose you have just conducted a rigorous, longitudinal study in which you find conclusively that playing video games does not prevent colon cancer. You’ve followed a representative sample of 100,000 Americans for twenty years; those participants who spend hours playing video games have roughly the same incidence of colon cancer as the participants who do not play video games at all. We’ll assume your methodology is impeccable. Which prestigious medical journal is going to publish your results?

Most things don’t prevent cancer.

None, for two reasons. First, there is no strong scientific reason to believe that playing video games has any impact on colon cancer, so it is not obvious why you were doing this study. Second, and more relevant here, the fact that something does not prevent cancer is not a particularly interesting finding. After all, most things don’t prevent cancer. Negative findings are not especially sexy, in medicine or elsewhere.
The net effect is to distort the research that we see, or do not see. Suppose that one of your graduate school classmates has conducted a different longitudinal study. She finds that people who spend a lot of time playing video games do have a lower incidence of colon cancer. Now that is interesting! That is exactly the kind of finding that would catch the attention of a medical journal, the popular press, bloggers, and video game makers (who would slap labels on their products extolling the health benefits of their products). It wouldn’t be long before Tiger Moms all over the country were “protecting” their children from cancer by snatching books out of their hands and forcing them to play video games instead.
Of course, one important recurring idea in statistics is that unusual things happen every once in a while, just as a matter of chance. If you conduct 100 studies, one of them is likely to turn up results that are pure nonsense—like a statistical association between playing video games and a lower incidence of colon cancer. Here is the problem: The 99 studies that find no link between video games and colon cancer will not get published, because they are not very interesting. The one study that does find a statistical link will make it into print and get loads of follow-on attention. The source of the bias stems not from the studies themselves but from the skewed information that actually reaches the public. Someone reading the scientific literature on video games and cancer would find only a single study, and that single study will suggest that playing video games can prevent cancer. In fact, 99 studies out of 100 would have found no such link.

Recall Bias

Memory is a fascinating thing—though not always a great source of good data. We have a natural human impulse to understand the present as a logical consequence of things that happened in the past—cause and effect. The problem is that our memories turn out to be “systematically fragile” when we are trying to explain some particularly good or bad outcome in the present. Consider a study looking at the relationship between diet and cancer. In 1993, a Harvard researcher compiled a data set comprising a group of women with breast cancer and an age-matched group of women who had not been diagnosed with cancer. Women in both groups were asked about their dietary habits earlier in life. The study produced clear results: The women with breast cancer were significantly more likely to have had diets that were high in fat when they were younger.
Ah, but this wasn’t actually a study of how diet affects the likelihood of getting cancer. This was a study of how getting cancer affects a woman’s memory of her diet earlier in life. All of the women in the study had completed a dietary survey years earlier, before any of them had been diagnosed with cancer. The striking finding was that women with breast cancer recalled a diet that was much higher in fat than what they actually consumed; the women with no cancer did not.

Women with breast cancer recalled a diet that was much higher in fat than what they actually consumed; the women with no cancer did not.

The New York Times Magazine described the insidious nature of this recall bias:

The diagnosis of breast cancer had not just changed a woman’s present and the future; it had altered her past. Women with breast cancer had (unconsciously) decided that a higher-fat diet was a likely predisposition for their disease and (unconsciously) recalled a high-fat diet. It was a pattern poignantly familiar to anyone who knows the history of this stigmatized illness: these women, like thousands of women before them, had searched their own memories for a cause and then summoned that cause into memory.

Recall bias is one reason that longitudinal studies are often preferred to cross-sectional studies. In a longitudinal study the data are collected contemporaneously. At age five, a participant can be asked about his attitudes toward school. Then, thirteen years later, we can revisit that same participant and determine whether he has dropped out of high school. In a cross-sectional study, in which all the data are collected at one point in time, we must ask an eighteen-year-old high school dropout how he or she felt about school at age five, which is inherently less reliable.

Survivorship Bias

Suppose a high school principal reports that test scores for a particular cohort of students has risen steadily for four years. The sophomore scores for this class were better than their freshman scores. The scores from junior year were better still, and the senior year scores were best of all. We’ll stipulate that there is no cheating going on, and not even any creative use of descriptive statistics. Every year this cohort of students has done better than it did the preceding year, by every possible measure: mean, median, percentage of students at grade level, and so on. Would you (a) nominate this school leader for “principal of the year” or (b) demand more data?

If you have a room of people with varying heights, forcing the short people to leave will raise the average height in the room, but it doesn’t make anyone taller.

I say “b.” I smell survivorship bias, which occurs when some or many of the observations are falling out of the sample, changing the composition of the observations that are left and therefore affecting the results of any analysis. Let’s suppose that our principal is truly awful. The students in his school are learning nothing; each year half of them drop out. Well, that could do very nice things for the school’s test scores—without any individual student testing better. If we make the reasonable assumption that the worst students (with the lowest test scores) are the most likely to drop out, then the average test scores of those students left behind will go up steadily as more and more students drop out. (If you have a room of people with varying heights, forcing the short people to leave will raise the average height in the room, but it doesn’t make anyone taller.)

Healthy User Bias

People who take vitamins regularly are likely to be healthy—because they are the kind of people who take vitamins regularly! Whether the vitamins have any impact is a separate issue. Consider the following thought experiment. Suppose public health officials promulgate a theory that all new parents should put their children to bed only in purple pajamas, because that helps stimulate brain development. Twenty years later, longitudinal research confirms that having worn purple pajamas as a child does have an overwhelmingly large positive association with success in life. We find, for example, that 98 percent of entering Harvard freshmen wore purple pajamas as children (and many still do) compared with only 3 percent of inmates in the Massachusetts state prison system.

The purple pajamas do not matter.

Of course, the purple pajamas do not matter; but having the kind of parents who put their children in purple pajamas does matter. Even when we try to control for factors like parental education, we are still going to be left with unobservable differences between those parents who obsess about putting their children in purple pajamas and those who don’t. As New York Times health writer Gary Taubes explains, “At its simplest, the problem is that people who faithfully engage in activities that are good for them—taking a drug as prescribed, for instance, or eating what they believe is a healthy diet—are fundamentally different from those who don’t.” This effect can potentially confound any study trying to evaluate the real effect of activities perceived to be healthful, such as exercising regularly or eating kale. We think we are comparing the health effects of two diets: kale versus no kale. In fact, if the treatment and control groups are not randomly assigned, we are comparing two diets that are being eaten by two different kinds of people. We have a treatment group that is different from the control group in two respects, rather than just one.

If statistics is detective work, then the data are the clues. My wife spent a year teaching high school students in rural New Hampshire. One of her students was arrested for breaking into a hardware store and stealing some tools. The police were able to crack the case because (1) it had just snowed and there were tracks in the snow leading from the hardware store to the student’s home; and (2) the stolen tools were found inside. Good clues help.
Like good data. But first you have to get good data, and that is a lot harder than it seems.

Public Open Sensor Data: Revolutionizing Smart Cities


New Paper in Technology and Society Magazine, IEEE (Volume: 32,  Issue: 4): “Local governments have decided to take advantage of the presence of wireless sensor networks (WSNs) in their cities to efficiently manage several applications in their daily responsibilities. The enormous amount of information collected by sensor devices allows the automation of several real-time services to improve city management by using intelligent traffic-light patterns during rush hour, reducing water consumption in parks, or efficiently routing garbage collection trucks throughout the city [1]. The sensor information required by these examples is mostly self-consumed by city-designed applications and managers.”

Imagining Data Without Division


Thomas Lin in Quanta Magazine: “As science dives into an ocean of data, the demands of large-scale interdisciplinary collaborations are growing increasingly acute…Seven years ago, when David Schimel was asked to design an ambitious data project called the National Ecological Observatory Network, it was little more than a National Science Foundation grant. There was no formal organization, no employees, no detailed science plan. Emboldened by advances in remote sensing, data storage and computing power, NEON sought answers to the biggest question in ecology: How do global climate change, land use and biodiversity influence natural and managed ecosystems and the biosphere as a whole?…
For projects like NEON, interpreting the data is a complicated business. Early on, the team realized that its data, while mid-size compared with the largest physics and biology projects, would be big in complexity. “NEON’s contribution to big data is not in its volume,” said Steve Berukoff, the project’s assistant director for data products. “It’s in the heterogeneity and spatial and temporal distribution of data.”
Unlike the roughly 20 critical measurements in climate science or the vast but relatively structured data in particle physics, NEON will have more than 500 quantities to keep track of, from temperature, soil and water measurements to insect, bird, mammal and microbial samples to remote sensing and aerial imaging. Much of the data is highly unstructured and difficult to parse — for example, taxonomic names and behavioral observations, which are sometimes subject to debate and revision.
And, as daunting as the looming data crush appears from a technical perspective, some of the greatest challenges are wholly nontechnical. Many researchers say the big science projects and analytical tools of the future can succeed only with the right mix of science, statistics, computer science, pure mathematics and deft leadership. In the big data age of distributed computing — in which enormously complex tasks are divided across a network of computers — the question remains: How should distributed science be conducted across a network of researchers?
Part of the adjustment involves embracing “open science” practices, including open-source platforms and data analysis tools, data sharing and open access to scientific publications, said Chris Mattmann, 32, who helped develop a precursor to Hadoop, a popular open-source data analysis framework that is used by tech giants like Yahoo, Amazon and Apple and that NEON is exploring. Without developing shared tools to analyze big, messy data sets, Mattmann said, each new project or lab will squander precious time and resources reinventing the same tools. Likewise, sharing data and published results will obviate redundant research.
To this end, international representatives from the newly formed Research Data Alliance met this month in Washington to map out their plans for a global open data infrastructure.”

User-Generated Content Is Here to Stay


in the Huffington Post: “The way media are transmitted has changed dramatically over the last 10 years. User-generated content (UGC) has completely changed the landscape of social interaction, media outreach, consumer understanding, and everything in between. Today, UGC is media generated by the consumer instead of the traditional journalists and reporters. This is a movement defying and redefining traditional norms at the same time. Current events are largely publicized on Twitter and Facebook by the average person, and not by a photojournalist hired by a news organization. In the past, these large news corporations dominated the headlines — literally — and owned the monopoly on public media. Yet with the advent of smartphones and spread of social media, everything has changed. The entire industry has been replaced; smartphones have supplanted how information is collected, packaged, edited, and conveyed for mass distribution. UGC allows for raw and unfiltered movement of content at lightening speed. With the way that the world works today, it is the most reliable way to get information out. One thing that is for certain is that UGC is here to stay whether we like it or not, and it is driving much more of modern journalistic content than the average person realizes.
Think about recent natural disasters where images are captured by citizen journalists using their iPhones. During Hurricane Sandy, 800,000 photos uploaded onto Instagram with “#Sandy.” Time magazine even hired five iPhoneographers to photograph the wreckage for its Instagram page. During the May 2013 Oklahoma City tornadoes, the first photo released was actually captured by a smartphone. This real-time footage brings environmental chaos to your doorstep in a chillingly personal way, especially considering the photographer of the first tornado photos ultimately died because of the tornado. UGC has been monumental for criminal investigations and man-made catastrophes. Most notably, the Boston Marathon bombing was covered by UGC in the most unforgettable way. Dozens of images poured in identifying possible Boston bombers, to both the detriment and benefit of public officials and investigators. Though these images inflicted considerable damage to innocent bystanders sporting suspicious backpacks, ultimately it was also smartphone images that highlighted the presence of the Tsarnaev brothers. This phenomenon isn’t limited to America. Would the so-called Arab Spring have happened without social media and UGC? Syrians, Egyptians, and citizens from numerous nations facing protests can easily publicize controversial images and statements to be shared worldwide….
This trend is not temporary but will only expand. The first iPhone launched in 2007, and the world has never been the same. New smartphones are released each month with better cameras and faster processors than computers had even just a few years ago….”