Need for Co-creating Urban Data Collaborative


Blog by Gaurav Godhwani: “…The Government of India has initiated various urban reforms for our cities like — Atal Mission for Rejuvenation and Urban Transformation 2.0 (AMRUT 2.0), Smart Cities Mission (SCM), Swachh Bharat Mission 2.0 (SBM-Urban 2.0) and development of Urban & Industrial Corridors. To help empower cities with data, the Ministry of Housing & Urban Affairs(MoHUA) has also launched various data initiatives including — DataSmart Cities StrategyData Maturity Assessment FrameworkSmart Cities Open Data PortalCity Innovation Exchange, India Urban Data Exchange and the India Urban Observatory.

Unfortunately, most of the urban data remains in silos and capacities for our cities to harness urban data to improve decision-making, strengthen citizen participation continues to be limited. As per the last Data Maturity Assessment Framework (DMAF) assessment conducted in November 2020 by MoHUA, among 100 smart cities only 45 cities have drafted/ approved their City Data Policies with just 32 cities having a dedicated data budget in 2020–21 for data-related activities. Moreover, in-terms of fostering data collaborations, only 12 cities formed data alliances to achieve tangible outcomes. We hope smart cities continue this practice by conducting a yearly self-assessment to progress in their journey to harness data for improving their urban planning.

Seeding Urban Data Collaborative to advance City-level Data Engagements

There is a need to bring together a diverse set of stakeholders including governments, civil societies, academia, businesses and startups, volunteer groups and more to share and exchange urban data in a secure, standardised and interoperable manner, deriving more value from re-using data for participatory urban development. Along with improving data sharing among these stakeholders, it is necessary to regularly convene, ideate and conduct capacity building sessions and institutionalise data practices.

Urban Data Collaborative can bring together such diverse stakeholders who could address some of these perennial challenges in the ecosystem while spurring innovation…(More)”

Improving Governance Outcomes Through AI Documentation: Bridging Theory and Practice 


Report by Amy Winecoff, and Miranda Bogen: “AI documentation is a foundational tool for governing AI systems, via both stakeholders within and outside AI organizations. It offers a range of stakeholders insight into how AI systems are developed, how they function, and what risks they may pose. For example, it might help internal model development, governance, compliance, and quality assurance teams communicate about and manage risk throughout the development and deployment lifecycle. Documentation can also help external technology developers determine what testing they should perform on models they incorporate into their products, or it could guide users on whether or not to adopt a technology. While documentation is essential for effective AI governance, its success depends on how well organizations tailor their documentation approaches to meet the diverse needs of stakeholders, including technical teams, policymakers, users, and other downstream consumers of the documentation.

This report synthesizes findings from an in-depth analysis of academic and gray literature on documentation, encompassing 37 proposed methods for documenting AI data, models, systems, and processes, along with 21 empirical studies evaluating the impact and challenges of implementing documentation. Through this synthesis, we identify key theoretical mechanisms through which AI documentation can enhance governance outcomes. These mechanisms include informing stakeholders about the intended use, limitations, and risks of AI systems; facilitating cross-functional collaboration by bridging different teams; prompting ethical reflection among developers; and reinforcing best practices in development and governance. However, empirical evidence offers mixed support for these mechanisms, indicating that documentation practices can be more effectively designed to achieve these goals…(More)”.

China’s Hinterland Becomes A Critical Datascape


Article by Gary Zhexi Zhang: “In 2014, the southwestern province of Guizhou, a historically poor and mountainous area, beat out rival regions to become China’s first “Big Data Comprehensive Pilot Zone,” as part of a national directive to develop the region — which is otherwise best known as an exporter of tobacco, spirits and coal — into the infrastructural backbone of the country’s data industry. Since then, vast investment has poured into the province. Thousands of miles of highway and high-speed rail tunnel through the mountains. Driving through the province can feel vertiginous: Of the hundred highest bridges in the world, almost half are in Guizhou, and almost all were built in the last 15 years.

In 2015, Xi Jinping visited Gui’an New Area to inaugurate the province’s transformation into China’s “Big Data Valley,” exemplifying the central government’s goal to establish “high quality social and economic development,” ubiquitously advertised through socialist-style slogans plastered on highways and city streets…(More)”.

Why Is There Data?


Paper by David Sisson and Ilan Ben-Meir: “In order for data to become truly valuable (and truly useful), that data must first be processed. The question animating this essay is thus a straightforward one: What sort of processing must data undergo, in order to become valuable? While the question may be obvious, its answers are anything but; indeed, reaching them will require us to pose, answer – and then revise our answers to – several other questions that will prove trickier than they first appear: Why is data valuable – what is it for? What is “data”? And what does “working with data” actually involve?…(More)”

AI in Global Development Playbook


USAID Playbook: “…When used effectively and responsibly, AI holds the potential to accelerate progress on sustainable development and close digital divides, but it also poses risks that could further impede progress toward these goals. With the right enabling environment and ecosystem of actors, AI can enhance efficiency and accelerate development outcomes in sectors such as health, education, agriculture, energy, manufacturing, and delivering public services. The United States aims to ensure that the benefits of AI are shared equitably across the globe.

Distilled from consultations with hundreds of government officials, non-governmental organizations, technology firms and startups, and individuals from around the world, the AI in Global Development Playbook is a roadmap to develop the capacity, ecosystems, frameworks, partnerships, applications, and institutions to leverage safe, secure, and trustworthy AI for sustainable development.

The United States’ current efforts are grounded in the belief that AI, when developed and deployed responsibly, can be a powerful force for achieving the Sustainable Development Goals and addressing some of the world’s most urgent challenges. Looking ahead, the United States will continue to support low- and middle-income countries through funding, advocacy, and convening efforts–collectively navigating the complexities of the digital age and working toward a future in which the benefits of technological development are widely shared.

This Playbook seeks to underscore AI as a uniquely global opportunity with far-reaching impacts and potential risks. It highlights that safe, secure, and trustworthy design, deployment, and use of AI is not only possible but essential. Recognizing that international cooperation and multi-stakeholder partnerships are key in achieving progress, we invite others to contribute their expertise, resources, and perspectives to enrich and expand this framework.

The true measure of progress in responsible AI is not in the sophistication of our machines but in the quality of life the technology enhances. Together we can work toward ensuring the promise of AI is realized in service of this goal…(More)”

Artificial intelligence (AI) in action: A preliminary review of AI use for democracy support


Policy paper by Grahm Tuohy-Gaydos: “…provides a working definition of AI for Westminster Foundation for Democracy (WFD) and the broader democracy support sector. It then provides a preliminary review of how AI is being used to enhance democratic practices worldwide, focusing on several themes including: accountability and transparency, elections, environmental democracy, inclusion, openness and participation, and women’s political leadership. The paper also highlights potential risks and areas of development in the future. Finally, the paper shares five recommendations for WFD and democracy support organisations to consider advancing their ‘digital democracy’ agenda. This policy paper also offers additional information regarding AI classification and other resources for identifying good practice and innovative solutions. Its findings may be relevant to WFD staff members, international development practitioners, civil society organisations, and persons interested in using emerging technologies within governmental settings…(More)”.

China’s biggest AI model is challenging American dominance


Article by Sam Eifling: “So far, the AI boom has been dominated by U.S. companies like OpenAI, Google, and Meta. In recent months, though, a new name has been popping up on benchmarking lists: Alibaba’s Qwen. Over the past few months, variants of Qwen have been topping the leaderboards of sites that measure an AI model’s performance.

“Qwen 72B is the king, and Chinese models are dominating,” Hugging Face CEO Clem Delangue wrote in June, after a Qwen-based model first rose to the top of his company’s Open LLM leaderboard.

It’s a surprising turnaround for the Chinese AI industry, which many thought was doomed by semiconductor restrictions and limitations on computing power. Qwen’s success is showing that China can compete with the world’s best AI models — raising serious questions about how long U.S. companies will continue to dominate the field. And by focusing on capabilities like language support, Qwen is breaking new ground on what an AI model can do — and who it can be built for.

Those capabilities have come as a surprise to many developers, even those working on Qwen itself. AI developer David Ng used Qwen to build the model that topped the Open LLM leaderboard. He’s built models using Meta and Google’s technology also but says Alibaba’s gave him the best results. “For some reason, it works best on the Chinese models,” he told Rest of World. “I don’t know why.”..(More)”

Why is it so hard to establish the death toll?


Article by Smriti Mallapaty: “Given the uncertainty of counting fatalities during conflict, researchers use other ways to estimate mortality.

One common method uses household surveys, says Debarati Guha-Sapir, an epidemiologist who specializes in civil conflicts at the University of Louvain in Louvain-la-Neuve, Belgium, and is based in Brussels. A sample of the population is asked how many people in their family have died over a specific period of time. This approach has been used to count deaths in conflicts elsewhere, including in Iraq3 and the Central African Republic4.

The situation in Gaza right now is not conducive to a survey, given the level of movement and displacement, say researchers. And it would be irresponsible to send data collectors into an active conflict and put their lives at risk, says Ball.

There are also ethical concerns around intruding on people who lack basic access to food and medication to ask about deaths in their families, says Jamaluddine. Surveys will have to wait for the conflict to end and movement to ease, say researchers.

Another approach is to compare multiple independent lists of fatalities and calculate mortality from the overlap between them. The Human Rights Data Analysis Group used this approach to estimate the number of people killed in Syria between 2011 and 2014. Jamaluddine hopes to use the ministry fatality data in conjunction with those posted on social media by several informal groups to estimate mortality in this way. But Guha-Sapir says this method relies on the population being stable and not moving around, which is often not the case in conflict-affected communities.

In addition to deaths immediately caused by the violence, some civilians die of the spread of infectious diseases, starvation or lack of access to health care. In February, Jamaluddine and her colleagues used modelling to make projections of excess deaths due to the war and found that, in a continued scenario of six months of escalated conflict, 68,650 people could die from traumatic injuries, 2,680 from non-communicable diseases such as cancer and 2,720 from infectious diseases — along with thousands more if an epidemic were to break out. On 30 July, the ministry declared a polio epidemic in Gaza after detecting the virus in sewage samples, and in mid-August it confirmed the first case of polio in 25 years, in a 10-month-old baby…

The longer the conflict continues, the harder it will be to get reliable estimates, because “reports by survivors get worse as time goes by”, says Jon Pedersen, a demographer at !Mikro in Oslo, who advises international agencies on mortality estimates…(More)”.

Germany’s botched data revamp leaves economists ‘flying blind’


Article by Olaf Storbeck: “Germany’s statistical office has suspended some of its most important indicators after botching a data update, leaving citizens and economists in the dark at a time when the country is trying to boost flagging growth.

In a nation once famed for its punctuality and reliability, even its notoriously diligent beancounters have become part of a growing perception that “nothing works any more” as Germans moan about delayed trains, derelict roads and bridges, and widespread staff shortages.

“There used to be certain aspects in life that you could just rely on, and the fact that official statistics are published on time was one of them — not any more,” said Jörg Krämer, chief economist of Commerzbank, adding that the suspended data was also closely watched by monetary policymakers and investors.

Since May the Federal Statistical Office (Destatis) has not updated time-series data for retail and wholesale sales, as well as revenue from the services sector, hospitality, car dealers and garages.

These indicators, which are published monthly and adjusted for seasonal changes, are a key component of GDP and crucial for assessing consumer demand in the EU’s largest economy.

Private consumption accounted for 52.7 per cent of German output in 2023. Retail sales made up 28 per cent of private consumption but shrank 3.4 per cent from a year earlier. Overall GDP declined 0.3 per cent last year, Destatis said.

The Wiesbaden-based authority, which was established in 1948, said the outages had been caused by IT issues and a complex methodological change in EU business statistics in a bid to boost accuracy.

Destatis has been working on the project since the EU directive in 2019, and the deadline for implementing the changes is December.

But a series of glitches, data issues and IT delays meant Destatis has been unable to publish retail sales and other services data for four months.

A key complication is that the revenues of companies that operate in both services and manufacturing will now be reported differently for each sector. In the past, all revenue was treated as either services or manufacturing, depending on which unit was bigger…(More)”

Synthetic Data and Social Science Research


Paper by Jordan C. Stanley & Evan S. Totty: “Synthetic microdata – data retaining the structure of original microdata while replacing original values with modeled values for the sake of privacy – presents an opportunity to increase access to useful microdata for data users while meeting the privacy and confidentiality requirements for data providers. Synthetic data could be sufficient for many purposes, but lingering accuracy concerns could be addressed with a validation system through which the data providers run the external researcher’s code on the internal data and share cleared output with the researcher. The U.S. Census Bureau has experience running such systems. In this chapter, we first describe the role of synthetic data within a tiered data access system and the importance of synthetic data accuracy in achieving a viable synthetic data product. Next, we review results from a recent set of empirical analyses we conducted to assess accuracy in the Survey of Income & Program Participation (SIPP) Synthetic Beta (SSB), a Census Bureau product that made linked survey-administrative data publicly available. Given this analysis and our experience working on the SSB project, we conclude with thoughts and questions regarding future implementations of synthetic data with validation…(More)”