Federal Sources of Entrepreneurship Data: A Compendium


Compendium developed by Andrew Reamer: “The E.M. Kauffman Foundation has asked the George Washington Institute of Public Policy (GWIPP) to prepare a compendium of federal sources of data on self-employment, entrepreneurship, and small business development. The Foundation believes that the availability of useful, reliable federal data on these topics would enable robust descriptions and explanations of entrepreneurship trends in the United States and so help guide the development of effective entrepreneurship policies.


Achieving these ends first requires the identification and detailed description of available federal datasets, as provided in this compendium. Its contents include:

  • An overview and discussion of 18 datasets from four federal agencies, organized by two categories and five subcategories.
  • Tables providing information on each dataset, including:
    • scope of coverage of self-employed, entrepreneurs, and businesses;
    • data collection methods (nature of data source, periodicity, sampling frame, sample size);
    • dataset variables (owner characteristics, business characteristics and operations, geographic areas);
    • Data release schedule; and
    • Data access by format (including fixed tables, interactive tools, API, FTP download, public use microdata samples [PUMS], and confidential microdata).

For each dataset, examples of studies, if any, that use the data source to describe and explain trends in entrepreneurship.
The author’s aim is for the compendium to facilitate an assessment of the strengths and weaknesses of currently available federal datasets, discussion about how data availability and value can be improved, and implementation of desired improvements…(More)”

The Neuroscience of Trust


Paul J. Zak at Harvard Business Review: “…About a decade ago, in an effort to understand how company culture affects performance, I began measuring the brain activity of people while they worked. The neuroscience experiments I have run reveal eight ways that leaders can effectively create and manage a culture of trust. I’ll describe those strategies and explain how some organizations are using them to good effect. But first, let’s look at the science behind the framework.

What’s Happening in the Brain

Back in 2001 I derived a mathematical relationship between trust and economic performance. Though my paper on this research described the social, legal, and economic environments that cause differences in trust, I couldn’t answer the most basic question: Why do two people trust each other in the first place? Experiments around the world have shown that humans are naturally inclined to trust others—but don’t always. I hypothesized that there must be a neurologic signal that indicates when we should trust someone. So I started a long-term research program to see if that was true….

How to Manage for Trust

Through the experiments and the surveys, I identified eight management behaviors that foster trust. These behaviors are measurable and can be managed to improve performance.

Recognize excellence.

The neuroscience shows that recognition has the largest effect on trust when it occurs immediately after a goal has been met, when it comes from peers, and when it’s tangible, unexpected, personal, and public. Public recognition not only uses the power of the crowd to celebrate successes, but also inspires others to aim for excellence. And it gives top performers a forum for sharing best practices, so others can learn from them….(More)”.

Assessing employer intent when AI hiring tools are biased


Report by Caitlin Chin at Brookings: “When it comes to gender stereotypes in occupational roles, artificial intelligence (AI) has the potential to either mitigate historical bias or heighten it. In the case of the Word2vec model, AI appears to do both.

Word2vec is a publicly available algorithmic model built on millions of words scraped from online Google News articles, which computer scientists commonly use to analyze word associations. In 2016, Microsoft and Boston University researchers revealed that the model picked up gender stereotypes existing in online news sources—and furthermore, that these biased word associations were overwhelmingly job related. Upon discovering this problem, the researchers neutralized the biased word correlations in their specific algorithm, writing that “in a small way debiased word embeddings can hopefully contribute to reducing gender bias in society.”

Their study draws attention to a broader issue with artificial intelligence: Because algorithms often emulate the training datasets that they are built upon, biased input datasets could generate flawed outputs. Because many contemporary employers utilize predictive algorithms to scan resumes, direct targeted advertising, or even conduct face- or voice-recognition-based interviews, it is crucial to consider whether popular hiring tools might be susceptible to the same cultural biases that the researchers discovered in Word2vec.

In this paper, I discuss how hiring is a multi-layered and opaque process and how it will become more difficult to assess employer intent as recruitment processes move online. Because intent is a critical aspect of employment discrimination law, I ultimately suggest four ways upon which to include it in the discussion surrounding algorithmic bias….(More)”

This report from The Brookings Institution’s Artificial Intelligence and Emerging Technology (AIET) Initiative is part of “AI and Bias,” a series that explores ways to mitigate possible biases and create a pathway toward greater fairness in AI and emerging technologies.

Civic Duty Days: One Way Employers Can Strengthen Democracy


Blog by Erin Barnes: “As an employer, I’m always looking for structural ways to support my team in their health and wellbeing. We know that individual health is so often tied to community health: strong communities mean, among other things, better health outcomes, reduced crime, and better education for our children, so making space for my team to be able to be active participants in their neighborhoods gives them and their families better health outcomes. So, from my perspective, allowing time to give back to the community is just as important as providing sick days.

When my cofounder Brandon Whitney and I started ioby — a nonprofit focused on building civic leadership in our neighborhoods — we wanted our internal organizational values to reflect our mission. For example, we’ve always given Election Day off, and Brandon created ioby’s Whole Person Policy inspired by the work of Parker Palmer. And a few years ago, after a series of high-profile killings of people of color by police made it difficult for many of our staff to feel fully present at work while also showing up for those in their community who were struggling with pain and grief, we decided to add an additional 5 days of Paid Time Off (PTO) for civic duty.

At ioby, a Civic Duty Day is not the same as jury duty. Civic Duty Days are designed to give ioby staff the time to do what we need to do to be active participants involved in everyday democracy. Activities can include neighborhood volunteering, get-out-the-vote volunteering, fundraising, self-care and community-care to respond to local and national emergencies, writing letters, meeting with local elected officials, making calls, going to a healing workshop, and personal health to recover from civic duty activities that fall on weekends.

A couple weeks ago, at a retreat with other nonprofit leaders, we were discussing structural ways to increase civic participation in the United States. Given that nearly 15% of Americans cite lack of time as their reason for not voting, and 75% of Americans cite it as their reason for not volunteering, employers can make a big difference in how Americans show up in public life.

I asked my team what sorts of things they’ve used Civic Duty Days for. In addition to the typical answers about park cleanups, phone banking, door knocking and canvassing, postcard writing, attending demonstrations like the Women’s March and the Climate Strike, I heard some interesting stories.

  • One ioby staff person used her Civic Duty Days to attend Reverse Ride Alongs where she acts as a guide with cadets for the entire day. This program allows cadets to see the community they will be serving and for the community to have a voice in how they see policing and what ways best to be approached by new police officers.
  • An ioby staff person used Civic Duty Days to attend trial for an activist who was arrested for protesting; this would have been impossible to attend otherwise since trials are often during the day.
  • Another ioby staff person used his days to stay home with his kids while his wife attended demonstrations….(More)”

The Passion Economy and the Future of Work


Li Jin at Andreessen-Horowitz: “The top-earning writer on the paid newsletter platform Substack earns more than $500,000 a year from reader subscriptions. The top content creator on Podia, a platform for video courses and digital memberships, makes more than $100,000 a month. And teachers across the US are bringing in thousands of dollars a month teaching live, virtual classes on Outschool and Juni Learning.

These stories are indicative of a larger trend: call it the “creator stack” or the “enterprization of consumer.” Whereas previously, the biggest online labor marketplaces flattened the individuality of workers, new platforms allow anyone to monetize unique skills. Gig work isn’t going anywhere—but there are now more ways to capitalize on creativity. Users can now build audiences at scale and turn their passions into livelihoods, whether that’s playing video games or producing video content. This has huge implications for entrepreneurship and what we’ll think of as a “job” in the future….(More)”.

Tracking the Labor Market with “Big Data”


Tomaz Cajner, Leland Crane, Ryan Decker, Adrian Hamins-Puertolas, and Christopher Kurz at FEDSNotes: “Payroll employment growth is one of the most reliable business cycle indicators. Each postwar recession in the United States has been characterized by a year-on-year drop in payroll employment as measured by the BLS Current Employment Statistics (CES) survey, and, outside of these recessionary declines, the year-on-year payroll employment growth has always been positive. Thus, it is not surprising that policymakers, financial markets, and the general public pay a great deal of attention to the CES payroll employment gains reported at the beginning of each month.

However, while the CES survey is one of the most carefully conducted measures of labor market activity and uses an extremely large sample, it is still subject to significant sampling error and nonsampling errors. For example, when the BLS first reported that private nonfarm payroll gains were 148,000 in July 2019, the associated 90 percent confidence interval was +/- 100,000 due to sampling error alone….

One such source of alternative labor market data is the payroll-processing company ADP, which covers 20 percent of the private workforce. These are the data that underlie ADP’s monthly National Employment Report (NER), which forecasts BLS payroll employment changes by using a combination of ADP-derived data and other publicly available data. In our research, we explore the information content of the ADP microdata alone by producing an estimate of employment changes independent from the BLS payroll series as well as from other data sources.

A potential concern when using the ADP data is that only the firms which hire ADP to manage their payrolls will appear in the data, and this may introduce sample selection issues….(More)”

The Internet Relies on People Working for Free


Owen Williams at OneZero: “When you buy a product like Philips Hue’s smart lights or an iPhone, you probably assume the people who wrote their code are being paid. While that’s true for those who directly author a product’s software, virtually every tech company also relies on thousands of bits of free code, made available through “open-source” projects on sites like GitHub and GitLab.

Often these developers are happy to work for free. Writing open-source software allows them to sharpen their skills, gain perspectives from the community, or simply help the industry by making innovations available at no cost. According to Google, which maintains hundreds of open-source projects, open source “enables and encourages collaboration and the development of technology, solving real-world problems.”

But when software used by millions of people is maintained by a community of people, or a single person, all on a volunteer basis, sometimes things can go horribly wrong. The catastrophic Heartbleed bug of 2014, which compromised the security of hundreds of millions of sites, was caused by a problem in an open-source library called OpenSSL, which relied on a single full-time developer not making a mistake as they updated and changed that code, used by millions. Other times, developers grow bored and abandon their projects, which can be breached while they aren’t paying attention.

It’s hard to demand that programmers who are working for free troubleshoot problems or continue to maintain software that they’ve lost interest in for whatever reason — though some companies certainly try. Not adequately maintaining these projects, on the other hand, makes the entire tech ecosystem weaker. So some open-source programmers are asking companies to pay, not for their code, but for their support services….(More)”.

Raw data won’t solve our problems — asking the right questions will


Stefaan G. Verhulst in apolitical: “If I had only one hour to save the world, I would spend fifty-five minutes defining the questions, and only five minutes finding the answers,” is a famous aphorism attributed to Albert Einstein.

Behind this quote is an important insight about human nature: Too often, we leap to answers without first pausing to examine our questions. We tout solutions without considering whether we are addressing real or relevant challenges or priorities. We advocate fixes for problems, or for aspects of society, that may not be broken at all.

This misordering of priorities is especially acute — and represents a missed opportunity — in our era of big data. Today’s data has enormous potential to solve important public challenges.

However, policymakers often fail to invest in defining the questions that matter, focusing mainly on the supply side of the data equation (“What data do we have or must have access to?”) rather than the demand side (“What is the core question and what data do we really need to answer it?” or “What data can or should we actually use to solve those problems that matter?”).

As such, data initiatives often provide marginal insights while at the same time generating unnecessary privacy risks by accessing and exploring data that may not in fact be needed at all in order to address the root of our most important societal problems.

A new science of questions

So what are the truly vexing questions that deserve attention and investment today? Toward what end should we strategically seek to leverage data and AI?

The truth is that policymakers and other stakeholders currently don’t have a good way of defining questions or identifying priorities, nor a clear framework to help us leverage the potential of data and data science toward the public good.

This is a situation we seek to remedy at The GovLab, an action research center based at New York University.

Our most recent project, the 100 Questions Initiative, seeks to begin developing a new science and practice of questions — one that identifies the most urgent questions in a participatory manner. Launched last month, the goal of this project is to develop a process that takes advantage of distributed and diverse expertise on a range of given topics or domains so as to identify and prioritize those questions that are high impact, novel and feasible.

Because we live in an age of data and much of our work focuses on the promises and perils of data, we seek to identify the 100 most pressing problems confronting the world that could be addressed by greater use of existing, often inaccessible, datasets through data collaboratives – new forms of cross-disciplinary collaboration beyond public-private partnerships focused on leveraging data for good….(More)”.

Governance sinkholes


Blog post by Geoff Mulgan: “Governance sinkholes appear when shifts in technology, society and the economy throw up the need for new arrangements. Each industrial revolution has created many governance sinkholes – and prompted furious innovation to fill them. The fourth industrial revolution will be no different. But most governments are too distracted to think about what to do to fill these holes, let alone to act. This blog sets out my diagnosis – and where I think the most work is needed to design new institutions….

It’s not too hard to get a map of the fissures and gaps – and to see where governance is needed but is missing. There are all too many of these now.

Here are a few examples. One is long-term care, currently missing adequate financing, regulation, information and navigation tools, despite its huge and growing significance. The obvious contrast is with acute healthcare, which, for all its problems, is rich in institutions and governance.

A second example is lifelong learning and training. Again, there is a striking absence of effective institutions to provide funding, navigation, policy and problem solving, and again, the contrast with the institution-rich fields of primary, secondary and tertiary education is striking. The position on welfare is not so different, as is the absence of institutions fit for purpose in supporting people in precarious work.

I’m particularly interested in another kind of sinkhole: the absence of the right institutions to handle data and knowledge – at global, national and local levels – now that these dominate the economy, and much of daily life. In field after field, there are huge potential benefits to linking data sets and connecting artificial and human intelligence to spot patterns or prevent problems. But we lack any institutions with either the skills or the authority to do this well, and in particular to think through the trade-offs between the potential benefits and the potential risks….(More)”.

What can the labor flow of 500 million people on LinkedIn tell us about the structure of the global economy?


Paper by Jaehyuk Park et al: “…One of the most popular concepts for policy makers and business economists to understand the structure of the global economy is “cluster”, the geographical agglomeration of interconnected firms such as Silicon ValleyWall Street, and Hollywood. By studying those well-known clusters, we become to understand the advantage of participating in a geo-industrial cluster for firms and how it is related to the economic growth of a region. 

However, the existing definition of geo-industrial cluster is not systematic enough to reveal the whole picture of the global economy. Often, after defining as a group of firms in a certain area, the geo-industrial clusters are considered as independent to each other. As we should consider the interaction between accounting team and marketing team to understand the organizational structure of a firm, the relationships among those geo-industrial clusters are the essential part of the whole picture….

In this new study, my colleagues and I at Indiana University — with support from LinkedIn — have finally overcome these limitations by defining geo-industrial clusters through labor flow and constructing a global labor flow network from LinkedIn’s individual-level job history dataset. Our access to this data was made possible by our selection as one of 11 teams selected to participate in the LinkedIn Economic Graph Challenge.

The transitioning of workers between jobs and firms — also known as labor flow — is considered central in driving firms towards geo-industrial clusters due to knowledge spillover and labor market pooling. In response, we mapped the cluster structure of the world economy based on labor mobility between firms during the last 25 years, constructing a “labor flow network.” 

To do this, we leverage LinkedIn’s data on professional demographics and employment histories from more than 500 million people between 1990 and 2015. The network, which captures approximately 130 million job transitions between more than 4 million firms, is the first-ever flow network of global labor.

The resulting “map” allows us to:

  • identify geo-industrial clusters systematically and organically using network community detection
  • verify the importance of region and industry in labor mobility
  • compare the relative importance between the two constraints in different hierarchical levels, and
  • reveal the practical advantage of the geo-industrial cluster as a unit of future economic analyses.
  • show a better picture of what industry in what region leads the economic growth of the industry or the region, at the same time
  • find out emerging and declining skills based on the representativeness of them in growing and declining geo-industrial clusters…(More)”.