Enrollment algorithms are contributing to the crises of higher education

Paper by Alex Engler: “Hundreds of higher education institutions are procuring algorithms that strategically allocate scholarships to convince more students to enroll. In doing so, these enrollment management algorithms help colleges vary the cost of attendance to students’ willingness to pay, a crucial aspect of competition in the higher education market. This paper elaborates on the specific two-stage process by which these algorithms first predict how likely prospective students are to enroll, and second help decide how to disburse scholarships to convince more of those prospective students to attend the college. These algorithms are valuable to colleges for institutional planning and financial stability, as well as to help reach their preferred financial, demographic, and scholastic outcomes for the incoming student body.

Unfortunately, the widespread use of enrollment management algorithms may also be hurting students, especially due to their narrow focus on enrollment. The prevailing evidence suggests that these algorithms generally reduce the amount of scholarship funding offered to students. Further, algorithms excel at identifying a student’s exact willingness to pay, meaning they may drive enrollment while also reducing students’ chances to persist and graduate. The use of this two-step process also opens many subtle channels for algorithmic discrimination to perpetuate unfair financial aid practices. Higher education is already suffering from low graduation rates, high student debt, and stagnant inequality for racial minorities—crises that enrollment algorithms may be making worse.

This paper offers a range of recommendations to ameliorate the risks of enrollment management algorithms in higher education. Categorically, colleges should not use predicted likelihood to enroll in either the admissions process or in awarding need-based aid—these determinations should only be made based on the applicant’s merit and financial circumstances, respectively. When colleges do use algorithms to distribute scholarships, they should proceed cautiously and document their data, processes, and goals. Colleges should also examine how scholarship changes affect students’ likelihood to graduate, or whether they may deepen inequities between student populations. Colleges should also ensure an active role for humans in these processes, such as exclusively using people to evaluate application quality and hiring internal data scientists who can challenge algorithmic specifications. State policymakers should consider the expanding role of these algorithms too, and should try to create more transparency about their use in public institutions. More broadly, policymakers should consider enrollment management algorithms as a concerning symptom of pre-existing trends towards higher tuition, more debt, and reduced accessibility in higher education….(More)”.

Artificial intelligence masters’ programs

An analysis “of curricula building blocks” by JRC-European Commission: “This report identifies building blocks of master programs on Artificial Intelligence (AI), on the basis of the existing programs available in the European Union. These building blocks provide a first analysis that requires acceptance and sharing by the AI community. The proposal analyses first, the knowledge contents, and second, the educational competences declared as the learning outcomes, of 45 post-graduate academic masters’ programs related with AI from universities in 13 European countries (Belgium, Denmark, Finland, France, Germany, Italy, Ireland, Netherlands, Portugal, Spain, and Sweden in the EU; plus Switzerland and the United Kingdom).

As a closely related and relevant part of Informatics and Computer Science, major AI-related curricula on data science have been also taken into consideration for the analysis. The definition of a specific AI curriculum besides data science curricula is motivated by the necessity of a deeper understanding of topics and skills of the former that build up the foundations of strong AI versus narrow AI, which is the general focus of the latter. The body of knowledge with the proposed building blocks for AI consists of a number of knowledge areas, which are classified as Essential, Core, General and Applied.

First, the AI Essentials cover topics and competences from foundational disciplines that are fundamental to AI. Second, topics and competences showing a close interrelationship and specific of AI are classified in a set of AI Core domain-specific areas, plus one AI General area for non-domain-specific knowledge. Third, AI Applied areas are built on top of topics and competences required to develop AI applications and services under a more philosophical and ethical perspective. All the knowledge areas are refined into knowledge units and topics for the analysis. As the result of studying core AI knowledge topics from the master programs sample, machine learning is observed to prevail, followed in order by: computer vision; human-computer interaction; knowledge representation and reasoning; natural language processing; planning, search and optimisation; and robotics and intelligent automation. A significant number of master programs analysed are significantly focused on machine learning topics, despite being initially classified in another domain. It is noteworthy that machine learning topics, along with selected topics on knowledge representation, depict a high degree of commonality in AI and data science programs. Finally, the competence-based analysis of the sample master programs’ learning outcomes, based on Bloom’s cognitive levels, outputs that understanding and creating cognitive levels are dominant.

Besides, analysing and evaluating are the most scarce cognitive levels. Another relevant outcome is that master programs on AI under the disciplinary lenses of engineering studies show a notable scarcity of competences related with informatics or computing, which are fundamental to AI….(More)”.

Alliance formed to create new professional standards for data science

Press Release: “A new alliance has been formed to create industry-wide professional standards for data science. ‘The Alliance for Data Science Professionals’ is defining the standards needed to ensure an ethical and well-governed approach so the public, organisations and governments can have confidence in how their data is being used. 

While the skills of data scientists are increasingly in demand, there is currently no professional framework for those working in the field. These new industry-wide standards, which will be finalised by the autumn, look to address current issues, such as data breaches, the misuse of data in modelling and bias in artificial intelligence. They can give people confidence that their data is being used ethically, stored safely and analysed robustly. 

The Alliance members, who initially convened in July 2020, are the Royal Statistical Society, BCS, The Chartered Institute for IT, the Operational Research Society, the Institute of Mathematics and its Applications, the Alan Turing Institute and the National Physical Laboratory (NPL). They are supported by the Royal Academy of Engineering and the Royal Society.  

Since convening, the Alliance has worked with volunteers and stakeholders to develop draft standards for individuals, standards for universities seeking accreditation of their courses and a certification process that will enable both individuals and education providers to gain recognition based on skills and knowledge within data science.  

Governed by a memorandum of understanding, the Alliance is committed to:  

  • Defining the standards of professional competence and behaviour expected of people who work with data which impacts life and livelihoods. These include data scientists, data engineers, data analysts and data stewards.  
  • Using an open-source process to maintain and update the standards. 
  • Delivering these standards as data science certifications offered by the Alliance members to their professional members, with processes to hold certified members accountable for their professional status in this area. 
  • Using these standards as criteria for Alliance members to accredit data science degrees, and data science modules of associated degrees, as contributing to certification. 
  • Creating a single searchable public register of certified data science professionals….(More)”.

From open policy-making to crowd-sourcing: illustrative forms of open government in education

Policy Brief by Muriel Poisson: “As part of its research project on ‘Open government (OG) in education: Learning from experience’, the UNESCO International Institute for Educational Planning (IIEP) has prepared five thematic briefs illustrating various forms of OG as applied to the education field: open government, open budgeting, open contracting, open policy-making and crowd-sourcing, and social auditing. This brief deals specifically with open policy-making and crowd-sourcing….(More)”.

We need to regulate mind-reading tech before it exists

Abel Wajnerman Paz at Rest of the World: “Neurotechnology” is an umbrella term for any technology that can read and transcribe mental states by decoding and modulating neural activity. This includes technologies like closed-loop deep brain stimulation that can both detect neural activity related to people’s moods and can suppress undesirable symptoms, like depression, through electrical stimulation.

Despite their evident usefulness in education, entertainment, work, and the military, neurotechnologies are largely unregulated. Now, as Chile redrafts its constitution — disassociating it from the Pinochet surveillance regime — legislators are using the opportunity to address the need for closer protection of people’s rights from the unknown threats posed by neurotechnology. 

Although the technology is new, the challenge isn’t. Decades ago, similar international legislation was passed following the development of genetic technologies that made possible the collection and application of genetic data and the manipulation of the human genome. These included the Universal Declaration on the Human Genome and Human Rights in 1997 and the International Declaration on Human Genetic Data in 2003. The difference is that, this time, Chile is a leading light in the drafting of neuro-rights legislation.

In Chile, two bills — a constitutional reform bill, which is awaiting approval by the Chamber of Deputies, and a bill on neuro-protection — will establish neuro-rights for Chileans. These include the rights to personal identity, free will, mental privacy, equal access to cognitive enhancement technologies, and protection against algorithmic bias….(More)”.

COVID data is complex and changeable – expecting the public to heed it as restrictions ease is optimistic

Manuel León Urrutia at The Conversation: “I find it tempting to celebrate the public’s expanding access to data and familiarity with terms like “flattening the curve”. After all, a better informed society is a successful society, and the provision of data-driven information to the public seems to contribute to the notion that together we can beat COVID.

But increased data visibility shouldn’t necessarily be interpreted as increased data literacy. For example, at the start of the pandemic it was found that the portrayal of COVID deaths in logarithmic graphs confused the public. Logarithmic graphs control for data that’s growing exponentially by using a scale which increases by a factor of ten on the y, or vertical axis. This led some people to radically underestimate the dramatic rise in COVID cases.

Two graphs comparing linear with logorithmic curves
A logorithmic graph (on the right) flattens exponential curves, which can confuse the public. LSE

The vast amount of data we now have available doesn’t even guarantee consensus. In fact, instead of solving the problem, this data deluge can contribute to the polarisation of public discourseOne study recently found that COVID sceptics use orthodox data presentation techniques to spread their controversial views, revealing how more data doesn’t necessarily result in better understanding. Though data is supposed to be objective and empirical, it has assumed a political, subjective hue during the pandemic….

This is where educators come in. The pandemic has only strengthened the case presented by academics for data literacy to be included in the curriculum at all educational levels, including primary. This could help citizens navigate our data-driven world, protecting them from harmful misinformation and journalistic malpractice.

Data literacy does in fact already feature in many higher education roadmaps in the UK, though I’d argue it’s a skill the entire population should be equipped with from an early age. Misconceptions about vaccine efficacy and the severity of the coronavirus are often based on poorly presented, false or misinterpreted data. The “fake news” these misconceptions generate would spread less ferociously in a world of data literate citizens.

To tackle misinformation derived from the current data deluge, the European Commission has funded projects such as MediaFutures and YourDataStories….(More)”.

Enhancing teacher deployment in Sierra Leone: Using spatial analysis to address disparity

Blog by Paul Atherton and Alasdair Mackintosh:”Sierra Leone has made significant progress towards educational targets in recent years, but is still struggling to ensure equitable access to quality teachers for all its learners. The government is exploring innovative solutions to tackle this problem. In support of this, Fab Inc. has brought their expertise in data science and education systems, merging the two to use spatial analysis to unpack and explore this challenge….

Figure 1: Pupil-teacher ratio for primary education by district (left); and within Kailahun district, Sierra Leone, by chiefdom (right), 2020.


Source: Mackintosh, A., A. Ramirez, P. Atherton, V. Collis, M. Mason-Sesay, & C. Bart-Williams. 2019. Education Workforce Spatial Analysis in Sierra Leone. Research and Policy Paper. Education Workforce Initiative. The Education Commission.

…Spatial analysis, also referred to as geospatial analysis, is a set of techniques to explain patterns and behaviours in terms of geography and locations. It uses geographical features, such as distances, travel times and school neighbourhoods, to identify relationships and patterns.

Our team, using its expertise in both data science and education systems, examined issues linked to remoteness to produce a clearer picture of Sierra Leone’s teacher shortage. To see how the current education workforce was distributed across the country, and how well it served local populations, we drew on geo-processed population data from the Grid-3 initiative and the Government of Sierra Leone’s Education Data Hub. The project benefited from close collaboration with the Ministry and Teaching Service Commission (TSC).

Our analysis focused on teacher development, training and the deployment of new teachers across regions, drawing on exam data. Surveys of teacher training colleges (TTCs) were conducted to assess how many future teachers will need to be trained to make up for shortages. Gender and subject speciality were analysed to better address local imbalances. The team developed a matching algorithm for teacher deployment, to illustrate how schools’ needs, including aspects of qualifications and subject specialisms, can be matched to teachers’ preferences, including aspects of language and family connections, to improve allocation of both current and future teachers….(More)”

Introduction to Modern Statistics

Free-to-download book by Mine Cetinkaya-Rundel and Johanna Hardin: “…a re-imagining of a previous title, Introduction to Statistics with Randomization and Simulation. The new book puts a heavy emphasis on exploratory data analysis (specifically exploring multivariate relationships using visualization, summarization, and descriptive models) and provides a thorough discussion of simulation-based inference using randomization and bootstrapping, followed by a presentation of the related Central Limit Theorem based approaches. Other highlights include:

Web native book. The online book is available in HTML, which offers easy navigation and searchability in the browser. The book is built with the bookdown package and the source code to reproduce the book can be found on GitHub. Along with the bookdown site, this book is also available as a PDF and in paperback. Read the book online here.

Tutorials. While the main text of the book is agnostic to statistical software and computing language, each part features 4-8 interactive R tutorials (for a total of 32 tutorials) that walk you through the implementation of the part content in R with the tidyverse for data wrangling and visualisation and the tidyverse-friendly infer package for inference. The self-paced and interactive R tutorials were developed using the learnr R package, and only an internet browser is needed to complete them. Browse the tutorials here.

Labs. Each part also features 1-2 R based labs. The labs consist of data analysis case studies and they also make heavy use of the tidyverse and infer packages. View the labs here.

Datasets. Datasets used in the book are marked with a link to where you can find the raw data. The majority of these point to the openintro package. You can install the openintro package from CRAN or get the development version on GitHub. Find out more about the package here….(More)”.

Measuring What Matters for Child Well-being and Policies

Blog by Olivier Thévenon at the OECD: “Childhood is a critical period in which individuals develop many of the skills and abilities needed to thrive later in life. Promoting child well-being is not only an important end in itself, but is also essential for safeguarding the prosperity and sustainability of future generations. As the COVID-19 pandemic exacerbates existing challenges—and introduces new ones—for children’s material, physical, socio-emotional and cognitive development, improving child well-being should be a focal point of the recovery agenda.

To design effective child well-being policies, policy-makers need comprehensive and timely data that capture what is going on in children’s lives. Our new reportMeasuring What Matters for Child Well-being and Policies, aims to move the child data agenda forward by laying the groundwork for better statistical infrastructures that will ultimately inform policy development. We identify key data gaps and outline a new aspirational measurement framework, pinpointing the aspects of children’s lives that should be assessed to monitor their well-being….(More)”.

The Returns to Public Library Investment

Working Paper by the Federal Reserve Bank of Chicago: “Local governments spend over 12 billion dollars annually funding the operation of 15,000 public libraries in the United States. This funding supports widespread library use: more than 50% of Americans visit public libraries each year. But despite extensive public investment in libraries, surprisingly little research quantities the effects of public libraries on communities and children. We use data on the near-universe of U.S. public libraries to study the effects of capital spending shocks on library resources, patron usage, student achievement, and local housing prices. We use a dynamic difference-in-difference approach to show that library capital investment increases children’s attendance at library events by 18%, children’s checkouts of items by 21%, and total library visits by 21%. Increases in library use translate into improved children’s test scores in nearby school districts: a $1,000 or greater per-student capital investment in local public libraries increases reading test scores by 0.02 standard deviations and has no effects on math test scores. Housing prices do not change after a sharp increase in public library capital investment, suggesting that residents internalize the increased cost and improved quality of their public libraries….(More)”.