Trying to detect welfare fraud, where people claim benefits they are not entitled to, is not new. Nor is it unreasonable: taxpayers rightly want to know that their money is going to those who truly need it. But a natural tendency by the authorities to turn to computers to help with the process of spotting fraud is fraught with problems. One example of what can go wrong is seen in Michigan’s experience. From October 2013 to September 2015, some 34,000 individuals were wrongfully accused of unemployment fraud. The problems in Michigan were twofold. One is that key records may not have been imported into a new system properly, leading to erroneous decisions based on incomplete data. The other issue was a hesitancy by human overseers to question the results that the system’s algorithm generated.
More recently, this problem has become worse with the application of advanced artificial intelligence techniques to fraud detection. The obscure nature of the machine-learning algorithms deployed make it hard for anyone to challenge the decisions. Even worse is a growing tendency for such automated decision-making systems to draw on highly personal data as a matter of routine. A good example of this is SyRI, which stands for “Systeem Risico Indicatie“, or System Risk Indicator. It was created by the Dutch Ministry of Social Affairs in 2014 to identify people who were considered to be at high risk of committing benefit fraud.
If a government agency suspected social welfare fraud, SyRI could be used to analyze that person’s neighborhood in order to find “citizen profiles” of those who may be cheating the system. This was brought in despite objections from the Dutch Data Protection Authority, and without any transparency for citizens about what happens with their personal data, which is drawn from multiple sources:
According to the official resolution that is the legal basis for SyRI, the system can cross-reference data about work, fines, penalties, taxes, properties, housing, education, retirement, debts, benefits, allowances, subsidies, permits and exemptions, and more. These are described so broadly that in 2014 the Council of State concluded in its negative opinion on SyRI that there is “hardly any personal data that cannot be processed”.
Despite this massive assault on privacy, the system has failed to deliver any of the claimed benefits. In its first five years, five municipalities requested neighborhood analyses. Only two were actually carried out, and even for these, no new cases of fraud were detected. In February 2020, the Dutch court of The Hague ordered the immediate halt of SyRI because it violates article 8 of the European Convention on Human Rights (ECHR), which protects the right to respect for private and family life.
The high-profile case of SyRI is unusual because the project was halted. But as AI systems based on machine learning become cheaper and easier to apply, so they are being rolled out more widely for the purpose of algorithmic decision making, often without people being aware of the move. That makes it hard to oppose them. A new report from the Greenlining Institute looks at how decision-making algorithms, often with built-in biases, are starting to appear in healthcare, at the workplace, within government, in the housing market, in finance, education, and in the pricing of goods and services.
The report offers three ideas for improving the situation: algorithmic transparency and accountability; race-aware algorithms to counter racial bias; and what it calls “algorithmic greenlining”: that is “using automated decision systems in ways that promote equity and help close the racial wealth gap.” It’s interesting to note that Amsterdam and Helsinki have already done work on transparency. They have created registers that track AI decision-making systems in their cities. They include an overview of the system, information on what data they use, including personal data, the logic applied, and what human oversight there is. Here, for example, is Amsterdam’s algorithmic decision-making system to determine whether people are renting out their homes illegally:
The algorithm helps prioritize the reports so that the limited enforcement capacity can be used efficiently and effectively. By analyzing the data of related housing fraud cases of the past 5 years, it calculates the probability of an illegal holiday rental situation on the reported address.
Another report, called “Poverty Lawgorithms” offers “A Poverty Lawyer’s Guide to Fighting Automated Decision-Making Harms on Low-Income Communities”. One group planning to use what it calls “strategic litigation” to make sure that public impact algorithms are used fairly, and cause no harm, is the Open Knowledge Foundation:
We aim to make public impact algorithms more accountable by equipping legal professionals, including campaigners and activists, with the know-how and skills they need to challenge the effects of these technologies in their practice. We also work with those deploying public impact algorithms to raise awareness of the potential risks and build strategies for mitigating them.
One of the key problems with automated decision making is that it is opaque: the results are simply presented, with little or no explanation of how they were arrived at. That makes challenging them hard or even impossible. Since they are now routinely drawing on personal data, they represent secret assaults on people’s privacy. The only way to respond to that attack is to demand transparency, and in at least two ways. First, it is vital for people to know what classes of personal data are being used for decision making. That should be simple enough to provide. Secondly, some details about the overall logic that lies behind the decision making needs to be given. That’s more problematic, and part of a larger issue of how to open up what are too often inscrutable black boxes. Work on this is underway, but it urgently needs to be deepened and put on a firmer legal footing.
Feature image by Pete Linforth.