The Pitfalls of AI Bias in Healthcare

AI algorithms in healthcare often tend to adopt unwanted biases inadvertently, leading to improper diagnosis and recommendations. Explainable AI could be a key to resolving it.

Photo by ThisisEngineering RAEng on Unsplash

The advent of artificial intelligence (AI) has resulted in the use of aggregated data in healthcare to develop complex models to automate diagnosis. This has empowered a practitioner’s approach by tailoring treatments and focusing on patients effectively. AI in Healthcare is expected to reach $36.15 billion by 2025 with a growth rate of 50.2%. Hospitals are being built on the premise that AI systems are the future, and start-ups using AI for healthcare have raised more than $5 billion in venture capital.

Most improvements in AI systems are made in view of advances in the field of Machine Learning and Deep Learning. In contrast to traditional systems, where rules are manually crafted, AI systems are developed based on past data and examples. For instance, engineers created an algorithm to help diagnose breast cancer and fed it to mammograms to identify cancer cells. The algorithms processed the models and discovered a common pattern matching over 60,000 patients to distinguish cancerous cells from non-cancerous ones with an accuracy of 31% compared to 18% without it, and cancer risk could be predicted up to 5 years in advance. AI systems have also performed better than humans in areas such as radiology and medical imaging.

However, one of the fundamental problems of Learning Algorithms is that they tend to adopt unwanted biases based on data utilized for training. In Healthcare, this can lead to improper diagnosis and care recommendations. Recently, several stories involving AI algorithmic bias have emerged, where there is a tendency of AI to favor certain groups based on gender, age, and race. AI algorithmic bias in the field of medicine could prove to be fatal.

Biased data sets for AI algorithms:

Data plays a major role as far as biases creeping into the process is concerned. For instance, a Canadian company developed an algorithm for auditory tests for neurological diseases. It registers the way a person speaks and analyzes data to determine early stage Alzheimer’s disease. The test had >90% accuracy; however, the data consisted of samples of native English speakers only. Hence, when non-English speakers took the test, it would identify pauses and mispronunciations as indicators of the disease.

In a similar vein, an algorithm was developed to diagnose maligning skin lesions from images. People with white skin are more likely to suffer from skin cancer, hence, more data was available for that particular skin type. The system was 11%-19% more accurate in diagnosing people with light skin, but was 34% less accurate on darker skinned individuals. The algorithm was trained on a patient dataset containing a very small sample size of a certain patient type and is not suitable for clinical use in a hospital with a patient population comprising a wide variety of skin types.

The above examples highlight biases occurring due to misrepresentation of patient population data while training an algorithm. Hence, while designing, it is paramount to consider all known factors related to the patient population.

Lack of Transparency

While most algorithmic bias studies are focused on determining known factors i.e. gender, race, age etc., recent research shows that algorithms can gather hidden biases that are often difficult to identify. Therefore, it is often arduous and developers have very little visibility to identify and fix issues.

In some cases, biases tend to occur due to a correlation between the factors considered such as the training data for an algorithm designed to determine the hospital stay/admission of a person. The model assigned low risk scores to asthmatic patients and high scores to patients with pneumonia. It would recommend admission to less critical pneumonia patients rather than high critical asthma cases. Hence, output strongly depended on certain factors while failing to consider critical factors.

Can AI biases be fixed?

By either failing to consider the right subjects for datasets or wrong factors for training, there are many situations where crucial decisions can be influenced by bias. It is evident that bias is a challenge, but not impossible to overcome.

In order to address biases, the first step is to understand and accept the fact that data could be a problem. Second, determine how and why algorithms are biased to take certain decisions. An explainable and interpretable AI would allow developers, business users, and end users to fathom why certain decisions are made. Subsequently, the necessary course correction can be initiated. Machine Learning models are often black boxes that offer very little visibility into their inner workings. This is where explainability comes in. It provides justification for the results in an interpretable way.

An Explainable AI will help to build trust in an AI algorithm, which is critical in the healthcare system. This ensures fairness, inclusivity, reliability, and transparency while designing and maintaining privacy and driving accountability. This involves monitoring data for inequalities, as well as talking to healthcare providers and patients to determine if they see any fairness conundrums. This could enable organizations to not only identify potential discrimination, but also rectify algorithms beforehand.

Current AI models are limited to offering predictions but as they improve, models could evolve toward explainability paired with medical reasoning and AI outcomes, which would lead to fair decision-making to increase adoption and enable trust of AI systems in healthcare.