This machine learning model may improve heart transplant outcomes

With thousands of Americans on the heart transplant waiting list, it's imperative that this vital organ not go to waste.

By Betsy Vereckey 

When it comes to matters of the heart, no one likes rejection. Heart transplant patients know this better than anyone.

One of the most significant risks of a heart transplant is organ rejection. This occurs when the recipient’s immune system reacts to the foreign antigens in the donor organ and begins to attack it by mistake. These rejections account for about 10% of all deaths within the first three years after a heart transplant, according to the National Center for Biotechnology Information, a division of the National Institute of Health.

Given that around 3,500 people in the U.S. are currently on the waiting list for a heart transplant, there’s a pressing need to make sure precious hearts aren’t going to waste.

“This is a serious business, and the stakes are very high,” says Anant Madabhushi, a biomedical engineer who co-authored a research study that looked at whether artificial intelligence (AI) could be used to help improve heart transplant outcomes. “When you’ve got high stakes like this, you really want to have a whole suite and armory of quantitative decision-support tools that can provide the best possible decision.”

A less-than-perfect match

Photo courtesy of Getty

Today, when cardiac pathologists begin the process of assessing whether a donor heart will work in a recipient, they take a biopsy of the recipient’s heart (known as an endomyocardial biopsy). Then, they examine it under a microscope. Certain patterns—such as a large number of lymphocytes (a type of white blood cell that is important in the functioning of the immune system)—suggest harmful inflammation and can point to a higher likelihood of organ rejection.

“The strength of the immune response is an indication of the extent of injury and shows the extent to which a patient is susceptible to rejection,” says Madabhushi, who is also the Donnell Institute Professor of Biomedical Engineering at Case Western Reserve.

When looking at these biopsies, cardiac pathologists assign each biopsy a grade, from 0 to 4. (Zero indicates no chance of rejection, while 4 suggests a high likelihood of rejection.)

But there is one big problem with this approach: There is little consistency. One pathologist might give a biopsy a score of 0, whereas someone else might give it a higher rating.

When the researchers specifically asked experienced pathologists to rank and review a collection of biopsies they collected, they agreed on a score 62.6% of the time, meaning that if you asked five pathologists about an image, two out of five would come up with a different conclusion.

When you’ve got high stakes like this, you really want to have a whole suite and armory of quantitative decision-support tools that can provide the best possible decision.

—Anant Madabhushi, biomedical engineer

“This is what made it even more baffling to some extent,” Madabhushi says. “We actually took people who had significant expertise in looking at endomyocardial biopsies, and the fact that there was only 62.6% agreement among expert cardiac pathologists was quite stunning. It was really a wake-up call that this is the extent of variability in the interpretation of these biopsies, and further reiterates the need for machine-based decision support to overcome this disagreement among human readers.”

Madabhushi notes that it’s difficult to say exactly how accurate the human readers were in terms of predicting heart transplant rejection, since they don’t directly predict outcomes, only grade the slides. But having no consensus is a problem.

“The challenge, unfortunately, with this grading criteria for cellular rejection is that it’s not very reproducible across different pathologists,” Madabhushi says. “That really is the fundamental issue.”

Improving transplant outcomes

Given that this poor agreement creates uncertainty, Madabhushi and his team began thinking about how technology could be used to help create better consensus. Was it possible that machine learning (ML) could see trends that humans couldn’t?

Photo courtesy of Getty

In 2018, the team began acquiring more than 2,000 photographic slides of patients who received a heart transplant from three major U.S. transplant centers: the Hospital of the University of Pennsylvania, University Hospitals Cleveland Medical Center and the Ohio State University Wexner Medical Center. The biopsy images were stored in HIPPA-compliant servers with no patient-specific information attached.

Then, the team used ML to identify cardiomyocytes (the cells that make up the heart muscle)  and lymphocytes in each image. The team wondered whether the distance between the two mattered in predicting transplant success, so they developed an algorithm that could spatially identify the location of these different types of cells. For some of the images, but not all, the team knew whether a patient eventually experienced heart rejection, so they used the images where they had this information to help them determine how spatial arrangement played into rejection.

“One of the things we know about pathology is that the number of cells is important, but perhaps even more important compared to the number of cells is architecture and the arrangement,” Madabhushi says. “Cellular architecture in some sense supersedes just the density of the cells. So rather than just using the machine to capture the number of immune cells, we decided to start to look at the spatial arrangement of the immune cells.”

AI to solve pathology disagreements

When the team combined their algorithm with the pathologists’ data, they found that agreement improved to about 66% (meaning that humans agreed with other humans 62% of the time, while humans agreed with the machine’s assessment 66% of the time). Madabhushi says that the increase “might not appear like much,” but in medicine, it is significant, especially when you think in terms of patients—having more reliable information will help improve health outcomes. Even if it only saves the lives of a few patients, it’s still worth it.

Madabhushi believes the ML model helped reach a better consensus because it provided new information: While cardiac pathologists typically look for the density of cells, the ML model uncovered the importance of spatial statistics.

“By invoking a separate set of features or patterns that the pathologists weren’t using, it actually had better agreement because all the pathologists were trying to estimate the density, and that’s where they were off,” says Madabhushi, who notes that machines are much better at capturing features like spatial arrangement than humans (who are better at counting and seeing density). “But then the machine comes along, and it’s using a completely different set of attributes, a different set of features, and I suspect that’s why the machine had better agreement with the pathologists.”

Cellular architecture in some sense supersedes just the density of the cells. So rather than just using the machine to capture the number of immune cells, we decided to start to look at the spatial arrangement of the immune cells.

—Anant Madabhushi, biomedical engineer

Having a machine provide a second opinion can help improve heart transplant outcomes because it provides more confidence—if a pathologist and the machine both suggest a higher ranking, then that provides greater confidence that someone is likely to experience rejection. Alternatively, if there is discordance, it might suggest the need for bringing in another independent reader to provide an additional opinion, a tie-breaker of sorts.

In the future, the team hopes to expand on the research and predict how patients will do over the long term, that is, not only whether they will accept or reject a donor heart but whether they will remain healthy and for how long.

“We think that in the short to medium term, this tool could serve as a machine-based support tool for cardiac pathologists, but beyond that, we want to really create a tool for cardiologists to be able to predict a clinical outcome in a way that is independent of rejection grading,” he says.

Patients can wait anywhere from days to months or even years for a donor heart. Given the scarcity of organs available, it’s important that a donor heart is matched with the person who is most likely to survive the transplant.

Says Madabhushi, “If a recipient is not going to benefit from a particular heart, maybe there’s somebody else down the line who might actually benefit.”