Machine Learning and a Slippery Prognosis

A Google subsidiary invites data scientists to crowd-solve big problems, like predicting lung function decline in patients with pulmonary fibrosis.

lungs highlighted in yellow on human torso

More than anything else, a patient who receives a serious diagnosis wants to know what to expect.

What will happen next? And if the disease worsens over time, how long before my health problems are severe?

Those questions have been difficult to answer for someone who has pulmonary fibrosis, a type of interstitial lung disease that scars the lungs and impairs breathing. The cause is unknown and a patient can deteriorate rapidly or enjoy a long period of stability. Doctors don’t understand why similar-seeming patients experience different outcomes.

To the Open Source Imaging Consortium (OSIC), it sounded like a perfect problem for Kaggle. OSIC, a not-for-profit, co-operative effort between academia, industry and philanthropy, enables rapid advances in the fight against Idiopathic Pulmonary Fibrosis.

Kaggle, a global online competition platform and a subsidiary of Google, calls itself the world's largest online community of data scientists and machine learning practitioners. The global online competition platform enables users to publish data and create data science challenges. If you give them the data, they will analyze it for hidden patterns and try to develop predictive algorithms.

So OSIC launched a Kaggle challenge and, using a Creative Commons license model, supplied anonymized high-resolution CT scans paired with clinical data about a large group of patients. CSL Behring is among the sponsors of the challenge, which includes thousands of high resolution images of patients’ lungs that can be analyzed for minute data points. Which similarities hint at a more positive outcome and which are associated with a swift decline? That’s what the data scientists will try to figure out.

And just like “open source” software, the data scientists competing in a Kaggle challenge share their findings and analyses to better advance progress. The pulmonary fibrosis Kaggle challenge will award prizes for first, second and third place. Top prize is $30,000, second prize is $15,000 and third prize is $10,000.

“What we love about it is that there will be people competing who are experts in self-driving cars and facial recognition who are going to look at this problem very differently than it’s been looked at before,” said Elizabeth Estes, OSIC’s Executive Director.

CSL Behring, which develops medicines for rare and serious diseases, partnered with OSIC to support cutting-edge research into interstitial lung diseases. The challenge runs through October 6.

“We’re proud to join the OSIC collaboration, a unique, industry-leading initiative,” said CSL Behring’s Lars Groenke, Global Clinical Development Lead for Respiratory. “The partnership is a positive step towards achieving earlier and more accurate diagnosis algorithms in clinical practice, and that will directly benefit patients and ultimately change lives.”

Google acquired Kaggle in 2017 and boasts more than 5 million registered users in 250 countries. Since Kaggle’s founding in 2010, “Kagglers” have wrestled with data to tackle health-related and non-health-related problems.

Through Kaggle, data scientists developed gesture technology for Microsoft’s Kinect and they’ve searched for answers in treating HIV. Right now, there’s also a Kaggle challenge to identify the birds chirping in many hours of recorded birdsong.

The Cornell Lab of Orinthology’s Center for Conservation Bioacoustics says a technical glitch has made it difficult to use machine learning to identify recorded birdsong. It’s something they hope the Kaggle team can unravel. Knowing where birds are can tell you a lot about the health of the environment – and often it’s easier to identify birds by sound rather than sight, according to the challenge summary. Top prize is $12,000.

“The eventual conservation outcomes could greatly improve the quality of life for many living organisms – birds and human beings included.”