Capstone

Read time – 6 minutes

Big Picture Thinking

Two MScBMI students organize unruly data for future use.

Written by Philip Baker
stock image of doctors in lab

For their capstone project, two students teamed up to perform an analysis of Inflammatory Bowel Disease treatments and learned important lessons about working with healthcare data along the way.

Brought together by their common interests and capstone goals, Erika Lin and Trevor Peters, both recent graduates of the Master of Science in Biomedical Informatics (MScBMI) program, used clinical trial data in their capstone project to conduct a patient-level meta-analysis comparing two treatments for Inflammatory Bowel Disease (IBD). 

With backgrounds spent predominantly in wet labs, the data-focused dry lab scenario provided them with a new experience as well as the challenge of putting their newly acquired coding skills to the test.

“That’s exactly what Trevor and I came to the MScBMI program to learn,” Lin said, who saw the MScBMI program as a great way to acquire valuable new skill sets before moving onto medical school. “We both have primarily biology backgrounds, so needing to use R to draw conclusions from the data forced us to get up to speed with it really fast.”

“It’s the best way to learn it,” added Peters, who had no prior experience with R—a statistical language and computational environment—before their capstone began. “In certain ways, it’s like learning a spoken language. Being immersed in an environment where you need to use it means you have no choice but to learn it.”

No data is perfect. It brought home for me the need to think about the big picture when developing a prospective clinical trial. It’s a matter of organizing the data so that it will be most useful for anyone looking to understand down the line.

Erika Lin, MScBMI '19

Pivotal Expertise from a Medical Advisor

To assist them with the clinical side of their project, Lin and Peters were introduced to Atsushi Sakuraba, MD, PhD, a gastroenterologist and assistant professor at UChicago Medicine who specializes in the diagnosis and management of IBD. His expertise was pivotal as they moved forward on their capstone project, which involved assessing whether the data from eight clinical trials reported better outcomes in reducing IBD-related inflammation using combination therapy or monotherapy. 

Specifically, their research centered around whether using a higher dosage of the immunomodulator Azathioprine in combination with Infliximab, a therapy commonly administered for the treatment of IBD, was more effective than using Infliximab on its own.

“In the beginning we were emailing Dr. Sakuraba constantly,” said Peters. “Despite his extremely busy schedule, he was great at getting back to us. Plus, his familiarity with similar projects that had reported conflicting results in the past meant he could steer us in the right direction and answer questions we had about the data.”

In certain ways, it’s like learning a spoken language. Being immersed in an environment where you need to use it means you have no choice but to learn it.

Trevor Peters, MScBMI '19

The Challenges and Rewards of Real-World Health Informatics Data

During the first quarter of the project, Lin and Peters undertook a literature review of IBD while focusing on the specific areas and therapies their project would cover. Vital during this time was assistance they received from their science advisor Matthew Dapas, PhD, who also had past experience working in the area, but whose R coding expertise served as a particular boon.

“By the end of the first quarter we’d put together the proposal for our project,” Lin said.  “We worked closely with Matt in laying out our game plan, and he was especially helpful when it came to preparing us for the programming challenges up ahead. Getting the R script up and running was one of the biggest challenges of the project for us.”

The data for the eight clinical trials was provided by the Yale Open Data Access (YODA) Project, a data storage platform that makes clinical data readily available to researchers and physicians for research purposes. To access the clinical trial data, they would log on remotely while using UChicago computers located at NBC Towers, whose large screens were essential for reviewing the many columns of data. A key hurdle they encountered during this time revolved around missing data sets, which had them reach out to the YODA Project and request they track them down.

“But the missing data sets were only part of the challenge,” said Peters. “More significant was that the eight clinical trials had different objectives and they organized their data in very different ways, so putting together the data tables was hard work. It was a great lesson in what working with real data entails.”

For Lin, too, this was a surprise and also a key learning point that she took away from the project. In fact, in her present position at the Ann & Robert H. Lurie Children’s Hospital, she’s creating a database from the ground up and paying particular attention that the variables she selects will be useful for anyone who uses the database in the future.

“That was really a major lesson,” she said. “No data is perfect. It brought home for me the need to think about the big picture when developing a prospective clinical trial. It’s a matter of organizing the data so that it will be most useful for anyone looking to use and understand it down the line.”

Drawing Conclusions and Looking Ahead

Having taken on the major challenge presented by the data during the second quarter, they were not entirely out of the woods once the third quarter began. As the YODA Project continued to track down and send them the missing data sets they had requested, incorporating the new data at this late stage often came with unsettling consequences for results they believed were certain.

“It actually became a little exhausting,” Peters said about the experience of writing and re-writing their capstone report to accommodate the new data. “But it was a great lesson, too. That’s what happens in data science work. No matter what you do, the variables and conclusions are liable to change at any time as new data comes in. It’s definitely part of the value of the capstone project that you come away with lessons like that.”

In the end, they concluded that a high dosage of Azathioprine had more cases of remission than low doses of Azathioprine in combination therapy. But, these results may be due to a different reason than they had originally anticipated. Based on previous clinical trials investigating the efficacy of combination therapy, they expected to find that Azathioprine would increase blood levels of Infliximab. However, their results showed that there may be other explanations, which led Lin and Peters to speculate about their findings, as well as future next steps. 

“I believe we used the data we had to its full potential,” Lin said. “As for what the next step could be, ideally a prospective clinical trial could be developed using uniform data capture practices. But securing the necessary funding and time isn’t easy, so, in lieu of that path, tracking down the missing data sets could be another way to yield additional significant results.”


The Graham School will not be admitting new students to the Master of Science in Biomedical Informatics (MScBMI) in Autumn 2024. The University will take this opportunity to consider future programming in the Biological Sciences Division (BSD). Please see the BSD website for more information about their offerings.

Additional Stories