Ming “Tommy” Tang on Data Challenges in Immuno-oncology, the Role of the Cloud, and Growing a Computational Biology Team
A conversation with Tommy Tang
Tommy Tang is one of the most prominent online voices on computational biology. Tommy speaks to Data in Biotech on a range of topics, from his early career to growing a team and from the challenges of bioinformatics data to the potential of the cloud, we take a close look at the discipline of Computational Biology.
The Interview
Guest Profile
Tommy Tang's career began when he pursued his Ph.D. in genetics and genomics at the University of Florida. Initially trained in molecular biology in the wet lab, he was driven to explore computational biology after encountering the limitations of traditional analysis methods. Through self-study, Tommy developed skills that enabled him to analyze complex genomic data sets.
Following his Ph.D., Tommy joined MD Anderson Cancer Center and later moved to Harvard and the Dana Farber Cancer Institute, where he worked on single-cell RNA sequencing. Currently, Tommy serves as the Director of Computational Biology at Immunitas Therapeutics, a single-cell genomics company focused on immuno-oncology.
The Highlights
Throughout the interview, Tommy discusses his career path, computational biology's role in immuno-oncology, the impact of data on research, and the importance of asking the right questions to drive progress. Let's delve into five of the highlights from our interview with Tommy:
The challenges of data in immuno-oncology: While the decreasing cost of sequencing has led to a surge in data availability, effectively utilizing and analyzing these data sets is still a problem. One of the major challenges in immuno-oncology is the quality and quantity of the data. High-quality data is essential for accurate analysis, but generating this data is not easy. Publicly available data can be useful but is typically messy and requires significant effort to clean and homogenize for analysis. Wet lab biologists can generate high-quality data, but getting the scale needed can be a challenge. Constructing a baseline dataset is the starting point for any computational biology practice.
Building the computational biology function: We asked Tommy to give us an overview of the process of coming into Immunitas and building the function of computational biology from scratch. He emphasized the importance of not letting perfect be the enemy of the good. The first priority is getting a baseline in place that works and, from there, gradually enhancing the most critical elements. This is particularly true for small biotechs where spending too much time on a “perfect” solution may mean that deadlines are missed. He also emphasizes that although machine learning models and AI show promise, simple statistical tests, and conventional methods can often provide valuable insights that should not be overlooked.
Mutual learning process: By working closely with wet lab biologists, computational biologists gain a deeper understanding of experimental design and biological processes, enabling them to work more effectively. Likewise, wet lab biologists stand to gain useful insight when collaborating with computational biologists. For example, a computational biologist might flag a subset of cells that seemed interesting, and a wet lab technician can add additional context that helps to understand them – it might be a purely technical issue like a change in temperature that has impacted the results. A wet lab technician may ask a question like, ‘why is the gene expression level on the project I am working on so low when my expectation is it should be high?’ Computational biology can dig into the possible causes. Interactions between the two disciplines make both groups perform better.
The benefits of using the cloud: One of the big benefits of the cloud, from Tommy’s perspective, was its ability to deal with scale. A data set can have millions of cells, which is a difficult volume of data to deal with on a local computer. Cloud computing has played a vital role in advancing computational biology by providing scalable infrastructure and storage that can deal with the large-scale genomic data being analyzed. Tommy also explains how the cloud gives a cost-effective way to store data while automating repeatable processes to ensure data is handled efficiently and is well-organized for future use, but suggests a few areas where cloud services customized for biotech can eliminate a lot of the friction for computational biology teams.
Growing a team: We asked Tommy about his approach to growing a computational biology team, and he emphasized the importance of focusing on career development. He points to the saying, “if you want to go fast, go alone, but if you want to go far, go together”. His view is that you can learn much more as a team if you learn together. Therefore, there is a big focus within his team on knowledge sharing with regular learning and collaboration sessions. Fostering a great learning environment benefits the team, and company, as a whole.
Further Reading: As we discussed with Tommy on the podcast, one of his aims with his social media channels is to curate and create resources for those breaking into the industry, so rather than give our usual book or paper recommendation, here are links to Tommy’s GitHub repository and website as an invaluable tool for anyone on a computational biology journey.
Continuing the Conversation
One of the overriding threads of the conversation with Tommy was the value of creating a multi-disciplinary team that integrates a deep understanding of biology with a deep understanding of computational techniques like machine learning. He also emphasized how much progress can be achieved by encouraging team members to immerse themselves in fields that are complementary to their work but not necessarily central to their current roles. Finally, he discussed how critical it is to create an environment where team members share insights across projects so that the team can grow together and learn from each others’ challenges and solutions.
At CorrDyn, we have a similar ethos when it comes to building a data team that can interface with scientific domain experts, including biology, chemistry, physics, engineering, automation, operations, and logistics. Here are three tactics we view as essential to growing your data teams to better collaborate with domain experts in biotech:
Read and listen to the language of your domain experts: Read what they are reading and writing. Discover how they think about the challenges they face; what methodologies do they use to address those challenges? Where might their deep disciplinary perspective result in a potential blind spot? Read their manuals and blog posts. Attend their team meetings. Find friends who are willing to complain to you.
Learn what your domain experts care about most (and hate most about their jobs): Domain experts, like any stakeholder group, have unique goals that they want to achieve and unique ways of articulating those goals. They often enjoy the parts of their roles most closely tied to the domain from which they came (otherwise, they wouldn’t have become a domain expert). They also have aspects of their jobs that divert attention from those goals. Learn both sides of the equation.
Engage your domain experts using their language, on their terms: When discussing a problem space with domain experts, learn to use their language to discuss their goals and the challenges they face in achieving them. A successful data team learns how to translate the goals and challenges of domain experts into the systems and processes that empower those domain experts. By enabling them to focus on the goals they care about, and helping to achieve those goals with less BS, domain experts will begin to share knowledge with your team, knowing that you can help. That trust creates a virtuous cycle that allows your data team to continue to deliver value.
As Tommy emphasized, collaboration allows much faster and greater progress. Accessibility attracts more talent and expands diversity. With a greater variety of perspectives and expertise, we are again able to move forward faster, which is critical in the fast-paced world of life sciences.