Ana Conesa
Spanish National Research Council, Spain
Ana Conesa is a Computational Biologist, Research Professor at the Spanish National
Research Council (CSIC) and Courtesy Professor at the University of Florida. She is a
member Spanish Royal Academy of Engineers, Fellow and Vice-president of the
International Society for Computational Biology (ISCB), and President of the Spanish
Society of Bioinformatics and Computational Biology. She directs the CSIC network
for Computational Biology and the CSIC node of the EU infrastructure ELIXIR. She
is co-founder of Biobam Bioinformatics, a start-up that provides bioinformatics tools
for biologists. Additionally, Ana Conesa is member of the CSIC Sustainability
Committee and of the green ISCB task force.
Ana Conesa’s lab is interested in understanding functional aspects of gene expression
at the genome-wide level and across different organisms. Her group has developed
over 20 statistical methods and software tools for transcriptomics analysis; she has
pioneered the development of methods for multi-omics integration and long-reads
transcriptomics. A strong drive in her research is helping the genomics community to
bridge the gap between data and knowledge by creating bioinformatics tools that
everybody can use. Some of our popular software tools are Blast2GO, PaintOmics,
maSigPro, NOISeq, Qualimap, SQANTI, tappAS, etc that have received over 46,000
citations. She has led multiple EU projects to develop methods for the analysis of the
transcriptome, more recently with a focus on the utilization of long read sequencing to
characterize transcriptome complexity.
Title: Transitioning from short to long read transcriptomics: accuracy, bias and
analysis challenges.
Recent advances in long-read sequencing technologies like PacBio and Oxford Nanopore have
revolutionized the generation of full-length transcript sequences. These technologies facilitate a
deeper understanding of complex isoforms and transcript structures. As the precision and depth
of sequencing improve, long-read methods are becoming more prevalent in transcriptomics
studies for identifying differential gene expression and isoform utilization across various
conditions using multiple replicates. Concurrently, new algorithms for transcript reconstruction
and quantification have emerged, adapting to the influx of long-read data. With the field's shift from short to long reads, there is an imperative to establish optimal data preprocessing,
experimental designs, quantification, and normalization strategies tailored to these data types.
Critical questions arise: What is the quality of my transcript identification and quantification calls
using long-read transcriptomics data? What is the best approach for constructing a long-read-
based quantification table? How many replicates are necessary? What is the ideal sequencing
depth? How can one identify and correct potential biases in transcript quantification? Which
data analysis strategies, if any, are different in long-read transcriptomics? Do these
considerations vary depending on the chosen sequencing technology or the algorithm used for
processing long reads?
I will present the efforts from my lab to evaluate the quality and utilization of long-read
transcriptomics data and discuss what challenges are still present to realize a complete shift
from short to reads in transcriptomics studies. I will present the expanded suite of SQANTI tools,
designed to comprehensively address these challenges. For benchmarking, SQANTI-SIM
stands out as it simulates long-read and orthogonal data with precise control over transcript
novelty, enabling robust evaluations of both annotated and novel transcript detection. The
BUGSI framework offers a set of universal single-isoform genes, serving as internal standards
to identify RNA degradation and library preparation issues. SQANTI-reads provides a critical
evaluation of raw data quality in multi-sample experiments, identifying outliers and technological
biases while ensuring data quality standards for discovery are met. SQANTI3 evaluates
transcript reconstruction algorithms, aiding in the accurate identification of transcript models
from long-read data. Modules such as Filter, Rescue, and Requant refine transcript models,
enhancing transcriptome quality and precision in quantification.
Our research highlights distinct quantification biases in lrRNA-seq compared to short-read RNA-
seq, underscoring the need for specialized normalization approaches. I will also explore
alternative methods for defining joint transcriptomes in multi-sample experiments and their
implications for transcript detection. Finally, I will introduce IsoAnnot, now incorporated into the
SQANTI suite, which differentiates productive from unproductive transcripts and provides
functional labels to deepen our understanding of the biological roles of alternative splicing.
Kentaro Tomii
National Institute of Advanced Industrial Science and Technology (AIST), Japan
Title:
From Binding-Site Prediction to Inhibitor Design: PoSSuM, PoSSuMAF, and Co-folding
Abstract:
Advances in biomolecular structure determination have greatly increased structural data on protein-ligand complexes. In addition, co-folding methods now enable highly accurate modeling of these complexes. In this presentation, we will introduce PoSSuM (Pocket Similarity Search using Multiple-sketches), a database of similarity search results for known and putative ligand-binding sites, and PoSSuMAF, an expanded version of PoSSuM that incorporates AlphaFold-predicted structures of human proteins. We will also present our collaboration with experimental groups to develop potent inhibitors using co-folding methods.
