Workshop 2 – Introduction to Machine Learning in Bioinformatics

Name: Workshop 2 - Introduction to Machine Learning in Bioinformatics
SKU: 1030
Price: 50.00 AUD
Availability: OutOfStock

$50.00

12th November 2023, 1300 – 1700, Room 2003, TRI

Out of stock

Category: Workshop

Description

Description

Abstract
The emerging field of Genome-NLP (Natural Language Processing) aims to analyse biological sequence data using machine learning (ML), offering significant advancements in data-driven diagnostics. Three key challenges exist in Genome-NLP. First, long biomolecular sequences require ”tokenisation” into smaller subunits, which is non-trivial since many biological ”words” remain unknown. Second, ML methods are highly nuanced, reducing interoperability and usability. Third, comparing models and reproducing results are difficult due to the large volume and poor quality of biological data. To tackle these challenges, we developed the first automated Genome-NLP workflow that integrates feature engineering and ML techniques. The workflow is designed to be species and sequence agnostic.

In this workflow:

a) We introduce a new transformer-based model for genomes called genomicBERT, which empirically tokenises sequences while retaining biological context. This approach minimises manual preprocessing, reduces vocabulary sizes, and effectively handles out-of-vocabulary ”words”.
(b) We enable the comparison of ML model performance even in the absence of raw data. To facilitate widespread adoption and collaboration, we have made genomicBERT available as part of the publicly accessible conda package called genomeNLP. We have successfully demonstrated the application of genomeNLP on multiple case studies, showcasing its effectiveness in the field of Genome-NLP

Workshop 2 – Introduction to Machine Learning in Bioinformatics

Description

Related products

Workshop 1 – Getting Tools into Galaxy

Workshop 5 – Unfolding the Future: Protein Modeling, Targeted Drug Design, and Mutation Insights

Workshop 4 – Quantum Machine Learning

Workshop 3 – Fundamentals Single-cell RNA-seq analysis