Image courtesy of the Hong Kong University of Science and Technology, Hong Kong SAR
Panoramic View of ClearWater Bay from HKUST Conference Auditorium

Sixth International
Conference on Bioinformatics

HKUST, Hong Kong
Hanoi, Vietnam and
Nansha, PR China

An APBioNet Meeting

InCoB History








HKUST Event 1

Getting around HK


HKSTP GalaDinner

Hanoi Event 2

Nansha Event 3

Call for Participation



Planning your trip:FAQs

Endorsed by

ISCB International Society for Computational Biology



Supported by

HongKong University of Science and Technology

HongKong Event
Sponsored by

KC Wong Education Foundation

Hong Kong Research Grants Council RGC

Hong Kong Science and Technology Parks

Inforsense Inc


KOOPrime KOOPrime Consulting

ISCB International Society for Computational Biology

Hanoi Event
Sponsored by

United Nations Educational, Scientific and Cultural Organisation

IUBMB International Union of Biochemistry and Molecular Biology

FAOBMB Federation of Asian and Oceanian Biochemists and Molecular Biologists

Institute of Biotechnology IBT

Nansha Event
Sponsored by

HKUST Fok Ying Tung Graduate School

Publication Partners

BioMedCentral Bioinformatics

BMF Biomolecular Frontiers

Internet2 Partners

Asian Institute of Technology, Thailand

TransEurasia Information Network TEIN


27 - 31 August 2007

Keynote Speakers - Hong Kong
Professor Burkhard Rost Professor Burkhard Rost (confirmed)
Department of Biochemistry and Molecular Biophysics
Columbia University
President, International Society for Computational Biology (ISCB)
Professor Minoru Kanehisa Professor Minoru Kanehisa, (confirmed)
Director, Bioinformatics Center,
Institute for Chemical Research, Kyoto University, Japan
Human Genome Center, Institute of Medical Science, University of Tokyo, Japan Website:

TOPIC: Genome to Life and Environment
Since the completion of the Human Genome Project, high-throughput experimental projects have been initiated for uncovering genomic information in an extended sense, including trascriptome and proteome information, as well as metabolome, glycome, and other genome-encoded information. Together with traditional genome sequencing for an increasing number of organisms from bacteria to higher eukaryotes, we are beginning to understand the genomic space of possible genes and proteins that make up the biological system. In contrast, we have very limited knowledge about the chemical space of possible chemical substances that exists as an interface between the biological world and the natural world. Chemical genomics is an emerging discipline for systematic analysis of the chemical space. Experimentally, this is being achieved by high-throughput screening of large-scale chemical compound libraries with various biological assays at the molecular, cellular, and organism levels. In order to best utilize the bioassay data being generated, bioinformatics methods have to be developed to extract biological information encoded in the chemical structures, and to understand the information in the context of molecular interactions and reactions involving proteins and other biomolecules. This would eventually lead to basic understanding of the chemical environment that interacts with and drives evolution of the biological system.

KEGG ( is a database of biological systems, integrating molecular building block information (KEGG GENES and KEGG LIGAND) and higher-level functional information (KEGG PATHWAY and KEGG BRITE). KEGG provides a reference knowledge base for linking genome to life by the process of PATHWAY mapping, which is to map, for example, a genomic or trascriptomic content of genes to KEGG reference pathways to infer systemic behaviors of the cell or the organism. In addition, KEGG now provides a reference knowledge base for linking genome to environment, such as for the analysis of drug-target relationships, by the process of BRITE mapping. KEGG BRITE is a collection of hierarchically structured vocabularies representing functional hierarchies of various biological objects, including molecules, cells, organisms, diseases, and drugs, as well as relationships among them. I will discuss bioinformatics methods that we have developed for integrated analysis of genomic and chemical spaces. In particular, I will show how KEGG can be used to understand the chemical repertoire of endogenous molecules and also to extract reaction/interaction information from the small molecular structures.

Professor David Wishart Professor David Wishart, (confirmed)
Departments of Computing Science and Biological Sciences
University of Alberta, Edmonton, Alberta, Canada

How Bioinformatics Helped Reveal the Human Metabolome
Metabolomics (or metabonomics as it is sometimes called) is a newly emerging field of omics research concerned with the high-throughput identification and quantification of the small molecule metabolites in the metabolome. The metabolome can be defined as the complete collection of all small molecule (<1500 Da) metabolites found in a specific cell, organ or organism. It is a close counterpart to the genome, the transcriptome and the proteome. While the technology to characterize the genome and the proteome has only been available for the past 20 years, the technology to characterize the metabolome has actually been around for much longer. In fact, for the past 100 years chemists, biochemists and clinical chemists have been "inadvertently" characterizing the human metabolome in their quest to characterize metabolic disorders and to identify useful biomarkers of disease. However, rather than depositing their results in a coordinated way into electronic repositories, this metabolomic data has been haphazardly appearing in journals, books and dissertations. As a result, most of the information about the human metabolome was essentially buried in dusty bookshelves.

In this presentation I will describe how we used a variety of custom bioinformatics tools (text mining, screen scraping, prediction and machine learning) in combination with "old-fashioned" library research to extract, assemble and annotate the human metabolome from existing data resources. With a first draft of the human metabolome now complete, we are now in the process of using advanced experimental methods to confirm or validate this draft data and to provide more complete metabolite annotations. We are also developing a number of databases and software tools to support the dissemination of information about the human metabolome and to facilitate metabolomics research in general. I will describe some of these tools in detail and provide examples of some of their applications to medical research and systems biology.

Professor Roderic Guigo Professor Roderic Guigó, (confirmed)
Head, Bioinformatics and Genomics Program
Center for Genomic Regulation
Professor, Universitat Pompeu Fabra.
Barcelona, Spain.
The complexity of the human genome transcriptional landscape
Transcribed regions have been long been regarded as a distinguishing characteristic of functional portions of the human genome. As part of the Encyclopedia of DNA Elements (ENCODE) project, the sites of transcription in the non-repeat sequences across a representative 1% of the human genome has been determined in a large number of different cell line/tissue samples using of high throughput transcription interrogation technique. In addition, a detailed annotation of the protein coding content of the ENCODE regions has been obtained through a combination of computational, experimental and manual methods. Overall, at least 90% of the ENCODE regions appear to transcribed as primary nuclear transcripts, and about 15% are present as mature processed polyadenylated transcripts. Interestingly up to 30% of these sites of transcription have not been previously identified.

In addition, using a combination of 5'Rapid Amplification of cDNA Ends (RACEs) and high-density resolution tiling arrays, we have systematically explored the transcriptional diversity of protein coding loci. RACE allows detection of low copy number transcripts/isoforms and a high-resolution analysis of genes individually, while pooling strategies and array hybridization permit to reach high-throughput readout. We identified previously unannotated and often tissue/cell line specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene. 5' most novel detected exons are significantly associated to independently derived evidence of transcription initiation. Notably, more than 50% of the novel transcripts resulting from inclusion of novel exons have changes in th! eir open reading frames. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results might revise our current understanding of the architecture of protein-coding genes. They have significant implications for our views on locations of regulatory regions in the genome and for the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "non-coding" ultimately relating to the identification of disease-related sequence alterations.

Professor Terry Speed Professor Terry Speed (confirmed)
Head, Bioinformatics Division,
The Walter and Eliza Hall Institute of Medical Research,
Parkville Victoria 3050 Australia.
Department of Statistics and Program in Biostatistics
University of California at Berkeley
Berkeley, CA, USA
Email: terry --at--; terry ++at++
Estimating chromosomal copy number
Mutations in the human genome range from single base-pair substitutions to changes in the numbers of whole chromosomes. There are several contexts in which people are interested in estimating chromosomal copy number from DNA samples. In pre-natal diagnoses, testing for trisomy 21 (Down's syndrome) is now commonplace, as is more general screening for chromosomal abnormalities. In tumors, selective amplification and deletion of specific chromosomal segments is frequent, with different types of cancer being associated with different patterns of gain and loss. Determining these patterns for particular tumors can be very relevant to the diagnosis, prognosis and treatment of the cancer. Recently it has been shown that there is a great deal of normal copy number variation in human populations, and interest in the association of such variation with different human diseases has increased. Cytogeneticists have long had a variety of methods for detecting such changes, including classic staining followed by microscopic visualization, fluorescence in situ hybridization (FISH), and comparative genomic hybridization (CGH). However, in the last few years much excitement has been generated in the use of high-density microarray technologies for this purpose, with the Affymetrix, Illumina and more recently NimbleGen platforms being at the forefront. As in all such endeavours, interest focusses on the accuracy and resolution of these technologies, and this is inextricably linked to the statistical/computational methods used to analyze the data they give. In this talk I will give a general introduction to the topic, a brief review of some analysis methods currently used, and finish by mentioning some challenges ahead.

Keynote Speaker - Nansha
Dr Emmanouil (Manolis) Dermitzakis Dr Emmanouil (Manolis) Dermitzakis (confirmed)
Population and Comparative Genomics
The Wellcome Trust Sanger Institute
Wellcome Trust Genome Campus
Hinxton, Cambridge, UK
Causes of regulatory variation in the human genome
The recent comparative analysis of the human genome has revealed a large fraction of functionally constrained non-coding DNA in mammalian genomes. However, our understanding of the function of non-coding DNA is very limited. In this talk I will present recent analysis in my group and collaborators that aims at the identification of functionally variable regulatory regions in the human genome by correlating SNPs and copy number variants with gene expression data. I will also be presenting some analysis on inference of trans regulatory interactions and evolutionary consequences of gene expression variation.

Conference Organisers

Professor Shoba Ranganathan Professor Shoba Ranganathan
Chair Professor of Bioinformatics
Dept of Chemistry and Biomolecular Sciences &
Biotechnology Research Institute
Macquarie University, Australia
Assoc Professor Hannah Xue Hong Assoc Professor Hannah Xue Hong
Department of Biochemistry
Director, HKUST Bioinformatics Centre
Hong Kong University of Science and Technology, Hong Kong SAR
Home Page:
Associate Professor Tan Tin Wee Assoc Professor Tan Tin Wee
Deputy Head, Department of Biochemistry,
National University of Singapore

Conference Invited Speakers Tentative List below

 Jose R Valverde
José R. Valverde, PhD MD MSci
Scientific Computing Service, CNB/CSIC
Campus Univ. Autonoma
Madrid, 28049. SPAIN

EMBnet, past, present and future | PPT/PDF |
(Final Abstract soon)
We have been working recently on EMBnet on testing and setting up a number of advanced services and infrastructures and we are now looking out to offer our collaboration to other groups and organizations to share our expertise and knwoledge to build and extend cooperative projects. Among these we would like to highlight at the conference:

  • e-Learning initiative: we are trying to nucleate an e-learning community willing to share knowledge, methods, materials and expertise as well as to host introductory and advanced courses over the Net
  • e-Science initiative: we have been participating from the onset on a number of Grid initiatives like EMBRACE and EGEE, which already have set out towards collaboration with asian partners, and believe it is time now to start deploying a true production infrastructure for Bioinformatics
  • Bioinformatics standards: we have started working on designing an open procedure that would give everybody the opportunity to participate on an equal basis on the development and definition of standards for Bioinformatics
Yike Guo
Yike Guo
Founder CEO, InforSense Ltd
Professor Imperial College London
Shanghai Bioinformatics Institute

Bioinformatics Workflow Integration | PPT/PDF |
Bioinformatics research, database management, data mining and other related activities are scaling up to levels which cannot be handled efficiently with manual techniques. Many organizations handling database curation and management are resorting to workflow integration. Many workflow integration systems are now available for bioinformatics applications. The experience of InforSense in delivering rapid workflow integration solutions for researchers in academia and industry is described. Use cases will be discussed along the concepts of workflow integration.

About Speaker
Prof. Yike Guo founded InforSense in November 1999 to commercialize his group.s pioneering Open Discovery Workflow technology for high-performance large-scale integrative data analysis, rapid application building and process knowledge management. He has led the company.s growth since then. He is a world leading expert in large scale data mining and Grid computing and also serves as Technical Director of the Parallel Computing Center and Head of the Data Mining Group at Imperial College, University of London. Over the last four years he has led a number of significant academic and industrial research and development projects targeted at building next generation e-Science platforms for which he has gained UK and European funding in excess of £10million. He holds a PhD in Computing Science from Imperial College.

Qiang Yang
Qiang Yang
Professor, Dept of Computer Science
Hong Kong University of Science and Technology

Data Mining for Bioinformatics: Some Challenging Problems | PPT/PDF |
In this talk, I will give a selected overview of the intersection of data mining for bioinformatics. I will list some challenges for both data mining and its corresponding problems in bioinformatics, point out where the challenges are, and survey some recent solutions. I hope to bring out where the hot topics are in a few selected areas, such as dimensionality reduction and feature selection.

Tutorial Speakers Tentative List below (To be finalised)

 Jose R Valverde
José R. Valverde, PhD MD MSci
Scientific Computing Service, CNB/CSIC
Campus Univ. Autonoma
Madrid, 28049. SPAIN

EGEE Grid Computing in the Life Sciences using gLite | OpenDoc, PPT and PDF |
A course on Grid computing using gLite (the EGEE middleware) with a short, half-day tutorial on-site integrated in an introductory e-learning course. Hands-on tutorial with the theoretical part imparted remotely through web-based e-learning will also be provided. All examples will be using biological and bioinformatics scenarios. See for tutorial materials.

Student Speakers

Kang Sungsoo
Kang Sungsoo
Daejon, Korea

The Student Council of the International Society for Computational Biology ISCB-SC: Regional Student Groups | PDF |
This talk is open to all students and anyone interested in helping to form Regional Student Groups in bioinformatics and computational biology in your own country or region.

First Created: April 2007 Tan Tin Wee
Last Updated: 17, 5 Aug; 20 June; April 2007