Background: The major bottleneck in genome sequencing is no longer data generation, but the computational challenges around data analysis, display and integration. New approaches and methods are, therefore, required to meet these challenges. Visual analytics is the representation and presentation of data that exploits human visual perception abilities in order to amplify cognition. Opportunities exist for African researchers to expand the use of visual discovery tools and curated datasets to enable visual discovery (exploration, mining and analysis via interactive visual interfaces) of bioinformatics results from high-quality genomics research. Methods: We are developing a system of visual analytics resources that are based on molecular and clinical data including molecular consequences of single nucleotide variants; the RNA-seq expression levels of transcripts; and the functional sites in protein sequences. Results: We have developed an initial set of visual analytics resources with the use case as the major intrinsic protein family of water and glycerol transporters. Members of these protein family have been implicated in diverse cardiometabolic diseases. The computational resources developed can be adapted for gene lists including those obtained from high-throughput assays. The long-term goal of the project is to empower researchers to make discoveries from largescale molecular and clinical datasets to support decision-making on genetic and environmental determinants of cardiometabolic diseases in Africa.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Presenter: Oyekanmi Nash, PhDNode Principal Investigator,
H3Africa Bioinformatics Network Node at National Biotechnology Development Agency (NABDA)
Abuja, Nigeria
Visual Analytical Screening System for Disease Linked Gene Variants
Map of Africa showing the distribution of nodes in the H3ABioNet network
H3Africa: Bioinformatics Network
• H3ABioNet: a sustainable African Bioinformatics Network for H3Africa
The network provide:• computational infrastructure and hardware, • human resources, • tools and computational solutions for genomic and population-based research,
and • communications among African researchers and other interested parties.These aims are be achieved by:• providing user support, • training and capacity development, • research and tools development, and • outreach and communication.
ICCAC Country Representative : Prof. Oyekanmi Nash, Alternate Representative: Hadiza Rasheed-Jada
Reports directly to the DG/CEO, NABDA/FMST
ORGANIZATION OF THE HVP NIGERIA NODE IIThe staff members of the Node include:• Alternate Representative - Hadiza Rasheed-Jada• Node Manager - Atinuke Hassan• Systems Administrator - Adekunle Farouk• Research Associates - Abimbola Kashim
- Deborah Fasesan - Taoheed Abdulkareem - Ayodele Fakoya - Adijat Ozohu Jimoh • Post-doctoral Researcher - Dr. Segun Fatumo
Institutional and Researchers Affiliation to the Node will drive the activities of the Node.
Background – Cardiometabolic Diseases• Worldwide cardiometabolic diseases are the major causes of:
• Disability; Rising Healthcare Costs and Deaths• Examples:
• Type 2 diabetes, hypertension, dyslipidemia, coronary heart disease and chronic kidney disease
• Over the next 7 years • Africa is projected to experience the largest increase in
death rates from cardiovascular disease, cancer, respiratory disease and diabetes (Aikins et al., 2010)
Source: Global Health Estimates (GHE) 2013: Deaths by age, sex and cause
A Strategy in Africa to Address Burden of Cardiometabolic Diseases
• Genomic and Environmental Determinants (H3Africa Projects)
• H3Africa Kidney Disease Research Network• Genomic and environmental risk factors for
cardiometabolic disease in Africans• Burden, spectrum and etiology of type 2
diabetes in sub-Saharan Africa• …..
Examples of Projected Massive and Complex Datasets from H3Africa Projects (2013….
Type 2 Diabetes Project• 12,000 Cases and 12,000 Controls • Sequencing of known T2DM regions • Genome-wide genotyping arrays• Whole exome/genome sequencing
Body Composition Project• African genome structure• Phenotyping and sampling for Cohorts • Genetic and environmental contribution to
body composition (~12,000 individuals)
These research investigations rely significantly on bioinformatics analysis and inferences from large and heterogeneous datasets
obtained from populations inside and outside Africa.
DATA SCIENCE
• Data Flow• Data Curation• Data Analysis
“The major bottleneck in genome sequencing is no longer data generation—the computational challenges around data analysis, display and integration are now rate limiting. New approaches and methods are required to meet these challenges”.
National Human Genome Research Institute Strategic Plan:Charting a course for genomic medicine from base pairs to bedside http://www.genome.gov/Pages/About/Planning/2011NHGRIStrategicPlan.pdf
Making Discoveries from the Massive and Complex Genomics Datasets and Bioinformatics Results
Use Case – Gene Families AQUAPORIN – Water and glycerol transporter
13 Mammalian Aquaporins (AQP0-AQP12). Malfunction or absence linked to disease. Adipose AQP7 deficiency is associated with an increase of
intracellular glycerol content. Up-regulation of AQP1 in the glomeruli of most diseased
kidneys.
Reference: Hibuse et al. (2005). Aquaporin 7 deficiency is associated with development of obesity through activation of adipose glycerol kinase. Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):10993-8. http://www.ncbi.nlm.nih.gov/pubmed/16009937
Visual Analytical System for Screening Disease Linked Gene Variants Integrates data from ENSEMBL and Database of Alternate Transcript Expression (DBATE)
Data
Sou
rces Blending of Data Dimensions from multiple Data Sources
Identifies Variants linked to TranscriptsInsights: rs199936776 is unique to AQP7-004 and could affect expression of transcript or properties of protein isoform
Identification of variants that could affect transcript expression in adipose tissues
Variation Name PolyPhen predictionAssociated TranscriptName Ensembl Transcript ID