-
Philippine Journal of Science141 (1): 25-34, June 2012ISSN 0031
- 7683Date Received: 19 Apr 2011
Key Words: 5' untranslated region, bioinformatics tools,
hepatitis C virus subtypes, non-structural 5B region
*Corresponding author: [email protected]
25
Michael O. Baclig1*, Juliet Gopez-Cervantes2, and Filipinas F.
Natividad1
1Research and Biotechnology Division,2Liver Disease and
Transplant Center, St. Luke's Medical Center,
279 E. Rodriguez Sr., Blvd., Quezon City
Bioinformatics Tools for IdentifyingHepatitis C Virus
Subtypes
With the development of freeware bioinformatics software as well
as the availability of web-based software, it is now possible to
use various bioinformatics tools to identify viral subtypes such as
hepatitis C virus (HCV). This study aimed to demonstrate the role
of bioinformatics tools in identifying HCV subtypes and to compare
the accuracy of HCV-1 subtyping by 5’UTR PCR-RFLP analysis and DNA
sequencing. From a clinical viewpoint, accurate genotype and
subtype identification of HCV are important because this may be
used as guide for deciding which therapy is appropriate to use for
a particular patient. From 2005 up to 2008, we had a total of 30
HCV genotype 1 (HCV-1) positive samples. HCV-1 subtypes were
identified by an in-house PCR-RFLP analysis and through direct
nucleic acid sequencing using nested primers specific to the 5’UTR
and non-structural 5B (NS5B) region. Bioinformatics tools play an
important role in identifying HCV-1 subtypes by predicting the size
of the amplicon; determining the specific restriction enzyme to cut
a given nucleic acid sequence; viewing and editing the
electropherogram; aligning nucleotide sequences with prototypes;
searching for identical sequences; and understanding the evolution
and relationships of various subtypes. The HCV nucleotide sequences
reported in this study have been deposited to GenBank. Overall,
this information can be utilized to generate molecular diagnostic
tests in the future.
INTRODUCTION There are a number of different methods for HCV
genotyping and subtyping. The most frequently used typing methods
are line probe assay (LiPA) and sequencing of the 5’UTR. The
Versant HCV genotype assay (LiPA) manufactured by Innogenetics has
been developed based on hybridization of 5’UTR amplification
products with genotype specific probes. On the other hand, the
TruGene HCV 5’NCR genotyping kit (Bayer
Healthcare, CA) is based on semi-automated sequencing (Verbeeck
et al. 2008; Chevaliez et al. 2009). However, it has been shown
that genotyping methods using the 5’UTR, including LiPA, may not
discriminate subtypes 1a from 1b in 5% to 10% of cases. Thus, other
investigators have used different regions of the HCV genome using
RFLP analysis or sequencing of the 5’UTR and NS5B for genotyping
and subtyping (Zein 2000; Chen and Weck 2002; Zheng et al. 2003;
Martro et al. 2008; Qiu et al. 2009; Mora et al. 2010).
PCR-RFLP analysis of the nested RT-PCR amplified 5’UTR is
generally used for the identification of HCV genotypes in the
Philippines (Maramag et al. 2006). It has
-
Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C
Virus Subtypes
Philippine Journal of ScienceVol. 141 No. 1, June 2012
26
been suggested that as the virus continues to evolve and more
HCV-infected individuals are tested, new subtypes such as HCV-1c
will emerge (Ross et al. 2008; Verma and Chakravarti 2008; Utama et
al. 2010). Until now, only three confirmed HCV-1 subtypes
specifically 1a, 1b, 1c, and 10 provisional assigned subtypes
specifically 1d to 1m have been described (Bracho et al. 2006;
Bracho et al. 2008; Martro et al. 2011). Thus, it is likely that
typing methods including PCR-RFLP analysis will have to be modified
to accommodate the rapidly increasing database of information
collected on HCV sequence heterogeneity (Davidson et al. 1995;
Buoroa et al. 1999; Lee et al. 2010). In addition, there is little
doubt that HCV typing methods require careful redesign.
The importance of identifying HCV genotypes and subtypes using
bioinformatics tools transcends mere academic interest because, it
will provide clinicians and scientists with invaluable information
about HCV genomics, which can be used for epidemiologic studies.
Furthermore, molecular characterization of HCV subtypes is likely
to facilitate the development of an effective vaccine. From a
clinical point of view, current therapeutic decisions for
chronically infected HCV patients are made on the basis of
genotyping and subtyping. Thus, accurate identification of subtypes
will enable the clinicians to make the proper choice of new
antiviral compounds which are likely to show distinct activities
against isolates belonging to different subtypes of HCV (Chandra et
al. 2007; Chevaliez et al. 2009; Koletzki et al. 2010; Panduro et
al. 2010; Pickett et al. 2011).
Bioinformatics tools have been developed to generate, store,
analyze, and visualize biological data. The challenge is to choose
user-friendly tools that would give clear and meaningful biological
information, without being overwhelmed by the complexity of the
data. In this paper, we demonstrate how bioinformatics tools can be
used to identify HCV-1 subtypes and we highlight selected freeware
bioinformatics software and web-based software. We also report the
accuracy of HCV-1 subtyping by 5’UTR PCR-RFLP analysis compared to
direct DNA sequencing. Additionally, we compare the HCV-1 subtypes
by phylogenetic analysis of the 5’UTR and non-structural 5B (NS5B)
region.
MATERIALS AND METHODS
Isolation of HCV RNA Viral ribonucleic acid (RNA) was obtained
from the St. Luke’s BioBank. These RNA samples were extracted from
blood of patients which tested positive for hepatitis C.
cDNA synthesiscDNA synthesis was carried out from HCV RNA
extract using the SuperScript III reverse transcriptase.
Nested PCR amplification of the 5’UTR
The 5’UTR amplification was carried with 1.0 pmol each of the
primers, 0.5X Phusion HF Buffer, 0.10 mM dNTPs, 0.01 U of Phusion
DNA polymerase and RNA template was made up to a volume of 50 µL.
Nested PCR was carried out using a programmable thermocycler
(G-Storm) at 94 °C for 1 minute, 25 cycles at 94 °C for 25 seconds,
50 °C for 40 °C seconds, 72 °C for 1 minute, and 72 °C for 5
minutes. First round PCR was done using the outer sense primer
(5’-CTGTGAGGAACTACTGTCTT-3’) a n d o u t e r a n t i s e n s e p r
i m e r (5’-ATACTCGAGGTGCACGGTCTACGAGACCT-3’). One µL of the first
round PCR product was reamplified with internal primers
(5’-TTCACGCAGAAAGCGTCTAG-3’ and 5’-CACTCTCGAGCACCCTATCAGGCAGT-3’)
for another 25 cycles under the same conditions (Chan et al. 1992).
Negative and positive controls were included in the nucleic acid
extraction, reverse transcription, as well as amplification for
quality control to exclude false positive results due to cross
contamination. The amplicons were analyzed on 2% agarose gels
followed by staining with ethidium bromide, and visualized under a
UV transilluminator (Gel Doc). A 25-bp ladder (Invitrogen) was used
as molecular weight marker.
Nested PCR amplification of the NS5B region NS5B nested PCR
amplification was carried out with the following thermal profile:
initial denaturation at 95 °C for 1 minute followed by 40 cycles at
95 °C for 20 seconds, 56 °C for 30 seconds, 72 °C for 1 minute, and
72 °C for 10 minutes as previously described (Baclig et al.
2010).
Simulation and testing of the PCR assayAmplify 3 version 3.1.4
for Mac OS (http://engels.genetics.wisc.edu/amplify) and AmplifX
1.5.4 (http://ifrjr.nord.univ-mrs.fr/AmplifX) were used to predict
the size of the amplicon.
Restriction fragment length polymorphism (RFLP) of the
5’UTRNEBcutter version 2.0 was used to map the enzyme used to cut
the 251-bp amplicon (http://tools.neb.com/NEBcutter2). The 251-bp
amplicon of the 5’UTR was digested using BstU1 for 16 hours at 60
°C. HCV-1 subtypes were identified from the resulting DNA fragments
which were visualized using polyacrylamide gel electrophoresis
(PAGE) and ethidium bromide staining.
-
Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C
Virus Subtypes
Philippine Journal of ScienceVol. 141 No. 1, June 2012
27
Visualization of the electropherogram and sequence alignment
ChromasPro version 1.5 was used to visualize and edit the
electropherogram (http://www.technelysium.com.au/ChromasPro.html).
Sequence alignment and analysis of the 5’UTR and NS5B region were
carried out using the ClustalW2 multiple sequence alignment
(http://www.ebi.ac.uk/Tools/msa/clustalw2) and Molecular
Evolutionary Genetics Analysis (MEGA) version 5 software.
DNA purification and sequencing A portion of the nested PCR
products was purified using the GFX PCR DNA and Gel Band
Purification kit (GE Healthcare), prior to direct nucleotide
sequencing. PCR products in the 5’UTR and NS5B region were
sequenced using Big Dye Terminator Sequencing Ready Reaction Kit
and Applied Biosystems 3730xl Automated Sequencer (Macrogen,
Korea). The sequence data were aligned with the consensus sequences
of confirmed subtypes using ClustalW2 and MEGA5 multiple sequence
alignment software. The DNA sequences were then compared for
identification with sequence from National Center for Biotechnology
Information (NCBI) using the basic local alignment search tool
(BLAST) program. The GenBank/EMBL/DDBJ accession numbers of HCV
sequences used in the analysis were M62321 (for 1a), D90208 (for
1b), D14853 (for 1c), AF238485 (for 2a), D10988 (for 2b), D17763
(for 3a), Y11604 (for 4a), Y13184 (for 5a), and Y12083 (for
6a).
Phylogenetic analysis and basic local alignment
searchPhylogenetic trees of the 5’UTR and NS5B regions were
constructed using MEGA5 software in accordance with the
neighbor-joining (NJ) and maximum parsimony (MP) method. The
sequence data were then aligned and tested for homology with
existing sequences already in the GenBank. The significance of the
group was assumed when bootstrap values were greater than 70%. The
robustness of the tree was evaluated by 1000 bootstrap
replicates.
Nucleotide sequence accession numbers The nucleic acid sequences
reported in this study have been deposited to NCBI through GenBank.
It can be retrieved under GenBank accession numbers GQ844690 to
GQ844700 and GQ866987 to GQ867012.
RESULTS AND DISCUSSIONMany relatively rapid and simple typing
methods for identifying the genotypes and subtypes of HCV have
been
described. These methods are based on the amplification of
subgenomic regions of the virus from clinical specimens by reverse
transcription-polymerase chain reaction (RT-PCR) followed by
digestion with restriction enzymes, amplification with
type-specific primers, and hybridization with type-specific probes.
However, the role of bioinformatics tools in identifying HCV
subtypes remain poorly described in detail. In our institution,
PCR-RFLP analysis of the nested RT-PCR amplified 5’UTR is generally
used for the identification of HCV genotypes and subtypes. In this
study, the simulation and testing of the PCR product were performed
using Amplify 3 and AmplifX. The PCR product size was predicted to
be 251-bp (Figure 1). This is consistent with the actual size of
the amplicon following agarose gel electrophoresis and ethidium
bromide staining (Figure 2). Based on the results of this study, we
have demonstrated that it is possible to theoretically determine
the size of the amplicon using bioinformatics tools such as Amplify
3 and AmplifX. Engels (1993) has previously shown that Amplify 3
may be used to design PCR experiments and predict the size of the
amplicon. In addition, Amplify 3 is a freeware Macintosh program
which can also be used as a tool for designing primers.
NEBcutter is an on-line DNA sequence tool used to find large,
non-overlapping, open reading frames and sites for all restriction
enzymes from New England BioLabs (Vincze et al. 2003). NEBcutter
DNA restriction mapper was used to determine the specific
restriction enzyme to differentiate HCV-1a and 1b viruses in this
study. Based on this, BstU1 was identified as the restriction
enzyme. Of the 30 HCV-1 samples, 16 (53%) were identified as 1a, 13
(43%) were identified as 1b and 1 (4%) was identified as mixed
subtype (1a/1b) by PCR-RFLP analysis of the 5’UTR. Results showed
that subtypes of HCV-1 viruses can be identified on the basis of
the electropherotypes produced following restriction enzyme
digestion of the amplified PCR products using BstUI resulting in
DNA fragments of 209-bp and 42-bp for HCV-1a viruses. In contrast,
the resulting DNA fragments for HCV-1b were 179-bp, 42-bp, and
30-bp. Subtypes 1a and 1b differ by a single nucleotide (A/G) at
position -99 in the 5’UTR. The presence of a G residue at position
-99 produces a sequence in nested RT-PCR products that is
recognized by the enzyme BstUI (Martro et al. 2008).
The electropherogram was viewed and edited using ChromasPro
(Figure 3). Multiple sequence alignment was performed using
ClustalW2. In this study, 5’UTR sequencing showed that 15 (50%)
were classified as subtype 1a and the remaining 15 (50%) were
classified as subtype 1b. Based on the results of this study, the
5’UTR nucleotide sequencing identified subtype 1a in 15/15 (100%)
and subtype 1b in 13/15 (87%) of the samples. The predictive value
of the 5’UTR PCR-RFLP
-
Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C
Virus Subtypes
Philippine Journal of ScienceVol. 141 No. 1, June 2012
28
analysis to subtype 1a was 15/16 (94%). For subtype 1b, the
predictive value of 5’UTR PCR-RFLP analysis was 13/13 (100%). One
sample (#743) that was typed as mixed subtype (1a/1b) by PCR-RFLP
analysis was confirmed as 1b by 5’UTR sequencing. This result could
be explained by an A/G polymorphism that may exist at nucleotide
position -99. Overall, the 5’UTR PCR-RFLP analysis was accurately
subtyped in 28 of 30 (93%) samples, missing 2 subtype 1b viruses.
Thus, PCR-RFLP
analysis of HCV-1 subtypes by 5’UTR-based typing method cannot
accurately discriminate 1a from 1b in 2 of 30 (7%) of cases.
Although the 5’UTR PCR-RFLP analysis provides an easy and rapid
method for screening of HCV-1 samples, and is a widely used tool in
HCV genotyping and subtyping, sequence and phylogenetic analysis is
the gold standard for determining HCV genotypes and subtypes. It is
noteworthy to mention that one sample
Figure 1. Using the software Amplify 3 and AmplifX, the
simulated 5’UTR PCR product is 251-bp.
-
Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C
Virus Subtypes
Philippine Journal of ScienceVol. 141 No. 1, June 2012
29
(#743) was identified to be a mixed subtype (1a/1b) by PCR-RFLP
analysis. It was shown to be HCV-1b by 5’UTR sequence and
phylogenetic analysis through bioinformatics tools such as
ChromasPro, ClustalW2, MEGA, and BLAST search. Thus, bioinformatics
can resolve ambiguous identities. The standard and most definitive
method for subtype determination is direct sequencing, which has
lower cost for reagents, but requires more time than commercial
assay kits. In this study, sequence alignment and analysis were
done using ChromasPro, ClustalW2, and MEGA5. It has been shown that
the ChromasPro software includes most of the functionality of
Chromas, such as assembly of overlapping sequences into a consensus
and automatic display of ambiguities for editing. On the other
hand, ClustalW2 is a multiple sequence alignment program for DNA or
proteins. It calculates the best match for the selected sequences
and lines them up, so that the identities can be seen. ClustalW2 is
currently maintained
at the Conway Institute UCD Dublin. MEGA5 is a user-friendly
software for building sequence alignments and it also provides
statistical analyses of DNA or protein sequence data (Larkin et al.
2007; Goujon et al. 2010; Tamura et al. 2011).
The GenBank database is an annotated collection of all publicly
available DNA sequences produced at NCBI as part of an
international collaboration with the European Molecular Biology
Laboratory (EMBL) and the DNA Data Bank of Japan (DDBJ). GenBank
receives nucleotide sequences produced in many laboratories
throughout the world from more than 100,000 distinct organisms
including viruses. Each database has its own set of submission and
retrieval tools, but the three major public DNA databases exchange
information daily so that all three databases should contain the
same set of DNA sequences (Mizrachi 2002; Teufel et al. 2006). In
order to identify HCV-1 subtypes, the DNA sequences
Figure 2. Representative gels of digested PCR products.
Figure 3. Representative electropherogram and nucleotide
sequence of partial 5’UTR.
Digested Undigested M Digested Undigested M
179 bp209 bp
42bp42bp30 bp
HCV-1bHCV-1a
251bp 251bp
50-1b50-1b
-
Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C
Virus Subtypes
Philippine Journal of ScienceVol. 141 No. 1, June 2012
30
were compared with reference sequences from NCBI using the BLAST
program (http://www.ncbi.nlm.nih.gov/BLAST).
The HCV sequences reported in this study may provide the
essential tools for studies on molecular virology, pathogenesis of
hepatitis C, drug design, and vaccine development. Because of the
diversity of HCV variants, when designing a vaccine, we must
consider specific genotypes and subtypes in various geographical
regions in order to achieve broad protection.
Phylogenetic trees are calculated using statistical models to
infer evolutionary relationships between organisms. Several methods
have been described for phylogenetic analysis such as
neighbor-joining (NJ), minimum evolution (ME), maximum likelihood
(ML), maximum parsimony (MP), and Bayesian approaches (Procter et
al. 2010). In this study, phylogenetic analysis was carried out
using MEGA5. This software has evolved to include the creation and
exploration of sequence alignments, estimation of sequence
divergence, construction, and visualization of phylogenetic trees
(Tamura et al. 2007; Tamura et al. 2011). MEGA includes various
tests for examining the reliability of the tree such as bootstrap
test and the standard error test. In bootstrap test, the same
number of sites is randomly sampled with replacement from the
original sequences, and a tree is constructed from the resampled
data. This process is repeated and the reliability of a sequence
cluster is evaluated by its relative frequency of the appearance in
bootstrap replications. In the standard error test, the branch
lengths of the tree are re-estimated by using the ordinary
least-squares method, and the standard errors of these estimates
are computed (Kumar et al. 1994; Kumar et al. 2001). Phylogenetic
analysis of the partial 5’UTR showed that not all of the subtype 1a
viruses clustered together and not all of the subtype 1b viruses
grouped together (Figure 4A). The probable reason for this is due
to the partial sequence subjected to phylogenetic analysis. It has
been suggested that the discriminatory power of the phylogenetic
tree analysis depends on the length of the fragment analyzed. Thus,
full-length analysis of 5’UTR is recommended to verify these
findings. Additionally, subtyping based on the use of variable
genomic regions such as the core, E1, and NS5B is recommended to
confirm these results (Figure 5). Phylogenetic analysis of the NS5B
showed that all of subtype 1b viruses grouped together and all of
subtype 1a viruses clustered together supported by bootstrap value
of 99% (Figure 4B). One of the possible ways to validate the
results is to run the data in different softwares to evaluate their
robustness using various methods such as maximum likelihood,
maximum parsimony, and Bayesian inference for comparison. Other
softwares which can be used to construct phylogenetic trees include
phylogeny inference package (PHYLIP) and phylogenetic analysis
using parsimony (Felsenstein 1989; Swofford 2003; Kumar et al.
2008).
Figure 4A. Phylogenetic analysis of partial 5’UTR sequences of
30 HCV-1 samples. HCV reference sequences from GenBank/EMBL/DDBJ
were included. The evolutionary history was inferred using
neighbor-joining (NJ) method. The numbers at the nodes represent
the percent bootstrap support for 1000 replicates. Only values over
70% are shown. Bar at the base of the tree shows the genetic
divergence. Phylogenetic analysis was conducted in MEGA5.
Recombination is a cause of genetic diversity and has important
implications to pathogenesis, diagnosis as well as treatment of HCV
infection. A natural intergenotypic (2b/1b) and intratypic (1b/1a)
recombinants in some families of RNA viruses have been reported
including HCV (Colina et al. 2004; Kageyama et al. 2006; Lee et al.
2010; Mes et al. 2010). In this study, phylogenetic tree analysis
did not show any evidence for recombinant forms. This result was
not surprising because HCV recombination occurs rarely. However,
further analysis of full-length genome is needed to show that a
recombination event has not truly occurred.
In the practical sense, bioinformatics tools can be used to
design primers in order to amplify a target DNA. It
708 (1b)544 (1b)
645 (1b)
818 (1b)
781 (1a)
717 (1b)
0.005
785 (1a)
98
559 (1b)646 (1b)
AF 238485 (2a)AF 238485 (2a)
D10988 (2b)
D17763 (3a)Y11604 (4a)
Y13184 (5a)
M62321 (1a)
745 (1b)
701 (1b)
743 (1b)814 (1b)622 (1b)620 (1b)566 (1b)551 (1b)443 (1b)D90208
(1B)
797 (1a)550 (1a)583 (1a)714 (1a)765 (1a)787 (1a)835 (1a) 766
(1a)Y12083 (6a)
723 (1a)821 (1a)837 (1a)D14853 (1c)
691 (1b)
-
Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C
Virus Subtypes
Philippine Journal of ScienceVol. 141 No. 1, June 2012
31
can also be used to identify genotypes and subtypes for
epidemiologic studies, as well as for molecular modeling of drug
targets. In addition, it can be utilized to identify biomarkers in
various disease processes as well as genome-wide association
studies and gene expression profiling.
CONCLUSION Bioinformatics tools play an important role in
identifying HCV subtypes by predicting the size of the amplicon;
determining the specific restriction enzyme to cut a given DNA
sequence; viewing and editing the electropherogram; aligning
nucleotide sequences with prototypes; searching
Figure 4B. Phylogenetic analysis of partial NS5B sequences of 30
HCV-1 samples. HCV reference sequences from GenBank/EMBL/DDBJ were
included. The evolutionary history was inferred using maximum
parsimony (MP) method. The numbers at the nodes represent the
percent bootstrap support for 1000 replicates. Only values over 70%
are shown. Bar at the base of the tree shows the genetic
divergence. Phylogenetic analysis was conducted in MEGA5.
765 (1a)
583 (1a)
797 (1a)
785 (1a)
766 (1a)
723 (1a)
835 (1a)
Y12083 (6a)
AF238485 (2a)
D10988 (2b)
D17763 (3a)
Y11604 (4a)
Y13184 (5a)
781 (1a)
787 (1a)
821 (1a)
714 (1a)
717 (1b)
743 (1b)
D90208 (1b)
620 (1b)
443 (1b)
745 (1b)
645 (1b)
622 (1b)
559 (1b)
708 (1b)
701 (1b)
566 (1b)
646 (1b)
691 (1b)
814 (1b)
818 (1b)
D14853 (1c)
5
544 (1b)
551 (1b)
837 (1a)
550 (1a)
99
99
(62321 (1a)
-
Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C
Virus Subtypes
Philippine Journal of ScienceVol. 141 No. 1, June 2012
32
for identical sequences; and understanding the evolution and
relationships of nucleic acid sequences. Overall, this information
can be utilized to generate molecular diagnostic tests in the
future.
ACKNOWLEDGEMENTS This work was supported by a research grant
from St. Luke’s Medical Center through the Research and
Biotechnology Division.
REFERENCESBACLIG MO, CHAN VF, RAMOS JDA, GOPEZ-
CERVANTES J, NATIVIDAD FF. 2010. Correlation of the
5’untranslated region (5’UTR) and non-structural 5B (NS5B)
nucleotide sequences in hepatitis C virus subtyping. Int J Mol
Epidemiol Genet 1:236-244.
BRACHO M, CARILLO-CRUZ, F, ORTEGA E, MOYA A, GONZALES-CANDELAS
F. 2006. A new subtype of hepatitis C virus genotype 1: Complete
genome and phylogenetic relationships of an equatorial Guinea
isolate. J Gen Virol 87:1697-1702.
BRACHO MA, SALUDES V, MARTRO E, BARGALLO A, GONZALEZ-CANDELAS F,
AUSINA V. 2008. Complete genome of a European hepatitis C virus
subtype 1g isolate: Phylogenetic and genetic analyses. Virol J
5:72
BUOROA S, PIZZIGHELLAB S, BOSCHETTOA R, PELLIZZARIA L, CUSANA M,
BONAGUROA R, MENGOLIA C, CAUDAIC C, PADULAC M, EGISTO P, VALENSINC
P, PALUA G. 1999. Typing of hepatitis C virus by a new method based
on restriction fragment length polymorphism. Intervirol 42:1-8.
Figure 5. The HCV genome consists of a single open reading frame
and two untranslated regions. It encodes a polyprotein of
approximately 3011 amino acids. Adapted from Lindenbach and Rice
2005 (Nature 436:933-937).
CHAN S, MCOMISH F, HOLMES E, DOW B, PEUTHERER J, FOLLETT E, YAP
P, SIMMONDS P. 1992. Analysis of a new hepatitis C virus type and
its phylogenetic relationship to existing variants. J Gen Virol
73:1131-1141.
CHANDRA M, THIPPAVUZZULA R, RAMACHANDRA RAO VV, HABIB AM,
HABIBULLAH CM, NARASU L, PRAMEELA Y, KHAJA MN. 2007. Genotyping of
hepatitis C virus in infected patients from South India. Infect
Genet Evol 7:724-730.
CHEN Z, WECK K. 2002. Hepatitis C virus genotyping:
Interrogation of the 5’untranslated region cannot accurately
distinguish genotypes 1a and 1b. J Clin Microbiol 40:3127-3134.
CHEVALIEZ S, BOUVIER-ALIAS M, BRILLET R, PAWLOTSKY JM. 2009.
Hepatitis C virus genotype 1 subtype identification in new HCV drug
development and future clinical practice. PLoS ONE 4:1-9
COLINA R, CASANE D, VASQUEZ S, GARCIA-AGUIRRE L, CHUNGA A,
ROMERO H, KHAN B, CRISTINA J. 2004. Evidence of intratypic
recombination in natural populations of hepatitis C virus. J Gen
Virol 85:31-37.
DAVIDSON F, SIMMONDS P, FERGUSON J, JARVIS L, DOW B, FOLLET E,
SEED C, KRUSIUS T, LIN C, MEDGYESI G, KIYOKAWA H, OLIM G, DURAISAMY
G, CUYPERS T, SAEED A, TEO D, CONRADIE J, KEW M, LIN M,
NUCHAPRAYOON C, NDIMBIE O, YAP P. 1995. Survey of major genotypes
and subtypes of hepatitis C virus using RFLP of sequences amplified
from the 5’non-coding region. J Gen Virol 76:1197-1204.
ENGELS W. 1993. Contributing software to the internet: The
amplify program. TIBS 18.
FELSENSTEIN J. 1989. PHYLIP. Phylogeny inference package
(version 3.2). Cladistics 5:164-166.
-
Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C
Virus Subtypes
Philippine Journal of ScienceVol. 141 No. 1, June 2012
33
GOUJON M, MCWILLIAM H, LI W, VALENTIN F, SQUIZZATO S, PAERN J,
LOPEZ R. 2010. A new bioinformatics analysis tools framework at
EMBL-EBI. Nucleic Acids Res 38:S695-699
KAGEYAMA S, AGDAMAG D, ALESNA E, LEANO P, HEREDIA A, TAC-AN A,
JEREZA L, TANIMOTO T, YAMAMURA J, ICHIMURA H. 2006. A natural
intergenotypic (2b/1b) recombinant of hepatitis C virus in the
Philippines. J Med Virol 78:1423-1428.
KOLETZKI D, DUMONT S, VERMEIREN H, FEVERY B, DE SMET P, STUYVER
LJ. 2010. Development and evaluation of an automated hepatitis C
virus NS5B sequence-based subtyping assay. Clin Chem Lab Med
48:1095-1102.
KUMAR S, TAMURA K, NEI M. 1994. MEGA: Molecular evolutionary
genetics analysis software for microcomputers. Comput Appl Biosci
10:189-191.
KUMAR S, TAMURA K, JOKOBSEN IB, NEI M. 2001. MEGA2: Molecular
evolutionary genetics analysis software. Bioinformatics
17:1244-1245.
KUMAR S, NEI M, DUDLEY J, TAMURA K. 2008. MEGA: A
biologist-centric software for evolutionary analysis of DNA and
protein sequences. Brief Bioinform 9:299-306.
LARKIN MA, BLACKSHIELDS G, BROWN NP, CHENNA R, MCGETTIGAN PA,
MCWILLIAM H, VALENTIN F, WALLACE IM, WILM A, LOPEZ R, THOMPSON JD,
GIBSON TJ, HIGGINS DG. 2007. ClustalW and ClustalX version 2.
Bioinformatics 23: 2947-2948.
LINDENBACH B, RICE C. 2005. Unraveling hepatitis C virus
replication from genome to function. Nature 436:933-938.
LEE Y, LIN H, CHEN Y, LEE C, WANG S, CHANG J, CHEN T, LIU H,
CHEN Y. 2010. Molecular epidemiology of HCV genotypes among
injection drug users in Taiwan: Full-length sequences of two new
subtype 6w strains and a recombinant form 2b6w. J Med Virol
82:57-68.
MARAMAG F, RIVERA M, PREDICALA R, BACLIG M, MATIAS R, CERVANTES
J. 2006. Hepatitis C genotypes among Filipinos. Phil J
Gastroenterol 2:30-32.
MARTRO E, GONZALES V, BUCKTON A, SALUDES V, FERNANDEZ G, MATAS
L, PLANAS R, AUSINA V. 2008. Evaluation of a new assay for
hepatitis C virus genotyping targeting both 5’NC and NS5B genomic
regions in comparison with reverse hybridization and sequencing
methods. J Clin Microbiol 46:192-197.
MARTRO E, VALERO A, JORDANA-LLUCH E, SALUDES V, PLANAS R,
GONZALEZ-CANDELAS F, AUSINA V, BRACHO MA. 2011. Hepatitis C virus
sequences from different patients confirm the existence and
transmissibility of subtype 2q, a rare subtype circulating in the
metropolitan area of Barcelona, Spain. J Med Virol 83:820-826.
MES TH, VAN DOORNUM GJ. 2010. Recombination in hepatitis C virus
genotype 1 evaluated by phylogenetic and population-genetic
methods. J Gen Virol 92:279-286.
MIZRACHI I. 2002. GenBank: The nucleotide sequence database. The
NCBI handbook 1-15.
MORA MV, ROMANO CM, GOMES-GOUVEA MS, GUTIERREZ MF, CARRILHO FJ,
PINHO JR. 2010. Molecular characterization distribution and
dynamics of hepatitis C virus genotypes in blood donors in
Colombia. J Med Virol 82:1889-1898.
PANDURO A, ROMAN S, KHAN A, TANAKA Y, KURBANOV F, LOPEZ E,
CAMPOLLO O, NAZARA Z, MIZOKAMI M. 2010. Molecular epidemiology of
hepatitis C virus genotypes in West Mexico. Virus Res
151:19-25.
PICKETT B, STRIKER R, LEFKOWITZ E. 2011. Evidence for separation
of HCV subtype 1a into two distinct clades. J Viral Hepat
18:608-618.
PROCTER JB, THOMPSON J, LETUNIC I, CREEVEY C, JOSSINET F, BARTON
GJ. 2010. Visualization of multiple alignments phylogenies and gene
family evolution. Nature Methods 7:S16-25.
QIU P, CAI XY, DING W, ZHANG Q, NORRIS ED, GREENE JR. 2009. HCV
genotyping using statistical classification approach. J Biomed Sci
16:62.
ROSS RS, VERBEECK J, VIAZOV S, LEMEY P, RANST MV, ROGGENDORF M.
2008. Evidence for a complex mosaic genome pattern in a full-length
hepatitis C virus sequence. Evolutionary Bioinformatics
4:249-254.
SWOFFORD D. 2003. PAUP. Phylogenetic analysis using parsimony
(version 4). Sunderland, Massachusetts: Sinauer Associates.
TAMURA K, DUDLEY J, NEI M, KUMAR S. 2007. Molecular evolutionary
genetics analysis software version 4.0 Mol Biol Evol
24:1596-1599.
TAMURA K, PETERSON D, PETERSON N, STECHER G, NEI M, KUMAR S.
2011. MEGA5: Molecular evolutionary genetics analysis using maximum
likelihood, evolutionary distance and maximum parsimony methods.
Mol Biol Evol 28:2731-2739.
-
Baclig et al.: Bioinformatics Tools for IdentifyingHepatitis C
Virus Subtypes
Philippine Journal of ScienceVol. 141 No. 1, June 2012
34
TEUFEL A, KRUPP M, WEINMANN A, GALLE PR. 2006. Current
bioinformatics tools in genomic biomedical research. Int J Mol Med
17:967-973.
UTAMA A, TANIA NP, DHENNI R, GANI RA, HASAN I, SANITYOSO A,
LELOSUTAN S, MARTAMALA R, LESMANA LA, SULAIMAN A, TAI S. 2010.
Genotype diversity of hepatitis C virus in HCV associated liver
disease patients in Indonesia. Liver Int 30:1152-11601.
VERBEECK J, STANLEY M, SHIEH J, CELIS L, HUYCK E, WOLLANTS E,
MORIMOTO J, FARRIOR A, SABLON E, JANKOWSKI-HENNIG M, SCHAPER C,
JOHNSON P, VAN RANST M, VAN BRUSSEL M. 2008. Evaluation of Versant
HCV genotype assay (LiPA) 2.0. J Clin Microbiol 46:1901-1906.
VERMA V, CHAKRAVARTI A. 2008. Comparison of 5’non coding core
with 5’non-coding regions of HCV by RT-PCR: Importance and clinical
implications. Curr Microbiol 57:206-211.
VINCZE T, POSFAI J, ROBERT R. 2003. NEBcutter: A program to
cleave DNA with restriction enzymes. Nucleic Acids Res
31:3688-3691.
ZEIN N. 2000. Clinical significance of hepatitis C virus
genotypes. Clin Microbiol Rev 13:223-235.
ZHENG X, PANG M, CHAN A, ROBERTO A, WARNER D, YEN-LIEBERMAN B.
2003. Direct comparison of hepatitis C virus genotypes tested by
INNO-LiPA HCV II and TruGene HCV genotyping methods. J Clin Virol
28:214-216.