VIROSEQ ™ HIV-1 GENOTYPE TESTING WITH THE NEW ABI PRISM ® 3100 GENETIC ANALYZER P BAYBAYAN, B HOO, N MARLOWE, N BERNARD, C COCHRAN and J DILEANIS Applied Biosystems, Foster City, CA OBJECTIVE To evaluate the performance of HIV-1 genotyping on the new 3100 Genetic Analyzer, Applied Biosystems' 16-capillary instrument. INTRODUCTION The ViroSeq ™ HIV-1 Genotyping System (HGS) is a sequencing based test method (for Research Use only) for the identification of mutations in the protease and 5' end of the reverse transcriptase genes which are associated with drug resistance. This sys- tem provides all the reagents needed to prepare RNA from plasma samples, to per- form a two-step RT-PCR reaction, and to sequence the subsequent 1.8kb PCR product. In addition, the system includes the ViroSeq ™ HGS Software, which will automatically assemble the sequences from a clinical research specimen into one consensus sequence (project) and perform a comparison of the consensus sequence with a refer- ence sequence. The user is then able to manually edit the data before a report is gen- erated. This report lists all the amino acid differences between the reference (HIV-1 wild type) sequence and the specimen's consensus sequence. These differences are sorted by gene (protease vs. reverse transcriptase) and sorted as variants associated with drug resistance (per the Los Alamos HIV-1 Database) or variants which are “novel” (not identified by the Los Alamos HIV-1 Database as conferring drug resis- tance). For sequencing, the ViroSeq HIV-1 Genotyping System utilizes 6–7 sequencing primers; the use of these primers provides double coverage of the entire protease and first 315 codons of the reverse transcriptase gene. These sequencing reactions can be analyzed on a variety of instruments. To date, the 377 instrument has been the plat- form of choice for customers who require mid to high throughput HIV genotyping; this acrylamide gel-based platform permits the genotyping of up to 13 specimens per 96- lane gel with a 7 hour run. For laboratories which have a lower throughput of speci- mens, the 310 instrument can be used; this instrument is a single capillary elec- trophoresis platform that permits the genotyping of 1-2 specimens per 24 hour run. For research laboratories with very high throughput requirements, the 3700 instru- ment can be used; this 96-capillary electrophoresis instrument allows the genotyping of up to 13 specimens in 2.5 hours. User feedback suggested that a capillary instru- ment with the capabilities of the 377 would be a preferred instrument of choice since there would be no acrylamide gel and no manual loading of the sam- ples. The 3100 instrument, an automated walk-away genetic analyzer, was released in April 2000. This instru- ment has a 16 capillary array. This platform permits the analysis of 2 HIV-1 clinical samples in 2.5 hours, using the standard sequencing mod- ule. Therefore, 20–24 samples can be analyzed within a 24 hour period. MATERIALS AND METHODS: Randomly selected clinical research samples from 41 HIV-1 infected individuals were processed using the ViroSeq ™ HIV-1 Genotyping Kit (version 2) per the manufactur- er’s protocol. For some of these samples, a second round of RT-PCR was performed with nested PCR primers. (These nested PCR primers are not included in the ViroSeq kit.) Viral loads of the samples ranged from <50 to 384,000 copies/mL. The sample preparation procedure utilized a guanidium thiocyanate lysis of the pelleted viral par- ticles. The resulting RNA was precipitated with isopropanol. After RNA resuspen- sion, a two step RT-PCR reaction was performed with the MuLV reverse transcriptase and AmpliTaq Gold ® enzymes. (In the ViroSeq kit, the PCR reaction contains dUTP nucleotides and the enzyme, AmpErase ® UNG (uracil N-glycosylase), for contamina- tion control.) PCR products were purified with Microcon ® -100 columns and subse- quently analyzed on an agarose gel. After appropriate dilutions of the PCR products are made, the PCR products were sequenced with 7 sequencing primers using the BigDye ™ terminator chemistry. These sequencing samples were analyzed with (a) a 5% Long Ranger‚ gel with the 377 DNA Sequencer and (b) POP-6 ™ polymer with the 3100 Genetic Analyzer (50 cm capillary array). Sequences were analyzed with Sequencing Analysis Software v.3.3 and the ViroSeq HIV-1 Genotyping Software v.2.2. RESULTS: • Sequencing results showed concordant genotypes for sequencing reactions from the same PCR product run on both the 3100 and the 377 instruments. Table 1. Genotypes of Samples Analyzed on the 3100 and 377 Instruments Sample Protease mutations Reverse Transcriptase mutations 1677 M36I, G48V, L63P, A71V, L90M M41l, M184V G196E, R211K, T215Y 1680 M36I none identified 1684 M46I, L63P, G73S V77I, N88S, L90M V179D, M184V 1679 M36I, L63P, V77I G190A, R211K 1603 M36I, G48V, V82A S68G, M184V, T215Y 1606 K20R, V77I none identified 1607 L10V, M36I, L63P R211K 1608 V77I none identified 1610 L10V, M36I, L63P V106I, R211K 1666-3 M36I 1611 A71V K70R, L74V, L100I, K103N, R211K, T215F, K219Q 1612 L63P K70R, K103N, M184V, R211K 1613 L10F, D30N, L63P, A71V, V75I, V77I, N88D M41L, M184V, T215Y 1614 L10V, M46I, G73S, V77I, L90M D67N, K70R, K103N, V108I, G190A, T215F, K219E 1615 D30N, M36I, M46I, K55R*, D60E, L63P, N88D D67N, K70R, K103R, M184V, K219Q 1617 L63P G196E*, R211K 1616 none identified R211K 1619 G16E, L63P, A71T G196E 1666-10 L63P none identified 1666-13 M36I R211K 1628 D60E, L36P, V77I R211K 1629 V77I K166R, G196E, R211K 1642 L63P, V77I V106I • Longer read lengths were obtained with the 3100 capillary electrophoresis than with slab gel electrophoresis. On average, 700 bases could be read with the 3100 using the stan- dard StdSeq50_POP6DefaultModule with the 50 cm capillary array. The cycle time was 2.5 hours for 16 sequencing reactions. (Figures 2 and 3) Figure 2. Figure 3. • An alternate method allows for more rapid sequencing. We tested this method with 12 specimens. When we used this rapid sequencing module (RapidSeq36_POP6DefaultModule), read length decreased to approximately 480 bases. The rapid sequencing module data collection time needs to be increased from 2100 sec- onds to 2580 seconds, which will increase the cycle time from 60 minutes to 68 minutes, if double coverage of the protease and reverse transcriptase sequences is desired. Note: Since resolution was decreased, additional edits were needed in the HGS software, there- by eliminating the advantage of the longer standard 3100 method which reduced edits by 60%. (Figures 4, 5, 6, and 7) Model 3100 Basecaller-3100opt.bcp BC 1.2.d.4 11_primerG_F11_11.ab1 11[primerG Lane 11 Signal G:480 A:461 T:544 C:268 DT3100POP6{BD}v2.mob Points 812 to 11557 Pk 1 Loc: 812 Page 1 of 2 Sat, Apr 8, 2000 12:16 PM Thu, Apr 6, 2000 3:29 AM Spacing: 14.21{14.22} Version 3.3 A C T G AT ATNC 10 TAAT CCCT GG 20 TG T CT CATT G 30 TTTATA CTA G 40 G TAT GG TAAA 50 TG CAG TG TAC 60 TTT CTG AATT 70 CTTTAT CTAA 80 GGG AAC TG AA 90 AAATATG CA T 100 CACCCAC AT C 110 CAG TATT GTT 120 AC T GA TTTGT 130 TC TTTTTTAA 140 CCCTGC GGG A 150 TG TGG TA TTC 160 C TAATTGAAT 170 TTCCCAG AAA 180 T CTTGAG TT C 190 TC TTATT GAG 200 TT CTC TGAAA 210 TCTAC TAA TT 220 TT CTCCAT C T 230 AGCAC T GTTC 240 TTTTTCTTT 25 A 50 T GGCAAATAC 260 T GGAG TA TT G 270 TA T GGATTTT 280 CAGGCCCAA T 290 C TTT GAAA TT 300 TTCCCTTCCT 310 TTTCCA TTT C 320 T GTACAAA TT 330 TC TATTAAT G 340 C TTTTATTTT 350 TTCTTCT GT C 360 AAT GGCCATT 370 GTTT AAC TTT 380 T GGGCCAT CC 390 ATTCCTGG TT 400 TTAA TTTTAC 410 TGGTACAG TT 420 TCAA TA GG AC 430 TAA TGGG AAA 440 A TTTAAAG TA 450 C AG CCAA TC T 460 G AG TCAT CAA 470 ATTT CTTCCA 480 A TTAT GTT GA 490 CAGG TG TAG GG 500 TCCTATTAAC 510 AC TG TACTTA 520 TA GC TTTAT G 530 TCCACA G ATC 540 TC TATGG CTA 550 CCTG ATCATA 560 C T G TCTTAC T 570 TTGATAAAAC 580 CTCCAATTCC 590 CCCTATTATT 600 TTT GG TTTCC 610 AT CTTCCTGG 620 TAAAT TCCATT 630 T CTT CT AA TA 640 CT GT ATCA T C 650 TG C TCCTG TA 660 TCTAATAG AG 670 CTTCC TTT AG 680 TTG C CCC CC T 690 AT CTTTA TT G 700 T GACGAC GGG 710 T CGTTG CC AA 720 AG AGTG AT CT 730 G AGGG CAG TT 740 AAAGG ATA C T 750 T TCT CCTTGNC 760 TA TTGG CTCC 770 NG CTTCTG AG 780 ANGG AGGTTG 790 CTG T CTNCT C 800 Model 3100 Basecaller-3100opt.bcp BC 1.2.d.4 11_primerB_B11_03.ab1 11[primerB Lane 3 Signal G:840 A:1178 T:580 C:384 DT3100POP6{BD}v2.mob Points 762 to 11557 Pk 1 Loc: 762 Page 1 of 2 Sat, Apr 8, 2000 12:16 PM Thu, Apr 6, 2000 3:29 AM Spacing: 13.00{-13.00} Version 3.3 TTTNNAAAA T 10 NAAAG CATT A 20 AT A GTAAATT 30 T G T A CAGNAA 40 ATGGNAAAAG 50 G AAGGG AAAA 60 TTT CAAAGNA 70 TT GGG CCT G A 80 AAAT CCAT AC 90 AAT AC T CCAG 100 TA TTT GCCAT 110 AAAG AAAAAG 12 G 20 AAC AG T G C TA 130 G ATGG AG AAA 140 ATTA G TA G AT 150 TT CA G AG AAC 160 T CAA TAAG AG 170 AAC T CAAG AT 180 TTC T GGG AAA 190 TT CAATTA GG 200 AATACCA CA T 210 CCC GC AGGG T 220 TAAAAAAG AA 230 CAAA TCA G TA 24 40 ACAA TAC T GG 250 A TGT GGGT GA 260 TGCA TA TTTT 270 TCAG TTCCC T 280 TAG ATAAAG A 290 ATTCA G AAAG 300 TA CA C TGCA T 310 TTACCAT ACC 320 T AG TA TAAAC 330 AA TG AG ACAC 340 CAGGG ATTA G 350 ATAT CA G TA C 360 AATG GT GC TT C 370 CAC AGGG AT G 380 G AAAGG ATCA 390 CCAG CAATAT 400 TCCAAAG TAG 410 CAT GACAAAG 420 ATCCT AG AG C 430 CTTTTAG AAA 440 ACAAAAT CCA 450 G AAATGGTTA 460 TTTAT CAA TA 470 C AT GG AT GAT 480 TT GTAT GT TAG 490 G AT CT GAC TT 500 AG AAATA GG A 510 CAGCA TA G AG 520 CAAAAATAG A 530 GGAAC TGAG A 540 CAGCAT CTG T 550 T GAGGTGGGG 560 ATTTTTCAC A 570 CCAG ACGAAA 580 AACAT CA G AA 590 A G AACCTCCA 600 TT CCTTTGG A 610 TGGGT TTA T GA 620 AC TCCAT CCT 630 G AT AAATGGA 640 CAGTAC AG CC 650 TAT AA T GC TG 660 CCAG AAAAAG 670 AAAG CT GG AC 680 T GT CAAT GAC 690 ATACAG AAG T 700 TAG T GGG AAA 710 A TT G AATT GG 720 GC AAGT CAG A 730 TTTA TG CAGG 74 G 40 GA TTAAA GTA 750 AAA CCAATTA 760 T G TAAA CCTT 770 N TT AG AGGAC 780 CCAAAGCAC T 790 TACCGGA GGT 800 A TTA CCAC T A 810 CCAAA Standard Rapid 3100 Sequencing Module Data Figure 4. Figure 5. 3100 Rapid Sequencing Module Data with Additional Run Time Figure 6. Figure 7. • HIV-1 samples were processed using the manufacturer’s protocol. The subsequent sequencing reactions were run on both the 377 and the 3100. After HGS software analy- sis, the number of edits made in the projects with 377 data was compared with the num- ber of edits made in projects with 3100 data. The 377 projects showed a total of 5603 edits; the 3100 projects showed a total of 1743 edits. Therefore, there were approximate- ly 3X less edits needed when sequencing reactions were analyzed on the 3100 instru- ment. Therefore basecalling accuracy was improved tremendously with the 3100 data. (Figures 8 and 9) Figure 8. Figure 9. 0 100 200 300 400 500 377 3100 Number of edits Sample Model 3100 Basecaller-3100RR BC 1.2.d.4 B03_4-2_b_03.ab1 12005-171 4-2_b Lane 3 Signal G:683 A:768 T:406 C:316 DT3100POP6{BD}v2.mob Points 857 to 18378 Pk 1 Loc: 857 Page 1 of 3 Mon, Aug 28, 2000 2:37 PM Mon, Aug 28, 2000 1:31 PM Spacing: 15.52{15.52} Version 3.4.1 ATAA GG CATT 10 AGNTAGTAAA 20 TCTG TACAGC 30 AAATGG AAAA 40 GGGAAGG AAA 50 AATTTNAAAA 60 TTGGG CCTGA 70 AAATCCATAT 80 AA TACTCCG T 90 ATTTGCCATA 100 AAG AAAAAAG 110 ATAG TACTAA 120 GTGGAG GGAAG 130 TTAGTAGATT 140 TCAGAGAACT 150 TAATAAGAGA 160 ACTCAAGACT 170 TCTGGGAAGT 180 TCAATTAGGA 190 ATACCACATC 200 CTGCAGGGTT 210 AAAGAAGAAA 220 AAATCAGTAA 230 CAGTACTAGA 240 TGTGGGTGAT 250 GCATATTT TTT 260 C AGTTCCTTT 270 AGATAAGACT 280 TCAGAAAGTA 290 TACTGCATTT 300 ACCATCCCAG 310 TNTAACAATG 320 AGACT CANGG 330 ATAGATTTAG 340 TNCATGNGCT 350 CCACNGGATG 360 GAAGGATCNC 370 NGCA TNTTCC 380 AAGT TGCATGA 390 CAAATNTTAA 400 GCCNTTTAGG 410 AACAAA T CCN 420 ACTAGNT TTN 430 NTCAAACANG 440 GGGATTGGTG 450 GGGACTGATT 460 NAAATNGGCG 470 C TTTGCCAAA 480 TNGGG CCGGG 490 NCCCTTTTTA 500 GGGG GGG TTTC 510 CCCCCCCAAA 520 AACTTAAANA 530 ACCCCTTTTT 540 GGNGGGTTAA 550 GCCCCNCTGN 560 AAAGGGANGG 570 GCCAANCNTN 580 CCNAAAAAAG 590 GGGGGTTAAA 600 AAAAAAAANT 610 TNGGAAAAAA 620 T GGGCCAAAA 630 T Model 3100 Basecaller-3100RR BC 1.2.d.4 A01_2-2_a_01.ab1 12005-171 2-2_a Lane 1 Signal G:734 A:747 T:386 C:325 DT3100POP6{BD}v2.mob Points 917 to 18378 Pk 1 Loc: 917 Page 1 of 3 Mon, Aug 28, 2000 1:31 PM Mon, Aug 28, 2000 12:19 PM Spacing: 19.00{-19.00} Version 3.4.1 GACTTANGNT 10 TTGGGGNAGN 20 ATACAACAAC 30 TCCCTCT NGN 40 AGNCGG AG AC 50 GATAG ACAAG 60 GN AACTG TAT 70 CCTTTAGNCT 80 TCCCTCAAAT 90 CACTCTTTGG 100 CAACGACCCC 110 TCGTCCCAAT 120 AAGG ATAGGG 130 GGGCAACTAA 140 AGGAAGCTCT 150 CTTAG ATACA 160 GGAGCAGATG 170 ATACAGTATT 180 AGAAGAAATA 190 GATTTGCCAG 200 GAAGATGGAA 210 ACCAAAAATG 220 ATAGTGGGAA 230 TTGGAGGTTT 240 TATCAA AGTA 250 AGACAG TATG 260 ATCAGGTACC 270 CCTAGAAATC 280 TGTGGACATA 290 AAG TTATAGG 300 TACAGTATTA 310 GTAGGACCTA 320 CACCTGTCAA 330 CATAATTGGA 340 AGAAATCTAA 350 TGACTCAGCT 360 TGGGTGCACT 37 T 70 TTAA TTTTNC 380 CATTAG TCCT 390 ATTGAACTGT 400 NCNGTAAAA T 410 TAAGCCNGGA 420 ATGGAT GGCC 430 CAAAGTTAAA 440 CA TGGCCATT 450 G CNGAAG AAA 460 A TTAA GCNTT 470 GTTG AATNTG 480 TCNG AACTGG 490 AANGAANG AA 500 ATTTNAAANT 510 GGGCCGAAAT 520 CCTCCAT CCT 530 CCGTTTTGC T 540 TTAAN AAAAN 550 ACG TC TAATG 560 GGNAAANTTT 570 TANTT CNAAA 580 ACTTATAAGA 590 AACCCA Model 3100 Basecaller-3100RR.bcp BC 1.2.d.5 2-2 3100rr[b 2-2 rr3100[b Lane 3 Signal G:343 A:502 T:257 C:177 DT3100POP6{BD}v2.mob Points 2078 to 14950 Pk 1 Loc: 2078 Page 1 of 1 Fri, Jul 28, 2000 4:19 PM Thu, Jul 20, 2000 11:54 AM Spacing: 24.00{-19.00} Version 3.4.1 AAATAAAAG C 10 ATTA G TA G AA 20 ATC T GTACAG 30 AAC T GG AAAA 40 GG AAAGG AAA 50 ATTT CAAAAA 60 TTGGG CCTG A 70 AAATCCA TA C 80 AA TAC TCCAG 90 TATTTGC TA T 100 AAAG AAAAAA 110 G ACA G TAC TA 120 AATGG AG AAA 130 A TTA G TA G AT 140 TTCA G AG AAC 150 TTAATAA G AG 160 AACCCAAG AC 170 TTTT GGG AAG 180 TTCAA TTA GG 190 AATACCAC AT 200 CCT GCAGGG T 210 TAAAAAAG AA 220 AAAAT CA GTA 230 A CA G TA C T GG 240 AT G T GGG TG GA 250 T GCA TA TTTT 260 TCAG TTCC TT 270 TAG ATAAA G A 280 C TTCAGG AAG 290 TA TA C T GC AT 300 TTACCA TA CC 310 TAG TACAAA C 320 AA T GA GACAC 330 CAGGGG TTAG 340 ATAT CA G TA C 350 AAT G TAC TT C 360 CAC AAGG AT G 370 G GAAAGGGT CA 380 CCAG C AA T AT 390 T CCAAAG T AG 400 CA T G A CAAAA 410 A T CTT A G AG C 420 C TTTTA G AAA 430 A C AA AA T CCA 440 G A C AT GG TT A 450 T C T A T C AA T A 460 C G T GG AT G A T 470 TT G T A T G TA G 480 G AT C T G A C TT 490 A G AAA T A G AA 500 C A G C TT AG Model 3100 Basecaller-3100RR.bcp BC 1.2.d.5 2-2 3100rr[a 2-2 rr3100[a Lane 1 Signal G:482 A:550 T:333 C:221 DT3100POP6{BD}v2.mob Points 2500 to 14950 Pk 1 Loc: 2500 Page 1 of 1 Fri, Jul 28, 2000 4:06 PM Thu, Jul 20, 2000 11:54 AM Spacing: 24.00{19.68} Version 3.4.1 CTTT A C T CCC 10 T CTCAG AAGC 20 A GG AG AC GA T 30 AG ACAAGG AA 40 C TGTAT CCTT 50 TAG C TTCCCT 60 CAAAT CA C T C 70 TTTGGCAA C G 80 A CCCCT CG TC 90 CCAA TAAGG A 100 TAGGGGGGCA 110 AC TAAAGG AA 120 G C TC TC TTAG 130 ATACA GG AG C 140 A G ATGATAC A 150 G TA TTAG AA G 160 AAATA G ATTT 170 GCCA GG AA GA 180 TGGAAACCAA 190 AAATGA TA G T 200 GGG AATT GG A 210 GGTTTTA TCA 220 AAG TAA GACA 230 GTAT GA TCA G 240 GTACCCC T TAG 250 AAATC T GTGG 260 ACA TAAAG TT 270 A TA GG TAC AG 280 TA TTA G TA GG 290 ACC TACACC T 300 G TCAACA TAA 310 TT GGAA GAAA 320 TC TAA T GA C T 330 CAGC TT GGG T 340 GCAC TTTAAA 350 TTTTCCCATT 360 AG T CCT ATTG 370 AAAC T G TACC 380 A G TAAAA TTA 390 AAG CCAGG AA 400 T GG AT GG CCC 410 AAAA G TTAAA 420 C AA T GG CC AT 430 T G A C AG AA G A 440 AAAAA T AA AA 450 G C ATT AG T AG 460 AAAA T CT G T A 470 C A G AA C T GGG 480 AA AA GG AAA G 490 GGAAAA T • In 16 random pairs of data, 8 samples had mixtures which were called automatically more often in the 3100 vs. 377 data; for 4 samples, the mixtures were called more fre- quently in the 377 data; and for the remaining 4 samples, mixture calling was equal. Therefore improvements in automatic base calling of mixtures was seen in the 3100 data. These improvements resulted in less time needed for manual editing. (Figures 10, 11, and 12) Figure 10. Figure 11. Figure 12. CONCLUSIONS: • Both instruments, the 377 and 3100, provided double coverage sequences for the entire HIV-1 protease gene and for the first 315 codons of the reverse transcriptase gene. • Longer read lengths were observed when sequencing reactions were analyzed on the 3100 capillary electrophoresis instrument. These longer read lengths were the result of better peak resolution at 580 base pairs and hence fewer edits in the ViroSeq‘ HGS software were needed. This decrease in number of edits reduced the time needed to review a project. • Longer read lengths also provide additional sequencing coverage for the targeted genes. This additional information results in a higher confidence in base calling. • Concordant genotypes were generated from the 377 and 3100 instruments. • In addition, use of the capillary electrophoresis instrument eliminated the need to pour an acrylamide sequencing gel and manual loading of the samples onto the acrylamide sequencing gel, thereby permitting a more automated, walk-away system, a reduction in labor cost, and increased ease of use in research laboratories. For Research Use Only. Not for use in diagnostic procedures. The ViroSeq ™ HIV-1 Genotyping System, Version 2, is for Research Use Only. Not for use in diagnostics procedures. Under separate procedures and labeling, this type of product can be used for investigational use. Please contact Applied Biosystems for information. ViroSeq and BigDye are trademarks and MicroAmp and Applied Biosystems are registered trademarks of PE Corporation or its subsidiaries in the U. S. and certain other countries. AmpErase is a registered trademark of Roche Molecular Systems, Inc. Microcon in a registered trademark of Millipore. All other trademarks are properties of their respective owners. 3100 data 377 data