Top Banner
J. gen. Virol. (1989), 70, 2541-2553. Printed in Great Britain Key words: porcine parvovirus/nucleotide sequence/parvoviruses 2541 Porcine Parvovirus: DNA Sequence and Genome Organization By ANA I. RANZ, JUAN J. MANCLIAS, ESMERALDA DIAZ-AROCA AND JOSI~ I. CASAL* Inmunologia y Genetica Aplicada, S.A. (Ingenasa), Hermanos Garcia Noblejas, 41-2 ° 28037 Madrid, Spain (Accepted 15 June 1989) SUMMARY We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-I, FPV and CPV. INTRODUCTION Porcine parvovirus (PPV) is a major cause of reproductive failure in swine, resulting in foetal death and mummification, still births, and delayed return to oestrus (Joo & Johnson, 1976; Mengeling, 1978). PPV is an autonomous replicating parvovirus, containing a ssDNA molecule of approximately 5000 nucleotides (nt) (Molitor et al., 1984); only the minus (genomic) strand is packaged into virions. Four virus-specific proteins have been described: three capsid proteins (A or VP1, B or VP2 and C or VP3, of Mr values 83000, 64000 and 60000, respectively) and one non-structural protein (NS1 ; Mr 84000) (Molitor et al., 1983, 1985). DNA sequences of AAV2 [serotype 2 of adeno-associated virus (AAV)] (Srivastava et al., 1983), minute virus of mice (MVM) (Astell et al., 1986), H-1 (Rhode & Paradiso, 1983), canine parvovirus (CPV) (Reed et al., 1988), feline panleukopenia virus (FPV) (Carlson et al., 1985), bovine parvovirus (BPV) (Chen et al., 1986) and B19 (human parvovirus B19) (Shade et al., 1986) have been reported. These studies indicate that autonomous parvoviruses show several common features. There are two large open reading frames (ORFs), the mRNAs from both ORFs are polyadenylated and 3'-coterminal at about map unit (m.u.) 95, the left ORF encodes non-capsid proteins which are necessary for viral DNA replication and the right ORF encodes the major capsid proteins of the virus as a nested set. PPV is a difficult virus to propagate in vitro due to the high tendency of this virus to produce defective interfering particles (Choi et al., 1987), particularly when used at a high m.o.i, or when it is highly passaged. These defective particles contain random genome deletions or duplications. To learn more about the genetic strategy and organization of PPV and its relationship to other parvoviruses we decided to clone and sequence it. In this paper we report the nucleotide sequence of PPV. We have used the NADL-2 strain (Mengeling & Cutlip, 1976), a tissue culture-adapted strain, which shows the presence of two different replicative forms (RFs) of DNA (Molitor et al., 1984). One of the RFs, NADL-2, is 0000-8946 © 1989 SGM
13

Porcine Parvovirus: DNA Sequence and Genome Organization

Feb 27, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Porcine Parvovirus: DNA Sequence and Genome Organization

J. gen. Virol. (1989), 70, 2541-2553. Printed in Great Britain

Key words: porcine parvovirus/nucleotide sequence/parvoviruses

2541

Porcine Parvovirus: DNA Sequence and Genome Organization

By A N A I. R A N Z , J U A N J. MANCLIAS, E S M E R A L D A D I A Z - A R O C A AND JOSI~ I. CA SA L*

Inmunologia y Genetica Aplicada, S .A. (Ingenasa), Hermanos Garcia Noblejas, 41-2 ° 28037 Madrid, Spain

(Accepted 15 June 1989)

S U M M A R Y

We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-I, FPV and CPV.

INTRODUCTION

Porcine parvovirus (PPV) is a major cause of reproductive failure in swine, resulting in foetal death and mummification, still births, and delayed return to oestrus (Joo & Johnson, 1976; Mengeling, 1978). PPV is an autonomous replicating parvovirus, containing a ssDNA molecule of approximately 5000 nucleotides (nt) (Molitor et al., 1984); only the minus (genomic) strand is packaged into virions. Four virus-specific proteins have been described: three capsid proteins (A or VP1, B or VP2 and C or VP3, of Mr values 83000, 64000 and 60000, respectively) and one non-structural protein (NS1 ; Mr 84000) (Molitor et al., 1983, 1985).

DNA sequences of AAV2 [serotype 2 of adeno-associated virus (AAV)] (Srivastava et al., 1983), minute virus of mice (MVM) (Astell et al., 1986), H-1 (Rhode & Paradiso, 1983), canine parvovirus (CPV) (Reed et al., 1988), feline panleukopenia virus (FPV) (Carlson et al., 1985), bovine parvovirus (BPV) (Chen et al., 1986) and B19 (human parvovirus B19) (Shade et al., 1986) have been reported. These studies indicate that autonomous parvoviruses show several common features. There are two large open reading frames (ORFs), the mRNAs from both ORFs are polyadenylated and 3'-coterminal at about map unit (m.u.) 95, the left ORF encodes non-capsid proteins which are necessary for viral DNA replication and the right ORF encodes the major capsid proteins of the virus as a nested set.

PPV is a difficult virus to propagate in vitro due to the high tendency of this virus to produce defective interfering particles (Choi et al., 1987), particularly when used at a high m.o.i, or when it is highly passaged. These defective particles contain random genome deletions or duplications. To learn more about the genetic strategy and organization of PPV and its relationship to other parvoviruses we decided to clone and sequence it.

In this paper we report the nucleotide sequence of PPV. We have used the NADL-2 strain (Mengeling & Cutlip, 1976), a tissue culture-adapted strain, which shows the presence of two different replicative forms (RFs) of DNA (Molitor et al., 1984). One of the RFs, NADL-2, is

0000-8946 © 1989 SGM

Page 2: Porcine Parvovirus: DNA Sequence and Genome Organization

2542 A . I . RANZ AND OTHERS

infective and apparently identical to the RF from the NADL-8 strain (a highly pathogenic isolate). The other RF, NADL-2*, is not infective and shows a deletion of 300 nt near the 5' terminus. We have cloned and sequenced the NADL-2 RF. The sequence shows major homologies with FPV and rodent parvovirus which would be useful in the construction of molecular probes for clinical detection, design of vaccines and in establishing the evolutionary relationship between parvoviruses. This is the first report concerned with the entire PPV genome .

METHODS Materials~ Restriction and DNA modification enzymes were purchased from Boehringer Mannheim or New

England Biolabs, and deoxynucleotides and dideoxynucleotides were purchased from Pharmacia. [c~-35S]dATP was obtained from New England Nuclear. Chemical reagents for oligonucleotide synthesis were purchased from Applied Biosystems and other chemical reagents were obtained from Sigma. For cloning and sequencing the plasmid DNAs used were pUC18 and the phages M13mpl8 and mpl9 (Messing, 1983).

Virus and cells. The swine testis (ST) cell line was a gift from Dr L. Enjuanes (Centro de Biologia Molecular, Madrid, Spain). The cells were grown in Dulbecco's modified Eagle's medium supplemented with 10~ foetal calf serum and antibiotics. All virus passages were done on ST (Pirtle, 1984). Viral strains used were NADL-2 (ATCC VR-742) a tissue culture-adapted virus (Mengeling & Cutlip, 1976), and NADL-8 (a gift from Dr W. Mengeling, National Animal Disease Center, Ames, Ia., USA), a highly pathogenic strain of PPV. To minimize the presence of defective particles we used a low m.o.i, and a low passaged virus as the starting material for all the cloning experiments.

DNA isolation. RF DNA was extracted from infected cells at 48 h post-infection by trypsinizing the cells and then centrifuging. The method was basically similar to that described by Molitor et al. (1984). Cells were lysed with 4 volumes of lysis buffer (0-75~ SDS, 1.25 M-NaCI, 20 mM-Tris-HC1 pH 7.5, 10 mM-EDTA and 250 ~tg/ml proteinase K) for 2 h at 37 °C, followed by overnight incubation on ice. Chromosomal D N A and particulate cell debris were eliminated by centrifugation (Kontron, TST 55.5, 30000 r.p.m., 90 min, 4 °C). Supernatants were precipitated with isopropanol, resuspended in Tris-EDTA (TE) buffer (10:1), treated with RNase A and phenol- extracted. Further purification was achieved by electrophoresis of the RF DNA in 1 ~ low melting point (LMP) agarose (FMC) gels in Tris-borate-EDTA (TBE) buffer. DNA was extracted from LMP agarose gel slices as previously described (Langridge et al., 1980).

DNA cloning and transformation. The virus genome orientation is given according to the convention of Armentrout et at. (1978), with the 3" end of the minus-strand DNA to the left. The cloning strategy for the PPV genome will be described elsewhere (J. J. Manclfis et al., unpublished). Briefly, purified RF DNA was digested with appropriate restriction enzymes (Pstl and EcoRI) and the individual bands were separated on 1 ~ LMP agarose gels in TBE buffer and eluted as described (Langridge et al., 1980). The fragments containing the 3' and 5' ends were treated respectively with mung bean nuclease and the Klenow fragment to leave blunt ends (Maniatis et al., 1982). These DNA fragments were ligated with appropriately restricted pUC18 DNA and transformed into Escherichia coli JM 109 cultures as described by Hanahan (1983). All transformation mixtures were plated onto LB plates containing 100 ~tg of ampicillin per ml and X-gal as the indicator. Cloned fragments were then assembled in a clone called pPPV-10 (Fig. 1), which contained a complete copy of the viral genome since it was able to infect ST cells by transfection (J. J. Manclfis et al., unpublished results).

Sequencing P P V DNA. The strategy for sequencing PPV DNA was to clone five large fragments, obtained by digestion of pPPV-10, into M13mpl8 or M13mpl9 double-stranded RF DNA by using combinations of appropriate restriction enzymes (Fig. 1). Ligated DNA was transformed into JM 109 cells, and white plaques were selected for single-stranded template preparation. The template DNAs were then sequenced by using the dideoxynucleotide chain termination method (Sanger et al., 1977) and PPV sequence-specific oligonucleotides as primers. Oligonucleotides were synthesized on a 381A apparatus (Applied Biosystems). The oligonucleotides were made consecutively at 250 bp intervals, once the sequence had been read through (Fig. 1). This process considerably speeds the sequencing. The sequence of each fragment was determined on a minimum of two separate gels and two different templates.

Computer program and analysis. The DNA sequence analysis was performed using the Beckman Microgenie Software (Queen & Korn, 1984) and the Sequence Analysis Package (Stephens, 1985) on an IBM PC AT computer.

RESULTS

N u c l e o t i d e sequence o f the 5' end

The cloning of the 5' end was done by filling in PPV RF DNA with the Klenow fragment and then digesting with EcoRI. The fragments generated, containing the blunt 5' end of RF DNA and art internal cohesive restriction enzyme site, were directionally cloned into pUC 18, and then

Page 3: Porcine Parvovirus: DNA Sequence and Genome Organization

Porcine parvovirus nucleotide sequence 2543

BamHl PstI Taql MspI pUC18~- I I I

pPPV 1o

I t MI3PPV3

Ncol Bglll Pvull Hindlll\Ec°Rl/ SacIBg III BomHI -<~\ , /

J / I pUCI8 I I 1I 1 I

I I M13PPV1

M13PPV2

!

M 13PPV4

! , 4 M13PPV9

I 1 I I I |

0 1000 2000 3000 4000 5000 Nucleotide no.

Fig. 1. Sequencing strategy and restriction map of PPV. The PPV genome was cloned in pUC18 to provide a clone called pPPV-10. Five fragments of pPPV-10 covering the whole genome of PPV were cloned in M13mpl8 or -mpl9 for sequencing.

(a) AG A.C

%.5 . 6 0 70 80 90 100

A C AGTGTGCAG4GGTAGTcGT;TCTGTC AAC(~AGGTCAAATTTc . . . . . . . . . . . 5"

50- C. TCACACGTCACCAT AGCAAAGACAGT TGGTC AGTTTAAAG 3'

;0 2'0 ,'o

/7 A

(b) T • A A C Ca

• - • T 3". .......... A T A C G A T T T O G T C G C T G C T C T C G G C G A A C C A G GG •

T 5". . . . . . . . . . G C T T G G T C C C .

• " T G

Fig. 2. Terminal structures of PPV DNA. (a) 3" End nucleotide sequence (minus strand); (b) 5' end nucleotide sequence (minus strand).

into M 13mpl 8 and mpl 9 (Fig. 1). DNA sequence data and comparison with available data from rodent parvoviruses (Astell et al., 1986) indicate the possible absence of some nucleotides (70 to 80) in the palindrome region with a U-shaped configuration which forms the 5' end. Fig. 2(b) shows a significant homology of PPV with the MVM 5'-terminal sequence (Astell et al., 1986) (23 of 40 nt) in the sequenced side of the loop. Since the sequenced 5' end is derived from a clone, pPPV-10, it shows only the flip orientation. A 127 bp direct repeat begins at nt 4519 (Fig. 3), five bases before the stop codon of the VP 1 and VP2 coding sequences, and finishes at nt 4772. This perfect repeat in the 5' end is a feature present in CPV (Reed et al., 1988) and also in the rodent

Page 4: Porcine Parvovirus: DNA Sequence and Genome Organization

2544 A . I . R A N Z AND OTHERS

parvoviruses (Astell et al., 1986). Inside the 127 bp repetition there are also small 24 bp repetitions (nt 4538 to 4562 and nt 4612 to 4635), although the matching is not as perfect (20 of 24 nt).

Nucleotide sequence of the 3" end o f the minus strand

The 3' end of the genome was sequenced from clone pPPV-3 in M13mpl8 (Fig. 1), which contains a 270 bp insert between the 3' end and the Pstl site. This region of the D N A contains a highly secondary structure making resolution difficult. This sequence is slightly shorter (102 nt compared to 120) but very much resembles that of CPV and rodent parvoviruses (Hauswirth, 1984; Reed et al., 1988) forming an Y-shaped configuration (Fig. 2a). The total length of the 3' end was 102 nt with 82 nt making up the stem structure and 20 nt formed the Y structure (Fig. 2a). Several regions of the Y structure are well conserved in rodent parvoviruses (Astell et al., 1986) and PPV, but not so well in CPV (Reed et al., 1988): nt 2 to 19 of the rodent parvoviruses and nt 2 to 16 of PPV (13 of 16 match) and nt 28 to 44 of the rodent parvoviruses and nt 27 to 40 of PPV (13 of 16 match). Like all the parvoviruses sequenced to date, except B19 (Shade et al., 1986), the 3' end is not related in sequence to the 5' end.

Genome organization and assignment o f PP V genes

Clones PPV-1, PPV-2 and PPV-4 were used to determine the sequence of the major portion of the genome (Fig. 1). PPV DNA-specific oligonucleotides were synthesized to cover the entire genome. The average distance between them was 250 nt. In a few cases it was necessary to reduce the distance between oligonucleotides to get a good sequence. The sequence and single letter amino acid translation of PPV are shown in Fig. 3. The nucleotide sequence of PPV was 4973 bases long, which is the shortest sequence reported to date for autonomous parvoviruses. However if we add 70 to 80 nt accounting for the probable loss in the process of cloning of the 5' end we would finish with about 5053 nt, a length similar to that reported for rodent parvoviruses (5151 nt) (Astell et al., 1986).

The final DNA sequence of the PPV genome was analysed by the computer programs already mentioned (Queen & Korn, 1984; Stephens, 1985). The potential coding domains for both the complementary strand (C strand, plus polarity) and the viral strand (V strand, minus polarity) are shown in Fig. 4. There are two predominant ORFs (A and B), both occurring in frame 3 of the C strand (Fig. 4). No ORFs of significant size were found in the minus strand of the virus.

A computer search of the C strand for possible promoter regions was done by using the information on eukaryotic promoters described by Bensinhom et al. (1983). These features include an enhancer region (E) about 100 bp upstream of the cap site; a G + C-rich activator region (A) approximately 50 to 75 bp upstream of the cap site and an A + T-rich domain (TATA box), which usually lies about 30 + 5 bp upstream of the cap site and positions the RNA polymerase for initiation of transcription. Several possible promoters were characterized searching for the consensus sequence TATA~A. Three possible promoters were localized, one at m.u. 3-7 (P3-7) (TATAAA, nt 183), a second at m.u. 38 (P38) (TATAT, nt 1923) and a third at m.u. 46 (P46) (TATAAA, nt 2329) which contained all the promoter components. Promoters P3.7 and P38 are both analogous in map position to other parvovirus promoters (Astell et al., 1986; Reed et al., 1988) and probably initiate, respectively, transcription of ORFs A and B. Unfortunately, no RNA mapping data are available for PPV and we therefore cannot correlate our results with transcription data.

A characteristic signal at the 3' end of eukaryotic mRNAs is the sequence AAUAAA about 20 nt before the poly(A) tract (Wickens & Stephenson, 1984), an essential but not sufficient signal for polyadenylation. Several additional signals have been suggested, such as downstream G/T clusters (Birnstiel et al., 1985) or a CAYUG sequence upstream or downstream from the poly(A) site (Berget, 1984). A search of the C strand of PPV shows 16 possible polyadenylation signals. Most of them (nine of 16) are located in the 5' end, downstream from the ORF B stop codon at nt 4524. Only these sites fulfil all the theoretical requirements and they are probably functional. The others were located within the coding region o fORF A (nt 791,918, 1041, 1152, 1581, 1991 and 3191) and possibly are not functional as occurs with AAV2 (Srivastava et al., 1983) and FPV (Carlson et al., 1985).

Page 5: Porcine Parvovirus: DNA Sequence and Genome Organization

Porcine parvovirus nucleotide sequence 2545

lO 80 120 ~TTTAAACT~A~CAACT6TCTTT6~TAT~T6AC6T6T6AC6C6~6TCCTT~66~A6T~A~AC~TCACCATCA~AAA6A~A~TT~6TCCA6TTTAAA6ATTAATAA6ACAATTC~ATT

160 200 240 ~CT6AAAAGA~C~AAATTCAAAAAAAGAG~G~AAAAAAA~A~GTGGAGCCTAA~AqT.~'.~ACA~TT~TTACTT~TTAGTTC~TTTU~TTCAGA~TG~ACTTC~CT~

Pstl 280 1'61 320 36(I c A G A 6 A c A C A 6 C T A C A A A c T A c T C ~ C A 6 C T A C T - . ~ - C - ~ 6 G A A A C A c T T A c T c G G A A G ~ G G T A c T A A A A 6 c T A C C A A c T G ~ c T T ~ A A 6 A T A A T G c T c A A A A A 6 A A G C A T T C T

M A A 6 N T Y S E E V L K A T N i L fl D N A G k E A F

400 440 480 CTTAT~TATTTAAAACA~AAAAA~TCAATCTAAAT6~AAAA~AAATT~CTT~GAATAACTACAACAAA~ATA~AA~AGAT~C6~AAAT6ATAAA~CTACAAAGAGGA~CA6AAA~ATCAT S Y P F k T Q K V N L ~ G k E I A W N ~ Y N k O T T O A E M I N L 0 R G ~ E T $

~ql ~bO ~00 ~ACCA~CAACA~A~AT~GAATGG6AATCAGAA~'g.~G~CT~A~AAAA~GCCAAGTACTGATTTTTGACTCTCTTGTTAAAAAATGT~TCTTTGAAGGTAT4HGCAAAAGAA~C

i D fl A T D M E i E S E I D S L T g G 0 V L I F 0 S L V k k C L F E G I L 0 K N

b40 bBO 720 TAA~TCCAA6~6A~T~CTA~T6~TTCAT!~C~CAT~A~CAT6~TcAA6ATA~TG~TATCA~T~CATGT~TA~TAGGT~GAAAAG6~TTACAA~AA~AATGGGAAAATGGTT~A~AA

L S P S 0 C Y W F I Q H E H 6 e D T G Y H C H V L L B G K B L O O A M 8 K W F R

760 800 840 A~AATTAAAC~ATTTAT~A~TA6AT~6TT~ATAAT~C~AT6CAAA~TA~CTCT~ACACCA6TT~AA6AATAAAATTAA~66AATTA6CA~6~AT6~T~6T~TATC~A~TAA K g L N N L W S R W L I M 0 C K V P L T P V E R I K L R E L A E O G E W V $ L L

880 920 960 CCTACACTCACAAA~AAACTAAAAAACAATATACAAAAAT~ACTCATTTTGGAAATAT~ATTGCTTACTACTTC~TAAATAAAAAAAGAAAGACAACT~AAAGA~A~CAT~GATATTATC

T Y T H K g T K K Q Y T K ~ T H F G N M I A Y Y F L N K K R K T T E R E H G Y Y

I000 . 1040 lOBO

TCA~CTC~ATTCT~CTTCAT6ACAA~TTTCTTAAAA~AAG6CGA~A6ACACTTA6TCA~TCACCTATTTACT6AA6CAAATAAACCTGAAACTGT66AAACAACG6TTACTAC~8CTC

L ~ ~ 0 S 8 F ~ T N F L k E G E R H L V S H L F T E I N K P E T V E T T V T T A

1120 . 1160 [200 ~M6CC~Ck~iAATACAAACAAAAAAA~AA6TAA6CATAAAAT6CACAATAAGA6ACTT~TTAATAAAA6ATGTACTA6C~TA6AA6ACT~6AT6AT6ACA6ATCCA6ACA Q E A l R 8 l I 0 T K K E V S I K C T I R O L V N K R C T S I E ~ W M M T D p D

~ a l l 1240 1280 1~20 $TTATATk6~MT~T66CTCAAAC-~A~6A~AAAATTT~ATCAAAAATACACTA~AAATAACAACTCTTACTCTAGCAA~AACAAAAACAGCATAT6ACTTAATACTT~AAAA~6CAA S Y I E M ~ A G T B @ E N L I K N T k E I T T L T L A R T K T A Y ~ L 1L E K A

1J60 1400 1440 ~C~6CAT~TACC~CATTTAATATTAGCAATACAA~AACATGTAAAATATTCA~CAT6CACAATTG6AACT~CATTAAAT~CT~CiATGCTATAACTT~T6TAClA~ACA~ACAA~ K P $ ~ L P T F N I S N T R T C K I F S M H N W N Y l K C C H A | T C V k N R O

1480 1820 I$~0

6~T~TTCTATTTCAT~CC~CATCAACA~GAAAA~TATAATT~TCAACACATT~C~ACTTA~TT~6TAAT6TT~TT~CT~CAAT6CA~CC~T6T6AACT 6 6 K R N T I k F M 6 P A S T 6 K $ I I I 0 H l A N k V 6 I V 6 C y N A A N V N

• l b O 0 , . 1 6 4 0 . 1 6 8 0

TTC~ATTT~T6ACT~TACAAAT~AA~TTAATATG~ATT~AA~AA~CA6~AAA~TTCTCTAAC~AA~TAAACCAATTCAAA6~CATAT~TTCA6~T~AAACAATTA6AATTGA~CAAA F P F N D C T N k N l I W I ~ E ~ ~ N F S N G V N 8 F K A I ~ S 8 0 T I R I O 0

. 1720 . , 1760 , 1800

~A~6TAAA66AAG~A~A~AAATT6AACCAA~TC~T~TAATAAT6ACTACAAATGAA6A~ATAACTAAA~TTA~AATA~TGC6A~6AAA~A~CAGAACAT~CACAA~CAATAA~A~ACA K 6 K 6 S k O I E P T P V I M T T N E 0 I T K V R I 6 C E E R g E H T 0 P I R 0

, . 1840 . . . 1880 , . 1920 8AATGTTA~C~TAAACCT~A~CAG~A~A~T~CCA~T~TTTT~GA~HTT~6A~T~AT@GCCA~T~T~T~T~TT8~TTG~TA~AAA6~U~GCA~AA~6G~TA

R M L N I N L T R K L P G 0 F G L L E E T E W P L I C A W L V K K G y Q A T ~ A

• . 1%0 . . . 2000 . . . 2048 G~CATC4TTG66GAAATGTACCTGATTG8TCAGAAAAATGGGA6GAGCCA~AA4TGC4AA~CCCAATAAATAcA~CA4c4G~CTCT~4GATTTCCACATc4GT~A~4AUT~GC

S Y M H H W G N V P O W S E K W E E P K ~ O T P I N T p T D S 0 1 S T ~ v K T S

Page 6: Porcine Parvovirus: DNA Sequence and Genome Organization

2546 k . I . R A N Z A N D O T H E R S

. 2080 . , 2120 . , . 2160

CA~cGGACAACAACTAC~CA~CAAC~CCAA~ACAG~A~GACCT~AT~A~CTTTA@CCTTB~A~CC~T~A~C~AGCCAACAACACCAACTTTCACAACTGCATTAAC~CAACACGCCA

P A D N W Y A A T P I 0 E ~ t 0 t A L A t E P W S E P T l P F T T A L T 0 H A

Haell 2200 2240 V ~ 2280

GAT~CAGCAATAC~6ACACAAGTCCAACTTGGTCGGAAATA6AAACCGACATAAGAG~CT8CTTT6GT~AAAACT~TGCACCCA~AA~AAA~CTTGAATAAB'A~GT-~TGGC8CCTCUG a F S N T 0 T S P T W S E I E T 0 1R A C F G E N C A P T T N L E M A P P

. 2320 . , 2360 . . 2400 C AAAAA~A6CA~A~AA~8'TA~TTTTAA~GGGT~T~CATAC~TAACT~AAATAATT~TTTTA~ATATTACA~GACTAACT~TACCA~8TACAATACT~TCCA~

A K R A R 8 K G S F K G V V A Y I L Q I I F L Y I T G L T L P 8 T I L G P

• 2440 2480 • • 2520

~AAA~TCA~TAGAC~AA~AACCAACTAATCCAT~A~AC6~C~CAGCAAAAGAACAc~Ac~AAGCCTAC~ACAAATACATAAAATCT~AAAAAATCCTACATT~TACTTCT~A~C~

G N 8 L D O B E P T N P S 0 A A A K E H D E A Y D K Y I K S G K N P T F Y F 8 A

Pvull 2860 2600 .

I D E K F I K E T E H A K 0 Y 6 6 K I 8 H Y F F R A K R A F A p K L S E 0 S P

. 2680 2720 . 2760 •TACATCTCAA•AAC•A6A66TAA6AA•ATc••C•A•AAAACA•CCA•G•TCTAAACCACCA••AAAAA•ACCT•CTCCAA•ACATATTTTTATAAACTTA••TAAAAAAAAA•CTAAA• T T S O 9 P E V R R S P R K H P 6 S K P P G X R P A P R H I F I N L A K K K A K

vP 2 2900 2840 . 2880 6~ACATCTAATACAAA~TCTAACTCA~AT~T~6AAcAA~ACAACCCTATTAAT~CA~C~ACT~AATTGTCT~CAAcA~6AAAT~AATcT66666TG~6~CG~C6~T~GC~ 6 T S N T N 8 N S M S E N V E Q H N P I N A A T E L S A T 8 N E S G 6 6 G 8 G G

. 2920 2960 , 3000

666~TA666GT~T6666666TT6~T~T~TCTACA~TA~TTTCAATAAT~AAAcA~AATTTCAATACTT@~6A~TT~TTA~AATCA~T~CACAC~CATCAA~ACTCATACAT~ 8 G R G A G 6 V 8 V S T G S F N N G T E F G Y L 8 E 8 L V R I T A H A S R L I H

3040 3080 . . 3120 TAAATAT~c~A~AACA~AAAcATACAAAA~AATACAT~TACTAAA~TCAGAATCA~TC~G~A~AAAT~TACAAGAC~AT~CA~A~ACACAAATG~ TAACACCTTGGTCACTAA

L N M P E H E T Y K R I H V L N S E S 6 S A B Q M V Q D D A H T Q M V T P W S L

• 3160 3200 . . 3240

TA6AT6CTAAC6CAT6••6A6T6T6•TTCAAT•CA••••A•T••CA6TTAATAT••AACAACAT•A•A•AAATAAA•TTA•TTA•TTTT•AA•AA••AATATTCAAT8TA•TA•TTAAAA I 0 A N A W 6 V W F N P A D W Q L I S N N M T E I N L V S F E O A F N V V L K

. 3280 Hmdlll 3320 . 3360

CAA TTACAGAATCATCAACC T CACCACCAACCAAAATA TATAA TAA T BA T CTAAC TBCAA"AGCT~AA T 86 TCGCAC TAGACACCMATAACACAC T TCCAT A CACACCAOCAGCACC TAOAA l I T E S A T S P P I K I Y N N I) L T A S L M V A L 0 T N N T L P y T P A A P'R

Ncol 3400 3440 3480 ST~A~A~TT~6~TTTTAT~TTAi~A~AAAACC~TCA~T~CAGATATTACiTATCAT6~A~AA~CT~AATCcA~CA'A~ATA~A~T~A~ATCA~AAcc~AATAACA 8 E T L 8 F Y P W L P T K P T Q Y R Y Y L S C I R N L N p p T Y T G Q 8 Q P N N

, 3520 . . 3560 £coR] 3600 GAC T C A A T ACAAACAGGC T A C ACAG T @AC A T T A T G T T C T A C AC A A T A GAAAATBCAG T A CC A A T T C A T C T T C T AAGAAC A BGA@AT 6 A'A'A~'~'~T C CAC A GGAA T A T A T C A C T T T G AC A C A A R L N T N R L H 8 0 I M F y T I E N A V P I H L L R T B 0 E F 8 T G I Y H F D T

, , , Bg/l[ , , 3880 , .~72¢ A ACCAC TAAAAT T A ACTCAC TCATGGCAAACAAACAGA TC TCT AGGAC T GCC T CCAAAAC TAC TAAC TGAACC T~CCACAGAAGGAGACCAACACCCAGGAACACTACCAGCAGC T C, AC K P L K L T H S W 0 T N R S L G L P P K L L T E P T T E G 0 0 H P G T L P A A N

, , , 3760 , . , 3800 , 3840 CAAGAAAAGGT TATCACCAAACAAT TAATAAT A GC T A CACAGAAGCAACAGCAAT T AGGCCAGC TCAGGT A SGA TA TAAT A CACCATACATGAA T T T T GAATACTCCAATGBTGGACCAT T R K G Y H D T I N N S Y T E A T A I R P A Q V G Y N T P Y M N F E Y S N G G P

3BBO . . 3920 . 3960 T TC TAACFC'C T ATAG T ACC'AACAGCAGAC'ACACAATATAA T GA TGATGAACCAAATGG T GC T ATAAGA T 'T TACAAT G GAT TACCAACA T G GACACT TAACCACATC T TCACAAGAGC TAG F L 'F P l V P T A D T 0 Y N D D E P N G A I F~ F T M D Y Q H G H L T T S S G E L

. . . Sacl . . . 4040 . 4(,B,31

AAAGA T A CACA T TCAAT CCACAAAG T AAA TG T G6,.%~GAGC T C CAAAGCAACAAT TTAAT CAAC~GGCACC A C TAAACC TAG~AAA T AC A AATA~T6GAACAC T T T T A CC T TCAGAT CCAA

E R Y T F N P 0 S K C G R A P K 0 Q F N 0 ~ A P L N L E N T N N G T L L F 5 D P

Page 7: Porcine Parvovirus: DNA Sequence and Genome Organization

Porcine parvovirus nucleotide sequence 2547

4120 4160 4200 TAGGABGGAAATCTAACATGCA TT TCATGAA T A CAC TCAATACATATGGACCAT TAACAGCAC TAAACAA TACTGCACCTGTAT T TCCAAA T G G TCAAATA TGGGATAAAGAAC TTGATA I 8 G K 8 N M H F M N T L N T Y G P L T A L N N T A P V F P N G 0 W O k E L

Bglll , 4240 4280 . 4320 C AG~-'G'~"C"FAAAACC T AGAC T AC A T G 1 T ACAGC T C CA T T T G T T T G TAAAAACA~T CCACCAGGACAAC T AT 1 T G T AAA AA T AGCACCAAACC T AACAGA T GA r T T CAAT BC T GAC T C T CC T C T O L K P R L H V T A P F V C k N N P P G 0 L F V K { A P N L T O D F N A D 8 P

4360 4400 4440

A ACAACCTAGAATAATAACTTATTCAAACT TTTGBTGGAAAGGAACACT A ACAT T CACAGCAAAAATGAGATCCAGT A ATATGTGGAACCCTATI C AACAACACACAACAACAGCAGAAA

Q Q P R I I T Y S N F W W K 8 T L T c T A k M R S 8 N M W N P I Q Q H T T T A E

4480 4520 4560 .

A CATT•GTAAATATATTCCTACAAATATTG•TG8CATAAAAAT•TTT•CAGAATATTCACAACTTATACCAA8AAAATTATACTA•AAATAACTCT8TAAATAAAAACTCA•TTACTTG8 N I G K Y I P T N I G G I K M F P E Y S 0 L I P R K L Y m, ÷

4bOO 4640 4680

TTAATCATGTACTACTATCATTGTATACTTCAATAAAAATAAATTGTAaAATCAATAAAACTAA~TTACTTA~TTTCT~TATACCTATACTAGAAATAAC~CTGTAAATAA8AACTCAGT

6 ÷ 6 ÷ a ÷ 6 ÷

4720 TACTTGGTTAATCATGTACTACTATCATTGTATACTTCAATAAAAATAAATTGTAAAAT

6 ÷ At

4760 . 4800

IAATAAAACTAAGTTACTTAGTTTCTGTATACCAATTATCCCCAAAAAACAATAAAATTT T a •

4840 , 4880 , 4920

A AAAASAAACAA8C TCTCC~TGTG TT TACTAT TAAU T AAACCAACCAC~C'f'f ATATGACCT TAT GTC T T TAGG8 T GG TGGTGGAATTACTATSTAT T CC T T TGAG T T A GTTGGT C GCCCCG

4%0 T~+TGC T~AACCAGCGnCGAGAGCCGC T TGGT TA TGCCCC~AAGGCG~CCA~CG

Fig. 3. The nucleotide sequence of the cloned PPV genome. The nucleotide sequence corresponds to the complementary (plus) strand (5' to 3')• Below the nucleotide sequence are the one-letter amino acid sequences of putative polypeptides corresponding to the major ORFs. Proposed mRNA 5' donor splice junctions are overlined and the 3' acceptor splice junctions are underlined. Polyadenylation signal regions are indicated as A +. The sequence is complete except for approx• 80 nt at the right end.

5 ' 3 '

11illiiil I- l ilIIllltl 11 ,i J ill ~I 1 I i t 111 ' 11 ~ ill iI i il/ll ,11,

Nucleotides 0 994 1989 2983 3978 4973

D , I , i , I , i , t 0 10 20 30 40 50 60 70 80 90 100

Map units

3' 5'

41 I II II IIIII I I'II IIII -I v 51I I Ill I II l l~ll,r] 6 lull I~ III I I l lIll

Fig. 4. Genomic organization of the complementary (C) (plus-polarity) and viral (V) (minus-polarity) strands• Each line designates the stop codon position of each frame in both strands.

Page 8: Porcine Parvovirus: DNA Sequence and Genome Organization

2548 A. I . R A N Z A N D O T H E R S

Table 1. Location of 5" donor and 3' acceptor splicing sites*

5' Donors 3' Acceptors A

Sequence Location (nt)

AAGGTATAT 580 CAAGTGACT 607 ATGGTGAGT 817 AAGGTAAAG 1681 CAGGTGATT 1834 AAGGTAGGA 2260 GAGGTAAGG 2293 GAGGTAAGA 2658 CAGGTAGTT 2914 ATGGGGAGT 3134 CAGGTAGGA 3786 AAAGTAAA! 3982

Sequence

GTTCAGGT TGCCAGGT TACCAGGT CTACAGGT CCACAGGA GCTCAGGT CAACAGGC

Location (nt)

1651 1831 2377 2911 3574 3783 4020

* The consensus sequences used for the 5' donor was AAGGTAAGT and for the 3' acceptor (PY)6 XCAGGC (Mount, 1982). The computer searched for a maximum of two mismatches in nine nucleotides. The regions underlined are homologous to the consensus sequences.

By analogy with other parvoviruses (MVM, H-l, CPV, FPV), amino acid homologies observed for the putative proteins, and also by expression studies of chimeric gene fusions containing PPV DNA fragments in a prokaryotic host (Hailing & Smith, 1985) we deduce that the left ORF of PPV codes for the non-structural protein NS1 and the right ORF codes for the capsid proteins. NS1 is encoded by the left half of the genome in AAV, H-l, MVM, BPV and CPV and this multifunctional polypeptide seems necessary for viral DNA replication (Cotmore & Tattersall, 1987). NS 1 has been detected as an immunoprecipitation product of PPV-infected cell lysates, with an Mr of 84 000 (Molitor et al., 1985). The capsid proteins are always encoded by the right ORF as a nested set. Usually there are two encoded capsid proteins. In the case of PPV we have three capsid proteins VP1, VP2 and VP3 of Mr 83 000, 64000 and 60000, respectively (Molitor et aL, 1983). VP3 comes from proteolytic cleavage of VP2 and is the most abundant protein in viral capsids.

There is an ATG codon located at nt 279, which could serve as an initiator for the NS1 protein. This codon conforms to the Kozak consensus sequence (AnnATGG) for initiation (Kozak, 1983) and is the first in-frame ATG in the right ORF, downstream from the TATA sequence. Therefore we assign this codon as the initiation site for NS1. There is also strong homology with the initial sequence of amino acids for the non-structural proteins of rodent parvoviruses (Fig. 5). However the polypeptide encoded by the left ORF would be of Mr 75 307. To explain an MT of 84000 for NS 1, calculated by gel electrophoresis, we should accept that the observed difference is due to the extent of phosphorytation of the NS1 (Jongeneel et al., 1986). In fact, Molitor et al. (1985) have described the presence of forms with slower electrophoretic mobilities for the PPV NSI which are probably due to a different level of phosphorylation.

The initiation codon for VP1 also conforms to Kozak's rule and is found at nt 2268. This codon is the first ATG in-frame after the P38 and is soon followed by the in-frame termination codon at 2331. The exact splicing pattern of the mRNA for VP1 is not certain; a computer search for the possible splice donor (consensus sequence AAGGT~AGT) and acceptor [consensus sequence (Py)6XCAGG c] sites (Mount, 1982) for VP1 and VP2 revealed several sites, which are shown in Table 1. One donor site lies immediately upstream (nt 2260) of the proposed ATG start codon for VP 1 (nt 2268); another donor lies a few bases downstream (nt 2293). A total of seven possible acceptor sites had greater than 65~ homology with the consensus sequence and retained the core CAGG region (Table 1). By homology with other parvoviruses the most probable acceptor site used by capsid proteins' mRNAs should be located at nt 2374 (CTCTACCAGGT). The first donor mentioned could serve as a splicing site for the transcription of the VP2-specific mRNA removing the ATG of the VP1 encoding mRNA and

Page 9: Porcine Parvovirus: DNA Sequence and Genome Organization

Porcine parvovirus nucleotide sequence 2549

PPV CPV MVM

~ D N ~ K ~ p - ~ ' ~ : K P ~ - C ~ ' ~ ~ ~ b T T D ~

PPV CPV MVM

~ AT~MEWES~I~-~FDSLVKKCLFE~IL~-~LS~ EE~ESEVDSL~K~QV~FD~KCLFE~FV~KN~F I 0 ~EV~~K~V~FD~LVKKCLFE~LN~

PPV CPV MVM

H - - - ' ~ C H V L L G ~ G K ~ K ~ ~ - ~ ~ K ~ R I K L R E L A HEWGKDQGWHCHVL~L~QA~GKW~R(~WSRWLVT~LTP~KLRE IA HEWGKDQGWHCHVLII~_G~DF~__~RQLh~YWSRWLVT~T_~RIKLREIA

PPV CPV MVM

~ L T Y ~ H K Q T K ~ F G N M I A Y Y F q N ~ R K T ~ E ~ LTY~HKQTKKDY~J4_VHFGNMIAYYFLFYK~KKIIVHM TK~[S~YFLS[IIDSGWKI LTY~HKQTKKDYTK~FGNMIAYYFqT~_.~KI$~PP~C~YFLSSDSGWK"

PPV CPV MVM FLKEGERHLV RV

PPV CPV MVM

TSPEDWM~L~PDSYI EMMAQPGGENLLKNTLE ICTL'I'LARTKTA~i LEKA~N~KLTNF( TSPEDWM~4qPDSY I EMMAQPGGEN LLKNTLE I CT LT LARTKTAFDL I LEK~ET~KLTN~ J

PPV CPV MVM

~ ~ C V L N R Q G G K R N I T ~ L F H G P A S T G K S I I A ~ ~CVLNRQGGKRNTVLFHGPASTGKSIIAQAIAQAV qCVLNRQGGKRNTVLFHGPASTGKSIIAQAIA~AV

PPV CPV MVM

GNVGCYNAANVNFPFNDCTNKNLIWIEEAGNF~VNQFKAICSGQTIRIDQKGKGSKQI" GNVGCYNAANVNFPFNDCTNKNLIWIEEAGNFGQQVNQFKAICSGQTIRIDQKGKGSK~] GNVGCYNAANVNFPFNDCTNKNLIW~EAGNFG~VN~FKAICSG~TIRIDQKGKGSKQI

PPV CPV MVM

EPTPV I M T T N I ~ R I GCEERPEHTQP I RDRldLN Ill ~'-~LPGDFGI~££'llEWPL I CA EPTPVI~TN~I ~I~RI OCEERPEHTQP IRDRMLNI~KLPODFGLVDP~WPL I CA EPTPVIMTTNE"I~/RIGCEERPEHT~PIRDRNLNI~PGDFGLVDP~ICA

PPV CPV MVM

WLVK HW P W L V K ~ Y I ~ T M A ~ H W G K V I ~ NWA E P I ~ I ~ G$I N ~ K D L ElllQAA~I~PI~qDQ W~V~YSSrMA~A---~OKVP~W~.NWAEP~V~'~I~ ~L~BARSP~m~P~~-

PPV CPV MVM A L T • p Q ~ b L ~ A L E P W ~ ~ T ~ - ~ H A R ~ N ~ ~ , _ , ~ P T W S E I ~ _ ~

I~DLV~IDLALEPWSTP~rrF~IIAF-.~' ~ SNQLO VI~I-IKII~VI~PTWSEIIi~DL ~DLALSPWST~Z_~TOEA~KAOQ ~b~PTWSEI~D'-

PPV CPV MVM

S~LEEDFRDD ~--fl A EPLKKDFS~PLN~

Fig. 5. Homology of the translated left ORFs between parvoviruses PPV, CPV and MVM. Homologous regions are enclosed by boxes.

Page 10: Porcine Parvovirus: DNA Sequence and Genome Organization

2550 A. I . R A N Z A N D O T H E R S

PPV CPV MVM

PPV CPV MVM

PPV CPV MVM

PPV CPV MVM

PPV CPV MVM

PPV CPV MVM

PPV CPV MVM

PPV CPV MVM

PPV CPV MVM

PPV CPV MVM

M--~J~ N~EQHNP I~-~TELS ~P~N E ~G~GVGVSTGS FNNQTE F~l~"

MSDG~ PS~AVH A - ~ pSGGGG~ ~GVGVSTGS[~"~YR~rL

~ E ~ L ~ R LNMP I E S G S A ~ ( ~ I p A N ~ V ~ I T A~ ~S R LVH L N M P E~E~K~R~V~ D K ~ I M ~ D E~I'~'~V T P W S L~ D A E~WVE I T A~ ~__L V H L N M I ~ T E ~ N MAiD DRH~_..~T PWS t VD A

NAWGVWFNP~S~LVSFEC~IFNVVLKI~I~-~ ~--~PTKI YNNDLTA'SL NAWGVWFN~WQL~NTM~E~LVSFEQEIFNVVLKT~ A~PTK~YNNDLTA, ~ i, NAWGVW~WQy ~(~NTMSF~L'NLVS~E I F NVV LKTV'T.TE~D LGGQA ~K I Y NN D L T ~

MVAL-~NT LPYTPA/~P~RSETLGFYPW~'q~S C I ~ Q P N NI~ L MVALDSNN~TPAA~SETLGFYPWKPTI'P .T~YY~W~N~I~ S ~ T N i

[ MVAVDSNr~I~PYTPAA~METLGFYPWKPT [~S~YRYYF]CV~V rr~E~EIIN

NTNR LHS~I ~ P I HLLRTGDE~S~;-~ DTK LT QTN GLP K~ YH~ P D~/~FY T I EN~VP~ L L RTGDEF~T G~F E~K*~LT H~WQTN ,E~L G LP ~L I

Ns~TT~HP~TLPA~NT ~YF~TI~S~ ~E-~RPAQVG~NT~--~G~-

~p__------~D~TI;I~_qOSBH~"I~TOM q ~VS~__~I R~!RPA(~V(~C(;~I-IN EI~RA~#_.

p~~PT A~-~ ~D~P~AI~FTMDY~HL~-T-~S~ ~ FNPQSKC~A~

AA~ KVPADI~ GV~ANqSV~j~SY~NW~A~ERYT~WDETSFGS~3_._~D T

E ~ ~ N ~ P I G ~ N ~ N T Y GPL T ALNN~ ~PVFPNGQI W DWI (~N I N~N LFS/ITI~DNV~LPr~DP i GGK~rC~I]NYTNIF NTYGP LTALNI~VRPVYPNGQ I W

DKE LDTDLKPRLHVTAPFVCKNNPPGQLFVK~FNAJ~SPQQ ~--I~r~FWWK DKE~DTD LKPR LH~APF VC~N~PGQLFV~PN L T ~ E ~ S ~ M~R IVTY~F~/K DKE LD~-~PRL~'I~APFVCKN~L(~N LT~NG~T ~S R 1 VTY~3T~K.

~ T T T A i ~ I G ~ 1 ~--~-I--G~M FPE~I~RKL- ~INV~*FN~ ~ IP~GGM~I~YE K SQ~R K L ~ A E~_~GNS~MS~r KW P[~_~A T~S~P L I TRPVA~T I

CPV YU MVM [ YU

Fig. 6. Homologyofthe VP2capsidprotein betweenparvovirusesPPV, CPV and MVM. Homologous regions are enclosed by boxes.

allowing the next A T G at nt 2787 to be the initial A T G for VP2. The second donor could be used as a splicing site for the VPl-specif ic m R N A yielding a protein of Mr 83000, very close to that est imated by gel electrophoresis (Molitor et al., 1983). The predicted Mr for VP2 would be 64416, similar to that est imated previously.

Page 11: Porcine Parvovirus: DNA Sequence and Genome Organization

Porcine parvovirus nucleotide sequence 2551

Sequence homology between P P V and other parvoviruses

There is a striking overall DNA homology between MVM, CPV and PPV (approx. 63~). This homology is maintained, or even improved, at the amino acid sequence level. The amino acid homologies among MVM, CPV and PPV NS1 and VP2 proteins are depicted in Fig. 5 and 6. The left ORF is highly conserved between these parvoviruses with a homology of approx. 70~. Homology with human parvovirus B19 (Shade et al., 1986) and BPV (Chen et al., 1986) is much lower at approx. 20 ~. Even so, there are certain regions of the PPV D N A sequence within the NS1 coding region that are highly conserved in all the parvoviruses (amino acids 389 to 408 of PPV have 90 ~ or greater homology with all the other parvoviruses) (Fig. 5). The right ORF is not as conserved as the left ORF, showing a homology of approx. 50 to 60~ with FPV and rodent parvoviruses and around 20~ with other autonomous parvoviruses. VPl-specific sequences are well conserved with respect to MVM and CPV. The amino terminus of VP1, formed by 13 amino acids, share the same sequence in all these viruses. There is also a high homology (73~) between MVM and CPV in the first 120 amino acids of VP1. No significant homologies with BPV or B19 are found in this VPl-specific sequence. Four small blocks of strong amino acid homologies are found in the VP2-specific sequence (a glycine-rich region at the beginning of the protein, TPWS at nt 3105, YNNDLTA at nt 3279 and PIWXK at nt 4161). These four blocks are well conserved in all the autonomous parvoviruses except the glycine-rich region in Bt9 (Shade et al., 1986).

DISCUSSION

An almost full-length copy of the PPV genome was sequenced from clones made from RF DNA. The copy contained all the transcription signals and coding sequences of PPV. Therefore we can compare our data with those described to date for other parvoviruses.

The 3' end sequence was cloned intact (except for possibly 4 nt) as indicated by the restriction map and DNA sequence analysis and it resembles the Y-stem structure of CPV and rodent parvoviruses (Astell et al., 1986; Reed et al., 1988). The 5' end was not a complete copy; we believe that 70 to 80 nt are lost from one of the sides of the 5' loop, probably due to a cloning artefact. Even so, the absence of these nucleotides seems not to affect the replication ability of this copy of the PPV genome in the plasmid pPPV-10 as it was able to infect ST cells by transfection and generate mature virions (J. J. Manclfls et al., unpublished results). We have observed a perfect 127 bp duplication in the 5' end, which partially covers the carboxy terminus of the VP1-VP2 coding region. This direct repeat is the largest one reported to date in a parvovirus genome doubling in size the direct repeat described in MVM (65 nt) (Astell et al., 1986). It has been suggested that these sequence duplications may be caused by high passage of virus in tissue culture to accommodate the viral genome into the capsid. In the NADL-2 strain of PPV the presence of two forms of RF DNA (differing by approximately 300 bp) has been described (Molitor et al., 1984). However this deletion, which involves the loss of the SacI site and one of the Bgl l I sites, does not coincide in position with the repetition, being situated approximately between nt 3900 and 4200. We did not observe any significant feature in this region to predict a possible deletion signal.

Similar to all the parvoviruses described to date, there are two large ORFs located in the C strand. PPV presents multiple promoter-like sites, but by homology with other parvoviruses (Astell et al., 1986; Reed et al., 1988) we assign two main promoters located at m.u. 3.7 and 38 that initiate the transcription of left and right ORFs respectively. No TATA-like sequences have been observed in the 5' end, therefore differing from FPV and CPV. The viral genome contains 16 potential polyadenylation signals, eight of them inside the 127 bp duplication, a much higher number than any other reported to date for autonomous parvoviruses. It is noteworthy that a similar phenomenom has been described for MVM (Astell et al., 1986). In that case two polyadenylation sites are repeated twice. In PPV, with a double length in the duplication (127 nt rather than 65 nt), there are four polyadenylation sites repeated twice. The functional significance of this cluster of poly(A) signals is not well understood.

Page 12: Porcine Parvovirus: DNA Sequence and Genome Organization

2552 A. I. R A N Z AND OTHERS

The predicted splice positions are analogous to those described for MVM, H-1 and CPV. Unlike B19, BPV or MVM, these transcripts splice within the same frame and not in different frames. We propose two different donor sites: one at nt 2260, which removes the VP1 ATG used by the mRNA encoding VP2 and the other at nt 2293, which would be used by VP1 mRNA. These two donors would use the same acceptor site at nt 2374.

We have shown that the putative amino acid sequence for the PPV NS1 protein is highly homologous to CPV and the rodent parvoviruses (Fig. 5). Also, PPV retains the G K R N region, common to all the parvoviruses, which may be used as a diagnostic probe for parvovirus identification. This sequence conforms also to the consensus sequence G(X)4GKT/ S(X) 5 6I/L/V, which has been recognized as a feature of purine triphosphate binding sites present in proteins of different organisms, suggesting the presence of ATPase or GTPase activities in the NS1 protein of parvoviruses.

With regard to MVM, H-l, CPV, and FPV, VPl-specific sequences also show a high homology (approx. 73 ~). The amino terminus of the VP1 protein contains the basic, proline- rich sequence MAPPAKRAKR, which has been implicated in the translocation of the virus to the cell nucleus (Cotmore & Tattersall, 1987). VP2 is not as highly conserved, but also shows a good homology (approx. 50 to 60~o) and contains several regions common to all the parvoviruses. The sequence PIW, conserved among zll the parvoviruses, only shows slight homology as shown below:

AAV BPV MVM PPV

V Y L Q G P I W A K I P nt 4027 I S R Y N P I W V K V P nt 4973 V Y P Q G Q I W D K E L nt 4174 VF P N G Q I W D K E L nt 4161

Since this region is not present in the defective genome NADL-2*, which has been reported to be non-infective, its role could be important in producing intact viral structural proteins and therefore mature virions.

The glycine-rich region, located 70 nt downstream from the initiation codon of the VP2 protein, has been implicated in the VP2 to VP3 cleavage (Paradiso, 1984). It has been suggested (Rhode, 1985) that a run of glycines distorts the a-helical structure generating a possible site for proteolytic cleavage, generating the smallest capsid protein, VP3. TPW and YNN regions (Chen et al., 1986) are also found in PPV making these regions good candidates to search for universal probes for detecting autonomous parvoviruses. The functional implications of these regions are not known at the present time.

Finally these results are in good agreement with the antigenic relationship among autonomous parvoviruses previously reported (Mengeling et al., 1986) and confirm the inclusion of PPV in the KRV-type group (which also includes parvoviruses MVM, H-l, Lu III, FPV and CPV), making the relationship between PPV and other autonomous parvoviruses such as B 19 or BPV very distant.

We thank Ana I. Sebastifin for assistance in preparation of the manuscript . This work was supported in part by grants from the Comisi6n Interministerial para la Ciencia y Tecnologia and from the Uni6n Espafiola de Explosivos Rio Tinto.

R E F E R E N C E S

ARMENTROUT, R., BATES, R., BERNS, K., CARTER, B., CHOW, M., DRESSLER, D., FIFE, K., HAUSWIRTH, W., HAYWARD, G., LAVELLE, G., RHODE, S., STRAUS, S., TATTERSALL, P. & WARD, D. (1978). A standardized nomenclature for restriction endonuctease fragments. In Replication of Mammalian Parcoviruses, pp. 523-526. Edited by D. Ward & P. Tattersall. New York: Cold Spring Harbor Laboratory.

ASTELL, C. R., GARDINER, E. M. & TATTERSALL, P. (1986). D N A sequence of the lymphotropic variant of minute virus of mice, M V M (i), and its comparison with the D N A sequence of the fibrotopic prototype strain, dournalof Virology 57, 656--659.

BENSINHON, M., GABARRO-ARPA, J., EHRLICH, R. & REISS, C. (1983). Physical characteristics in eucaryotic promoters. Nucleic Acids Research 11, 4521-4540.

aERt3ET, S. M. (1984). Are U4 small nuclear riboproteins involved in polyadenylation? Nature, London 309, 179- 182.

Page 13: Porcine Parvovirus: DNA Sequence and Genome Organization

Porcine parvovirus nucleotide sequence 2553

BIRNSTIEL, M. L., BUSSLINGER, M. & STRUB, K. (1985). Transcription termination and 3' processing: the end is in site. Cell 41, 349 359.

CARLSON, J. R., RUSHLOW, K., MAXWELL, I., MAXWELL, F., WINSTON, S. & HAHN, W. (1985). Cloning and sequence of DNA encoding structural proteins of the autonomous parvovirus feline panleukopenia virus. Journal of Virology 55, 574-582.

CHEN, K. C., SHULL, B. C., MOSES, E. A., LEDERMAN, M., STOUT, F. R. & BATES, R. C. (1986). Complete nucleotide sequence and genome organization of bovine parvovirus. Journal of Virology 60, 1085-1097.

cHot, c. s., MOLITOR, X. W. & JOO, H. S. (1987). Inhibition of porcine parvovirus replication by empty virus particles. Archives of Virology 96, 75-87.

COTMORE, S. F. & TATTERSALL, P. (1987). The autonomously replicating parvoviruses of vertebrates. Advances in Virus Research 33, 91 174.

HALLING, S. M. & SMITH, S. (1985). Expression in Escherichia coli of multiple products from a chimaeric gene fusion: evidence for the presence of procaryotic translational control regions within eucaryotic genes. Bio/Technology 3, 715-720.

HANAHAN, D. (1983). Studies on transformation of Escherichia coli with plasmids. Journal of Molecular Biology 166, 557-580.

HAUSWIRTH, W. W. (1984). Autonomous parvovirus DNA structure and replication. In The Parvoviruses, pp. t29- 150. Edited by K. I. Berns. New York: Plenum Press.

JONGENEEL, C. V., SAHLI, R., McMASTER, G. K. & HIRT, B. (1986). A precise map of splice junctions in the mRNAs of minute virus of mice, an autonomous parvovirus. Journal of Virology 59, 564-573.

jOO, H. S. & JOHNSON, R. rr. (1976). Porcine parvovirus: a review. Veterinary Bulletin 46, 653~560. KOZAK, m. (1983). Comparison of initiation of protein synthesis in procaryotes, eucaryotes and organelles.

Microbiological Reviews 47, 1 45. LANGRIDGE, J., LANGRIDGE, P. & BERGQUIST, P. L. (1980). Extraction of nucleic acids from agarose gel. Analytical

Biochemistry 103, 264-271. MANIATIS, T., FRITSCH, E. F. & SAMBROOK, J. (1982). Molecular Cloning." A Laboratory Manual. New York: Cold

Spring Harbor Laboratory. MENGELING, W. L. (1978). Prevalence of porcine parvovirus induced reproductive failure: an abortion study.

Journal of the American Veterinary Medical Association 172, 1291-1294. MENGELING, W. L. & CUTLIP, R. C. (1976). Reproductive disease experimentally induced by exposing pregnant gilts

to porcine parvovirus. American Journal of Veterinary Research 37, 1393 1399. MENGELING, W. L., PAUL, P. S., BUNN, T. O. & RIDPATH, J. F. (1986). Antigenic relationships among autonomous

parvoviruses. Journal of General Virology 67, 2839-2844. MESSING, J. (1983). New M13 vectors for cloning. Methods in Enzymology 101, 20-76. MOLITOR, T. W., JOO, H. S. & COLLETT, M. S. (1983). Porcine parvovirus: virus purification and structural and

antigenic properties of virion polypeptides. Journal of Virology 45, 842-854. MOLITOR, T. W., JOO, H. S. & COLLETT, M. S. (1984). Porcine parvovirus DNA : characterization of the genomic and

replicative form DNA of two virus isolates. Virology 137, 241-254. MOLITOR, T. W., JOO, H. S. & COLLETT, M. S. (1985). Identification and characterization of a porcine parvovirus

nonstructural polypeptide. Journal of Virology 55, 554-559. MOUNT, S. M. (1982). A catalogue of splice junction sequences. Nucleic Acids Research 10, 459-472. PARADISO, V. R. (1984). Identification of multiple forms of the noncapsid parvovirus protein NCVP1 in H-1

parvovirus-infected cells. Journal of Virology 52, 82-87. PIRTLE, E. C. (1984). Titration of two porcine respiratory viruses in mammalian cell culture by direct fluorescent

antibody staining. American Journal of Veterinary Research 34, 249-250. QUEEN, C. & KORN, L. S. (1984). A comprehensive sequence analysis program for the IBM PC. Nucleic Acids

Research 12, 581-599. REED, A. P., JONES, E. V. & MILLER, T. J. (1988). Nucleotide sequence and genome organization of canine parvovirus.

Journal of Virology 62, 266-276. RHODE, S. L. (1985). Nucleotide sequence of the coat protein gene of canine parvovirus. Journal of Virology 54, 630-

633. RHODE, S. L. & PARADISO, P. K. (1983). Parvovirus genome: nucleotide sequence of H-1 and mapping of its genes by

hybrid-arrested translation. Journal of Virology" 45, 173-184. SANGER, F., NICKLEN, S. & COULSON, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings

of the National Academy of Sciences, U.S.A. 74, 5463-5467. SHADE, R. O., BLUNDELL, M. C., COTMORE, S. F., TATTERSALL, P. & ASTELL, R. C. (1986). Nucleotide sequence and

genome organization of human parvovirus B19 isolated from the serum of a child during aplastic crisis. Journal of Virology 58, 921-936.

SRIVASTAVA, A., LUBSKY, E. W. & BERNS, K. I. (1983). Nucleotide sequence and organization of the adeno-associated virus 2 genome. Journal of Virology 45, 555 564.

STEPHENS, R. M. (1985). A sequencers' sequence analysis package for the IBM PC. Gene Analysis Techniques 2, 67- 75.

WICKENS, M. & STEPHENSON, P. (1984). Role of the conserved AAUAAA sequence: four AAUAAA point mutations prevent messenger RNA 3' end formation. Science 226, 1045-1051.

(Received 8 March 1989)