Top Banner
Proc. Natl. Acad. Sci. USA Vol. 90, pp. 9330-9334, October 1993 Immunology A hypothesis for the HLA-B27 immune dysregulation in spondyloarthropathy: Contributions from enteric organisms, B27 structure, peptides bound by B27, and convergent evolution R. H. SCOFIELD, W. L. WARREN, G. KOELSCH, AND J. B. HARLEYt Oklahoma Medical Research Foundation, University of Oklahoma Health Sciences Center, Department of Veterans Affairs Medical Center, Oklahoma City, OK 73104 Communicated by Susumu Ohno, April 5, 1993 ABSTRACT Several human rheumatic diseases occur pre- dominately in persons who carry the histocompatibility (HLA) class I afleleB27. They have also been related to Gram-negative enteric microorganisms. In addition, the recent recovery of peptides bound to B27 has aflowed an understanding of the structural requirements for their binding. Using the accumu- lated data base of protein sequences, we have tested a series of hypotheses. First, we have asked whether the primary amino acid sequence of the hypervariable regions of HLA-B27 shares short sequences with the proteins of Gram-negative enteric bacteria. The data demonstrate that, unique among the HLA-B molecules, the hypervariable regions of HLA-B27 unexpectedly share short peptide sequences with proteins from these bacte- ria. Second, we have asked whether the enteric proteins tend to satisfy the structural requirements for peptide binding to B27 in those regions of the sequence shared with B27. This hypothesis.also tends to be true, especially in an aflelically variable part of the B27 sequence which is predicted to bind B27 if it were to be presented as a free peptide. We conclude that HLA-B27 and enteric Gram-negative bacteria have un- dergone a previously unappreciated form of convergent evo- lution which may be important in the process leading to these rheumatic diseases. Moreover, the regions of the enteric bac- terial proteins which are contiguous with the short sequences shared with B27 tend to have structures which are also predicted to bind B27. These observations suggest a mechanism for autoimmunity and lead to the prediction that the B27- associated diseases are mediated by a subset of T-cefl receptors, B27, and the peptides bound by B27. Ankylosing spondylitis, Reiter syndrome, and other reactive arthritides are referred to as spondyloarthropathies. They affect the spine, joints, eyes, and skin and are predominately found in individuals with the B27 class I histocompatibility allele (1). B27 is also associated with idiopathic anterior uveitis, which is likely to share many immunologic features with the spondyloarthropathies. Patients affected by spondyloarthropathy tend to harbor a variety of Gram-negative enteric bacteria, including Yersinia, Shigella, Salmonella, and Klebsiella (1-3). Several lines of immunologic evidence have associated these bacteria with the spondyloarthropathies (4-6). For example, data implicating Klebsiella pneumoniae include crossreactivity of an anti-B27 monoclonal antibody with Klebsiella antigens. Immune asso- ciations between Klebsiella and ankylosing spondylitis have not been consistently found, however (7-11). A possible structural explanation for the association of the spondyloarthropathies and Gram-negative bacteria has been proposed, based upon the six consecutive amino acids, QTDRED, shared between HLA-B*2705 and K. pneumoniae The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. nitrogenase reductase (12) (also in Fig. 1). Antibodies in the sera of some patients with ankylosing spondylitis bind the region of both B*2705 and the nitrogenase which contains the shared sequence (12, 13). Subsequently, two other shorter sequences shared by B27 and Gram-negative enteric bacteria associated with spondyloarthropathies have been identified. The Yersinia outer protein 1 (YOP1) shares four consecutive amino acids, QTDR, with the first hypervariable region of B27 (14). Shigella flexneri shares a pentapeptide, AQTDR, with the first hypervariable region (15). These peptides are also reputed to be bound by some spondyloarthropathy patient sera (16, 17). We first sought to determine whether HLA-B27 tended to be related to proteins of the Gram-negative enteric bacteria. The hypervariable regions of the HLA-B molecules were compared with the known sequenced proteins for short, identical, shared consecutive amino acids. We found that, unique to the HLA-B proteins, HLA-B27 shared an unex- pected number of hexapeptides and pentapeptides with Gram-negative bacterial proteins. These data suggest con- vergent evolution between HLA-B27 and these proteins which is likely to contribute to the powerful association of HLA-B27 and the spondyloarthropathies. In addition, Wiley et aL (18, 19) recently determined the sequence of endogenous peptides found in the binding cleft of crystallized B27, and Ohno (ref. 20 and personal communica- tion) expanded on these sequences to develop a B27 binding motif. The motif includes an invariant arginine in position 2 of a nonapeptide. B27 contains a nonapeptide sequence that fits the motif at positions 168-176 within the third hypervariable region. Thus, we determined whether proteins which have sequence identity with B27 also contained a nonapeptide which fits the B27-binding motif. We found that proteins from Gram-negative enteric organisms contain a binding motif at their site of sequence identity significantly more often than do proteins from other organisms. METHODS Computer Searches. Sequences of the Protein Identifica- tion Resource (PIR) data bank (release 27, December 1990) were scanned for short segments of identity to the sequence of HLA B27 (entry Hlhub2). For this purpose we designed Fortran programs linked to the Genetics Computer Group (21) software package (version 6.2) operating on a VAX 8350. Existing sequence data bank searching programs were de- signed to uncover sequences related to a query sequence by divergent evolution. Therefore, a modification to a Genetics Computer Group software module was required to demon- strate the occurrence of sequences possibly related to a query Abbreviation: PIR, Protein Identification Resource. tTo whom reprint requests should be addressed at: Oklahoma Medical Research Foundation, 825 NE 13th Street, Oklahoma City, OK 73104. 9330
5

A hypothesis for the HLA-B27 immune dysregulation in

Feb 11, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: A hypothesis for the HLA-B27 immune dysregulation in

Proc. Natl. Acad. Sci. USAVol. 90, pp. 9330-9334, October 1993Immunology

A hypothesis for the HLA-B27 immune dysregulation inspondyloarthropathy: Contributions from enteric organisms, B27structure, peptides bound by B27, and convergent evolutionR. H. SCOFIELD, W. L. WARREN, G. KOELSCH, AND J. B. HARLEYtOklahoma Medical Research Foundation, University of Oklahoma Health Sciences Center, Department of Veterans Affairs Medical Center, Oklahoma City,OK 73104

Communicated by Susumu Ohno, April 5, 1993

ABSTRACT Several human rheumatic diseases occur pre-dominately in persons who carry the histocompatibility (HLA)class I afleleB27. They have also been related to Gram-negativeenteric microorganisms. In addition, the recent recovery ofpeptides bound to B27 has aflowed an understanding of thestructural requirements for their binding. Using the accumu-lated data base of protein sequences, we have tested a series ofhypotheses. First, we have asked whether the primary aminoacid sequence of the hypervariable regions of HLA-B27 sharesshort sequences with the proteins of Gram-negative entericbacteria. The data demonstrate that, unique among the HLA-Bmolecules, the hypervariable regions ofHLA-B27 unexpectedlyshare short peptide sequences with proteins from these bacte-ria. Second, we have asked whether the enteric proteins tendto satisfy the structural requirements for peptide binding toB27 in those regions of the sequence shared with B27. Thishypothesis.also tends to be true, especially in an aflelicallyvariable part of the B27 sequence which is predicted to bindB27 if it were to be presented as a free peptide. We concludethat HLA-B27 and enteric Gram-negative bacteria have un-dergone a previously unappreciated form of convergent evo-lution which may be important in the process leading to theserheumatic diseases. Moreover, the regions of the enteric bac-terial proteins which are contiguous with the short sequencesshared with B27 tend to have structures which are alsopredicted to bind B27. These observations suggest a mechanismfor autoimmunity and lead to the prediction that the B27-associated diseases are mediated by a subset of T-cefl receptors,B27, and the peptides bound by B27.

Ankylosing spondylitis, Reiter syndrome, and other reactivearthritides are referred to as spondyloarthropathies. Theyaffect the spine, joints, eyes, and skin and are predominatelyfound in individuals with the B27 class I histocompatibilityallele (1). B27 is also associated with idiopathic anterioruveitis, which is likely to share many immunologic featureswith the spondyloarthropathies.

Patients affected by spondyloarthropathy tend to harbor avariety of Gram-negative enteric bacteria, including Yersinia,Shigella, Salmonella, and Klebsiella (1-3). Several lines ofimmunologic evidence have associated these bacteria with thespondyloarthropathies (4-6). For example, data implicatingKlebsiella pneumoniae include crossreactivity of an anti-B27monoclonal antibody with Klebsiella antigens. Immune asso-ciations between Klebsiella and ankylosing spondylitis havenot been consistently found, however (7-11).A possible structural explanation for the association of the

spondyloarthropathies and Gram-negative bacteria has beenproposed, based upon the six consecutive amino acids,QTDRED, shared between HLA-B*2705 and K. pneumoniae

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement"in accordance with 18 U.S.C. §1734 solely to indicate this fact.

nitrogenase reductase (12) (also in Fig. 1). Antibodies in thesera of some patients with ankylosing spondylitis bind theregion ofboth B*2705 and the nitrogenase which contains theshared sequence (12, 13). Subsequently, two other shortersequences shared by B27 and Gram-negative enteric bacteriaassociated with spondyloarthropathies have been identified.The Yersinia outer protein 1 (YOP1) shares four consecutiveamino acids, QTDR, with the first hypervariable region ofB27 (14). Shigella flexneri shares a pentapeptide, AQTDR,with the first hypervariable region (15). These peptides arealso reputed to be bound by some spondyloarthropathypatient sera (16, 17).We first sought to determine whether HLA-B27 tended to

be related to proteins of the Gram-negative enteric bacteria.The hypervariable regions of the HLA-B molecules werecompared with the known sequenced proteins for short,identical, shared consecutive amino acids. We found that,unique to the HLA-B proteins, HLA-B27 shared an unex-pected number of hexapeptides and pentapeptides withGram-negative bacterial proteins. These data suggest con-vergent evolution between HLA-B27 and these proteinswhich is likely to contribute to the powerful association ofHLA-B27 and the spondyloarthropathies.

In addition, Wiley et aL (18, 19) recently determined thesequence of endogenous peptides found in the binding cleft ofcrystallized B27, and Ohno (ref. 20 and personal communica-tion) expanded on these sequences to develop a B27 bindingmotif. The motif includes an invariant arginine in position 2 ofa nonapeptide. B27 contains a nonapeptide sequence that fitsthe motif at positions 168-176 within the third hypervariableregion. Thus, we determined whether proteins which havesequence identity with B27 also contained a nonapeptidewhich fits the B27-binding motif. We found that proteins fromGram-negative enteric organisms contain a binding motif attheir site of sequence identity significantly more often than doproteins from other organisms.

METHODSComputer Searches. Sequences of the Protein Identifica-

tion Resource (PIR) data bank (release 27, December 1990)were scanned for short segments of identity to the sequenceof HLA B27 (entry Hlhub2). For this purpose we designedFortran programs linked to the Genetics Computer Group(21) software package (version 6.2) operating on aVAX 8350.Existing sequence data bank searching programs were de-signed to uncover sequences related to a query sequence bydivergent evolution. Therefore, a modification to a GeneticsComputer Group software module was required to demon-strate the occurrence of sequences possibly related to a query

Abbreviation: PIR, Protein Identification Resource.tTo whom reprint requests should be addressed at: OklahomaMedical Research Foundation, 825 NE 13th Street, Oklahoma City,OK 73104.

9330

Page 2: A hypothesis for the HLA-B27 immune dysregulation in

Proc. Natl. Acad. Sci. USA 90 (1993) 9331

sequence by convergent evolution through the conservationof short sequences. These programs were used to determineshort consecutive amino acid matches, 4 to 6 amino acids inlength, between the HLA alleles and the data base proteins.No substitutions were allowed, so that the matched se-quences were required to be identical in amino acid compo-sition. Peptide sequences related to the query sequence bydivergent evolution are expected to have high numbers ofshort sequence identities with the other sequences ofthe databank. Given that we wished to find amino acid identitybetween HLA-B alleles and proteins not related to them bydivergent evolution, we eliminated the 3119 sequences of theimmunoglobulin superfamily from the search of 28,796 pos-sible sequences of the PIR data bank. The algorithm ofWilbur and Lipman (22) was used to determine the number ofk-tuples (short sequences, of length 4-6 in this study) incommon between the query and a given data bank sequence.Nonapeptides which meet the criteria for binding to HLA-

B27 as defined by Wiley (18, 19) and expanded upon by Ohno(ref. 20 and personal communication) were sought in the database proteins by using the existing Genetics Computer Groupprograms (21). The patterns queried were as follows(K,R)RXXXXXX(A,L,Y) and (A,G,F,L)RXXXXXX(K,R),where X is any amino acid.

Statistics. The number of amino acid sequence identitiesfound for B*2705 was compared with the composition of thedata base and with the number of identities found for otherHLA-B proteins by x2 analysis of a two-by-two contingencytable. Calculations were performed using the Statistical Anal-ysis System 5.6 software package (23).

RESULTSGiven how commonly short sequences are shared betweenunrelated proteins (24-26), we sought to determine whethera general relationship of primary amino acid sequence existsbetween HLA-B27 and the proteins of enteric bacteria. Tothat end, we have searched the PIR protein data base (release27) for short sequence identity to HLA-B27 and otherHLA-Bproteins. Ten hexapeptide matches with the hypervariableregions of B*2705 have been found (Table 1). Five of these(50%) are from enteric organisms. In contrast, only 2581 ofthe 25,677 non-immunoglobulin superfamily sequences(10.1%) in the PIR data base are from Gram-negative entericbacteria (Table 2). The tendency of proteins from Gram-negative organisms to share a hexapeptide sequence with thehypervariable regions of B*2705 is unexpected (P < 0.002,Fisher's exact test).The protein sequence data base was also examined for

hexapeptides shared with the hypervariable region sequencesof HLA-B proteins other than HLA-B27. Only 23 of 163(14.1%) shared hexapeptides are from enteric bacteria, a

percentage that is not different from the percentage (10.1%)of the PIR data base (P > 0.05) but is different from the 50%for B27 (P = 0.01) (Table 2). Each HLA-B protein was alsoexamined individually, and no protein, other than B27, had apropensity to share hexapeptides with Gram-negative micro-organisms (data not shown).At the pentapeptide level the number of shared peptides is

much larger. There are 286 pentapeptide or longer matcheswith the hypervariable regions ofB*2705 in the PIR data base,of which 47 (16.5%) are from Gram-negative enteric bacteria(Table 2). Relative to the PIR data base this is unlikely to haveoccurred by chance (P < 0.001). Also, compared to HLA-B*0701, as an example of a typical HLA-B protein, where 28of 308 (9.1%) pentapeptides shared with PIR sequences arefrom Gram-negative enteric bacteria, the tendency of B*2705to share pentapeptides with Gram-negative enteric bacteria isunexpected (P = 0.005). When searches for pentapeptideamino acid sequence identities were carried out for all the

Table 1. Proteins sharing at least six consecutive amino acidswith the hypervariable regions (HV) of HLA-B27

HV Protein Organism1 Nitrogenase reductase Klebsiella pneumoniae1 Histidyl-PO4 aminotransferase Salmonella typhimurium3 Site-specific methyltransferase Pseudomonas aeruginosa3 Lethal factor precursor Bacillus anthracis3 Glutamate synthetase Escherichia coli1 pinFl Agrobacterium1 T2 Epstein-Barr virus1 CurC Streptomyces curacoi2 pol polyprotein Avian reticloendotheliosis

virus3 Monophenol monoxygenase Human melanoma cells

Searches were performed so that every possible overlappinghexapeptide of the three hypervariable regions (HV) ofHLA-B*2705(PIR access code hlhub2) was compared to every possible overlap-ping hexapeptide of each of the proteins with the PIR data base(release 27). Sequences which were part of the immunoglobulinsuperfamily were eliminated from the data base, resulting in 25,677sequences which were compared with HLA-B*2705. The PIR accesscodes for the proteins which had a hexapeptide shared with HLA-B*2705 are in order from the table: nikbfp, xnebhc, xyps7a, jq0032,b29617, a33073, a24938, a32306, fovdar, and yrmsb6. E. coli gluta-mate synthetase (b29617) shares 8 consecutive amino acids withHLA-B*2705 in, and adjacent to, the third hypervariable region. Theshared hexapeptide of each sequence is identified by its PIR accesscode in Fig. 1.

HLA-B proteins, we found that there were 5325 matches in thePIR data base to the hypervariable regions. Among these were492 (9.2%) from proteins of Gram-negative enteric bacteria.This was not different than the representation ofthese proteinsin the PIR data base but was significantly different than thatfound for B*2705 (P < 10-9).The PIR data base contains entries of some completely

identical sequences. Sometimes, these are entries of thepartial sequence. There are, however, no duplicate se-

Table 2. Number of six and five consecutive amino acididentities found among the HLA-B molecule hypervariable regionswhen compared with protein sequences contained in the PIRdata base

Hexapeptide Pentapeptidematches matches

Other OtherOrganisms HLA-B27 HLA-B HLA-B27 HLA-B

Gram-negative bacteria(total in PIR = 2581) 5* 23 47t 492

Other organisms(total in PIR = 23,096) 5 140 239 4823

Searches were performed as described for Table 1. The otherHLA-B molecules included in these searches are given here accord-ing to the 1990 World Health Organization nomenclature and are asfollows: B*0701, B*0702, B*0801, B*1301, B*1302, B*1401, B*1402,B*1501, B*1801, B*3501, B*3502, B*3701, B*3801, B*3901, B*4001,B*4002, B*4101, B*4201, B*4401, B*4402, B*4601, B*4701, B*4901,B*5101, B*5201, B*5301, B*5701, B*5801, and B*7801. B*3801 andB*3901 were excluded from the search because only partial se-quences are available for these proteins. The HLA serologic spec-ificity, previous nomenclature, and sequences of the HLA-B proteinsare available elsewhere (27, 28).*P = 0.01 comparing the number of shared hexapeptides for B27 withthe number for the other HLA-B proteins by Fisher's exact test.Hexapeptides were partitioned according to whether or not theprotein containing the hexapeptide was found in an enteric bacte-rium.tp = 10-9, x2 = 69. Pentapeptide amino acid matches to theHLA-B27 hypervariable regions were partitioned in the same man-ner as were the hexapeptides.

Immunology: Scofield et al.

Page 3: A hypothesis for the HLA-B27 immune dysregulation in

Proc. Natl. Acad. Sci. USA 90 (1993)

quences contained among those with a hexapeptide match toB27. For the presentation above and in Fig. 1, the duplica-tions of identical sequences with pentapeptide matches havebeen included. The three sets of duplicate sequences amongthe shared pentapeptides of the hypervariable regions ofB*2705 and enteric organisms are as follows (identified byaccess code): niavf and a35405; a25103, s08320, and s10429;and a34192 and syepg. There are 23 sets of duplicate se-quences in the pentapeptides shared between B*2705 and theother sequences in the modified PIR data base. The statisticalsignificance of the results is not meaningfully changed byremoving these duplicates from consideration.

Finally, we searched for a B27-binding motif (refs. 18-20;S. Ohno, personal communication) in the peptides which hadsequence identity to B27. Each protein within the data basethat had sequence identity with B27 was searched for anonapeptide which fit the B27-binding motif and which wasalso contiguous with the amino acid identity to B27. Wefound that proteins from Gram-negative organisms weremore likely to contain a peptide conforming to the B27-binding motif which overlapped with the sequence identity toB27 than other proteins. This relationship held true when allthe hypervariable regions of B27 were considered: 9 of 47

Gram-negative bacterial proteins with a match had the bind-ing motif, while only 16 of 239 of the matching proteins fromother origins also satisfied this motif. (X2 = 7.6, P < 0.01).Of particular interest, we noted that the B27 molecule

contained a nonapeptide beginning at amino acid residue 168that conformed to the requirements for binding to and pre-sentation by B27 in conjunction with 82-microglobulin. Wefound that, of Gram-negative enteric organisms with a se-quence match at B27 residues 168-176, 5 of 7 had a non-apeptide consistent with binding by B27 which was at the siteof the match to the B27 molecule. On the other hand, only 6of49 proteins not from Gram-negative enteric organisms withshort sequence similarity to B27 residues 168-176 also had abinding motif sequence (2 = 19.6, P = 0.0002). Thesesequences with both sequence identity at B27 residues 168-176 and a binding motif are shown in Table 3 along with theirprotein of origin.

DISCUSSIONWe have performed a systematic search ofthe known proteinsequences and determined the relatedness ofthese sequencesto the hypervariable regions of HLA-B proteins. The data

68 HV1 83KAKAQTDREDLRTLLR

105 HV2 120GPDGRLR¢GYHQDAYQ

nikbfpxnebhc

166 HV3 184EWLRRYLENGKETLQRVDPPK

xyPs7ajq0032

b29617-

KAKAQTDRZDLRTLLR

jgq=5ju0135deecpe

s01424niavfa25103a35405s08320810429s11793

syechf27733s11886qqecf j

KAKAQTDREDLRT*TL.R

68 HV1 83

VGPDGRLLRYHQDAYG

-zpecp3a32354s09214s00836s06302s00252

ju0380vzebpts07355s00302a28214a34192a31862

nqeca

VGPDGRLLRGYHQDAYG

105 HV2 120

EWLRRYLENGKETLQRVDPPK

s01840jgO612js0383a33465

mmeQLs00920s09207ncecxvxuecags04021a28626xubph9a32360s01032

EWLRRYLENGKETLQRVDP

166 HV3 184

FIG. 1. Hexapeptide and pentapeptide sequence identities to the hypervariable regions (HV) of HLA-B*2705 and Gram-negative entericorganisms found in the PIR data base. The amino acid sequence of each HV is given on the center line while the residue position is given alongthe top and bottom. The access code for each protein from the data base overlies its shared sequence with B27 and is underlined if the proteincontains a sequence which satisfies the Bl27-binding motif at the site of the shared sequence. At the top, the proteins of origin for the sharedhexapeptides are listed in order from right to left from the Gram-negative sequences in Table 1. HV 2 and 3 are extended (in lightface type) toencompass additional amino acids shared by the octapeptide of b29617 and the hexapeptide of zpecp3. The protein sequences containingpentapeptide identities with the hypervariable regions of B*2705 are as follows: for HV 1, jqO559, E. coli plasmid RK2 KfrA protein; ju0135,Acetobacter sp. aldehyde dehydrogenase; deecpe, E. coli phosphoribosylaminoimidazole carboxylase; s01424, Frankia. sp. nitrogenase; niavf,Azotobacter vinelandii nitrogenase iron protein; a25103, Azotobacter chroococcum nitrogenase; a35405, A. vinelandii nitrogenase; s08320, A.vinelandii nitrogenase; s10429, A. chroococcum nitrogenase iron protein; s11793, Pseudomonas aeruginosa phosphate-specific porin P; syech,E. coli histidine-tRNA ligase; f27733, A. vinelandii protein 5; s11886, E. coli fimD protein; qqecfj, E. coli pho regulon 26-kDa protein; for HV2, zpecp3, E. coli penicillin-binding protein 3 precursor; a32354, Bacillus subtilis CTP synthase; s09214, Pseudomonas syringae acetyltrans-ferase; s00836, E. coli plasmid MccB17 McbE protein; s06302, E. coli transposon Tn2501; s00252, E. coli shikimate dehydrogenase; ju0380, E.coli sensor protein PhoQ; vzebpt, Salmonella typhimurium virulence membrane protein PhoQ; s07355, E. coli ornithine carbamoyltransferase;s00302, E. coli shikimate dehydrogenase protein; a28214, Pseudomonas diminuta phosphotriesterase; a34192, E. coli phosphoribosylformyl-glycinamidine synthase; a31862, E. coli phosphoribosylformylglycinamidine synthase; nqeca, E. coli 1,4-a-glucan branching enzyme; for HV3, s01840, K. pneumoniae nitrogenase molybdenum-iron protein NifN; jqO612, E. coli hypothetical protein; jsO383, Bacillus megaterium26.2-kDa protein; a33465, Haemophilus influenzae lic-1 phase/variation protein D; mmecof, E. coli outer membrane FeaD protein; s00920, E.coli site-specific methyltransferase; s09207, E. coli f3galactosidase; ncecxv, E. coli exodeoxyribonuclease; xuecag, E. coli glycerol-3-phosphateacetyltransferase; s04021, Shigella dysenteriae shiga toxin A precursor; a28626, S. dysenteriae shiga toxin A precursor; xubph9, bacteriophageH19B shigalike toxin A precursor; a32360, bacteriophage 933W shigalike toxin II variant A chain precursor; s01032, bacteriophage 933Wshigalike toxin II chain A precursor.

9332 Immunology: Scofield et al.

Page 4: A hypothesis for the HLA-B27 immune dysregulation in

Proc. Natl. Acad. Sci. USA 90 (1993) 9333

Table 3. Proteins from Gram-negative enteric organisms whichshare consecutive amino acid sequence with HLA-B27 and havethe binding motif for HLA-B27

Protein Sequence

B27 HV 3 EWLRRYLEIGKETLQRVDPxyps7a LPRLRRYLEARRDVIs01840 DIEWLRRCVEArGLQPjq0612 ADARRYLEIGATFVAjs0383 ARVTA FLLEmmecof TGSYRYSDDI§GRTG

The motif in each protein is in boldface and the sequence identitywith B27 is underlined. The proteins are named by their PIR accesscodes. The proteins and organisms are as follows: xyps7a, Pseudo-monas aeruginosa site-specific methyltransferase (adenine specific);s01840, K. pneumoniae nitrogenase molybdenum-iron protein NifN;jq0612, E. coli hypothetical protein 168; js0383, Bacillus megaterium26.2-kDa protein; mmecof, E. coli outer membrane FeaD protein.

reveal that unique to this set of class I molecules, HLA-B*2705 shares short amino acid sequences with Gram-negative enteric bacteria. At the hexapeptide length, therewere 10 proteins in the PIR data base which had a commonsequence with one of the three hypervariable regions ofB*2705. Half of these proteins were from enteric bacteria.The remaining HLA-B proteins shared hexapeptides withGram-negative organisms no more commonly than expectedby chance, based on the number of proteins from Gram-negative organisms relative to the total number of proteins inthe data bank. This analysis has been extended to pentapep-tides, and the B*2705 primary amino acid sequence continuesto be related to the primary sequence of proteins fromGram-negative bacteria. Forty-seven of 286 pentapeptidematches to B*2705 are from proteins of Gram-negative bac-teria. Again, this is significantly different from the composi-tion of the PIR data base and from the number of Gram-negative organism protein matches to the other HLA-Bproteins.On average, over 80% ofthe hexapeptide sequences shared

by B27 and enteric organisms as well as nearly 40% ofpentapeptide or longer sequences are in excess of the pro-portion that would be expected by chance. There may beother structural or functional properties shared by the pro-teins that share short sequences with B27. There are 20shared sequences from E. coli, 8 are nitrogenases, 5 areShigella toxins, and 4 are methyl- or acetyltransferases. E.coli is, of course, a ubiquitous inhabitant of the human gutand its genome has been extensively sequenced. When E. coliis removed from consideration a significant relationship ofB*2705 with non-E. coli enteric bacteria is maintained (datanot shown).Though genetic adaptation is presumed to be more rapid in

bacteria than in humans, an evolutionary contribution fromhumans must have occurred, since B27 is unique among theHLA-B proteins in its relationship to proteins of the Gram-negative enteric bacteria. Examples ofa like nature involvingstructural similarity between host immune system proteinsand pathogen proteins exist (29), including pathogen proteinswhich share short primary amino acid sequences for struc-tural and functional advantage (30-32). Our analysis extendsprevious observations and establishes the existence of aunique structural relationship between HLA-B27 and a classof microorganisms which are potentially related to thespondyloarthropathies. The shared sequences themselves(Fig. 1 and Table 3), as well as the relationship of theHLA-B27 protein to the spondyloarthropathies and to theenteric microorganisms, cannot be simply fortuitous. Rather,they must be a product of selective evolutionary pressuresburied deep within the history of the symbiotic relationshipof humans and their enteric microorganisms which culminate

in the sharing of the short peptide sequences describedherein.These observations provide an important link for an ex-

planation of the tripartite relationship between B27, thespondyloarthropathies, and enteric bacteria. The central roleof B27 and the inflammatory nature of the spondyloarthrop-athies strongly suggest that the convergent evolution impliedby our data has an immunologic origin. While the associationbetween B27 and the spondyloarthropathies is among thestrongest of those known between HLA and disease, previ-ous work neither suggests the molecular mechanism norgenerally establishes a role for B27 rather than a closelylinked gene product. The present work is a comprehensiveand systematic comparison of the HLA-B sequences to theknown sequenced proteins and is, along with work utilizingrats transgenic for B27 (33), the best evidence available todate that B27 itself is involved in the pathogenesis of thediseases with which it is associated.The shared sequences limit the potential mechanisms of

disease which should be considered. Indeed, precedents existfor the potential pathologic relevance of sequence similaritybetween HLA antigens and microorganisms. The demonstra-tion that protection from lethal malaria is related to particularHLA alleles may, in some respect, be an analogous example(34, 35). A hexapeptide shared between the Epstein-Barrvirus gpllO glycoprotein and the DR4 sequence most asso-ciated with risk for rheumatoid arthritis has been described(36). Peptides containing the shared sequence from bothproteins not only are recognized by human peripheral bloodT cells but also appear to stimulate some of the same T cells.More recently, the same investigators have reported a se-quence match between the same DR4 sequence in the thirdhypervariable region and an E. coli heat shock protein (37).

Other examples of histocompatibility structures from selfbeing recognized by syngeneic T cells or being crossreactivewith heteroantigen-responsive T cells are known (38-41).For example, in experimental autoimmune encephalomyeli-tis (EAE), responding T cells in animals immunized withmyelin basic protein preferentially utilize the variable-regionprotein Vf38.2 in their T-cell receptors. Recent work hasshown that immunization of animals with established diseasewith peptides from V,B8.2 results in a remission ofEAE (49).Thus, in the EAE model, peptides from immunoregulatoryproteins can modify the immune response. Ohno (20) hasfound that T-cell epitopes from sites of pathogen proteinswhich share sequence with the host are recognized by arestricted set of HLA class I molecules. In addition, T-cellepitopes which do not share sequence with the host arerecognized by many class I molecules. These examples, andresults from the present study, suggest a general model ofautoimmunity in which the immune response is modulatedwhen immunoregulatory elements are mimicked. By struc-turally imitating critical molecular decision-making machin-ery of the immune response the subsequent immunologicinteractions are modulated, leading to immune dysregulationand the consequent pathologic expression of disease.

Past data as well as data presented herein which link thespondyloarthropathies, HLA-B27 and enteric bacteria doproduce a paradox. HLA-B27 and the products of the othermajor histocompatibility complex (MHC) class I alleles bindand present peptides which are synthesized intracellularly.Thus HLA-B27 would not normally be involved in theimmune response to extracellular organisms such as entericbacteria. However, recent work shows that, perhaps uniqueamong class I molelcules, B27 is present on the cell surfacewithout a peptide in its binding groove much of the time. Thismay allow B27 to bind extracellular peptides (48). In addition,recent data have demonstrated that bacteria move from thegut lumen to extraintestinal sites by penetrating an intactintestinal tract (42, 43). Extraintestinal translocation of bac-

Immunology: Scofield et al.

Page 5: A hypothesis for the HLA-B27 immune dysregulation in

Proc. Natl. Acad. Sci. USA 90 (1993)

teria may also involve transitory intracellular location inmacrophages (43-45). The link among the diseases, HLA-B27, and enteric organisms may be completed if MHC classI molecules can bind bacterially derived peptides during anintracellular phase.The relevance of our observations is also dependent upon

whether the B27 sequence LRRYLENGK, which is pre-dicted to bind B27, is actually processed and presented in theB27 peptide-binding cleft. Preliminary data suggest that thispeptide is indeed bound by B27 (R.H.S., W.L.W., andJ.B.H., unpublished observations). This structural relation-ship does predict a mechanism for T-cell-mediated autoim-munity by which the variable regions of HLA antigensbecome peptides bound by HLA antigens and then presentedto T-cell receptors. If this is a more general phenomenon andexplains some of the other, albeit weaker, HLA associationswith idiopathic inflammatory diseases, then a subset ofHLAproteins may contribute peptides that are bound by anothersubset of HLA proteins. As possible examples of such amechanism, consider the gene interaction of DQI/DQ2 het-erozygotes associated with anti-Ro autoantibodies and ofDR3/DR4 heterozygotes with insulin-dependent diabetes(46, 47). Here one HLA antigen could contribute the peptidefor binding to the other.

Within the context of current immunologic knowledge, theobservations of the present study lead to an obvious scenariofor the autoimmune pathogenesis of the spondyloarthropa-thies which requires the following steps. First, the host istolerized to its own B27 molecule as well as to the endoge-neous peptides that are bound by B27. Then the host isexposed to the enteric organisms that partially imitate B27 inways that lead these particular enteric antigenic structures tobecome immunogenic. Those enteric peptides that sharesequences with B27 and that satisfy the requirements to bindB27 hold the potential of breaking tolerance to B27 se-quences. Once tolerance to the self-peptides of B27 is bro-ken, the requirements are met to establish a chronic inflam-matory condition.

The authors appreciate the suggestions of Barbara R. Neas and themembers of the Management and Information Systems division ofthe Oklahoma Medical Research Foundation. Also, we appreciatethe financial support from the Oklahoma Chapter of the ArthritisFoundation, the U.S. Department of Veterans Affairs, and theNational Institutes of Health (Grants AR39577, A124717, A121568,A131584, and AR01844).

1. Calin, A. (1989) in Textbook of Rheumatology, eds. Kelley,W. N., Harris, E. D., Ruddy, S. & Sledge, C. B. (Saunders,Philadelphia), 3rd Ed., pp. 1023-1039.

2. Rayborne, R. B., Williams, K. M., Cheng, X. K. & Yu,D. T. Y. (1990) Scand. J. Rheumatol. 87, S134-S139.

3. Ebringer, R. W., Cadwell, D. R., Cowling, P. & Ebringer, A.(1978) Ann. Rheum. Dis. 37, 146-151.

4. Ogawasara, M., Kono, D. H. & Yu, D. T. Y. (1986) Infect.Immunol. 51, 901-908.

5. Ebringer, A., Cowling, P., Ngwa-Suh, N., James, D. C. 0. &Ebringer, R. B. (1976) in HLA and Disease, eds. Dausset, J. &Svejaard, J. (INSERM, Paris), p. 27.

6. Geczy, A. F., Alexander, K. I. & Bashir, H. V. (1980) Nature(London) 283, 782-784.

7. Cameron, F. H., Russell, P. H., Easter, J. F., Wakefield, D. &Ziff, M. (1987) Arthritis Rheum. 30, 300-305.

8. Kinsella, T., Fritzler, M. & Lewkonta, R. (1986) ArthritisRheum. 29, 358-362.

9. Terasaki, P. & Yu, D. T. Y. (1987) Arthritis Rheum. 30,353-354.

10. Cavender, D. & Ziff, M. (1986) Arthritis Rheum. 29, 352-357.11. Tsuchiya, N., Husby, G. & Williams, R. C., Jr. (1989) Clin.

Exp. Immunol. 76, 354-360.12. Schwimmbeck, P. L., Yu, D. T. Y. & Oldstone, M. B. (1987)

J. Exp. Med. 166, 173-181.

13. Ewing, C., Ebringer, R., Tribbick, G. & Geysen, H. M. (1987)J. Exp. Med. 171, 1635-1647.

14. Lahasmaa, R., Skumik, M., Granfors, K., Mottonen, T.,Saaria, R. & Toivanen, P. (1990) Scand. J. Rheumatol. 88,S70-S71.

15. Stieglitz, H., Fosmire, S. & Lipskey, P. (1989) Arthritis Rheum.32, 937-946.

16. Tsuchiya, N., Husby, G., Williams, R. C., Jr., Stieglitz, H.,Lipsky, P. & Inman, R. D. (1990) J. Clin. Invest. 86, 1193-1203.

17. Tsuchiya, N., Husby, G. & Williams, R. C., Jr. (1990) Clin.Exp. Immunol. 82, 493-498.

18. Madden, D. R., Gorga, J. C., Strominger, J. L. & Wiley, D. C.(1991) Nature (London) 353, 321-325.

19. Jardetzky, T. S., Lane, W. S., Robinson, R. A., Madden,D. R. & Wiley, D. C. (1991) Nature (London) 353, 326-329.

20. Ohno, S. (1992) Proc. Natl. Acad. Sci. USA 89, 4643-4647.21. Devereaux, J., Haeberli, P. & Smithies, 0. (1984) Nucleic

Acids Res. 12, 387-395.22. Wilbur, W. J. & Lipman, D. J. (1983) Proc. Natl. Acad. Sci.

USA 80, 726-730.23. SAS Institute (1985) SAS User's Guide: Statistics (SAS Inst.,

Cary, NC).24. Ohno, S. (1991) Proc. Natl. Acad. Sci. USA 88, 3065-3068.25. Harley, J. B. & Scofield, R. H. (1991) J. Clin. Immunol. 11,

297-316.26. Scofield, R. H. & Harley, J. B. (1991) Proc. Natl. Acad. Sci.

USA 88, 3343-3347.27. Bodmer, J. G., Marsh, S. G. E., Albert, E. D., Bodmer,

W. F., Dupont, B., Erlich, H. A., Mach, B., Mayr, W. R.,Parham, P., Sasazuki, T., Schreuder, G. M. T., Strominger,J. L., Svejgaard, A. & Terasaki, P. I. (1991) Hum. Immunol.31, 186-194.

28. Marsh, S. G. E. & Bodmer, J. G. (1991) Hum. Immunol. 31,207-227.

29. Cooper, N. R. (1991) Immunol. Today 12, 327-331.30. Nemmerow, G. R., Houghton, R. A., Moore, M. D. &

Cooper, N. R. (1989) Cell 56, 369-377.31. Russel, D. G. & Talamas-Rohana, P. (1990) Immunol. Today

10, 328-333.32. Bullock, W. E. & Wright, S. D. (1987) J. Exp. Med. 165,

195-210.33. Hammer, R. E., Maika, S. D., Richardson, J. A., Tang, J.-P.

& Jaurog, J. D. (1990) Cell 63, 1099-1112.34. Hill, A. B. S., Allsop, C. E. M., Kwiatkowski, D., Anstey,

M. N., Twumasi, P., Rowe, P. A., Bennett, S., Brewster, D.,McMichael, A. J. & Greenwood, B. M. (1991) Nature (Lon-don) 352, 595-600.

35. Howard, J. C. (1991) Nature (London) 352, 565-567.36. Roudier, J., Petersen, J., Rhodes, G. H., Luka, J. & Carson,

D. A. (1989) Proc. Natl. Acad. Sci. USA 86, 5104-5108.37. Albani, S., Tuckwell, J. E., Esparza, L., Carson, D. A. &

Roudier, J. (1992) J. Clin. Invest. 89, 327-331.38. Anderson, D. C., van Schooten, W. C. A., Barry, M. E.,

Janson, A. A. M., Buchanan, J. M. & deVries, R. P. P. (1988)Science 242, 259-261.

39. Saskia do Koster, H., Anderson, D. C. & Termijtelein, A.(1989) J. Exp. Med. 169, 1191-1196.

40. Roudier, J., Sette, A., Lamont, A., Albani, S., Karras, J. G. &Carson, D. A. (1991) Eur. J. Immunol. 21, 2063-2067.

41. Agrawal, B., Manickasundari, M., Fraga, E. & Singh, B. (1991)J. Immunol. 147, 338-390.

42. Wells, C. L., Jechorek, R. P. & Gillingham, K. (1991) Arch.Surg. 126, 247-252.

43. Wells, C. L. & Erlandsen, S. L. (1991) Infect. Immun. 59,4693-4697.

44. Wells, C. L., Jechorek, R. P. & Erlandsen, S. L. (1991) J.Infect. Dis. 162, 82-90.

45. Wells, C. L., Maddaus, M. A. & Simmons, R. L. (1987) Arch.Surg. 122, 48-53.

46. Fujisaku, A., Frank, M. B., Neas, B., Reichlin, M. & Harley,J. B. (1990) J. Clin. Invest. 86, 606-611.

47. Tiwari, J. L. & Terasaki, P. I. (1985) Juvenile Diabetes Melli-tus (Springer, New York), pp. 33, 36, and 185-210.

48. Benjamin, R. J., Madrigal, J. A. & Parham, P. (1991) Nature(London) 351, 74-77.

49. Offner, H., Hashim, G. A. & Vandenbark, A. A. (1991) Sci-ence 251, 430-432.

9334 Immunology: Scofield et al.