We have undertaken the inventory and assembly of the ABC transporter systems in the complete genome of Bacillus subtilis. We combined the identification of the three protein partners that compose an ABC transporter (Nucleotide Binding Domain (NBD), Membrane Spanning Domain (MSD) and Solute-Binding Protein (SBP)) with constraints on the genetic organization. This strategy allowed the identification of 86 NBDs in 78 proteins, 103 MSD proteins and 37 solute-binding proteins. The analysis of transcriptional units allows the reconstruction of 59 ABC transporters, which include at least one NBD and one MSD. A particular class of five dimeric ATPases was not associated to MSD partners and is assumed to be involved either in macrolide resistance or regulation of translation elongation. In addition, we have detected five genes encoding ATPases without any gene coding for MSD protein in their neighborhood and 11 operons that encode only the membrane and solute-binding proteins. On the bases of similarities, three ATP-binding proteins are proposed to energize ten incomplete systems, suggesting that one ATPase may be recruited by more than one transporter. Finally, we estimate that the B. subtilis genome encodes for at least 78 ABC transporters that have been split in 38 importers and 40 extruders. The ABC systems have been further classified into 11 sub-families according to the tree obtained from the NBDs and the clustering of the MSDs and the SBPs. Comparisons with Escherichia coli show that the extruders are overrepresented in B. subtilis, corresponding to an expansion of the sub-families of antibiotic and drug resistance systems.
Table: summary of the different transport systems classified according to the sub-families.
tree obtained on the 86 Nucleotide Binding Domains.
In order to sort out new NBDs according to the classification described in our work, we used the MEME program (Bailey and Elkan(1994), Ismb 2, 28-36) to derive models for each sub-family which includes at least 5 members in B. subtilis. To increase the sensitivity and selectivity, the program was trained with B. subtilis and E.coli proteins. For each sub-family, the models include the well-known Walker A, Walker B and Signature motifs but also two other conserved regions located between the Walker A and the Signature and downstream of the Walker B motif, respectively.
These models are available for the following sub-families and can be used with the MAST program (Bayley and Gribskov (1997), J. Comput. Biol., 4, 45-59).
For each sub-family, the NBD alignment highlighting the 5 conserved motifs and the derived PROSITE signature are provided in the following:
Sub-family 2 |
Sub-family 3 (N-terminal NBDs) |
Sub-family 3 model (C-terminal NBDs) |
Sub-family 4 |
Sub-family 5 |
Sub-family 6 |
Sub-family 7 |
Sub-family 8 |
Sub-family 9 |
>YcbN
LTYIVQTNGLTKTYQGKEVVSNVSMHIKKGEIYGFLGPNGAGKTTIMKML
TSLVKPTSGEIIILGNKFTHTSYEVLGNIGSMIEYPIFYENLTAEENLER
SACEYMGYHNKKAIQEVLDMVNLKQIDKKPVKTFSLGMKQRLGIARAILT
KPDLLILDEPVNGLDPLGIKKIRQLFQVLSKEYGMTLLISSHLLGEIEQI
ADTIGVIRDGRLLEEVSMEDVRGQNTEYIELLTPNQTRACFVLEKELQLT
NFKILNEKTIRIYEAEASQAAISKALILNDVDTWINEHKSIRRWRIISSN
>YvrO
MLTLNNISKSYKLGKEEVPILKHINLTVQAGEFLAIMGPSGSGKSTLMNI
IGCLDRPTSGTYTLDQIDILKGKDGALAEIRNESIGFVFQTFHLLPRLTA
LQNVELPMIYNKVKKKERRQRAYEALEKVGLKDRVSYKPPKLSGGQKQRV
AIARALVNQPRFILADEPTGALDTKSSEQILALFSELHREGKTIIMITHD
PDVAKKADRTVFIRDGELVLDERGDISHA
>YesO
LKKICYVLLSLVCVFLFSGCSAGEEASGKKEDVTLRIAWWGGQPRHDYTT
KVIELYEKKNPHVHIEAEFANWDDYWKKLAPMSAAGQLPDVIQMDTAYLA
QYGKKNQLEDLTPYTKDGTIDVSSIDENMLSGGKIDNKLYGFTLGVNVLS
VIANEDLLKKAGVSINQENWTWEDYEKLAYDLQEKAGVYGSNGMHPPDIF
FPYYLRTKGERFYKEDGTGLAYQDDQLFVDYFERQLRLVKAKTSPTPDES
AQIKGMEDDFIVKGKSAITWNYSNQYLGFARLTDSPLSLYLPPEQMQEKA
LTLKPSMLFSIPKSSEHKKEAAKFINFFVNNEEANQLIKGERGVPVSDKV
ADAIKPKLNEEETNIVEYVETASKNISKADPPEPVGSAEVIKLLKDTSDQ
ILYQKVSPEKAAKTFRKKANEILERNN
>YhfQ
MKKTLIILTVLLLSVLTAACSSSSGNQNSKEHKVAVTHDLGKTNVPEHPK
RVVVLELGFIDTLLDLGITPVGVADDNKAKQLINKDVLKKIDGYTSVGTR
SQPSMEKIASLKPDLIIADTTRHKKVYDQLKKIAPTIALNNLNADYQDTI
DASLTIAKAVGKEKEMEKKLTAHEEKLSETKQKISANSQSVLLIGNTNDT
IMARDENFFTSRLLTQVGYRYAISTSGNSDSSNGGDSVNMKMTLEQLLKT
DPDVIILMTGKTDDLDADGKRPIEKNVLWKKLKAVKNGHVYHVDRAVWSL
RRSVDGANAILDELQKEMPAAKK
>YtcQ
MGNKWRVLDLCLVLALGGVLAGCKGTDQSSAEGKAGPDSKVKLSWMAILY
HQQPPKDRAIKEIEKLTNTELDITWVPDAVKEDRLNAALAAGNLPQIVTI
QDIKNSSVMNAFRSGMFWEIGDYIKDYPNLNKMNKLINKNVTIDGKLYGI
YRERPLSRQGIVIRKDWLDNLNLKTPKTLDELYEVAKAFTEDDPDKDGKD
DTFGLADRNDLIYGAFKTIGSYEGMPTDWKESGGKFTPDFMTQEYKDTMN
YMKKLRDNGYMNKDFPVTSKTQQQELFSQGKAGIYIGNMVDAVNLRDHAS
DKSMKLEIINRIKGPDGKERVWASGGHNGVFAFPKTSVKTEAELKRILAF
FDRIAEEDVYSLMTYGIDGVHYNKGEDKTFTRKESQVKDWQTDIQPLSAL
IAIDKAYLKNTGDPLRTAYEELTEDNEKIIVSNPAESCIPHRSRNGVTN
>YteQ
MKTAEAQAPAVDAVIFKKEKRKRLLIKLIQQKYLYLMILPGCIYFLLFKY
VPMWGIVIAFQDYQPFLGILGSEWVGLKHFIRLFTEPTFFLLLKNTLVLF
ALNLAIFFPVPILLALLLNEVRIALFKKFVQTLIYIPHFMSWVIVVSLSF
VLLTVDGGLINELIVFFGGEKINFLLNEEWFRPLYILQVIWREAGWSTII
YLAAITAVDPQLYEAAKMDGAGRLRQMWHITLPAIKSVIVVLLILKIGDT
LELGFEHVYLLLNATNREVAEIFDTYVYTAGLKQGQFSYSTAVGVFKAAV
GLILVMLANRLAKKFGEEGIY