Model Internals

Each PhenDB model is trained on sets of bacterial ENOGs (orthologous groups from EggNOG 4.5), which have or have not been identified in the training genomes. Each ENOG is given a weight, with the magnitude of the weight being the importance of that ENOG for the final prediction. The sign of the weight indicates whether the presence (positive weight) or absence (negative weight) of this ENOG is indicative of the trait.

This table lists the 250 highest-ranking ENOGs of this model.

rank in model enog name enog description weight in model
1 ENOG4105DIJ Phosphoethanolamine transferase -0.482170
2 ENOG4108R1Q NA -0.478161
3 ENOG4108XA2 Transposase 0.463407
4 ENOG4108NJP Tape measure protein -0.293486
5 ENOG4105IS8 NA -0.282588
6 ENOG41078UC NA 0.198214
7 ENOG4107F7K Inherit from COG: transposase 0.196553
8 ENOG4105CYF SMP-30 gluconolaconase LRE domain protein 0.181395
9 ENOG4105FJ8 phage repressor 0.067042
10 ENOG4105NAH NA 0.065223
11 ENOG41063K6 NA 0.054258
12 ENOG4105CVR type I restriction-modification system 0.052592
13 ENOG41067T3 NA 0.047281
14 ENOG4108BJB Protein of unknown function (DUF497) 0.046960
15 ENOG4105JJH Protein of unknown function (DUF1173) 0.044271
16 ENOG41064SZ Uncharacterized protein conserved in bacteria (DUF2171) -0.043222
17 ENOG4105S5X Replication protein 0.043039
18 ENOG4106NS9 C-5 cytosine-specific DNA methylase 0.042604
19 ENOG4105TGS NA 0.042132
20 ENOG4105F9F transposase -0.041528
21 ENOG41072XF NA -0.040467
22 ENOG410604F Helix-turn-helix 0.020182
22 ENOG410685U NA 0.020182
23 ENOG4107FAD NA 0.040144
24 ENOG4105TZD TrbL VirB6 plasmid conjugal transfer protein 0.039713
25 ENOG4105KMS anti-termination protein 0.039041
26 ENOG4107R04 NA 0.038694
27 ENOG4105ZXY NA 0.038247
28 ENOG41064UG Terminase, large subunit 0.038073
29 ENOG41068W6 Capsular polysaccharide biosynthesis protein -0.038063
30 ENOG4107ET6 NA 0.037423
31 ENOG4106YDW NA -0.037301
32 ENOG41066S5 ORF6C domain 0.037088
33 ENOG4106N1G Conjugative transfer protein 0.037008
34 ENOG4106318 fimbrial-like protein YdeR -0.036759
35 ENOG4108XYH NA 0.036719
36 ENOG4108Y31 NA -0.036579
37 ENOG4105DFP Integrase 0.036386
38 ENOG41073HG NA -0.036022
39 ENOG4108PXA Replication Protein -0.035108
40 ENOG4105YQ0 NA 0.034951
41 ENOG4108EQ0 fad dependent oxidoreductase -0.034933
42 ENOG41061TZ Short-chain dehydrogenase reductase Sdr -0.034758
43 ENOG4107MRY Nitroreductase family -0.034752
44 ENOG410682D short chain dehydrogenase -0.033977
45 ENOG4105XT3 NA 0.033591
46 ENOG41077W1 NA 0.033393
47 ENOG41076QY NA 0.033306
48 ENOG4108WEB transposase (IS4,) 0.033239
49 ENOG4105X8W Polysaccharide Biosynthesis Protein 0.033109
50 ENOG410862C low temperature requirement -0.032901
51 ENOG41074WQ Protein of unknown function (DUF2522) -0.032887
52 ENOG4107N08 NA 0.032842
53 ENOG4108MRP Dehydrogenase 0.032350
54 ENOG4106VV6 NA -0.032212
55 ENOG4105ZMG Addiction module antitoxin, RelB DinJ family -0.032159
56 ENOG4106GUI NA -0.032159
57 ENOG41071TB NA -0.032008
58 ENOG4108C33 Resolvase 0.031913
59 ENOG41073AU Protein of unknown function (DUF551) 0.031865
60 ENOG4107YE4 Phage integrase family 0.031864
61 ENOG4106FQJ NA 0.031700
62 ENOG4106PCE Phage replication protein CRI 0.031588
63 ENOG4107T2H Integrase, catalytic region 0.031453
64 ENOG4106CEI NA -0.031202
65 ENOG4108MGJ head-to-tail joining protein 0.031045
66 ENOG4105FJ9 Integrase catalytic subunit 0.030958
67 ENOG4105KF3 Short-chain dehydrogenase reductase Sdr -0.015399
67 ENOG4107ZQV short-chain dehydrogenase reductase -0.015399
68 ENOG4107DDN NA -0.030702
69 ENOG41060CS Transcriptional regulator 0.030510
70 ENOG4108KM8 NA 0.030432
71 ENOG4107Y4X Sel1 domain protein repeat-containing protein 0.030121
72 ENOG41090K2 KR domain 0.030104
73 ENOG41068CK NA 0.029806
74 ENOG4105XT9 zeta toxin 0.029793
75 ENOG4105V3W Stress-induced bacterial acidophilic repeat motif -0.029753
76 ENOG4107B4G NA -0.029685
77 ENOG41082PI Sel1 domain protein repeat-containing protein -0.029455
78 ENOG4107AR2 NA 0.029348
79 ENOG41071C0 NA 0.029348
80 ENOG4107IS3 DNA maturase 0.029348
81 ENOG4107FR3 ankyrin repeat-containing protein -0.029278
82 ENOG4108N26 phage protein 0.014530
82 ENOG4108RZQ NA 0.014530
83 ENOG41075KM NA 0.028728
84 ENOG41064HS Inherit from COG: Hemolysin-type calcium-binding -0.028620
85 ENOG4106H70 NA 0.028509
86 ENOG4107MJP Domain of unknown function (DUF1788) 0.028446
87 ENOG4108ITA resolvase 0.028442
88 ENOG4108UST Zonular occludens toxin 0.028420
89 ENOG4106B4Y Protein of unknown function (DUF1192) 0.014186
89 ENOG41078G5 NA 0.014186
90 ENOG41061PY Histidine kinase 0.028055
91 ENOG41075PX NA 0.027993
92 ENOG4105F2I Transposase 0.006986
92 ENOG41086X3 Transposase 0.006986
92 ENOG4108MVM Transposase 0.006986
92 ENOG41090WB Transposase is3 is911 0.006986
93 ENOG4105H9N is1 orf1 0.009315
93 ENOG4108TAU NA 0.009315
93 ENOG4108Z8X NA 0.009315
94 ENOG4107R8N Glycosyl transferase, family 2 -0.027946
95 ENOG4105WDA Protein of unknown function (DUF1653) -0.027946
96 ENOG4105C4W S-formylglutathione hydrolase -0.027946
97 ENOG41090BE (LipO)protein -0.027946
98 ENOG4105C4P Facilitates transcription termination by a mechanism that involves Rho binding to the nascent RNA, activation of Rho's RNA-dependent ATPase activity, and release of the mRNA from the DNA template (By similarity) -0.009315
98 ENOG4107W9U HPP family -0.009315
98 ENOG41082AT scp-like extracellular -0.009315
99 ENOG4105F94 Vwa containing coxe family protein -0.027946
100 ENOG4105VGH gCN5-related N-acetyltransferase 0.027911
101 ENOG4105MFG Antirestriction protein 0.027893
102 ENOG4105FDY integral membrane protein 0.027860
103 ENOG4108R0M Aminoglycoside 0.027858
104 ENOG4108PP2 Transcriptional regulator, IclR family 0.027853
105 ENOG41069NN phage protein -0.027703
106 ENOG4105EBM Alpha Beta Hydrolase Fold protein 0.027659
107 ENOG4108MQ4 ABC, transporter -0.027650
108 ENOG41087EE NA 0.027554
109 ENOG4107TDB pilus assembly protein, tip-associated adhesin PilY1 0.027430
110 ENOG4105U2X NA -0.027370
111 ENOG4105YMR NA -0.027217
112 ENOG4107T5C Molybdenum cofactor synthesis domain protein 0.027018
113 ENOG41083U5 Transposase -0.026967
114 ENOG4108S1V cytoplasmic protein 0.026899
115 ENOG4108410 Transcriptional regulator 0.026802
116 ENOG4105T32 Protein of unknown function (DUF3611) -0.026749
117 ENOG4105NDF NAD-dependent protein deacetylase which modulates the activities of several enzymes which are inactive in their acetylated form. May also have NAD-dependent lysine demalonylase and desuccinylase activity (By similarity) 0.026433
118 ENOG4105ZCR Pilus assembly protein TraF 0.026402
119 ENOG4105JYE NA 0.026256
120 ENOG4105K7A Phage prohead protease, HK97 family -0.026202
121 ENOG41090EB tir protein -0.026159
122 ENOG41080NW Alkylphosphonate utilization operon protein PhnA -0.026101
123 ENOG4108Z8G Alkylphosphonate utilization operon protein PhnA 0.026101
124 ENOG4105G4R mu-like prophage protein gp16 0.025959
125 ENOG41062VS NA 0.025959
126 ENOG4105JG6 Integrase core domain 0.008653
126 ENOG4105N9A NA 0.008653
126 ENOG4107RG9 type II secretory pathway, component ExeA 0.008653
127 ENOG4108QNS NA 0.025959
128 ENOG4108UKV benzoate 1,2-dioxygenase 0.025646
129 ENOG4107310 Transcriptional regulator -0.025605
130 ENOG4107H8V Resolvase -0.025552
131 ENOG4106GYG NA -0.025544
132 ENOG4105RKQ tetratricopeptide tpr_2 repeat protein -0.025534
133 ENOG4107IHG Aminoglycoside-2''-adenylyltransferase 0.025446
134 ENOG4108N8A Major Facilitator 0.025431
135 ENOG4107YH7 Beta-lactamase (EC 3.5.2.6) 0.025421
136 ENOG4107JS7 NA 0.025344
137 ENOG4105UNC NA -0.025343
138 ENOG4105TM2 NA -0.025324
139 ENOG4107R0M Integrase 0.025242
140 ENOG41078HT type 4 fimbrial biogenesis protein PilO 0.025214
141 ENOG41060YD NA 0.003149
141 ENOG41068IM transcriptional regulator, copG family 0.003149
141 ENOG410705U Transglutaminase-like superfamily 0.003149
141 ENOG4107NVD NA 0.003149
141 ENOG4107U6T NA 0.003149
141 ENOG41086B1 Bacterial protein of unknown function (DUF896) 0.003149
141 ENOG41088UI Sodium hydrogen exchanger 0.003149
141 ENOG4108M0H Extracellular lipase 0.003149
142 ENOG410658P NA 0.025097
143 ENOG4106CHF NA 0.025079
144 ENOG4105KK6 nuclease 0.025054
145 ENOG4106C41 NA 0.025025
146 ENOG4106ETY NA -0.024935
147 ENOG41069BA NA 0.024877
148 ENOG4105QIT septicolysin 0.024785
149 ENOG4107PKQ NA 0.024781
150 ENOG41086QX Protein of unknown function (DUF1311) 0.024774
151 ENOG4108DKD HemX 0.024683
152 ENOG41065X9 hemagglutinin-related transmembrane protein 0.024682
153 ENOG4106B8V Host-nuclease inhibitor protein Gam -0.024669
154 ENOG4105YBS Protein of unknown function (DUF3192) -0.024669
155 ENOG4105EAK ubiquinone biosynthesis hydroxylase, ubiH ubiF VisC COQ6 family 0.012334
155 ENOG4108I5K Secretin and TonB N terminus short domain 0.012334
156 ENOG4106G0D acyl-Coa dehydrogenase 0.024669
157 ENOG4105VQU rdd domain containing protein 0.024669
158 ENOG4105VNG YcgL domain-containing protein 0.012334
158 ENOG4107CX4 NA 0.012334
159 ENOG4108111 DJ-1/PfpI family -0.024621
160 ENOG4105EKS Tetr family transcriptional regulator -0.024571
161 ENOG410620U Protein of unknown function (DUF3296) -0.024529
162 ENOG4108X81 AraC family transcriptional regulator 0.024443
163 ENOG4105VXN addiction module toxin, RelE StbE family -0.024435
164 ENOG4105MNN NA -0.024338
165 ENOG41084KN transcriptional regulator -0.024332
166 ENOG41072UF Tape measure protein -0.024319
167 ENOG4106KU4 NA -0.024286
168 ENOG4105SX5 NA 0.024256
169 ENOG4105XHP NA -0.024132
170 ENOG4105VQR NA 0.024107
171 ENOG41061WC NA 0.023940
172 ENOG41068AG Protein of unknown function (DUF1451) -0.023918
173 ENOG4105Y4E Stress-induced bacterial acidophilic repeat motif -0.023892
174 ENOG410852Z Protein of unknown function (DUF2834) 0.023886
175 ENOG41077UX NA -0.023878
176 ENOG4105GYD Terminase Small Subunit -0.023769
177 ENOG4106B9S NA 0.007921
177 ENOG4106FI0 NA 0.007921
177 ENOG4107BTD NA 0.007921
178 ENOG410788D NA 0.023757
179 ENOG4107XGH Saccharopine dehydrogenase -0.023733
180 ENOG410698T Required for coenzyme pyrroloquinoline quinone (PQQ) biosynthesis. PQQ is probably formed by cross-linking a specific glutamate to a specific tyrosine residue and excising these residues from the peptide (By similarity) -0.023653
181 ENOG4106EZ5 Bacteriophage CII protein 0.023592
182 ENOG41084X9 prevent-host-death family 0.023585
183 ENOG4108C1Q NA -0.023442
184 ENOG4105K4U addiction module toxin, RelE StbE family 0.023425
185 ENOG4107RHE UDP-glucose 6-dehydrogenase -0.023359
186 ENOG4107QJZ Glycosyl transferase, family 2 -0.023300
187 ENOG41066PX NA -0.023269
188 ENOG4105CXP Major Facilitator -0.023087
189 ENOG41082ST This protein specifically catalyzes the removal of signal peptides from prolipoproteins (By similarity) 0.023042
190 ENOG4106I5A Uncharacterized small protein (DUF2158) 0.022996
191 ENOG4106BJK DNA transfer system protein TraJ -0.022965
192 ENOG41090JY secretion activator protein -0.022846
193 ENOG4107RGY fad dependent oxidoreductase 0.022807
194 ENOG4108C3H Magnesium-importing ATPase 0.022781
195 ENOG4108K39 ATP-dependent endonuclease of the OLD 0.022699
196 ENOG4108WJ6 Tail assembly protein 0.022598
197 ENOG4108W7V carboxymuconolactone decarboxylase 0.022568
198 ENOG41081M4 Peptidase S24-like 0.022480
199 ENOG4105EP1 Efflux transporter rnd family, mfp subunit -0.022428
200 ENOG4105WB5 Acetyltransferase (GNAT) family 0.022407
201 ENOG4106CQ0 transcriptional regulator antitoxin, MazE -0.022403
202 ENOG4105MJK Protein of unknown function (DUF2786) 0.022402
203 ENOG4105NFP P22 coat protein - gene protein 5 -0.004480
203 ENOG410743V gCN5-related N-acetyltransferase -0.004480
203 ENOG4107544 NA -0.004480
203 ENOG4107BV5 NA -0.004480
203 ENOG4108T14 lysozyme -0.004480
204 ENOG4108MV4 NA -0.022393
205 ENOG4105WXW Antirepressor 0.022354
206 ENOG4105ZKY NA 0.022331
207 ENOG4105CDV Ompa motb domain protein -0.022326
208 ENOG4105E9I Appr-1-p processing domain protein 0.011114
208 ENOG4108K7R NA 0.011114
209 ENOG4106FKH NA 0.022178
210 ENOG41062H4 flgn family 0.022169
211 ENOG4106ZVZ Escherichia coli IMT2125 genomic chromosome, IMT2125 0.022050
212 ENOG4106DZS NA -0.021963
213 ENOG4107UJA Pfam:DUF2081 0.021962
214 ENOG4106019 Endonuclease that resolves Holliday junction intermediates made during homologous genetic recombination and DNA repair. Exhibits sequence and structure-selective cleavage of four-way DNA junctions, where it introduces symmetrical nicks in two strands of the same polarity at the 5' side of dinucleotides. Corrects the defects in genetic recombination and DNA repair associated with inactivation of ruvAB or ruvC (By similarity) 0.021948
215 ENOG4106UZG NA 0.021933
216 ENOG4107QR5 DNA helicase -0.021834
217 ENOG4108JFQ HTH_XRE 0.021815
218 ENOG41066U0 NA 0.021734
219 ENOG41089T1 Inherit from COG: phosphoserine phosphatase activity -0.021723
220 ENOG4105EUZ Beta-lactamase domain protein -0.021675
221 ENOG4108TE0 Metal Dependent Phosphohydrolase 0.021662