EFI - Enzyme Similarity Tool

This web resource is supported by a Research Resource from the National Institute of General Medical Sciences (R24GM141196-01).
The tools are available without charge or license to both academic and commercial users.

Cluster Analyses and Downloads

Uploaded Filename: 26123_IP91_IPR004184_UniRef90_NoFragments_Proteobacteria_Minlen650_AS240_PP_full_ssn.xgmml

The Data File Download tab provides the Color SSN with the nodes colored according to node.fillColor (Cluster Sequence Count).

Six node attributes were added to the input SSN: Cluster Sequence Count, Sequence Count Cluster Number, Cluster Node Count, Node Count Cluster Number, node.fillColor (according to Cluster Sequence Count, hexadecimal), and Node Count Fill Color (according to Cluster Node Count, hexadecimal).

To change the node colors in Cytoscape to Node Count Fill Color: 1) select all nodes; 2) on the Style Panel, click on the "?" in the Fill Color Property; 3) select "Remove Bypass"; 4) deselect the nodes (default node color); and 5) open the Fill Color Property and select "Node Count Fill Color" as the Column and "Passthrough Mapping" as the Mapping Type. The nodes will be recolored.

The Data File Download tab also provides files for 1) UniProt ID-Color-Cluster Number mapping table, 2) ID Lists and FASTA Files for each cluster, 3) cluster sizes, and 4) SwissProt annotations for clusters and singletons. The number of UniRef/UniProt IDs for each cluster is displayed in the WebLogos, HMMs, and Length Histograms tabs.

The WebLogos tab provides the WebLogo (generated using http://weblogo.threeplusone.com/) and MSA (generated using MUSCLE) for the node IDs in each SSN cluster containing at least the specified "Minimum Node Count". The MSA can be viewed with Jalview (https://www.jalview.org/). This tab also provides the percent identity matrix for the multiple sequence alignment, as computed by Clustal-Omega.

The Consensus Residues tab provides a tab-delimited text file with the number of the conserved residues and their MSA positions for each specified residue in each SSN cluster (numbered by Cluster Sequence Count) containing at least the specified "Minimum Node Count".

The HMMs tab provides the HMM for each SSN cluster containing at least the specified "Minimum Node Count". The Skylign download provides the image of the HMM generated from the MSA (https://skylign.org/). The HMM text file can be viewed interactively by uploading to https://skylign.org/ and selecting "Information Content – Above Background"; the probability of each amino acid residue and probability and length of an insert at each position is provided. The p

The Length Histograms tab provides length histograms for each cluster containing at least the specified "Minimum Node Count".

Submission Summary Table

Job Number26235
Input OptionCluster Analysis
Uploaded Filename26123_IP91_IPR004184_UniRef90_NoFragments_Proteobacteria_Minlen650_AS240_PP_full_ssn.xgmml
Database VersionUniProt: 2022-04 / InterPro: 91
Analysis OptionsWeblogo, HMM, Consensus Residue, Length Histogram (AAs=C; Thresholds=0.9,0.8,0.7,0.6,0.5,0.4,0.3,0.2,0.1; )
Number of SSN clusters61
Number of SSN singletons40
SSN sequence sourceUniRef90
Number of SSN (meta)nodes1,048
Number of accession IDs in SSN7,555
Please cite your use of the EFI tools:

Rémi Zallot, Nils Oberg, and John A. Gerlt, The EFI Web Resource for Genomic Enzymology Tools: Leveraging Protein, Genome, and Metagenome Databases to Discover Novel Enzymes and Metabolic Pathways. Biochemistry 2019 58 (41), 4169-4182. https://doi.org/10.1021/acs.biochem.9b00735

Nils Oberg, Rémi Zallot, and John A. Gerlt, EFI-EST, EFI-GNT, and EFI-CGFP: Enzyme Function Initiative (EFI) Web Resource for Genomic Enzymology Tools. J Mol Biol 2023. https://doi.org/10.1016/j.jmb.2023.168018

Colored SSN

Each cluster in the submitted SSN has been identified and assigned a unique number and color.

Supplementary Files

Mapping Tables
UniProt ID-Color-Cluster number mapping table
ID Lists and FASTA Files per Cluster
UniProt ID lists per cluster
UniRef90 ID lists per cluster
FASTA files per UniProt cluster
FASTA files per UniRef90 cluster
Miscellaneous Files
Cluster sizes
SwissProt annotations by cluster

Consensus Residues

C
Consensus residue position summary table (full) <1 MB

HMMs

If the HMM is missing for Node Cluster 1 (and additional clusters with large numbers of nodes), repeat the job with a "Maximum Node Count" in the Sequence Filter input window. MUSCLE can fail with a "large" number of sequences (variable, anywhere from >750 to >1500).

HMMs for FASTA UniProt cluster (full length sequences) 18 MB


Sequence Cluster 1 / Node Cluster 1

Full Sequences
Number of IDs: UniProt: 3,789, UniRef90: 279
Cluster 1

Sequence Cluster 4 / Node Cluster 2

Full Sequences
Number of IDs: UniProt: 304, UniRef90: 75
Cluster 4

Sequence Cluster 7 / Node Cluster 3

Full Sequences
Number of IDs: UniProt: 86, UniRef90: 67
Cluster 7

Sequence Cluster 3 / Node Cluster 4

Full Sequences
Number of IDs: UniProt: 839, UniRef90: 57
Cluster 3

Sequence Cluster 2 / Node Cluster 5

Full Sequences
Number of IDs: UniProt: 1,693, UniRef90: 56
Cluster 2

Sequence Cluster 6 / Node Cluster 6

Full Sequences
Number of IDs: UniProt: 93, UniRef90: 52
Cluster 6

Sequence Cluster 5 / Node Cluster 7

Full Sequences
Number of IDs: UniProt: 126, UniRef90: 44
Cluster 5

Sequence Cluster 11 / Node Cluster 8

Full Sequences
Number of IDs: UniProt: 35, UniRef90: 33
Cluster 11

Sequence Cluster 8 / Node Cluster 9

Full Sequences
Number of IDs: UniProt: 76, UniRef90: 31
Cluster 8

Sequence Cluster 12 / Node Cluster 10

Full Sequences
Number of IDs: UniProt: 32, UniRef90: 25
Cluster 12

Sequence Cluster 9 / Node Cluster 11

Full Sequences
Number of IDs: UniProt: 44, UniRef90: 25
Cluster 9

Sequence Cluster 17 / Node Cluster 12

Full Sequences
Number of IDs: UniProt: 23, UniRef90: 21
Cluster 17

Sequence Cluster 10 / Node Cluster 13

Full Sequences
Number of IDs: UniProt: 43, UniRef90: 20
Cluster 10

Sequence Cluster 18 / Node Cluster 14

Full Sequences
Number of IDs: UniProt: 22, UniRef90: 17
Cluster 18

Sequence Cluster 13 / Node Cluster 15

Full Sequences
Number of IDs: UniProt: 26, UniRef90: 15
Cluster 13

Sequence Cluster 15 / Node Cluster 16

Full Sequences
Number of IDs: UniProt: 24, UniRef90: 13
Cluster 15

Sequence Cluster 21 / Node Cluster 17

Full Sequences
Number of IDs: UniProt: 12, UniRef90: 11
Cluster 21

Sequence Cluster 22 / Node Cluster 18

Full Sequences
Number of IDs: UniProt: 12, UniRef90: 11
Cluster 22

Sequence Cluster 16 / Node Cluster 19

Full Sequences
Number of IDs: UniProt: 23, UniRef90: 10
Cluster 16

Sequence Cluster 20 / Node Cluster 20

Full Sequences
Number of IDs: UniProt: 12, UniRef90: 10
Cluster 20

Sequence Cluster 23 / Node Cluster 21

Full Sequences
Number of IDs: UniProt: 11, UniRef90: 10
Cluster 23

Sequence Cluster 14 / Node Cluster 22

Full Sequences
Number of IDs: UniProt: 24, UniRef90: 8
Cluster 14

Sequence Cluster 26 / Node Cluster 23

Full Sequences
Number of IDs: UniProt: 8, UniRef90: 8
Cluster 26

Sequence Cluster 29 / Node Cluster 24

Full Sequences
Number of IDs: UniProt: 7, UniRef90: 7
Cluster 29

Sequence Cluster 27 / Node Cluster 25

Full Sequences
Number of IDs: UniProt: 7, UniRef90: 6
Cluster 27

Sequence Cluster 31 / Node Cluster 26

Full Sequences
Number of IDs: UniProt: 6, UniRef90: 6
Cluster 31

Sequence Cluster 33 / Node Cluster 27

Full Sequences
Number of IDs: UniProt: 5, UniRef90: 5
Cluster 33

Sequence Cluster 25 / Node Cluster 28

Full Sequences
Number of IDs: UniProt: 9, UniRef90: 5
Cluster 25

Length Histograms

Length Histograms for FASTA UniProt cluster (full length sequences, UniProt) 1 MB


Sequence Cluster 1 / Node Cluster 1

Full Sequences
Number of IDs: UniProt: 3,789, UniRef90: 279
Cluster 1
Full Sequences
Number of IDs: UniProt: 3,789, UniRef90: 279
Cluster 1

Sequence Cluster 4 / Node Cluster 2

Full Sequences
Number of IDs: UniProt: 304, UniRef90: 75
Cluster 4
Full Sequences
Number of IDs: UniProt: 304, UniRef90: 75
Cluster 4

Sequence Cluster 7 / Node Cluster 3

Full Sequences
Number of IDs: UniProt: 86, UniRef90: 67
Cluster 7
Full Sequences
Number of IDs: UniProt: 86, UniRef90: 67
Cluster 7

Sequence Cluster 3 / Node Cluster 4

Full Sequences
Number of IDs: UniProt: 839, UniRef90: 57
Cluster 3
Full Sequences
Number of IDs: UniProt: 839, UniRef90: 57
Cluster 3

Sequence Cluster 2 / Node Cluster 5

Full Sequences
Number of IDs: UniProt: 1,693, UniRef90: 56
Cluster 2
Full Sequences
Number of IDs: UniProt: 1,693, UniRef90: 56
Cluster 2

Sequence Cluster 6 / Node Cluster 6

Full Sequences
Number of IDs: UniProt: 93, UniRef90: 52
Cluster 6
Full Sequences
Number of IDs: UniProt: 93, UniRef90: 52
Cluster 6

Sequence Cluster 5 / Node Cluster 7

Full Sequences
Number of IDs: UniProt: 126, UniRef90: 44
Cluster 5
Full Sequences
Number of IDs: UniProt: 126, UniRef90: 44
Cluster 5

Sequence Cluster 11 / Node Cluster 8

Full Sequences
Number of IDs: UniProt: 35, UniRef90: 33
Cluster 11
Full Sequences
Number of IDs: UniProt: 35, UniRef90: 33
Cluster 11

Sequence Cluster 8 / Node Cluster 9

Full Sequences
Number of IDs: UniProt: 76, UniRef90: 31
Cluster 8
Full Sequences
Number of IDs: UniProt: 76, UniRef90: 31
Cluster 8

Sequence Cluster 12 / Node Cluster 10

Full Sequences
Number of IDs: UniProt: 32, UniRef90: 25
Cluster 12
Full Sequences
Number of IDs: UniProt: 32, UniRef90: 25
Cluster 12

Sequence Cluster 9 / Node Cluster 11

Full Sequences
Number of IDs: UniProt: 44, UniRef90: 25
Cluster 9
Full Sequences
Number of IDs: UniProt: 44, UniRef90: 25
Cluster 9

Sequence Cluster 17 / Node Cluster 12

Full Sequences
Number of IDs: UniProt: 23, UniRef90: 21
Cluster 17
Full Sequences
Number of IDs: UniProt: 23, UniRef90: 21
Cluster 17

Sequence Cluster 10 / Node Cluster 13

Full Sequences
Number of IDs: UniProt: 43, UniRef90: 20
Cluster 10
Full Sequences
Number of IDs: UniProt: 43, UniRef90: 20
Cluster 10

Sequence Cluster 18 / Node Cluster 14

Full Sequences
Number of IDs: UniProt: 22, UniRef90: 17
Cluster 18
Full Sequences
Number of IDs: UniProt: 22, UniRef90: 17
Cluster 18

Sequence Cluster 13 / Node Cluster 15

Full Sequences
Number of IDs: UniProt: 26, UniRef90: 15
Cluster 13
Full Sequences
Number of IDs: UniProt: 26, UniRef90: 15
Cluster 13

Sequence Cluster 15 / Node Cluster 16

Full Sequences
Number of IDs: UniProt: 24, UniRef90: 13
Cluster 15
Full Sequences
Number of IDs: UniProt: 24, UniRef90: 13
Cluster 15

Sequence Cluster 21 / Node Cluster 17

Full Sequences
Number of IDs: UniProt: 12, UniRef90: 11
Cluster 21
Full Sequences
Number of IDs: UniProt: 12, UniRef90: 11
Cluster 21

Sequence Cluster 22 / Node Cluster 18

Full Sequences
Number of IDs: UniProt: 12, UniRef90: 11
Cluster 22
Full Sequences
Number of IDs: UniProt: 12, UniRef90: 11
Cluster 22

Sequence Cluster 16 / Node Cluster 19

Full Sequences
Number of IDs: UniProt: 23, UniRef90: 10
Cluster 16
Full Sequences
Number of IDs: UniProt: 23, UniRef90: 10
Cluster 16

Sequence Cluster 20 / Node Cluster 20

Full Sequences
Number of IDs: UniProt: 12, UniRef90: 10
Cluster 20
Full Sequences
Number of IDs: UniProt: 12, UniRef90: 10
Cluster 20

Sequence Cluster 23 / Node Cluster 21

Full Sequences
Number of IDs: UniProt: 11, UniRef90: 10
Cluster 23
Full Sequences
Number of IDs: UniProt: 11, UniRef90: 10
Cluster 23

Sequence Cluster 14 / Node Cluster 22

Full Sequences
Number of IDs: UniProt: 24, UniRef90: 8
Cluster 14
Full Sequences
Number of IDs: UniProt: 24, UniRef90: 8
Cluster 14

Sequence Cluster 26 / Node Cluster 23

Full Sequences
Number of IDs: UniProt: 8, UniRef90: 8
Cluster 26
Full Sequences
Number of IDs: UniProt: 8, UniRef90: 8
Cluster 26

Sequence Cluster 29 / Node Cluster 24

Full Sequences
Number of IDs: UniProt: 7, UniRef90: 7
Cluster 29
Full Sequences
Number of IDs: UniProt: 7, UniRef90: 7
Cluster 29

Sequence Cluster 27 / Node Cluster 25

Full Sequences
Number of IDs: UniProt: 7, UniRef90: 6
Cluster 27
Full Sequences
Number of IDs: UniProt: 7, UniRef90: 6
Cluster 27

Sequence Cluster 31 / Node Cluster 26

Full Sequences
Number of IDs: UniProt: 6, UniRef90: 6
Cluster 31
Full Sequences
Number of IDs: UniProt: 6, UniRef90: 6
Cluster 31

Sequence Cluster 33 / Node Cluster 27

Full Sequences
Number of IDs: UniProt: 5, UniRef90: 5
Cluster 33
Full Sequences
Number of IDs: UniProt: 5, UniRef90: 5
Cluster 33

Sequence Cluster 25 / Node Cluster 28

Full Sequences
Number of IDs: UniProt: 9, UniRef90: 5
Cluster 25
Full Sequences
Number of IDs: UniProt: 9, UniRef90: 5
Cluster 25

Sequence Cluster 30 / Node Cluster 29

Full Sequences
Number of IDs: UniProt: 6, UniRef90: 4
Cluster 30
Full Sequences
Number of IDs: UniProt: 6, UniRef90: 4
Cluster 30

Sequence Cluster 28 / Node Cluster 30

Full Sequences
Number of IDs: UniProt: 7, UniRef90: 4
Cluster 28
Full Sequences
Number of IDs: UniProt: 7, UniRef90: 4
Cluster 28

Sequence Cluster 35 / Node Cluster 31

Full Sequences
Number of IDs: UniProt: 4, UniRef90: 4
Cluster 35
Full Sequences
Number of IDs: UniProt: 4, UniRef90: 4
Cluster 35

Sequence Cluster 36 / Node Cluster 32

Full Sequences
Number of IDs: UniProt: 4, UniRef90: 4
Cluster 36
Full Sequences
Number of IDs: UniProt: 4, UniRef90: 4
Cluster 36

Sequence Cluster 37 / Node Cluster 33

Full Sequences
Number of IDs: UniProt: 4, UniRef90: 4
Cluster 37
Full Sequences
Number of IDs: UniProt: 4, UniRef90: 4
Cluster 37

Sequence Cluster 38 / Node Cluster 34

Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 38
Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 38

Sequence Cluster 39 / Node Cluster 35

Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 39
Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 39

Sequence Cluster 40 / Node Cluster 36

Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 40
Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 40

Sequence Cluster 41 / Node Cluster 37

Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 41
Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 41

Sequence Cluster 32 / Node Cluster 38

Full Sequences
Number of IDs: UniProt: 6, UniRef90: 3
Cluster 32
Full Sequences
Number of IDs: UniProt: 6, UniRef90: 3
Cluster 32

Sequence Cluster 42 / Node Cluster 39

Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 42
Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 42

Sequence Cluster 43 / Node Cluster 40

Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 43
Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 43

Sequence Cluster 44 / Node Cluster 41

Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 44
Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 44

Sequence Cluster 34 / Node Cluster 42

Full Sequences
Number of IDs: UniProt: 5, UniRef90: 3
Cluster 34
Full Sequences
Number of IDs: UniProt: 5, UniRef90: 3
Cluster 34

Sequence Cluster 46 / Node Cluster 43

Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 46
Full Sequences
Number of IDs: UniProt: 3, UniRef90: 3
Cluster 46

Sequence Cluster 47 / Node Cluster 44

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 47
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 47

Sequence Cluster 48 / Node Cluster 45

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 48
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 48

Sequence Cluster 49 / Node Cluster 46

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 49
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 49

Sequence Cluster 50 / Node Cluster 47

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 50
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 50

Sequence Cluster 51 / Node Cluster 48

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 51
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 51

Sequence Cluster 45 / Node Cluster 49

Full Sequences
Number of IDs: UniProt: 3, UniRef90: 2
Cluster 45
Full Sequences
Number of IDs: UniProt: 3, UniRef90: 2
Cluster 45

Sequence Cluster 52 / Node Cluster 50

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 52
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 52

Sequence Cluster 19 / Node Cluster 51

Full Sequences
Number of IDs: UniProt: 20, UniRef90: 2
Cluster 19
Full Sequences
Number of IDs: UniProt: 20, UniRef90: 2
Cluster 19

Sequence Cluster 53 / Node Cluster 52

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 53
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 53

Sequence Cluster 54 / Node Cluster 53

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 54
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 54

Sequence Cluster 55 / Node Cluster 54

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 55
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 55

Sequence Cluster 56 / Node Cluster 55

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 56
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 56

Sequence Cluster 57 / Node Cluster 56

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 57
Full Sequences
Number of IDs: UniProt: 2, UniRef90: 2
Cluster 57

Sequence Cluster 58 / Node Cluster 63

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 1
Cluster 58

Sequence Cluster 24 / Node Cluster 71

Full Sequences
Number of IDs: UniProt: 11, UniRef90: 1
Cluster 24

Sequence Cluster 59 / Node Cluster 79

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 1
Cluster 59

Sequence Cluster 60 / Node Cluster 86

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 1
Cluster 60

Sequence Cluster 61 / Node Cluster 97

Full Sequences
Number of IDs: UniProt: 2, UniRef90: 1
Cluster 61

Click here to contact us for help, reporting issues, or suggestions.