EFI - Genome Neighborhood Tool

This web resource is supported by a Research Resource from the National Institute of General Medical Sciences (R24GM141196-01).
The tools are available without charge or license to both academic and commercial users.
Important Notice

The UniProtKB database used by the EFI tools is undergoing major reorganization starting with the 2025_04 release (https://www.uniprot.org/release-notes/forthcoming-changes). When the reorganization is fully implemented (2026_02 release, Spring 2026), the number of proteins in UniProtKB is expected to decrease from ~253M accessions in the current 2025_03 release to ~141M accessions in the 2026_02 release.

In response to these changes, we are planning to provide the current 2025_03 release until the 2026_02 release is available.

More information about the changes is located here.

Results

Submitted Network Name: 26123_IP91_IPR004184_UniRef90_NoFragments_Bacteroidetes_Minlen650_AS240_PB_full_ssn

The parameters for computing the GNN and associated files are summarized in the table.

Uploaded Filename26123_IP91_IPR004184_UniRef90_NoFragments_Bacteroidetes_Minlen650_AS240_PB_full_ssn.xgmml.zip
Neighborhood Size10
Input % Co-Occurrence20
Please cite your use of the EFI tools:

Rémi Zallot, Nils Oberg, and John A. Gerlt, The EFI Web Resource for Genomic Enzymology Tools: Leveraging Protein, Genome, and Metagenome Databases to Discover Novel Enzymes and Metabolic Pathways. Biochemistry 2019 58 (41), 4169-4182. https://doi.org/10.1021/acs.biochem.9b00735

Nils Oberg, Rémi Zallot, and John A. Gerlt, EFI-EST, EFI-GNT, and EFI-CGFP: Enzyme Function Initiative (EFI) Web Resource for Genomic Enzymology Tools. J Mol Biol 2023. https://doi.org/10.1016/j.jmb.2023.168018

Colored Sequence Similarity Network (SSN)

Each cluster in the submitted SSN has been identified and assigned a unique number and color. Node attributes for "Neighbor Pfam Families" and "Neighbor InterPro Families" have been added.

# Nodes # Edges File Size (Zipped MB)
333 13,658 <1

Genome Neighborhood Networks (GNNs)

GNNs provide a representation of the neighboring Pfam families for each SSN cluster identified in the colored SSN. To be displayed, neighboring Pfams families must be detected in the specified window and at a co-occurrence frequency higher than the specified minimum.

SSN Cluster Hub-Nodes: Genome Neighborhood Network (GNN)

Each hub-node in the network represents a SSN cluster. The spoke nodes represent Pfam families that have been identified as neighbors of the sequences from the center hub.

File Size (Zipped MB)
<1
Pfam Family Hub-Nodes Genome Neighborhood Network (GNN)

Each hub-node in the network represents a Pfam family identified as a neighbor. The spokes nodes represent SSN clusters that identified the Pfam family from the center hub.

File Size (Zipped MB)
<1

Mapping Tables, FASTA Files, ID Lists, and Supplementary Files

Click here to contact us for help, reporting issues, or suggestions.