The parameters used for the initial submission and the finalization are summarized in the table below.
|Analysis Job Number||27256|
|Total Number of Sequences After Filtering||20,089|
|EST Job Number||26121 (Original Dataset)|
|Database Version||UniProt: 2022-04 / InterPro: 91|
|Input Option||Families (Option B)|
|E-Value for SSN Edge Calculation||5|
|Pfam / InterPro Family||IPR004184|
|Number of IDs in Pfam / InterPro Family||21,636|
|Total Number of Sequences in Dataset||21,636|
|Total Number of Edges||217,671,851|
|Number of Unique Sequences||17,505|
The panels below provide files for full and representative node SSNs for download with the indicated numbers of nodes and edges. As an approximate guide, SSNs with ~2M edges can be opened with 16 GB RAM, ~5M edges can be opened with 32 GB RAM, ~10M edges can be opened with 64 GB RAM, ~20M edges can be opened with 128 GB RAM, ~40M edges can be opened with 256 GB RAM, and ~120M edges can be opened with 768 GB RAM.
Files may be transferred to the Genome Neighborhood Tool (GNT), the Color SSN utility, the Cluster Analysis utility, or the Neighborhood Connectivity utility.
Each node in the network represents a single protein sequence.
|# Nodes||# Edges|
In representative node (RepNode) networks, each node in the network represents a collection of proteins grouped according to percent identity. For example, for a 75% identity RepNode network, all connected sequences that share 75% or more identity are grouped into a single node (meta node). Sequences are collapsed together to reduce the overall number of nodes, making for less complicated networks easier to load in Cytoscape.
The cluster organization is not changed, and the clustering of sequences remains identical to the full network.
|% ID||# Nodes||# Edges|
Portions of these data are derived from the Universal Protein Resource (UniProt) databases.
Would you like to color the SSN?