EFI - Enzyme Similarity Tool

Release Notes

December 7, 2018

The EST database now uses UniProt release 2018_10 and InterPro 71. UniProt release 2018_10 includes a total of 134,066,044 entries: 133,507,323 in TrEMBL and 558,681 in SwissProt.

Users of the EST can now create GNNs and color SSNs directly from the EST network files download page. UniRef90 and UniRef50 sequences can be used instead of an entire family to speed computation. For examploe, the full set of sequences in PF05544 is a total of 10,914 sequences, but contains 4,198 UniRef90 and 552 UniRef50 seed sequences. The UniRef page at UniProt further discusses UniRef50 and UniRef90.

July 11, 2018

The EST database now uses UniProt release 2018_06 and InterPro 69. UniProt release 2018_06 includes a total of 116,587,823 entries: 116,030,110 in TrEMBL and 557,713 in SwissProt.

This database includes 16,712 Pfam famliies, 34,358 InterPro families, and 604 Pfam clans. Tables of family sizes are available here.

May 9, 2018

The EST database now uses UniProt release 2018_04 and InterPro 68. UniProt release 2018_04 includes a total of 115,316,915 entries: 114,759,640 in TrEMBL and 557,275 in SwissProt.

This database includes 16,712 Pfam famliies, 33,947 InterPro families, and 604 Pfam clans. Tables of family sizes are available here.

March 27, 2018

The EST database now uses UniProt release 2018_02 and InterPro 67. UniProt release 2018_02 includes a total of 109,414,541 entries: 108,857,716 in TrEMBL and 556,825 in SwissProt.

This database includes 16,712 Pfam famliies, 33,707 InterPro families, and 604 Pfam clans. Tables of family sizes are available here.

January 17, 2018

The EFI-EST and EFI-GNT tools were updated with the following changes:

December 15, 2017

The EST database now uses UniProt release 2017_11 and InterPro 66. UniProt release 2017_11 includes a total of 99,261,416 entries: 98,705,220 in TrEMBL and 556,196 in SwissProt.

This database includes 16,712 Pfam families, 32,568 InterPro families, and 604 Pfam clans. Lists of the families/clans are available along with the number number of sequences (full and UniRef90) can be accessed with the link below. The reductions in the number of sequences when using UniRef90 seed sequences are provided; the time required for the BLAST is decreased by the sequence of this reduction. Use of UniRef90 seed sequences also allows SSNs to be generated for larger families/clans (305,000 sequence limit). Tables of family sizes are available here.

Support for Pfam clans has now been added to the Families option. Pfam clans are collections of multiple Pfam families that define superfamilies. The sequences in the families in a clan are not mutually exclusive. A list of the families in each clans is available here. Pfam clans can also be specified in the FASTA and Accession IDs options as supplementary sequences.

Need help or have suggestions or comments? Please click here.