A general overview of the protein sequence set for the mouse transcriptome produced during the FANTOM2 sequencing project is presented here. We applied different algorithms to characterize protein sequences derived from a nonredundant representative protein set (RPS) and a variant protein set (VPS) of the mouse transcriptome. The functional characterization and assignment of Gene Ontology terms was done by analysis of the proteome using InterPro. The Superfamily database analyses gave a detailed structural classification according to SCOP and provide additional evidence for the functional characterization of the proteome data. The MDS database analysis revealed new domains which are not presented in existing protein domain databases. Thus the transcriptome gives us a unique source of data for the detection of new functional groups. The data obtained for the RPS and VPS sets facilitated the comparison of different patterns of protein expression. A comparison of other existing mouse and human protein sequence sets (e.g., the International Protein Index) demonstrates the common patterns in mammalian proteomes. The analysis of the membrane organization within the transcriptome of multiple eukaryotes provides valuable statistics about the distribution of secretory and transmembrane proteins.
ASJC Scopus subject areas