Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome

Mihaela Zavolan, Shinji Kondo, Christian Schönbach, Jun Adachi, David A. Hume, Takahiro Arakawa, Piero Carninci, Jun Kawai, Yoshihide Hayashizaki, Terry Gaasterland

Research output: Contribution to journalArticle

156 Citations (Scopus)

Abstract

We analyzed the FANTOM2 clone set of 60,770 RIKEN full-length mouse cDNA sequences and 44,122 public mRNA sequences. We developed a new computational procedure to identify and classify the forms of splice variation evident in this data set and organized the results into a publicly accessible database that can be used for future expression array construction, structural genomics, and analyses of the mechanism and regulation of alternative splicing. Statistical analysis shows that at least 41% and possibly as much as 60% of multiexon genes in mouse have multiple splice forms. Of the transcription units with multiple splice forms, 49% contain transcripts in which the apparent use of an alternative transcription start (stop) is accompanied by alternative splicing of the initial (terminal) exon. This implies that alternative transcription may frequently induce alternative splicing. The fact that 73% of all exons with splice variation fall within the annotated coding region indicates that most splice variation is likely to affect the protein form. Finally, we compared the set of constitutive (present in all transcripts) exons with the set of cryptic (present only in some transcripts) exons and found statistically significant differences in their length distributions, the nucleotide distributions around their splice junctions, and the frequencies of occurrence of several short sequence motifs.

Original languageEnglish
Pages (from-to)1290-1300
Number of pages11
JournalGenome Research
Volume13
Issue number6 B
DOIs
Publication statusPublished - Jun 1 2003
Externally publishedYes

Fingerprint

Alternative Splicing
Transcriptome
Exons
Messenger RNA
Genomics
Nucleotides
Complementary DNA
Clone Cells
Databases
Genes
Proteins

ASJC Scopus subject areas

  • Genetics

Cite this

Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome. / Zavolan, Mihaela; Kondo, Shinji; Schönbach, Christian; Adachi, Jun; Hume, David A.; Arakawa, Takahiro; Carninci, Piero; Kawai, Jun; Hayashizaki, Yoshihide; Gaasterland, Terry.

In: Genome Research, Vol. 13, No. 6 B, 01.06.2003, p. 1290-1300.

Research output: Contribution to journalArticle

Zavolan, M, Kondo, S, Schönbach, C, Adachi, J, Hume, DA, Arakawa, T, Carninci, P, Kawai, J, Hayashizaki, Y & Gaasterland, T 2003, 'Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome', Genome Research, vol. 13, no. 6 B, pp. 1290-1300. https://doi.org/10.1101/gr.1017303
Zavolan, Mihaela ; Kondo, Shinji ; Schönbach, Christian ; Adachi, Jun ; Hume, David A. ; Arakawa, Takahiro ; Carninci, Piero ; Kawai, Jun ; Hayashizaki, Yoshihide ; Gaasterland, Terry. / Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome. In: Genome Research. 2003 ; Vol. 13, No. 6 B. pp. 1290-1300.
@article{80e5f6748e204ab6b5e0576e57482892,
title = "Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome",
abstract = "We analyzed the FANTOM2 clone set of 60,770 RIKEN full-length mouse cDNA sequences and 44,122 public mRNA sequences. We developed a new computational procedure to identify and classify the forms of splice variation evident in this data set and organized the results into a publicly accessible database that can be used for future expression array construction, structural genomics, and analyses of the mechanism and regulation of alternative splicing. Statistical analysis shows that at least 41{\%} and possibly as much as 60{\%} of multiexon genes in mouse have multiple splice forms. Of the transcription units with multiple splice forms, 49{\%} contain transcripts in which the apparent use of an alternative transcription start (stop) is accompanied by alternative splicing of the initial (terminal) exon. This implies that alternative transcription may frequently induce alternative splicing. The fact that 73{\%} of all exons with splice variation fall within the annotated coding region indicates that most splice variation is likely to affect the protein form. Finally, we compared the set of constitutive (present in all transcripts) exons with the set of cryptic (present only in some transcripts) exons and found statistically significant differences in their length distributions, the nucleotide distributions around their splice junctions, and the frequencies of occurrence of several short sequence motifs.",
author = "Mihaela Zavolan and Shinji Kondo and Christian Sch{\"o}nbach and Jun Adachi and Hume, {David A.} and Takahiro Arakawa and Piero Carninci and Jun Kawai and Yoshihide Hayashizaki and Terry Gaasterland",
year = "2003",
month = "6",
day = "1",
doi = "10.1101/gr.1017303",
language = "English",
volume = "13",
pages = "1290--1300",
journal = "Genome Research",
issn = "1088-9051",
publisher = "Cold Spring Harbor Laboratory Press",
number = "6 B",

}

TY - JOUR

T1 - Impact of alternative initiation, splicing, and termination on the diversity of the mRNA transcripts encoded by the mouse transcriptome

AU - Zavolan, Mihaela

AU - Kondo, Shinji

AU - Schönbach, Christian

AU - Adachi, Jun

AU - Hume, David A.

AU - Arakawa, Takahiro

AU - Carninci, Piero

AU - Kawai, Jun

AU - Hayashizaki, Yoshihide

AU - Gaasterland, Terry

PY - 2003/6/1

Y1 - 2003/6/1

N2 - We analyzed the FANTOM2 clone set of 60,770 RIKEN full-length mouse cDNA sequences and 44,122 public mRNA sequences. We developed a new computational procedure to identify and classify the forms of splice variation evident in this data set and organized the results into a publicly accessible database that can be used for future expression array construction, structural genomics, and analyses of the mechanism and regulation of alternative splicing. Statistical analysis shows that at least 41% and possibly as much as 60% of multiexon genes in mouse have multiple splice forms. Of the transcription units with multiple splice forms, 49% contain transcripts in which the apparent use of an alternative transcription start (stop) is accompanied by alternative splicing of the initial (terminal) exon. This implies that alternative transcription may frequently induce alternative splicing. The fact that 73% of all exons with splice variation fall within the annotated coding region indicates that most splice variation is likely to affect the protein form. Finally, we compared the set of constitutive (present in all transcripts) exons with the set of cryptic (present only in some transcripts) exons and found statistically significant differences in their length distributions, the nucleotide distributions around their splice junctions, and the frequencies of occurrence of several short sequence motifs.

AB - We analyzed the FANTOM2 clone set of 60,770 RIKEN full-length mouse cDNA sequences and 44,122 public mRNA sequences. We developed a new computational procedure to identify and classify the forms of splice variation evident in this data set and organized the results into a publicly accessible database that can be used for future expression array construction, structural genomics, and analyses of the mechanism and regulation of alternative splicing. Statistical analysis shows that at least 41% and possibly as much as 60% of multiexon genes in mouse have multiple splice forms. Of the transcription units with multiple splice forms, 49% contain transcripts in which the apparent use of an alternative transcription start (stop) is accompanied by alternative splicing of the initial (terminal) exon. This implies that alternative transcription may frequently induce alternative splicing. The fact that 73% of all exons with splice variation fall within the annotated coding region indicates that most splice variation is likely to affect the protein form. Finally, we compared the set of constitutive (present in all transcripts) exons with the set of cryptic (present only in some transcripts) exons and found statistically significant differences in their length distributions, the nucleotide distributions around their splice junctions, and the frequencies of occurrence of several short sequence motifs.

UR - http://www.scopus.com/inward/record.url?scp=0038349624&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0038349624&partnerID=8YFLogxK

U2 - 10.1101/gr.1017303

DO - 10.1101/gr.1017303

M3 - Article

VL - 13

SP - 1290

EP - 1300

JO - Genome Research

JF - Genome Research

SN - 1088-9051

IS - 6 B

ER -