Skip to content

Protein innovations

Protein C-terminal additions and the yeast prion [PSI+]

Genetic variation beyond stop codons is subject to little selective pressure. The [PSI+] prion is an epigenetically inherited aggregate of the Sup35 protein, which is a release factor required for translation to terminate at stop codons. When [PSI+] appears, elevated readthrough occurs at every gene in the genome, and a range of pre-existing cryptic genetic variation is phenotypically revealed. As an epigenetically inherited protein aggregate, [PSI+] can easily be lost after some generations. This returns the lineage to its normal [psi-] state and restores translation fidelity. If a subset of revealed phenotypic variation is adaptive, it may have lost its dependence on [PSI+] by this time. This process of genetic assimilation may, for example, involve one or more point mutations in stop codons. This leaves the yeast with a new adaptive trait and with no permanent load of other, deleterious variation. The yeast prion [PSI+] is a wonderful model system for studying evolutionary capacitance, because the relevant molecular biology is well understood.

A great advantage of this system is that we can use the rich resource of fully sequenced, closely related Saccharomyces species. We classified events in which 3′UTR was incorporated into coding regions, both in Saccharomyces and in rodents. Based on mutational bias, most such events are expected to result from indels occurring shortly before the stop codon and knocking it out of frame. This leads to the inclusion of frameshifted 3′UTR in the new allele, and this is indeed what we found in most cases in rodents. In contrast, in Saccharomyces a high proportion of 3′UTR incorporation events led to the inclusion of inframe 3′UTR through precise mutation of the stop codon. This is compatible with the genetic assimilation of inframe readthrough products produced by [PSI+] (Giacomelli et al. 2007).

De novo gene birth

Our theories of preadaptation may help explain how new protein-coding genes evolve de novo from non-coding sequences. In agreement with our theories, “non-coding” sequences are often found in association with ribosomes, allowing them to be translated, providing an opportunity for the most deleterious amino acid sequences to be eliminated by selection (Wilson & Masel 2011). The pre-screened set of sequences provides the raw material from which de novo protein-coding genes could be co-opted. We have also identified a new 28 amino acid de novo gene in S. cerevisiae (Wilson & Masel 2011).

We are currently undertaking a range of studies concerning the importance of avoiding aggregation as a constraint on protein sequence evolution.

Scientific American ran a nice article about a recent SMBE symposium on this topic.


  • Foy SG, Wilson BA, Cordes MHJ, Masel J. Progressively more subtle aggregation avoidance strategies form a long-term arrow of protein evolutionary time, manuscript submitted.
  • Wilson BA, Foy SG, Neme R, Masel J. (2017) Young genes are highly disordered, falsifying the continuum hypothesis of de novo gene birth, manuscript submitted.
  • Kosinski LJ, Masel J. Stop codon readthrough errors purge deleterious cryptic sequences, facilitating the later co-option of non-coding sequences into coding, manuscript in preparation.
  • Andreatta, M.E., Levine, J.Al., Foy, S.G., Guzman, L., Kosinski, L., Cordes, M.H.J., Masel, J. (2015) The recent de novo origin of protein C-termini, Genome Biology & Evolution 7(6):1686-1701.
  • Wilson, B. & Masel, J. (2011). Putatively noncoding transcripts show extensive association with ribosomes. Genome Biology & Evolution, 3, 1245-1252 Go to document.
  • Giacomelli, M. G., Hancock, A. S., & Masel, J. (2007). The conversion of 3′ UTRs into coding regions. Mol Biol Evol, 24(2), 457-64. (PDF) (PubMed)Go to document (doi)Go to document