Current Undergraduate Research opportunities

Transposable Elements

Expression of transposable elements (TEs) in different tissues:

Transcription is a vital step in the process of transposition, as TE encoded proteins need to be generated. Although RNA-seq datasets exist for a large number of tissues and developmental stages in maize, particularly the inbred line B73, most analysis pipelines focus only on protein coding genes. This project would use Kallisto, a relatively new program for quickly measuring gene expression differences, to quantify how much transcription is occurring for a given TE copy or TE family. For a TE to make it to the next generation, it has to transpose in tissue that will become pollen or eggs, so we would focus on these tissues and their precursors. Candidate TE families would be Dotted, Ac, and a number of LTR retrotransposon families. It would be neat to follow up on a result from Kunze that has never been published, of increased transcription of Ds when Ac is present in the genome. This seems counter-intuitive to host control of a TE.

B73 polymorphism:

The maize genome consists of over 85% TEs, and 75% LTR retrotransposons, but little is known about the transpositional activity of LTR retrotransposons. Only one LTR TE (bs1) has been observed to jump. This project would use short read resequencing of B73 to identify candidate TE regions that appear absent across accessions of the inbred line B73. Although B73 is often thought of as a completely inbred stock, differences have accumulated, as labs across the world grow these plants, and generate their own stocks. Transposition may contribute to differences between these stocks.

Identifying the autonomous partner to rDt:

Dotted is a DNA TE discovered by Marcus Rhoades in the 1940’s, named such because it generates small dots on the aleurone. Now, we know this is due to a small nonautonomous TE, rDt, in an anthocyanin pigment producing gene. Experiments in the 1980’s showed that unlike other elements like Ac and Mu which stop transposing once there are high copy numbers in the genome, Dotted just keeps jumping more no matter how many copies introduced into one background. This project would include genotyping Dotted insertions across lines, and testing for their presence and transposition in a number of controlled crosses.

Characterizing the genomic niche of a retrotransposon family:

The maize genome is a majority DNA derived from LTR retrotransposons, but there is a stunning diversity of types and ecological strategies of these TEs. Simply knowing a sequence is from an LTR TE doesn’t tell you much. This project would entail characterizing every insertion in the genome of a small family of LTR retrotransposon (~10-30 insertions), and identifying whether they are present in the same positions in other individuals. This would be done through a combination of bioinformatic (e.g. relocaTE, TEPID, mcclintock) and wet-lab techniques to verify position in different genotypes. This would address whether TEs that always insert into the 5’ UTR of one maize inbred line do this in every inbred line, or whether the ecological environment of another genome disrupts the insertion preference of the TE. A somewhat similar approach is described here.

Molecular evolution of TE proteins:

We have evidence of activity of some TE families in maize, but not all. Is this due to innate deficiencies in their protein coding sequence for transposon proteins? This project would mine genomic sequences to identify TE protein coding domains, and use population genetic methods to measure the extent of purifying and positive selection acting on individual families of TEs. There are lots of ways to take this, but I like the theoretical framework proposed here.

Tree Genetics

Identify of locus controlling bitter/sweet kernel (Sk) in almond

In almond breeding one of the first phenotypes evaluated is whether an almond tree produces sweet or bitter kernels. Obviously only those genotypes with sweet kernels are advanced for further evaluation and possibly for variety development. The kernel sweetness is controlled in a dominant/co-dominant manner at the Sk locus with the sweet kernel being dominant. (Literature indicates that the sweet kernel allele may be mostly dominant and produce a bittersweet kernel phenotype when in combination with the bitter kernel allele but distinction between bittersweet and bitter phenotypes is unclear.) Markers with some linkage to the phenotype have been identified, identifying an approximately 2 Mb region in the peach genome. Identifying the genetic control of the Sk locus and alleles contributing to bitter and sweet kernels would be of great benefit to almond breeding programs and could be used to investigate the origin and evolution of the sweet kernel allele(s) in almond.


When chromosomes break, sometimes the cell’s repair machinery doesn’t work as accurately as expected, and occasionally the broken chunk of chromosome flips around before it is stuck back in place. These changes, called inversions, can play an important role in evolutionary genetics, such as disrupting gene function via breakage, modifying gene action by rearranging regulatory sequence, suppressing recombination and fomenting adaptive divergence and introgression. Many cool examples of inversions have been identified in the nature populations such as perenniality in Mimulus, mating behavior in birds, marine and fresh water divergence in sticklebacks, and mimicry polymorphism of wing patterns in butterflies.

So, what about inversions in maize and maize wild relative teosinte? So far, we have identified nearly 130Mb of Zea inversions, which is nearly the entire genome length of the Arabidopsis. The biggest inversion identified so far on Chr1 is about 50Mb. This inversion identified in teosinte an altitudinal cline and associates with phenotype culm diameter, a trait difference between maize and teosinte. When we surveyed a handful of teosinte populations using relatively low-density genotyping, we easily pick up several megabase-scale inversions. Of course, smaller inversions will be harder to detect, but also less likely to be disrupted by crossing over and probably much more abundant in the genome. So far, we have developed an R package to identify inversions based on genotype data. The plan is to use this package to identify novel inversions across a large diverse panel of inbred lines, then verify our approach using population genetics and comparison to several fully sequenced genomes.

Adaptation to UV radiation

In order to photosynthesize enough to survive, plants must be able to capture energy from solar radiation while avoiding the harmful effects of UV radiation that is also present. Models of future climates suggest that, as in current high elevation environments, increase UV radiation will be damaging to plant growth and lead to decreased plant productivity and increased mortality. These fitness reductions have obvious consequences for agricultural output and the long term stability of natural plant populations. Domesticated maize and its close relatives the teosintes have been shown to have a number of adaptations to mitigate UV damage like an increase in chemical sunscreen compounds as well as a dense layer of hair on stems and leaves. While the genes that control many of these traits have been characterized by coarse genetic surveys of mutant plants, the specific genetic changes that have been naturally selected to confer these advantages are still unknown. In this experiment we will survey high elevation maize and teosinte plants’ ability to avoid the harmful effects of UV damage in leaves and pollen. This data, combined with sequencing data from these same plants, will allow us to characterize genetic predictors of adaptive UV resistance. With this knowledge we will be able to characterize the size and scope of UV adaptive responses in high elevation maize genomes and begin to generate improvement strategies to meet the demands of agricultural output in changing climates.