Plenty of Fish in the Sea, or “Analysis & Results”

There are plenty of fish in the sea… Especially in the Green, Tennessee, Ohio, Cumberland, and Laurel river systems. I couldn’t go the whole summer without throwing that line in, could I?


Liz Lemon likes my joke. 

Update: I finished my pipetting. I finished gathering the genetic information at 25 loci for 169 individulas. (Or should I say fin-dividuals?) I finished sifting through those numbers to weed out the false alleles. In short, I FINISHED MY DATA COLLECTION.

I’m as happy as this guy. happy-fish

I’m left with three days for analysis before I leave New Haven. My computer and I have become fast friends. I have never considered myself a particularly tech savvy person, but lo and behold, I am writing rudimentary code. Along the way I’ve picked up the tools Tandem (which rounds nucleotide repeats) and STRUCTURE (which analyses genotype data to infer distinct populations).

My first round of analysis yielded five distinct populations. Each of the six sections in the picture below represents a locality that we sampled in the river system. Bar Plot K5

At first glance, it appears that populations 1, 2, the majority of 3, and 4 each form their own genetic clusters. A handful of specimen in population 3 (the dark blue amid the red) were mislabelled and will be taken out of the dataset.

Populations 5 and 6 are different from the others but similar to each other. However, when I ran only populations 5 and 6 together, I saw that…


…at a more specific level of analysis, they form their own genetic clusters as well!

I’m working with Professor Near and a postdoc in the lab, Rich, to continue this analysis. The next steps involve reorganizing the fin-dividuals in the dataset to reflect the geography of the river system. For instance, if population 4 represents the Upper Tennessee River, we will place those individuals at the top of the river system early in the dataset and those individuals at the bottom of the river system at the end of the dataset.

Further work will show if the microsatellite data supports the already established gene tree formed with mitochondrial data.

A Brief Aside…

I’ve been very busy. Instead of chatting up gel electrophoresis today, I’ll give a brief update on life in the lab and take you on a picture tour.

After many rounds of primer screening, I’ve found twenty-five viable loci for my project! Hooraaaaaayyy! The next step involves running microsatellite PCR for all 172 samples of E. Kennicotti. Keep in mind that my summer job ends in two weeks, so it’s a sprint to the finish.

What did last week look like, you ask? The same as this week and, most likely, the next: pipetting, pipetting, and more pipetting. To keep me on my toes while working, I listen to upbeat tunes (this week the musical soundtracks to Hamilton, Ragtime, & Sweeney Todd) and podcasts (This American Life, Invisibilia, & an interesting new show about being a single mother by choice called Not By Accident).

Want to see where I work?
Come on in…


Screen Shot 2016-07-19 at 3.46.56 PM

What’s that on the wall? 


Oh, it’s just Elvis. 


My lab space on a sunny day. 


That sunny day. 



Fish galore.

Now off to power through for a full data set!

Seeing DNA with My Own Eyes, or “The Polymerase Chain Reaction”

In my last post I mentioned the process of gene amplification through Polymerase Chain Reaction (PCR). I thought I might expand on the topic, since most of my week has revolved around diluting primers and combining them with other reagents to amplify specific sequences of DNA.

Some vocab that might help to read this post:

  • Polymerase Chain Reaction (PCR): a molecular biology technique used to amplify a piece of DNA over and over again, allowing analysis of that gene
  • Primer: a short complimentary nucleotide sequence created to bind a specific gene
  • Locus: the specific location of a gene’s DNA sequence on a chromosome (plural: loci)
  • Genome: the complete set of genes present in an organism

Again, the big picture: our project goal is to establish the magnitude of relation between six populations of E. Kennicotti in the Upper Cumberland, Laurel River, Ohio-Clarks, Green, Lower Tennessee, and Upper Tennessee.

Screen Shot 2016-06-29 at 10.20.43 AM

Populations of E. Kennicotti have already been compared by morphology (the physical structures of organisms, ex: scale count), nuclear loci (genetic material located in the nucleus), and mitochondrial loci (genetic material located in the mitochondria, an organelle inherited from an individual’s mother). We are taking the project one step further by comparing ~25 microsatellite loci. 

Though I talked about microsatellites briefly in my second blog post, here’s a refresher: microsatellites are nucleotide repeats found in an individual’s genome. The repeats don’t code for anything – while some DNA sequences make proteins, others like microsatellites act as glorified placeholders. (There are various theories on what they actually do.) The number of microsatellite sequence repeats differs between individuals. We can estimate the amount of genetic diversity between populations by establishing the variation in number of repeats.

Okay. The stage is set. So how do we visualize the number of microsatellite repeats across 25 loci in an individual? PCR! Wahoo!
We combine various reagents, including DNA polymerase, free nucleotides, and primers. We put this mixture into individual tubes, add DNA from different individuals to each, then stick the tubes in a thermal cycler. That program allows the DNA to: Polymerase_chain_reaction.svg

  1. Denature. The strands of DNA are usually in a double helix, but in high temperatures the helix “unzips.”
  2. Anneal to the primer. Primers have been specifically constructed to bind to these DNA sequences. A forward primer binds to one of the unzipped strands, and a reverse primer binds to the other.
  3. Elongate. The DNA polymerase creates a new DNA strand complementary to the template strands in step 2.

Steps 1-3 repeat on a cycle to amplify the DNA. Image from wikipedia.
To visualize the product, we perform gel electrophoresis. But I’ll save that for another time.

A History Lesson, or “Phylogenetic Revelations Inspired by Molecular & Morphological Data”

Last week I spent most of my time…

…learning how to extract DNA from tissue samples …




…pipetting liquid A into liquid B to amplify segments of DNA through Polymerase Chain Reaction (PCR)




…and running gels to make sure all of that worked!


These skills are crucial for molecular biology work.

On Friday, I met with Professor Near and learned more about the project, particularly how my molecular work relates to his phylogenetic theories for E. Kennicotti. But first: what is phylogenetics?
Phylogeneticists study the evolutionary history and relationships among individuals or groups of organisms. (Thanks, Google.) We construct trees that show the links between species based on their most recent common ancestor.


This phylogenetic tree shows that A and B share a more recent common ancestor (signified by the nodes) than either does with C. Thus lineages A and B diverged more recently than lineage C.


Professor Near believes that the group known as E. Kennicotti actually contains multiple different species (Rich Harrington, E. France, M. Thomas, & T. Near unpublished). We examine this theory through phylogeography, the examination of geographic distributions of individuals in relation to their genetic lineages; morphological variation, the similarities and differences between the physical structures of individuals; and molecular variation, the similarities and differences between the genetic material of individuals.

Much work has been done already, and more lies ahead.

The Stripetail Darter, or “Etheostoma Kennicotti”

I’ll be working on a new species description of the stripetail darter, also known as Ethiostoma Kennicotti1.jpg6c946c10-9523-4951-a851-b260f86445dcOriginal.jpg

Here’s a shot of my new friend. Ken insisted that I get his good side.


Screen Shot 2016-06-27 at 11.21.34 AM


E. kennicotti is found in small rivers in the eastern US – Tennessee, Ohio, Kentucky, etc.


To do this, I’ll be analyzing the microsatellites of individuals. If you’d like the long explanation of microsatellite genotyping, click away! Here’s the short story: microsatellites are nucleotide repeats found in an individual’s genome (all of its genetic material). The repeats don’t code for anything – some DNA sequences make proteins, while other sequences like microsatellites are more of a mystery. That being said, the number of repeats differs between individuals. We will use the differences between repeat number to gauge the amount of genetic diversity between populations.


My Summer Experience with Fish, or “The Phylogenomics and the Evolution of North American Freshwater Fishes”

Are you interested in genetics? What about fish? (Think swimming in water, not fried on a plate.) When picking a trendy outfit, do you gravitate towards lab coats and safety goggles? Most importantly, how do you feel about identifying and amplifying fish genes while wearing the aforementioned trendy outfit?

Well, you’ve come to the right place. A bit about me: I’m Claire, an undergraduate student studying at Yale University. I’m originally from Texas, listen to R&B, and practice yoga. And I study Ecology & Evolutionary Biology! Because you learn the mechanics of how life works. A few months ago I learned about biology research opportunities organized and funded by the Yale Peabody Museum of Natural History. Check it out if you’re ever in the area; you can never outgrow dinosaur fossils.

So here I am, doing summer research in the Near Lab at Yale thanks to the Peabody. I don’t know too much about the evolutionary history of North American freshwater fish – yet – but I know someone who does. That would be Professor Thomas Near, the driving force behind the lab.

In his own words: “North America is home to the most species rich non-tropical fauna of freshwater fishes on Earth. Investigating the mechanisms responsible for this incredible diversity can add light to some of the oldest questions in evolutionary biology. This research involves the generation of phylogenetic trees using molecular data. A natural byproduct of this work is the discovery and description of new species.”

Translation: a lot of freshwater fish are swimming around North America, and we have yet to discover them all. If you didn’t understand all of the words in Professor Near’s research description, don’t worry. I’m writing about the things that I learn over the summer to make phylogenetics research more accessible to those of us who aren’t science encyclopedias.

Please enjoy the lovely photograph below. I’ll be in lab.