NCBI Gene: Variations (Part 3): Finding Common Variants

Open Human TH record in the NCBI Gene Database

in another browser window to work through this tutorial side by side.

We are exploring questions about genetic variation with the human TH gene.

In this section we'll explore the question: Are there any common protein variants in this gene?

We define "common variants" as those that have a minor allele frequency (MAF) of 1% or more in the population.

You can find the common variants for a gene from the NCBI gene record.

From this TH Gene record, go to the Variation section using the table of contents.

TOC from gene record

This time we'll look at the Variation Viewer. Note here that there are two links: One for GRCh37.p13 and one for GRCh38.

variation viewer link

These are two different assemblies. Many researchers and labs continue to use an older genome assembly’s coordinates for a long time after a new genome assembly becomes available.

Follow the link "See Variation Viewer (GRCh38)."

You are now in the Variation Viewer. The Variation Viewer shows you the genomic context of the variations.

We won't be exploring the Variation Viewer in detail right now, but there is a 4-minute tour of the Variation Viewer available, if you would like to learn more.

Scroll down to the "Molecular consequence" filters in the left menu.

You can filter the table by missense type of "Molecular consequence” to get coding region variants.

variation viewer consequence filter

Scroll further to the "1000 Genomes MAF” section to filter for only common variants.

What is "1000 Genomes?"

What was MAF, again?

Remember that a "common" variant has a frequency of greater than or equal to 1% in the general population.

We'll use >=0.05 for this example.

1000 genomes

 

One variant remains, rs6356. You may need to scroll back up the page to view the result.

var_ref_snp

This Variant ID is an "rs" number or "Reference SNP" from the Single Nucleotide Polymorphism database (dbSNP). Follow the link from rs6356 to go to dbSNP to learn more about this variant. 

You're now in the SNP (single nucleotide polymorphism) database.

What is a single nucleotide polymorphism?

Earlier we used the ClinVar filters to find single nucleotide variants that cause disease. dbSNP contains information about human single nucleotide variations as well as other short genetic variations.

In this class we'll generally be following links from other databases to dbSNP, but keep in mind that you can also search dbSNP directly with an rs number. [HINT: You might be doing this in a future exercise.]

Back to our example, we learn from the dbSNP record for rs6356 that this variation, a change from C to T at this position, results in changing the reference amino acid valine in the protein product to a methionine.

Now let's go back to our example in the Variation Viewer. If you've wandered off, follow this link to get back.

Remember that we got here from the Gene database link from the TH gene record to the Variation Viewer, applying some filters to find missense variations that we've found with >= 0.05 frequency in the 1000 Genomes project.

We could also have found this variant in ClinVar. However, many common variants (especially non-coding ones) are not in ClinVar. You can still find and filter for ClinVar variants in the Variation Viewer. Deselect the earlier missense and MAF >= 0.05 choices to see the number of variants in ClinVar.

You have reached the end of the tutorial for the question:

What variations are present in the gene and are they associated with disease?

Close both windows to end the Guide.

Powered by Guide on the Side from the University of Arizona Libraries
Developed resources reported in this site are supported by the National Library of Medicine (NLM), National Institutes of Health (NIH) under cooperative agreement number UG4LM012344 with the University of Utah Spencer S. Eccles Health Sciences Library. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH..