PubChem Part 1

Open PubChem

in another browser window to work through this tutorial side by side.

In this section you will review how to find chemical information in PubChem starting with chemical names or identifiers, molecular formulas, gene symbols, proteins, pathways, and taxonomies. Part 2 will review structure searching.

Search with Chemical Names or Identifiers

1 of 5

PubChem recognizes chemical names, many synonyms, and other identifiers like the International Union of Pure and Applied Chemistry (IUPAC) name and Chemical Abstract Services (CAS) number.

Type citric acid into the search box on the PubChem homepage. As you type, PubChem will start to auto-fill a list of potential Compound names. 

citric acid autocomplete menu

You can either click on the Compound you’re looking for from the auto-fill list, or click Enter on the keyboard to search.

Note on phrase search results in PubChem

Click Enter, now.

Search with Chemical Names or Identifiers

2 of 5

On the search results page, the first result at the very top of the page is the Best Match, which is the result that PubChem suggests is most relevant to your search.

citric acid search

To see how many total Compound and Substance records PubChem has for citric acid, scroll below the Best Match search to the menu with tabs including Compounds (>300), Substances (>2,900), and other data type categories.

citric acid search record types

Search with Chemical Names or Identifiers

3 of 5

Scroll back up to the Best Match box on the results page. It includes identifiers for citric acid, including the Compound CID, molecular formula (MF), and molecular weight (MW).

citric acid best match

The Compound ID (CID) is the Compound’s unique PubChem identifier. The Compound CID for citric acid is 311. To return to this Compound directly, you can search PubChem with the Compound CID.

Search with Chemical Names or Identifiers

4 of 5

Exercise 3:

Search PubChem to answer the questions below:

What is the Compound CID for Apixaban?

Which Compound has the CID of 2244?

Search with Chemical Names or Identifiers

5 of 5

Open the PubChem record for Aspirin (CID 2244) by clicking on the record title (the link labeled Aspirin; ACETYLSALICYLIC ACID; 50-78…).

PubChem pages have a few features that help you navigate to the information you need:

PubChem Contents menu

  • A Contents menu with sections and subsections that organize different types of data, like BioAssay results and gene or protein targets.
  • Tooltips that describe what each section and subsection contain. Look for the question mark icon question mark icon beside a section’s title and click on it to reveal details about that section.

On the Compound summary page for Aspirin (CID 2244), find the Contents menu located on the right-hand side of the page or below the summary box. This menu displays the different sections of available information in PubChem.

Find the Biological Test Results section and click onBiological test results menu item the small arrow beside it to view its sub-sections. This section has one sub-section for aspirin, BioAssay Results.

Click on BioAssay Results to view available bioactivity information in PubChem about Aspirin.

Search with Molecular Formula

1 of 2

You can also search with a chemical’s molecular formula.

Return to the PubChem homepage by clicking on either the PubChem logo or the Search PubChem button in the upper right-hand corner of the screen.

Search PubChem for Al2O12S3. (Hint: copy and paste)

Notice that the results page looks different this time. Below the search box is a note that PubChem is “Treating this as a molecular formula query.”

Also, PubChem does not retrieve a Best Match result for this search. Instead, you see four Compounds with different structures to choose from.

Search with Molecular Formula

2 of 2

Exercise 4:

How many Compounds does PubChem retrieve when you search C18H40O4P2?

What is the molecular formula for SynuClean-D?

Search with Genes or Proteins

1 of 4

Return to the PubChem homepage by clicking on either the PubChem logo or the Search PubChem button in the upper right-hand corner of the screen.

If a gene or protein target has been tested in a Bioassay or is involved in a Pathway, then it will have a record in PubChem.

For example, if you want to find information about the Vitamin D Receptor gene, search PubChem for Vitamin D Receptor. As you type, PubChem will start to auto-fill a list of potential Gene names. You can either click on the Gene you’re looking for from the auto-fill list, or click Enter on the keyboard to search.

Click on vitamin D receptor under Gene to continue:

vitamin D receptor search

Search with Genes or Proteins

2 of 4

Below the search box are the different PubChem data types with a count of how many summaries are available for each one. Click on the Genes tab (if you're not already there) to view your options.

Results by data type

Search with Genes or Proteins

3 of 4

Now you should see results for summaries about the Vitamin D Receptor. Each result includes the organism it corresponds to in parentheses at the end of the title. For example, one result is for vdrb – vitamin D receptor b (zebrafish).

In the list, locate the summary for VDR – vitamin D receptor (human)

VDR human summary

You can already see a lot of information about the gene on the results page, including how many linked BioAssays and linked Pathways it has in PubChem. These numbers appear in blue boxes with the summary on the results page. Clicking on those numbers will take you directly to a list of related BioAssay and Pathway PubChem summaries.

Click on VDR – vitamin D receptor (human) to view the full record. Use the Contents menu on the right-hand side of the screen or below the summary box to jump to different parts of the summary.

Search with Genes or Proteins

4 of 4

Exercise 5:

Locate the Gene page for tbx2 (human) to answer the following questions:

Which protein target does PubChem list as being mapped to the tbx2 (human) gene target? (Hint: Locate the Proteins section of the Contents menu and look for Protein Targets)

What is the Compound CID for the drug listed under the Drug-Gene Interactions section for tbx-2 (human)? (Hint: Look for the Interactions and Pathways section of the Contents menu) 

Search with Pathways

1 of 4

If you need to know which chemicals or genes interact with a specific biological pathway, you can find that in PubChem.

Return to the PubChem homepage by clicking on either the PubChem logo or the Search PubChem button in the upper right-hand corner of the screen.

Search PubChem for lidocaine metabolism.

Search with Pathways

2 of 4

On the results page, you’ll see at least 3 Pathway records. When new pathway data for any taxonomy is added to PubChem, a new Pathway record is created.

In the list for lidocaine metabolism, you see a result from the source Pathbank for the Homo sapiens (human) taxonomy. You also see two entries from the source WikiPathways, one for the Bos taurus (cattle) and one for the Homo sapiens (human) taxonomy. You can read more about how Pathway records are organized in PubChem in this article.

The results page also displays how many PubChem Compounds, Gene, and Protein records are linked to each Pathway. These are displayed as blue boxes with numbers.

lidocaine pathway example

Click on the record for Lidocaine (Local Anaesthetic) Metabolism Pathway from PathBank.

Search with Pathways

3 of 4

contents for lidocaine pathway exampleUse the Contents menu to jump to Interactions,
Chemicals, Proteins, or Genes involved in this Pathway; additionally, some Pathways will have linked Related Pathways.

Search with Pathways

4 of 4

Exercise 6:

Locate the Pathway summary page for Glycolysis and Gluconeogenesis from the data source INOH for the taxonomy Homo sapiens (human) to answer the following questions:

True or False: This Glycolysis and Gluconeogenesis Pathway has Compound, Gene, and Protein records linked to it in PubChem.

True or False: The chemical Phosphoric acid is listed in PubChem as being involved with the Glycolysis and Gluconeogenesis Pathway.

Search with Taxonomies

1 of 4

Taxonomy summaries include data available in PubChem associated with a specific organism. This includes biological experiments archived in PubChem BioAssay that were conducted against the organism as a whole or a particular gene or protein of it, as well as the compounds tested in those experiments.

Return to the PubChem homepage by clicking on either the PubChem logo or the Search PubChem button in the upper right-hand corner of the screen.

Search with Taxonomies

2 of 4

Search for mosquitos. As you type, PubChem will start to auto-fill a list of potential Taxonomy names. You can either click on the Taxonomy you’re looking for from the auto-fill list, or click Enter on the keyboard to search.

mosquito taxonomy

Search with Taxonomies

3 of 4

PubChem will return at least one Taxonomy result for Culicidae (mosquitos). Each result displays a Linked BioAssay Count with the number of BioAssays connected to this organism. This is shown as a blue box with a number in it.

mosquitos summary

Click on Culicidae (mosquitos) to view the full summary.

Use the Contents menu to jump to related Chemicals and Bioactivities, BioAssays, and other information.

Search with Taxonomies

4 of 4

This concludes Part 1. You now know how to find chemical information in PubChem starting with chemical names or identifiers, molecular formulas, gene symbols, proteins, pathways, and taxonomies.

Continue to Part 2 to learn how to search by structures.

Powered by Guide on the Side from the University of Arizona Libraries
Developed resources reported in this site are supported by the National Library of Medicine (NLM), National Institutes of Health (NIH) under cooperative agreement number UG4LM012344 with the University of Utah Spencer S. Eccles Health Sciences Library. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH..