PubChem Part 2: Searching with Structures

Open PubChem

in another browser window to work through this tutorial side by side.

This section reviews finding chemical information in PubChem with chemical structures. It includes:

  • the basics of searching with line notations and drawings,
  • identity searching and
  • finding similar structures, substructures, and superstructures.

Search with Structures: Line Notations

1 of 3

You can search PubChem with chemical structures to find exact matches or chemicals that share a similar structure or substructure.

PubChem recognizes structure drawings and line notations, including:

You should be viewing the PubChem homepage in your other browser Window. If you are not, open the PubChem home page before moving on.

Search with Structures: Line Notations

2 of 3

Enter the SMILES identifier for calcium carbonate into the PubChem search box (hint: copy and paste):
C(=O)([O-])[O-].[Ca+2]

PubChem automatically recognizes that you’re searching for a structure and will try to identify the Compound. Relevant results are listed under the Identity tab.

For this structure, you should see one result for calcium carbonate with Compound CID 10112.

calcium carbonate structure results

Search with Structures: Line Notations

3 of 3

Let's try a couple of other searches using line notations.

Exercise 7:

Which amino acid has the structure CSCC[C@@H](C(=O)O)N?

Which element is represented by InChI=1S/Cu?

Search with Structures: Drawings

1 of 8

You can also search by manually drawing a structure using the PubChem Sketcher.

Return to the PubChem homepage by clicking on either the PubChem logo or the Search PubChem button in the upper right-hand corner of the screen.

Scroll below the search box and click on the Draw Structure icon.draw structure icon

The PubChem Sketcher should open in a new window.

This tutorial will walk you through some basic PubChem Sketcher functions and mechanics.

Search with Structures: Drawings

2 of 8

The PubChem Sketcher window has three parts:Sketcher options

  • The left side of the window has buttons and controls for drawing,
  • The right side of the window is where the structure drawing will appear, and
  • The top of the window has a search bar that will display a drawn structure as a line notation or other identifier.

Search with Structures: Drawings

3 of 8

There are two main ways to draw a structure:

  • Input an identifier, like SMILES or SMARTS, into the search bar, or
  • Use the buttons to manually draw a structure.

Let’s try using the InChl for benzaldehyde to draw a structure.

Search with Structures: Drawings

4 of 8

In the PubChem Sketcher window, change the SMILES drop-down menu to StdInChl.

change SMILES to STDInChi

Copy and paste the InChl into the search bar:
InChI=1S/C7H6O/c8-6-7-4-2-1-3-5-7/h1-6H

Click Enter on the keyboard to search.

Search with Structures: Drawings

5 of 8

The structure should appear in the white box below the search bar:

InChI=1S/C7H6O/c8-6-7-4-2-1-3-5-7/h1-6H

To search PubChem with this structure, click the blue Search for this Structure button at the bottom of the Sketcher window. Do this now.

Search with Structures: Drawings

6 of 8

You should now see the PubChem results page. In the search bar, the SMILES string for the query structure is shown. You should see one result for Benzaldehyde under the Identity tab.

Benzaldehyde results

Return to the PubChem homepage and click Draw Structure to return to the PubChem Sketcher window.

Search with Structures: Drawings

7 of 8

This time, let’s try drawing the structure for benzaldehyde in the PubChem Sketcher. This exercise will introduce you to the basics of using the PubChem Sketcher.

Search with Structures: Drawings

8 of 8

Follow the steps below to duplicate this benzaldehyde structure drawing using the PubChem Sketcher:

benzaldehyde structure

benzene ring1. Select the benzene ring button. It will turn yellow when selected.

2. benzene to workspaceMove the mouse to the right side of the window and click in the middle of the white space. The ring will appear when you click.

propane3. Select the propane symbol button. It will turn yellow when selected.

4. flipped benzeneClick on the top of the benzene ring to place the propane symbol.

5. mirrorSelect the mirror button. It will turn yellow when selected.

6. reversed benzaldehydeWith the mirror button selected, click anywhere on the structure. It will flip the structure to match the direction of our drawing.

7. double bondSelect the double bond button. It will turn yellow when selected.

8. add double bondClick on the propane symbol to add the double bond.

9. oxygenSelect the oxygen button. It will turn yellow when selected.

10. benzaldehyde structureClick the top of the double bond symbol to add the oxygen symbol.

You've now created the structure for benzaldehyde.

Click the blue Search for this Structure button.

The PubChem results page should return one matching Compound, Benzaldehyde.

Search with Structures: Identity Search

What we just did with the benzaldehyde query is an identity search. An identity search returns compounds identical to the query molecule. When using identity search, you have some control over what is meant by “identical” compounds.

The default search considers two molecules to be identical if they have the same connectivity, isotopism, and stereochemistry. You can tell PubChem to ignore isotopism or stereochemistry by clicking the Settings button on the top-right of the search results and selecting an appropriate one from the list of available options.

identity settings menu

When stereochemistry is ignored, compounds with the same connectivity and isotopism, but with varying stereochemistry, are returned. If isotopism is ignored, the identity search finds compounds with the same connectivity and stereochemistry, but with different isotopes.

To view more information on the identity search with different definitions of chemical identity to find stereoisomers and isotopomers, view this article.

Find Similar Structures, Substructures, and Superstructures

1 of 5

Once you’ve identified the structure you’re looking for, you can use links on the results page to jump to other structure searches.

Return to the PubChem homepage by clicking on either the PubChem logo or the Search PubChem button in the upper right-hand corner of the screen.

Search for the SMILES identifier:
CC(C)OP(=O)(C)F

PubChem should identify this as Sarin.

The search results displayed below the search box has five tabs: Identity, Similarity, Substructure, Superstructure, and 3D Similarity.

structure result tabs

  • Similarity - allows one to locate Compounds that are similar to a chemical structure query using pre-specified similarity thresholds
  • Substructure – allows one to locate chemical structures that contain a particular connectivity and valence-bond pattern
  • Superstructure – allows one to identify chemical structures that comprise or make up the provided chemical structure query.

Find Similar Structures, Substructures, and Superstructures

2 of 5

PubChem has automatically run those searches and the results for individual search types can be viewed by clicking the corresponding tabs.

For example, click on the Similarity tab option. PubChem now displays the Similarity results.

Find Similar Structures, Substructures, and Superstructures

3 of 5

Right below the Similarity tab is a description of the search: Fingerprint Tanimoto-based 2-dimensional similarity search.

Fingerprint Tanimoto-based 2-dimensional similarity search.

If you want to see results for 3D similarity, go back to the results page and click on the 3D Similarity tab.

Find Similar Structures, Substructures, and Superstructures

4 of 5

For all of these structure searches, you can adjust some parameters to perform customized searches based on what you need. Click on the blue Settings button to see which customizations are available.

View step-by-step directions for different customized structure searches at the links below:

Find Similar Structures, Substructures, and Superstructures

5 of 5

Return to the PubChem Tutorial.

Powered by Guide on the Side from the University of Arizona Libraries
Developed resources reported in this site are supported by the National Library of Medicine (NLM), National Institutes of Health (NIH) under cooperative agreement number UG4LM012344 with the University of Utah Spencer S. Eccles Health Sciences Library. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH..