OPSIN Logo
  • Documentation
  • Web service
  • Contact

EMBL-EBI | Chemical Biology | OPSIN

OPSIN

Open Parser for Systematic IUPAC Nomenclature

Example IUPAC names: 2,4,6-tri-O-methyl-D-…, (3β)-cholest-5-en-3-ol, 1,3,7-trimethyl-3,7-…, (1S,2R,18R,19R,22S,25R,28R,40S)-…

New Image

Documentation

Read about the IUPAC nomenclature supported.

New Image

Web service

Automate batch processing via web services.

New Image

Contact

How to get in touch or cite OPSIN.


What is OPSIN?

OPSIN is a freely available software that interprets systematic IUPAC nomenclature for chemistry and biochemistry and converts it into chemical structures reported as SMILES, InChI and CML (Chemical Markup Language). OPSIN was originally developed at the University of Cambridge. This website is hosted by the Chemical Biology Services Team at EMBL-EBI in collaboration with the developer of OPSIN, Dr Daniel Lowe. It allows users to easily run this software online as an aid to chemical and biochemical curation. In addition, web services are provided to facilitate usage from scripts.

Error:

Image:

Warning:


Standard InChI:


Standard InChIKey:


SMILES:


CML:

Specifying Non-ASCII Characters

Greek characters may be specified by:

  • The appropriate Unicode character, e.g. λ
  • The romanised name for the letter, e.g. lambda
  • The romanised name for the letter surrounded by dots, e.g. .lambda.
  • The corresponding modern letter prefixed with a dollar, e.g. $l

Superscripts may be specified by:

  • A carat with or without bracketing, e.g. N^(2) or N^2
  • Bracketing on its own, e.g. N(2)
  • Tildes or asterisks, e.g. N~2~ or N*2*
  • Angle brackets, e.g. N<2>

Note that for locants OPSIN will be able to infer that N2 means N^2, and for von Baeyer systems OPSIN will intelligently guess what should have been superscripted. However, for spiro systems superscripts need to be indicated.


Supported Nomenclature

The goal of OPSIN is to interpret the entirety of IUPAC nomenclature. However since this is a large and ever-expanding set, there are still gaps in that coverage. The table below provides examples of classes of nomenclature supported.


Nomenclature Examples
Alk/ane/ene/yne hexane
hex-1-ene
hex-1-yne
Heteroatom chains tetrasilane
tetrasiloxane
Cyclised chains cyclohexane
cyclotriborazane
Trivial acids and derivatives maleic acid
maleamic acid
maleamide
maleimide
Hantzsch-Widman rings 1,3-oxazole
Spiro compounds spiro[4.5]decane
pentaspiro[2.0.24.1.1.210.0.213.18.23]octadecane
1H,1'H-2,2'-spirobi[naphthalene]
2λ6,2',2''-spiroter[[1,3,2]benzodioxathiole]
1'H,1''H,2H,8'H-1,2':7',2''-dispiroter[naphthalen]-1'-one
spiro[1,2-benzodithiole-3,2'-[1,3]benzodithiole]
von Baeyer systems pentacyclo[13.7.4.33,8.018,20.113,28]triacontane
Hydro/dehydro 2,3-dihydropyridine
1,2-didehydrobenzene
Indicated hydrogen 1H-benzoimidazole
phosphinin-2(1H)-one
Heteroatom replacement 3-aza-pentane
3-azonia-pentane
3-azanylia-pentane
3-azanida-pentane
3-azanuida-pentane
Specification of charge: ium/ide/ylium/uide azanium
boranuide
Multiplicative nomenclature ethylenediaminetetraacetic acid
3,3'-[ethane-1,2-diylbis(oxy)]bis{4-[2-(furan-3-yloxy)ethoxy]furan}
Conjunctive nomenclature 1,3,5-benzenetriacetic acid
Fused ring systems imidazo[4,5-d]pyridine
phenothiazino[3',4':5,6][1,4]oxazino[2,3-i]benzo[5,6][1,4]thiazino[3,2-c]phenoxazine
phenanthro[4,5-bcd:1,2-c']difuran
Simple bridges 2,3-methanoindene
3,4-methylenedioxy-β-methoxyphenethylamine
3,4-epoxy-3,4-dihydrophenanthrene
Ring assemblies biphenyl
2,2':6',2''-terpyridine
Prefix functional replacement peroxybenzoic acid
Infix functional replacement benzoperoxoic acid
Lambda convention λ5-phosphane
Perhalogenation perchloro-3,4-dimethylenecyclobutene
Radicofunctional nomenclature acetals, acids, alcohols, amides, anhydrides, anilides, azetidides, azides, bromides, chlorides, cyanates, cyanides, esters, di/tri/tetra esters, ethers, fluorides, fulminates, glycol ethers, glycols, hemiacetals, hemiketal, hydrazides, hydrazones, hydrides, hydroperoxides, hydroxides, imides, iodides, isocyanates, isocyanides, isoselenocyanates, isothiocyanates, ketals, ketones, lactams, lactims, lactones, mercaptans, morpholides, oxides, oximes, peroxides, piperazides, piperidides, pyrrolidides, selenides, selenocyanates, selenoketones, selenolsselenosemicarbazones, selenones, selenoxides, selones, semicarbazones, sulfides, sulfones, sulfoxides, sultams, sultims, sultines, sultones, tellurides, telluroketones, tellurones, tellurosemicarbazones, telluroxides, thiocyanates, thioketones, thiols and thiosemicarbazones
Amino Acids and derivatives glycinol
L-2-aminobutyric acid
L-alanyl-L-glutaminyl-L-arginyl-O-phosphono-L-seryl-L-alanyl-L-proline
Nucleosides, nucleotides and their esters adenosine 5'-(tetrahydrogen triphosphate)
Steroids including alpha/beta stereochemistry (3β)-cholest-5-en-3-ol
Open-chain monosaccharides 4-amino-4,6-dideoxy-3-C-methyl-2-O-methyl-L-mannose
Cyclised monosaccharides and glycosides β-D-ribofuranose
β-D-fructofuranosyl α-D-glucopyranosyl-(1->4)-α-D-glucopyranoside
Deoxy and anhydro 5-acetamido-2,7-anhydro-3,5-dideoxy-D-glycero-α-D-galacto-non-2-ulopyranosonic acid
Basic inorganic support aluminium(3+) chloride
mercury(II) chloride
Isotope specification (2H6)propan-2-one
acetone-d6
hexadeuterioacetone
R/S stereochemistry (1R,3S)-3-amino-3-methylcyclohexanecarboxylic acid
E/Z stereochemistry (2Z)-but-2-ene
cis/trans indicating relative stereochemistry on rings cis-1,4-dimethylcyclohexane
Structure-based polymer names poly(2,2'-diamino-5-hexadecylbiphenyl-3,3'-diyl)

Web service

The OPSIN web service is accessed using the base URL https://www.ebi.ac.uk/opsin/ws.

To use it, send a GET request to https://www.ebi.ac.uk/opsin/ws/CHEMICAL_NAME.EXTENSION. For example, accessing https://www.ebi.ac.uk/opsin/ws/benzene.png will return a PNG depiction of benzene, while accessing https://www.ebi.ac.uk/opsin/ws/benzene.smi will return a SMILES string. Note that the chemical name must be URL-encoded. The list of supported extensions is as follows:

.smi
A SMILES string.
.cml and .no2d.cml
CML with(out) 2D coordinates
.stdinchi, .stdinchikey and .inchi
A standard InChI(Key) or a non-standard tautomer-specific InChI
.png and .svg
A PNG or SVG depiction.
.json
A JSON string with keys cml, inchi, stdinchi, stdinchikey, smiles, message and status (plus warnings where applicable).

Bulk conversion

If you need to convert large numbers of IUPAC names to structures, we recommend that you use a local installation of OPSIN rather than use the web service. This will yield faster results while ensuring that the web site is not impacted for other users. Here are two ways to do this:

  1. Download the latest version of OPSIN and follow the instructions on the OPSIN GitHub page to run it at the command-line
  2. Use a third-party Python integration such as pyopsin, py2opsin or MoleculeResolver.

Contact

For questions about this website, please contact opsin-website@ebi.ac.uk.

For questions about OPSIN, or to report a bug, please file an issue on OPSIN's GitHub page.

For questions about IUPAC Nomenclature, users are referred to the IUPAC documentation.


How to cite

If you use OPSIN via this website please consider citing:

  • Daniel M. Lowe, Peter T. Corbett, Peter Murray-Rust, Robert C. Glen. J. Chem. Inf. Model. 2011, 51, 739. (Link)
  • OPSIN webserver. https://www.ebi.ac.uk/opsin (accessed DATE)