OPSIN Information

Contents:

History

Supported Nomenclature

References

Libraries

License and Warranty

History

During the development of OSCAR the need to have a program to convert identified chemical names to connection tables arose. Due to the absence of any open source efforts with broad coverage of organic nomenclature work was started by Peter Murray-Rust and Joe Townsend on such a program. This work was continued by Peter Corbett cumulating in the creation of a system broadly similar to the current incarnation (Corbett and Murray-Rust 2006). In 2008 Daniel Lowe took over development of the project as part of his PhD during which time the range of nomenclature supported as been expanded substantially, along with improvements to the parser to cope with the complexity of the grammar of chemical names (Lowe et al. 2011). A comprehensive description of OPSIN, its algorithms and performance (as of mid 2012) is included in the PhD thesis (Lowe 2012). Development of OPSIN is still on-going with more nomenclature continuing to be added (Lowe et al. 2013).

Examples of Supported Nomenclature

Nomenclature Examples
Alk/ane/ene/yne hexane
hex-1-ene
hex-1-yne
Heteroatom chains tetrasilane
tetrasiloxane
Cyclised chains cyclohexane
cyclotriborazane
Trivial acids and derivatives maleic acid
maleamic acid
maleamide
maleimide
Hantzsch-Widman rings 1,3-oxazole
Spiro compounds spiro[4.5]decane
pentaspiro[2.0.24.1.1.210.0.213.18.23]octadecane
1H,1'H-2,2'-spirobi[naphthalene]
2λ6,2',2''-spiroter[[1,3,2]benzodioxathiole]
1'H,1''H,2H,8'H-1,2':7',2''-dispiroter[naphthalen]-1'-one
spiro[1,2-benzodithiole-3,2'-[1,3]benzodithiole]
von Baeyer systems pentacyclo[13.7.4.33,8.018,20.113,28]triacontane
Hydro/dehydro 2,3-dihydropyridine
1,2-didehydrobenzene
Indicated hydrogen 1H-benzoimidazole
phosphinin-2(1H)-one
Heteroatom replacement 3-aza-pentane
3-azonia-pentane
3-azanylia-pentane
3-azanida-pentane
3-azanuida-pentane
Specification of charge: ium/ide/ylium/uide azanium
boranuide
Multiplicative nomenclature ethylenediaminetetraacetic acid
3,3'-[ethane-1,2-diylbis(oxy)]bis{4-[2-(furan-3-yloxy)ethoxy]furan}
Conjunctive nomenclature 1,3,5-benzenetriacetic acid
Fused ring systems imidazo[4,5-d]pyridine
phenothiazino[3',4':5,6][1,4]oxazino[2,3-i]benzo[5,6][1,4]thiazino[3,2-c]phenoxazine
phenanthro[4,5-bcd:1,2-c']difuran
Simple bridges 2,3-methanoindene
3,4-methylenedioxy-β-methoxyphenethylamine
3,4-epoxy-3,4-dihydrophenanthrene
Ring assemblies biphenyl
2,2':6',2''-terpyridine
Prefix functional replacement peroxybenzoic acid
Infix functional replacement benzoperoxoic acid
Lambda convention λ5-phosphane
Perhalogenation perchloro-3,4-dimethylenecyclobutene
Radicofunctional nomenclature acetals, acids, alcohols, amides, anhydrides, anilides, azetidides, azides, bromides, chlorides, cyanates, cyanides, esters, di/tri/tetra esters, ethers, fluorides, fulminates, glycol ethers, glycols, hemiacetals, hemiketal, hydrazides, hydrazones, hydrides, hydroperoxides, hydroxides, imides, iodides, isocyanates, isocyanides, isoselenocyanates, isothiocyanates, ketals, ketones, lactams, lactims, lactones, mercaptans, morpholides, oxides, oximes, peroxides, piperazides, piperidides, pyrrolidides, selenides, selenocyanates, selenoketones, selenolsselenosemicarbazones, selenones, selenoxides, selones, semicarbazones, sulfides, sulfones, sulfoxides, sultams, sultims, sultines, sultones, tellurides, telluroketones, tellurones, tellurosemicarbazones, telluroxides, thiocyanates, thioketones, thiols and thiosemicarbazones
Amino Acids and derivatives glycinol
L-2-aminobutyric acid
L-alanyl-L-glutaminyl-L-arginyl-O-phosphono-L-seryl-L-alanyl-L-proline
Nucleosides, nucleotides and their esters adenosine 5'-(tetrahydrogen triphosphate)
Steroids including alpha/beta stereochemistry (3β)-cholest-5-en-3-ol
Open-chain monosaccharides 4-amino-4,6-dideoxy-3-C-methyl-2-O-methyl-L-mannose
Cyclised monosaccharides and glycosides β-D-ribofuranose
β-D-fructofuranosyl α-D-glucopyranosyl-(1->4)-α-D-glucopyranoside
Deoxy and anhydro 5-acetamido-2,7-anhydro-3,5-dideoxy-D-glycero-α-D-galacto-non-2-ulopyranosonic acid
Basic inorganic support aluminium(3+) chloride
mercury(II) chloride
Isotope specification (2H6)propan-2-one
acetone-d6
hexadeuterioacetone
R/S stereochemistry (1R,3S)-3-amino-3-methylcyclohexanecarboxylic acid
E/Z stereochemistry (2Z)-but-2-ene
cis/trans indicating relative stereochemistry on rings cis-1,4-dimethylcyclohexane
Structure-based polymer names poly(2,2'-diamino-5-hexadecylbiphenyl-3,3'-diyl)

References

[1] Peter Corbett and Peter Murray-Rust. High-Throughput Identification of Chemistry in Life Science Texts. Lecture Notes in Computer Science. 2006, 4216, pp107-118. DOI: 10.1007/11875741_11

[2] Daniel M. Lowe, Peter T. Corbett, Peter Murray-Rust, Robert C. Glen. Chemical Name to Structure: OPSIN, an Open Source Solution. Journal of Chemical Information and Modeling. 2011, 51 (3), 739-753 DOI: 10.1021/ci100384d

[3] Daniel M. Lowe. Extraction of chemical structures and reactions from the literature. Ph.D. Thesis, University of Cambridge, 2012. Available from Apollo.

[4] Daniel M. Lowe, Peter Murray-Rust, Robert C. Glen. OPSIN: Taming the Jungle of IUPAC Chemical Nomenclature. 6th Joint Sheffield Conference on Chemoinformatics, 2013. Available from here.

Libraries

OPSIN utilises dk.brics.automaton to provide a discrete finite state automaton to allow the parsing of chemical names. Woodstox is used as an XML framework for reading in resource files and writing CML. JNI-InChI is used to generate InChIs. Additionally for testing JUnit and Mockito are employed. The web interface is powered by Restlet with the Indigo toolkit being used for 2D coordinate generation and depiction.

YourKit Logo

OPSIN's developers use YourKit to profile and optimise code.

YourKit supports open source projects with its full-featured Java Profiler. YourKit, LLC is the creator of YourKit Java Profiler and YourKit .NET Profiler, innovative and intelligent tools for profiling Java and .NET applications.

License and Warranty

OPSIN is licensed under the MIT License

OPSIN is made available in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.