Was thinking of InChI[0] but on Googling SMILES and SELFIES I found this[1] talk, this[2] paper and my goodness I've been down a few rabbit holes since...
Note: There are two standardized formats for this called SMILES and SELFIES. SMILES is much better supported, but SELFIES is more robust. I'm integrating them into some bio and chem software I'm working on.
You can do things like look up, using PubChem's API, similar molecules etc to a SMILES string.
I believe most molecule editors can load and save SMILES.
Does this do structural formulae too?
Was thinking of InChI[0] but on Googling SMILES and SELFIES I found this[1] talk, this[2] paper and my goodness I've been down a few rabbit holes since...
[0] https://en.wikipedia.org/wiki/International_Chemical_Identif... [1] https://www.inchi-trust.org/wp/wp-content/uploads/2019/12/18... [2] https://pubs.rsc.org/en/content/articlehtml/2022/dd/d1dd0001...
Note: There are two standardized formats for this called SMILES and SELFIES. SMILES is much better supported, but SELFIES is more robust. I'm integrating them into some bio and chem software I'm working on.
You can do things like look up, using PubChem's API, similar molecules etc to a SMILES string.
I believe most molecule editors can load and save SMILES.
What about inchi? Isn’t that a common way of describing molecules as well?
Good point!
Does the SMILE (or Simplified Molecular Input Line Entry System) code have an EBNF definition ? https://en.wikipedia.org/wiki/Simplified_Molecular_Input_Lin... Claims there is a context free grammar.