Skip to content

medchem.utils

medchem.utils.smarts

SMARTSUtils

Collections of utils to build complex SMARTS query more efficiently for non experienced user

aliphatic_chain(min_size=6, unbranched=False, unsaturated_bondtype=None, allow_hetero_atoms=True) classmethod

Returns a query that can match a long aliphatic chain

Parameters:

Name Type Description Default
min_size int

minimum size of the long chain

6
unbranched bool

whether the chain should be unbranched

False
unsaturated_bondtype Optional[str]

additional unsaturated bond type to use for the query. By default, Any bond type (~) is used. Single bonds ARE always allowed and bondtype cannot be aromatic

None
allow_hetero_atoms bool

whether the chain can contain hetero atoms

True
Example

to build a query for a long aliphatic chain of a least 5 atoms (e.g: 'CCC(C)CCC')

SMARTSUtils.aliphatic_chain(min_size=5)

Returns:

Name Type Description
smarts str

smarts pattern matching a long aliphatic chain

atom_in_env(*smarts_strs, include_atoms=False, union=False) classmethod

Returns a recursive/group smarts to find an atom that fits in the environments defined by all the input smarts

Parameters:

Name Type Description Default
smarts_strs str

list of input patterns defining the environment the atom must fit in. The first atom of each pattern should be the atom we want to match to, unless include_atoms is set to True, then [*:99] will be added at the start of each pattern

()
include_atoms bool

whether to include an additional first atom that needs to be in the required environment or not

False
union bool

whether to use the union of the environments or the intersection

False
Example

you can use this function to construct a complex query if you are not sure about how to write the smarts for example, to find a carbon atom that is both in a ring or size 6, bonded to an ethoxy and have a Fluorine in meta

SMARTSUtils.atom_in_env("[#6;r6][OD2][C&D1]", "[c]aa[F]", union=False) # there are alternative way to write this

Returns:

Name Type Description
smarts str

smarts pattern matching the group/environment

meta(smarts_str1, smarts_str2, aromatic_only=False) classmethod

Returns a recursive smarts string connecting the two input smarts in meta of each other. Connection points needs to be through single or double bonds.

Parameters:

Name Type Description Default
smarts_str1 str

first smarts pattern defining the first functional group

required
smarts_str2 str

second smarts pattern defining the second functional group

required
aromatic_only bool

whether the ring needs to be aromatic or not

False
Example

to build a smarts for a methyl group in meta to an oxygen (e.g: 'c1c(C)cc(O)cc1')

SMARTSUtils.meta('[#6;!R]', '[#8]')

Returns:

Name Type Description
smarts str

smarts pattern connecting the two input smarts in meta of each other

ortho(smarts_str1, smarts_str2, aromatic_only=False) classmethod

Returns a recursive smarts string connecting the two input smarts in ortho of each other. Connection points needs to be through single or double bonds.

Parameters:

Name Type Description Default
smarts_str1 str

first smarts pattern defining the first functional group

required
smarts_str2 str

second smarts pattern defining the second functional group

required
aromatic_only bool

whether the ring needs to be aromatic or not

False
Example

to build a smarts for a methyl group in ortho to an oxygen (e.g: 'C1CC(C)C(O)CC1')

SMARTSUtils.ortho('[#6;!R]', '[#8]')

Returns:

Name Type Description
smarts str

smarts pattern connecting the two input smarts in ortho of each other

para(smarts_str1, smarts_str2, aromatic_only=False) classmethod

Returns a recursive smarts string connecting the two input smarts in para of each other. Connection points needs to be through single or double bonds.

Parameters:

Name Type Description Default
smarts_str1 str

first smarts pattern defining the first functional group

required
smarts_str2 str

second smarts pattern defining the second functional group

required
aromatic_only bool

whether the ring needs to be aromatic or not

False
Example

to build a smarts for a methyl group in para to an oxygen (e.g: 'c1(C)ccc(O)cc1')

SMARTSUtils.para('[#6;!R]', '[#8]')

Returns:

Name Type Description
smarts str

smarts pattern connecting the two input smarts in para of each other

standardize_attachment(smiles, attach_tokens='[*:1]') classmethod

Standardize an attachment point in a smiles

Parameters:

Name Type Description Default
smiles str

SMILES string

required
attach_tokens str

Attachment point token to use as standard token

'[*:1]'

medchem.utils.loader

get_data_path(filename, module='medchem.data') cached

Return the filepath of an internal data file.

get_grammar(grammar=None, as_string=False)

Return the default lark grammar file for the medchem query system

Parameters:

Name Type Description Default
grammar Optional[Union[PathLike, str]]

The path to the grammar file. If None, the default medchem grammar file is used.

None
as_string bool

If True, return the grammar as a string. Defaults to False.

False

medchem.utils.graph

automorphism(mol, standardize=True, node_attrs=DEFAULT_NODE_ATTR, edge_attrs=DEFAULT_EDGE_ATTR)

Compute automorphism in a molecular graph

Parameters:

Name Type Description Default
mol Union[str, Mol]

input molecular graph

required
standardize bool

whether to standardize the compound or not

True
node_attrs List[str]

list of categorical atom attributes/properties to consider for node matching

DEFAULT_NODE_ATTR
edge_attrs List[str]

list of categorical bond attributes/properties to consider for edge matching

DEFAULT_EDGE_ATTR

score_symmetry(mol, exclude_self_mapped_edged=False, **automorphism_kwargs)

Provide a symmetry score for a given input molecule

Note

This is an heuristic and our definition of symmetry is pretty loose. We define symmetry according to any (set of) plans dividing the molecule into two very similar subgraph. We include both edge and vertex transitivity. For example the star-molecular graph (e.g neopentane) is symmetrical here, although it's not vertex-transitive.

Parameters:

Name Type Description Default
mol Union[Mol, str]

inputs molecules

required
exclude_self_mapped_edged bool

Whether to exclude edges that matches to themselves in automorphism.

False
automorphism_kwargs Any

keyword for determining automorphism

{}