medchem.utils
¶
medchem.utils.smarts
¶
SMARTSUtils
¶
Collections of utils to build complex SMARTS query more efficiently for non experienced user
aliphatic_chain(min_size=6, unbranched=False, unsaturated_bondtype=None, allow_hetero_atoms=True)
classmethod
¶
Returns a query that can match a long aliphatic chain
Parameters:
Name | Type | Description | Default |
---|---|---|---|
min_size
|
int
|
minimum size of the long chain |
6
|
unbranched
|
bool
|
whether the chain should be unbranched |
False
|
unsaturated_bondtype
|
Optional[str]
|
additional unsaturated bond type to use for the query. By default, Any bond type (~) is used. Single bonds ARE always allowed and bondtype cannot be aromatic |
None
|
allow_hetero_atoms
|
bool
|
whether the chain can contain hetero atoms |
True
|
Example
to build a query for a long aliphatic chain of a least 5 atoms (e.g: 'CCC(C)CCC')
SMARTSUtils.aliphatic_chain(min_size=5)
Returns:
Name | Type | Description |
---|---|---|
smarts |
str
|
smarts pattern matching a long aliphatic chain |
atom_in_env(*smarts_strs, include_atoms=False, union=False)
classmethod
¶
Returns a recursive/group smarts to find an atom that fits in the environments defined by all the input smarts
Parameters:
Name | Type | Description | Default |
---|---|---|---|
smarts_strs
|
str
|
list of input patterns defining the environment the atom must fit in. The first atom of each pattern should be the atom we want to match to, unless include_atoms is set to True, then [*:99] will be added at the start of each pattern |
()
|
include_atoms
|
bool
|
whether to include an additional first atom that needs to be in the required environment or not |
False
|
union
|
bool
|
whether to use the union of the environments or the intersection |
False
|
Example
you can use this function to construct a complex query if you are not sure about how to write the smarts for example, to find a carbon atom that is both in a ring or size 6, bonded to an ethoxy and have a Fluorine in meta
SMARTSUtils.atom_in_env("[#6;r6][OD2][C&D1]", "[c]aa[F]", union=False) # there are alternative way to write this
Returns:
Name | Type | Description |
---|---|---|
smarts |
str
|
smarts pattern matching the group/environment |
meta(smarts_str1, smarts_str2, aromatic_only=False)
classmethod
¶
Returns a recursive smarts string connecting the two input smarts in meta
of each other.
Connection points needs to be through single or double bonds.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
smarts_str1
|
str
|
first smarts pattern defining the first functional group |
required |
smarts_str2
|
str
|
second smarts pattern defining the second functional group |
required |
aromatic_only
|
bool
|
whether the ring needs to be aromatic or not |
False
|
Example
to build a smarts for a methyl group in meta to an oxygen (e.g: 'c1c(C)cc(O)cc1')
SMARTSUtils.meta('[#6;!R]', '[#8]')
Returns:
Name | Type | Description |
---|---|---|
smarts |
str
|
smarts pattern connecting the two input smarts in |
ortho(smarts_str1, smarts_str2, aromatic_only=False)
classmethod
¶
Returns a recursive smarts string connecting the two input smarts in ortho
of each other.
Connection points needs to be through single or double bonds.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
smarts_str1
|
str
|
first smarts pattern defining the first functional group |
required |
smarts_str2
|
str
|
second smarts pattern defining the second functional group |
required |
aromatic_only
|
bool
|
whether the ring needs to be aromatic or not |
False
|
Example
to build a smarts for a methyl group in ortho to an oxygen (e.g: 'C1CC(C)C(O)CC1')
SMARTSUtils.ortho('[#6;!R]', '[#8]')
Returns:
Name | Type | Description |
---|---|---|
smarts |
str
|
smarts pattern connecting the two input smarts in |
para(smarts_str1, smarts_str2, aromatic_only=False)
classmethod
¶
Returns a recursive smarts string connecting the two input smarts in para
of each other.
Connection points needs to be through single or double bonds.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
smarts_str1
|
str
|
first smarts pattern defining the first functional group |
required |
smarts_str2
|
str
|
second smarts pattern defining the second functional group |
required |
aromatic_only
|
bool
|
whether the ring needs to be aromatic or not |
False
|
Example
to build a smarts for a methyl group in para to an oxygen (e.g: 'c1(C)ccc(O)cc1')
SMARTSUtils.para('[#6;!R]', '[#8]')
Returns:
Name | Type | Description |
---|---|---|
smarts |
str
|
smarts pattern connecting the two input smarts in |
standardize_attachment(smiles, attach_tokens='[*:1]')
classmethod
¶
Standardize an attachment point in a smiles
Parameters:
Name | Type | Description | Default |
---|---|---|---|
smiles
|
str
|
SMILES string |
required |
attach_tokens
|
str
|
Attachment point token to use as standard token |
'[*:1]'
|
medchem.utils.loader
¶
get_data_path(filename, module='medchem.data')
cached
¶
Return the filepath of an internal data file.
get_grammar(grammar=None, as_string=False)
¶
Return the default lark grammar file for the medchem query system
Parameters:
Name | Type | Description | Default |
---|---|---|---|
grammar
|
Optional[Union[PathLike, str]]
|
The path to the grammar file. If None, the default medchem grammar file is used. |
None
|
as_string
|
bool
|
If True, return the grammar as a string. Defaults to False. |
False
|
medchem.utils.graph
¶
automorphism(mol, standardize=True, node_attrs=DEFAULT_NODE_ATTR, edge_attrs=DEFAULT_EDGE_ATTR)
¶
Compute automorphism in a molecular graph
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol
|
Union[str, Mol]
|
input molecular graph |
required |
standardize
|
bool
|
whether to standardize the compound or not |
True
|
node_attrs
|
List[str]
|
list of categorical atom attributes/properties to consider for node matching |
DEFAULT_NODE_ATTR
|
edge_attrs
|
List[str]
|
list of categorical bond attributes/properties to consider for edge matching |
DEFAULT_EDGE_ATTR
|
score_symmetry(mol, exclude_self_mapped_edged=False, **automorphism_kwargs)
¶
Provide a symmetry score for a given input molecule
Note
This is an heuristic and our definition of symmetry is pretty loose. We define symmetry according to any (set of) plans dividing the molecule into two very similar subgraph. We include both edge and vertex transitivity. For example the star-molecular graph (e.g neopentane) is symmetrical here, although it's not vertex-transitive.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol
|
Union[Mol, str]
|
inputs molecules |
required |
exclude_self_mapped_edged
|
bool
|
Whether to exclude edges that matches to themselves in automorphism. |
False
|
automorphism_kwargs
|
Any
|
keyword for determining automorphism |
{}
|