medchem.groups
¶
medchem.groups.list_default_chemical_groups(hierarchy=False)
¶
List all the available chemical groups.
Note
chemical groups defines how a collection of patterns are organized. They do not correspond to individual pattern name.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hierarchy |
bool
|
whether to return the full hierarchy or the group name only |
False
|
Returns:
Type | Description |
---|---|
list
|
List of chemical groups |
medchem.groups.list_functional_group_names(unique=True)
¶
List common functional group names
Parameters:
Name | Type | Description | Default |
---|---|---|---|
unique |
bool
|
whether to return only unique names |
True
|
Returns:
Type | Description |
---|---|
list
|
List of functional group names |
medchem.groups.get_functional_group_map()
cached
¶
Map functional groups to their corresponding SMARTS string.
Returns:
Type | Description |
---|---|
dict
|
List of functional group names |
medchem.groups.ChemicalGroup
¶
Build a library of chemical groups using a list of structures parsed from a file
The default library of structure has been curated from https://github.com/Sulstice/global-chem and additional open source data.
Note
For new chemical groups, please minimally provide the 'smiles'/'smarts', 'name' and "group" and optional 'hierarchy' columns
Warning
The SMILES and SMARTS used in the default list of substructures do not result in the same matches. Unless specified otherwise, the SMILES will be used in the matching done by this class, whereas due to RDKit's limitation, the SMARTS will be used in the matching done by the generated catalog.
dataframe
property
¶
Get the dataframe of the chemical groups
mol_adjusted
property
¶
Get the Molecules object of the SMILES, adjusted for stricter match for the chemical groups in this instance
mol_smarts
property
¶
Get the SMARTS of the chemical groups in this instance
mols
property
¶
Get the Molecule object of the SMILES for the chemical groups in this instance
name
property
¶
Get the Name of the chemical groups in this instance
smarts
property
¶
Get the SMARTS of the chemical groups in this instance
smiles
property
¶
Get the SMILES of the chemical groups in this instance
__init__(groups=None, n_jobs=None, groups_db=None)
¶
Build a chemical group library
Parameters:
Name | Type | Description | Default |
---|---|---|---|
groups |
Optional[Union[str, List[str]]]
|
List of groups to use. Defaults to None where all functional groups are used |
None
|
n_jobs |
Optional[int]
|
Optional number of jobs to run in parallel for internally building the data. Defaults to None. |
None
|
groups_db |
Optional[Union[PathLike, str]]
|
Path to a file containing the dump of the chemical groups. Default is internal dataset |
None
|
filter(names, fuzzy=False)
¶
Filter the group to restrict to only the name in input
Parameters:
Name | Type | Description | Default |
---|---|---|---|
names |
List[str]
|
list of names to use for filters |
required |
fuzzy |
bool
|
whether to use exact or fuzzy matching |
False
|
get_catalog(exact_match=True)
¶
Build an rdkit catalog from the current chemical group data
Parameters:
Name | Type | Description | Default |
---|---|---|---|
exact_match |
bool
|
whether to adjust the queries for a more stringent match |
True
|
get_matches(mol, use_smiles=True, exact_match=False, terminal_only=False)
¶
Get all the functional groups in this instance that matches the input molecule
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
use_smiles |
bool
|
whether to use the smiles representation of the catalog or the smarts |
True
|
exact_match |
bool
|
whether to use exact matching by adjusting the query |
False
|
terminal_only |
bool
|
ensure whether the matches to the functional group are terminal, meaning that any subgraph matching should not be in the middle of the molecules. |
False
|
has_match(mol, exact_match=False, terminal_only=False)
¶
Check whether the input molecule has any functional group in this instance
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
exact_match |
bool
|
whether to use exact matching by adjusting the query |
False
|
terminal_only |
bool
|
ensure the matches to the functional group are terminal |
False
|
list_groups()
¶
List all the chemical groups available
list_hierarchy_groups()
¶
List all the hierarchy in chemical groups available.
To get the full hierarchy on each path, split by the .
character.