Skip to content

medchem.groups

medchem.groups.list_default_chemical_groups(hierarchy=False)

List all the available chemical groups.

Note

chemical groups defines how a collection of patterns are organized. They do not correspond to individual pattern name.

Parameters:

Name Type Description Default
hierarchy bool

whether to return the full hierarchy or the group name only

False

Returns:

Type Description
list

List of chemical groups

medchem.groups.list_functional_group_names(unique=True)

List common functional group names

Parameters:

Name Type Description Default
unique bool

whether to return only unique names

True

Returns:

Type Description
list

List of functional group names

medchem.groups.get_functional_group_map() cached

Map functional groups to their corresponding SMARTS string.

Returns:

Type Description
dict

List of functional group names

medchem.groups.ChemicalGroup

Build a library of chemical groups using a list of structures parsed from a file

The default library of structure has been curated from https://github.com/Sulstice/global-chem and additional open source data.

Note

For new chemical groups, please minimally provide the 'smiles'/'smarts', 'name' and "group" and optional 'hierarchy' columns

Warning

The SMILES and SMARTS used in the default list of substructures do not result in the same matches. Unless specified otherwise, the SMILES will be used in the matching done by this class, whereas due to RDKit's limitation, the SMARTS will be used in the matching done by the generated catalog.

dataframe property

Get the dataframe of the chemical groups

mol_adjusted property

Get the Molecules object of the SMILES, adjusted for stricter match for the chemical groups in this instance

mol_smarts property

Get the SMARTS of the chemical groups in this instance

mols property

Get the Molecule object of the SMILES for the chemical groups in this instance

name property

Get the Name of the chemical groups in this instance

smarts property

Get the SMARTS of the chemical groups in this instance

smiles property

Get the SMILES of the chemical groups in this instance

__init__(groups=None, n_jobs=None, groups_db=None)

Build a chemical group library

Parameters:

Name Type Description Default
groups Optional[Union[str, List[str]]]

List of groups to use. Defaults to None where all functional groups are used

None
n_jobs Optional[int]

Optional number of jobs to run in parallel for internally building the data. Defaults to None.

None
groups_db Optional[Union[PathLike, str]]

Path to a file containing the dump of the chemical groups. Default is internal dataset

None

filter(names, fuzzy=False)

Filter the group to restrict to only the name in input

Parameters:

Name Type Description Default
names List[str]

list of names to use for filters

required
fuzzy bool

whether to use exact or fuzzy matching

False

get_catalog(exact_match=True)

Build an rdkit catalog from the current chemical group data

Parameters:

Name Type Description Default
exact_match bool

whether to adjust the queries for a more stringent match

True

get_matches(mol, use_smiles=True, exact_match=False, terminal_only=False)

Get all the functional groups in this instance that matches the input molecule

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
use_smiles bool

whether to use the smiles representation of the catalog or the smarts

True
exact_match bool

whether to use exact matching by adjusting the query

False
terminal_only bool

ensure whether the matches to the functional group are terminal, meaning that any subgraph matching should not be in the middle of the molecules.

False

has_match(mol, exact_match=False, terminal_only=False)

Check whether the input molecule has any functional group in this instance

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
exact_match bool

whether to use exact matching by adjusting the query

False
terminal_only bool

ensure the matches to the functional group are terminal

False

list_groups()

List all the chemical groups available

list_hierarchy_groups()

List all the hierarchy in chemical groups available. To get the full hierarchy on each path, split by the . character.