Skip to content

medchem.catalogs

medchem.catalogs.list_named_catalogs()

List all available named catalogs. This list will not report chemical groups !

Tip

For a list of chemical groups that can be queried using NamedCatalog.chemical_groups, use medchem.groups.list_default_chemical_groups

medchem.catalogs.merge_catalogs(*catalogs)

Merge several catalogs into a single one

Returns:

Name Type Description
catalog FilterCatalog

merged catalog

medchem.catalogs.catalog_from_smarts(smarts, labels=None, mincounts=None, maxcounts=None, entry_as_inds=False)

Load catalog from a list of smarts

Parameters:

Name Type Description Default
smarts Union[Sequence[str], ndarray, Series]

list of input smarts to add to the catalog

required
labels Optional[Union[Sequence[str], ndarray, Series]]

list of label for each smarts

None
mincounts Optional[Union[Sequence[int], ndarray, Series]]

minimum count before a match is recognized

None
maxcounts Optional[Union[Sequence[int], ndarray, Series]]

maximum count for a match to be valid

None
entry_as_inds bool

whether to use index for entry id or the label

False

Returns:

Name Type Description
catalog FilterCatalog

filter catalog built from the input smarts

medchem.catalogs.NamedCatalogs

Holder for substructure matching catalogs. This class provides several popular and custom catalog that can be used for substructure matching.

All the catalogs are cached using functools.lru_cache to avoid reloading them every time.

Note

A filter catalog is a collection of substructures and molecular patterns (SMARTS) used to flag molecules with (un)desirable structural properties.

alerts(subset=None) staticmethod

Common alerts filter catalogs commonly used in molecule filtering

Parameters:

Name Type Description Default
subset Optional[Union[List[str], str]]

subset of alert providers to consider

None

Returns:

Name Type Description
catalog FilterCatalog

filter catalog

bredt() cached staticmethod

Bredt Filter Rules: a catalog for filtering unstable molecules, ideal for molecules generated by deep learning models or chemical space enumeration. See example of usage by surge

chemical_groups(filters='medicinal') cached staticmethod

Chemical group filter catalogs

Parameters:

Name Type Description Default
filters Union[str, List[str]]

list of tag to filter the catalog on.

'medicinal'

nibr() cached staticmethod

Catalog from NIBR

Warning

This will return all the compounds in the catalog, regardless of their severity (FLAG, EXCLUDE, ANNOTATION). You likely don't want to use this for blind prioritization.

tox(pains_a=True, pains_b=True, pains_c=False, brenk=True, nih=False, zinc=False) cached staticmethod

Common toxicity and interference catalog

Parameters:

Name Type Description Default
pains_a bool

whether to include PAINS filters (assay A)

True
pains_b bool

whether to include PAINS filters (assay B)

True
pains_c bool

whether to include PAINS filters (assay C)

False
brenk bool

whether to include BRENK filters, also known as Dundee filters

True
nih bool

whether to include NIH filters

False
zinc bool

whether to include ZINC filters

False

unstable_graph(severity_threshold=5) cached staticmethod

Unstable molecular graph to filter out, ideal for generated de novo molecules.

Warning

This method returns problematic patterns and thus patterns with higher severity than the threshold.

Parameters:

Name Type Description Default
severity_threshold int

minimum severity for a pattern to be returned.

5