Skip to content

medchem.rules

medchem.rules.RuleFilters

Build a filter based on a compound phychem properties. For a list of default rules, use RuleFilters.list_available_rules(). Most of these rules have been collected from the litterature including https://fafdrugs4.rpbs.univ-paris-diderot.fr/descriptors.html

__call__(mols, n_jobs=-1, progress=False, progress_leave=False, scheduler='auto', keep_props=False, fail_if_invalid=True)

Compute the rules for a list of molecules

Parameters:

Name Type Description Default
mols Sequence[Union[str, Mol]]

list of input molecule object.

required
n_jobs Optional[int]

number of jobs to run in parallel.

-1
progress bool

whether to show progress or not.

False
progress_leave bool

whether to leave the progress bar or not.

False
scheduler str

which scheduler to use. If "auto", will use "processes" if len(mols) > 500 else "threads".

'auto'
keep_props bool

whether to keep the properties columns computed by the rules.

False
fail_if_invalid bool

whether to fail if a rule fails or not.

True

Returns:

Name Type Description
df DataFrame

Dataframe where each row is a molecule and each column is a the outcomes of applying self.rules[column].

__getitems__(name)

Return a specific rule

__init__(rule_list, rule_list_names=None)

Build a rule filtering object

Parameters:

Name Type Description Default
rule_list List[Union[str, Callable]]

list of rules to apply. Either a callable that takes a molecule as input (with kwargs) or a string of the name of a pre-defined rule as defined in the basic_rules module

required
rule_list_names Optional[List[Optional[str]]]

Name of the rules passed as inputs. Defaults to None.

None

__len__()

Return the number of rules inside this filter

list_available_rules(*query) cached staticmethod

List all the available rules and they properties

list_available_rules_names(*query) cached staticmethod

List only the names of the available rules

Basic Rules

medchem.rules.basic_rules

rule_of_chemaxon_druglikeness(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, n_rotatable_bonds=None, n_rings=None, **kwargs)

Compute the druglikeness filter according to chemaxon:

It computes: MW < 400 & logP < 5 & HBA <= 10 & HBD <= 5 & ROTBONDS < 5 & RINGS > 0

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds in the molecule.

None
n_rings Optional[int]

precomputed number of rings in the molecule.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
roc bool

True if molecule is compliant, False otherwise

rule_of_cns(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, tpsa=None, **kwargs)

Computes drug likeness rule for CNS penetrant molecules as described in: Jeffrey & Summerfield (2010) Assessment of the blood-brain barrier in CNS drug discovery.

It computes: MW in [135, 582] & logP in [-0.2, 6.1] & TPSA in [3, 118] & HBD <= 3 & HBA <= 5

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed logP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
tpsa Optional[int]

precomputed TPSA.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
roc bool

True if molecule is compliant, False otherwise

rule_of_druglike_soft(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, tpsa=None, n_rotatable_bonds=None, n_rings=None, n_hetero_atoms=None, charge=None, **kwargs)

Compute the DrugLike Soft rule available in FAF-Drugs4.

It computes:

MW in [100, 600] & logP < in [-3, 6] & HBD <= 7 & HBA <= 12 & TPSA <=180 & ROTBONDS <= 11 &
RIGBONDS <= 30 & N_RINGS <= 6 & MAX_SIZE_RING <= 18 & N_CARBONS in [3, 35] &  N_HETEROATOMS in [1, 15] &
HC_RATIO in [0.1, 1.1] & CHARGE in [-4, 4] & N_ATOM_CHARGE <= 4

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
tpsa Optional[float]

precomputed TPSA.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds.

None
n_rings Optional[int]

precomputed number of rings in the molecules.

None
n_hetero_atoms Optional[int]

precomputed number of heteroatoms.

None
charge Optional[float]

precomputed charge.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

rule_of_egan(mol, clogp=None, tpsa=None, **kwargs)

Compute passive intestinal absorption according to Egan Rules as described in: Egan, William J., Kenneth M. Merz, and John J. Baldwin (2000) Prediction of drug absorption using multivariate statistics.

It computes: TPSA in [0, 132] & logP in [-1, 6]

Summary of the paper

The author built a multivariate statistics model of passive intestinal absorption with robust outlier detection. Outliers were identified as being actively transported. They chose PSA and AlogP98 (cLogP), based on consideration of the physical processes involved in membrane permeability and the interrelationships and redundancies between other available descriptors.

Compounds, which had been assayed for Caco-2 cell permeability, demonstrated a good rate of successful predictions (74−92%)

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
clogp Optional[float]

precomputed cLogP.

None
tpsa Optional[float]

precomputed TPSA.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
roe bool

True if molecule is compliant, False otherwise

rule_of_five(mol, mw=None, clogp=None, n_lipinski_hbd=None, n_lipinski_hba=None, **kwargs)

Compute the Lipinski's rule-of-5 for a molecule. Also known as Pfizer's rule of five or RO5. This rule is a rule of thumb to evaluate the druglikeness of a chemical compounds.

It computes: MW <= 500 & logP <= 5 & HBD <= 5 & HBA <= 10

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_lipinski_hbd Optional[float]

precomputed number of HBD.

None
n_lipinski_hba Optional[float]

precomputed number of HBA.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
ro5 bool

True if molecule is compliant, False otherwise

rule_of_five_beyond(mol, mw=None, clogp=None, n_hbd=None, n_hba=None, tpsa=None, n_rotatable_bonds=None, **kwargs)

Compute the Beyond rule-of-5 rule for a molecule. This rule illustrates the potential of compounds far beyond rule-of-5 space to modulate novel and difficult target classes that have large, flat, and groove-shaped binding sites and has been described in:

Doak, Bradley C., et al. (2015) How Beyond Rule of 5 Drugs and Clinical Candidates Bind to Their Targets.

It computes: MW <= 1000 & logP in [-2, 10] & HBD <= 6 & HBA <= 15 & TPSA <=250 & ROTBONDS <= 20

Note

This is a very permissive rule and is likely to not be a good predictor for druglikeness as known for small molecules.

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_hbd Optional[float]

precomputed number of HBD.

None
n_hba Optional[float]

precomputed number of HBA.

None
tpsa Optional[float]

precomputed TPSA.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
ro5 bool

True if molecule is compliant, False otherwise

rule_of_four(mol, mw=None, clogp=None, n_hba=None, n_rings=None, **kwargs)

Compute the rule-of-4 for a molecule. The rule-of-4 define a rule of thumb for PPI inhibitors, which are typically larger and more lipophilic than inhibitors of standard binding sites. It has been published in: Morelli X, Bourgeas R, Roche P. (2011) Chemical and structural lessons from recent successes in protein–protein interaction inhibition

Also see: Shin et al. (2020) Current Challenges and Opportunities in Designing Protein–Protein Interaction Targeted Drugs

It computes: MW >= 400 & logP >= 4 & RINGS >=4 & HBA >= 4

Warning

Do not use this for small molecules that are not PPI inhibitors !

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_rings Optional[int]

precomputed number of rings in the molecules.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
ro4 bool

True if molecule is compliant, False otherwise

rule_of_generative_design(mol, mw=None, clogp=None, n_lipinski_hba=None, n_lipinski_hbd=None, tpsa=None, n_rotatable_bonds=None, n_hetero_atoms=None, charge=None, **kwargs)

Compute druglikeness rule of generative design.

This set of rules are proprietary to © Valence Discovery and have been curated to better prioritize molecules suggested by generative models for small molecules.

It computes:

MW in [200, 600] & logP < in [-3, 6] & HBD <= 7  & HBA <= 12 & TPSA in [40, 180] &
ROTBONDS <= 15 & RIGID BONDS <= 30 & N_AROMATIC_RINGS <= 5 & N_FUSED_AROMATIC_RINGS_TOGETHER <= 2 &
MAX_SIZE_RING_SYSTEM <= 18  & N_CARBONS in [3, 40] & N_HETEROATOMS in [1, 15] & CHARGE in [-2, 2] &
N_ATOM_CHARGE <= 2 & N_TOTAL_ATOMS < 70 & N_HEAVY_METALS < 1

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_lipinski_hba Optional[float]

precomputed number of HBA.

None
n_lipinski_hbd Optional[float]

precomputed number of HBD.

None
tpsa Optional[float]

precomputed TPSA.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds.

None
n_hetero_atoms Optional[int]

precomputed number of heteroatoms.

None
charge Optional[float]

precomputed charge.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

rule_of_generative_design_strict(mol, mw=None, clogp=None, n_lipinski_hba=None, n_lipinski_hbd=None, tpsa=None, n_rotatable_bonds=None, n_hetero_atoms=None, charge=None, **kwargs)

Compute the strict version of the druglikeness rule of generative design, which includes long aliphatic chains.

This set of rules are proprietary to © Valence Discovery and have been curated to better prioritize molecules suggested by generative models for small molecules.

It computes:

MW in [200, 600] & logP < in [-3, 6] & HBD <= 7  & HBA <= 12 & TPSA in [40, 180] &
ROTBONDS <= 15 & RIGID BONDS <= 30 & N_AROMATIC_RINGS <= 5 & N_FUSED_AROMATIC_RINGS_TOGETHER <= 2 &
MAX_SIZE_RING_SYSTEM <= 18  & N_CARBONS in [3, 40] & N_HETEROATOMS in [1, 15] & CHARGE in [-2, 2] &
N_ATOM_CHARGE <= 2 & N_TOTAL_ATOMS < 70 & N_HEAVY_METALS < 1 & N_STEREO_CENTER <= 3 &
HAS_NO_SPIDER_SIDE_CHAINS & FRACTION_RING_SYSTEM >= 0.25

By default SPIDER_SIDE_CHAINS are defined as having at least 2 'chains' of >=4 consecutif atoms in side chains (not part of any ring system)

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_lipinski_hba Optional[float]

precomputed number of HBA.

None
n_lipinski_hbd Optional[float]

precomputed number of HBD.

None
tpsa Optional[float]

precomputed TPSA.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds.

None
n_hetero_atoms Optional[int]

precomputed number of heteroatoms.

None
charge Optional[float]

precomputed charge.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

rule_of_ghose(mol, mw=None, clogp=None, mr=None, **kwargs)

Compute the Ghose filter. The Ghose filter is a drug-like filter described in: Ghose, AK.; Viswanadhan, VN.; Wendoloski JJ. (1999) A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery.1. A qualitative and quantitative characterization of known drug databases.

It computes: MW in [160, 480] & logP in [-0.4, 5.6] & Natoms in [20, 70] & refractivity in [40, 130]

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
mr Optional[float]

precomputed molecule refractivity.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
rog bool

True if molecule is compliant, False otherwise

rule_of_gsk_4_400(mol, mw=None, clogp=None, **kwargs)

Compute GSK Rule (4/400) for druglikeness using interpretable ADMET rule of thumb. It has been described in: Gleeson, M. Paul (2008). Generation of a set of simple, interpretable ADMET rules of thumb.

It computes: MW <= 400 & logP <= 4.

Summary of the paper

  • The rule are based on a set of consistent structure-property guides determined from an analysis of a number of key ADMET assays run within GSK: solubility, permeability, bioavailability, volume of distribution, plasma protein binding, CNS penetration, brain tissue binding, P-gp efflux, hERG inhibition, and cytochrome P450 1A2/2C9/2C19/2D6/3A4 inhibition.
  • Conclusion: It is clear from the analyses reported herein that almost all ADMET parameters deteriorate with either increasing molecular weight, logP, or both, with ionization state playing either a beneficial or detrimental affect depending on the parameter in question.

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
clogp Optional[float]

precomputed cLogP.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
rog bool

True if molecule is compliant, False otherwise

rule_of_leadlike_soft(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, tpsa=None, n_rotatable_bonds=None, n_rings=None, n_hetero_atoms=None, charge=None, **kwargs)

Compute the Lead-Like Soft rule available in FAF-Drugs4.

It computes:

MW in [150, 400] & logP < in [-3, 4] & HBD <= 4 & HBA <= 7 & TPSA <=160 & ROTBONDS <= 9 &
RIGBONDS <= 30 & N_RINGS <= 4 & MAX_SIZE_RING <= 18 & N_CARBONS in [3, 35] &  N_HETEROATOMS in [1, 15] &
HC_RATIO in [0.1, 1.1] & CHARGE in [-4, 4] & N_ATOM_CHARGE <= 4 & N_STEREO_CENTER <= 2

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
tpsa Optional[float]

precomputed TPSA.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds.

None
n_rings Optional[int]

precomputed number of rings in the molecules.

None
n_hetero_atoms Optional[int]

precomputed number of heteroatoms.

None
charge Optional[float]

precomputed charge.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

rule_of_oprea(mol, n_hba=None, n_hbd=None, n_rotatable_bonds=None, n_rings=None, **kwargs)

Computes Oprea's rule of drug likeness obtained by comparing drug vs non drug compounds across multiple datasets. The rules have been described in: Oprea (2000) Property distribution of drug-related chemical databases

It computes: HBD in [0, 2] & HBA in [2, 9] & ROTBONDS in [2,8] and RINGS in [1, 4]

Summary of the paper

Seventy percent of the drug-like compounds were found between the following limits: 0 ≤ HDO ≤ 2, 2 ≤ HAC ≤ 9, 2 ≤ RTB ≤ 8, and 1 ≤ RNG ≤ 4

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds in the molecule.

None
n_rings Optional[int]

precomputed number of rings in the molecule.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns roo: True if molecule is compliant, False otherwise

rule_of_pfizer_3_75(mol, clogp=None, tpsa=None, **kwargs)

Compute Pfizer 3/75 Rule for invivo toxicity. It has been described in: Hughes, et al. (2008) Physiochemical drug properties associated with in vivo toxicological outcomes Price et al. (2009) Physicochemical drug properties associated with in vivo toxicological outcomes: a review

It computes: TPSA >= 75 & logP <= 3

Summary of the paper

  • In vivo toleration (IVT) studies on 245 preclinical Pfizer compounds found an increased likelihood of toxic events for less polar, more lipophilic compounds.
  • Compounds with low clogP / high TPSA are ∼ 2.5 times more likely to not have any toxity issue at a fixed concentration of 10 uM (total) or 1 uM (free);
  • Compounds with high clogP / low TPSA are ∼ 2.5 times more likely to have a toxicity finding; this represents an overall odds >= 6.

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
clogp Optional[float]

precomputed cLogP.

None
tpsa Optional[float]

precomputed TPSA.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
rop bool

True if molecule is compliant, False otherwise

rule_of_reos(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, charge=None, n_rotatable_bonds=None, n_heavy_atoms=None, **kwargs)

Compute the REOS filter. The REOS filter is a filter designed to filter out unuseful compounds from HTS screening results. The filter is described in: Waters & Namchuk (2003) Designing screens: how to make your hits a hit.

It computes: MW in [200, 500] & logP in [-5, 5] & HBA in [0, 10] & HBD in [0, 5] & charge in [-2, 2] & ROTBONDS in [0, 8] & NHeavyAtoms in [15, 50]

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
charge Optional[int]

precomputed formal charge.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds in the molecule.

None
n_heavy_atoms Optional[int]

precomputed number of heavy atoms in the molecule.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
ror bool

True if molecule is compliant, False otherwise

rule_of_respiratory(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, tpsa=None, n_rotatable_bonds=None, n_rings=None, **kwargs)

Computes drug likeness rule for Respiratory (nasal/inhalatory) molecules as described in: Ritchie et al. (2009) Analysis of the Calculated Physicochemical Properties of Respiratory Drugs: Can We Design for Inhaled Drugs Yet?

It computes: MW in [240, 520] & logP in [-2, 4.7] & HBONDS in [6, 12] & TPSA in [51, 135] & ROTBONDS in [3,8] & RINGS in [1,5]

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed logP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
tpsa Optional[int]

precomputed TPSA.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds in the molecule.

None
n_rings Optional[int]

precomputed number of rings.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
roc bool

True if molecule is compliant, False otherwise

rule_of_three(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, n_rotatable_bonds=None, **kwargs)

Compute the rule-of-3. The rule-of-three is a rule of thumb for molecular fragments (and not small molecules) published in: Congreve M, Carr R, Murray C, Jhoti H. (2003) A "rule of three" for fragment-based lead discovery?

It computes: MW <= 300 & logP <= 3 & HBA <= 3 & HBD <= 3 & ROTBONDS <= 3

Note

TPSA is not used in this version of the rule of three. Other version uses TPSA <= 60 AND logP in [-3, 3] in addition

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds in the molecule.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
ro3 bool

True if molecule is compliant, False otherwise

rule_of_three_extended(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, tpsa=None, n_rotatable_bonds=None, **kwargs)

Compute the extended rule-of-3 which is an extension of the rule of three that includes and TPSA and relaxes HBA constraints.

It computes: MW <= 300 & logP in [-3, 3] & HBA <= 6 & HBD <= 3 & ROTBONDS <= 3 & TPSA <= 60

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
tpsa Optional[float]

precomputed TPSA.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds in the molecule.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
ro3 bool

True if molecule is compliant, False otherwise

rule_of_two(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, **kwargs)

Computes rules-of-2 for reagent (building block design). It aims for prioritization of reagents that typically do not add more than 200 Da in MW or 2 units of clogP. The rule of two has been described in:

Goldberg et al. (2015) Designing novel building blocks is an overlooked strategy to improve compound quality

Note

Their analysis showed that molecular weight (MW) and clogP were important factors in the frequency of use of reagents. Other parameters, such as TPSA, HBA, HBD and ROTBONDS count, were less important.

It computes MW <= 200 & logP <= 2 & HBA <= 4 & HBD <= 2

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
mw Optional[float]

precomputed molecular weight.

None
clogp Optional[float]

precomputed cLogP.

None
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
ro2 bool

True if molecule is compliant, False otherwise

rule_of_veber(mol, tpsa=None, n_rotatable_bonds=None, **kwargs)

Compute the Veber filter. The Veber filter is a druglike filter for orally active drugs described in:

Veber et. al. (2002) Molecular Properties That Influence the Oral Bioavailability of Drug Candidates

It computes: ROTBONDS <= 10 & TPSA < 140

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
tpsa Optional[float]

precomputed TPSA.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns:

Name Type Description
rov bool

True if molecule is compliant, False otherwise

rule_of_xu(mol, n_hba=None, n_hbd=None, n_rotatable_bonds=None, n_rings=None, n_heavy_atoms=None, **kwargs)

Computes Xu's rule of drug likeness as described in: Xu & Stevenson (2000), Drug-like Index: A New Approach To Measure Drug-like Compounds and Their Diversity.

It computes HBD <= 5 & HBA <= 10 & ROTBONDS in [2, 35] & RINGS in [1, 7] & NHeavyAtoms in [10, 50].

Note

A compound's Drug Likeness Index is calculated based upon the knowledge derived from known drugs selected from Comprehensive Medicinal Chemistry (CMC) database.

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
n_hba Optional[float]

precomputed number of HBA.

None
n_hbd Optional[float]

precomputed number of HBD.

None
n_rotatable_bonds Optional[int]

precomputed number of rotatable bonds in the molecule.

None
n_rings Optional[int]

precomputed number of rings in the molecule.

None
n_heavy_atoms Optional[int]

precomputed number of rings in the molecule.

None
**kwargs Any

Allow extra arguments for descriptors pre-computation.

{}

Returns rox: True if molecule is compliant, False otherwise

rule_of_zinc(mol, mw=None, clogp=None, n_hba=None, n_hbd=None, tpsa=None, n_rotatable_bonds=None, n_rings=None, charge=None, **kwargs)

Compute the Zinc rule for a molecule. This rule is a rule of thumb to evaluate the druglikeness of a chemical compounds, based on:

Irwin & Schoichet (2005) ZINC - A Free Database of Commercially Available Compounds for Virtual Screening

Also see: https://fafdrugs4.rpbs.univ-paris-diderot.fr/filters.html

It computes: MW in [60, 600] & logP < in [-4, 6] & HBD <= 6 & HBA <= 11 & TPSA <=150 & ROTBONDS <= 12 & RIGBONDS <= 50 & N_RINGS <= 7 & MAX_SIZE_RING <= 12 & N_CARBONS >=3 & HC_RATIO <= 2.0 & CHARGE in [-4, 4]

Args: mol: input molecule mw: precomputed molecular weight. clogp: precomputed cLogP. n_hba: precomputed number of HBA. n_hbd: precomputed number of HBD. tpsa: precomputed TPSA. n_rotatable_bonds: precomputed number of rotatable bonds. n_rings: precomputed number of rings in the molecules. charge: precomputed charge. **kwargs: Allow extra arguments for descriptors pre-computation.

Utilities

medchem.rules.in_range(x, min_val=-float('inf'), max_val=float('inf'))

Check if a value is in a range

Parameters:

Name Type Description Default
x float

value to check

required
min_val float

minimum value

-float('inf')
max_val float

maximum value

float('inf')

medchem.rules.n_heavy_metals(mol, allowed_metals=['Li', 'Be', 'K', 'Na', 'Ca', 'Mg'])

Count the number of heavy metals in a molecule

Metal atoms are defined using the M notation in marvinjs. It's quicker to exclude atoms than to list all metals

Parameters:

Name Type Description Default
mol Mol

input molecule

required
allowed_metals List[str]

list of metals not counted as heavy metals. Default is ["Li", "Be", "K", "Na", "Ca", "Mg"]

['Li', 'Be', 'K', 'Na', 'Ca', 'Mg']

medchem.rules.has_spider_chains(mol, min_appendage=2, min_appendage_len=4)

Check whether a molecule has multiple appendage-like structures

Parameters:

Name Type Description Default
mol Mol

input molecule

required
min_appendage int

minimum number of appendages (>=)

2
min_appendage_len int

minimum length (number of atoms in straight line) of a appendage (>=)

4

medchem.rules.n_fused_aromatic_rings(mol, require_all_aromatic=True, pairwise=False)

Count the number of fused aromatic rings in a molecule

Warning

There is no such thing as a spiroaoaromatic ring in this implementation

Parameters:

Name Type Description Default
mol Mol

input molecule

required
require_all_aromatic bool

whether to require all simple rings in the fused system to be aromatic

True
pairwise bool

whether to compute the number of fused aromatic rings pairwise. meaning phenanthrene and anthracene would count for 2 fused aromatic rings each

False

medchem.rules.fraction_atom_in_scaff(mol)

Compute the fraction of atoms that belong to any ring system of a molecule as defined by its murcko scaffold

Parameters:

Name Type Description Default
mol Mol

input molecule

required

medchem.rules.list_descriptors()

List all descriptors available for computation