Skip to content

medchem.query

medchem.query.QueryFilter

Query filtering system based on a custom query grammar

__call__(mols, scheduler='processes', n_jobs=-1, progress=True)

Call the internal chemical filter that has been build

Parameters:

Name Type Description Default
mols List[Union[str, Mol]]

list of input molecules to filter

required
n_jobs int

whether to run jobs in parallel and number of jobs to consider.

-1
scheduler str

joblib scheduler to use.

'processes'
progress bool

whether to show job progress.

True

__init__(query, grammar=None, parser='lalr')

Constructor for query filtering system

Parameters:

Name Type Description Default
query str

input unparsed query

required
grammar Optional[str]

path to grammar language to use. Defaults to None, which will use the default grammar.

None
parser str

which Lark language parser to use. Defaults to "lalr".

'lalr'

medchem.query.QueryOperator

A class to hold all the operators that can be used in queries

AVAILABLES_PROPERTIES = list_descriptors() class-attribute instance-attribute

Default list of available properties in medchem's query system

AVAILABLE_CATALOGS = list_named_catalogs() class-attribute instance-attribute

Default list of available catalogs in medchem's query system

AVAILABLE_FUNCTIONAL_GROUPS = list_functional_group_names() class-attribute instance-attribute

Default list of available functional groups in medchem's query system

AVAILABLE_RULES = RuleFilters.list_available_rules_names() class-attribute instance-attribute

Default list of available rules in medchem's query system

getprop(mol, prop) staticmethod

Compute the molecular property of a molecule. This is an alternative to the hasprop function, that does not enforce any comparison.

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
prop str

molecular property to apply as filter on the molecule

required

Returns:

Name Type Description
property float

computed property value

hasalert(mol, alert) staticmethod

Check if a molecule match a named alert catalog. The alert catalog needs to be one supported by the medchem package.

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
alert str

named catalog to apply as filter on the molecule

required

Returns:

Name Type Description
has_alert bool

whether the molecule has a given alert

hasgroup(mol, group) staticmethod

Check if a molecule has a specific functional group.

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
group str

functional group to check on the molecule.

required

Returns:

Name Type Description
has_group bool

whether the molecule has the given functional group

hasprop(mol, prop, comparator, limit) staticmethod

Check if a molecule has a property within a desired range

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
prop str

molecular property to apply as filter on the molecule

required
comparator Callable

operator function to apply to check whether the molecule property matches the expected value

required
limit float

limit value for determining whether the molecule property is within desired range

required

Returns:

Name Type Description
has_property bool

whether the molecule has a given property within a desired range

hassubstructure(mol, query, is_smarts=False, operator='min', limit=1) staticmethod

Check if a molecule has the substructure provided by a query

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
query str

input smarts query

required
is_smarts bool

whether this is a smarts query or not

False
operator Optional[str]

one of min or max to specify the min or max limit

'min'
limit int

limit of substructures to be found

1

Returns:

Name Type Description
has_substructure bool

whether the query is a subgraph of the molecule

hassuperstructure(mol, query) staticmethod

Check if a molecule has a superstructure defined by a query. Note that a superstructure cannot be a query (SMARTS)

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
query str

input smarts query

required

Returns:

Name Type Description
has_superstructure bool

whether the molecule is a subgraph of the query

like(mol, query, comparator, limit) staticmethod

Check if a molecule is similar or distant enough from another molecule using tanimoto ECFP distance.

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
query Union[Mol, str]

input molecule to compare with

required
comparator Callable[[float, float], bool]

operator function to apply to check whether the molecule property matches the expected value. Takes computed_similarity and limit as arguments and returns a boolean.

required
limit float

limit value for determining whether the molecule property is within desired range

required

Returns:

Name Type Description
is_similar bool

whether the molecule is similar or distant enough from the query

matchrule(mol, rule) staticmethod

Check if a molecule match a druglikeness rule

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
rule str

druglikeness rule check on the molecule.

required

Returns:

Name Type Description
match_rule bool

whether the molecule match the given rule

similarity(mol, query) staticmethod

Compute the ECFP tanimoto similarity between two molecules. This is an alternative to the like function, that does not enforce any comparison, and lets python handles the binary comparison operators.

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
query Union[Mol, str]

input query molecule to compute similarity against

required

Returns:

Name Type Description
similarity float

computed similarity value between mol and query

medchem.query.EvaluableQuery

Parser of a query into a list of evaluable function nodes

__call__(mol, exec=True)

Evaluate a query on an input molecule

Parameters:

Name Type Description Default
mol Union[Mol, str]

input molecule

required
exec bool

whether to interpret the resulting query or not

True

Returns:

Type Description
str

query string or boolean value corresponding to the query result

__init__(parsed_query, verbose=False)

Constructor for query evaluation

Parameters:

Name Type Description Default
parsed_query Union[str, ParseTree]

query that has been parsed and transformed

required
verbose bool

whether to print debug information

False

medchem.query.QueryParser

Bases: Transformer

Query parser for the custom query language for molecule. This parses the input language, build a parseable and evaluable representation. The trick for lazy evaluation is to define custom guard with 'fn(*)' around expression that needs to be evaluated.

Note that you SHOULD NOT HAVE TO INTERACT WITH THIS CLASS DIRECTLY.

Example
import medchem
import lark
QUERY_GRAMMAR = medchem.utils.loader.get_grammar(as_string=True)
QUERY_PARSER = Lark(QUERY_GRAMMAR, parser="lalr", transformer=QueryParser())
# see how the string needs to be "quoted". This builds on the json quote requirements to avoid dealing with unwanted outcomes
example = """(HASPROP("tpsa" > 120 ) | HASSUBSTRUCTURE("c1ccccc1")) AND NOT HASALERT("pains") OR HASSUBSTRUCTURE("[OH]", max)"""
t = QUERY_PARSER.parse(example)
print(t)
((((`fn(getprop, prop='tpsa')` > 120.0) or `fn(hassubstructure, query='c1ccccc1', operator='None', limit=None, is_smarts=None)`) and not `fn(hasalert, alert='pains')`) or `fn(hassubstructure, query='[OH]', operator='max', limit=None, is_smarts=None)`)

bool_expr(bool_term, *others)

Define how boolean expressions should be parsed

bool_term(bool_factor, *others)

Define how boolean terms should be parsed

hasalert(value)

Format the hasalert node in the query

Note

The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.

hasgroup(value)

Format the hasgroup node in the query

Note

The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.

hasprop(value, comparator, limit)

Format the hasprop node in the query

Note

The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.

hassubstructure(value, is_smarts, operator, limit)

Format the substructure node in the query

Note

The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.

hassuperstructure(value)

Format the superstructure node in the query

Note

The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.

like(value, comparator, limit)

Format the like node in the query

Note

The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.

matchrule(value)

Format the matchrule node in the query

Note

The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.

not_bool_factor(*args)

Define representation of a negation