medchem.query
¶
medchem.query.QueryFilter
¶
Query filtering system based on a custom query grammar
__call__(mols, scheduler='processes', n_jobs=-1, progress=True)
¶
Call the internal chemical filter that has been build
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mols |
List[Union[str, Mol]]
|
list of input molecules to filter |
required |
n_jobs |
int
|
whether to run jobs in parallel and number of jobs to consider. |
-1
|
scheduler |
str
|
joblib scheduler to use. |
'processes'
|
progress |
bool
|
whether to show job progress. |
True
|
__init__(query, grammar=None, parser='lalr')
¶
Constructor for query filtering system
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
str
|
input unparsed query |
required |
grammar |
Optional[str]
|
path to grammar language to use. Defaults to None, which will use the default grammar. |
None
|
parser |
str
|
which Lark language parser to use. Defaults to "lalr". |
'lalr'
|
medchem.query.QueryOperator
¶
A class to hold all the operators that can be used in queries
AVAILABLES_PROPERTIES = list_descriptors()
class-attribute
instance-attribute
¶
Default list of available properties in medchem's query system
AVAILABLE_CATALOGS = list_named_catalogs()
class-attribute
instance-attribute
¶
Default list of available catalogs in medchem's query system
AVAILABLE_FUNCTIONAL_GROUPS = list_functional_group_names()
class-attribute
instance-attribute
¶
Default list of available functional groups in medchem's query system
AVAILABLE_RULES = RuleFilters.list_available_rules_names()
class-attribute
instance-attribute
¶
Default list of available rules in medchem's query system
getprop(mol, prop)
staticmethod
¶
Compute the molecular property of a molecule. This is an alternative to the hasprop function, that does not enforce any comparison.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
prop |
str
|
molecular property to apply as filter on the molecule |
required |
Returns:
Name | Type | Description |
---|---|---|
property |
float
|
computed property value |
hasalert(mol, alert)
staticmethod
¶
Check if a molecule match a named alert catalog. The alert catalog needs to be one supported by the medchem package.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
alert |
str
|
named catalog to apply as filter on the molecule |
required |
Returns:
Name | Type | Description |
---|---|---|
has_alert |
bool
|
whether the molecule has a given alert |
hasgroup(mol, group)
staticmethod
¶
Check if a molecule has a specific functional group.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
group |
str
|
functional group to check on the molecule. |
required |
Returns:
Name | Type | Description |
---|---|---|
has_group |
bool
|
whether the molecule has the given functional group |
hasprop(mol, prop, comparator, limit)
staticmethod
¶
Check if a molecule has a property within a desired range
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
prop |
str
|
molecular property to apply as filter on the molecule |
required |
comparator |
Callable
|
operator function to apply to check whether the molecule property matches the expected value |
required |
limit |
float
|
limit value for determining whether the molecule property is within desired range |
required |
Returns:
Name | Type | Description |
---|---|---|
has_property |
bool
|
whether the molecule has a given property within a desired range |
hassubstructure(mol, query, is_smarts=False, operator='min', limit=1)
staticmethod
¶
Check if a molecule has the substructure provided by a query
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
query |
str
|
input smarts query |
required |
is_smarts |
bool
|
whether this is a smarts query or not |
False
|
operator |
Optional[str]
|
one of min or max to specify the min or max limit |
'min'
|
limit |
int
|
limit of substructures to be found |
1
|
Returns:
Name | Type | Description |
---|---|---|
has_substructure |
bool
|
whether the query is a subgraph of the molecule |
hassuperstructure(mol, query)
staticmethod
¶
Check if a molecule has a superstructure defined by a query. Note that a superstructure cannot be a query (SMARTS)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
query |
str
|
input smarts query |
required |
Returns:
Name | Type | Description |
---|---|---|
has_superstructure |
bool
|
whether the molecule is a subgraph of the query |
like(mol, query, comparator, limit)
staticmethod
¶
Check if a molecule is similar or distant enough from another molecule using tanimoto ECFP distance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
query |
Union[Mol, str]
|
input molecule to compare with |
required |
comparator |
Callable[[float, float], bool]
|
operator function to apply to check whether the molecule property matches the
expected value.
Takes computed_similarity and |
required |
limit |
float
|
limit value for determining whether the molecule property is within desired range |
required |
Returns:
Name | Type | Description |
---|---|---|
is_similar |
bool
|
whether the molecule is similar or distant enough from the query |
matchrule(mol, rule)
staticmethod
¶
Check if a molecule match a druglikeness rule
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
rule |
str
|
druglikeness rule check on the molecule. |
required |
Returns:
Name | Type | Description |
---|---|---|
match_rule |
bool
|
whether the molecule match the given rule |
similarity(mol, query)
staticmethod
¶
Compute the ECFP tanimoto similarity between two molecules. This is an alternative to the like function, that does not enforce any comparison, and lets python handles the binary comparison operators.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
query |
Union[Mol, str]
|
input query molecule to compute similarity against |
required |
Returns:
Name | Type | Description |
---|---|---|
similarity |
float
|
computed similarity value between mol and query |
medchem.query.EvaluableQuery
¶
Parser of a query into a list of evaluable function nodes
__call__(mol, exec=True)
¶
Evaluate a query on an input molecule
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mol |
Union[Mol, str]
|
input molecule |
required |
exec |
bool
|
whether to interpret the resulting query or not |
True
|
Returns:
Type | Description |
---|---|
str
|
query string or boolean value corresponding to the query result |
__init__(parsed_query, verbose=False)
¶
Constructor for query evaluation
Parameters:
Name | Type | Description | Default |
---|---|---|---|
parsed_query |
Union[str, ParseTree]
|
query that has been parsed and transformed |
required |
verbose |
bool
|
whether to print debug information |
False
|
medchem.query.QueryParser
¶
Bases: Transformer
Query parser for the custom query language for molecule. This parses the input language, build a
parseable and evaluable representation.
The trick for lazy evaluation is to define custom guard with 'fn(*)
' around expression
that needs to be evaluated.
Note that you SHOULD NOT HAVE TO INTERACT WITH THIS CLASS DIRECTLY.
Example
import medchem
import lark
QUERY_GRAMMAR = medchem.utils.loader.get_grammar(as_string=True)
QUERY_PARSER = Lark(QUERY_GRAMMAR, parser="lalr", transformer=QueryParser())
# see how the string needs to be "quoted". This builds on the json quote requirements to avoid dealing with unwanted outcomes
example = """(HASPROP("tpsa" > 120 ) | HASSUBSTRUCTURE("c1ccccc1")) AND NOT HASALERT("pains") OR HASSUBSTRUCTURE("[OH]", max)"""
t = QUERY_PARSER.parse(example)
print(t)
((((`fn(getprop, prop='tpsa')` > 120.0) or `fn(hassubstructure, query='c1ccccc1', operator='None', limit=None, is_smarts=None)`) and not `fn(hasalert, alert='pains')`) or `fn(hassubstructure, query='[OH]', operator='max', limit=None, is_smarts=None)`)
bool_expr(bool_term, *others)
¶
Define how boolean expressions should be parsed
bool_term(bool_factor, *others)
¶
Define how boolean terms should be parsed
hasalert(value)
¶
Format the hasalert node in the query
Note
The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.
hasgroup(value)
¶
Format the hasgroup node in the query
Note
The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.
hasprop(value, comparator, limit)
¶
Format the hasprop node in the query
Note
The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.
hassubstructure(value, is_smarts, operator, limit)
¶
Format the substructure node in the query
Note
The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.
hassuperstructure(value)
¶
Format the superstructure node in the query
Note
The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.
like(value, comparator, limit)
¶
Format the like node in the query
Note
The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.
matchrule(value)
¶
Format the matchrule node in the query
Note
The parser does not enforce any validity on the argument and the underlying function is supposed to handle it.
not_bool_factor(*args)
¶
Define representation of a negation