Mark Davies - BioMedBridges

RDF Workshop
ChEMBL Examples and Tutorial
Mark Davies
ChEMBL Group, Technical Lead
30/04/2014
Query 1 – Find All Small Molecules
PREFIX cco: <http://rdf.ebi.ac.uk/terms/chembl#>!
!
SELECT ?molecule !
WHERE {!
?molecule a cco:SmallMolecule .!
}!
http://tinyurl.com/o5gsunm
Query 2 - Find all FDA Approved Drugs
PREFIX cco: <http://rdf.ebi.ac.uk/terms/chembl#>!
!
SELECT ?molecule !
WHERE {!
?molecule a cco:SmallMolecule .!
?molecule cco:highestDevelopmentPhase ?phase .!
FILTER(?phase = 4 )!
}!
http://tinyurl.com/pssp75u
Query 3 – Find GPCR Bioactivity Data
•  Find all molecules which bind GPCRs
PREFIX cco: <http://rdf.ebi.ac.uk/terms/chembl#>!
PREFIX cpc: <http://rdf.ebi.ac.uk/resource/chembl/protclass/>!
!
SELECT ?molecule ?activity ?assay ?target!
WHERE {!
?molecule a cco:SmallMolecule .!
?smol cco:hasActivity ?activity .!
?activity cco:hasAssay ?assay .!
?assay cco:hasTarget ?target .!
?target cco:hasProteinClassification cpc:CHEMBL_PC_1020 .!
}!
http://tinyurl.com/punz758
ChEMBL Protein Classification
Query 4 – Find Secreted Proteins
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>!
PREFIX cco: <http://rdf.ebi.ac.uk/terms/chembl#>!
PREFIX sio: <http://semanticscience.org/resource/>!
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>!
!
SELECT ?uniprot ?target ?target_name ?target_type!
WHERE {!
!
{# Select targets - In this example Secreted Proteins(CHEMBL_PC_3)!
<http://rdf.ebi.ac.uk/resource/chembl/protclass/CHEMBL_PC_3> cco:hasTargetDescendant ?target .!
?target rdfs:label ?target_name ;!
cco:hasTargetComponent ?target_component ;!
cco:organismName ?organism ;!
cco:targetType ?target_type .!
?target_component skos:exactMatch ?uniprot .!
?uniprot a cco:UniprotRef .!
}!
!
}!
http://tinyurl.com/njyo9au
Query 5 – Secreted Protein Approved Drugs
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>!
PREFIX cco: <http://rdf.ebi.ac.uk/terms/chembl#>!
PREFIX sio: <http://semanticscience.org/resource/>!
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>!
!
SELECT ?uniprot ?target ?target_name ?target_type (count(distinct ?molecule) as ?drug_binding_count) !
WHERE {!
!
{# Select approved drug molecules which bind targets !
?target cco:hasAssay ?assay .!
?assay cco:hasActivity ?activity .!
?activity cco:hasMolecule ?molecule ;!
cco:pChembl ?molecule_pchembl .!
?molecule cco:highestDevelopmentPhase ?molecule_phase .!
FILTER(?molecule_pchembl > 6 )!
FILTER(?molecule_phase
= 4 )!
}!
!
{# Select some targets - In this example GPCRs (CHEMBL_PC_1020)!
<http://rdf.ebi.ac.uk/resource/chembl/protclass/CHEMBL_PC_1020> cco:hasTargetDescendant ?target .!
?target rdfs:label ?target_name ;!
cco:hasTargetComponent ?target_component ;!
cco:organismName ?organism ;!
cco:targetType ?target_type .!
?target_component skos:exactMatch ?uniprot .!
?uniprot a cco:UniprotRef .!
}!
!
}!
group by ?uniprot ?target ?target_name ?target_type !
order by desc(count(distinct ?molecule))!
http://tinyurl.com/oukfhbv
Query 6 - Secreted Protein Drug-Like Mols
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>!
PREFIX cco: <http://rdf.ebi.ac.uk/terms/chembl#>!
PREFIX sio: <http://semanticscience.org/resource/>!
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>!
!
SELECT ?uniprot ?target ?target_name ?target_type (count(distinct ?molecule) as ?drug_binding_count) !
WHERE {!
!
{# Select approved drug molecules which bind targets !
?target cco:hasAssay ?assay .!
?assay cco:hasActivity ?activity .!
?activity cco:hasMolecule ?molecule ;!
cco:pChembl ?molecule_pchembl .!
?molecule cco:highestDevelopmentPhase ?molecule_phase .!
FILTER(?molecule_pchembl > 6 )!
FILTER(?molecule_phase
= 4 )!
}!
!
!
{# Select some targets - In this example secreted molecules (CHEMBL_PC_3)!
<http://rdf.ebi.ac.uk/resource/chembl/protclass/CHEMBL_PC_3> cco:hasTargetDescendant ?target .!
?target rdfs:label ?target_name ;!
cco:hasTargetComponent ?target_component ;!
cco:organismName ?organism ;!
cco:targetType ?target_type .!
?target_component skos:exactMatch ?uniprot .!
?uniprot a cco:UniprotRef .!
}!
!
}!
group by ?uniprot ?target ?target_name ?target_type !
order by desc(count(distinct ?molecule))!
http://tinyurl.com/lk5d34w
Reactome Pathways - Apoptosis
TNF Signaling = http://identifiers.org/reactome/REACT_1432.5
Query 7 – Reactome Proteins
Running a federated query against Reactome SPARQL endpoint
PREFIX dc:
<http://purl.org/dc/elements/1.1/>!
PREFIX biopax3: <http://www.biopax.org/release/biopax-level3.owl#>!
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>!
PREFIX foaf: <http://xmlns.com/foaf/0.1/>!
PREFIX owl: <http://www.w3.org/2002/07/owl#>!
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>!
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>!
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>!
PREFIX dcterms: <http://purl.org/dc/terms/>!
!
SELECT DISTINCT ?pathwayname ?uniprot!
WHERE!
# Pull back proteins from Reactome Endpoint involved in TNF Signaling (REACT_1432.5) !
{ SERVICE <http://www.ebi.ac.uk/rdf/services/reactome/sparql>!
{ { <http://identifiers.org/reactome/REACT_1432.5> rdf:type biopax3:Pathway ;!
biopax3:pathwayComponent ?reaction ;!
biopax3:displayName ?pathwayname .!
?reaction rdf:type biopax3:BiochemicalReaction!
{
{ ?reaction ?rel ?protein }!
UNION!
{ ?reaction ?rel ?complex .!
?complex rdf:type biopax3:Complex .!
?complex ?comp ?protein!
}!
}!
?protein rdf:type biopax3:Protein .!
?protein biopax3:entityReference ?uniprot!
}!
}!
}!
http://tinyurl.com/qxn87kx
Query 8 - Reactome Proteins + Drug-Like Mols
PREFIX rdfs:
<http://www.w3.org/2000/01/rdf-schema#>!
PREFIX cco:
<http://rdf.ebi.ac.uk/terms/chembl#>!
PREFIX sio:
<http://semanticscience.org/resource/>!
PREFIX skos:
<http://www.w3.org/2004/02/skos/core#>!
PREFIX biopax3: <http://www.biopax.org/release/biopax-level3.owl#>!
!
SELECT ?pathwayname ?uniprot ?target ?target_name ?molecule !
{ !
{# ChEMBL Block 1!
?molecule cco:highestDevelopmentPhase ?molecule_phase .!
?activity cco:hasMolecule ?molecule .!
?activity cco:pChembl ?pChembl .!
?assay cco:hasActivity ?activity .!
}!
!
# Pull back proteins from Reactome Endpoint involved in TNF Signaling (REACT_1432.5) !
SERVICE <http://www.ebi.ac.uk/rdf/services/reactome/sparql>!
{ { <http://identifiers.org/reactome/REACT_1432.5> a biopax3:Pathway ;!
biopax3:pathwayComponent ?reaction ;!
biopax3:displayName ?pathwayname .!
?reaction a biopax3:BiochemicalReaction!
{
{ ?reaction ?rel ?protein }!
UNION!
{ ?reaction ?rel ?complex .!
?complex a biopax3:Complex .!
?complex ?comp ?protein!
}!
}!
?protein a biopax3:Protein .!
?protein biopax3:entityReference ?uniprot!
}!
}!
!
{# ChEMBL Block 2!
?target_component skos:exactMatch ?uniprot .!
?target_component cco:hasTarget ?target .!
?target cco:hasAssay ?assay ;!
rdfs:label ?target_name .!
} !
!
# Work around to get RO5 Value!
BIND(IRI(CONCAT(?molecule,'#num_ro5_violations')) AS ?ro5_violations_iri)!
?ro5_violations_iri sio:SIO_000300 ?ro5_violations_value . !
!
FILTER(?molecule_phase
< 4 )
!
FILTER(?pChembl > 6 )!
FILTER(?ro5_violations_value = 0 )!
}
http://tinyurl.com/nfm4xo4
Revisiting Reactome
Identifying the proteins in the pathway shown which interact with drug-like molecules
12
34
34
34
46
34
34
7
46
Caspase-8
34
TNF-alpha
12
TNF-R1
7
RIPK1
7
Query 9 – Select the Caspase-8 Molecules
PREFIX
PREFIX
PREFIX
PREFIX
PREFIX
!
SELECT
{ !
{!
rdfs:
cco:
sio:
skos:
biopax3:
<http://www.w3.org/2000/01/rdf-schema#>!
<http://rdf.ebi.ac.uk/terms/chembl#>!
<http://semanticscience.org/resource/>!
<http://www.w3.org/2004/02/skos/core#>!
<http://www.biopax.org/release/biopax-level3.owl#>!
DISTINCT ?molecule ?molecule_name!
?molecule cco:highestDevelopmentPhase ?molecule_phase ;!
sio:SIO_000008 ?molecule_ro5 ;!
rdfs:label ?molecule_name . !
?molecule_ro5 a sio:CHEMINF_000312 ;!
sio:SIO_000300 ?molecule_ro5_val.
!
?activity cco:hasMolecule ?molecule .!
?activity cco:pChembl ?pChembl .!
?assay cco:hasActivity ?activity .!
<http://rdf.ebi.ac.uk/resource/chembl/target/CHEMBL3776> cco:hasAssay ?assay ;!
} !
!
FILTER(?molecule_phase
< 4 )
FILTER(?pChembl > 6 )!
FILTER(?molecule_ro5_val = 0 )!
} !
!
http://tinyurl.com/nhc6tm7
Questions
1.  How many Assays are in ChEMBL_18 RDF? (Hint: use
type Assay)
2.  How many Activities are in ChEMBL_18 RDF? (Hint: use
type Activity)
3.  How many human kinase targets are there in ChEMBL_18
RDF? (Hint: Look under Enzyme classification to find
Kinases)
Questions
4.  Find all drug-like molecules in the ChEMBL_18 RDF?
• 
How many molecules have been returned?
• 
Explain your search criteria?
5.  Choose a protein target which is bound by a molecule
from list above and find the following information:
• 
Protein name
• 
Protein classification
• 
Organism name
Answers
1.  http://tinyurl.com/pdmna9e
2.  http://tinyurl.com/pc2wzj8
3.  http://tinyurl.com/ow4nz87
4.  http://tinyurl.com/ofofsgq - Using Rule of Five violations
equal zero
5.  Selected ChEMBL Molecule CHEMBL22 and an example
query could be: http://tinyurl.com/p24uebu. Click on
target link to get Protein Classification details