Help:WikiPathways Metabolomics
From WikiPathways
(→Metabolites with an identifier but undefined data source) |
(→Metabolites with an Entrez Gene identifier) |
||
Line 193: | Line 193: | ||
== Metabolites with an Entrez Gene identifier == | == Metabolites with an Entrez Gene identifier == | ||
+ | <pre> | ||
prefix wp: <http://vocabularies.wikipathways.org/wp#> | prefix wp: <http://vocabularies.wikipathways.org/wp#> | ||
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> | prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> | ||
Line 207: | Line 208: | ||
FILTER (str(?identifier) != "") | FILTER (str(?identifier) != "") | ||
} order by ?pathway | } order by ?pathway | ||
+ | </pre> |
Revision as of 12:38, 8 January 2013
On this page we collect SPARQL queries to see the state of the Metabolome in WikiPathways. Triggered by User:Andra's RDF / SPARQL work, curation started with metabolites without database identifiers. But this soon led to the observation that metabolites are often not even annotated as being a metabolite (using <Label> rather than <DataNode>). Therefore, User:Egonw started at Pathway:WP1 to curate them one by one and fix these issues:
- connect lines between metabolites
- convert metabolites to use <Label> rather than <DataNode>
The reason for this is that these are some basic underlying properties we need for metabolomics research fields.
Contents |
Metabolome
The following queries provide an overview of the Metabolome captures by WikiPathways.
The key type for metabolites is the wp:Metabolite. We can see all available properties with:
prefix wp: <http://vocabularies.wikipathways.org/wp#> select distinct ?p where { ?mb a wp:Metabolite ; ?p [] . }
To only get analysis of the most recent pathways, add this snippet to your SPARQL, assuming ?pathway is the used variable name:
?mb dcterms:isPartOf ?pathway . ?mb dcterms:isPartOf ?pathway2 . ?pathway2 pav:version ?version2 . FILTER (?version2 > ?version)
All Metabolites
Count
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> select count(?mb) where { ?mb a wp:Metabolite . }
List
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> select ?mb ?label where { ?mb a wp:Metabolite ; rdfs:label ?label . }
Metabolic Data Sources
Sorted by use
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> select ?datasource count(?identifier) as ?count where { ?mb a wp:Metabolite ; dc:source ?datasource ; dc:identifier ?identifier . } order by desc(?count)
All metabolites from one source
All KEGG identifiers
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> select distinct ?identifier where { ?mb a wp:Metabolite ; dc:source "Kegg Compound"^^xsd:string ; dc:identifier ?identifier . FILTER (!isIRI(?identifier)) } order by ?identifier
All HMDB identifiers
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> select distinct ?identifier where { ?mb a wp:Metabolite ; dc:source "Kegg Compound"^^xsd:string ; dc:identifier ?identifier . FILTER (!isIRI(?identifier)) } order by ?identifier
Metabolic Pathways
Curation
Metabolites not classified as such
One can list all data sources for non-metabolites with this query:
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> select ?datasource count(?identifier) as ?count where { ?mb dc:source ?datasource ; dc:identifier ?identifier . FILTER NOT EXISTS { ?mb a wp:Metabolite } } order by desc(?count)
That mostly lists gene identifier sources, etc, but watch out for the metabolite identifier data sources. For example, metabolites not marked as such but with a metabolite identifier can be found this way.
Non-Metabolites with CAS identifier
Note that a CAS identifier can also refer to mixtures, compound classes, etc.
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> select distinct ?pathway ?mb ?identifier where { ?mb dc:source "CAS"^^xsd:string ; dc:identifier ?identifier ; dcterms:isPartOf ?pathway . FILTER NOT EXISTS { ?mb a wp:Metabolite } FILTER (!isIRI(?identifier)) } order by ?pathway
Non-Metabolites with PubChem identifier
These might have been curated by the time of reading.
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> select distinct ?pathway ?mb ?identifier where { ?mb dc:source "PubChem"^^xsd:string ; dc:identifier ?identifier ; dcterms:isPartOf ?pathway . FILTER NOT EXISTS { ?mb a wp:Metabolite } FILTER (!isIRI(?identifier)) } order by ?pathway
Metabolites with an identifier but undefined data source
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> select distinct ?pathway ?mb ?identifier where { ?mb a wp:Metabolite ; dc:source ""^^xsd:string ; dc:identifier ?identifier ; dcterms:isPartOf ?pathway . FILTER (!isIRI(?identifier)) FILTER (str(?identifier) != "") } order by ?pathway
Metabolites with an Entrez Gene identifier
prefix wp: <http://vocabularies.wikipathways.org/wp#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix dcterms: <http://purl.org/dc/terms/> prefix xsd: <http://www.w3.org/2001/XMLSchema#> select distinct ?pathway ?mb ?identifier where { ?mb a wp:Metabolite ; dc:source "Entrez Gene"^^xsd:string ; dc:identifier ?identifier ; dcterms:isPartOf ?pathway . FILTER (!isIRI(?identifier)) FILTER (str(?identifier) != "") } order by ?pathway