Help:WikiPathways Metabolomics

From WikiPathways

(Difference between revisions)
Jump to: navigation, search
(Metabolic Pathways)
(All HMDB identifiers)
Line 104: Line 104:
where {
where {
   ?mb a wp:Metabolite ;
   ?mb a wp:Metabolite ;
-
     dc:source "Kegg Compound"^^xsd:string ;
+
     dc:source "HMDB"^^xsd:string ;
     dc:identifier ?identifier .
     dc:identifier ?identifier .
   FILTER (!isIRI(?identifier))
   FILTER (!isIRI(?identifier))

Revision as of 07:41, 9 January 2013

On this page we collect SPARQL queries to see the state of the Metabolome in WikiPathways. Triggered by User:Andra's RDF / SPARQL work, curation started with metabolites without database identifiers. But this soon led to the observation that metabolites are often not even annotated as being a metabolite (using <Label> rather than <DataNode>). Therefore, User:Egonw started at Pathway:WP1 to curate them one by one and fix these issues:

  • connect lines between metabolites
  • convert metabolites to use <Label> rather than <DataNode>

The reason for this is that these are some basic underlying properties we need for metabolomics research fields.

Contents

Metabolome

The following queries provide an overview of the Metabolome captures by WikiPathways.

The key type for metabolites is the wp:Metabolite. We can see all available properties with:

prefix wp:      <http://vocabularies.wikipathways.org/wp#>

select distinct ?p where {
  ?mb a wp:Metabolite ;
    ?p [] .
}

To only get analysis of the most recent pathways, add this snippet to your SPARQL, assuming ?pathway is the used variable name:

  ?mb dcterms:isPartOf ?pathway .
  ?pathway pav:version ?version .
  ?mb dcterms:isPartOf ?pathway2 .
  ?pathway2 pav:version ?version2 .
  FILTER (?version2 > ?version)


All Metabolites

Count

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select count(?mb) where {
  ?mb a wp:Metabolite .
}

List

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?mb ?label where {
  ?mb a wp:Metabolite ;
     rdfs:label ?label .
}

Metabolic Data Sources

Sorted by use

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?datasource count(?identifier) as ?count
where {
  ?mb a wp:Metabolite ;
    dc:source ?datasource ;
    dc:identifier ?identifier .
} order by desc(?count)

All metabolites from one source

All KEGG identifiers

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "Kegg Compound"^^xsd:string ;
    dc:identifier ?identifier .
  FILTER (!isIRI(?identifier))
} order by ?identifier

All HMDB identifiers

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "HMDB"^^xsd:string ;
    dc:identifier ?identifier .
  FILTER (!isIRI(?identifier))
} order by ?identifier

Metabolic Pathways

Pathways with the most metabolites

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>
prefix pav:     <http://purl.org/pav/>

select ?pathway count(?mb) as ?mbCount
where {
  ?mb a wp:Metabolite ;
    dcterms:isPartOf ?pathway .
} order by desc(?mbCount)

Curation

Metabolites not classified as such

One can list all data sources for non-metabolites with this query:

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?datasource count(?identifier) as ?count
where {
  ?mb dc:source ?datasource ;
    dc:identifier ?identifier .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
} order by desc(?count)

That mostly lists gene identifier sources, etc, but watch out for the metabolite identifier data sources. For example, metabolites not marked as such but with a metabolite identifier can be found this way.

Non-Metabolites with CAS identifier

Note that a CAS identifier can also refer to mixtures, compound classes, etc.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb dc:source "CAS"^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
  FILTER (!isIRI(?identifier))
} order by ?pathway

Non-Metabolites with PubChem identifier

These might have been curated by the time of reading.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb dc:source "PubChem"^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
  FILTER (!isIRI(?identifier))
} order by ?pathway

Metabolites with an identifier but undefined data source

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb a wp:Metabolite ;
    dc:source ""^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER (!isIRI(?identifier))
  FILTER (str(?identifier) != "")
} order by ?pathway

Metabolites with an Entrez Gene identifier

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb a wp:Metabolite ;
    dc:source "Entrez Gene"^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER (!isIRI(?identifier))
  FILTER (str(?identifier) != "")
} order by ?pathway
Personal tools