Help:WikiPathways Metabolomics

From WikiPathways

(Difference between revisions)
Jump to: navigation, search
(Metabolic Data Sources)
(Metabolic Pathways)
Line 155: Line 155:
} order by ?mb
} order by ?mb
</pre>
</pre>
 +
 +
[http://goo.gl/rjhu8 Run]
== Pathways with the most metabolites ==
== Pathways with the most metabolites ==
Line 171: Line 173:
} order by desc(?mbCount)
} order by desc(?mbCount)
</pre>
</pre>
 +
 +
[http://goo.gl/Tf2v3 Run]
== Metabolites in the most Pathways ==
== Metabolites in the most Pathways ==
Line 189: Line 193:
} order by desc(?pwCount)
} order by desc(?pwCount)
</pre>
</pre>
 +
 +
[http://goo.gl/VSli7 Run]
= Curation =
= Curation =

Revision as of 14:56, 20 January 2013

On this page we collect SPARQL queries to see the state of the Metabolome in WikiPathways. Triggered by User:Andra's RDF / SPARQL work, curation started with metabolites without database identifiers. But this soon led to the observation that metabolites are often not even annotated as being a metabolite (using <Label> rather than <DataNode>). Therefore, User:Egonw started at Pathway:WP1 to curate them one by one and fix these issues:

  • connect lines between metabolites
  • convert metabolites to use <Label> rather than <DataNode>

The reason for this is that these are some basic underlying properties we need for metabolomics research fields.

Contents

Metabolome

The following queries provide an overview of the Metabolome captures by WikiPathways.

The key type for metabolites is the wp:Metabolite. We can see all available properties with:

prefix wp:      <http://vocabularies.wikipathways.org/wp#>

select distinct ?p where {
  ?mb a wp:Metabolite ;
    ?p [] .
}

Run

Likewise, we can get all pathway properties with:

prefix wp:      <http://vocabularies.wikipathways.org/wp#>

select distinct ?p where {
  ?mb a wp:Pathway ;
    ?p [] .
}

Run

Latest data only

To only get analysis of the most recent pathways, add this snippet to your SPARQL, assuming ?pathway is the used variable name:

  ?mb dcterms:isPartOf ?pathway .
  ?pathway pav:version ?version .
  ?mb dcterms:isPartOf ?pathway2 .
  ?pathway2 pav:version ?version2 .
  FILTER (?version2 > ?version)

However, it should be kept in mind that this is not a fool-proof solution.

All Metabolites

Count

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select count(?mb) where {
  ?mb a wp:Metabolite .
}

Run

List

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?mb ?label where {
  ?mb a wp:Metabolite ;
     rdfs:label ?label .
}

Run

Metabolic Data Sources

Sorted by use

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?datasource count(?identifier) as ?count
where {
  ?mb a wp:Metabolite ;
    dc:source ?datasource ;
    dc:identifier ?identifier .
} order by desc(?count)

Run

All metabolites from one source

All KEGG identifiers

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "Kegg Compound"^^xsd:string ;
    dc:identifier ?identifier .
  FILTER (!isIRI(?identifier))
} order by ?identifier

Run

All HMDB identifiers

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "HMDB"^^xsd:string ;
    dc:identifier ?identifier .
  FILTER (!isIRI(?identifier))
} order by ?identifier

Run

Metabolic Pathways

Metabolomes

Human Metabolome

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix ncbi:    <http://purl.obolibrary.org/obo/NCBITaxon_>

select distinct ?mb where {
  ?mb a wp:Metabolite ;
    dcterms:isPartOf ?pw .
  ?pw wp:organism ncbi:9606 .
} order by ?mb

Run

Pathways with the most metabolites

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>
prefix pav:     <http://purl.org/pav/>

select ?pathway count(?mb) as ?mbCount
where {
  ?mb a wp:Metabolite ;
    dcterms:isPartOf ?pathway .
} order by desc(?mbCount)

Run

Metabolites in the most Pathways

With the remark that BridgeDB is not involved yet.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>
prefix pav:     <http://purl.org/pav/>

select ?mb count(?pathway) as ?pwCount
where {
  ?mb a wp:Metabolite ;
    dcterms:isPartOf ?pathway .
} order by desc(?pwCount)

Run

Curation

Common wrong identifiers

PubChem-compound 1004

Wrongly used for phosphate. It is the uncharged compound. Phosphate is, instead, and particularly thinkgs like "Pi", CID 1061 for ortho-phosphate, aka [PO4]2-.

Metabolites not classified as such

One can list all data sources for non-metabolites with this query:

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?datasource count(?identifier) as ?count
where {
  ?mb dc:source ?datasource ;
    dc:identifier ?identifier .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
} order by desc(?count)

That mostly lists gene identifier sources, etc, but watch out for the metabolite identifier data sources. For example, metabolites not marked as such but with a metabolite identifier can be found this way.

Non-Metabolites with CAS identifier

Note that a CAS identifier can also refer to mixtures, compound classes, etc.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb dc:source "CAS"^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
  FILTER (!isIRI(?identifier))
} order by ?pathway

Non-Metabolites with PubChem identifier

These might have been curated by the time of reading.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb dc:source "PubChem"^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
  FILTER (!isIRI(?identifier))
} order by ?pathway

Metabolites with an identifier but undefined data source

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb a wp:Metabolite ;
    dc:source ""^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER (!isIRI(?identifier))
  FILTER (str(?identifier) != "")
} order by ?pathway

Metabolites with an Entrez Gene identifier

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb a wp:Metabolite ;
    dc:source "Entrez Gene"^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER (!isIRI(?identifier))
  FILTER (str(?identifier) != "")
} order by ?pathway
Personal tools