Help:WikiPathways Metabolomics

From WikiPathways

(Difference between revisions)
Jump to: navigation, search
(Metabolome)
(Metabolic Pathways)
Line 111: Line 111:
= Metabolic Pathways =
= Metabolic Pathways =
 +
 +
== Pathways with the most metabolites ==
 +
 +
<pre>
 +
prefix wp:      <http://vocabularies.wikipathways.org/wp#>
 +
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
 +
prefix dcterms: <http://purl.org/dc/terms/>
 +
prefix xsd:    <http://www.w3.org/2001/XMLSchema#>
 +
prefix pav:    <http://purl.org/pav/>
 +
 +
select ?pathway count(?mb) as ?mbCount
 +
where {
 +
  ?mb a wp:Metabolite ;
 +
    dcterms:isPartOf ?pathway .
 +
} order by desc(?mbCount)
 +
</pre>
= Curation =
= Curation =

Revision as of 07:39, 9 January 2013

On this page we collect SPARQL queries to see the state of the Metabolome in WikiPathways. Triggered by User:Andra's RDF / SPARQL work, curation started with metabolites without database identifiers. But this soon led to the observation that metabolites are often not even annotated as being a metabolite (using <Label> rather than <DataNode>). Therefore, User:Egonw started at Pathway:WP1 to curate them one by one and fix these issues:

  • connect lines between metabolites
  • convert metabolites to use <Label> rather than <DataNode>

The reason for this is that these are some basic underlying properties we need for metabolomics research fields.

Contents

[hide]

Metabolome

The following queries provide an overview of the Metabolome captures by WikiPathways.

The key type for metabolites is the wp:Metabolite. We can see all available properties with:

prefix wp:      <http://vocabularies.wikipathways.org/wp#>

select distinct ?p where {
  ?mb a wp:Metabolite ;
    ?p [] .
}

To only get analysis of the most recent pathways, add this snippet to your SPARQL, assuming ?pathway is the used variable name:

  ?mb dcterms:isPartOf ?pathway .
  ?pathway pav:version ?version .
  ?mb dcterms:isPartOf ?pathway2 .
  ?pathway2 pav:version ?version2 .
  FILTER (?version2 > ?version)


All Metabolites

Count

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select count(?mb) where {
  ?mb a wp:Metabolite .
}

List

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?mb ?label where {
  ?mb a wp:Metabolite ;
     rdfs:label ?label .
}

Metabolic Data Sources

Sorted by use

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?datasource count(?identifier) as ?count
where {
  ?mb a wp:Metabolite ;
    dc:source ?datasource ;
    dc:identifier ?identifier .
} order by desc(?count)

All metabolites from one source

All KEGG identifiers

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "Kegg Compound"^^xsd:string ;
    dc:identifier ?identifier .
  FILTER (!isIRI(?identifier))
} order by ?identifier

All HMDB identifiers

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select distinct ?identifier
where {
  ?mb a wp:Metabolite ;
    dc:source "Kegg Compound"^^xsd:string ;
    dc:identifier ?identifier .
  FILTER (!isIRI(?identifier))
} order by ?identifier

Metabolic Pathways

Pathways with the most metabolites

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>
prefix pav:     <http://purl.org/pav/>

select ?pathway count(?mb) as ?mbCount
where {
  ?mb a wp:Metabolite ;
    dcterms:isPartOf ?pathway .
} order by desc(?mbCount)

Curation

Metabolites not classified as such

One can list all data sources for non-metabolites with this query:

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms:  <http://purl.org/dc/terms/>

select ?datasource count(?identifier) as ?count
where {
  ?mb dc:source ?datasource ;
    dc:identifier ?identifier .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
} order by desc(?count)

That mostly lists gene identifier sources, etc, but watch out for the metabolite identifier data sources. For example, metabolites not marked as such but with a metabolite identifier can be found this way.

Non-Metabolites with CAS identifier

Note that a CAS identifier can also refer to mixtures, compound classes, etc.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb dc:source "CAS"^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
  FILTER (!isIRI(?identifier))
} order by ?pathway

Non-Metabolites with PubChem identifier

These might have been curated by the time of reading.

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb dc:source "PubChem"^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER NOT EXISTS { ?mb a wp:Metabolite }
  FILTER (!isIRI(?identifier))
} order by ?pathway

Metabolites with an identifier but undefined data source

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb a wp:Metabolite ;
    dc:source ""^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER (!isIRI(?identifier))
  FILTER (str(?identifier) != "")
} order by ?pathway

Metabolites with an Entrez Gene identifier

prefix wp:      <http://vocabularies.wikipathways.org/wp#>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix xsd:     <http://www.w3.org/2001/XMLSchema#>

select distinct ?pathway ?mb ?identifier 
where {
  ?mb a wp:Metabolite ;
    dc:source "Entrez Gene"^^xsd:string ;
    dc:identifier ?identifier ;
    dcterms:isPartOf ?pathway .
  FILTER (!isIRI(?identifier))
  FILTER (str(?identifier) != "")
} order by ?pathway
Personal tools