Download Pathways

From WikiPathways

(Difference between revisions)
Jump to: navigation, search
(April data release)
(110 intermediate revisions not shown.)
Line 1: Line 1:
__NOEDITSECTION__ <!-- Turn off section editing -->
__NOEDITSECTION__ <!-- Turn off section editing -->
  {| align="right"
  {| align="right"
-
   | __TOC__
+
   | __NOTOC__
   |}
   |}
-
Pathways available for download have been explicitly marked with the '''Curated collection''' curation tag. This includes pathways that have been carefully curated and typically excludes draft or test pathways not intended for distribution. '''This is the recommended set for download'''.
+
=== Versioned Releases ===
 +
Each month we release an updated set of pathways in various data and image formats. These pathways have been reviewed and tagged as approved, and are considered ready for analysis and data overlays.  
-
A larger set of pathway which includes non-curated content is available [[:Download_All_Pathways | here]].
 
-
== GPML ==
+
<font size=4>'''Current version: [http://data.wikipathways.org/20240410/ 20240410 (10 April 2024)]'''</font>
-
Click on one of the links below to download all pathways in GPML format. You can view and edit GPML files in [http://pathvisio.org PathVisio]. You can also use GPML files for analysis in both PathVisio and [http://cytoscape.org Cytoscape] using the [http://apps.cytoscape.org/apps/wikipathways WikiPathways app].
+
-
{|class="prettytable"
 
-
|- valign="top"
 
-
|<batchDownload filetype="gpml" tag="Curation:AnalysisCollection"></batchDownload>
 
-
|}
 
-
The pathways from the Human collection of Reactome are available through the Reactome portal at WikiPathways in the GPML format as well. Click [http://www.wikipathways.org//wpi/batchDownload.php?species=Homo%20sapiens&fileType=gpml&tag=Curation:Reactome_Approved  here] to download them.
+
=== Vertebrates ===
-
== BioPAX ==
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Bos_taurus.zip Bos taurus]'''
-
Click on one of the links below to download all pathways in [http://www.biopax.org BioPAX] level 3 format.
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Canis_familiaris.zip Canis familiaris]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Danio_rerio.zip Danio rerio]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Equus_caballus.zip Equus caballus]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Gallus_gallus.zip Gallus gallus]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Homo_sapiens.zip Homo sapiens]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Mus_musculus.zip Mus musculus]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Pan_troglodytes.zip Pan troglodytes]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Rattus_norvegicus.zip Rattus norvegicus]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Sus_scrofa.zip Sus scrofa]'''
-
<batchDownload filetype="owl" tag="Curation:AnalysisCollection"></batchDownload>
+
=== Invertebrates ===
-
== Eu.Gene ==
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Anopheles_gambiae.zip Anopheles gambiae]'''
-
Click on one of the links below to download all pathways in the [http://www.duccioknights.org/?page_id=169 Eu.Gene] format (pwf). Eu.Gene is a tool for microarray analysis in context of biological pathways ([http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=retrieve&db=pubmed&list_uids=17599938&dopt=AbstractPlus read more]).
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Caenorhabditis_elegans.zip Caenorhabditis elegans]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Drosophila_melanogaster.zip Drosophila melanogaster]'''
-
{|class="prettytable"
+
=== Plants ===
-
|- valign="top"
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Arabidopsis_thaliana.zip Arabidopsis thaliana]'''
-
  |<batchDownload filetype="pwf" tag="Curation:AnalysisCollection"></batchDownload>
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Hordeum_vulgare.zip Hordeum vulgare]'''
-
|}
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Oryza_sativa.zip Oryza sativa]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Populus_trichocarpa.zip Populus trichocarpa]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Solanum_lycopersicum.zip Solanum lycopersicum]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Zea_mays.zip Zea mays]'''
 +
  <!-- * '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Beta_vulgaris.zip Beta vulgaris]''' -->
-
== Plain text ==
+
=== Eukaryotic microorganisms ===
-
Click on one of the links below to download all pathways in plain text format. This format contains a list of all datanodes, with the identifier in the first
+
-
column, and the database system in the second column.
+
-
{|class="prettytable"
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Gibberella_zeae.zip Gibberella zeae]'''
-
|- valign="top"
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Saccharomyces_cerevisiae.zip Saccharomyces cerevisiae]'''
-
|<batchDownload filetype="txt" tag="Curation:AnalysisCollection"></batchDownload>
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Plasmodium_falciparum.zip Plasmodium falciparum]'''
-
|}
+
-
== PDF ==
+
=== Bacteria ===
-
Click on one of the links below to download all pathways in Portable Document Format (PDF).
+
-
<batchDownload filetype="pdf" tag="Curation:AnalysisCollection"></batchDownload>
+
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Bacillus_subtilis.zip Bacillus subtilis]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Escherichia_coli.zip E.coli]'''
 +
* '''[http://data.wikipathways.org/20240410/gpml/wikipathways-20240410-gpml-Mycobacterium_tuberculosis.zip Mycobacterium tuberculosis]'''
-
== SVG ==
 
-
Click on one of the links below to download all pathways in [http://www.w3.org/Graphics/SVG/ SVG] format. The [http://www.w3.org/Graphics/SVG/ SVG] files can be used to create high quality images for publication purposes. You can edit and convert the SVG files using [http://www.inkscape.org Inkscape].
 
-
{|class="prettytable"
+
== Programmatic Access ==
-
|- valign="top"
+
The archive of current and past collections of pathways in various formats at data.wikipathways.org is accessible programmatically as well. Depending on your preferences, there are many ways to identify and download the collection you need.
-
|<batchDownload filetype="svg" tag="Curation:AnalysisCollection"></batchDownload>
+
-
|}
+
-
== PNG ==
+
''Note: Our files contain the date of creation in their names so that you can be sure which collection your are using and to avoid overwriting local copies of these files.''
-
Click on one of the links below to download all pathways in the Portable Network Graphics (png) format.
+
-
{|class="prettytable"
+
# '''[https://github.com/wikipathways/rwikipathways rWikiPathways]''' is an R package that provides an helper function called ''downloadPathwayArchive'' that will retrieve the latest file for you per species and format, e.g.,  <pre>downloadPathwayArchive(organism="Mus musculus”, format=‘gmt’)</pre>
-
  |- valign="top"
+
# '''Filename pattern''' allows you to infer the filename of the latest collection given the current date. For example, since we always release our archive collections on the 10th of each month, you know that the latest filename is the nearest prior date matching that pattern, e.g., 20180910 would be the current file from Sep 10 to Oct 10, 2018. ''Caution: this might break if for some unforeseen reason we are unable to produce the archive on schedule.''
-
  |<batchDownload filetype="png" tag="Curation:AnalysisCollection"></batchDownload>
+
# '''Bash scripting''' allows you to scrape the currently available filenames and guarantee that you are getting the latest file no matter what the name might be. Here is an example of a one-liner to get a list of all the current GMT files: <pre>echo "cat //html/body/div/table/tbody/tr/td/a" | xmllint --html --shell http://data.wikipathways.org/current/gmt/ | grep -o -E ">(.*gmt)<" | sed -E 's/(<|>)//g'</pre> And here is a version that would return the latest GMT for mouse: <pre>echo "cat //html/body/div/table/tbody/tr/td/a" |  xmllint --html --shell http://data.wikipathways.org/current/gmt/ | grep -o -E ">.*Mus_musculus.gmt<" | sed -E 's/(<|>)//g'</pre>
-
  |}
+
   
 +
== Other Collections ==
-
== GMT Gene Sets ==
+
<font size=3>
-
 
+
* [http://data.wikipathways.org Prior monthly releases]
-
You can download the complete WikiPathways gene set collection [http://pathvisio.org/data/bots/gmt/wikipathways.gmt here].
+
* [[Daily_Download|Daily curated releases]]
-
 
+
* [http://www.wikipathways.org//wpi/batchDownload.php?species=Homo%20sapiens&fileType=gpml&tag=Curation:Reactome_Approved  Reactome Human Collection]
-
This file gets updated every day. We archive earlier versions of the collection [http://pathvisio.org/data/bots/gmt/ here].
+
* [http://data.wikipathways.org/current/gmt Gene lists per pathway (GMT)]
 +
* [http://data.wikipathways.org/current/svg Pathway image files (SVG)]
 +
* [http://data.wikipathways.org/current/rdf Linked data files (RDF)]
 +
* [http://data.wikipathways.org/current/index Database index files (index)]
 +
* [[Help:FileFormats|Other file formats]]
 +
</font>

Revision as of 14:50, 11 April 2024

Versioned Releases

Each month we release an updated set of pathways in various data and image formats. These pathways have been reviewed and tagged as approved, and are considered ready for analysis and data overlays.


Current version: 20240410 (10 April 2024)


Vertebrates

Invertebrates

Plants

Eukaryotic microorganisms

Bacteria


Programmatic Access

The archive of current and past collections of pathways in various formats at data.wikipathways.org is accessible programmatically as well. Depending on your preferences, there are many ways to identify and download the collection you need.

Note: Our files contain the date of creation in their names so that you can be sure which collection your are using and to avoid overwriting local copies of these files.

  1. rWikiPathways is an R package that provides an helper function called downloadPathwayArchive that will retrieve the latest file for you per species and format, e.g.,
    downloadPathwayArchive(organism="Mus musculus”, format=‘gmt’)
  2. Filename pattern allows you to infer the filename of the latest collection given the current date. For example, since we always release our archive collections on the 10th of each month, you know that the latest filename is the nearest prior date matching that pattern, e.g., 20180910 would be the current file from Sep 10 to Oct 10, 2018. Caution: this might break if for some unforeseen reason we are unable to produce the archive on schedule.
  3. Bash scripting allows you to scrape the currently available filenames and guarantee that you are getting the latest file no matter what the name might be. Here is an example of a one-liner to get a list of all the current GMT files:
    echo "cat //html/body/div/table/tbody/tr/td/a" |  xmllint --html --shell http://data.wikipathways.org/current/gmt/ | grep -o -E ">(.*gmt)<" | sed -E 's/(<|>)//g'
    And here is a version that would return the latest GMT for mouse:
    echo "cat //html/body/div/table/tbody/tr/td/a" |  xmllint --html --shell http://data.wikipathways.org/current/gmt/ | grep -o -E ">.*Mus_musculus.gmt<" | sed -E 's/(<|>)//g'


Other Collections

Personal tools