1001proteomes

From Mascp

Jump to: navigation, search

Contents

1001 Proteomes

1001 Proteomes, An Arabidopsis thaliana non-synonymous SNP browser. The 1001 Proteomes portal provides a simple way to browse changes to proteins caused by non-synonymous single nucleotide polymorphisms (nsSNPs) in accessions or natural strains of Arabidopsis thaliana. This portal is part of a collection of resources developed by the Proteomics Subcommittee of the Multinational Arabidopsis Steering Committee (MASCP) and the 1001 Genomes consortium.


SNP data for Arabidopsis thaliana accessions

The underlying data in 1001 Proteomes is obtained from the Arabidopsis 1001 Genomes project as sets of processed SNP calls. We have created a pipeline to take these processed data and constructs pseudo chromosomes for each accession based on the Arabidopsis reference accession Col-0. Reference genome sequence data is obtained from The Arabidopsis Information Resource (currently TAIR10). From the accession specific pseudo chromosomes, we extract and compile proteins sets using BioPerl (based on the TAIR gene reference files) which are then compared to the reference protein set generated at TAIR for Col-0.

Note: The resultant accession 'proteomes' are based on SNP calls from various resequencing projects and analysis procedures and are completely reliant on the analysis and SNP calls by the original data providers. Consequently, all nsSNPs should be treated as putative until demonstrated experimentally through targeted sequence validation or mass spectrometry.


Browser compatibility

Due to issues associated with browser non-conformity, we recommend using any of the following browsers to obtain correct rendering Mozilla Firefox, Google Chrome or Apple Safari. It is possible to use this tool with Internet Explorer after installation of the Google Chrome Frame Add-on.

News

Initial Release: November 2011

91 Accessions

Using 1001 Proteomes

1. When first visiting the site, a pop-up box will be presented to enter an AGI. An pre-filled AGI (at3g15450.1) enables instant access to the browser.


2. Data for an AGI can be retrieved at any point by entering the code into the window on the top left and clicking the "Retrieve" button. The current description for the protein is displayed (TAIR10) with the protein represented as a bar scale with numbers indicating the amino acid number in the protein. The red/grey spheres indicate an nsSNP or amino acid substitution has occurred at this point in the protein sequence. This 'track' represents a consolidation or summary of all nsSNPs from all associated accession. The letter in the red/grey sphere corresponds to the substituted amino acid (1 letter code) as a result of the nsSNP. The size of the red/grey sphere indicates the number of accessions or natural variants that have contributed to a specific nsSNP or substitution. The larger the red/grey sphere the more accessions have contributed to the change.


3. By employing a mouse click on the triangle in the dark grey "Control Box" next to nsSNPs a list of all Arabidopsis thaliana accessions that contain nsSNPs on the current protein is displayed grouped by data providers. Currently, MPI (Max Planck Institute for Developmental Biology) and JGI (Joint Genome Institute). Specific accessions can be accessed by toggling the triangles and then the subsequent + symbol next to each accessions name. This enables complete control of visualized tracks, allowing specific focus on a single accession.


4. By using the either the mouse wheel or the + and - symbols above the "Control Box" it is possible to zoom in on a specific amino acid that have been substituted by the nsSNP by a specific accession. Simply use the mouse to grab and pan to maintain horizontal orientation. In this example Bur-0 has a SNP that causes a substitution at Met (106), resulting in an Ile; while Ped-0 contains a SNP that causes Thr at position 141 to be substituted with a Ser.


5. The number of accession contributing to a specific nsSNP is related to the size of the red/grey sphere. The more accession contributing, the larger the sphere relative to others.


6. The order of 'tracks' can be shuffled by clicking the 'Edit' button on the "Control Box" and moving tracks with the mouse. The "Control Box" can also be removed by clicking the X in the top right corner of the "Control Box". Finally, the toolbar at the bottom of the page indicates the last time data for this AGI was loaded. This can be refreshed by clicking the circular arrow.

Available Datasets

1001 Proteomes: November 2011

JGIHeazlewood2011 - 6 accessions from DOE Joint Genome Institute

MPICao2010 - 80 Arabidopsis thaliana accessions

JGIHeazlewood2008 - Bay-0 and Sha from DOE Joint Genome Institute

MPIOssowski2008 - Col-0, Bur-0 and Tsu-1

Chromosome Data

The individual pseudo chromosomes are available upon request.

Please contact: Joshua Heazlewood Lawrence Berkeley National Laboratory Joint BioEnergy Institute