Specht et al, 2019

Article: Specht et al, 2019

Peer reviewed article: Specht, H., Emmott, E., Petelski, A.A. et al. Single-cell proteomic and transcriptomic analysis of macrophage heterogeneity using SCoPE2. Genome Biol 22, 50 (2021). 10.1186/s13059-021-02267-5, Website

Single-cell proteomics method: SCoPE2

Sample preparation method: mPOP

Model systems: Monocytic cell line (U937) differentiating into macrophage-like cells

Processed SCoPE2 Data RAW SCoPE2 Data 10x Genomics Data

SCoPE2 data processed to ASCII text matrices

All Processed Data
Peptides-raw.csv
- Peptides x single cells at 1% FDR. The first 2 columns list the corresponding protein identifiers and peptide sequences and each subsequent column corresponds to a single cell. Peptide identification is based on spectra analyzed by MaxQuant and is enhanced by using DART-ID to incorporate retention time information. See Specht et al., 2019 for details.

Proteins-processed.csv
- Proteins x single cells at 1% FDR, imputed and batch corrected.

Cells.csv
- Annotation x single cells. Each column corresponds to a single cell and the rows include relevant metadata, such as, cell type if known, measurements from the isolation of the cell, and derivative quantities, i.e., rRI, CVs, reliability.

sdrf_meta_data.tsv
- Meta data following the Sample to Data file format (SDRF) for Proteomics project guidelines for for all single cells used in analysis constituting all figures.

Joint protein-RNA data
- Gene x single cells. Both sets imputed and batch-corrected separately then combined, taking only genes common to both data sets. Uniprot accession numbers used to denote gene.

Signal-to-noise data
- Peptides and Proteins x single cells at 1% FDR. The first 2 columns list the corresponding protein identifiers and peptide sequences and each subsequent column corresponds to a single cell. The quantitation is the Signal-to-noise (S/N) ratio for each single cell’s corresponding reporter ion extracted from the RAW file. The single cell identification numbers are mapped to cell type and RAW file. Complete extracted S/N for each RAW file can be found here.

DART-ID input

GSEA: GOrilla output
Minimal data files necessary for generating Peptides-raw.csv and Proteins-processed.csv

Additional data files necessary for generating figures from the SCoPE2 article.

Processed Data from the second version (v2) of the SCoPE2 preprint

Processed Data from the first version (v1) of the SCoPE2 preprint

Single cell proteomics data processing: The analysis of the data described here has been replicated by Christophe Vanderaa and Laurent Gatto with the scp Bioconductor package: The scp package is used to process and analyze mass spectrometry-based single cell proteomics data and is freely available from their Github repository. The scp package and the replication are described in this video.

SCoPE2 RAW data and search results from MaxQuant

The repositories below contain RAW mass-spectrometry data files generated by a Q exactive instrument as well as the search results from analyzing the RAW files by MaxQuant and by DART-ID. The files in Repository 1 were generated in the spring of 2019 and described in the first version of Specht et al., 2019. The files in Repository 2 were generated in the fall of 2019 from biological replicates and are described in the third version of Specht et al., 2019. The data in Repository 1 were generated with 11-plex TMT while data in Repository 2 were generated with 16-plex TMT pro.

MaxQuant results descriptions
Raw file descriptions
MassIVE Repository 1:
- http: MSV000083945
- ftp: MSV000083945
MassIVE Repository 2:
- http: MSV000084660
- ftp: MSV000084660

scRNA-seq 10x Genomics RAW and processed data

A cellular mixture identical to that used for the single-cell proteomics was assessed with scRNA-seq using 10x Genomics Chromium platform and the Single Cell 3’ Library & Gel Bead Kit (v2). Two biological replicates of the cell suspension (stock concentration: 1200 cells/μl) were loaded into independent lanes of the device. An average number of about 10,000 cells/lane were recovered. Following the library preparation and sample QC by Agilent BioAnalyzer High Sensitivity chip, the two libraries were pooled together, quantified by KAPA Library Quantification kit and sequenced using the Illumina Novaseq 6000 system (Nova S1 100 flow cell) with the following run parameters: Read1: 26 cycles, i7 index: 8 cycles, Read2: 93 cycles.

GEO Repository
- http: GSE142392
Processed data matrices: Transcript x single cells in UMI counts.
- mRNA biological replicate one
- mRNA biological replicate two