transform = TRUEargument to
[extraction method, allowing the user to skip automatic DESeq2 transformations, which can be CPU intensive for large datasets.
plotMeanAverage(). An MA-plot by definition is not a “Mean Average” plot, so this function name is misleading. We will keep the
plotMeanAverage()working but it is now soft deprecated.
labelargument, similar to
tximport()call to handle transcript version mismatch with tx2gene data.frame. This can result if the bcbio pipeline is using an old genome build.
genomeBuildis detected from AnnotationHub
rowRangesMetadataif applicable, and not left
NULLin the metadata.
aes_string(), which uses tidyeval and quasiquotation.
plotGene(): reduced the number of
returnoptions to simply “facet” and “wide”. Previously, this also supported “grid”, “list”, and “markdown”, but these were removed because they are not frequently used.
plotGene(): Switched back to internal
lapply()call instead of using
BiocParallel::bplapply(). This doesn’t always work perfect in an HPC environment (e.g. HMS O2 cluster).
plotMeanAverage()in favor of
aggregateReplicates()support has been added back. This function returns a
RangedSummarizedExperimentinstead of a
bcbioRNASeqobject, containing only an aggregate raw counts matrix in the
organismas new parameter arguments. We’ve reduced the number of parameters required here to run clusterProfiler.
bcbioRNASeqobject, in favor of
DESeqDataSetonly. This function is only useful when a proper design formula has been defined.
metrics()now contains an informative error for datasets that were analyzed using the
bcbioRNASeqobject doesn’t attempt to run
DESeq()command any more, which was unnecessary and improves speed.
bcbioSingleCell()constructor now supports
censorSamplesparameter. This is useful for removing known poor quality samples upon loading.
NULLin the quality control functions. This behavior doesn’t change the appearance of the plot colors, which will still default to
ggplot2::scale_fill_hue(). The upcoming ggplot2 v2.3.0 update supports global options for color and fill palettes, so these parameters may be deprecated in a future release.
rlogcounts by default in plots, where applicable.
plotDEGPCA()to default differential expression R Markdown template.
colData()factors are correctly releveled upon object subset with
[. This helps avoid unwanted downstream errors when creating a
DESeqDataSetand running differential expression with DESeq2.
facetreturn method by default for
plotGene(). Updated the working example to reflect this.
samplelabel has been removed from axis title for QC plot functions.
bcbio_geom_label_repel(). These are also used by bcbioSingleCell for improved graphical consistency.
plotPCA()functions to match
plotDEGPCA(), matching the other DEG functions. Also added directionality to
DESeqDataSetmethod support to
plotCorrelationHeatmap(), using the normalized counts.
reusltsTables()now writes local files to
tempdir()when Dropbox mode is enabled using
Last set of code fixes before F1000v2 resubmission.
rlereturn support for
counts(), which are calculated on the fly.
organism = NULLagain, for datasets with poorly annotated genomes.
assay()containing raw counts is now named
raw, for consistency with other
DESeqDataSet) and the bcbioSingleCell S4 class definition.
assays(), even when rlog and vst transformations are skipped.
metadata<-assignment methods, to avoid unwanted coercion to
SummarizedExperiment. Objects extending
RangedSummarizedExperimentshouldn’t be doing this, so we may need to file a bug report with Bioconductor or check our class definition in the package.
message()rather than the alternate rlang functions
bcbioRNASeqS4 class object is now extending
SummarizedExperiment. Consequently, the row annotations are now stored in the
GRangesclass, instead of in the
rowDataslot as a
rowData()accessor still works and returns a data frame of gene/transcript annotations, but these are now coerced from the internally stored
GRangesobject is acquired automatically from Ensembl using
basejump::ensembl(). By default,
GRangesare acquired from Ensembl using AnnotationHub and ensembldb. Legacy GRCh37 genome build is supported using the EnsDb.Hsapiens.v75 package.
assays()now only slot matrices. We’ve moved the tximport data from the now defunct
bcbio()slot to assays. This includes the
lengthsmatrix from tximport. Additionally, we are optionally slotting DESeq2 variance-stabilized counts (“
"vst"). DESeq2 normalized counts and edgeR TMM counts are calculated on the fly and no longer stored inside the
colData()now defaults to returning as
DataFrame, for easy piping to tidyverse functions.
bcbio()slot is now defunct.
isSpikeargument during the
loadRNASeq()data import step.
plotCountDensity(). Note that we are subsetting the nonzero genes as defined by the raw counts here.
tximport()code to no longer attempt to strip transcript versions. This is required for working with C. elegans transcripts.
as(object, "DESeqDataSet")coercion method support for
bcbioRNASeqclass. This helps us set up the differential expression analysis easily.
counts()function now returns DESeq2 normalized counts (
normalized = TRUE) and edgeR TMM counts (
normalized = "tmm") on the fly, as suggested by the F1000 reviewers.
bcbioRNASeqobject, since we’re not stashing a
validObject()is now required for all plotting functions. This check is also called in the R Markdown template. Legacy objects can be updated using
metrics()now returns columns sorted alphabetically.
contrastName()as a generic function.
plotDEGPCA()generics no longer have
countsdefined in the signature. The
countsargument is now only defined in the methods.
prepareRNASeqTemplate()has been converted from a generic to a standard function.
plotCorrelationHeatmap()matrix method has been moved to basejump package, for improved consistency with the other heatmap code.
plotGenderMarkers()internal code has been reworked to match
plotMA()appearance has changed, providing a line at the 0 y-intercept, similar to
.sampleDirs()code is now exported in bcbioBase as a generic.
interestingGroups()method support are now defined for
SummarizedExperimentin the bcbioBase package.
bcbio()slot is now defunct, since we have moved all data into the
plot5x3Bias()in favor of
plot5Prime3PrimeBias(). This is less confusing as to what this function plots.
flatFiles()has been deprecated in favor of
as(object, "list")coercion method. See bcbioBase package for
bcbioRNADataSetmethod support has been removed.
gene2symbolargument not renaming rows in
[subset method dropping metrics in metadata.
resultsTables(), for use with the Stem Cell Commons database.
plotGenderMarkers()to run faster.
genomeBuildparameters are now user-definable in the main
intersect()in the featureCounts matrix.
aggregateReplicates()code. This needs to be reworked and added back in a future release.
ifstatements to be more class specific.
normalized, for consistency with the
titlesupport to plots, where applicable.
resulsTables()function now defaults to
summary = TRUE.
res, etc.) into a single
examplesobject. This helps avoid accidental use of example
bcbin an analysis.
internal-ggplot.Rto above each function.
transformationLimit. If there are more samples than this limit, then the DESeq2 transformations will be skipped. In this case,
vstwill not be slotted into
loadRNASeq()to ensure rows are in the same order as the columns in the counts matrix. Otherwise, DESeq will report an error at the
DESeqDataSetFromTximport()step. We’re also ensuring the factor levels get updated here.
str()in examples, where applicable.
colData<-assignment method support. This requires a
DataFrameclass object. Upon assignment, the internal colData at
assays(object)[["vst"]]are also updated to match.
design, which will update the internal DESeqDataSet.
gene2symbol()generic, which will now return a 2 column
symbolcolumns. This is helpful for downstream gene to symbol mapping operations.
interestingGroups<-in the documentation.
viridis::scale_fill_viridis(discrete = TRUE). This makes it clearer to the user in the documentation where these palettes are located.
plotHeatmap()now uses internal
gene2symbolmappings from stashed annotable, instead of always querying Ensembl. The user can define custom mappings with the
gene2symbolargument, if desired.
plotPCA()now supports custom color palettes. The
shapesparameter has been removed because it doesn’t work well and is limited to datasets with few samples. This behavior matches the PCA functionality in DESeq2.
plotVolcano(). Added support for
gene2symbolargument, like in
plotHeatmap(). If left missing, the function will query Ensembl for the gene2symbol mappings. We’re now using
statsas the main data source.
dependenciesargument, which allows for automatic install of suggested packages along with imports.
f1000v1branch containing the reproducible code used to generate the figures in our workflow.
plotMA()to support vertical or horizontal layout return. Also added an argument to remove the color legend, which is typically not that informative.
bcbioRNADataSet(< 0.1.0) to
bcbioRNASeqclass object is now possible using
bcbioRNADataSetobjects must be upgraded to
bcbioRNASeqS4 object using
bcbioRNASeqmethod support for
bcbioRNADataSetS4 class to
bcbioRNASeq. This matches the naming conventions in the bcbioSingleCell package.
loadRNASeq()from using S4 dispatch to a standard function.
loadRNASeq()that enables request of a specific Ensembl release version for gene annotations.
interestingGroupargument in quality control functions to
interestingGroupsfor better consistency.
bcbioRNADataSetor a metrics
interesting_groupdeclaration for visualization.
design = formula(~1)for quality control. This enables automatic generation of
bcbioRnaDataSetS4 definition updates.
plot_pca()and gene-level heatmaps.
load_run()that saves to S4 object instead of list.
project-summary.yamlsaved in the final run directory.