Title: | Various functions for microbial community data |
---|---|
Description: | Collection of functions for microbiome analyses. E.g. fitting neutral models and standardized effect sizes of phylogenetic beta diversities, and much more. |
Authors: | Jakob Russel [aut, cre] |
Maintainer: | Jakob Russel <[email protected]> |
License: | GPL (>= 3) | file LICENSE |
Version: | 0.9.19 |
Built: | 2024-11-06 18:20:01 UTC |
Source: | https://github.com/Russel88/MicEco |
Calculate (partial) Omega-squared (effect-size calculation) for PERMANOVA and add it to the input object
adonis_OmegaSq(adonisOutput, partial = TRUE)
adonis_OmegaSq(adonisOutput, partial = TRUE)
adonisOutput |
An adonis object |
partial |
Should partial omega-squared be calculated (sample size adjusted). Default TRUE |
Original adonis object with the (partial) Omega-squared values added
This function performs a CLR transformation on the input. Prior to transformation it does a multiplicative zero replacement, which adds a pseudocount, but corrects the non-zero abundances such that log-ratios between non-zero elements are unchanged after correction.
clr(mat, delta = 1)
clr(mat, delta = 1)
mat |
A community matrix with features as rows |
delta |
The pseudocount to exchange zeroes with. Zero-correction is multiplicative such that the log-ratios between any entirely non-zero features will not be affected by the pseudocount. |
CLR transformed community matrix
Parallel calculation of MPD (mean pairwise distance) separating taxa in two communities, a measure of phylogenetic beta diversity
comdist.par(comm, dis, abundance.weighted = FALSE, cores = 1, progress = TRUE)
comdist.par(comm, dis, abundance.weighted = FALSE, cores = 1, progress = TRUE)
comm |
Community data matrix with samples as rows |
dis |
Distance matrix (generally a phylogenetic distance matrix) |
abundance.weighted |
Should mean nearest taxon distances for each species be weighted by species abundance? (default = FALSE) |
cores |
How many cores should be used for parallel computing |
progress |
Show a progress bar |
A distance matrix of MPD values
Parallel calculation of MNTD (mean nearest taxon distance) separating taxa in two communities, a measure of phylogenetic beta diversity
comdistnt.par( comm, dis, abundance.weighted = FALSE, exclude.conspecifics = FALSE, cores = 1, progress = TRUE )
comdistnt.par( comm, dis, abundance.weighted = FALSE, exclude.conspecifics = FALSE, cores = 1, progress = TRUE )
comm |
Community data matrix with samples as rows |
dis |
Distance matrix (generally a phylogenetic distance matrix) |
abundance.weighted |
Should mean nearest taxon distances for each species be weighted by species abundance? (default = FALSE) |
exclude.conspecifics |
Should conspecific taxa in different communities be exclude from MNTD calculations? (default = FALSE) |
cores |
How many cores should be used for parallel computing' |
progress |
Show a progress bar |
A distance matrix of MNTD values
Calculate the average 16S rRNA copy number for each sample in an OTU table
community_rrna(x, copy.database = "v13.5", weighted = TRUE)
community_rrna(x, copy.database = "v13.5", weighted = TRUE)
x |
A phyloseq object OR an OTU-table with taxa as rows and OTU names as rownames. OTUs should be picked against the Greengenes v13.5 database, unless a another copy number database is provided. |
copy.database |
What Greengenes database version was used to find OTUs. Atm only "v13.5" is available. Alternatively, A dataframe with two variables: "ID" is the OTU id matched by rownames in x and "Copy" is the copy number. |
weighted |
Logical. Should the average copy number be weighted by the relative abundances of the OTUs? If not, it is adviced to rarefy the otu-table first if there are large differences in sample reads. |
A dataframe with an average copy number for each sample
Precalculated for the PICRUSt pipeline (https://picrust.github.io/picrust/picrust_precalculated_files.html)
Fit neutral model developed by Sloan et al. (2006, Environ Microbiol 8(4):732-740) and implemented by Burns et al. (2015, ISME J 10(3):655-664).
neutral.fit(otu)
neutral.fit(otu)
otu |
An OTU-table with taxa as columns and samples as rows. |
A list of length two; first element contains fit statistics, the second element contains predictions.
Fit neutral model developed by Sloan et al. (2006, Environ Microbiol 8(4):732-740) and implemented by Burns et al. (2015, ISME J 10(3):655-664) several times on ramdomly picked samples and with 16S rRNA gene copy number corrected rarefaction.
neutral.rand( data, n = NULL, s = NULL, rRNA = NULL, rn = NULL, cores = 1, naming = NULL )
neutral.rand( data, n = NULL, s = NULL, rRNA = NULL, rn = NULL, cores = 1, naming = NULL )
data |
A phyloseq object |
n |
Integer. Number of times to repeat analysis |
s |
Integer. Number of random samples to for each repetition. |
rRNA |
What Greengenes database version was used to find OTUs. Atm only "v13.5" is available. Alternatively, A dataframe with two variables: "ID" is the OTU id matched by names in x and "Copy" is the copy number. |
rn |
Integer. Number of reads to sample for rarefaction |
cores |
Integer. Number of cores to use for parallel computing. |
naming |
Optional. A list for naming the output e.g. list(Time="1 Week",Type="Gut") |
A list of length two; first element contains fit statistics, the second element contains predictions.
Calculate proportionality as proposed by Lovell et al. 2016 Proportionality: a valid alternative to correlation for relative data
proportionality(x, delta = 1)
proportionality(x, delta = 1)
x |
An otu-table with OTUs as rows OR a phyloseq object |
delta |
The pseudocount to exchange zeroes with. Zero-correction is multiplicative such that the proportionality between any entirely non-zero OTUs will not be affected by the pseudocount. |
The network association matrix with proportionality values
Make Euler diagram of shared taxa (ASVs, OTUs) across sample groups from a phyloseq object. Overlap can be weighted by relative abundance
ps_euler( ps, group, fraction = 0, shape = "circle", weight = FALSE, relative = TRUE, plot = TRUE, ... )
ps_euler( ps, group, fraction = 0, shape = "circle", weight = FALSE, relative = TRUE, plot = TRUE, ... )
ps |
A phyloseq object |
group |
The grouping factor. Should match variable in sample_data(ps) |
fraction |
The fraction (0 to 1) of samples in a group in which the taxa should be present to be included in the count. |
shape |
Shape of the plot "circle" or "ellipse"? |
weight |
If TRUE, the overlaps are weighted by abundance |
relative |
Should abundances be made relative |
plot |
If TRUE return a plot, if FALSE return a list with shared and unique taxa |
... |
Additional arguments |
Any further arguments to this function are passed to the plot.euler function from the eulerr package. This can be used to change colors, fonts, and other graphical parameters. For example: ps_euler(phy, "Time", quantities = list(type=c("percent","counts"), font = 2), labels = list(cex = 2), col = "red", fill = c("red","blue","green"))
An euler plot
Make pheatmap directly from a phyloseq object, including agglomoration and filtering. Default color map is viridis with absent taxa as black.
ps_pheatmap( ps, annot_samp = NULL, annot_taxa = NULL, relative = TRUE, log10 = TRUE, tax_agg = NULL, order_taxa = TRUE, min_samples = 1, min_reads = 1, min_abundance = 0, label_rank = NULL, color = c("black", viridis::viridis(10)), ... )
ps_pheatmap( ps, annot_samp = NULL, annot_taxa = NULL, relative = TRUE, log10 = TRUE, tax_agg = NULL, order_taxa = TRUE, min_samples = 1, min_reads = 1, min_abundance = 0, label_rank = NULL, color = c("black", viridis::viridis(10)), ... )
ps |
A phyloseq object |
annot_samp |
Sample variables to annotate with |
annot_taxa |
Taxa variables to annotate with |
relative |
If TRUE, abundances total sum scaled. Default TRUE |
log10 |
If TRUE, log10 transform abundances. Default TRUE |
tax_agg |
Taxa rank to agglomorate. Default NULL |
order_taxa |
If TRUE, taxa are ordered from most to least abundant. Default TRUE |
min_samples |
Minimum number of samples the features should be present in. Default 1 |
min_reads |
Minimum number of total reads the features should have. Default 1 |
min_abundance |
Minimum mean relative abundance features should have. Default 0 |
label_rank |
Taxa rank to label the taxa. If NULL will label by taxa_names(ps). Default NULL |
color |
Color palette. Default viridis with absent |
... |
Additional arguments to pheatmap function |
A pheatmap
Prune taxa from phyloseq object by their prevalence or abundance
ps_prune(data, min.samples = 0, min.reads = 0, min.abundance = 0)
ps_prune(data, min.samples = 0, min.reads = 0, min.abundance = 0)
data |
|
min.samples |
Minimum number of samples the features should be present in. Default 0 |
min.reads |
Minimum number of total reads the features should have. Default 0 |
min.abundance |
Minimum mean relative abundance features should have. Default 0 |
Similar to input, but with features not reaching the criteria given grouped as "Others"
Relevel the Sample variable in a psmelted phyloseq object, such that similar samples are plotted together with ggplot barcharts
ps_refactor(psmelted, ...)
ps_refactor(psmelted, ...)
psmelted |
A phyloseq object melted into a data.frame with psmelt |
... |
Arguments passed to hclust |
Example: An OTU is annotated as Proteobacteria in the Phylum rank, but NAs in the more specific ranks. All annotations below Phylum for this OTU will be replaced with Phylum_Proteobacteria.
ps_tax_clean(data)
ps_tax_clean(data)
data |
|
Similar to input, but with NAs replaced in the tax_table
Make Venn diagram of shared taxa (ASVs, OTUs) across sample groups from a phyloseq object. Overlap can be weighted by relative abundance #' Any further arguments to this function are passed to the plot.venn function from the eulerr package. This can be used to change colors, fonts, and other graphical parameters. For example: ps_venn(phy, "Time", quantities = list(type=c("percent","counts"), font = 2), labels = list(cex = 2), col = "red", fill = c("red","blue","green"))
ps_venn( ps, group, fraction = 0, weight = FALSE, relative = TRUE, plot = TRUE, ... )
ps_venn( ps, group, fraction = 0, weight = FALSE, relative = TRUE, plot = TRUE, ... )
ps |
A phyloseq object |
group |
The grouping factor. Should match variable in sample_data(ps) |
fraction |
The fraction (0 to 1) of samples in a group in which the taxa should be present to be included in the count. |
weight |
If TRUE, the overlaps are weighted by abundance |
relative |
Should abundances be made relative |
plot |
If TRUE return a plot, if FALSE return a list with shared and unique taxa |
... |
Additional arguments |
An venn plot
Rarefy an OTU-table with the probability of the inverse 16S rRNA copy numbers: The result is a normalized AND copy number corrected OTU-table.
rarefy_rrna( x, reads, copy.database = "v13.5", seed = NULL, trim = FALSE, replace = FALSE ) rarefy_rrna.matrix(x, reads, copy.database, seed = NULL, trim, replace) rarefy_rrna.phyloseq(x, reads, copy.database, seed = NULL, trim, replace)
rarefy_rrna( x, reads, copy.database = "v13.5", seed = NULL, trim = FALSE, replace = FALSE ) rarefy_rrna.matrix(x, reads, copy.database, seed = NULL, trim, replace) rarefy_rrna.phyloseq(x, reads, copy.database, seed = NULL, trim, replace)
x |
A phyloseq object OR an OTU-table with taxa as columns and OTU names as colnames. OTUs should be picked against the Greengenes v13.5 database, unless another copy number database is provided. |
reads |
Number of reads to sample. |
copy.database |
What Greengenes database version was used to find OTUs. Atm only "v13.5" is available. Alternatively, A dataframe with two variables: "ID" is the OTU id matched by names in x and "Copy" is the copy number. |
seed |
Random seed for sampling. |
trim |
Should samples with less than the set amount of reads be trimmed away? |
replace |
Should reads be sampled with replacement? Default FALSE. If FALSE, rRNA copy numbers will have small effect if the rarefied depth is close to the actual depth. If TRUE, some taxa can end up with more reads in the rarefied matrix that they had in the input. |
A rarefied otu-table
Rarefaction curve on phyloseq object Theoretical richness with vegan's rarefy function
rcurve(physeq, subsamp = 10^c(1:5), trim = TRUE, add_sample_data = TRUE)
rcurve(physeq, subsamp = 10^c(1:5), trim = TRUE, add_sample_data = TRUE)
physeq |
phyloseq object |
subsamp |
Vector of number of reads to subsample |
trim |
Remove richness estimations from subsamples larger than the library size |
add_sample_data |
Add sample data to data.frame |
Standardized effect size of MPD (mean pairwise distance) separating taxa in two communities, a measure of phylogenetic beta diversity
ses.comdist( samp, dis, null.model = c("taxa.labels", "richness", "frequency", "sample.pool", "phylogeny.pool", "independentswap", "trialswap"), abundance.weighted = FALSE, runs = 999, iterations = 1000, cores = 1 )
ses.comdist( samp, dis, null.model = c("taxa.labels", "richness", "frequency", "sample.pool", "phylogeny.pool", "independentswap", "trialswap"), abundance.weighted = FALSE, runs = 999, iterations = 1000, cores = 1 )
samp |
Community data matrix with samples as rows |
dis |
Distance matrix (generally a phylogenetic distance matrix) |
null.model |
Null model to use (see Details section for description) |
abundance.weighted |
Should mean nearest taxon distances for each species be weighted by species abundance? (default = FALSE) |
runs |
Number of randomizations |
iterations |
Number of iterations to use for each randomization (for independent swap and trial null models) |
cores |
Number of cores to use for parallel computing |
Currently implemented null models (arguments to null.model):
taxa.labels - Shuffle distance matrix labels (across all taxa included in distance matrix)
richness - Randomize community data matrix abundances within samples (maintains sample species richness)
frequency - Randomize community data matrix abundances within species (maintains species occurence frequency)
sample.pool - Randomize community data matrix by drawing species from pool of species occurring in at least one community (sample pool) with equal probability
phylogeny.pool- Randomize community data matrix by drawing species from pool of species occurring in the distance matrix (phylogeny pool) with equal probability
independentswap - Randomize community data matrix with the independent swap algorithm (Gotelli 2000) maintaining species occurrence frequency and sample species richness
trialswap - Randomize community data matrix with the trial-swap algorithm (Miklos & Podani 2004) maintaining species occurrence frequency and sample species richness
A list of results:
ntaxa - Number of taxa in community
comdist.obs - Observed mpd between communities
comdist.rand.mean - Mean mpd between null communities
comdist.rand.sd - Standard deviation of mpd between null communities
comdist.obs.rank - Rank of observed mpd vs. null mpd
comdist.obs.z - Standardized effect size of mpd vs. null mpd (= (comdist.obs - comdist.rand.mean) / comdist.rand.sd, equivalent to -betaNRI)
comdist.obs.p - P-value (quantile) of observed mpd vs. null communities (= comdist.obs.rank / runs + 1)
runs - Number of randomizations
Standardized effect size of MPD (mean pairwise distance) separating taxa in two communities, a measure of phylogenetic beta diversity
ses.comdist2( samp, dis, method = "quasiswap", fixedmar = "both", shuffle = "both", strata = NULL, mtype = "count", burnin = 0, thin = 1, abundance.weighted = FALSE, runs = 999, cores = 1 )
ses.comdist2( samp, dis, method = "quasiswap", fixedmar = "both", shuffle = "both", strata = NULL, mtype = "count", burnin = 0, thin = 1, abundance.weighted = FALSE, runs = 999, cores = 1 )
samp |
Community data matrix with samples as rows |
dis |
Distance matrix (generally a phylogenetic distance matrix) |
method |
Character for method used for the swap algorithm ("swap", "tswap", "quasiswap", "backtrack"). If NULL no swap algorithm is applied (uses permatfull from vegan). If mtype="count" the "quasiswap", "swap", "swsh" and "abuswap" methods are available (see details). |
fixedmar |
Character, stating which of the row/column sums should be preserved ("none", "rows", "columns", "both"). |
shuffle |
Character, indicating whether individuals ("ind"), samples ("samp") or both ("both") should be shuffled, see details. |
strata |
Numeric vector or factor with length same as nrow(m) for grouping rows within strata for restricted permutations. Unique values or levels are used. |
mtype |
Matrix data type, either "count" for count data, or "prab" for presence-absence type incidence data. |
burnin |
Number of null communities discarded before proper analysis in sequential ("swap", "tswap") methods. |
thin |
Number of discarded permuted matrices between two evaluations in sequential ("swap", "tswap") methods. |
abundance.weighted |
Should mean nearest taxon distances for each species be weighted by species abundance? (default = FALSE) |
runs |
Number of randomizations |
cores |
Number of cores to use for parallel computing |
See permat (vegan) for detailed options
A list of results:
ntaxa - Number of taxa in community
comdist.obs - Observed mpd between communities
comdist.rand.mean - Mean mpd between null communities
comdist.rand.sd - Standard deviation of mpd between null communities
comdist.obs.rank - Rank of observed mpd vs. null mpd
comdist.obs.z - Standardized effect size of mpd vs. null mpd (= (comdist.obs - comdist.rand.mean) / comdist.rand.sd, equivalent to -betaNRI)
comdist.obs.p - P-value (quantile) of observed mpd vs. null communities (= comdist.obs.rank / runs + 1)
runs - Number of randomizations
Standardized effect size of MNTD (mean nearest taxon distance) separating taxa in two communities, a measure of phylogenetic beta diversity
ses.comdistnt( samp, dis, null.model = c("taxa.labels", "richness", "frequency", "sample.pool", "phylogeny.pool", "independentswap", "trialswap"), abundance.weighted = FALSE, exclude.conspecifics = FALSE, runs = 999, iterations = 1000, cores = 1 )
ses.comdistnt( samp, dis, null.model = c("taxa.labels", "richness", "frequency", "sample.pool", "phylogeny.pool", "independentswap", "trialswap"), abundance.weighted = FALSE, exclude.conspecifics = FALSE, runs = 999, iterations = 1000, cores = 1 )
samp |
Community data matrix with samples as rows |
dis |
Distance matrix (generally a phylogenetic distance matrix) |
null.model |
Null model to use (see Details section for description) |
abundance.weighted |
Should mean nearest taxon distances for each species be weighted by species abundance? (default = FALSE) |
exclude.conspecifics |
Should conspecific taxa in different communities be exclude from MNTD calculations? (default = FALSE) |
runs |
Number of randomizations |
iterations |
Number of iterations to use for each randomization (for independent swap and trial null models) |
cores |
Number of cores to use for parallel computing |
Currently implemented null models (arguments to null.model):
taxa.labels - Shuffle distance matrix labels (across all taxa included in distance matrix)
richness - Randomize community data matrix abundances within samples (maintains sample species richness)
frequency - Randomize community data matrix abundances within species (maintains species occurence frequency)
sample.pool - Randomize community data matrix by drawing species from pool of species occurring in at least one community (sample pool) with equal probability
phylogeny.pool- Randomize community data matrix by drawing species from pool of species occurring in the distance matrix (phylogeny pool) with equal probability
independentswap - Randomize community data matrix with the independent swap algorithm (Gotelli 2000) maintaining species occurrence frequency and sample species richness
trialswap - Randomize community data matrix with the trial-swap algorithm (Miklos & Podani 2004) maintaining species occurrence frequency and sample species richness
A list of results:
ntaxa - Number of taxa in community
comdistnt.obs - Observed mntd between communities
comdistnt.rand.mean - Mean mntd between null communities
comdistnt.rand.sd - Standard deviation of mntd between null communities
comdistnt.obs.rank - Rank of observed mntd vs. null mntd
comdistnt.obs.z - Standardized effect size of mntd vs. null mntd (= (comdistnt.obs - comdistnt.rand.mean) / comdistnt.rand.sd, equivalent to -betaNTI)
comdistnt.obs.p - P-value (quantile) of observed mntd vs. null communities (= comdistnt.obs.rank / runs + 1)
runs - Number of randomizations
Standardized effect size of MNTD (mean nearest taxon distance) separating taxa in two communities, a measure of phylogenetic beta diversity
ses.comdistnt2( samp, dis, method = "quasiswap", fixedmar = "both", shuffle = "both", strata = NULL, mtype = "count", burnin = 0, thin = 1, abundance.weighted = FALSE, exclude.conspecifics = FALSE, runs = 999, cores = 1 )
ses.comdistnt2( samp, dis, method = "quasiswap", fixedmar = "both", shuffle = "both", strata = NULL, mtype = "count", burnin = 0, thin = 1, abundance.weighted = FALSE, exclude.conspecifics = FALSE, runs = 999, cores = 1 )
samp |
Community data matrix with samples as rows |
dis |
Distance matrix (generally a phylogenetic distance matrix) |
method |
Character for method used for the swap algorithm ("swap", "tswap", "quasiswap", "backtrack"). If NULL no swap algorithm is applied (uses permatfull from vegan). If mtype="count" the "quasiswap", "swap", "swsh" and "abuswap" methods are available (see details). |
fixedmar |
Character, stating which of the row/column sums should be preserved ("none", "rows", "columns", "both"). |
shuffle |
Character, indicating whether individuals ("ind"), samples ("samp") or both ("both") should be shuffled, see details. |
strata |
Numeric vector or factor with length same as nrow(m) for grouping rows within strata for restricted permutations. Unique values or levels are used. |
mtype |
Matrix data type, either "count" for count data, or "prab" for presence-absence type incidence data. |
burnin |
Number of null communities discarded before proper analysis in sequential ("swap", "tswap") methods. |
thin |
Number of discarded permuted matrices between two evaluations in sequential ("swap", "tswap") methods. |
abundance.weighted |
Should mean nearest taxon distances for each species be weighted by species abundance? (default = FALSE) |
exclude.conspecifics |
Should conspecific taxa in different communities be exclude from MNTD calculations? (default = FALSE) |
runs |
Number of randomizations |
cores |
Number of cores to use for parallel computing |
See permat (vegan) for detailed options
A list of results:
ntaxa - Number of taxa in community
comdistnt.obs - Observed mntd between communities
comdistnt.rand.mean - Mean mntd between null communities
comdistnt.rand.sd - Standard deviation of mntd between null communities
comdistnt.obs.rank - Rank of observed mntd vs. null mntd
comdistnt.obs.z - Standardized effect size of mntd vs. null mntd (= (comdistnt.obs - comdistnt.rand.mean) / comdistnt.rand.sd, equivalent to -betaNTI)
comdistnt.obs.p - P-value (quantile) of observed mntd vs. null communities (= comdistnt.obs.rank / runs + 1)
runs - Number of randomizations
Parallel calculation of standardized effect size of mean nearest taxon distances in communities. When used with a phylogenetic distance matrix, equivalent to -1 times the Nearest Taxon Index (NTI).
ses.mntd.par( samp, dis, null.model = c("taxa.labels", "richness", "frequency", "sample.pool", "phylogeny.pool", "independentswap", "trialswap"), abundance.weighted = FALSE, runs = 999, iterations = 1000, cores = 1 )
ses.mntd.par( samp, dis, null.model = c("taxa.labels", "richness", "frequency", "sample.pool", "phylogeny.pool", "independentswap", "trialswap"), abundance.weighted = FALSE, runs = 999, iterations = 1000, cores = 1 )
samp |
Community data matrix with samples as rows |
dis |
Distance matrix (generally a phylogenetic distance matrix) |
null.model |
Null model to use (see Details section for description) |
abundance.weighted |
Should mean nearest taxon distances for each species be weighted by species abundance? (default = FALSE) |
runs |
Number of randomizations |
iterations |
Number of iterations to use for each randomization (for independent swap and trial null models) |
cores |
Number of cores to use for parallel computing |
Faster than ses.mntd from picante
when there are many samples and taxa.
Currently implemented null models (arguments to null.model):
taxa.labels - Shuffle distance matrix labels (across all taxa included in distance matrix)
richness - Randomize community data matrix abundances within samples (maintains sample species richness)
frequency - Randomize community data matrix abundances within species (maintains species occurence frequency)
sample.pool - Randomize community data matrix by drawing species from pool of species occurring in at least one community (sample pool) with equal probability
phylogeny.pool- Randomize community data matrix by drawing species from pool of species occurring in the distance matrix (phylogeny pool) with equal probability
independentswap - Randomize community data matrix with the independent swap algorithm (Gotelli 2000) maintaining species occurrence frequency and sample species richness
trialswap - Randomize community data matrix with the trial-swap algorithm (Miklos & Podani 2004) maintaining species occurrence frequency and sample species richness
A data frame of results for each community
ntaxaNumber of taxa in community
mntd.obsObserved mntd in community
mntd.rand.meanMean mntd in null communities
mntd.rand.sdStandard deviation of mntd in null communities
mntd.obs.rankRank of observed mntd vs. null communities
mntd.obs.zStandardized effect size of mntd vs. null communities (= (mntd.obs - mntd.rand.mean) / mntd.rand.sd, equivalent to -NRI)
mntd.obs.pP-value (quantile) of observed mntd vs. null communities (= mntd.obs.rank / runs + 1)
runsNumber of randomizations
Parallel calculation of standardized effect size of mean pairwise distances in communities. When used with a phylogenetic distance matrix, equivalent to -1 times the Nearest Relative Index (NRI).
ses.mpd.par( samp, dis, null.model = c("taxa.labels", "richness", "frequency", "sample.pool", "phylogeny.pool", "independentswap", "trialswap"), abundance.weighted = FALSE, runs = 999, iterations = 1000, cores = 1 )
ses.mpd.par( samp, dis, null.model = c("taxa.labels", "richness", "frequency", "sample.pool", "phylogeny.pool", "independentswap", "trialswap"), abundance.weighted = FALSE, runs = 999, iterations = 1000, cores = 1 )
samp |
Community data matrix with samples as rows |
dis |
Distance matrix (generally a phylogenetic distance matrix) |
null.model |
Null model to use (see Details section for description) |
abundance.weighted |
Should mean nearest taxon distances for each species be weighted by species abundance? (default = FALSE) |
runs |
Number of randomizations |
iterations |
Number of iterations to use for each randomization (for independent swap and trial null models) |
cores |
Number of cores to use for parallel computing |
Faster than ses.mpd from picante
when there are many samples and taxa.
Currently implemented null models (arguments to null.model):
taxa.labels - Shuffle distance matrix labels (across all taxa included in distance matrix)
richness - Randomize community data matrix abundances within samples (maintains sample species richness)
frequency - Randomize community data matrix abundances within species (maintains species occurence frequency)
sample.pool - Randomize community data matrix by drawing species from pool of species occurring in at least one community (sample pool) with equal probability
phylogeny.pool- Randomize community data matrix by drawing species from pool of species occurring in the distance matrix (phylogeny pool) with equal probability
independentswap - Randomize community data matrix with the independent swap algorithm (Gotelli 2000) maintaining species occurrence frequency and sample species richness
trialswap - Randomize community data matrix with the trial-swap algorithm (Miklos & Podani 2004) maintaining species occurrence frequency and sample species richness
A data frame of results for each community
ntaxaNumber of taxa in community
mpd.obsObserved mpd in community
mpd.rand.meanMean mpd in null communities
mpd.rand.sdStandard deviation of mpd in null communities
mpd.obs.rankRank of observed mpd vs. null communities
mpd.obs.zStandardized effect size of mpd vs. null communities (= (mpd.obs - mpd.rand.mean) / mpd.rand.sd, equivalent to -NRI)
mpd.obs.pP-value (quantile) of observed mpd vs. null communities (= mpd.obs.rank / runs + 1)
runsNumber of randomizations
Permutation test of z-matrix from ses.comdist(2), ses.comdistnt and ses.UniFrac. Test if means within or between groups is higher or lower than the overall mean of the matrix
ses.permtest(zmat, sampleGroups, R = 10000)
ses.permtest(zmat, sampleGroups, R = 10000)
zmat |
Symmetric matrix with z-values |
sampleGroups |
Vector with the grouping of the samples in the same order as in zmat |
R |
Number of permutations |
A dataframe with p-values (pval), fdr corrected p-values (pval.adj), and average z-values (avg) for each group and combination of groups.
Standardized effect size of unifrac
ses.UniFrac( physeq, method = "taxa.labels", fixedmar = "both", shuffle = "both", strata = NULL, mtype = "count", burnin = 0, thin = 1, weighted = TRUE, normalized = TRUE, runs = 99, cores = 1 )
ses.UniFrac( physeq, method = "taxa.labels", fixedmar = "both", shuffle = "both", strata = NULL, mtype = "count", burnin = 0, thin = 1, weighted = TRUE, normalized = TRUE, runs = 99, cores = 1 )
physeq |
phyloseq-class, containing at minimum a phylogenetic tree and otu table |
method |
"taxa.labels" shuffles labels in phylogenetic tree (Ignores fixedmar, shuffle, strata, mtype). If NULL then no swap algorithm is applied (i.e. uses permatfull from vegan). Else the method used for the swap algorithm ("swap", "tswap", "quasiswap", "backtrack"). If mtype="count" the "quasiswap", "swap", "swsh" and "abuswap" methods are available (see details). |
fixedmar |
Character, stating which of the row/column sums should be preserved ("none", "rows", "columns", "both"). |
shuffle |
Character, indicating whether individuals ("ind"), samples ("samp") or both ("both") should be shuffled. |
strata |
Numeric vector or factor for grouping samples within strata for restricted permutations. Unique values or levels are used. |
mtype |
Matrix data type, either "count" for count data, or "prab" for presence-absence type incidence data. |
burnin |
Number of null communities discarded before proper analysis in sequential ("swap", "tswap") methods. |
thin |
Number of discarded permuted matrices between two evaluations in sequential ("swap", "tswap") methods. |
weighted |
Should unifrac be weighted by species abundance? (default = TRUE) |
normalized |
(Optional). Logical. Should the output be normalized such that values range from 0 to 1 independent of branch length values? Default is TRUE. Note that (unweighted) UniFrac is always normalized by total branch-length, and so this value is ignored when weighted == FALSE. |
runs |
Number of randomizations |
cores |
Number of cores to use for UniFrac of null communities. Default is 1 |
See permat (vegan) for detailed options on permutation
A list of results:
unifrac.obs - Observed unifrac between communities
unifrac.rand.mean - Mean unifrac between null communities
unifrac.rand.sd - Standard deviation of unifrac between null communities
unifrac.obs.rank - Rank of observed unifrac vs. null unifrac
unifrac.obs.z - Standardized effect size of unifrac vs. null unifrac (= (unifrac.obs - unifrac.rand.mean) / unifrac.rand.sd)
unifrac.obs.p - P-value (quantile) of observed unifrac vs. null communities (= unifrac.obs.rank / runs + 1)
UniFrac
multiple times in parallel and take the averageWith unrooted phylogenies UniFrac
sets the root randomly on the tree. The position of the root affects the results.
The function runs UniFrac multiple times, with different roots, and takes the average to smooth potential bias.
UniFrac.multi(physeq, R = 100, seed = 42, cores = 1, ...)
UniFrac.multi(physeq, R = 100, seed = 42, cores = 1, ...)
physeq |
Phyloseq object. Required |
R |
Number of times to repeat calculation. Default 100 |
seed |
Random seed for reproducibility. Default 42 |
cores |
Number of cores to use for parallel computing. Default 1 aka not parallel |
... |
Additional arguments passed to the |
A distance object with the average UniFrac
distances
Robust distance-based multivariate analysis of variance (https://doi.org/10.1186/s40168-019-0659-9)
WdS.test(dm, f, nrep = 999, strata = NULL)
WdS.test(dm, f, nrep = 999, strata = NULL)
dm |
Distance matrix |
f |
Factor |
nrep |
Number of permutations |
strata |
Factor for permuting in strata |
This code is taken from https://github.com/alekseyenko/WdStar/
A list with a p-value, test statitic, and number of permutation