Title: | Multivariate Methods with Unbiased Variable Selection |
---|---|
Description: | Predictive multivariate modelling for metabolomics. Types: Classification and regression. Methods: Partial Least Squares, Random Forest ans Elastic Net Data structures: Paired and unpaired Validation: repeated double cross-validation (Westerhuis et al. (2008)<doi:10.1007/s11306-007-0099-6>, Filzmoser et al. (2009)<doi:10.1002/cem.1225>) Variable selection: Performed internally, through tuning in the inner cross-validation loop. |
Authors: | Carl Brunius [aut], Yingxiao Yan [aut, cre] |
Maintainer: | Yingxiao Yan <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0 |
Built: | 2024-11-17 05:41:44 UTC |
Source: | https://github.com/metabocomp/muvr2 |
Makes a biplot of a fitted object (e.g. from a MUVR with PLS core).
biplotPLS( fit, comps = 1:2, xCol, labPlSc = TRUE, labs, vars, labPlLo = TRUE, pchSc = 16, colSc, colLo = 2, supLeg = FALSE )
biplotPLS( fit, comps = 1:2, xCol, labPlSc = TRUE, labs, vars, labPlLo = TRUE, pchSc = 16, colSc, colLo = 2, supLeg = FALSE )
fit |
A PLS fit (e.g. from MUVRclassObject$Fit[[2]]) |
comps |
Which components to plot |
xCol |
(Optional) Continuous vector for grey scale gradient of observation (sample) color (e.g. Y vector in regression analysis) |
labPlSc |
Boolean to plot observation (sample) names (defaults to TRUE) |
labs |
(Optional) Label names |
vars |
Which variables to plot (names in rownames(loadings)) |
labPlLo |
Boolean to plot variable names (defaults to TRUE) |
pchSc |
Plotting character for observation scores |
colSc |
Colors for observation scores (only if xCol omitted) |
colLo |
Colors for variable loadings (defaults to red) |
supLeg |
Boolean for whether to suppress legends |
A PLS biplot
data("freelive2") nRep <- 2 # Number of MUVR2 repetitions nOuter <- 3 # Number of outer cross-validation segments varRatio <- 0.75 # Proportion of variables kept per iteration method <- 'PLS' # Selected core modeling algorithm regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, method = method, modReturn = TRUE) biplotPLS(regrModel$Fit[[2]], comps = 1:2, xCol = YR2, labPlSc = FALSE, labPlLo = FALSE)
data("freelive2") nRep <- 2 # Number of MUVR2 repetitions nOuter <- 3 # Number of outer cross-validation segments varRatio <- 0.75 # Proportion of variables kept per iteration method <- 'PLS' # Selected core modeling algorithm regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, method = method, modReturn = TRUE) biplotPLS(regrModel$Fit[[2]], comps = 1:2, xCol = YR2, labPlSc = FALSE, labPlLo = FALSE)
This can be run to test if the command input of parameters contradict each other and check the structure of the data. If something goes wrong, warning messages are given.
checkinput( X, Y, ML, DA, method, fitness, nInner, nOuter, varRatio, scale, modReturn, logg, parallel )
checkinput( X, Y, ML, DA, method, fitness, nInner, nOuter, varRatio, scale, modReturn, logg, parallel )
X |
The original data of X, not the result after onehotencoding |
Y |
The original data of Y |
ML |
ML in MUVR2 |
DA |
DA in MUVR2 |
method |
RF or PLS so far in MUVR2 |
fitness |
fitness in MUVR2 |
nInner |
nInnerin MUVR2 |
nOuter |
nOuter in MUVR2 |
varRatio |
varRatio in MUVR2 |
scale |
scale |
modReturn |
modReturn in MUVR2 |
logg |
logg in MUVR2 |
parallel |
parallel in MUVR2 |
correct_input: the original input(call) and the real input used in MUVR2 when you enter your input
data("freelive2") checkinput(X = XRVIP2, Y = YR2, ## YR2 a numeric variable DA = FALSE, fitness="RMSEP")
data("freelive2") checkinput(X = XRVIP2, Y = YR2, ## YR2 a numeric variable DA = FALSE, fitness="RMSEP")
Make a confusion matrix from a MUVR object.
confusionMatrix(MVObj, model = "mid")
confusionMatrix(MVObj, model = "mid")
MVObj |
A MUVR object (classification analysis) |
model |
min, mid or max model |
A confusion matrix of actual vs predicted class
data("mosquito") data("crisp") nRep <- 2 # Number of MUVR2 repetitions nOuter <- 4 # Number of outer cross-validation segments varRatio <- 0.6 # Proportion of variables kept per iteration classModel <- MUVR2_EN(X = Xotu, Y = Yotu, nRep = nRep, nOuter = nOuter, DA = TRUE, modReturn = TRUE) confusionMatrix(classModel) MLModel <- MUVR2(X = crispEM, ML = TRUE, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "RF", modReturn = TRUE) confusionMatrix(MLModel)
data("mosquito") data("crisp") nRep <- 2 # Number of MUVR2 repetitions nOuter <- 4 # Number of outer cross-validation segments varRatio <- 0.6 # Proportion of variables kept per iteration classModel <- MUVR2_EN(X = Xotu, Y = Yotu, nRep = nRep, nOuter = nOuter, DA = TRUE, modReturn = TRUE) confusionMatrix(classModel) MLModel <- MUVR2(X = crispEM, ML = TRUE, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "RF", modReturn = TRUE) confusionMatrix(MLModel)
Effect matrix for the crisp multilevel tutorial
data(crisp)
data(crisp)
Make custom parameters for MUVR internal modelling, not rdCV. Please note that, at present, there is no mtryMax for the outer (consensus) loop in effect.
customParams( method = c("RF", "PLS", "SVM", "ANN"), robust = 0.05, ntreeIn = 150, ntreeOut = 300, mtryMaxIn = 150, compMax = 5, nodes = 200, threshold = 0.1, stepmax = 1e+08, neuralMaxIn = 10, kernel = "notkernel", nu = 0.1, gamma = 1, degree = 1, oneHot, NZV, rfMethod = c("randomForest", "ranger"), svmMethod = c("svm", "ksvm", "svmlight"), annMethod = c("nnet", "neuralnet") )
customParams( method = c("RF", "PLS", "SVM", "ANN"), robust = 0.05, ntreeIn = 150, ntreeOut = 300, mtryMaxIn = 150, compMax = 5, nodes = 200, threshold = 0.1, stepmax = 1e+08, neuralMaxIn = 10, kernel = "notkernel", nu = 0.1, gamma = 1, degree = 1, oneHot, NZV, rfMethod = c("randomForest", "ranger"), svmMethod = c("svm", "ksvm", "svmlight"), annMethod = c("nnet", "neuralnet") )
method |
PLS or RF (default) |
robust |
Robustness (slack) criterion for determining min and max knees (defaults to 0.05) |
ntreeIn |
RF parameter: Number of trees in inner cross-validation loop models (defaults to 150) |
ntreeOut |
RF parameter: Number of trees in outer (consensus) cross-validation loop models (defaults to 300) |
mtryMaxIn |
RF parameter: Max number of variables to sample from at each node in the inner CV loop (defaults to 150). Will be further limited by standard RF rules (see randomForest documentation) |
compMax |
PLS parameter: Maximum number of PLS components (defaults to 5) |
nodes |
ann parameter: |
threshold |
ann parameter: |
stepmax |
ann parameter: |
neuralMaxIn |
ann parameter: Maximum number of ANN (defaults to 20) |
kernel |
svm parameter: kernal function to use, which includes sigmoid, radical, polynomial |
nu |
svm parameter: ratios of errors allowed in the training set range from 0-1 |
gamma |
svm parameters: needed for "vanilladot","polydot","rbfdot" kernel in svm |
degree |
svm parameter: needed for polynomial kernel in svm |
oneHot |
TRUE or FALSE using onehot endcoding or not |
NZV |
TRUE or FALSE using non-zero variance or not |
rfMethod |
randomforest method, which includes randomForest and ranger |
svmMethod |
support vector machine method, which includes svm, ksvm, s |
annMethod |
artificial neural network method which includes 2 different ann methods |
a 'methParam' object
# Standard parameters for random forest methParam <- customParams() # or methParam <- customParams('RF') # Custom ntreeOut parameters for random forest methParam <- customParams('RF',ntreeOut=50) # or methParam <- customParams('RF') methParam$ntreeOut <- 50 methParam
# Standard parameters for random forest methParam <- customParams() # or methParam <- customParams('RF') # Custom ntreeOut parameters for random forest methParam <- customParams('RF',ntreeOut=50) # or methParam <- customParams('RF') methParam$ntreeOut <- 50 methParam
Get Root Mean Square Error of Prediction (RMSEP) in classification.
get_rmsep(actual, predicted)
get_rmsep(actual, predicted)
actual |
Vector of actual classifications of samples |
predicted |
Vector of predicted classifications of samples |
RMSEP
data("mosquito") actual <- YR2 predicted <- sampling_from_distribution(actual) get_rmsep(actual, predicted)
data("mosquito") actual <- YR2 predicted <- sampling_from_distribution(actual) get_rmsep(actual, predicted)
Get Balanced Error Rate (BER) in classification.
getBER(actual, predicted, weigh_added = FALSE, weighing_matrix)
getBER(actual, predicted, weigh_added = FALSE, weighing_matrix)
actual |
Vector of actual classifications of samples |
predicted |
Vector of predicted classifications of samples |
weigh_added |
To add a weighing matrix when it is classification |
weighing_matrix |
The matrix used to get a misclassification score |
BER
data("mosquito") actual <- Yotu predicted <- sampling_from_distribution(actual) getBER(actual, predicted)
data("mosquito") actual <- Yotu predicted <- sampling_from_distribution(actual) getBER(actual, predicted)
Get number of misclassifications from classification analysis.
getMISS(actual, predicted, weigh_added = FALSE, weighing_matrix)
getMISS(actual, predicted, weigh_added = FALSE, weighing_matrix)
actual |
Vector of actual classifications of samples |
predicted |
Vector of predicted classifications of samples |
weigh_added |
Boolean, add a weighing matrix when it is classification |
weighing_matrix |
The matrix used to get a misclassification score |
number of misclassifications
data("mosquito") actual <- Yotu predicted <- sampling_from_distribution(actual) getMISS(actual, predicted)
data("mosquito") actual <- Yotu predicted <- sampling_from_distribution(actual) getMISS(actual, predicted)
Obtain the min, mid, or max number of variables for an object generated from the rdCVnet() function.
getVar( rdCVnetObject, option = c("quantile", "fitness"), fit_curve = c("loess", "gam"), span = 1, k = 5, outlier = c("none", "IQR", "residual"), robust = 0.05, quantile = 0.25 )
getVar( rdCVnetObject, option = c("quantile", "fitness"), fit_curve = c("loess", "gam"), span = 1, k = 5, outlier = c("none", "IQR", "residual"), robust = 0.05, quantile = 0.25 )
rdCVnetObject |
an object obtained from the rdCVnet() function |
option |
quantile or fitness: which way to perform variable selection |
fit_curve |
gam or loess method for fitting the curve in the fitness option |
span |
parameter for using loess to fit curve in the fitness option: how smooth the curve needs to be |
k |
parameter for using gam to fit curve in the fitness option |
outlier |
if remove outlier variables or not. There are 3 options: "none","IRQ", "residual" |
robust |
if the option is fitness, robust parameter decides how much deviation it is allowed from the optimal perdiction performance for min and max variabel selection |
quantile |
if the option is quantile, this value decides the cut for the first quantile, ranging from 0 to 0.5 |
a rdCVnet object
data("mosquito") nRep <- 2 nOuter <- 4 varRatio <-0.6 classModel <- MUVR2_EN(X = Xotu, Y = Yotu, nRep = nRep, nOuter = nOuter, DA = TRUE, modReturn = TRUE) classModel<-getVar(classModel)
data("mosquito") nRep <- 2 nOuter <- 4 varRatio <-0.6 classModel <- MUVR2_EN(X = Xotu, Y = Yotu, nRep = nRep, nOuter = nOuter, DA = TRUE, modReturn = TRUE) classModel<-getVar(classModel)
Extract autoselected variables from MUVR model object.
getVIRank(MUVRclassObject, model = "mid", n, all = FALSE)
getVIRank(MUVRclassObject, model = "mid", n, all = FALSE)
MUVRclassObject |
an object of MUVR class |
model |
which model to use ("min", "mid" (default), or "max") |
n |
customize values |
all |
logical, to get the ranks of all variable or not |
data frame with order, name and average rank of variables ('order', 'name' & 'rank')
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) getVIRank(regrModel, model="min")
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) getVIRank(regrModel, model="min")
Make reference distribution for resampling tests to assess overfitting.
H0_reference(Y, n = 1000, fitness = c("Q2", "BER", "MISS", "AUROC"), ...)
H0_reference(Y, n = 1000, fitness = c("Q2", "BER", "MISS", "AUROC"), ...)
Y |
the target variable |
n |
number of permutations to run |
fitness |
number of repetitions for each permutation (defaults to value of actual model) |
... |
additional arguments for sampling from distribution |
a histogram of reference distribution
data("freelive2") H0_reference(YR2)
data("freelive2") H0_reference(YR2)
This function will extract data and parameter settings from a MUVR object and run standard permutation or resampling test. This will fit a standard case of multivariate predictive modelling in either a regression, classification or multilevel case. However, if an analysis has a complex sample dependency which requires constrained permutation of your response vector or if a variable pre-selection is performed for decreased computational burden, then permutaion/resampling loops should be constructed manually. In those cases, View(H0_test) can be a first start from which to build custom solutions for permutation analysis.
H0_test( MUVRclassObject, n = 50, nRep, nOuter, varRatio, parallel, type = c("resampling", "permutation") )
H0_test( MUVRclassObject, n = 50, nRep, nOuter, varRatio, parallel, type = c("resampling", "permutation") )
MUVRclassObject |
a 'MUVR' class object |
n |
number of permutations to run |
nRep |
number of repetitions for each permutation (defaults to value of actual model) |
nOuter |
number of outer validation segments for each permutation (defaults to value of actual model) |
varRatio |
varRatio for each permutation (defaults to value of actual model) |
parallel |
whether to run calculations using parallel processing which requires registered backend (defaults to parallelization for the actual model) |
type |
either permutation or resampling, to decide whether the permutation sampling is performed on original Y values or the probability(If Y categorical)/distributions(If Y continuous) of Y values |
permutation_output: A permutation matrix with permuted fitness statistics (nrow=n and ncol=3 for min/mid/max)
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) H0_test(regrModel)
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) H0_test(regrModel)
Subject identifiers for the rye metabolomics regression tutorial
data(freelive)
data(freelive)
Subject identifiers for the rye metabolomics regression tutorial, using unique individuals
data(freelive2)
data(freelive2)
Merge two MUVR class objects that use regression for PLS or RF methods. The resultant MUVR class object has the same indata except that nRep is different.
mergeModels(MV1, MV2)
mergeModels(MV1, MV2)
MV1 |
a MUVR class Object |
MV2 |
a MUVR class Object |
A merged MURV class object
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) mergedModel<-mergeModels(regrModel,regrModel)
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) mergedModel<-mergeModels(regrModel,regrModel)
"Multivariate modelling with Unbiased Variable selection" using PLS and RF. Repeated double cross validation with tuning of variables in the inner loop.
MUVR2( X, Y, ID, scale = TRUE, nRep = 5, nOuter = 6, nInner, varRatio = 0.75, DA = FALSE, fitness = c("AUROC", "MISS", "BER", "RMSEP", "wBER", "wMISS"), method = c("PLS", " RF", "ANN", "SVM"), methParam, ML = FALSE, modReturn = FALSE, logg = FALSE, parallel = TRUE, weigh_added = FALSE, weighing_matrix = NULL, keep, ... )
MUVR2( X, Y, ID, scale = TRUE, nRep = 5, nOuter = 6, nInner, varRatio = 0.75, DA = FALSE, fitness = c("AUROC", "MISS", "BER", "RMSEP", "wBER", "wMISS"), method = c("PLS", " RF", "ANN", "SVM"), methParam, ML = FALSE, modReturn = FALSE, logg = FALSE, parallel = TRUE, weigh_added = FALSE, weighing_matrix = NULL, keep, ... )
X |
Predictor variables. NB: Variables (columns) must have names/unique identifiers. NAs not allowed in data. For multilevel, only the positive half of the difference matrix is specified. |
Y |
Response vector (Dependent variable). For classification, a factor (or character) variable should be used. For multilevel, Y is calculated automatically. |
ID |
Subject identifier (for sampling by subject; Assumption of independence if not specified) |
scale |
If TRUE, the predictor variable matrix is scaled to unit variance for PLS modeling. |
nRep |
Number of repetitions of double CV. (Defaults to 5) |
nOuter |
Number of outer CV loop segments. (Defaults to 6) |
nInner |
Number of inner CV loop segments. (Defaults to nOuter - 1) |
varRatio |
Ratio of variables to include in subsequent inner loop iteration. (Defaults to 0.75) |
DA |
Boolean for Classification (discriminant analysis) (By default, if Y is numeric -> DA = FALSE. If Y is factor (or character) -> DA = TRUE) |
fitness |
Fitness function for model tuning (choose either 'AUROC' or 'MISS' (default) for classification; or 'RMSEP' (default) for regression.) |
method |
Multivariate method. Supports 'PLS' and 'RF' (default) |
methParam |
List with parameter settings for specified MV method (see function code for details) |
ML |
Boolean for multilevel analysis (defaults to FALSE) |
modReturn |
Boolean for returning outer segment models (defaults to FALSE). Setting modReturn = TRUE is required for making MUVR predictions using predMV(). |
logg |
Boolean for whether to sink model progressions to 'log.txt' |
parallel |
Boolean for whether to perform 'foreach' parallel processing (Requires a registered parallel backend; Defaults to 'TRUE') |
weigh_added |
To add a weighing matrix when it is classfication |
weighing_matrix |
The matrix used for get a miss classfication score |
keep |
Confounder variables can be added. NB: Variables (columns) must match column names. |
... |
additional argument |
A 'MUVR' object
data(freelive2) nRep <- 2 # Number of MUVR2 repetitions nOuter <- 3 # Number of outer cross-validation segments varRatio <- 0.6 # Proportion of variables kept per iteration method <- 'PLS' # Selected core modeling algorithm regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = method, modReturn = TRUE)
data(freelive2) nRep <- 2 # Number of MUVR2 repetitions nOuter <- 3 # Number of outer cross-validation segments varRatio <- 0.6 # Proportion of variables kept per iteration method <- 'PLS' # Selected core modeling algorithm regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = method, modReturn = TRUE)
"Multivariate modelling with Unbiased Variable selection" using Elastic Net (EN). Repeated double cross validation with tuning of variables using Elastic Net.
MUVR2_EN( X, Y, ID, alow = 1e-05, ahigh = 1, astep = 11, alog = TRUE, nRep = 5, nOuter = 6, nInner, NZV = TRUE, DA = FALSE, fitness = c("AUROC", "MISS", "BER", "RMSEP", "wBER", "wMISS"), methParam, ML = FALSE, modReturn = FALSE, parallel = TRUE, keep = NULL, weigh_added = FALSE, weighing_matrix = NULL, ... )
MUVR2_EN( X, Y, ID, alow = 1e-05, ahigh = 1, astep = 11, alog = TRUE, nRep = 5, nOuter = 6, nInner, NZV = TRUE, DA = FALSE, fitness = c("AUROC", "MISS", "BER", "RMSEP", "wBER", "wMISS"), methParam, ML = FALSE, modReturn = FALSE, parallel = TRUE, keep = NULL, weigh_added = FALSE, weighing_matrix = NULL, ... )
X |
Predictor variables. NB: Variables (columns) must have names/unique identifiers. NAs not allowed in data. For multilevel, only the positive half of the difference matrix is specified. |
Y |
Response vector (Dependent variable). For classification, a factor (or character) variable should be used. For multilevel, Y is calculated automatically. |
ID |
Subject identifier (for sampling by subject; Assumption of independence if not specified) |
alow |
alpha tuning: lowest value of alpha |
ahigh |
alpha tuning: highest value of alpha |
astep |
alpha tuning: number of alphas to try from low to high |
alog |
alpha tuning: Whether to space tuning of alpha in logarithmic scale (TRUE; default) or normal/arithmetic scale (FALSE) |
nRep |
Number of repetitions of double CV. (Defaults to 5) |
nOuter |
Number of outer CV loop segments. (Defaults to 6) |
nInner |
Number of inner CV loop segments. (Defaults to nOuter-1) |
NZV |
Boolean for whether to filter out near zero variance variables (defaults to TRUE) |
DA |
Boolean for Classification (discriminant analysis) (By default, if Y is numeric -> DA=FALSE. If Y is factor (or character) -> DA=TRUE) |
fitness |
Fitness function for model tuning (choose either 'AUROC' or 'MISS' (default) for classification; or 'RMSEP' (default) for regression.) |
methParam |
List with parameter settings for specified MV method (see function code for details) |
ML |
Boolean for multilevel analysis (defaults to FALSE) |
modReturn |
Boolean for returning outer segment models (defaults to FALSE). Setting modReturn=TRUE is required for making MUVR predictions using predMV(). |
parallel |
Boolean for whether to perform 'foreach' parallel processing (Requires a registered parallel backend; Defaults to 'TRUE') |
keep |
A group of confounders that you want to manually set as non-zero |
weigh_added |
weigh_added |
weighing_matrix |
weighing_matrix |
... |
Pass additional arguments |
A MUVR object
data("freelive2") nRep <- 2 # Number of MUVR2 repetitions nOuter <- 4 # Number of outer cross-validation segments regrModel <- MUVR2_EN(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, modReturn = TRUE)
data("freelive2") nRep <- 2 # Number of MUVR2 repetitions nOuter <- 4 # Number of outer cross-validation segments regrModel <- MUVR2_EN(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, modReturn = TRUE)
Adapted and stripped down from mixOmics v 5.2.0 (https://cran.r-project.org/web/packages/mixOmics/).
nearZeroVar(x, freqCut = 95/5, uniqueCut = 10)
nearZeroVar(x, freqCut = 95/5, uniqueCut = 10)
x |
a numeric vector or matrix, or a data frame with all numeric data. |
freqCut |
the cutoff for the ratio of the most common value to the second most common value. |
uniqueCut |
the cutoff for the percentage of distinct values out of the number of total samples. |
nzv object
data("freelive2") nearZeroVar(XRVIP2) data("mosquito") nearZeroVar(Xotu)
data("freelive2") nearZeroVar(XRVIP2) data("mosquito") nearZeroVar(Xotu)
Each factor and character variable with n categories(>2) will be transformed to n variables. Each factor and character variable with 2 categories will be transformed to one 01 numeric dummy variable. Each factor and character variable with 1 categories will be transformed to one numeric variable that only has value 1. Each factor and character variable with 0 categories will be transformed to one numeric variable that only has value -999. Each logical variable will be transformed to one 01 numeric dummy variable.
onehotencoding(X)
onehotencoding(X)
X |
data frame data with numeric, factor, character and/or logical variables |
matrix with all variables transformed to numeric variables
#To test the scenario when X has factor and character when using PLS #add one factor and one character variable(freelive data X, # which originally has 112 numeric samples and 1147 observations) # factor variable has 3,6,5factors(nearzero variance), character variable has 7,4 categories factor_variable1<-as.factor(c(rep("33",105),rep("44",3),rep("55",4))) factor_variable2<-as.factor(c(rep("AB",20),rep("CD",10),rep("EF",30), rep("GH",15),rep("IJ",25),rep("KL",12))) factor_variable3<-as.factor(c(rep("Tessa",25),rep("Olle",30),rep("Yan",12), rep("Calle",25),rep("Elisa",20))) factor_variable4<-as.factor(c(rep(NA,112))) character_variable1<-c(rep("one",16),rep("two",16),rep("three",16), rep("four",16),rep("five",16),rep("six",16),rep("seven",16)) character_variable2<-c(rep("yes",28),rep("no",28), rep("yes",28),rep("no",28)) character_variable3<-c(rep("Hahahah",112)) character_variable4<-as.character(c(rep(NA,112))) logical_variable1<-c(rep(TRUE,16),rep(FALSE,16),rep(TRUE,16), rep(FALSE,16),rep(TRUE,16),rep(FALSE,32)) logical_variable2<-c(rep(TRUE,28),rep(FALSE,28),rep(TRUE,28),rep(FALSE,28)) X<-data.frame(row.names<-1:112) X<-cbind(X,XRVIP, factor_variable1,factor_variable2,factor_variable3,factor_variable4, character_variable1,character_variable2,character_variable3,character_variable4, logical_variable1,logical_variable2) onehotencoding(X)
#To test the scenario when X has factor and character when using PLS #add one factor and one character variable(freelive data X, # which originally has 112 numeric samples and 1147 observations) # factor variable has 3,6,5factors(nearzero variance), character variable has 7,4 categories factor_variable1<-as.factor(c(rep("33",105),rep("44",3),rep("55",4))) factor_variable2<-as.factor(c(rep("AB",20),rep("CD",10),rep("EF",30), rep("GH",15),rep("IJ",25),rep("KL",12))) factor_variable3<-as.factor(c(rep("Tessa",25),rep("Olle",30),rep("Yan",12), rep("Calle",25),rep("Elisa",20))) factor_variable4<-as.factor(c(rep(NA,112))) character_variable1<-c(rep("one",16),rep("two",16),rep("three",16), rep("four",16),rep("five",16),rep("six",16),rep("seven",16)) character_variable2<-c(rep("yes",28),rep("no",28), rep("yes",28),rep("no",28)) character_variable3<-c(rep("Hahahah",112)) character_variable4<-as.character(c(rep(NA,112))) logical_variable1<-c(rep(TRUE,16),rep(FALSE,16),rep(TRUE,16), rep(FALSE,16),rep(TRUE,16),rep(FALSE,32)) logical_variable2<-c(rep(TRUE,28),rep(FALSE,28),rep(TRUE,28),rep(FALSE,28)) X<-data.frame(row.names<-1:112) X<-cbind(X,XRVIP, factor_variable1,factor_variable2,factor_variable3,factor_variable4, character_variable1,character_variable2,character_variable3,character_variable4, logical_variable1,logical_variable2) onehotencoding(X)
Plot permutation analysis using actual model and permutation result. This is basically a wrapper for the MUVR2::plotPerm() function using model objects to make coding nicer and cleaner.
permutationPlot( MUVRclassObject, permutation_result, model = "Mid", type = "t", side = c("greater", "smaller"), pos, xlab = NULL, xlim, ylim = NULL, breaks = "Sturges", main = NULL )
permutationPlot( MUVRclassObject, permutation_result, model = "Mid", type = "t", side = c("greater", "smaller"), pos, xlab = NULL, xlim, ylim = NULL, breaks = "Sturges", main = NULL )
MUVRclassObject |
A 'MUVR' class object |
permutation_result |
A permutation result. It is a list of 1 items: permutation_output |
model |
'Min', 'Mid', or 'Max' |
type |
't' (default; for Student's t) or 'non' for "non-parametric" (i.e. rank) studen'ts |
side |
'smaller' for actual lower than H0 or 'greater' for actual larger than H0 (automatically selected if not specified) |
pos |
which side of actual to put p-value on |
xlab |
optional xlabel |
xlim |
optional x-range |
ylim |
otional y-range |
breaks |
optional custom histogram breaks (defaults to 'sturges') |
main |
optional plot title (or TRUE for autoname) |
A permutation plot
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) permutation_result<-H0_test(regrModel,n=10) permutationPlot(regrModel,permutation_result)
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) permutation_result<-H0_test(regrModel,n=10) permutationPlot(regrModel,permutation_result)
Plot predicted and actual target variables, with different plots depending on modelling approach.
plotMV(MUVRclassObject, model = "min", factCols, sampLabels, ylim = NULL)
plotMV(MUVRclassObject, model = "min", factCols, sampLabels, ylim = NULL)
MUVRclassObject |
An MUVR class object |
model |
What type of model to plot ('min', 'mid' or 'max'). Defaults to 'mid'. |
factCols |
An optional vector with colors for the factor levels (in the same order as the levels) |
sampLabels |
Sample labels (optional; implemented for classification) |
ylim |
Optional for imposing y-limits for regression and classification analysis |
A plot of results from multivariate predictions
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) plotMV(regrModel, model="min")
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) plotMV(regrModel, model="min")
Customised PCA score plots with the possibility to choose PCs, exporting to png and the possibility to add color or different plotting symbols according to variable.
plotPCA(pca, PC1 = 1, PC2 = 2, file, colVar, symbVar, main = "")
plotPCA(pca, PC1 = 1, PC2 = 2, file, colVar, symbVar, main = "")
pca |
A 'prcomp' object |
PC1 |
Principal component on x-axis |
PC2 |
Principal component on y-axis |
file |
If specified provides the name of a png export file. Otherwise normal plot. |
colVar |
Continuous variable for coloring observations (40 cuts) |
symbVar |
Categorical/discrete variable for multiple plot symbols |
main |
If provided provides a main title of the plot |
A PCA score plot. Exported as png if 'file' specified in function call.
data("freelive2") pca_object<-prcomp(XRVIP2) plotPCA(pca_object)
data("freelive2") pca_object<-prcomp(XRVIP2) plotPCA(pca_object)
Plots histogram of null hypothesis (permutation/resampling) distribution, actual model fitness and cumulative p-value. Plot defaults to "greater than" or "smaller than" tests and cumulative probability in Student's t-distribution.
plotPerm( actual, distribution, xlab = NULL, side = c("greater", "smaller"), type = "t", ylab = NULL, xlim, ylim = NULL, breaks = "Sturges", pos, main = NULL, permutation_visual = "none", curve = TRUE, extend = 0.1, multiple_p_shown = NULL, show_actual_value = TRUE, show_p = TRUE, round_number = 4 )
plotPerm( actual, distribution, xlab = NULL, side = c("greater", "smaller"), type = "t", ylab = NULL, xlim, ylim = NULL, breaks = "Sturges", pos, main = NULL, permutation_visual = "none", curve = TRUE, extend = 0.1, multiple_p_shown = NULL, show_actual_value = TRUE, show_p = TRUE, round_number = 4 )
actual |
Actual model fitness (e.g. Q2, AUROC or number of misclassifications) |
distribution |
Null hypothesis (permutation) distribution of similar metric as 'actual' |
xlab |
Label for x-axis (e.g. 'Q2 using real value',"Q2 using distributions","BER" 'AUROC', or 'Misclassifications') |
side |
Cumulative p either "greater" or "smaller" than H0 distribution (defaults to side of median(H0)) |
type |
c('t','non',"smooth","rank","ecdf") |
ylab |
label for y-axis |
xlim |
Choice of user-specified x-limits (if default is not adequate) |
ylim |
Choice of user-specified y-limits (if default is not adequate) |
breaks |
Choice of user-specified histogram breaks (if default is not adequate) |
pos |
Choice of position of p-value label (if default is not adequate) |
main |
Choice of user-specified plot title |
permutation_visual |
choice of showing median or mean or none |
curve |
if add curve or not base on the mid |
extend |
how many percenrtage of the orignical range do we start |
multiple_p_shown |
show many p values |
show_actual_value |
show the actual value on the vertical line or not |
show_p |
if p value is added to the figure |
round_number |
How many digits does it keep |
Plot
data("freelive2") actual <- sample(YR2, 1) distribution <- YR2 plotPerm (actual, distribution)
data("freelive2") actual <- sample(YR2, 1) distribution <- YR2 plotPerm (actual, distribution)
At present, this function only supports predictions for PLS regression type problems.
plotPred(Ytrue, Ypreds)
plotPred(Ytrue, Ypreds)
Ytrue |
True value of Y, should be a vector |
Ypreds |
Predicted value of Y can be a vector or data frame with the same number of rows |
A plot, plot the prediction
data("freelive2") Ytrue<-YR2 Ypreds<-sampling_from_distribution(YR2) plotPred(Ytrue,Ypreds) Ytrue<-YR2 nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) Ypreds<-regrModel$yPred plotPred(Ytrue,Ypreds)
data("freelive2") Ytrue<-YR2 Ypreds<-sampling_from_distribution(YR2) plotPred(Ytrue,Ypreds) Ytrue<-YR2 nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) Ypreds<-regrModel$yPred plotPred(Ytrue,Ypreds)
Plot stability of selected variables and prediction fitness as a function of number of repetitions.
plotStability(MUVRrdCVclassObject, model = "min", VAll, nVarLim, missLim)
plotStability(MUVRrdCVclassObject, model = "min", VAll, nVarLim, missLim)
MUVRrdCVclassObject |
MUVR class object or rdCV object |
model |
'min' (default), 'mid' or 'max' |
VAll |
Option of specifying which variables (i.e. names) to consider as reference set. Defaults to variables selected from the 'model' of the 'MUVRrdCVclassObject' |
nVarLim |
Option of specifying upper limit for number of variables |
missLim |
Option of specifying upper limit for number of misclassifications |
Plot of number of variables, proportion of variables overlapping with reference and prediction accuracy (Q2 for regression; MISS otherwise) as a function of number of repetitions.
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) plotStability(regrModel, model = "min")
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) plotStability(regrModel, model = "min")
Produces a plot of validation metric vs number of variables in model (inner segment).
plotVAL(MUVRclassObject, show_outlier = TRUE)
plotVAL(MUVRclassObject, show_outlier = TRUE)
MUVRclassObject |
An object of class 'MUVR' |
show_outlier |
Boolean, show outliers |
A plot
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) plotVAL(regrModel)
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) plotVAL(regrModel)
Plot variable importance ranking in MUVR object. Regardless of MV core method, variables are sorted by rank, where lower is better. 'plotVIRank' produces boxplots of variable rankings for all model repetitions.
plotVIRank( MUVRclassObject, n, model = "min", cut, maptype = c("heatmap", "dotplot"), add_blank = 4, cextext = 1 )
plotVIRank( MUVRclassObject, n, model = "min", cut, maptype = c("heatmap", "dotplot"), add_blank = 4, cextext = 1 )
MUVRclassObject |
An MUVR class object only applied to PLS, RF not rdCVnet |
n |
Number of top ranking variables to plot (defaults to those selected by MUVR2) |
model |
Which model to choose ('min', 'mid' (default) or 'max') |
cut |
Optional value to cut length of variable names to 'cut' number of characters |
maptype |
for rdCvnet dot plot or heat map |
add_blank |
put more blank when the rownames is too long, |
cextext |
the cex of the text |
Barplot of variable rankings (lower is better)
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) plotVIRank(regrModel, n=20)
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn = TRUE) plotVIRank(regrModel, n=20)
Calculate permutation p-value Calculate perutation p-value of actual model performance vs null hypothesis distribution. 'pPerm' will calculate the cumulative (1-tailed) probability of 'actual' belonging to 'permutation_distribution'. 'side' is guessed by actual value compared to median(permutation_distribution). Test is performed on original data OR ranked for non-parametric statistics.
pPerm( actual, permutation_distribution, side = c("smaller", "greater"), type = "t", extend = 0.1 )
pPerm( actual, permutation_distribution, side = c("smaller", "greater"), type = "t", extend = 0.1 )
actual |
Actual model performance (e.g. misclassifications or Q2) |
permutation_distribution |
Null hypothesis distribution from permutation test (same metric as 'actual') |
side |
Smaller or greater than (automatically guessed if omitted) (Q2 and AUC is a "greater than" test, whereas misclassifications is "smaller than") |
type |
one of ('t','non',"smooth","ecdf","rank") |
extend |
extend how much it extend |
p-value
data("freelive2") actual <- sample(YR2, 1) permutation_distribution <- YR2 pPerm(actual, permutation_distribution)
data("freelive2") actual <- sample(YR2, 1) permutation_distribution <- YR2 pPerm(actual, permutation_distribution)
Predict outcomes Predict MV object using a MUVR class object and a X testing set. At present, this function only supports predictions for PLS regression type problems.
predMV(MUVRclassobject, newdata, model = "min")
predMV(MUVRclassobject, newdata, model = "min")
MUVRclassobject |
An 'MUVR' class object |
newdata |
New data for which to predict outcomes |
model |
What type of model to plot ('min', 'mid' or 'max'). Defaults to 'mid'. |
The predicted result based on the MUVR model and the newdata
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn=TRUE) predMV(regrModel,XRVIP2)
data("freelive2") nRep <- 2 nOuter <- 4 varRatio <-0.6 regrModel <- MUVR2(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, varRatio = varRatio, method = "PLS", modReturn=TRUE) predMV(regrModel,XRVIP2)
Perform matrix pre-processing
preProcess( X, offset = 0, zeroOffset = 0, trans = "none", center = "none", scale = "none" )
preProcess( X, offset = 0, zeroOffset = 0, trans = "none", center = "none", scale = "none" )
X |
Data matrix with samples in rows and variables in columns |
offset |
Add offset to all data points (defaults to 0) |
zeroOffset |
Add offset to zero data (defaults to 0) |
trans |
Either 'log', 'sqrt' or 'none' (default is 'none') |
center |
Either 'mean', 'none' or a numeric vector of length equal to the number of columns of X (defaults to 'none'). |
scale |
Either 'UV', 'Pareto', 'none' or a numeric vector of length equal to the number of columns of X (defaults to 'none'). |
A pre-processed data matrix
data("freelive2") preProcess(XRVIP2)
data("freelive2") preProcess(XRVIP2)
Q2 calculation
Q2_calculation(yhat, y)
Q2_calculation(yhat, y)
yhat |
prediction values |
y |
real values |
Q2
data("freelive2") actual <- YR2 predicted <- MUVR2::sampling_from_distribution(actual) Q2_calculation(actual, predicted)
data("freelive2") actual <- YR2 predicted <- MUVR2::sampling_from_distribution(actual) Q2_calculation(actual, predicted)
Wrapper for speedy access to MUVR2 (autosetup of parallelization)
qMUVR2( X, Y, ML = FALSE, method = "RF", varRatio = 0.65, nCore, repMult = 1, nOuter = 5, ... )
qMUVR2( X, Y, ML = FALSE, method = "RF", varRatio = 0.65, nCore, repMult = 1, nOuter = 5, ... )
X |
X-data |
Y |
Y-data |
ML |
Boolean for multilevel |
method |
'RF' (default) or 'PLS' |
varRatio |
proportion of variables to keep in each loop of the recursive feature elimination |
nCore |
Number of threads to use for calculation (defaults to detectCores()-1) |
repMult |
Multiplier of cores -> nRep = repMult * nCore |
nOuter |
Number of outer segments |
... |
Additional arguments(see MUVR) |
MUVR object
data("freelive2") regrModel <- qMUVR2(X = XRVIP2, Y = YR2, nCore = 1)
data("freelive2") regrModel <- qMUVR2(X = XRVIP2, Y = YR2, nCore = 1)
Wrapper for repeated double cross-validation without variable selection
rdCV( X, Y, ID, nRep = 5, nOuter = 6, nInner, DA = FALSE, fitness = c("AUROC", "MISS", "RMSEP", "BER"), method = c("PLS", "RF"), methParam, ML = FALSE, modReturn = FALSE, logg = FALSE )
rdCV( X, Y, ID, nRep = 5, nOuter = 6, nInner, DA = FALSE, fitness = c("AUROC", "MISS", "RMSEP", "BER"), method = c("PLS", "RF"), methParam, ML = FALSE, modReturn = FALSE, logg = FALSE )
X |
Independent variables. NB: Variables (columns) must have names/unique identifiers. NAs not allowed in data. For ML, X is upper half only (X1-X2) |
Y |
Response vector (Dependent variable). For DA (classification), Y should be factor or character. For ML, Y is omitted. For regression, Y is numeric. |
ID |
Subject identifier (for sampling by subject; Assumption of independence if not specified) |
nRep |
Number of repetitions of double CV. |
nOuter |
Number of outer CV loop segments. |
nInner |
Number of inner CV loop segments. |
DA |
Logical for Classification (discriminant analysis) (Defaults do FALSE, i.e. regression). PLS is limited to two-class problems (see 'Y' above). |
fitness |
Fitness function for model tuning (choose either 'AUROC' or 'MISS'or 'BER' for classification; or 'RMSEP' (default) for regression.) |
method |
Multivariate method. Supports 'PLS' and 'RF' (default) |
methParam |
List with parameter settings for specified MV method (defaults to ???) |
ML |
Logical for multilevel analysis (defaults to FALSE) |
modReturn |
Logical for returning outer segment models (defaults to FALSE) |
logg |
Logical for whether to sink model progressions to 'log.txt' |
An object containing stuff...
data("freelive2") nRep <- 2 # Number of MUVR2 repetitions nOuter <- 4 # Number of outer cross-validation segments varRatio <- 0.75 # Proportion of variables kept per iteration method <- 'RF' # Selected core modeling algorithm regrModel <- rdCV(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, method = method, modReturn = TRUE)
data("freelive2") nRep <- 2 # Number of MUVR2 repetitions nOuter <- 4 # Number of outer cross-validation segments varRatio <- 0.75 # Proportion of variables kept per iteration method <- 'RF' # Selected core modeling algorithm regrModel <- rdCV(X = XRVIP2, Y = YR2, nRep = nRep, nOuter = nOuter, method = method, modReturn = TRUE)
Custom parameters can be set in the function call or by manually setting "slots" in the resulting methParam object.
rdcvNetParams( robust = 0.05, family = "gaussian", nRepInner = 1, NZV = TRUE, oneHot = TRUE )
rdcvNetParams( robust = 0.05, family = "gaussian", nRepInner = 1, NZV = TRUE, oneHot = TRUE )
robust |
Robustness (slack) criterion for determining min and max knees (defaults to 0.05) |
family |
the options could be "gaussian", "binomial", "poisson", "multinomial", "cox", "mgaussian" |
nRepInner |
how many nRepInner |
NZV |
NZV |
oneHot |
TRUE or FALSE using onehot endcoding or not |
a 'methParam' object
# Standard parameters for rdcvNet methParam <- rdcvNetParams()
# Standard parameters for rdcvNet methParam <- rdcvNetParams()
Sampling from the distribution of something
sampling_from_distribution(X, upperlimit, lowerlimit, extend, n)
sampling_from_distribution(X, upperlimit, lowerlimit, extend, n)
X |
a vector (numeric or factor) where the distribution/probility will be generated |
upperlimit |
if X is numeric, set upper limit |
lowerlimit |
if X is numeric, set lower limit |
extend |
If X is numeric, how much you want to extend from the lower and upper existing X. |
n |
How many you want to sample |
a resampled thing
data("mosquito") sampling_from_distribution(Yotu) data("freelive2") sampling_from_distribution(YR2, upperlimit=200, lowerlimit=0, n=length(YR2) )
data("mosquito") sampling_from_distribution(Yotu) data("freelive2") sampling_from_distribution(YR2, upperlimit=200, lowerlimit=0, n=length(YR2) )
Reports names and numbers of variables: all as well as optimal (min model), redundant (from min up to max) and noisy (the rest).
varClass(MUVRclassObject)
varClass(MUVRclassObject)
MUVRclassObject |
A MUVR class object |
A list with names and numbers of variables: all as well as optimal (Corresponding to 'min' or minial-optimal model), redundant (from 'min' up to 'max' or all-relevant ) and noisy (the rest)
data("mosquito") nRep <- 2 nOuter <- 4 classModel <- MUVR2_EN(X = Xotu, Y = Yotu, nRep = nRep, nOuter = nOuter, DA = TRUE, modReturn = TRUE) classModel<-getVar(classModel,option="quantile") varClass(classModel)
data("mosquito") nRep <- 2 nOuter <- 4 classModel <- MUVR2_EN(X = Xotu, Y = Yotu, nRep = nRep, nOuter = nOuter, DA = TRUE, modReturn = TRUE) classModel<-getVar(classModel,option="quantile") varClass(classModel)
Microbiota composition in mosquitos for the classification tutorial
data(mosquito)
data(mosquito)
Metabolomics data for the rye metabolomics regression tutorial
data(freelive)
data(freelive)
Metabolomics data for the rye metabolomics regression tutorial, using unique individuals
data(freelive2)
data(freelive2)
Village of capture of mosquitos for the classification tutorial
data(mosquito)
data(mosquito)
Rye consumption for the rye metabolomics regression tutorial
data(freelive)
data(freelive)
Rye consumption for the rye metabolomics regression tutorial, using unique individuals
data(freelive2)
data(freelive2)