r function list all parameters including ellipses - r

Is there any ways to list all parameters including ellipses(additional parameters with three dots) of R function? For example,I want to know "qplot" function's parameters,the only way I found is args(qplot),which result
> args(qplot)
function (x, y = NULL, ..., data, facets = NULL, margins = FALSE,
geom = "auto", xlim = c(NA, NA), ylim = c(NA, NA), log = "",
main = NULL, xlab = deparse(substitute(x)), ylab = deparse(substitute(y)),
asp = NA, stat = NULL, position = NULL)
But I really want to know what additional parameters the three dots represents can pass into this function.for example,the "shape" parameter.

The three dot ellipsis ... refer to any number of function arguments that get processes/passed on within the function body.
For example, in the case of qplot, the function body (which you can see if you execute qplot) reveals that any additional function arguments will be used as additional aesthetic specifications.
The relevant lines are:
arguments <- as.list(match.call()[-1])
env <- parent.frame()
aesthetics <- compact(arguments[.all_aesthetics])
where
.all_aesthetics <- c("adj", "alpha", "angle", "bg", "cex", "col", "color",
"colour", "fg", "fill", "group", "hjust", "label", "linetype", "lower",
"lty", "lwd", "max", "middle", "min", "pch", "radius", "sample", "shape",
"size", "srt", "upper", "vjust", "weight", "width", "x", "xend", "xmax",
"xmin", "xintercept", "y", "yend", "ymax", "ymin", "yintercept", "z")
The definition of .all_aesthetics can be found here.

Related

Create an R function that normalizes data based on input values

I don't make to many complicated functions and typically stick with very basic ones. I have a question, how do I create a function that takes a dataset and normalizes based on desired normalization method and boxplots the output? Currently norm_method is different between the norm methods, was wondering if there is a way to call this in the start of function to pull through the correct method? Below is the code I created, but am stuck how to proceed.
library(reshape2) # for melt
library(cowplot)
demoData;
# target_deoData will need to be changed at some point
TestFunc <- function(demoData) {
# Q3 norm (75th percentile)
target_demoData <- normalize(demoData ,
norm_method = "quant",
desiredQuantile = .75,
toElt = "q_norm")
# Background normalization without spike
target_demoData <- normalize(demoData ,
norm_method = "neg",
fromElt = "exprs",
toElt = "neg_norm")
boxplot(assayDataElement(demoData[,1:10], elt = "q_norm"),
col = "red", main = "Q3",
log = "y", names = 1:10, xlab = "Segment",
ylab = "Counts, Q3 Normalized")
boxplot(assayDataElement(demoData[,1:10], elt = "neg_norm"),
col = "blue", main = "Neg",
log = "y", names = 1:10, xlab = "Segment",
ylab = "Counts, Neg. Normalized")
}
You might want to consider designing your normalize() and assayDataElement() functions to take ..., which provides more flexibility.
In lieu of that, given the examples above, you could make a simple configuration list, and elements of that configuration are passed to your normalize() and assayDataElement() functions, like this:
TestFunc <- function(demoData, method=c("quant", "neg")) {
method = match.arg(method)
method_config = list(
"quant" = list("norm_args" = list("norm_method" = "quant", desired_quantile = 0.75, "toElt" = "q_norm"),
"plot_args" = list("col"="red", main="Q3", ylab = "Counts, Q3 Normalized")),
"neg" = list("norm_args" = list("fromElt" = "exprs", "toElt" = "neg_norm"),
"plot_args" = list("col"="blue", main="Neg", ylab = "Counts, Neg Normalized"))
)
mcn = method_config[[method]][["norm_args"]]
mcp = method_config[[method]][["plot_args"]]
# normalize the data
target_demoData = do.call(normalize, c(list(data = demoData[1:10]), mcn))
# get the plot
boxplot(assayDataElement(
demoData[1:10], elt=mcp[["toElt"]],col = mcp[["col"],main = mcp[["main"]],
log = "y", names = 1:10, xlab = "Segment",ylab = mcp[["ylab"]]
)
}
Again, using this approach is not as flexible as ... (and consider splitting into two functions.. one that returns normalized data, and a second function that generates the plot..

Complexheatmap with multiple files plotting

I would like to use Complexheatmap for multiple files for plotting individual data frame or files .
So far I was able to do this as for small subset of files.
Reading files as list
list_of_files <- list.files('Model_hmap/',pattern = '\\.txt$', full.names = TRUE)
#Further arguments to read.csv can be passed in ...
#all_csv <- lapply(list_of_files,read_delim,delim = "\t", escape_double = FALSE,trim_ws = TRUE)
all_csv <- lapply(list_of_files,read.table,strip.white = FALSE,check.names = FALSE,header=TRUE,row.names=1)
#my_names = c("gene","baseMean","log2FoldChange","lfcSE","stat","pvalue","padj","UP_DOWN")
my_names = c("Symbol","baseMean","log2FoldChange","lfcSE","stat","pvalue","padj","UP_DOWN")
#my_names = c['X2']
#my_names = c("Peak","annotation","ENSEMBL","log2FoldChange","padj","UP_DOWN")
result_abd = lapply(all_csv, FUN = function(x) subset(x, select=-c(1:7,155)))
names(result_abd) <- gsub(".txt","",
list.files("Model_hmap/",full.names = FALSE),
fixed = TRUE)
Then Scaling the data
fun <- function(result_abd) {
p <- t(scale(t(result_abd[,1:ncol(result_abd)])))
}
p2 <- mapply(fun, result_abd, SIMPLIFY = FALSE)
Next step was to use the metadata which i would like to annotate my heat-map
My metadata is as such
dput(head(metadata))
structure(list(patient = c("TCGA-AB-2856", "TCGA-AB-2849", "TCGA-AB-2971",
"TCGA-AB-2930", "TCGA-AB-2891", "TCGA-AB-2872"), prior_malignancy = c("no",
"no", "no", "no", "no", "no"), FAB = c("M4", "M0", "M4", "M2",
"M1", "M3"), Risk_Cyto = c("Intermediate", "Poor", "Intermediate",
"Intermediate", "Poor", "Good")), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
To read the above metadata I'm doing this below Im not sure if its the right way or approach.
list_of_files1 <- list.files('Model_hmap_meta/',pattern = '\\.txt$', full.names = TRUE)
#Further arguments to read.csv can be passed in ...
meta1 <- lapply(list_of_files1,read.table, row.names = 1,sep = "\t",header = TRUE)
Now I'm stuck at the above step Im not sure how do I pass the argument as list which i have done for the dataframe of my gene expression which I had calculated the zscore which is a list. So I think the metadata should be the same class if I have to use this .
For single file This is how I used to annotation into my final plot
metadata = read_delim("Model_hmap_meta/FAB_table.txt",delim = "\t", escape_double = FALSE,
trim_ws = TRUE)
head(metadata)
dim(metadata)
ann <- data.frame(metadata$FAB, metadata$Risk_Cyto)
colnames(ann) <- c('FAB', 'Risk_Cyto')
colours <- list('FAB' = c('M0' = 'red2', 'M1' = 'royalblue', 'M2'='gold','M3'='forestgreen','M4'='chocolate','M5'='Purple'),
'Risk_Cyto' = c('Good' = 'limegreen', 'Intermediate' = 'navy' , 'N.D.' ='magenta','Poor'='black'))
colAnn <- HeatmapAnnotation(df = ann,
which = 'col',
col = colours,
annotation_width = unit(c(1, 4), 'cm'),
gap = unit(1, 'mm'))
Now this is what I need to pass it to the list if I understand which I'm not able to do
My plotting function.
This is the code I use to plot.
hm1 <- Heatmap(heat,
col= colorRamp2(c(-2.6,-1,0,1,2.6),c("blue","skyblue","white","lightcoral","red")),
#heatmap_legend_param=list(at=c(-2.6,-1,0,1,2.6),color_bar="continuous",
# legend_direction="vertical", legend_width=unit(5,"cm"),
# title_position="topcenter", title_gp=gpar(fontsize=10, fontface="bold")),
name = "Z-score",
#Row annotation configurations
cluster_rows=T,
show_row_dend=FALSE,
row_title_side="right",
row_title_gp=gpar(fontsize=8),
show_row_names=FALSE,
row_names_side="left",
#Column annotation configuratiions
cluster_columns=T,
show_column_dend=T,
column_title="DE genes",
column_title_side="top",
column_title_gp=gpar(fontsize=15, fontface="bold"),
show_column_names = FALSE,
column_names_gp = gpar(fontsize = 12, fontface="bold"),
#Dendrogram configurations: columns
clustering_distance_columns="euclidean",
clustering_method_columns="complete",
column_dend_height=unit(10,"mm"),
#Dendrogram configurations: rows
clustering_distance_rows="euclidean",
clustering_method_rows="complete",
row_dend_width=unit(4,"cm"),
row_dend_side = "left",
row_dend_reorder = TRUE,
#Splits
border=T,
row_km = 1,
column_km = 1,
#plot params
#width = unit(5, "inch"),
#height = unit(4, "inch"),
#height = unit(0.4, "cm")*nrow(mat),
#Annotations
top_annotation = colAnn)
# plot heatmap
draw(hm1, annotation_legend_side = "right", heatmap_legend_side="right")
Objective
How do I wrap all the above into a small function where I can take input multiple files and plot them.
UPDATE
Data files
My data files my metadafile
Using the code you provided I made the following function (make_heatmap). Some of the read in statements are altered to match what I was working with on my machine. I also only used 2 of your files but it should work with all 4 that you're using.
This function will allow you to pass the counts matrix (which you normalize and set up before passing to the function). The assumption is that you're using the same metadata/annotation for each file you're passing. If you have different annotation files you could set up the heatmap annotation before the function and then pass that to the function. This is a bit more tedious though.
Usually the way that I set up my heatmap analyzes is that I have a script containing all of my functions (one for each type of heatmap I have to make) and then every time I need to make a new heatmap I have another script where I read in/prepare (ie median center) my counts matrix and then call the heatmap function I need.
list_of_files <- dir(pattern = 'MAP', full.names = TRUE)
#Further arguments to read.csv can be passed in ...
#all_csv <- lapply(list_of_files,read_delim,delim = "\t", escape_double = FALSE,trim_ws = TRUE)
all_csv <- lapply(list_of_files,read.table,strip.white = FALSE,check.names = FALSE,header=TRUE,row.names=1)
#my_names = c("gene","baseMean","log2FoldChange","lfcSE","stat","pvalue","padj","UP_DOWN")
my_names = c("Symbol","baseMean","log2FoldChange","lfcSE","stat","pvalue","padj","UP_DOWN")
#my_names = c['X2']
#my_names = c("Peak","annotation","ENSEMBL","log2FoldChange","padj","UP_DOWN")
result_abd = lapply(all_csv, FUN = function(x) subset(x, select=-c(1:7,155)))
names(result_abd) <- gsub(".txt","",
list.files("Model_hmap/",full.names = FALSE),
fixed = TRUE)
fun <- function(result_abd) {
p <- t(scale(t(result_abd[,1:ncol(result_abd)])))
}
p2 <- mapply(fun, result_abd, SIMPLIFY = FALSE)
# list_of_files1 <- list.files('Model_hmap_meta/',pattern = '\\.txt$', full.names = TRUE)
# #Further arguments to read.csv can be passed in ...
# meta1 <- lapply(list_of_files1,read.table, row.names = 1,sep = "\t",header = TRUE)
make_heatmap<-function(counts_matrix){
metadata = read.table("FAB_table.txt",sep = "\t", header=1)
head(metadata)
dim(metadata)
ann <- data.frame(metadata$FAB, metadata$Risk_Cyto)
colnames(ann) <- c('FAB', 'Risk_Cyto')
colours <- list('FAB' = c('M0' = 'red2', 'M1' = 'royalblue', 'M2'='gold','M3'='forestgreen','M4'='chocolate','M5'='Purple'),
'Risk_Cyto' = c('Good' = 'limegreen', 'Intermediate' = 'navy' , 'N.D.' ='magenta','Poor'='black'))
colAnn <- HeatmapAnnotation(df = ann,
which = 'col',
col = colours,
annotation_width = unit(c(1, 4), 'cm'),
gap = unit(1, 'mm'))
hm1 <- Heatmap(counts_matrix,
col= colorRamp2(c(-2.6,-1,0,1,2.6),c("blue","skyblue","white","lightcoral","red")),
#heatmap_legend_param=list(at=c(-2.6,-1,0,1,2.6),color_bar="continuous",
# legend_direction="vertical", legend_width=unit(5,"cm"),
# title_position="topcenter", title_gp=gpar(fontsize=10, fontface="bold")),
name = "Z-score",
#Row annotation configurations
cluster_rows=T,
show_row_dend=FALSE,
row_title_side="right",
row_title_gp=gpar(fontsize=8),
show_row_names=FALSE,
row_names_side="left",
#Column annotation configuratiions
cluster_columns=T,
show_column_dend=T,
column_title="DE genes",
column_title_side="top",
column_title_gp=gpar(fontsize=15, fontface="bold"),
show_column_names = FALSE,
column_names_gp = gpar(fontsize = 12, fontface="bold"),
#Dendrogram configurations: columns
clustering_distance_columns="euclidean",
clustering_method_columns="complete",
column_dend_height=unit(10,"mm"),
#Dendrogram configurations: rows
clustering_distance_rows="euclidean",
clustering_method_rows="complete",
row_dend_width=unit(4,"cm"),
row_dend_side = "left",
row_dend_reorder = TRUE,
#Splits
border=T,
row_km = 1,
column_km = 1,
#plot params
#width = unit(5, "inch"),
#height = unit(4, "inch"),
#height = unit(0.4, "cm")*nrow(mat),
#Annotations
top_annotation = colAnn)
# plot heatmap
draw(hm1, annotation_legend_side = "right", heatmap_legend_side="right")
}
make_heatmap(as.matrix(p2[[1]])) #just call the function with the counts matrix
make_heatmap(as.matrix(p2[[2]]))
If you need to output the heatmap to a pdf or something, you can do that before calling the function or you can put that command inside of the heatmap function (just make sure to call dev.off() inside the function too in that case).

plot_generic() function in plot_roc_components not found in R

I am using plot_roc_components function from rmda package. The definition of it has plot_generic() function. But, I am not able to find definition of this function. Why is it so?
The reason for it to see if there is an option for legend.size(). plot_roc_components gives me figure, however, I want to change the legend size. There is an option for legend.position, but not for its font size.
Could you please explain?
Thanks!
https://github.com/mdbrown/rmda/blob/57553a4cf5b6972176a0603b412260e367147619/R/plot_functions_sub.R
You were looking in one file but it was defined in another file.
plot_generic<- function(xx, predictors, value, plotNew,
standardize, confidence.intervals,
cost.benefit.axis = TRUE, cost.benefits, n.cost.benefits,
cost.benefit.xlab, xlab, ylab,
col, lty, lwd,
xlim, ylim, legend.position,
lty.fpr = 2, lty.tpr = 1,
tpr.fpr.legend = FALSE,
impact.legend = FALSE,
impact.legend.2 = FALSE,
population.size = 1000,
policy = policy, ...){
## xx is output from get_DecisionCurve,
## others are directly from the function call
#save old par parameters and reset them once the function exits.
old.par<- par("mar"); on.exit(par(mar = old.par))
xx.wide <- reshape::cast(xx, thresholds~model, value = value, add.missing = TRUE, fill = NA)
xx.wide$thresholds <- as.numeric(as.character(xx.wide$thresholds))
if(is.numeric(confidence.intervals)){
val_lower <- paste(value, "lower", sep = "_")
val_upper <- paste(value, "upper", sep = "_")
xx.lower <- cast(xx, thresholds~model, value = val_lower, add.missing = TRUE, fill = NA)
xx.upper <- cast(xx, thresholds~model, value = val_upper, add.missing = TRUE, fill = NA)
xx.lower$thresholds <- as.numeric(as.character(xx.lower$thresholds))
xx.upper$thresholds <- as.numeric(as.character(xx.upper$thresholds))
}
# adjust margins to add extra x-axis
if(cost.benefit.axis) par(mar = c(7.5, 4, 3, 2) + 0.1)
#set default ylim if not provided
#initial call to plot and add gridlines

Assigning type to xyplot

Complete beginner at R here trying to perform nonmetric multidimensional scaling on a 95x95 matrix of similarities where 8 corresponds to very similar and 1 corresponds to very dissimilar. I also have an additional column (96th) signifying type and ranging from 0 to 1.
First I load the data:
dsimilarity <- read.table("d95x95matrix.txt",
header = T,
row.names = c("Y1", "Y2", "Y3", "Y4", "Y5", "Y6", "Y7", "Y8", "Y9", "Y10", "Y11", "Y12", "Y13", "Y14", "Y15", "Y16", "Y17", "Y18", "Y19", "Y20",
"Y21", "Y22", "Y23", "Y24", "Y25", "Y26", "Y27", "Y28", "Y29", "Y30", "Y31", "Y32", "Y33", "Y34", "Y35", "Y36", "Y37", "Y38", "Y39", "Y40",
"Y41", "Y42", "Y43", "Y44", "Y45", "Y46", "Y47", "Y48", "Y49", "Y50", "Y51", "Y52", "Y53", "Y54", "Y55", "Y56", "Y57", "Y58", "Y59", "Y60",
"Y61", "Y62", "Y63", "Y64", "Y65", "Y66", "Y67", "Y68", "Y69", "Y70", "Y71", "Y72", "Y73", "Y74", "Y75", "Y76", "Y77", "Y78", "Y79", "Y80",
"Y81", "Y82", "Y83", "Y84", "Y85", "Y86", "Y87", "Y88", "Y89", "Y90", "Y91", "Y92", "Y93", "Y94", "Y95"))
I convert the matrix of similarities into a matrix of dissimilarities, and exclude the 96th column:
ddissimilarity <- dsimilarity; ddissimilarity[1:95, 1:95] = 8 - ddissimilarity[1:95, 1:95]
Then I perform the nonmetric MDS using the Smacof function:
ordinal.mds.results <- smacofSym(ddissimilarity[1:95, 1:95],
type = c("ordinal"),
ndim = 2,
ties = "primary",
verbose = T )
I create a new data frame (I'm following a guide and don't really know what's going on here):
mds.config <- as.data.frame(ordinal.mds.results$conf)
All well and good thus far (to my knowledge). However at this point I will try to create an xyplot of the data and get a good result using this code:
xyplot(D2 ~ D1, data = mds.config,
aspect = 1,
main = "Figure 1. MDS solution",
panel = function (x, y) {
panel.xyplot(x, y, col = "black")
panel.text(x, y-.03, labels = rownames(mds.config),
cex = .75)
},
xlab = "MDS Axis 1",
ylab = "MDS Axis 2",
xlim = c(-1.1, 1.1),
ylim = c(-1.1, 1.1))
Now I want to create a figure that incorporates the type in column 96th and assigns different colors to observations of the two different types. However, can't quite figure out how to do so. Does anyone have any ideas of where I'm going wrong here?
xyplot(D2 ~ D1, data = mds.config ~ ddissimilarity[96:96, 96:96],
aspect = 1,
main = "Figure 1. MDS solution",
panel = function (x, y) {
panel.xyplot(x, y, col = "black")
panel.text(x, y-.03, labels = rownames(mds.config),
cex = .75)
},
xlab = "MDS Axis 1",
ylab = "MDS Axis 2",
xlim = c(-1.1, 1.1),
ylim = c(-1.1, 1.1),
group = "Type")

Why do two ggplot objects pass an all.equal() test, but fail identical() test?

I want to test if two graphs generated by ggplot are the same. One option would be to use all.equal on the plot objects, but I'd rather have a harder test to ensure they're the same, which seems like is something identical() provides me.
However, when I tested two plot objects created with the same data and the same aes, I've found that all.equal() recognizes them as being the same, whereas the objects didn't pass the identical test. I'm not sure why and I'd love to learn more.
Basic example:
graph <- ggplot2::ggplot(data = iris, aes(x = Species, y = Sepal.Length))
graph2 <- ggplot2::ggplot(data = iris, aes(x = Species, y = Sepal.Length))
all.equal(graph, graph2)
# [1] TRUE
identical(graph, graph2)
# [1] FALSE
The graph and graph2 objects contain environments and each time an environment is generated it is different even if it holds the same values. R lists are identical if they have the same contents. This can be stated by saying that environments have object identity apart from their values whereas the values of the list form the identity of the list. Try:
dput(graph)
giving the following which includes environments denoted by <environment> in the dput output: (continued after output)
...snip...
), class = "factor")), .Names = c("Sepal.Length", "Sepal.Width",
"Petal.Length", "Petal.Width", "Species"), row.names = c(NA,
-150L), class = "data.frame"), layers = list(), scales = <environment>,
mapping = structure(list(x = Species, y = Sepal.Length), .Names = c("x",
"y"), class = "uneval"), theme = list(), coordinates = <environment>,
facet = <environment>, plot_env = <environment>, labels = structure(list(
x = "Species", y = "Sepal.Length"), .Names = c("x", "y"
))), .Names = c("data", "layers", "scales", "mapping", "theme",
"coordinates", "facet", "plot_env", "labels"), class = c("gg",
"ggplot"))
For example, consider:
g <- new.env()
g$a <- 1
g2 <- new.env()
g2$a <- 1
identical(as.list(g), as.list(g2))
## [1] TRUE
all.equal(g, g2) # the values are the same
## [1] TRUE
identical(g, g2) # but they are not identical
## [1] FALSE

Resources