Placing rows at user-defined distances in filled.contour in R - r

I am trying to create filled.contour using the following matrix.
row1 <- rep(10,100)
row2 <- sample(c(10:30),100,replace=TRUE)
row3 <- rep(30,100)
z1 <- cbind(row1,row2,row3)
col1 <- colorRampPalette(c('red','yellow','deepskyblue'))(20)
filled.contour(z=z1,col=col1,cex.lab=2,cex.main=1.1,nlevels=20,main=('Heat map'))
I get the following plot:
(You won't see the border in the legend because I have modified filled.contour as stated here)
You can see that row2 is placed exactly in the middle(at position 0.5 with respect to the axis). my question is as follows:
Is it possible to put the rows not symmetrically but on user defined locations? For example, I am requiring to put the rows at positions c(0,.33,1) and not the default c(0,.5,1).

While I waited for some response here, I kept on trying and actually it was a very silly problem.
You just need to change the default y argument in filled.contour
Here is the (not significantly) modified function which gives the following image:
filled.contour(z=z1,y=c(0,.33,1),col=col1,cex.lab=2,cex.main=1.1,nlevels=20,main=('Heat map with custom y placement'))

Related

I want to use heatmap in my code but i am getting error

heatmap(Web_Data$Timeinpage)
str(Web_Data)
heat = c(t(as.matrix(Web_Data$Timeinpage[,-1])))
heatmap(heat)
A few items to note here:
1) by including the c() operator in the c(t(as.matrix(Web_Data$Timeinpage[,-1]))) You are creating a single vector and not a matrix. You can see this by running the following: is.matirx(c(t(as.matrix(Web_Data$Timeinpage[,-1])))). heatmap (I believe) is checking for a matrix because...
2) You need to provide a matrix with at least two rows and two columns for this function to work. Currently, you are only give on vector - time. You will need to provide some other feature of interest to have it work correctly, such as Continent.
3) If you intend to plot ONLY one field, you may consider doing as suggested here and use the image() function. (I included an example below).
4) I find the heatmap function somewhat dated in look. You may want to consider other popular functions, such as ggplot's geom_tile. (see here).
Below is an example code that should produce an output:
#fake data
Web_Data <- data.frame("Timeinpage" = c(123,321,432,555,332,1221,2,43,0, NA,10, 44),
OTHER = rep(c("good", "bad",6)) )
#a matrix with TWO columns from my data frame. Notice the c() is removed and I am not transposing. Also removing the , from [,-1]
heat <- matrix(c(Web_Data$Timeinpage[-1], Web_Data$OTHER[-1]), 2,11)
#output
heatmap(heat)
#one row
heat2 <- as.matrix(sort(Web_Data$Timeinpage[-1])) #sorting as well
#output
image(heat2)

How to check if a column has numeric or categorical levels in R?

I am trying to plot 9 barplots in a 3X3 matrix in R using base-R wrapped inside a for loop. (I am working on a workhorse solution for visualizing every column before I begin working on manipulating data) Below is the code:
library(ISLR);
library(ggplot2);
# load wage data
data(Wage)
par(mfrow=c(3,3))
for(i in 1:(dim(Wage)[2]-2)){
plot(Wage[,i],main = paste0(names(Wage)[i]),las = 2)
}
But unfortunately can't do properly for first 2 columns because they are numeric and actually needs a histogram. I get it that I need to fit if-else condition somewhere inside for() statement but that is giving me errors. below is the output where first 2 columns are plotted wrong. (Age and year are actually numeric and I may need to use them in X-axis instead of defaulting them to y).
Kindly requesting to suggest an edit/hack? I also learnt that I cant' use par() when I am wrapping ggplot inside for so I had to use base-R otherwise ggplot would have been great aesthetically.

R statbin_mean with fixed bins of x

I would like to plot in R the equivalent of the binscatter command that you can find in Stata.
I have found the statar package that should give the same with the command stat_binmean.
I am having problems in setting the bins though. I want to set the specific values of x at which I want the bin to be constructed. Indeed , for now, I have only managed to set the number of bins that I want, leaving to R the option to set the corresponding values of x.
The following is my code:
library(statar)
library(ggplot2)
g<-ggplot( df , aes(x=var_x , y=var_y))
g + stat_binmean(n=0)
From the statar's instruction code: "Set (n) to zero if you want to use distinct value of x for grouping", but how do I specify the specific values of the grouping?
PS: I am also fine with other commands, like stat_summary_bin, but my problem stays the same.

concise way to generate ordered sets of line segment coordinates

I wrote a quick hack to generate the coordinates of the endpoints of all "cell walls" in an a plain old array of squares on integer coordinates.
dimx <- 4
dimy <- 5
xvert<-rep(1:(dimx+1),each=dimy)
yvert<-1:dimy
yvert<-rep(yvert,times=dimx+1)
vertwall<-cbind(xvert, xvert,yvert,yvert+1)
And similarly for the horizontal walls. It feels like I just reinvented some basic function, so: Faster, Better, Cleaner?
EDIT: consider a grid of cells. The bottom-left cell's two walls of interest have the coordinate x,y pairs (1,1),(1,2) and (1,1),(2,1) . Similar to the definition of crystal unit cells in solid-state physics, that's all that is required, as the next cell "up" has walls (1,2),(1,3) and (1,2),(2,2) and so on. Thus the reason for repeating the "xvert" data in my sample.
I am not sure to understand what do you try to do ( your column names are duplicated and this is confusing). You can try this for example:
df = expand.grid( yvert= seq_len(dimy),xver= seq_len(dimy))
transform(df,xvert1=xvert,yvert1=yvert+1)
CGW added for completeness' sake: generate both horizontal and vertical walls:
df = expand.grid( xvert= seq_len(dimx),yvert= seq_len(dimy))
transform(df,xvert1=xvert,yvert1=yvert+1) ->dfv
df2 <- expand.grid(yvert= seq_len(dimy), xvert= seq_len(dimx))
transform(df2,yvert1=yvert,xvert1=xvert+1) ->dfh
# make x,y same order in both arrays
dfh[] <- dfh[,c(2,1,4,3)]
The expand.grid function creates Cartesian products of arrays, which provides most of what you need to do.
expand.grid(x=1:5,y=1:5)

r producing multiple violin plots with one graphic device

I have a data frame that looks like that:
bin_with_regard_to_strand CLONE3
31 0.14750872
33 0.52735917
28 0.48559060
. .
. .
I want to use this data frame to generate violin plots in such a way that all of the values in CLONE3 corresponding to a given value of bin_with_regard_to_strand will generate one plot.
Further, I want all of the plots to appear in the same graphic device (I'm using R-studio, and I want all of the plots to appear in one plot window).
Theoretically I could do this with:
vioplot(df$CLONE3[which(df$bin_with_regard_to_strand==1)],
df$CLONE3[which(df$bin_with_regard_to_strand==2)]...)
but since bin_with_regard_to_strand has 60 different values, this seems a bit ridiculous.
I tried using tapply:
tapply(df$CLONE3, df$bin_with_regard_to_strand,vioplot)
But that would open 60 different windows (one for each plot).
Or, if I used the add parameter:
tapply(df$CLONE3, df$bin_with_regard_to_strand,vioplot(add=TRUE))
generated a single plot with the data from all values bin_with_regard_to_strand (seperated by lines).
Is there a way to do this?
You could use par(mfrow=c(rows, columns)) (see ?par for details).
(see also ?layout for complexer arrangements)
d <- lapply(1:6, function(x)runif(100)) # generate some example data
library("vioplot")
par(mfrow=c(3, 2)) # use a 3x2 (rows x columns) layout
lapply(d, vioplot) # call plot for each list element
par(mfrow=c(1, 1)) # reset layout
Another alternative to mfrow, is to use layout. It is very handy to organize your plots. You just create a matrix with plots index. Here what you can do. It seems that 60 boxplots is a huge number. Maybe you should organize them in 2 pages.
The code below in function of N (number of plots)
library(vioplot)
N <- 60
par(mar=rep(2,4))
layout(matrix(c(1:N),
nrow=10,byrow=T))
dat <- data.frame(bin_with_regard_to_strand=gl(N,10),CLONE3=rnorm(10*N))
with(dat ,
tapply(CLONE3,bin_with_regard_to_strand ,vioplot))
This is an old question, but though I would put out a different solution for getting vioplot to make multiple violin plots on the same graph (i.e. same axes), rather than on different graphics objects like the above answers.
Basically use do.call to apply vioplot to a list of data. Ultimately, vioplot is not very well written (can't even set the title, axis names, etc.). I usually prefer base R, but this is a case where ggplot2 options is probably the way to go.
x<-rnorm(1000)
fac<-rep(c(1:10),each=100)
listOfData<-tapply(x,fac,function(x){x},simplify=FALSE)
names(listOfData)[[1]]<-"x" #because vioplot requires a 'x' argument
do.call(vioplot,listOfData)
resultingImage

Resources