I'm trying to visualise missing data with the R package VIM.
I'm using R version 3.4.0 with RStudio
I've used the function aggr() but the colnames of my dataframe seem to be too long. Thus, some labels of the x axis don't appear.
I would like to increase the space at the bottom of the x axis.
library(VIM)
aggr(df)
Here is my dataframe df and the plot I obtain
I've tried with par() function but it doesn't change anything.
aggr(df,mar=c(10,5,5,3))
or
par(mar=c(10,5,5,3))
g=aggr(df,plot=FALSE)
plot(g)
I can reduce the font size with cex.axis but then labels are too small.
aggr(df,cex.axis=.7)
Here is the plot with small axis labels:
I've not find a lot of examples using aggr() that's why I ask for your help.
Thank you in advance.
I think you are looking for a graphical parameter oma which will allow you to resize the main plot. The help reference states:
For plot.aggr, further graphical parameters to be passed down. par("oma") will be set appropriately unless supplied (see par).
In your case you could do something like:
aggr(df, prop = T, numbers = F, combined = F,
labels = names(df), cex.axis = .9, oma = c(10,5,5,3))
Obviously, you need to play around with cex.axis and other parameters to find out what works best for your data.
I try to use base R to plot a time series as a bar plot and as ordinary line plot. I try to write a flexible function to draw such a plot and would like to draw the plots without axes and then add universal axis manually.
Now, I hampered by strange problem: same ylim values result into different axes. Consider the following example:
data(presidents)
# shorten this series a bit
pw <- window(presidents,start=c(1965))
barplot(t(pw),ylim = c(0,80))
par(new=T)
plot(pw,ylim = c(0,80),col="blue",lwd=3)
I intentionally plot y-axes coming from both plots here to show it's not the same. I know I can achieve the intended result by plotting a bar plot first and then add lines using x and y args of lines.
But the I am looking for flexible solution that let's you add lines to barplots like you add lines to points or other line plots. So is there a way to make sure y-axes are the same?
EDIT: also adding the usr parameter to par doesn't help me here.
par(new=T,usr = par("usr"))
Add yaxs="i" to your lineplot. Like this:
plot(pw,ylim = c(0,80),col="blue",lwd=3, yaxs="i")
R start barplots at y=0, while line plots won't. This is to make sure that you see a line if it happens that your data is y=0, otherwise it aligns with the x axis line.
I'm new to R. Previously, I've been able to overlay 2 separate plots that were of the same kind, p1 and p2, using plot (p1); plot (p2, add=T).
I'm struggling with the definition of factors when overlaying a barplot with a point plot showing all individual points.
I can individually plot the barplot as I want it. The point plot looks like I want it, but I realize I'm using an incorrect definition of phase as numerical to force R plot to display each value, rather than default to a boxplot (like when I use plot(my.df$cond, my.df$val).
Any tips on defining my variable types correctly or whether I'm using the correct barplot and plot functions, would be greatly appreciated. Thank you so much.
shpad <- c(1,2,5,6,1,2,5,6,1,2,5,6,1,2,5,6)
my.df <- data.frame(val=c(0.0738,0.0518,0.002,0.0397,0.1452,0.1152,0.1774,0.0658,0.0218,0.0497,-0.0296,0.0653,0.0848,0.1296,0.1416,0.0923,
phase=c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4),
sub=c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4),
cond=c("NsNm", "NsNm", "NsNm", "NsNm", "NsLm", "NsLm", "NsLm", "NsLm", "LsNm", "LsNm", "LsNm", "LsNm", "LsLm", "LsLm", "LsLm", "LsLm"))
avg <-tapply(my.df$val, my.df$phase, mean)
barplot(avg, border=NA, names.arg=c("NsNm", "NsLm", "LsNm", "LsLm"),col=c("blue","darkblue","red", "darkred"),ylab = "score",ylim=c(-0.03,0.25))
plot(my.df$phase, my.df$val, type="p", ylim=c(-0.03,0.25), ylab = "score", pch=shpad)
tl;dr: problem is that if instead of the last line, I have plot(my.df$phase, my.df$val, type="p", ylim=c(-0.03,0.25), ylab = "score", pch=shpad, add=T), the formats are incongruent.
Alright, so, I've tried for a bit to accomplish what you wanted, but the best I could do with the base plotting system is this:
Which is accomplished purely by your lines of code above except for the last line, which I replaced with
points(my.df$phase,my.df$val,type="p",pch=shpad)
However, I think you can do much better, if you want to keep the same kind of plot, using the ggplot2 library. Using this code:
library('ggplot2')
new.df <- data.frame(avg,phase=levels(factor(phase)))
ggplot(new.df) +
geom_bar(stat="identity",aes(x=levels(phase),y=avg, fill=c("NsNm","NsLm","LsNm","LsLm")))+
geom_point(aes(x=my.df$phase,y=my.df$val,shape=factor(shpad))) +
scale_x_discrete(name="Type",labels=c("NsNm","NsLm","LsNm","LsLm")) +
ylab("Score")
you can make this chart:
I didn't adjust the coloring and the point types and the legend titles (not sure how important they are, but those can be fiddled with). However, you can see this probably produces the result you were aiming for.
In R I often add legends to my plots like this
legend("topright",c("a=1","b=1"),lwd=c(1,2))
However, what I want to do is produce a plot which contains nothing but that legend. How do I do it? (Preferably without using package such as ggplot)
You can generate a new, empty plot frame using frame() or plot.new()
plot.new()
legend("topright",c("a=1","b=1"),lwd=c(1,2))
Use the type='n' parameter as in:
plot(x,y,type='n')
See ?plot.default for details. If you will want to add some text/points/lines to the plot afterward you may want to provide the x and y parameters, and/or the ylim and xlim parameters in order to set up the plotting region.
You can also drop the axes with the argument axes=F, and you can set the xlab,ylab, and main to NA, if you really want a blank plot.
I am working with timeseries with millions of points. I normally plot this data with
plot(x,type='l')
Things slow down terribly if I accidentally type
plot(x)
because the default is type='p'
Is there any way using setHook() or something else to modify the default plot(type=...) during an R session?
I see from How to set a color by default in R for all plot.default, plot or lines calls that this can be done for par() parameters like 'col'. But there doesn't appear to be any points-vs-line setting in par().
A lightweight solution is to just define a wrapper function that calls plot() with type="l" and any other arguments you've given it. This approach has some possible advantages over changing an existing function's defaults, a few of them mentioned here
lplot <- function(...) plot(..., type="l")
x <- rnorm(9)
par(mfcol=c(1,2))
plot(x, col="red", main="plot(x)")
lplot(x, col="red", main="lplot(x)")