I am attempting to use the aggr() function of the VIM package to plot missing data patterns. My plots don't show the frequencies/proportions of missing data patterns to the outside of the right-side axis. Should look like this. I'm getting an error that there is "not enough horizontal space to display frequencies".
library(VIM)
aggr(sleep, prop = T, numbers = T)
I don't do much base R plotting. I think this has something to do with margins. I reviewed this informative tutorial on margins, but I did not find my way to a solution.
I had two problems: one localized and one related to aggr().
1) This function helped me to reset par("pin"). Resetting made the toy example work.
resetPar <- function() {
dev.new()
op <- par(no.readonly = TRUE)
dev.off()
op
}
par(resetPar())
2) My actual use case was still failing with the same horizontal space error. I realized I needed to set the cex.numbers parameter of aggr() to something less than 1.
Related
I want to remove the light gray grid lines from a Lattice dotplot. After searching R help pages, Sarkar's book, and the web, the one answer I've found is this post, which explains that you can set the grid line width to zero for all dotplots using this magic:
## turn off grid lines
d1 <- trellis.par.get("dot.line")
d1$lwd <- 0 ## hack -- set line width to 0
trellis.par.set("dot.line",d1)
Example: Try dotplot(VADeaths[,"Rural Female"]) before and after doing the preceding.
This solution works, but I would have thought that there would be a way to control the grid lines from inside the dotplot function, perhaps using a panel function. Is there a way to do that? (An authoritative "No" could count as a correct answer.)
Setting col.line = "transparent" inside panel.dotplot should solve this issue. See also ?panel.dotplot.
dotplot(VADeaths[, "Rural Female"], panel = function(...) {
panel.dotplot(..., col.line = "transparent")
})
This is probably going to sound like a really stupid question but my plot in R will not change color!
this is my line in R
plot(MasterModCfs$V3,MasterModCfs$V5,col="blue")
I have tried it with spaces and without, with different colors, reloading, everything I could think of. If it's important, MasterModCfs contains literally thousands of data values.
I did one of the examples just to check
cars <- c(1, 3, 6, 4, 9)
plot(cars)
plot(cars, col="blue")
and it's blue. So that works.
Why won't my plot change colors?
Googling the same problem, I came across this present page - but subsequently found the source of my own issue through trial and error so posting here in case it helps. My x axis values were set up as factors - when I re-converted them to a normal string using as.character(), I was able to re-apply my own colours finally.
Without knowing what your par() settings are and what the classes of your columns, it's impossible to say.
You might try adding 'pch=21, bg="red"' to your plot command and see if that changes anything. It might give you a hint.
I suggest using the aes command in ggplot2. You can try
ggplot(data = MasterModCfs) + geom_point (mapping =aes (x=V3, y=V5), color="blue")
That should work!
Using plot directly I am not sure how to do it. I try your simple example (with cars) but it also fails if you add another vector for the "y" axis (as you wish to do in the end). To be honest, I have no idea why it works using only "cars" and the color.
I have this pairs plot
I want to make this plot bigger, but I don't know how.
I've tried
window.options(width = 800, height = 800)
But nothing changes.
Why?
That thing's huge. I would send it to a pdf.
> pdf(file = "yourPlots.pdf")
> plot(...) # your plot
> dev.off() # important!
Also, there is an answer to the window sizing issue in this post.
If your goal is to explore the pairwise relationships between your variables, you could consider using the shiny interface from the pairsD3 R package, which provides a way to interact with (potentially large) scatter plot matrices by selecting a few variables at a time.
An example with the iris data set:
install.packages("pairsD3")
require("pairsD3")
shinypairs(iris)
More reference here
I had the same problem with the pairs() function. Unfortunately, I couldn't find a direct answer to your question.
However, something that could help you is to plot a selected number of variables only. For this, you can either subset the default plot. Refer to this answer I received on a different question.
Alternatively, you can use the pairs2 function which I came across through this post.
To make the plot bigger, write it to a file. I found that a PDF file works well for this. If you use "?pdf", you will see that it comes with height and width options. For something this big, I suggest 6000 (pixels) for both the height and width. For example:
pdf("pairs.pdf", height=6000, width=6000)
pairs(my_data, cex=0.05)
dev.off()
The "cex=0.05" is to handle a second issue here: The points in the array of scatter plots are way too big. This will make them small enough to show the arrangements in the embedded scatter plots.
The labels not fitting into the diagonal boxes is resolved by the increased plot size. It could also be handled by changing the font size.
I don't know if you have seen some unwanted bold-face font like picture below:
As you see the third line is bold-faced, while the others are not. This happens to me when I try to use ggplot() with lapply() or specially mclapply(), to make the same chart template based on different data, and put all the results as different charts in a single PDF file.
One solution is to avoid using lapply(x, f) when f() is a function that returns a ggplot() plot, but I have to do so for combining charts (i.e. as input for grid.arrange()) in some situation.
Sorry not able to provide you reproducible example, I tried really hard but was not successful because the size of code and data is too big with several nested functions and when I reduced complexity to make a reproducible example, the problem did not happen.
I asked the question because I guessed maybe someone has faced the same experience and know how to solve it.
My intuition is that it's not actually being printed in bold, but rather double-printed for some reason, which then looks bold. This would explain why it doesn't come up with a simpler example. Especially given your mention of nested functions and probably other complicated structures where it's easy to get an off-by-one or similar error, I would try doing something where you can see exactly what's being plotted -- perhaps by examining the length() of the return value from apply().
Changing the order of elements of the vector, so that the order of the elements in the key is different, may also help. If you consistently get the bold-face on the last element, that also tells you a little bit more about where something is going wrong.
As #Dinre also mentioned, it could also be related to your plotting device. You can try out changing your plotting device. I have my doubts about this though, seeing as it's not a consistent problem. You could also try changing the position of the key, which depending on your plotting device and settings, may move you in or out of a compression block, thus changing which artifacts crop up.
Reproducible example and a solution may be as follows:
library(ggplot2)
d <- data.frame(x=1:10, y=1:10)
ggplot(data = d, aes(x=x, y=y)) +
geom_point() +
geom_text(aes(3,7,label = 'some text 10 times')) +
geom_text(data = data.frame(x=1,y=1),
aes(7,3, label = 'some text one time'))
When we try to add a label by geom_text() manually inserting x and y do not shorten the data. Then same label happen to be printed as many times as the number of rows our data has. Data length may be forced to 1 by replacing data within geom_text().
I hope this isn't redundant as I've googled extensively and still have not found an answer. I'm plotting intraday data and want to place a vertical line at a specific point in time. It seems I have to use the function addTA but it always plots below my graph in some weird empty white space. Here's some sample code and data. Thanks for any help.
Data:
date,value
29-DEC-2010:00:02:04.000,99.75
29-DEC-2010:00:03:44.000,99.7578125
29-DEC-2010:00:05:04.000,99.7578125
29-DEC-2010:00:07:53.000,99.7421875
29-DEC-2010:00:07:57.000,99.71875
29-DEC-2010:00:09:20.000,99.7421875
29-DEC-2010:00:11:04.000,99.75
29-DEC-2010:00:12:56.000,99.7421875
29-DEC-2010:00:13:05.000,99.7421875
Code:
#set up data
data = read.csv("foo.csv")
values = data[,2]
time = c(strptime(data[,1],format="%d-%b-%Y:%H:%M:%S",tz="GMT"))
dataxts = xts(values, order.by=time,tzone="GMT")
# chart data
chartSeries(dataxts)
# add vertical line - this is where I have no clue what's going on.
addTA(xts(TRUE,as.POSIXlt("2010-12-29 00:11:00",tz="GMT"),on=1))
What ends up happening is that I get a vertical line where I want it, 2010-12-29 00:11:00, but it sits in a new section below the graph instead of overlaid on it. Any ideas?
You're passing on as an argument to xts, but you should be passing it to addTA.
I think you mean to be doing this:
addTA(xts(TRUE,as.POSIXlt("2010-12-29 00:11:00",tz="GMT")),on=1)
That said, it still doesn't work for me with your sample data. However, if what you had works with your real data except it puts the line in a new panel, then this should work and not open a new panel.
I have an addVLine function in my qmao package that is essentially the same thing.
Edit
Aside from the typo with the parenthesis, the xts object also needs a column name in order for addTA to work (I think... at least in this case anyway). Also, you must give an x value that exists on your chart (i.e. 11:04, not 11:00)
colnames(dataxts) <- "x"
chartSeries(dataxts)
addTA(xts(TRUE,as.POSIXlt("2010-12-29 00:11:04",tz="GMT")),on=1, col='blue')
# or
# library(qmao)
# addVLine(index(dataxts[7]))
Edit 2
Perhaps a better approach is to use the addLines function
addLines(v=7)
Or, if you know the time, but don't know the row number, you could do this
addLines(v=which(index(dataxts) == as.POSIXlt("2010-12-29 00:11:04", tz="GMT")))
which gives