Overlapping Field Names in mice::md.pattern - r

Generating a graphic in R using the mice package using the md.pattern function to graph the rows and columns of a data.frame where there are missing data values. This function creates a plot pasted below. The field headers are overlapping and are illegible. I've tried expanding the parameters of the image but that doesn't change anything.
Any ideas on how I can work around this? Any insight is appreciated.

you can try this
md.pattern(x, plot = TRUE, rotate.names = TRUE)

Related

R Plot of Two Detrended Series Shows Line Chart Rather Than Scatterplot

I have a set of data which are within the exact same time frame, with the exact same number of points. I have detrended both so comovement can be analyzed. When I plot them against each other the graph attempts to create a line chart including dates.
plot
This is what the series look like in the environment:
environment variables
This is what the data looks like:
data screenshot
I would like this in a scatterplot measuring against both variables, just points and no lines or dates in the plot.
So I sorta figured it out but it's super botched and I do not recommend anyone else to do this.
Essentially, I bound the two datasets together doing:
testvar <- cbind(dewagerealM, dewagerealF)
I was then able to select all the data on the left and the right, then plot them against each other like so:
plot(testvar[1:23,1], testvar[1:23,2])
This seems to have worked but it's not pretty and definitely not what should be done but it seems to have gotten the job done.
The easiest way to do this is to use the options xy.lines and xy.labels set to FALSE
plot(dewagerealM, dewagerealF, type = 'p',xy.lines = FALSE, xy.labels = FALSE)
Since you are plotting time series (ts) type objects, the help function help("plot.ts") can give you more details on the options you can use to plot these objects.

Creating a list/data.frame of images to graph with ggplot2

I have dataset (vg) with observations uniquely identified with an image's filename. I have a folder with these images, named the same way (e.g., 001.jpg ... 501.jpg). I would like to graph my data using the actual images as the "geom" in ggplot2.
I was able to figure out how to plot an image thanks to this answer. However, I am left with two issues: (1) plotting many images using a loop and (2) plotting the images on the graph based on the data in my dataset. I am presently concerned with reading the images into some sort of list or data frame. Here is what I have tried:
imgdir = "c:/mydir"
setwd(imgdir)
l = length(dir())
img <- graph.data.frame()
for (i in l) {
load = readJPEG(dir()[i], TRUE) %>%
rasterGrob(interpolate=TRUE)
img[1:length(load)] = load
img[1:length(load)] = dir()[i]
}
It's cute but it obviously does not work. Frankly, I do not understand how the object this creates functions. Ideally, I would have a dataframe with the filename and raster (is that what it is?), or even just load the image into my dataset. Then I would have to figure out how to give ggplot 700+ images to plot!
I am rather new to R and recognize this may not be possible the way I envision it. Thank you for your patience.

How to get the actual data from the function hist

I am very new to R, so I apologize if this is a basic question.
Is there any way to have the data behind the graph the function "hist" produces?
I don't need the graphic, I just the data.
In general, it would be nice if I have the option to only get the data behind the functions that produce graphs and prevent drawing the actual plots.
Thank you,
There is no way to obtain the original data behind the function hist.
If you are referring just to the data required to generate the plot, they are stored in hist(x)$mids and hist(x)$count, which contains respectively the midpoints and the counts. If you want just the data without drawing the plot, you can call this function on the object hist:
dataHist<-function(y){
rbind(y$mids,y$counts)
}
Try using hist(*yourvectorname*, plot = FALSE)

Graphing multiple variables in R

I am currently attempting to graph multiple columns in a matrix in R. So far, I have figured things out, but here is my problem- when I submit a matrix with 5 columns, I only get a graph with 4 lines. I've noticed that the missing line is always the line closest to the x-axis. I've been working on this for several hours now, and I have tried several different things. Any advice or help on how to get R to produce that 5th line (with a corresponding color filling the space between the x-axis and the line) would be greatly appreciated.
gender=cbind(matrix(malepop),matrix(femalepop))
plotmat(year,gender)
#a sample set
biggen=cbind(malepop,femalepop,malepop,femalepop)
#start of the function
plotmat2=function(years,m,colors){
n=m/1000000
#create a plot with the base line
plot(years,n[,1],type='l',ylim=c(0,10))
##create a for loop to generate all other lines and fill in the spaces
for (i in ncol(n):2) {
newpop=matrix(rowSums(n[,1:i]))
lines(year,newpop)
cord.xmat=c(min(years),years,max(years))
cord.ymat=c(-1,newpop[,1],-1)
polygon(cord.xmat,cord.ymat,col=clrs[i])
next
cord.xmat=c(min(years),years,max(years))
cord.ymat1=c(-1,n[,1]/1000000,-1)
polygon(cord.xmat,cord.ymat,col="purple")
}
}
#sample color set
clrs=c("red","blue","yellow","pink","purple", "cyan", "hotpink")
#run the function
plotmat2(year,biggen,clrs)
Thanks for any and all help you can provide!
It might be that you are unintentionally covering up your first line with the other colored sections, and that you may be skipping the creation of the polygon for n[,1].
From the way you tried to graph the columns in descending order, I am assuming you know that your columns are in ascending size order (the section that is pink in your example plot would be the final column in the matrix "biggen"). In case I am wrong about this, it might be a good idea to change your polygon shading using the density argument, which may help you see if you are covering up other sections by accident.
## plotmat2 function
plotmat2=function(years,m,colors){
n=m/1000000
#create a blank plot based on the baseline
plot(years,n[,1],type='n',ylim=c(0,10))
##create a for loop to generate all other lines and fill in the spaces
for (i in ncol(n):1) {
newpop=matrix(rowSums(n[,1:i]))
lines(year,newpop)
cord.xmat=c(min(years),years,max(years))
cord.ymat=c(-1,newpop[,1],-1)
polygon(cord.xmat,cord.ymat,col=colors[i], density=10)
}
}
P.S. If this doesn't help fix the problem, it might help if you provided a portion of your dataset. I am still learning about R and about StackOverflow, but that seems to be sensible advice that is given on a lot of the threads I have read on here. Good luck!

Heatmap generation

I have an Excel file with two different columns. One column have values ranging from 2 to 15 and other column have values ranging from positive to negative numbers.
I want to produce a heatmap in such a way that for first column each value should have a different color. Second column should be in the form of a gradient.
I tried using excel conditional formatting to do this.
But I want to know is there any way to do it in R?
The R command image() takes a matrix and makes a heat-map from it. see the help page: ?image. Also worth considering is the heatmap function, which is basically image() with some clustering applied. Below are two examples from these two plotting routines:
image(volcano,col = terrain.colors(30))
heatmap(volcano,col = terrain.colors(30))
Probably the easiest way to export your data from Excel to R is to save it as a .csv file (comma or tab-separated text file), and then import it using read.table()
You can easily create an interactive heatmap in R using plotly:
library(plotly)
plot_ly(z = volcano, type = "heatmap")
More instructions here.

Resources