Obtaining all pairwise scatterplots amongst variables - r

Im trying to solve this question with the dataframe stackloss:
Use the pairs() function to obtain all pairwise scatterplots among the
four variables.
However when i use the pairs function I get a graph with all the variables plotted together. How can i make sure that i only get the variables pairwise so only two variables will appear per graph window?
My code is:
pairs(stackloss,pch=21,bg=c("red","green","yellow","blue"))
Thank you

It was not quite clear how you want to obtain all plots. I put the plot() function in two loops and use the Sys.sleep() function to have a small break between every call of the command. If you use R-studio you can switch between the last shown plots.
for(ii in 1:(ncol(stackloss)-1) ){
begin <- ii + 1
for(i in begin:ncol(stackloss)){
plot(x=stackloss[,ii], y=stackloss[,i], xlab=colnames(stackloss)[ii], ylab=colnames(stackloss)[i])
Sys.sleep(1)
}
}

Related

Adding multiple lines to plot, without ggplot

I would like to plot multiple lines on the same plot, without using ggplot.
I have scores for different individuals across a set time period and wish to plot a line between yearly scores for each individual. Data is organised with each row representing an individual and each column an observed value in a given year.
Currently I am using a for loop, but am aware that this is often not efficient in R, and am interested if there are any more suitable approaches available within base R.
I will be working with up 100,000 individuals
Thanks.
Code:
df=data.frame(runif(10,0,100),runif(10,0,100),runif(10,0,100),runif(10,0,100))
df=data.frame(t(df))
Years=seq(1,10,1)
plot(1,type="n",xlab="Year",ylab="Score", xlim=c(1,10), ylim=c(0,100))
for(x in 1:4){lines(Years,df[x,])}
Efficiency is not much of a consideration when plotting since plotting to a device is a slow operation in itself. You can use matplot (which uses a loop internally). It's basically a more sophisticated version of your code wrapped in a function.
matplot(Years, t(df), xlab="Year", ylab="Score", type = "l")

Plot multiple similar equations in r

there
I am new on R. I want to plot a graph like this.
The curves are created by these equations :
(log(0.4)-(0.37273*log(x)-1.79389))/0.17941
(log(0.5)-(0.37273*log(x)-1.79389))/0.17941
(log(0.6)-(0.37273*log(x)-1.79389))/0.17941
etc. The equations are similar, the only difference is the first log(XXX). I already manually draw the graph by repeating plot() for each equation.
But I think there must be a way to just assign a simple variable like
x<-c(0.4,0.5,0.6,0.7)
and then plot all the curves automatically. I tried to use data frame to make a set of equations, but failed.
You can create a function-generating function and then loop over values of interest. For example
# takes a value, returns a function
logfn <- function(b) {
function(x) (log(b)-(0.37273*log(x)-1.79389))/0.17941
}
x <- c(0.4,0.5,0.6,0.7)
# empty plot
plot(0,0,type="n", ylim=c(-5,5), xlim=c(1,8), xlab="Lenght", ylab="Z-score")
# add plots for questions with `curve()`
for(v in x) {
curve(logfn(v)(x),add=T)
}

qqnorm plotting for multiple subsets

I am very new to R. I have figured out how to make qqnorm plots on a subset of my dataframe. However, I would like to make qqnorm plots on subsets that are defined by two factors (one factor has 48 categories (brain_region) and each of those categories can be further subdivided by another factor, which has three levels (GroupID)). I have tried the following:
by(t, t[,"GroupID"], function(x) tapply(t$FA,t$brain_region,qqnorm))
but it does not seem to be working. I'm also not sure if this is the best approach, as I'm new to this program.
I would also like to save each of the separately generated qqnorm plot with the x axis as labeled as "FA" and the title with the specific level of each of the two factors (brain region/GroupID). Thank you very much for any help.
Plotting is one of the few things where apply isn't the optimal solution. ggplot offers you enough possibilities to get this done, as shown in this answer.
Plotting all levels in one go
If you use the base plots, you can better use a for loop for this. Plus, if you want to plot different plots on the same graphics device, you can use eg par(mfrow=) or layout (see the help page ?layout)
Let's take the built-in data set iris as an example:
data(iris)
op <- par(mfrow=c(1,3))
for(i in levels(iris$Species)){
tmp <- with(iris, Petal.Width[Species==i])
qqnorm(tmp,xlab="Petal.Width",main=i)
qqline(tmp)
}
par(op)
rm(i,tmp)
gives :
Don't forget to clean up your workspace after using a for loop. Not really obligatory, but it can prevent serious confusion later on.
Combine two factors
In order to get this done for 2 factor levels at the same time, you can either construct a nested for-loop, or combine both factors into a single factor. Take the dataset mtcars:
data(mtcars)
mtcars$cyl <- factor(mtcars$cyl)
mtcars$am <- factor(mtcars$am,
labels=c('automatic','manual'))
To combine both levels, you can use this simple construct :
mtcars$combined <- factor(paste(mtcars$cyl,mtcars$am,sep='/'))
And then do the same again. With two for loops, your code would like like the code below. Be warned though that this only works if you have data for every combination of the factors, and you don't have too many levels. If you have a lot of levels, you better save the plots by using eg png() (see ?png for info) instead of plotting them all on the same graphics device.
lcyl <- levels(mtcars$cyl)
lam <- levels(mtcars$am)
par(mfrow=c(length(lam),length(lcyl)))
for(i in lam){
for(j in lcyl){
tmp <- with(mtcars,mpg[am==i & cyl==j])
qqnorm(tmp,xlab="Petal.Width",
main=paste(i,j,sep="/"))
qqline(tmp)
}
}
gives :

for loop generates multiple plots

I have a question about the for loop in combination with the plot function.
I want to use a for loop function (see below) to plot multiple points in one plot. But my loop generates for each point his one plot. So with an i of 35 I generate 35 plot. My question is, is there a way to plot all the points in the same plot?
pdf("test plot.pdf")
for (i in 1:nrow(MYC)){
plot(MYC[i,1], MYC[i,2]
}
dev.off()
Thank you all!
As mentioned in the comments, you are in essence trying to do multiple plots with a loop. R doesn't understand that actually want to plot only points. There's a cure for that, and it comes in vials of points(). Before calling a loop, construct your plot using the type argument. This will make an empty plot, something along the lines of:
plot(your.data, type = "n")
You can then use your loop (with points) to add points to this existing plot.

Producing statistics over levels

I've generated a set of levels from my dataset, and now I want to find a way to sum the rest of the data columns in order to plot it while plotting my first column. Something like:
levelSet <- cut(frame$x1, "cutting")
boxplot(frame$x1~levelSet)
for (l in levelSet)
{
x2Sum<-sum(frame$x2[levelSet==l])
}
or maybe the inside of the loop should look like:
lines(sum(frame$x2[levelSet==l]))
Any thoughts? I am new to R, but I can't seem to get a hang of the indexing and ~ notation thus far.
I know r doesn't work this way, but I'd like functionality that 'looks' like
hist(frame$x2~levelSet)
## Or
hist(frame$x2, breaks = levelSet)
To plot a histograph, boxplot, etc. over a level set:
Try the lattice package:
library(lattice)
histogram(~x2|equal.count(x1),data=frame)
Substitute shingle for equal.count to set your own break points.
ggplot2 would also work nicely for this.
To put a histogram over a boxplot:
par(mfrow=c(2,1))
hist(x2)
boxplot(x2)
You can also use the layout() command to fine-tune the arrangement.

Resources