ggplot with points and plots of certain columns - r

Let's say I have a data.frame of three columns:
x <- seq(1,10)
y <- 0.1*x^2
z <- y+rnorm(10,0,10)
d <- data.frame(x,y,z)
I now want a ggplot that plots the points (x,z) and somewhat smooth lines going through (x,y).
How can I achieve that?

"%>%" <- magrittr::"%>%"
d %>%
ggplot2::ggplot(ggplot2::aes(x=x)) +
ggplot2::geom_point(ggplot2::aes(y=z)) +
ggplot2::geom_smooth(ggplot2::aes(y=y))

Related

In ggplot2 is there a relatively simple way of using different geoms for different groups in the data?

I have a set of data with multiple groups. I'd like to plot them on the same graph but with, say, a smooth line for one group and the data points for the other. Or with smooth lines for both, but data points for only one of them. An example:
library(reshape)
library(ggplot2)
set.seed(123)
x <- 1:1000
y <- 5 + rnorm(1000)
z <- 5 + 0.005*x + rnorm(1000)
df <- as.data.frame(cbind(x,y,z))
df <- melt(df,id=c("x"))
ggplot(df,aes(x=x,y=value,color=variable)) +
geom_point() + #here I want only the y variable graphed
geom_smooth() #here I want only the z variable graphed
They are both graphed against the x variable, and are on the same scale. Is there a relatively easy way to accomplish this?
Set the data parameter with the filtered data on each plot type
library(ggplot2)
library(reshape)
set.seed(123)
x <- 1:1000
y <- 5 + rnorm(1000)
z <- 5 + 0.005*x + rnorm(1000)
df <- as.data.frame(cbind(x,y,z))
df <- reshape::melt(df,id=c("x"))
df
ggplot(df,aes(x=x,y=value,color=variable)) +
geom_point(data=df[df$variable=="y",]) + #here I want only the y variable graphed
geom_smooth(data=df[df$variable=="z",]) #here I want only the z variable graphed

Give color to scatter plot points based on value thershold

I have data.frame of value between -10 to 10, my data.frame has 2 columns.I wanted to create a ggplot graph.
I need to give color to points which have values more than 8 or less than -8.
How can I do this by ggplot in geom_point()?
I agree with the comments above, anyway I think this is what you are looking for
p <- runif(100, min=-10, max=10)
g <- 1:100
dat <- data.frame(p, g)
dat$colors <- 1
dat[which(dat$p < (-8) | dat$p > 8),"colors"] <- 0
library(ggplot2)
ggplot(dat, aes(x=g, y=p, group=colors)) + geom_point(aes(color=as.factor(colors)))
Which results in this:
Edit:
In a previous version of this answer the different colors were expressed as a continuous variable. I changed this to a dichotomous format with as.factor.

Graphing 3 axis accelerometer data in R

I have data from a 3 axis accelerometer that I would like to create a graph of in R.
The data is currently in a CSV file that looks like this.
time,X_value,Y_value,Z_value
0.000,0.00000,0.00000,0.00000
0.014,-0.76674,3.02088,10.41717
0.076,-0.64344,3.08493,8.82323
0.132,-0.68893,3.01071,8.82862
0.193,0.48483,2.40438,9.73482
0.255,-0.71168,2.07637,8.94174
0.312,-0.32920,0.79188,10.77690
0.389,-0.54468,2.08236,9.77732
0.434,-1.53648,-0.00898,11.77887
I want to show the change in all three over time in one graph. Any suggestions on how I might do that?
You'll want to read up on plotting in R. This is a fairly common analysis.
R: plot multiple lines in one graph
https://stats.stackexchange.com/questions/7439/how-to-change-data-between-wide-and-long-formats-in-r
You will want to melt the data frame and then plot it, grouped by your factors (x axis, y axis, z axis).
library(ggplot2)
library(reshape2)
t <- 1:10
x <- rnorm(10)
y <- rnorm(10)
z <- rnorm(10)
df <- data.frame(t,x,y,z)
dfm <- melt(df, id.vars = "t")
ggplot(dfm, aes(x=t, y=value)) + geom_line(aes(color=variable))

adding layer to a plot in R

Taking some generic data
A <- c(1997,2000,2000,1998,2000,1997,1997,1997)
B <- c(0,0,1,0,0,1,0,0)
df <- data.frame(A,B)
counts <- t(table(A,B))
frac <- counts[1,]/(counts[2,]+counts[1,])
C <- c(1998,2001,2000,1995,2000,1996,1998,1999)
D <- c(1,0,1,0,0,1,0,1)
df2 <- data.frame(C,D)
counts2 <- t(table(C,D))
frac2 <- counts2[1,]/(counts2[2,]+counts2[1,])
If we then want to create a scatterplot for the two datasets on the one scale
We can:
plot(frac, pch=22)
points(frac2, pch=19)
But we see we have two problems
first we want to put our year values (which appear as df$A and df$C) along the x axis
We want the x axis to automatically adjust the scale when the second data is added.
A solution using ggplot2 or base R would be desired
ggplot will do the scaling for you. You can convert the fracs to data.frame and to use with ggplot
library(ggplot2)
ggplot(data.frame(y=frac, x=names(frac)), aes(x, y)) +
geom_point(col="salmon") +
geom_point(data=data.frame(y=frac2, x=names(frac2)), aes(x, y), col="steelblue") +
theme_bw()

y and x not the same length, scatterplot R

take the following as a simple example:
A <- c(1,1,1,2,2,3,3,4,4,4)
B <- c(1,0,0,1,0,1,0,1,0,0)
C <- c(6,3,2,4,1,2,6,8,4,3)
data <- data.frame(A,B,C)
data
I want to create a scatterplot that looks like so:
without the blue and red boarders, they are there as an explanitary guide
So I want to plot:
Each time B=1, I want to use its C value for the horizontal scale and plot the C value where B=0 along the vertical scale.
So for example; where X=6, we have points at x=3 and 2
where X=4, we have points at x=1
where X=2, we have a point at x=6
where X=8, we have a points at x=4 and 3
Must i manipuulate/melt/reshape my data somehow?
Using na.locf from the zoo package there is no need for reshaping.
library(zoo)
#extract the part of C that we need for mapping x
data$D = ifelse(data$B==1,data$C,NA)
#fill in the blanks
data$D = na.locf(data$D)
#Extract from C what we need for y
data$E = ifelse(data$B==1,NA,data$C)
#Done!
plot(data$D,data$E)

Resources