Help reproduce graph from excel - r

I would like to reproduce the following graph:
On the horizontal axis I would like to have 8 the question numbers, and I would like to plot two results for each question.
for example
questionnumbers<-c(1,2,3,4,5,6,7,8)
result1<-c(0.2,0.4,0.3,0.6,0.9,0.3,0.4,0.8)
result2<-c(0.4,0.9,0.3,0.1,0.4,0.6,0.3,0.2)
And i'd like to get a graph similar to this:
http://dl.dropbox.com/u/22681355/chart.tiff
Preferably I'd like to know how to do this in qplot using ggplot2

library(reshape2)
library(ggplot2)
qs <- data.frame(
questionnumbers = c(1,2,3,4,5,6,7,8),
result1 = c(0.2,0.4,0.3,0.6,0.9,0.3,0.4,0.8),
result2 = c(0.4,0.9,0.3,0.1,0.4,0.6,0.3,0.2)
)
mqs <- melt(qs, id.vars="questionnumbers")
ggplot(mqs, aes(x=questionnumbers, y=value, colour=variable)) + geom_line()
Edited.
Your follow-on question asks what is different with your diffferent data set. The answer is that your grouping variable is continuous, not categorical. By default, ggplot will group categorical variables together. If your grouping variable is not categorical, you need to make the grouping variable explicit in the aes call in ggplot, as follows `aes(..., group=variable, ...):
qs<-data.frame(
questionnumbers = c("1red","1blue","2red","2blue","3red","3blue","4red","4blue"),
Probability=c(0.59,0.60,0.55,0.55,0.60,0.58,0.67,0.68),
Chosing.colour=c(0.16,0.21,0.26,0.53,0.84,0.89,0.84,0.947))
mqs <-melt(qs, id.vars="questionnumbers")
str(mqs)
ggplot(mqs, aes(x=questionnumbers, y=value, group=variable, colour=variable)) +
geom_line()

In base graphics it would be...
questionnumbers<-c(1,2,3,4,5,6,7,8)
result1<-c(0.2,0.4,0.3,0.6,0.9,0.3,0.4,0.8)
result2<-c(0.4,0.9,0.3,0.1,0.4,0.6,0.3,0.2)
plot(questionnumbers, result2, type = 'b', ylim = c(0,0.9), col = 'green', xlab = 'Question Nunbers', ylab = '', main = 'Chart 2', panel.first = grid(nx = NA, ny = NULL))
lines(questionnumbers, result1, col = 'blue', type = 'b')
legend('bottomleft', c('result1','result2'), fill = c('blue', 'green'), cex = 0.8, bty = 'n', horiz = TRUE)
(you should really provide a y-axis label)

Related

x-y scatter-plot in r with labels on points

I am trying to make an x-y scatter-plot. I don't mind if it's in plot or ggplot2. I don't know much about each, but I would like an example in both if you don't mind. I would like a label on the points.
Below is code and dput:
tickers <- rownames(x2)
library(zoo)
plot(x2,
main= "Vol vs Div",
xlab= "Vol (in %)",
ylab= "Div",
col= "blue", pch = 19, cex = 1, lty = "solid", lwd = 2)
text(x=x2$Volatility101,y=x2$`12m yield`, labels=tickers,cex= 0.7, pos= 3)
x2:
structure(list(Volatility101 = c(25.25353177644, 42.1628734949414,
28.527736824123), `12m yield` = c("3.08", "7.07", "4.72")), class = "data.frame", row.names = c("EUN",
"HRUB", "HUKX"))
Here is a tidyverse solution.
library(ggplot2)
library(tidyr)
library(dplyr)
library(ggrepel)
x2 %>%
rownames_to_column(var = "tickers") %>%
ggplot(aes(x = Volatility101, y = `12m yield`)) +
geom_point(color = "blue") +
geom_text_repel(aes(label = tickers)) +
ggtitle("Vol vs Div") +
xlab("Vol (in %)") +
ylab("Div") +
theme_classic()
I was surprised that the plot function worked at all. The Y-values are character values. Fixing that in the text call results in text being placed in the expected locations
text(x=x2$Volatility101,y=as.numeric(x2$`12m yield`)+.1, labels=tickers,
cex= 0.7, col='black')
A couple of notes about the question presentation: It's unclear (and misleading) why ggplot2 is a tag. The plot function is generic and in this case it uses base-graphics rather than either ggplot2 specifically or grid graphics more generally. I also think that the library(zoo) call is probably unnecessary. There is a plot.zoo function, but it would not be called in this case.

ggplot2: solid line for one group, points for the other

I have four series that I would like to plot.
There are 2 models : xg and algo30.
There are two types of data: predicted and observed.
This means we have the following 4 series: "predicted xg","observed xg", "predicted 30", "observed 30".
I want "xg" to be blue, "algo30" to be red.
I also want predicted to be a solid line and observed to be points.
Here is what I mean, using base plot:
library(magrittr)
library(ggplot2)
library(dplyr)
set.seed(123)
gr <- 1:10
obs.xg <- sort(runif(10, 0.5, 1))
obs.30 <- sort(runif(10, 0.5, 1))
pred.xg <- lm(obs.xg~gr) %>% predict() %>% add(rnorm(10,0,.01))
pred.30 <- lm(obs.30~gr) %>% predict() %>% add(rnorm(10,0,.01))
plot(gr, obs.xg, col="darkblue", ylim=range(c(obs.xg,obs.30)), pch=20)
lines(gr, pred.xg, col="darkblue", lwd=2)
points(gr, obs.30, col="firebrick", pch=20)
lines(gr, pred.30, col="firebrick", lwd=2)
legend("bottomright",
pch=c(20,NA,NA,NA,NA),
lty=c(NA,1,NA,1,1),
lwd=c(NA,1,NA,2,2),
col = c("black","black",NA, "darkblue","firebrick"),
legend=c("observé","prédit",NA,"xgboost","algo30"),
bty='n')
Here is my best attempt using ggplot. Notice that the legend doesnt work as I want.
xg.data <- data.frame(model= "xg", decile = seq(1:10), observed = obs.xg, predicted = pred.xg)
algo30.data <- data.frame(model = "algo30",decile = seq(1:10), observed = obs.30, predicted = pred.30)
ggplotdata <- bind_rows(xg.data, algo30.data)
ggplotdata %>%
ggplot( aes(x=decile, y= predicted, color= model))+ geom_line()+
geom_point(aes(x=decile, y= observed, color = model))
Most of the time when making a legend like this I look to override.aes in guide_legend().
The idea here is to make a legend using an additional aesthetic that you don't want mapped onto the plot itself and then using constants instead of a variable for that aesthetic. I used alpha, since both points and lines use that aesthetic.
Then the heavy lifting is done in scale_alpha_manual: removing the legend name, making sure the plot still looks right by setting the values, and then, finally, picking the correct point type and lines along with blanks for the legend.
ggplot(ggplotdata, aes(x=decile, y= predicted, color= model))+
geom_line( aes(alpha = "prédit") )+
geom_point(aes(x=decile, y= observed, alpha = "observé")) +
scale_alpha_manual(name = NULL, values = c(1, 1),
guide = guide_legend(override.aes = list(linetype = c(0, 1), shape = c(16, NA)))) +
scale_color_manual(name = NULL, values = c("firebrick", "darkblue"))

R: Stacked bar plot with barpot, ggplot or plotly

I've been searching to find a solution, but none of the already existing questions fit my problem.
I have a data.frame:
Pat <- c(1,1,1,1,1,1,2,2,2,2,2,2)
V_ID <- c(1,1,6,6,9,9,1,1,6,6,9,9)
T_ID <- c("A","B","A","B","A","B","A","B","A","B", "A","B")
apples <- c(1,1,1,1,1,1,1,1,1,1,1,1)
bananas <- c(2,2,2,2,2,2,2,2,2,2,2,2)
cranberries <- c(3,3,3,3,3,3,3,3,3,3,3,3)
df <- data.frame(Pat,V_ID, T_ID, apples, bananas, cranberries)
I am trying to plot:
barplot(as.matrix(df[,4:6]) ,
main="tobefound", horiz = FALSE,width = 1,
names.arg=colnames(df[,4:6]),
las=2,
col = c("blue", "red"),
legend = df[,3],
args.legend = list(x="topleft"),
beside= FALSE)
BARPLOT
I need two changes:
First of all I like to have all "B"s (so the red part in every stack) piled up together and then the blue ones on top. Second: is there a way of decreasing the legend to only A and B once besides addressing this via
legend = df[1:2,3],
I am also looking for a solution using plotly or ggplot.
Thanks,
First reshape:
df_long <- tidyr::gather(df, 'key', 'value', apples:cranberries)
Then plot:
ggplot(df_long, aes(key, value, fill = T_ID)) + geom_col(col = 'black')
Or perhaps without the borders:
ggplot(df_long, aes(key, value, fill = T_ID)) + geom_col()
Using base graphics, you needed to sort df by T_ID first.
df = df[order(df$T_ID), ]
barplot(as.matrix(df[,4:6]) ,
main="tobefound", horiz = FALSE,width = 1,
names.arg=colnames(df[,4:6]),
las=2,
ylim = c(0,40),
col = 1+as.numeric(as.factor(df$T_ID)),
border = NA,
beside= FALSE)
box()
legend('topleft', fill = 1+as.numeric(as.factor(levels(df$T_ID))), legend = levels(as.factor(df$T_ID)))

Plot multipoints and a best fit line

I want to create one plot graph with the Roundrobin and Prediction points, without colors, where the Roundrobin and Prediction type of points are different, and it has a legend. I was want to add a best fit line for the results.
I am having trouble in adding all these features into one graph that has 2 points. I am used to Gnuplot, but I don't know how to do this with R. How I do this with R?
[1] Input data
Inputdata,Roundrobin,Prediction
1,178,188
2,159,185
3,140,175
[2] Script to generate data
no_faults_data <- read.csv("testresults.csv", header=TRUE, sep=",")
# Graph 1
plot(no_faults_data$Inputdata, no_faults_data$Roundrobin,ylim = range(c(no_faults_data$Roundrobin,no_faults_data$Prediction)),xlab="Input data size (MB)", ylab="Makespan (seconds)")
points(no_faults_data$Inputdata, no_faults_data$Prediction)
abline(no_faults_data$Inputdata, no_faults_data$Roundrobin, untf = FALSE, \dots)
abline(no_faults_data$Inputdata, no_faults_data$Prediction, untf = FALSE, \dots)
legend("top", notitle, c("Round-robin","Prediction"), fill=terrain.colors(2), horiz=TRUE)
In base R you will have to create a fitted model first:
robin <- lm(Roundrobin ~ Inputdata, data = no_faults_data)
pred <- lm(Prediction ~ Inputdata, data = no_faults_data)
plot(no_faults_data$Inputdata, no_faults_data$Roundrobin,
ylim = range(c(no_faults_data$Roundrobin,no_faults_data$Prediction)),
xlab = "Input data size (MB)", ylab = "Makespan (seconds)",
col = "green", pch = 19, cex = 1.5)
points(no_faults_data$Inputdata, no_faults_data$Prediction, pch = 22, cex = 1.5)
abline(robin, lty = 1)
abline(pred, lty = 5)
legend(1.1, 155, legend = c("Round-robin","Prediction"), pch = c(19,22), col = c("green","black"),
bty = "n", cex = 1.2)
which gives:
For further customization of the base R plot, see ?par and ?legend.
With ggplot2 you will need to reshape your data into long format:
library(reshape2)
library(ggplot2)
ggplot(melt(no_faults_data, id="Inputdata"),
aes(x=Inputdata, y=value, shape=variable, color=variable)) +
geom_point(size=4) +
geom_smooth(method = "lm", se = FALSE) +
theme_minimal()
which gives:
Used data:
no_faults_data <- read.csv(text="Inputdata,Roundrobin,Prediction
1,178,188
2,159,185
3,140,175", header=TRUE)
You should look into the ggplot2 package for plotting. Maybe not needed for the 3 points data you provided but it makes much nicer plots than the default.
df <- data.frame("Inputdata" = c(1,2,3,1,2,3), "score" = c(178,159,140,188,185,175), "scoreType" = c(rep("Roundrobin",3), rep("Prediction",3)))
p <- ggplot(data=df, aes(x=Inputdata, y=score, group=scoreType, shape = scoreType)) + geom_point(size=5)
p <- p + ggtitle("My Title")
p+stat_smooth(method="lm",se = FALSE)
Here you group by the type of score and let GG plot make the legend for you. stat_smooth is using lm here.

How to superimpose bar plots in R?

I'm trying to create a figure similar to the one below (taken from Ro, Russell, & Lavie, 2001). In their graph, they are plotting bars for the errors (i.e., accuracy) within the reaction time bars. Basically, what I am looking for is a way to plot bars within bars.
I know there are several challenges with creating a graph like this. First, Hadley points out that it is not possible to create a graph with two scales in ggplot2 because those graphs are fundamentally flawed (see Plot with 2 y axes, one y axis on the left, and another y axis on the right)
Nonetheless, the graph with superimposed bars seems to solve this dual sclaing problem, and I'm trying to figure out a way to create it in R. Any help would be appreciated.
It's fairly easy in base R, by using par(new = T) to add to an existing graph
set.seed(54321) # for reproducibility
data.1 <- sample(1000:2000, 10)
data.2 <- sample(seq(0, 5, 0.1), 10)
# Use xpd = F to avoid plotting the bars below the axis
barplot(data.1, las = 1, col = "black", ylim = c(500, 3000), xpd = F)
par(new = T)
# Plot the new data with a different ylim, but don't plot the axis
barplot(data.2, las = 1, col = "white", ylim = c(0, 30), yaxt = "n")
# Add the axis on the right
axis(4, las = 1)
It is pretty easy to make the bars in ggplot. Here is some example code. No two y-axes though (although look here for a way to do that too).
library(ggplot2)
data.1 <- sample(1000:2000, 10)
data.2 <- sample(500:1000, 10)
library(ggplot2)
ggplot(mapping = aes(x, y)) +
geom_bar(data = data.frame(x = 1:10, y = data.1), width = 0.8, stat = 'identity') +
geom_bar(data = data.frame(x = 1:10, y = data.2), width = 0.4, stat = 'identity', fill = 'white') +
theme_classic() + scale_y_continuous(expand = c(0, 0))

Resources