I have a large matrix mdat (1000 rows and 16 columns) contains first column as x variable and other columns as y variables. What I want to do is to make scatter plot in R having 15 figures on the same window. For example:
mdat <- matrix(c(1:50), nrow = 10, ncol=5)
In the above matrix, I have 10 rows and 5 columns. Is it possible that to use the first column as variable on x axes and other columns as variable on y axes, so that I have four different scatterplots on the same window? Keep in mind that I will not prefer par(mfrow=, because in that case I have to run each graph and then produce them on same window. What I need is a package so that I will give it just data and x, y varaibeles, and have graphs on same windows.
Is there some package available that can do this? I cannot find one.
Perhaps the simplest base R way is mfrow (or mfcol)
par(mfrow = c(2, 2)) ## the window will have 2 rows and 2 columns of plots
for (i in 2:ncol(mdat)) plot(mdat[, 1], mdat[, i])
See ?par for everything you might want to know about further adjustments.
Another good option in base R is layout (the help has some nice examples). To be fancy and pretty, you could use the ggplot2 package, but you'll need to reshape your data into a long format.
require(ggplot2)
require(reshape2)
molten <- melt(as.data.frame(mdat), id = "V1")
ggplot(molten, aes(x = V1, y = value)) +
facet_wrap(~ variable, nrow = 2) +
geom_point()
Alternatively with colors instead of facets:
ggplot(molten, aes(x = V1, y = value, color = variable)) +
geom_point()
#user4299 You can re-write shujaa's ggplot command in this form, using qplot which means 'quick plot' which is easier when starting out. Then instead of faceting, use variable to drive the color. So first command produces the same output as shujaa's answer, then the second command gives you all the lines on one plot with different colors and a legend.
qplot(data = molten, x = V1, y = value, facets = . ~ variable, geom = "point")
qplot(data = molten, x = V1, y = value, color = variable, geom = "point")
Maybe
library(lattice)
x = mdat[,1]; y = mdat[,-1]
df = data.frame(X = x, Y = as.vector(y),
Grp = factor(rep(seq_len(ncol(y)), each=length(x))))
xyplot(Y ~ X | Grp, df)
Related
I have been working on plotting several lines according to different probability levels and am stuck adding labels to each line to represent the probability level.
Since each curve plotted has varying x and y coordinates, I cannot simply have a large data-frame on which to perform usual ggplot2 functions.
The end goal is to have each line with a label next to it according to the p-level.
What I have tried:
To access the data comfortably, I have created a list df with for example 5 elements, each element containing a nx2 data frame with column 1 the x-coordinates and column 2 the y-coordinates. To plot each curve, I create a for loop where at each iteration (i in 1:5) I extract the x and y coordinates from the list and add the p-level line to the plot by:
plot = plot +
geom_line(data=df[[i]],aes(x=x.coor, y=y.coor),color = vector_of_colors[i])
where vector_of_colors contains varying colors.
I have looked at using ggrepel and its geom_label_repel() or geom_text_repel() functions, but being unfamiliar with ggplot2 I could not get it to work. Below is a simplification of my code so that it may be reproducible. I could not include an image of the actual curves I am trying to add labels to since I do not have 10 reputation.
# CREATION OF DATA
plevel0.5 = cbind(c(0,1),c(0,1))
colnames(plevel0.5) = c("x","y")
plevel0.8 = cbind(c(0.5,3),c(0.5,1.5))
colnames(plevel0.8) = c("x","y")
data = list(data1 = line1,data2 = line2)
# CREATION OF PLOT
plot = ggplot()
for (i in 1:2) {
plot = plot + geom_line(data=data[[i]],mapping=aes(x=x,y=y))
}
Thank you in advance and let me know what needs to be clarified.
EDIT :
I have now attempted the following :
Using bind_rows(), I have created a single dataframe with columns x.coor and y.coor as well as a column called "groups" detailing the p-level of each coordinate.
This is what I have tried:
plot = ggplot(data) +
geom_line(aes(coors.x,coors.y,group=groups,color=groups)) +
geom_text_repel(aes(label=groups))
But it gives me the following error:
geom_text_repel requires the following missing aesthetics: x and y
I do not know how to specify x and y in the correct way since I thought it did this automatically. Any tips?
You approach is probably a bit to complicated. As far as I get it you could of course go on with one dataset and use the group aesthetic to get the same result you are trying to achieve with your for loop and multiple geom_line. To this end I use dplyr:.bind_rows to bind your datasets together. Whether ggrepel is needed depends on your real dataset. In my code below I simply use geom_text to add an label at the rightmost point of each line:
plevel0.5 <- data.frame(x = c(0, 1), y = c(0, 1))
plevel0.8 <- data.frame(x = c(0.5, 3), y = c(0.5, 1.5))
library(dplyr)
library(ggplot2)
data <- list(data1 = plevel0.5, data2 = plevel0.8) |>
bind_rows(.id = "id")
ggplot(data, aes(x = x, y = y, group = id)) +
geom_line(aes(color = id)) +
geom_text(data = ~ group_by(.x, id) |> filter(x %in% max(x)), aes(label = id), vjust = -.5, hjust = .5)
I have a question about using geom_segment in R ggplot2.
For example, I have three facets and two clusters of points(points which have the same y values) in each facets, how do I draw multiple vertical line segments for each clustering with geom_segment?
Like if my data is
x <- (1:24)
y <- (rep(1,2),2,rep(2,2),1,rep(3,2),4, rep(4,1),5,6, ..rep(8,2),7)
facets <-(1,2,3)
factors <-(1,2,3,4,5,6)
xmean <- ( (1+2+3)/3, (4+5+6)/3, ..., (22+23+24)/3)
Note: (1+2+3)/3 is the mean first cluster in the first facet and (4+5+6)/3 is the mean second cluster in the second facet and (7+8+9)/3 is the first cluster in the second facet.
My Code:
ggplot(,aes(x=as.numeric(x),y=as.numeric(y),color=factors)+geom_point(alpha=0.85,size=1.85)+facet_grid(~facets)
+geom_segment(what should I put here to draw this line in different factors?)
Desired result:
Please see the picture!
Please see the updated picture!
Thank you so much! Have a nice day :).
Maybe this is what you are looking for. Instead of working with vectors put your data in a dataframe. Doing so you could easily make an aggregated dataframe with the mean values per facet and cluster which makes it easy to the segments:
Note: Wasn't sure about the setup of your data. You talk about two clusters per facet but your data has 8. So I slightly changed the example data.
library(ggplot2)
library(dplyr)
df <- data.frame(
x = 1:24,
y = rep(1:6, each = 4),
facets = rep(1:3, each = 8)
)
df_sum <- df %>%
group_by(facets, y) %>%
summarise(x = mean(x))
#> `summarise()` has grouped output by 'facets'. You can override using the `.groups` argument.
ggplot(df, aes(x, y, color = factor(y))) +
geom_point(alpha = 0.85, size = 1.85) +
geom_segment(data = df_sum, aes(x = x, xend = x, y = y - .25, yend = y + .25), color = "black") +
facet_wrap(~facets)
Lets say I have a data frame :
df <- data.frame(x = c("A","B","C"), y = c(10,20,30))
and I wish to plot it with ggplot2 such that I get a plot like a histogram ( where instead of plotting count I plot my y column values from the data frame. ( I don't mind if the x column is a factor column or a character column.
I will add that I know how to reorder a bar chart by descending/ascending, but ordering like a histogram (highest values in the middle- around the mean and decreasing to both sides) is still beyond me.
I thought of transmuting the data such that I can fit it in a histogram - like creating a vector with 10 "A"objects, 20 "B" and 30 "C" and then running a histogram on that. But its not practical for what I'm trying to do as it seems like a lazy and highly inefficient way to do it. Also the df data frame is huge as it is- so multiplying by millions etc is not going to be kind on my system.
This seems like a strange thing to want to do, since if the ordering is not already implicit in your x variables, then ordering as a bell curve is at best artificial. However, it's fairly trivial to implement if you really want to...
library(ggplot2)
df <- data.frame(yvals = floor(abs(rnorm(26)) * 100),
xvals = LETTERS,
stringsAsFactors = FALSE)
ggplot(data = df, aes(x = xvals, y = yvals)) + geom_bar(stat = "identity")
ordered <- order(df$yvals)
left_half <- ordered[seq(1, length(ordered), 2)]
right_half <- rev(ordered[seq(2, length(ordered), 2)])
new_order <- c(left_half, right_half)
df2 <- df[new_order,]
df2$xvals <- factor(df2$xvals, levels = df2$xvals)
ggplot(data = df2, aes(x = xvals, y = yvals)) + geom_bar(stat = "identity")
I`m having trouble constructing an histogram from a matrix in R
The matrix contains 3 treatments(lamda0.001, lambda0.002, lambda0.005 for 4 populations rec1, rec2, rec3, con1). The matrix is:
lambda0.001 lambda0.002 lambda.003
rec1 1.0881688 1.1890554 1.3653264
rec2 1.0119031 1.0687678 1.1751051
rec3 0.9540271 0.9540271 0.9540271
con1 0.8053506 0.8086985 0.8272758
my goal is to plot a histogram with lambda in the Y axis and four groups of three treatments in X axis. Those four groups should be separated by a small break from eache other.
I need help, it doesn`t matter if in ggplot2 ou just regular plot (R basic).
Thanks a lot!
Agree with docendo discimus that maybe a barplot is what you're looking for. Based on what you're asking though I would reshape your data to make it a little easier to work with first and you can still get it done with stat = "identity"
sapply(c("dplyr", "ggplot2"), require, character.only = T)
# convert from matrix to data frame and preserve row names as column
b <- data.frame(population = row.names(b), as.data.frame(b), row.names = NULL)
# gather so in a tidy format for ease of use in ggplot2
b <- gather(as.data.frame(b), lambda, value, -1)
# plot 1 as described in question
ggplot(b, aes(x = population, y = value)) + geom_histogram(aes(fill = lambda), stat = "identity", position = "dodge")
# plot 2 using facets to separate as an alternative
ggplot(b, aes(x = population, y = value)) + geom_histogram(stat = "identity") + facet_grid(. ~ lambda)
I am very new to R and I am trying to plot a third variable to a plot using ggplot2. I have searched for an answer and I could not find anything similar (or I didn't know the right words to search).
I have three columns of data which will be my x, y and z variable.
I want a graph that can show the values for x and y axis (as in the first and second column variables). However, I want the "points" (as a scatter plot) in the graph to be the values shown in variable z. Is there a way of doing that?
Everything that I have tried plot x against y.
Thanks for any help!
I believe this is what you are asking: Map two variables: (x,y) in their axis and display the "text" of a third variable.
Let's use this data frame - We'll try to "write" X1 and X3
df <- data.frame(X1 = 1:5, X2 = 2*1:5, X3 = rnorm(1:5))
With base graphics you can just plot one character
plot(df$X1, df$X2, pch = paste(df$X1)) plot(df$X1, df$X2, pch = paste(df$X3))
doesn't seem to work well.
Using ggplot2:
ggplot(df, aes(x = X1, y = X2)) + geom_text(label = df$X1)
ggplot(df, aes(x = X1, y = X2)) + geom_text(label = df$X3)
a fancier alternative is adding colour in the aes()
ggplot(df, aes(x = X1, y = X2, color=X3)) + geom_text(label = df$X3)
I want the "points" (as a scatter plot) in the graph to be the values shown in variable z. Is there a way of doing that?
Definitely. The bit that you need to think about is how to present the data in your z variable. By that I mean do you want the information in z to be shown by the points' colour, size or area? There are some great examples of how to do this at the R cookbook.
If you have a data frame called my.data, which has columns x, y, and z, you need to set up your plot like this:
my.plot <- ggplot(data = my.data,
aes(x = x,
y = y))
The example above says "plot the data in my.data using my.data$x to set the x location and y.data$y to set the y location". If your x variable was grid.x and y was grid.y you would have
my.plot <- ggplot(data = my.data,
aes(x = grid.x,
y = grid.y))
then you need to add your points. This time we'll assume that the information in z is going to used to set the colour of the points, which in this case is the colour aesthetic:
my.plot <- my.plot + geom_point(aes(colour = z))
print(my.plot)
And that should be that. You don't need to tell geom_point() what x and y are, because you already did that when you set up the plot.