I don't know how can I plot in better way.
I have
df1 <- data.frame(x=c(1,3,5), y=c(2,4,6))
df2 <- data.frame(x=c(2,6,10,12), y=c(1,4,7,15)
Those data frames have x as time, y as its own value.
I have data-frames with different amount of elements
I want to combine this data by x (time), but I need one method of two to show them on one plot: a) to show df1.y on x axis of a plot to see distribution df2 by df1, so these two data frames should be connected by the time (x) but shown each on one of two axis, or b) to show three axis, and for df1.y the y axis should be at the right side of a plot.
For a better terminology, I will rename your example variables according to your sample plots.
df1 <- data.frame(time=c(1,3,5), memory=c(2,4,6))
df2 <- data.frame(time=c(2,6,10,12), threads=c(1,4,7,15))
Your first plot:
From your description, I assume that you want to do the following: For each available time value get the value of df1$memory and df2$threads. However, that value may not always be available. One suitable approach is to fill up missing values by linear interpolation. This may be done using the approx-function:
merged.time <- sort(unique(c(df1$time, df2$time))
merged.data <- data.frame(time = merged.time,
memory = approx(df1$time, df1$memory, xout=merged.time)$y
threads = approx(df2$time, df2$threads, xout=merged.time)$y
)
Note that appprox(...)$y just extracts the interpolated data.
Plotting may now be done using standard plotting commands (or, as your tags suggest, using ggplot2:
ggplot(data=merged.data, aes(x=memory, y=threads)) + geom_line()
Your second plot
... is not possible with ggplot2. That is for numerous reasons, for example see here.
Related
I would like to create an interactive 3D surface plot of depths in a lake, ideally using the plotly or rgl libraries. I have extracted my data from a SpatialLinesDataFrame of contour lines in Gauss-Krueger/EPSG:31468 CRS, i.e. metric units. Now each contour line produces a set of coordinates with the same depth value. The resulting data frame is rather large, but looks something like this:
set.seed(41)
xx <- rnorm(100,4448929,100)
yy <- rnorm(100,5308097,100)
zz <- c(rep(-10,10),rep(-20,10),rep(-30,10),rep(-40,10),rep(-50,10),rep(-60,10),rep(-70,10),rep(-80,10),rep(-90,10),rep(-100,10))
df <- data.frame(xx,yy,zz)
I have tried plotting the data with plotly as in this example and with rgl as in this post. In both cases I get error messages relating to my data not being in a matrix format, i.e. where x- and y-values are represented as row- and column-numbers.
What does work, is using the add_trace command in plotly:
plot_ly() %>% add_trace(df,x = ~df$xx, y = ~df$yy, z = ~df$zz,type="mesh3d")
However, the resulting graph not only lacks the fancy colour legend of the add_surface command, but more importantly, warps the x- and y-values in relation to the z-values. The z-values are shown much too large, although all have the same metric unit.
I have also tried reshaping the data frame to a matrix as in this post, but it either doesn't work at all, or gives me a matrix consisting almost entirely of NAs. I can only speculate that the number of coordinates that have depth values attached is very small in comparison to all x-y-combinations of coordinates in that range?
Any suggestions will be much appreciated - thanks!
Those are randomly located points, so rgl::persp3d can't handle them directly. However, you can follow the example in ?rgl::persp3d.deldir to triangulate them and then plot. For example,
dxyz <- deldir::deldir(df$xx, df$yy, z = df$zz, suppressMsgs=TRUE)
persp3d(dxyz, col = "lightblue")
This results in a pretty ugly picture, but with some work (e.g. fixing the axis labels, using real data) you should get something reasonable.
I am trying to plot several histograms for the same data set, but with different numbers of bins. I am using Gadfly.
Suppose x is just an array of real values, plotting each histogram works:
plot(x=x, Geom.histogram(bincount=10))
plot(x=x, Geom.histogram(bincount=20))
But I'm trying to put all the histograms together. I've added the number of bins as another dimension to my data set:
x2 = vcat(hcat(10*ones(length(x)), x), hcat(20*ones(length(x)), x)
df = DataFrame(Bins=x2[:,1], X=x2[:,2])
Is there any way to send the number of bins (the value from the first column) to Geom.histogram when using Geom.subplot_grid? Something like this:
plot(df, x="X", ygroup="Bins", Geom.subplot_grid(Geom.histogram(?)))
I think you would be better off not using subplot grid at that point, and instead just combine them with vstack or hstack. From the docs
Plots can also be stacked horizontally with ``hstack`` or vertically with
``vstack``. This allows more customization in regards to tick marks, axis
labeling, and other plot details than is available with ``subplot_grid``.
Sequential portions of my time series are under different treatments, and I'd like to separately color a line connecting observations in each portion.
For example, in the series under treatment A I'd have a red line, and in the succeeding series under treatment B I'd have a blue line.
plot(response, type="l",col="treatment") failed - all observations were connected with a line the same color.
This listhost posting proposed just splitting the data by treatment and then separately plotting each subset on the same plot. (http://r.789695.n4.nabble.com/Can-R-plot-multicolor-lines-td791081.html).
Is there a more elegant way?
An alternative using Map that avoids manually plotting segments:
dat <- data.frame(treatment=rep(LETTERS[1:2],3:4),
response=c(6,5,2,1,5,6,7),time=1:7)
plot(response ~ time, data=dat, type="n")
Map(
function(x) lines(response ~ time, data=x, col=x$treatment),
split(dat, dat$treatment)
)
There are two popular more elegant ways. One is to use the ggplot2 package. Without more information it's hard to advise you other than look at help or examples in various places. The other is to check out the function matplot. That will require you to first restructure your data as a matrix but it can easily do what you want. Keep in mind that while it says in the help, "Plot the columns of one matrix against the columns of another", the x-axis matrix can be a vector the same length as one column of a matrix containing your line information. The function will just recycle the x vector.
I have a data frame data_2 and wish to create a Bland-Altman plot to compare the differences between the data in the columns alog1 vs. dig1.
Please help with the function for this and how to execute this. Would the function be barplot()?
Thanks for your time.
Another name for a Bland-Altman plot is a Tukey mean-difference plot. (I have nothing against Bland and Altman, but I think 'mean-difference' is more descriptive.) Note that this different from a boxplot (observe the pictures on the two Wikipedia pages). The mean-difference plot is simply a regular scatterplot, except that instead of plotting x versus y, you are plotting the difference x-y against the mean of x and y (or in your case, alog1 and dig1). Probably the easiest way to make this is to form these two new variables first, and then simply plot them as you would any other scatterplot. Here is some sample code:
mn <- (data_2$alog1 + data_2$dig1)/2
dif <- data_2$alog1 - data_2$dig1
plot(mn, dif)
If you wanted to add arguments to customize your plot, you could do that just as you normally would, for example:
plot(mn, dif, main="Bland-Altman plot", xlab="mean of alog1 & dig1",
ylab="difference between alog1 & dig1")
I have been searching for hours, but I can't find a function that does this.
How do I generate a plot like
Lets say I have an array x1 = c(2,13,4) and y2=c(5,23,43). I want to create 3 blocks with height from 2-5,13-23...
How would I approach this problem? I'm hoping that I could be pointed in the right direction as to what built-in function to look at?
I have not used your data because you say you are working with an array, but you gave us two vectors. Moreover, the data you showed us is overlapping. This means that if you chart three bars, you only see two.
Based on the little image you provided, you have three ranges you want to plot for each individual or date. Using times series, we usually see this to plot the min/max, the standard deviation and the current data.
The trick is to chart the series as layers. The first series is the one with the largest range (the beige band in this example). In the following example, I chart an empty plot first and I add three layers of rectangles, one for beige, one for gray and one for red.
#Create data.frame
n=100
df <-data.frame(1:n,runif(n)*10,60+runif(n)*10,25+runif(n)*10,40+runif(n)*10,35-runif(n)*10,35+runif(n)*10)
colnames(df) <-c("id","beige.min","beige.max","gray.min","gray.max","red.min","red.max")
#Create chart
plot(x=df$id,y=NULL,ylim=range(df[,-1]), type="n") #blank chart, ylim is the range of the data
rect(df$id-0.5,df[,2],df$id+0.5,df[,3],col="beige", border=FALSE) #first layer
rect(df$id-0.5,df[,4],df$id+0.5,df[,5],col="gray", border=FALSE) #second layer
rect(df$id-0.5,df[,6],df$id+0.5,df[,7],col="darkred", border=FALSE) #third layer
It's not entirely clear what you want based on the png, but based on what you've written:
x1 <- c(2,13,4)
y2 <- c(5,23,43)
foo <- data.frame(id=1:3, x1, y2)
library(ggplot2)
ggplot(data=foo) + geom_rect(aes(ymin=x1, ymax=y2, xmin=id-0.4, xmax=id+0.4))