my problem is the following:
I have to plot a curve which shows the number of breakdowns (y) by the service life (x) but in a cumulative way - and that's the point where I struggle!!
The solution is given in the second Picture, my code in the first (I think only the type of the plot should be different)
my code
solution
Thanks so much for every help!!
I can't replicate your data, so this is more of a comment, then a complete solution.
n <- sum(h$counts) # This should sum up to the number of observations
y <- cumsum(h$counts) / n # Your y values
x <- h$mids # I assume these to be your x-axis value, but this might need an edit.
plot(x = x, y = y, type = "l")
Finally, you can add the vertical and horizontal lines via the abline() function at the respective points.
Related
So, I've spent the last four hours trying to find an efficient way of plotting the curve(s) of a function with two variables - to no avail. The only answer that I could actually put to practice wasn't producing a multiple-line graph as I expected.
I created a function with two variables, x and y, and it returns a continuous numeric value. I wanted to plot in a single screen the result of this function with certain values of x and all possible values of y within a given range (y is also a continuous variable).
Something like that:
These two questions did help a little, but I still can't get there:
Plotting a function curve in R with 2 or more variables
How to plot function of multiple variables in R by initializing all variables but one
I also used the mosaic package and plotFun function, but the results were rather unappealing and not very readable: https://www.youtube.com/watch?v=Y-s7EEsOg1E.
Maybe the problem is my lack of proficiency with R - though I've been using it for months so I'm not such a noob. Please enlighten me.
Say we have a simple function with two arguments:
fun <- function(x, y) 0.5*x - 0.01*x^2 + sqrt(abs(y)/2)
And we want to evaluate it on the following x and y values:
xs <- seq(-100, 100, by=1)
ys <- c(0, 100, 300)
This line below might be a bit hard to understand but it does all of the work:
res <- mapply(fun, list(xs), ys)
mapply allows us to run function with multiple variables across a range of values. Here we provide it with only one value for "x" argument (note that xs is a long vector, but since it is in a list - it's only one instance). We also provide multiple values of "y" argument. So the function will run 3 times each with the same value of x and different values of y.
Results are arranged column-wise so in the end we have 3 columns. Now we only have to plot:
cols <- c("black", "cornflowerblue", "orange")
matplot(xs, res, col=cols, type="l", lty=1, lwd=2, xlab="x", ylab="result")
legend("bottomright", legend=ys, title="value of y", lwd=2, col=cols)
Here the matplot function does all the work - it plots a line for every column in the provided matrix. Everything else is decoration.
Here is the result:
I have around 20.000 points in my scatter plot. I have a list of interesting points and want to show those points in the scatter plot with different color. Is there any simple way to do it? Thank you.
Further explanation,
I have a matrix, consist of 20.000 rows, let's say R1 to R20000 and 4 columns, let's say A,B,C, and, D. Each row has its own row.names. I want to make a scatter plot between A and C. It is easy with plot(data$A,data$B).
On the other hand, I have a list of row.names which I want to check where in the scatter plot those point is. Let's say R1,R3,R5,R10,R20,R25.
I just want to change the color of R1,R3,R5,R10,R20,R25 in the scatter plot different from other points. Sorry if my explanation is not clear.
If your data is in a simple form, then it is easy to do. For example:
# Make some toy data
dat <- data.frame(x = rnorm(1000), y = rnorm(1000))
# List of indicies (or a logical vector) defining your interesting points
is.interesting <- sample(1000, 30)
# Create vector/column of colours
dat$col <- "lightgrey"
dat$col[is.interesting] <- "red"
# Plot
with(dat, plot(x, y, col = col, pch = 16))
Without a reproducible example, it's hard to say anything more specific.
Here is some data to work with.
df <- data.frame(x1=c(234,543,342,634,123,453,456,542,765,141,636,3000),x2=c(645,123,246,864,134,975,341,573,145,468,413,636))
If I plot these data, it will produce a simple scatter plot with an obvious outlier:
plot(df$x2,df$x1)
Then I can always write the code below to remove the y-axis outlier(s).
plot(df$x2,df$x1,ylim=c(0,800))
So my question is: Is there a way to exclude obvious outliers in scatterplots automatically? Like ouline=F would do if I were to plot, say, boxplots for an example. To my knowledge, outline=F doesn't work with scatterplots.
This is relevant because I have hundreds of scatterplots and I want to exclude all obvious outlying data points without setting ylim(...) for each individual scatterplot.
You could write a function that returns the index of what you define as an obvious outlier. Then use that function to subset your data before plotting.
Here all observations with "a" exceeding 5 * median of "a" are excluded.
df <- data.frame(a = c(1,3,4,2,100), b=c(1,3,2,4,2))
f <- function(x){
which(x$a > 5*median(x$a))
}
with(df[-f(df),], plot(b, a))
There is no easy yes/no option to do what you are looking for (the question of defining what is an "obvious outlier" for a generic scatterplot is potentially quite problematic).
That said, it should not be too difficult to write a reasonable function to give y-axis limits from a set of data points. If we take "obvious outlier" to mean a point with y value significantly above or below the bulk of the sample (which could be justified assuming a sufficient distribution of x values), then you could use something like:
ybounds <- function(y){ # y is the response variable in the dataframe
bounds = quantile(df$x1, probs=c(0.05, 0.95), type=3, names=FALSE)
return(bounds + c(-1,1) * 0.1 * (bounds[2]-bounds[1]) )
}
Then plot each dataframe with plot(df$x, df$y, ylim=ybounds(df$y))
I want to create a 3d plot with densities.
I use the function density to first create a 2d dimensional plot for specific x values, the function then creates the density and puts them into a y variable. Now I have a second set of x values and put it again into the density function and I get a second set of y variables and so on....
I want to put those sets into a 3d plot, I hope you know what I mean. So I have a surface of densities....
E.g. I have:
x1<-c(1:10)
x2<-c(2:11)
y1<-c(1,1,2,1,3,4,2,3,2,2)
y2<-c(1,2,3,1,3,6,2,8,2,2)
.
.
.
.
Now I want to put on the x axis for the first value 1 the first set , on the y axis the corresponding x values and on the z axis the densities. So I have a "disk" for x=1, for x=2 I have the second "disk" and so on, so I get a density "mountain".
I hope I am understandable, if you have a better idea to realize it then you are welcome!
I want to do it with the persp function, would be nice if you make an example with that function,
Thanks a lot for your help.
I'm afraid I can't make head or tail out of your question. But here is how you draw a plot of the sort I think you are looking for from a two dimensional dataset for which you first estimate the bivariate density:
x <- rnorm(1000)
y <- 2 + x*rnorm(1000,1,.1) + rnorm(1000)
library(MASS)
den3d <- kde2d(x, y)
persp(den3d, box=FALSE)
Then there are many options for persp, check out
?persp
Building on Peter answer. The plot can now be more interesting, prettier and interactive with the plotly library.
x <- rnorm(1000)
y <- 2 + x*rnorm(1000,1,.1) + rnorm(1000)
library(MASS)
den3d <- kde2d(x, y)
# the new part:
library(plotly)
plot_ly(x=den3d$x, y=den3d$y, z=den3d$z) %>% add_surface()
which gives:
How to plot the density of a single column dataset as dots? For example
x <- c(1:40)
On the same plot using the same scale of the x-axis and y-axis, how to add another data set as line format which represent the density of another data that represents the equation of
y = exp(-x)
to the plot?
The equation is corrected to be y = exp(-x).
So, by doing plot(density(x)) or plot(density(y)), I got two separated figures. How to add them in the same axis and using dots for x, smoothed line for y?
You can add a line to a plot with the lines() function. Your code, modified to do what you asked for, is the following:
x <- 1:40
y <- exp(-x)
plot(density(x), type = "p")
lines(density(y))
Note that we specified the plot to give us points with the type parameter and then added the density curve for y with lines. The help pages for ?plot, ?par, ?lines would be some insightful reading. Also, check out the R Graph Gallery to view some more sophisticated graphs that generally have the source code attached to them.