Setting custom LCL and UCL limits with qcc (Rstudio) - r

Perhaps this is an easy question but I am quite new wirh R and am struggling to define custom UCL and LCL limits in xBar control charts. In productions we have already set tollerances that must be fulfilled and I would like to set the limits (LCL and UCL) according the tollerances but I do not know how to do.
I write here a simple example to better understand:
library(qcc)
data(pistonrings)
diameter <- pistonrings$diameter
q1 <- qcc(diameter, type = "xbar.one", plot = TRUE)
This creates the xBar chart defining the two limits according the measurements and confidence interval. I would like to set them as following (just as example) and calculate the results according these values:
LCL: 73.99
UCL: 74.02
Is it possible?

I fixed the issue. It was enough specifying the limits with the qcc function:
q1 <- qcc(diameter, type = "xbar.one", plot = TRUE, limits = c(73.99,74.02))

Related

Finding the x-value at a certain y-value on a ggplot

I am currently having some difficulties trying to find the Effective Concentration of 50% for one of my datasets. To shortly summarize what it is, it is data on how levels of glutathione in cells depleted from 100% when exposed to a substance known as HEMA.
GSH50 <- read.table("Master list for all GSH data T9 TVN.csv", header = TRUE, sep = ";", dec = ",")
After some further subsetting, I end up with a plot like this
GSH plot
I have several more plots in addition to this, so I need to find the EC50 value for everyone so I can then compare them with each other (the problem is consistent on several plots, so if it can be fixed here it should be fixed on the others as well).
From an earlier dataset with almost the same setup (the only difference being x-axis values) I managed to get fairly correct EC50 using a setup like this:
HG <- approxfun(x, y)
optimize(function(t0) abs(HG(t0) - 50), interval = range(x))
Where I then got my EC50 value from the optimize function. However, it does not work on this data for some reason, as if I input the value from optimize, I end up getting this GSH plot instead.
If somebody has any idea how I can fix this issue, it would be most appreciated.
Edit
If you want a reproducible dataset I gathered the averages of the data, and as such the plot should still be similar to the GSH plots I have shown:
Concentration <- seq(from = 0, to = 9, by=1)
GSH <- c(100, 67.405, 47.78, 39.2325, 33.97, 28.435, 26.97, 24.5125, 23.5275, 21.565)
df <- data.frame(Concentration, GSH)
ggplot(df, aes(Concentration, GSH)) + geom_smooth()
I am quite certain that the dose is high enough to reach the lower level, but I have not stored the model somewhere. I hope the example data provided is enough.
Edit2
I should mention that the approx and optimize code does work for the example when we use geom_lines(), but for some reason, it is not as accurate on geom_smooth().

Error in axis(side = side, at = at, labels = labels, ...) : invalid value specified for graphical parameter "pch"

I have applied DBSCAN algorithm on built-in dataset iris in R. But I am getting error when tried to visualise the output using the plot( ).
Following is my code.
library(fpc)
library(dbscan)
data("iris")
head(iris,2)
data1 <- iris[,1:4]
head(data1,2)
set.seed(220)
db <- dbscan(data1,eps = 0.45,minPts = 5)
table(db$cluster,iris$Species)
plot(db,data1,main = 'DBSCAN')
Error: Error in axis(side = side, at = at, labels = labels, ...) :
invalid value specified for graphical parameter "pch"
How to rectify this error?
I have a suggestion below, but first I see two issues:
You're loading two packages, fpc and dbscan, both of which have different functions named dbscan(). This could create tricky bugs later (e.g. if you change the order in which you load the packages, different functions will be run).
It's not clear what you're trying to plot, either what the x- or y-axes should be or the type of plot. The function plot() generally takes a vector of values for the x-axis and another for the y-axis (although not always, consult ?plot), but here you're passing it a data.frame and a dbscan object, and it doesn't know how to handle it.
Here's one way of approaching it, using ggplot() to make a scatterplot, and dplyr for some convenience functions:
# load our packages
# note: only loading dbscacn, not loading fpc since we're not using it
library(dbscan)
library(ggplot2)
library(dplyr)
# run dbscan::dbscan() on the first four columns of iris
db <- dbscan::dbscan(iris[,1:4],eps = 0.45,minPts = 5)
# create a new data frame by binding the derived clusters to the original data
# this keeps our input and output in the same dataframe for ease of reference
data2 <- bind_cols(iris, cluster = factor(db$cluster))
# make a table to confirm it gives the same results as the original code
table(data2$cluster, data2$Species)
# using ggplot, make a point plot with "jitter" so each point is visible
# x-axis is species, y-axis is cluster, also coloured according to cluster
ggplot(data2) +
geom_point(mapping = aes(x=Species, y = cluster, colour = cluster),
position = "jitter") +
labs(title = "DBSCAN")
Here's the image it generates:
If you're looking for something else, please be more specific about what the final plot should look like.

Lag 0 is not plotted in GGCcf

With the following code I plotted the Cross Correlation of my data. All works wonderful, however the visualization does not depict Lag 0, which is highly important for my studies.
p= ggCcf(
df_ccf$Asia_Co,
df_ccf$EU_USA,
lag.max = 10,
type = c("correlation", "covariance"),
plot = TRUE,
na.action = na.contiguous)
plot(p)
The plot is looking like that:
Head of data:
I encountered the same issue; it might be an issue/bug with 'ggCcf' from the forecast library. I couldn't get ggCcf to work, no matter what I tried. Anyone who wants to reproduce this behaviour, try:
ggCcf(c(1,2,3,4),c(2,3,4,6))
The workaround is using regular/base R ccf:
max_lag = 10
result = ccf(series1, series2, lag.max = max_lag)
y = results$acf
x = c(-max_lag:max_lag)
You can use these two series to plot the ccf using ggplot2 and choosing an appropriate ylim.
The downside of this all is less conveniance, but the upside is that you can add some flair/styling to your plot now that you are doing everything yourself anyway ;).

R: Histogram with both custom breaks and constant width

I have some skewed data and want to create a histogram with custom breaks, but want it to actually look readable w/ constant widths for the bins (which would throw off the scale of the x axis, but that's fine). Does anyone know how to do this in ggplot/R?
This is what I don't want, but I don't know how to make breaks not override the width argument:
library(ggplot2)
test_data = rep(c(1,2,3,4,5,8,9,14,20,42,98,101,175), c(50,40,30,20,10,6,6,7,9,5,6,4,1))
buckets = c(-.5,.5,1.5,2.5,3.5,4.5,5.5,10.5,99.5,200)
q1 = qplot(test_data,geom="histogram",breaks=buckets)
print(q1)
Not the histogram I want :(
As ulfelder suggested, use cut():
library(ggplot2)
test_data = rep(c(1,2,3,4,5,8,9,14,20,42,98,101,175),
c(50,40,30,20,10,6,6,7,9,5,6,4,1))
buckets = c(-.5,.5,1.5,2.5,3.5,4.5,5.5,10.5,99.5,200)
q1 = qplot(cut(test_data, buckets), geom="histogram")
print(q1)

How to plot a violin scatter boxplot (in R)?

I just came by the following plot:
And wondered how can it be done in R? (or other softwares)
Update 10.03.11: Thank you everyone who participated in answering this question - you gave wonderful solutions! I've compiled all the solution presented here (as well as some others I've came by online) in a post on my blog.
Make.Funny.Plot does more or less what I think it should do. To be adapted according to your own needs, and might be optimized a bit, but this should be a nice start.
Make.Funny.Plot <- function(x){
unique.vals <- length(unique(x))
N <- length(x)
N.val <- min(N/20,unique.vals)
if(unique.vals>N.val){
x <- ave(x,cut(x,N.val),FUN=min)
x <- signif(x,4)
}
# construct the outline of the plot
outline <- as.vector(table(x))
outline <- outline/max(outline)
# determine some correction to make the V shape,
# based on the range
y.corr <- diff(range(x))*0.05
# Get the unique values
yval <- sort(unique(x))
plot(c(-1,1),c(min(yval),max(yval)),
type="n",xaxt="n",xlab="")
for(i in 1:length(yval)){
n <- sum(x==yval[i])
x.plot <- seq(-outline[i],outline[i],length=n)
y.plot <- yval[i]+abs(x.plot)*y.corr
points(x.plot,y.plot,pch=19,cex=0.5)
}
}
N <- 500
x <- rpois(N,4)+abs(rnorm(N))
Make.Funny.Plot(x)
EDIT : corrected so it always works.
I recently came upon the beeswarm package, that bears some similarity.
The bee swarm plot is a
one-dimensional scatter plot like
"stripchart", but with closely-packed,
non-overlapping points.
Here's an example:
library(beeswarm)
beeswarm(time_survival ~ event_survival, data = breast,
method = 'smile',
pch = 16, pwcol = as.numeric(ER),
xlab = '', ylab = 'Follow-up time (months)',
labels = c('Censored', 'Metastasis'))
legend('topright', legend = levels(breast$ER),
title = 'ER', pch = 16, col = 1:2)
(source: eklund at www.cbs.dtu.dk)
I have come up with the code similar to Joris, still I think this is more than a stem plot; here I mean that they y value in each series is a absolute value of a distance to the in-bin mean, and x value is more about whether the value is lower or higher than mean.
Example code (sometimes throws warnings but works):
px<-function(x,N=40,...){
x<-sort(x);
#Cutting in bins
cut(x,N)->p;
#Calculate the means over bins
sapply(levels(p),function(i) mean(x[p==i]))->meansl;
means<-meansl[p];
#Calculate the mins over bins
sapply(levels(p),function(i) min(x[p==i]))->minl;
mins<-minl[p];
#Each dot is one value.
#X is an order of a value inside bin, moved so that the values lower than bin mean go below 0
X<-rep(0,length(x));
for(e in levels(p)) X[p==e]<-(1:sum(p==e))-1-sum((x-means)[p==e]<0);
#Y is a bin minum + absolute value of a difference between value and its bin mean
plot(X,mins+abs(x-means),pch=19,cex=0.5,...);
}
Try the vioplot package:
library(vioplot)
vioplot(rnorm(100))
(with awful default color ;-)
There is also wvioplot() in the wvioplot package, for weighted violin plot, and beanplot, which combines violin and rug plots. They are also available through the lattice package, see ?panel.violin.
Since this hasn't been mentioned yet, there is also ggbeeswarm as a relatively new R package based on ggplot2.
Which adds another geom to ggplot to be used instead of geom_jitter or the like.
In particular geom_quasirandom (see second example below) produces really good results and I have in fact adapted it as default plot.
Noteworthy is also the package vipor (VIolin POints in R) which produces plots using the standard R graphics and is in fact also used by ggbeeswarm behind the scenes.
set.seed(12345)
install.packages('ggbeeswarm')
library(ggplot2)
library(ggbeeswarm)
ggplot(iris,aes(Species, Sepal.Length)) + geom_beeswarm()
ggplot(iris,aes(Species, Sepal.Length)) + geom_quasirandom()
#compare to jitter
ggplot(iris,aes(Species, Sepal.Length)) + geom_jitter()

Resources