I try to make a good Boxplot. As you can see in the picture, to get a clear visualization, it is necessary to "zoom in" into the biggest part of the data.
I did this with the ylim option.
As you can see in the picture below I created an main title, the outliers are going through the title and that is the problem.
I think I could solve the problem by deleting the outliers in the original data, but I was wondering if it is possible to cut the "boxplotline" by 0.10, so the boxplot stays in the figure.
My code so far:
boxplot (genergy$Measurevalue, ylim= c(0,0.1), ylab = "Measured Value",
main="Boxplot Measured Value", col = "red")
UPDATE:
#Twitch_City: I don't think that use another ylim is the solution. For example:
boxplot (genergy$Measurevalue, ylim= c(0,0.50), ylab = "Measured Value",
main="Boxplot Measured Value", col = "red")
#akash87, sure. The data is:
You could use outline=FALSE to avoid plotting the outliers completely. You could then provide data about the outliers separately (for example, using fivenum or other summary).
Here's an example using random data generated from a chi-squared distribution with df=3; the data are quite positively skewed, as your data seem to be. Save the boxplot stats to obtain info on the outliers.
N=500000
dat <- rchisq(N, 3)
dat.box <- boxplot(dat, cex=.5, outline=F, las=1)
cat(fivenum(dat.box$out))
Another alternative is to plot a kernel density curve and add lines corresponding to the desired quantiles. As below:
plot(density(dat), las=1)
abline(v=median(dat), col='black')
abline(v=quantile(dat, .25), lty=3, col='red')
abline(v=quantile(dat, .75), lty=3, col='red')
Related
I am trying to plot two sets of data on one histogram, but I dont want the bars to overlap, just to be next to each other in the same plot. currently I am using the code:
plot(baxishist1,freq=FALSE, xlab = 'B-Axis (mm)', ylab = 'Percent of Sample', main = 'Distribution of B-Axis on Moraine 1', ylim=c(0,30),breaks=seq(25,60,1), col='blue')
par(new=T)
plot(baxishist2,freq=FALSE, xlab = 'B-Axis (mm)', ylab = 'Percent of Sample', main = 'Distribution of B-Axis on Moraine 2', ylim=c(0,30),breaks=seq(25,60,1), col='red')
and the results are bars overlapping on histogram
Can anyone help me to make the bars to be in the same bins but not overlap so that I can see both histograms?
You can make this a little easier to interpret, by using transparent colors.
Let's fist generate some data:
a <- rnorm(100)
b <- rnorm(100, mean=3)
And now plot the histograms:
hist(a, col=rgb(1,0,0,0.5))
hist(b, col=rgb(0,1,0,0.5), add=T)
As you can see, both are now somewhat visible but we would now have to manually adjust the x-axis to accomodate both distributions. And in any case, it's still not nice to read/interpret so I would rather plot two separate histograms, a boxplot or a violinplot.
I am a beginner with R. I managed to plot my data into overlapping histograms. However, I would like to place all the histograms on one page. I am struggling as I am not able to tell R, which sets to pick (only manage to plot one of the plots).
This is the code:
df<-read.csv("Salt dshalo sizes.csv",header=T)
#View(df)
library(ggplot2)
DSA<-df[,1]
DS1<-df[,2]
DSB<-df[,5]
DS2<-df[,6]
DSC<-df[,9]
DS3<-df[,10]
#remove the NA column by columns separately or it will chop the data
DSA=na.omit(DSA)
DS1=na.omit(DS1)
DSB=na.omit(DSB)
DS2=na.omit(DS2)
DSC=na.omit(DSC)
DS3=na.omit(DS3)
#plot histograms for DSA, DSB and DSC on one same graph
hist(DSA, prob=TRUE, main="Controls", xlab="Sizes (um)", ylab="Frequency", col="yellowgreen",xlim= c(5,25), ylim=c(0,0.5), breaks=10)
hist(DSB, prob=TRUE, col=rgb(0,0,1,0.5),add=T)
hist(DSC, prob=TRUE, col=rgb(0.8,0,1,0.5),add=T)
#add a legend to the histogram
legend("topright", c("Control 1", "Control2", "Control3"), text.width=c(1,1,1),lwd=c(2,2,2),
col=c(col="yellowgreen", col="blue", col="pink",cex= 1))
box()
#plot histograms for DS1, DS2 and DS3 on one same graph
hist(DS1, prob=TRUE, main="Monoculture Stressed", xlab="Sizes (um)", ylab="Frequency", col="yellowgreen",xlim= c(5,25), ylim=c(0,0.5), breaks=10)
hist(DS2, prob=TRUE, col=rgb(0,0,1,0.5),add=T)
hist(DS3, prob=TRUE, col=rgb(0.8,0,1,0.5),add=T)
#add a legend to the histogram
legend("topright", c("DS1", "DS2", "DS3"), text.width=c(1,1,1),lwd=c(2,2,2),
col=c(col="yellowgreen", col="blue", col="pink",cex= 1))
box()
# put both overlapping histograms onto one page
combined <- par(mfrow=c(1, 2))
plot(hist(DSA),main="Controls")
plot(hist(DS1),main="Monoculture stressed")
par(combined)
Basically, I get two separate overlapping histograms, but cannot put them on the same page.
EDIT: I evidently didn't read your question thoroughly. I see you figured out the add =T.
I assume what you are looking for then is the comment I made first:
par(mfrow = c(a,b)) where a and b are the number of rows and columns you want the graphics objects to be printed. I used c(2,2) for this pic.
I made a comment, but sounds like you may be talking about the add=T option.
a=rnorm(100, 2, 1)
b=rnorm(100, 4, 1)
hist(a, xlim=c(0,10), col="yellow")
hist(b, add=T, col="purple" )
you can play around with transparency options on colors to see both overlap. Such as rgb(1,0,0,1/4) as the color.
With transparency colors:
a=rnorm(100, 2, 1)
b=rnorm(100, 4, 1)
hist(a, xlim=c(0,10), col=rgb(1,1,0,1/4))
hist(b, add=T, col=rgb(1,0,0,1/4) )
I am trying to make a grouped bar chart. Right now this is what I have:
Grouped Bar Chart
For background: I have quite a large data set with multiple variables. What I am interested in for this bar chart is visually representing the median inspection distance (cm) that male guppies (yes, fish) will inspect a predator in the presence and absence of females. As you can see, below the two bar charts there is "A" and "B".... I want these to say "Bright" and "Drab"... I cannot seem to get anything to work!!
this is my code right now:
barplot(matrix(c(18.41,7.20,21.40,11.17),nr=2), beside=T,
col=c("aquamarine3","snow3"), ylim=c(0, 25),
names.arg=LETTERS[1:2], xlab = "Colour", ylab = "Inspection Frequency (cm)")
legend("topright", c("Present","Absent"), pch=15, col=c("aquamarine3","snow3"),
bty="n")
thank you in advance - I know this is a super basic question but I am fairly new at this!
You can suppress the plotting of the x axis labels and then add labels of your own. To see where barplot draws each of the bars, save it to an object.
myplot <- barplot(matrix(c(18.41,7.20,21.40,11.17),nr=2),
beside=T, xaxt="n",
col=c("aquamarine3","snow3"), ylim=c(0, 25),
names.arg=LETTERS[1:2], xlab = "Colour",
ylab = "Inspection Frequency (cm)")
axis( 1, at=colMeans(myplot), labels=c("Bright","Drab"))
I want to make a histogram for multiple variables.
I used the following code :
set.seed(2)
dataOne <- runif(10)
dataTwo <- runif(10)
dataThree <- runif(10)
one <- hist(dataOne, plot=FALSE)
two <- hist(dataTwo, plot=FALSE)
three <- hist(dataThree, plot=FALSE)
plot(one, xlab="Beta Values", ylab="Frequency",
labels=TRUE, col="blue", xlim=c(0,1))
plot(two, col='green', add=TRUE)
plot(three, col='red', add=TRUE)
But the problem is that they cover each other, as shown below.
I just want them to be added to each other (showing the bars over each other) i.e. not overlapping/ not covering each other.
How can I do this ?
Try replacing your last three lines by:
plot(One, xlab = "Beta Values", ylab = "Frequency", col = "blue")
points(Two, col = 'green')
points(Three, col = 'red')
The first time you need to call plot. But the next time you call plot it will start a new plot which means you lose the first data. Instead you want to add more data to it either with scatter chart using points, or with a line chart using lines.
It's not quite clear what you are looking for here.
One approach is to place the plots in separate plotting spaces:
par("mfcol"=c(3, 1))
hist(dataOne, col="blue")
hist(dataTwo, col="green")
hist(dataThree, col="red")
par("mfcol"=c(1, 1))
Is this what you're after?
I want to plot a distribution and a single value (with abline) which is very smaller than the minimum value in my distribution, so the abline won't appear in the plot. How can I plot them in the same plot manipulating the x-axis scale or maybe inserting breaks?
data <- rnorm(1000, -3500, 27)
estimate <- -80000
plot(density(data))
abline(v = estimate)
Here's a rough solution, it's not particularly pretty:
library(plotrix)
d <- density(data)
gap.plot(c(-8000,d$x), c(0,d$y), gap=range(c(-7990,-3620)),
gap.axis="x", type="l", xlab="x", ylab="Density",
xtics=c(-8000,seq(-3600,-3300,by=100)))
abline(v=-8000, col="red", lwd=2)
Not exactly clear what is needed but this might be progress:
plot(density(data), xlim=range(c(data, estimate+10) ) )
abline(v = estimate, col='red')
In package:plotrix there are broken axis plotting functions.