I'm trying to look at the conditional distributions of some data to compare how they look using a barplot. I would like to change the variable of the x-axis when I look at a different conditional distribution of a contingency table but R does not do so. It keeps the x axis variable and the plotted variable the same (with frequency distribution on the y axis).
Here is my code:
eyecolour<-matrix(c(43, 62, 48, 27,35, 26, 30, 29,27,39,61,33), ncol=4, byrow=T)
colnames(eyecolour)<-c("Blue", "Brown", "Green", "Other")
rownames(eyecolour)<-c("Glasgow", "Sheffield", "London")
barplot(prop.table(eyecolour, 1), legend=T, beside=T)
barplot(prop.table(eyecolour, 2), legend=T, beside=T)
I was expecting the two barplots to show Cities on the x axis for one plot and eye colours on the x axis for the other. I wasn't sure which - I'm just learning.
Can anyone help me to produce that result?
To answer your first question you can simply use t() so that it now plots the cities rather than the eyecolours. You might notice that the two outputs of your prop.tables have the same structure (eyecolour in the columns). They just have different numbers depending on the margin you specify. Reading the documentation for ?barplot it says that:
If height is a matrix and beside is FALSE then each bar of the plot corresponds to a column of height, with the values in the column giving the heights of stacked sub-bars making up the bar. If height is a matrix and beside is TRUE, then the values in each column are juxtaposed rather than stacked.
This suggests that barplot uses the columns as the heights in the plot which is why you need to transpose your matrix so that the columns are cities instead of eyecolours.
Something like:
barplot(t(prop.table(eyecolour, 1)), legend=T, beside=T)
Related
I'm analyzing numeric data with values between 1 to 7. I want to plot boxplots and show the significance across categories. My problem is that adding the labels also extends the values in the y axis. This might imply that the possible data range is up to more than 7 - which is not the best. I tried using ylim() but using it cuts off the signif labels. Is there a way to make the axis values to be 1-7, without cutting the information the should apear beyond this range?
my current plot:
when using ylim()
the desired outcome is something like that:
As mentioned in the comments, the solution is setting breaks:
gboxplot(...)+ scale_y_continuous(breaks = seq(0, 7, by = 1))
I use Julia with Plots , to generate my plots.
I want to plot data (A,B) and i know that all interesting data lies in two region of A. The two regions should be plotted between each other in one plot.
My A-data is evenly spaced. So what i did was cutting out my interesting pieces and glued them into one object.
My problem is that i don't know how to manipulate the scale on the x-axis.
When I just plot the B data against their array index, I basically get the form I want. I just need the numbers from A on the x-axis.
I give here a toy example
using Plots
N=5000
B=rand(N)
A=(1:1:N)
xl_1=100
xu_1=160
xl_2=600
xu_2=650
A_new=vcat(A[xl_1:xu_1],A[xl_2:xu_2])
B_new=vcat(B[xl_1:xu_1],B[xl_2:xu_2])
plot(A_new,B_new) # This leaves the spacing between the data explicit
plot(B_new) # This creats basically the right spacing, but
# without the right x axis grid
I did not find anything how one can use two successive xlims, therefore i try it this way.
You can't pass two successive xlims, because you can't have a break in the axis. That is by design in Plots.
So your possibilities are: 1) to have two subplots with different parts of the plot, or 2) to plot with the index, and just change the axis labels.
The second approach would use a command like xticks = ([1, 50, 100, 150], ["1", "50", "600", "650"], but I'd recommend the first as it's strictly speaking a more correct way of displaying the data:
plot(
plot(A[xl_1:xu_1], B[xl_1:xu_1], legend = false),
plot(A[xl_2:xu_2], B[xl_2:xu_2], yshowaxis = false),
link = :y
)
I have molecular sequencing data of relative abundance (in %) of the various phyla in 9 different samples and I am trying to plot it as colour-coded barchart (where each phyla corresponds to a different colour). Simple enough on excel, but for a complete newbie on R, I am struggling quite a bit. My data is in an excel format (formated as tabs), where the first line is the labels (e.g.sample name)- when plotting it, the bar labels are misplaced, do not match, and R plots the first line of my excel file (the names) as a separate value (pictures attached). What I have so far is:
attach(data)
data.1<-as.matrix(data)
par(mfrow=c(1,1))
barplot(data.1, col=c("aquamarine3","azure2","blue2","brown3","cadetblue3","deepskyblue3","firebrick3","gold3","darkorange3","darkorchid3","darkseagreen","darkslateblue","darkviolet","deeppink4"), main=".", xlab="Unit/Treatment", ylab="% Relative abundance")
detach(data)
legend("topright", inset=c(-0.2,0),
legend = c("Unassigned", "Acidobacteria","Actinobacteria","Bacteroidetes","Chlorobi","Chloroflexi","Firmicutes","Gemmatimonadetes","Planctomycetes","Proteobacteria","Verrucromicrobia","Euryarchaeota","Crenarchaeota","Parvarchaeota"),
fill = c("aquamarine3","azure2","blue2","brown3","cadetblue3","deepskyblue3","firebrick3","gold3","darkorange3","darkorchid3","darkseagreen","darkslateblue","darkviolet","deeppink4"))
par(mar=c(5.1, 4.1, 4.1, 8.1), xpd=TRUE)
layout(mat, widths = rep.int(1, ncol(mat)),
heights = rep.int(1, nrow(mat)), respect = FALSE)
As a result, I get this:
Barchart attempt, where R plots my sample names as x_1 and thus moves the other labels. Also, my legend covers the majority of my barchart and I cannot seem to adjust it.
Thanks very much in advance- any help with getting the barchart decently-looking would be highly appreciated.
I am trying to plot several histograms for the same data set, but with different numbers of bins. I am using Gadfly.
Suppose x is just an array of real values, plotting each histogram works:
plot(x=x, Geom.histogram(bincount=10))
plot(x=x, Geom.histogram(bincount=20))
But I'm trying to put all the histograms together. I've added the number of bins as another dimension to my data set:
x2 = vcat(hcat(10*ones(length(x)), x), hcat(20*ones(length(x)), x)
df = DataFrame(Bins=x2[:,1], X=x2[:,2])
Is there any way to send the number of bins (the value from the first column) to Geom.histogram when using Geom.subplot_grid? Something like this:
plot(df, x="X", ygroup="Bins", Geom.subplot_grid(Geom.histogram(?)))
I think you would be better off not using subplot grid at that point, and instead just combine them with vstack or hstack. From the docs
Plots can also be stacked horizontally with ``hstack`` or vertically with
``vstack``. This allows more customization in regards to tick marks, axis
labeling, and other plot details than is available with ``subplot_grid``.
I had some problems while trying to plot a histogram to show the frequency of every value while plotting the value as well. For example, suppose I use the following code:
x <- sample(1:10,1000,replace=T)
hist(x,label=TRUE)
The result is a plot with labels over the bar, but merging the frequencies of 1 and 2 in a single bar.
Apart from separate this bar in two others for 1 and 2, I also need to put the values under each bar.
For example, with the code above I would have the number 10 under the tick at the right margin of its bar, and I needed to plot the values right under the bars.
Is there any way to do both in a single histogram with hist function?
Thanks in advance!
Calling hist silently returns information you can use to modify the plot. You can pull out the midpoints and the heights and use that information to put the labels where you want them. You can use the pos argument in text to specify where the label should be in relation to the point (thanks #rawr)
x <- sample(1:10,1000,replace=T)
## Histogram
info <- hist(x, breaks = 0:10)
with(info, text(mids, counts, labels=counts, pos=1))