I am getting an error using geom_map that I do not get while using geom_polygon when trying to make a choropleth map.
I am following the ggplot2 documentation as closely as possible.
I have a data.frame of positions plus an id column (lease_number) to reference later:
dfmap<- read.table(text="id long lat order hole piece group lease_number
1 -90.38103 28.78907 1 FALSE 1 1.1 00016
1 -90.38065 28.82965 2 FALSE 1 1.1 A0016
1 -90.33457 28.82930 3 FALSE 1 1.1 A0016
1 -90.33497 28.78872 4 FALSE 1 1.1 A0016
1 -90.38103 28.78907 5 FALSE 1 1.1 A0016", header=T)
And a data.frame of values with the corresponding id column (just one value here):
df <- data.frame(lease_number="A0016", var1=10)
Following the documentation's structure exactly:
ggplot(df, aes(fill=var1) +
geom_map(aes(map_id=lease_number), map=dfmap) +
expand_limits(dfmap)
Gives the following error:
Error in unit(x, default.units) : 'x' and 'units' must have length > 0
By merging the data like so I can produce a correct plot,
ggplot() +
geom_polygon(data=merge(dfmap, df, by='lease_number', all=T),
aes(x=long, y=lat, group=lease_number, fill=var1))
but I want to avoid this because I will need to reference a lot of different things and it will be much better to be able to reference between two data.frames with the lease_number column.
I have seen this question but that answer does not apply here since I cannot even get a base map to show up with geom_map if I remove the fill= argument. Does anyone know how to take on this error?
Thanks.
Related
This question already has answers here:
plotting grouped bar charts in R
(3 answers)
Closed 10 months ago.
I'm quite new to R studio, so I apologise in advance. I need help to create a grouped barplot. I have three variables:
"Time": converted to a continuous variable
"Treatment": "Con", "Hya"
"Trial": "T1", "T2", "T3"
I want to produce something like this:
There should be three groups of three columns stacked beside each other. Time on the Y-axis; Trial (1,2 & 3)on the X-axis; Treatment corresponding to coloured columns (Hya=grey, Con=white) with a legend explaining Treatment colour.
Here is the structure of my data:
'data.frame': 102 obs. of 3 variables:
$ Trial : int 1 1 1 1 1 1 1 1 1 1 ...
$ Treatment: $ Trial : int 1 1 1 1 1 1 1 1 1 1 ...
$ Treatment: Factor w/ 2 levels "Control","Hyaluronan": 1 1 1 1 1 1 1 1 1 1 ...
$ Time : num 11 7 7.68 7.7 7 3 5 5.48 4 6 ...
I get this error message:
> barplot(table(Biopsy$Time, Biopsy$Treatment, Biopsy$Trial))
Error in barplot.default(table(Biopsy$Time, Biopsy$Treatment, Biopsy$Trial)) :
'height' must be a vector or a matrix
Please if anyone is able to help I would so appreciate it, I've been trying for so long :(
I think it is useful to mention the "ggplot2" package here. With this package the creation of a stacked bar plot is quite easy. I was not sure about the data frame you are working with since you are only providing a snapshot of your data structure, but I hope the data frame I created as an example will help to show you the basic function used to create such a plot. (You can just copy-paste in RStudio and run the code. Make sure to install the ggplot2 [install.packages("ggplot2")] package before running the library() function.)
trial <- c(1,1,1,1,2,2,2,2,3,3,3,3)
treatment <- c("Hya","Hya","Con","Con","Hya","Hya","Con","Con","Hya","Hya","Con","Con")
time <- c(1,7,1,7,2,8,2,8,3,9,3,9)
df <- data.frame(trial,treatment,time)
library(ggplot2)
ggplot(df, aes(y = time,
x = trial,
group = treatment)) +
geom_bar(stat = "identity", position = "dodge", aes(fill = treatment))
The resulting plot can be found here.
The above code will create a data frame and a bar plot. The grouping is done with the argument "group", the color set with "fill". Of course you can modify the coloring etc. Since you are new RStudio/R I recommend you check out the documentation of ggplot.
I hope this example helps...
I'm trying to extract information about the limits and transform of an existing ggplot object. I'm getting close, but need some help. Here's my code
data = data.frame(x=c(1,10,100),y=(c(1,10,100)))
p = ggplot(data=data,aes(x=x,y=y)) + geom_point()
p = p + scale_y_log10()
q = ggplot_build(p)
r = q$panel$y_scales
trans.y = (q$panel$y_scales)[[1]]$trans$name
range.y = (q$panel$y_scales)[[1]]$rang
print(trans.y) gives me exactly what I want
[1] "log-10"
But range.y is a funky S4 object (see below).
> print(range.y)
Reference class object of class "Continuous"
Field "range":
[1] 0 2
> unclass(range.y)
<S4 Type Object>
attr(,".xData")
<environment: 0x11c9a0630>
I don't really understand S4 objects or how to query their attributes and methods. Or, if I'm just going down the wrong rabbit hole here, a better solution would be great :) In Matlab, I could just use the commands "get(gca,'YScale')" and "get(gca,'YLim')", so I wonder if I'm making this harder than it needs to be.
As #MikeWise points out in the comments, this all becomes a lot easier if you update ggplot to v2.0. It now uses ggproto objects instead of proto, and these are more convenient to get info from.
It's easy to find now what you need. Just printing ggplot_build(p) gives you a nice list of all that's there.
ggplot_build(p)$panel$y_scales[[1]]$range here gives you a ggproto object. You can see that contains several parts, one of which is range (again), which contains the data range. All the way down, you end up with:
ggplot_build(p)$panel$y_scales[[1]]$range$range
# [1] 0 2
Where 0 is 10^0 = 1 and 2 is 10^2 = 100.
Another way might be to just look it up in $data part like this:
apply(ggplot_build(p)$data[[1]][1:2], 2, range)
# y x
# 1 0 1
# 2 1 10
# 3 2 100
You can also get the actual range of the plotting window with:
ggplot_build(p)$panel$ranges[[1]]$y.range
[1] -0.1 2.1
I have a list object as shown below ->
> myaggregate
input$AgeAndGender input$CTR
1 Female_<18 0.030041698
2 Female_18-24 0.010918938
3 Female_25-34 0.009839806
4 Female_35-44 0.010193773
5 Female_45-54 0.009996056
6 Female_55-64 0.020024678
7 Female_65+ 0.030060728
8 Male_<18 0.028356698
9 Male_18-24 0.011031902
10 Male_25-34 0.010218562
11 Male_35-44 0.010168911
12 Male_45-54 0.010021256
13 Male_55-64 0.020191223
14 Male_65+ 0.029717747
Im trying to plot a bargraph representing the CTR levels(Y axis) for each value in AgeAndGender(X axis).
When I attempt a simple plot however I run into the following issue ->
> ggplot(data= myaggregate,aes(x=input$AgeAndGender,y=input$CTR))+geom_bar()
Error in data.frame(x = c("Male_35-44", "Female_65+", "Male_25-34", "Female_45-54", :
arguments imply differing number of rows: 3378934, 14
I'm sure I'm missing something pretty basic. Any help is appreciated!
If you are just wanting to plot the values, then you need stat="identity" like in the following example:
library(ggplot2)
AgeAndGender <- c("f1","f2","f3")
CTR <- c(.1,.15,.12)
myaggregate <- data.frame(AgeAndGender, CTR)
ggplot(data= myaggregate,aes(x=AgeAndGender, y=CTR)) + geom_bar(stat = "identity")
Which results in the following:
Looking at your comment about your data being in a list concerns me. Try making myaggregate a dataframe.
I was able to plot with something like what you are using but it's a rather weird construction. Dataframes do not generally have dollar-signs in there name because $ is an infix function in R. I read in the data with read.table and the dollar-signs get converted to periods. I put back the column names as you have them with:
names(myaggregate) <- c('input$AgeAndGender', 'input$CTR')
And then you can get a rather messy barplot with:
ggplot(data= myaggregate,aes(x=`input$AgeAndGender`,y=`input$CTR`))+ geom_bar(stat = "identity")
When you just put your code in, the unquoted names get interpreted as x being the "AgeAndGender"-clumn in the input dataframe. If you only use ordinary quotes rather than backticks you do not succeed.
I have data as follows in .csv format as I am new to ggplot2 graphs I am not able to do this
T L
141.5453333 1
148.7116667 1
154.7373333 1
228.2396667 1
148.4423333 1
131.3893333 1
139.2673333 1
140.5556667 2
143.719 2
214.3326667 2
134.4513333 3
169.309 8
161.1313333 4
I tried to plot a line graph using following graph
data<-read.csv("sample.csv",head=TRUE,sep=",")
ggplot(data,aes(T,L))+geom_line()]
but I got following image it is not I want
I want following image as follows
Can anybody help me?
You want to use a variable for the x-axis that has lots of duplicated values and expect the software to guess that the order you want those points plotted is given by the order they appear in the data set. This also means the values of the variable for the x-axis no longer correspond to the actual coordinates in the coordinate system you're plotting in, i.e., you want to map a value of "L=1" to different locations on the x-axis depending on where it appears in your data.
This type of fairly non-sensical thing does not work in ggplot2 out of the box. You have to define a separate variable that has a proper mapping to values on the x-axis ("id" in the code below) and then overwrite the labels with the values for "L".
The coe below shows you how to do this, but it seems like a different graphical display would probbaly be better suited for this kind of data.
data <- as.data.frame(matrix(scan(text="
141.5453333 1
148.7116667 1
154.7373333 1
228.2396667 1
148.4423333 1
131.3893333 1
139.2673333 1
140.5556667 2
143.719 2
214.3326667 2
134.4513333 3
169.309 8
161.1313333 4
"), ncol=2, byrow=TRUE))
names(data) <- c("T", "L")
data$id <- 1:nrow(data)
ggplot(data,aes(x=id, y=T))+geom_line() + xlab("L") +
scale_x_continuous(breaks=data$id, labels=data$L)
You have an error in your code, try this:
ggplot(data,aes(x=L, y=T))+geom_line()
Default arguments for aes are:
aes(x, y, ...)
Good day,
I have a data frame such as:
sample.df = data.frame(a=c(-1,1,0,-1),b=c(2,NA,1,2),c=c(0,0,1,2),d=c(-1,-2,0,0))
and I would like to produce a stacked barplot for each column showing the number of times each unique values occurs in that column (ignoring NAs).
My first thought is to create a new data frame with the possible values as the rownames (depicted here as Score) and the data values being the count of times that value occurs for that column:
Score a b c d
2 0 2 1 0
1 1 1 1 0
0 1 0 2 0
-1 2 0 0 1
-2 0 0 0 1
I've tried using ddply, table and aggregate from other examples but don't see a way to get to that structure.
My thought is that once I have that it should be straightforward to hand it to barplot to get the stacked barplot showing the number of occurrences of each value in each column.
I appreciate any guidance you can give me.
Thank you,
Dave
To close the loop on this, here is where I ended up.
I used melt as suggested by tcash21 ("Score" wasn't in the original data just attributes a, b, c and d). Also, ggplot() gave me an error about using "y" and stat="bin" so I removed the assignment of "y" in the ggplot() call. Here are the steps I ended up taking to get to the desired plot:
sample.df = data.frame(a=c(-1,1,0,-1),b=c(2,NA,1,2),c=c(0,0,1,2),d=c(-1,-2,0,0))
new.df<-melt(sample.df)
new.df<-new.df[complete.cases(new.df),]
new.df$Score<-as.factor(new.df$value)
ggplot(new.df, aes(x=variable, fill=Score)) + geom_bar()
You need to use melt to get the data into the proper long shape for ggplot:
new.df<-melt(sample.df, id.vars="Score")
ggplot(new.df, aes(x=Score, y=value, fill=variable)) + geom_bar()