I have a matrix (C) of outputs from a bayesian model with 3000 rows which contain the week number (1-13) in which a given bird breeding behavior (columns; singing, incubating, fledglings etc.) is most likely to occur. I have visualized kernel density estimates for the week in which a behavior is most likely to occur using this code:
G <- mcmc_dens(C, pars = c("Singing", "Building", "Incubating", "Nestlings", "Empty Nest", "Fledglings Observed", "Fledgling/Adult Interactions", "Fledgling Foraging"))
G <- G + theme(axis.title = element_text(face="plain",size=12)) + labs(x ="Week") + scale_x_continuous(breaks = 1:13)
...which produces these figures:
I would like to stack the figures above one another so that I have one figure with the same x-axis where you can easily see which behaviors peak at the same time, but I don't know how to do this with mcmc_dens (i.e. I want the graph for singing to be above building, both singing and building to be above incubating, and so on so that I have eight vertically aligned graphs).
Data sample from matrix C (does not include all columns):
Singing Building Incubating Nestlings Empty Nest
[1,] 8 8 8 8 13
[2,] 8 8 8 11 4
[3,] 9 8 8 12 13
[4,] 5 4 8 11 13
[5,] 9 8 8 8 13
[6,] 9 8 8 8 13
[7,] 5 8 8 11 13
[8,] 9 8 10 11 12
[9,] 9 4 8 10 8
[10,] 5 7 12 10 8
Figured it out! mcmc_dens has the argument facet_args which turns each figure into its own facet (took me so long because I was unfamiliar with facets). Modifying the first line of my original code gave me the figure I was looking for:
pars <- c("Singing", "Building", "Incubating", "Nestlings",
"Empty", "Fledglings", "Interactions", "Foraging")
G <- mcmc_dens(C, pars=pars, facet_args=list(ncol=1, strip.position="left"))
This is what the images look like now:
Related
This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 2 years ago.
Given the cuadratic models for the odor dataset from the faraway package:
> lmod <- lm(odor ~ I(temp) + I(gas) + I(pack)+I(temp^2)+I(gas^2)+I(pack^2)+I(temp*gas)+I(gas*pack)+I(pack*temp),odor)
> lmod6 <- lm(odor ~ polym(temp,gas,pack,degree = 2),odor)
Both models have the same fitted values:
> lmod$fitted
1 2 3 4 5 6 7 8 9 10 11 12
86.62500 45.87500 36.12500 28.37500 42.50000 15.25000 -3.25000 -24.50000 59.87500 29.37500 20.62500 -16.87500
13 14 15
-30.66667 -30.66667 -30.66667
> lmod6$fitted
1 2 3 4 5 6 7 8 9 10 11 12
86.62500 45.87500 36.12500 28.37500 42.50000 15.25000 -3.25000 -24.50000 59.87500 29.37500 20.62500 -16.87500
13 14 15
-30.66667 -30.66667 -30.66667
However, when comparing these fitted values to each other they are not identical, why is that?
> table(lmod6$fitted==lmod$fitted)
FALSE TRUE
13 2
This is a classic example of the floating point trap as described in chapter 1 of R Inferno by Patric Burns or R FAQ
The differences are zero up to floating point accuracy:
options(digits=20)
lmod$fitted - lmod6$fitted
1 2
0.0000000000000000000e+00 0.0000000000000000000e+00
3 4
7.1054273576010018587e-15 1.0658141036401502788e-14
5 6
-7.1054273576010018587e-15 -3.5527136788005009294e-15
7 8
4.4408920985006261617e-16 -3.5527136788005009294e-15
9 10
-1.4210854715202003717e-14 -1.4210854715202003717e-14
11 12
7.1054273576010018587e-15 3.5527136788005009294e-15
13 14
7.1054273576010018587e-15 7.1054273576010018587e-15
15
7.1054273576010018587e-15
all.equal() function is designed for testing if the two vectors are "almost equal" (up to fp accuracy):
all.equal(lmod$fitted,lmod6$fitted)
[1] TRUE
I want to use R to extract values from a raster. Basically, my raster has values from 0-6 and I want to extract for every single pixel the corresponding value. So that I have at the end a data table containing those two variables.
Thank you for your help, I hope my explanations are precisely enough.
Example data
library(raster)
r <- raster(ncol=5, nrow=5, vals=1:25)
To get all values, you can do
values(r)
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#as.matrix(r)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 2 3 4 5
#[2,] 6 7 8 9 10
#[3,] 11 12 13 14 15
#[4,] 16 17 18 19 20
#[5,] 21 22 23 24 25
Also see ?getValues
You can also use indexing
r[2,2]
#7
r[7:8]
#[1] 7 8
For more complex extractions using points, lines or polygons, see ?extract
x is the raster object you are trying to extract values from; y is may be a SpatialPoints, SpatialPolygons,SpatialLines, Extent or a vector representing cell numbers (take a look at ?extract). Your code values_raster <- extract(x = values, df=TRUE) will not work because you're feeding the function with any y object/vector.
You could try to build a vector with all cell numbers of your raster. Imagine your raster have 200 cells. If your do values_raster <- extract(x = values,y=seq(1,200,1), df=TRUE) you'll get a dataframe with values for each cell.
How about simply doing
as.data.frame(s, xy=TRUE) # s is your raster file
This question already has answers here:
Understanding the order() function
(7 answers)
Closed 9 years ago.
I have this function and it takes a few parameters.
I have this part of the function here:
sort.order <- order(inputs[,input.of.interest])
Iif I read inputs I get something like:
Status Quo Vaccination
[1,] 10.409146 16.252537
[2,] 5.834875 9.373437
[3,] 5.784903 15.935623
[4,] 12.208484 18.654250
[5,] 9.786787 16.467321
[6,] 6.560276 9.689887
But what is input.of.interest supposed to be?
What does it mean, how is this function used?
Should it be a number, i.e if it's 2, what would it do?
It chooses the column to sort by. If it's 1 it sorts by Status Quo and if it's 2 it sorts by Vaccination.
x <- seq(20, 11, -1)
x
# [1] 20 19 18 17 16 15 14 13 12 11
order(x)
# [1] 10 9 8 7 6 5 4 3 2 1
x[order(x)]
# [1] 11 12 13 14 15 16 17 18 19 20
Hope you see better how it works.
Column data$form contains 170 unique different values, (numbers from 1 to ~800).
I would like to merge some values (e.g with a 10 radius/step).
I need to do this in order to use:
colors = rainbow(length(unique(data$form)))
In a plot and provide a better visual result.
Thank you in advance for your help.
you can use %/% to group them and mean to combine them and normalize to scale them.
# if you want specifically 20 groups:
groups <- sort(form) %/% (800/20)
x <- c(by(sort(form), groups, mean))
x <- normalize(x, TRUE) * 19 + 1
0 1 2 3 4
1.000000 1.971781 2.957476 4.103704 4.948560
5 6 7 8 9
5.950617 7.175309 7.996914 8.953086 9.952263
10 11 12 13 14
10.800705 11.901235 12.888889 13.772291 14.888889
15 16 17 18 19
15.927984 16.864198 17.918519 18.860082 20.000000
You could also use cut. If you use the argument labels=FALSE, you get an integer value:
form <- runif(170, min=1,max=800)
> cut(form, breaks=20)
[1] (518,558] (280,320] (240,280] (121,160] (757,797]
[6] (160,200] (320,359] (598,638] (80.8,121] (359,399]
[7] (121,160] (200,240] ...
20 Levels: (1.18,41] (41,80.8] (80.8,121] (121,160] (160,200] (200,240] (240,280] (280,320] (320,359] (359,399] (399,439] ... (757,797]
> cut(form, breaks=20, labels=FALSE)
[1] 14 8 7 4 20 5 9 16 3 10 4 6 5 18 18 6 2 12
[19] 2 19 13 11 13 11 14 12 17 5 ...
On a side-note, I want you to re-consider plotting with rainbow colours, as it distorts reading the data, cf. Rainbow Color Map (Still) Considered Harmful.
I'm trying to create a stacked bar graph using ggplot 2. My data in its wide form, looks like this. The numbers in each cell are the frequency of responses.
activity yes no dontknow
Social events 27 3 3
Academic skills workshops 23 5 8
Summer research 22 7 7
Research fellowship 20 6 9
Travel grants 18 8 7
Resume preparation 17 4 12
RAs 14 11 8
Faculty preparation 13 8 11
Job interview skills 11 9 12
Preparation of manuscripts 10 8 14
Courses in other campuses 5 11 15
Teaching fellowships 4 14 16
TAs 3 15 15
Access to labs in other campuses 3 11 18
Interdisciplinary research 2 11 18
Interdepartamental projects 1 12 19
I melted this table using reshape2 and
melted.data(wide.data,id.vars=c("activity"),measure.vars=c("yes","no","dontknow"),variable.name="haveused",value.name="responses")
That's as far as I can get. I want to create a stacked bar chart with activities on the x axis, frequency of responses in the y axis, and each bar showing the distribution of the yes, nos and dontknows
I've tried
ggplot(melted.data,aes(x=activity,y=responses))+geom_bar(aes(fill=haveused))
but I'm afraid that's not the right solution
Any help is much appreciated.
You haven't said what it is that's not right about your solution. But some issues that could be construed as problems, and one possible solution for each, are:
The x axis tick mark labels run into each other. SOLUTION - rotate the tick mark labels;
The order in which the labels (and their corresponding bars) appear are not the same as the order in the original dataframe. SOLUTION - reorder the levels of the factor 'activity';
To position text inside the bars set the vjust parameter in position_stack to 0.5
The following might be a start.
# Load required packages
library(ggplot2)
library(reshape2)
# Read in data
df = read.table(text = "
activity yes no dontknow
Social.events 27 3 3
Academic.skills.workshops 23 5 8
Summer.research 22 7 7
Research.fellowship 20 6 9
Travel.grants 18 8 7
Resume.preparation 17 4 12
RAs 14 11 8
Faculty.preparation 13 8 11
Job.interview.skills 11 9 12
Preparation.of.manuscripts 10 8 14
Courses.in.other.campuses 5 11 15
Teaching.fellowships 4 14 16
TAs 3 15 15
Access.to.labs.in.other.campuses 3 11 18
Interdisciplinay.research 2 11 18
Interdepartamental.projects 1 12 19", header = TRUE, sep = "")
# Melt the data frame
dfm = melt(df, id.vars=c("activity"), measure.vars=c("yes","no","dontknow"),
variable.name="haveused", value.name="responses")
# Reorder the levels of activity
dfm$activity = factor(dfm$activity, levels = df$activity)
# Draw the plot
ggplot(dfm, aes(x = activity, y = responses, group = haveused)) +
geom_col(aes(fill=haveused)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.25)) +
geom_text(aes(label = responses), position = position_stack(vjust = .5), size = 3) # labels inside the bar segments