I have a data frame with the two columns bloodlevel and sex (F & M only), with 14 male and 11 female.
bloodlevel sex
1 14.9 M
2 12.9 M
3 14.7 M
4 14.7 M
5 14.8 M
6 14.7 M
7 13.9 M
8 14.1 M
9 16.1 M
10 16.1 M
11 15.3 M
12 12.8 M
13 14.0 M
14 14.9 M
15 11.2 F
16 14.5 F
17 12.1 F
18 14.8 F
19 15.2 F
20 11.2 F
21 15.0 F
22 13.2 F
23 14.4 F
24 14.7 F
25 13.2 F
I am trying to create two histograms that differentiate females' and males' blood levels with facet_wrap.
I have tried
ggplot(Physiology, aes(x=sex, y=bloodlevel))+
geom_histogram(binwidth=5, fill="white", color="black")+
facet_wrap(~Physiology)+
xlab("sex")
but I’m getting the error
Error in `combine_vars()`:
! At least one layer must contain all faceting variables: `Physiology`.
* Plot is missing `Physiology`
* Layer 1 is missing `Physiology`
I am trying trying to facet the variable with plot like this:
Is this what you're trying?
df <- data.frame(bloodlevel = sample(12:16,25,T),
sex=sample(c("M","F"),25,T))
df %>% ggplot(aes(x=bloodlevel))+geom_histogram()+
facet_wrap(~sex)
Next time please provide a working code sample for us to use (Copying the table you printed doesnt do the trick..)
When i was trying to plot a line, the x-axis came out different from the database. This is my data:
Month num temp
1 2016-1-1 61 4.5
2 2016-2-1 50 3.8
3 2016-3-1 51 5.3
4 2016-4-1 48 6.5
5 2016-5-1 49 11.3
6 2016-6-1 48 13.9
7 2016-7-1 50 15.3
8 2016-8-1 48 15.5
9 2016-9-1 52 14.6
10 2016-10-1 54 9.8
11 2016-11-1 69 4.9
12 2016-12-1 80 5.9
13 2017-1-1 59 3.8
14 2017-2-1 52 5.2
15 2017-3-1 51 7.3
16 2017-4-1 47 8.0
17 2017-5-1 50 12.1
18 2017-6-1 47 14.4
and my code was:
ggplot(data=trendsData,aes(x=Month, y=temp,group=1))+geom_line()+theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = .5))
but it came out:
enter image description here
Could anyone help with the disorder, thanks!
R can only sort those dates correctly when it knows what they are infact dates.
ymd() from the package lubridate is nice for that.
trendsData$Month <- ymd( trendsData$Month )
Then your plot should be fine.
EDIT:
If you want more date points to show on the x axis, you can use scale_x_date() like so:
+ scale_x_date( breaks=trendsData$Month )
When I try to fit an exponential decay and my x axis has decimal number, the fit is never correct. Here's my data below:
exp.decay = data.frame(time,counts)
time counts
1 0.4 4458
2 0.6 2446
3 0.8 1327
4 1.0 814
5 1.2 549
6 1.4 401
7 1.6 266
8 1.8 182
9 2.0 140
10 2.2 109
11 2.4 83
12 2.6 78
13 2.8 57
14 3.0 50
15 3.2 31
16 3.4 22
17 3.6 23
18 3.8 20
19 4.0 19
20 4.2 9
21 4.4 7
22 4.6 4
23 4.8 6
24 5.0 4
25 5.2 6
26 5.4 2
27 5.6 7
28 5.8 2
29 6.0 0
30 6.2 3
31 6.4 1
32 6.6 1
33 6.8 2
34 7.0 1
35 7.2 2
36 7.4 1
37 7.6 1
38 7.8 0
39 8.0 0
40 8.2 0
41 8.4 0
42 8.6 1
43 8.8 0
44 9.0 0
45 9.2 0
46 9.4 1
47 9.6 0
48 9.8 0
49 10.0 1
fit.one.exp <- nls(counts ~ A*exp(-k*time),data=exp.decay, start=c(A=max(counts),k=0.1))
plot(exp.decay, col='darkblue',xlab = 'Track Duration (seconds)',ylab = 'Number of Particles', main = 'Exponential Fit')
lines(predict(fit.one.exp), col = 'red', lty=2, lwd=2)
I always get this weird fit. Seems to me that the fit is not recognizing the right x axis, because when I use a different set of data, with only integers in the x axis (time) the fit works! I don't understand why it's different with different units.
You need one small modification:
lines(predict(fit.one.exp), col = 'red', lty=2, lwd=2)
should be
lines(exp.decay$time, predict(fit.one.exp), col = 'red', lty=2, lwd=2)
This way you make sure to plot against the desired values on your abscissa.
I tested it like this:
data = read.csv('exp_fit_r.csv')
A0 <- max(data$count)
k0 <- 0.1
fit <- nls(data$count ~ A*exp(-k*data$time), start=list(A=A0, k=k0), data=data)
plot(data)
lines(data$time, predict(fit), col='red')
which gives me the following output:
As you can see, the fit describes the actual data very well, it was just a matter of plotting against the correct abscissa values.
can someone help me show me how I could make a fully labelled scatter plot for 2 variables, showing the axis labels with units(such as "cm"), and also including the chart title. Forexample, how would i make a fully labelled scatter plot including all the above listed features for age and height, using the following data using R?
Distance Age Height Coning
1 21.4 18 3.3 Yes
2 13.9 17 3.4 Yes
3 23.9 16 2.9 Yes
4 8.7 18 3.6 No
5 241.8 6 0.7 No
6 44.5 17 1.3 Yes
7 30.0 15 2.5 Yes
8 32.3 16 1.8 Yes
9 31.4 17 5.0 No
10 32.8 13 1.6 No
11 53.3 12 2.0 No
12 54.3 6 0.9 No
13 96.3 11 2.6 No
14 133.6 4 0.6 No
15 32.1 15 2.3 No
16 57.9 12 2.4 Yes
17 30.8 17 1.8 No
18 59.9 7 0.8 No
19 42.7 15 2.0 Yes
20 20.6 18 1.7 Yes
21 62.0 8 1.3 No
22 53.1 7 1.6 No
23 28.9 16 2.2 Yes
24 177.4 5 1.1 No
25 24.8 14 1.5 Yes
26 75.3 14 2.3 Yes
27 51.6 7 1.4 No
28 36.1 9 1.1 No
29 116.1 6 1.1 No
30 28.1 16 2.5 Yes
31 8.7 19 2.2 Yes
32 105.1 6 0.8 No
33 46.0 15 3.0 Yes
34 102.6 7 1.2 No
35 15.8 15 2.2 No
36 60.0 7 1.3 No
37 96.4 13 2.6 No
38 24.2 14 1.7 No
39 14.5 15 2.4 No
40 36.6 14 1.5 No
41 65.7 5 0.6 No
42 116.3 7 1.6 No
43 113.6 8 1.0 No
44 16.7 15 4.3 Yes
45 66.0 7 1.0 No
46 60.7 7 1.0 No
47 90.6 7 0.7 No
48 91.3 7 1.3 No
49 14.4 18 3.1 Yes
50 72.8 14 3.0 Yes
With base graphics:
df <- read.table(header=T, sep=" ", text="
Yes Distance Age Height Coning
1 21.4 18 3.3 Yes
2 13.9 17 3.4 Yes
3 23.9 16 2.9 Yes
4 8.7 18 3.6 No
5 241.8 6 0.7 No
6 44.5 17 1.3 Yes
7 30.0 15 2.5 Yes
8 32.3 16 1.8 Yes
9 31.4 17 5.0 No
10 32.8 13 1.6 No
11 53.3 12 2.0 No
12 54.3 6 0.9 No
13 96.3 11 2.6 No
14 133.6 4 0.6 No
15 32.1 15 2.3 No
16 57.9 12 2.4 Yes
17 30.8 17 1.8 No
18 59.9 7 0.8 No
19 42.7 15 2.0 Yes
20 20.6 18 1.7 Yes
21 62.0 8 1.3 No
22 53.1 7 1.6 No
23 28.9 16 2.2 Yes
24 177.4 5 1.1 No
25 24.8 14 1.5 Yes
26 75.3 14 2.3 Yes
27 51.6 7 1.4 No
28 36.1 9 1.1 No
29 116.1 6 1.1 No
30 28.1 16 2.5 Yes
31 8.7 19 2.2 Yes
32 105.1 6 0.8 No
33 46.0 15 3.0 Yes
34 102.6 7 1.2 No
35 15.8 15 2.2 No
36 60.0 7 1.3 No
37 96.4 13 2.6 No
38 24.2 14 1.7 No
39 14.5 15 2.4 No
40 36.6 14 1.5 No
41 65.7 5 0.6 No
42 116.3 7 1.6 No
43 113.6 8 1.0 No
44 16.7 15 4.3 Yes
45 66.0 7 1.0 No
46 60.7 7 1.0 No
47 90.6 7 0.7 No
48 91.3 7 1.3 No
49 14.4 18 3.1 Yes
50 72.8 14 3.0 Yes")
attach(df)
lab <- sprintf("%.1fcm, %dyr", Height, Age)
plot(Age ~ Height, main="The Title", pch=20, xlab="Height in cm", ylab="Age in years")
text(y=Age, x=Height, labels=lab, cex=.7, col=rgb(0,0,0,.5), pos=4)
detach(df)
And with the help of wordcloud::textplot():
if (!require(wordcloud)) {
install.packages("wordcloud")
library(wordcloud)
}
plot(Age ~ Height, main="The Title", pch=20, xlab="Height in cm", ylab="Age in years", type="n")
textplot(y=Age, x=Height, words=lab, cex=.5, new=F, show.lines=T)
You can use the ggplot2 library. Example -
library(ggplot2)
ggplot(mtcars, aes(x=wt, y=mpg, label=rownames(mtcars)))+
geom_point() +
geom_text()
What that code snippet is doing is taking the 'mtcars' dataset, assigning the x variable as the wt column, the y variable as the mpg column, and the labels as the rownames. geom_point adds a scatterplot based on the above x,y, and geom_text places the labels at the x,y coordinates.
Check out the help entry on geom_text to see the formatting options.
Examples taken from ggplot2 documentation, page 98
p <- ggplot(mtcars, aes(x=wt, y=mpg, label=rownames(mtcars)))
p + geom_text()
# Change size of the label
p + geom_text(size=10)
p <- p + geom_point()
# Set aesthetics to fixed value
p + geom_text()
p + geom_point() + geom_text(hjust=0, vjust=0)
p + geom_point() + geom_text(angle = 45)
# Add aesthetic mappings
p + geom_text(aes(colour=factor(cyl)))
p + geom_text(aes(colour=factor(cyl))) + scale_colour_discrete(l=40)
p + geom_text(aes(size=wt))
p + geom_text(aes(size=wt)) + scale_size(range=c(3,6))
# You can display expressions by setting parse = TRUE. The
# details of the display are described in ?plotmath, but note that
# geom_text uses strings, not expressions.
p + geom_text(aes(label = paste(wt, "^(", cyl, ")", sep = "")),
parse = TRUE)
# Add an annotation not from a variable source
c <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
c + geom_text(data = NULL, x = 5, y = 30, label = "plot mpg vs. wt")
# Or, you can use annotate
c + annotate("text", label = "plot mpg vs. wt", x = 2, y = 15, size = 8, colour = "red")
# Use qplot instead
qplot(wt, mpg, data = mtcars, label = rownames(mtcars),
geom=c("point", "text"))
qplot(wt, mpg, data = mtcars, label = rownames(mtcars), size = wt) +
geom_text(colour = "red")
# You can specify family, fontface and lineheight
p <- ggplot(mtcars, aes(x=wt, y=mpg, label=rownames(mtcars)))
p + geom_text(fontface=3)
p + geom_text(aes(fontface=am+1))
p + geom_text(aes(family=c("serif", "mono")[am+1]))
So I know that R has a built in function for t.test....which is basically just t.test(y1,y2) etc
But I'm having trouble accessing the data I wish to compare.
So I basically have two seperate data outputs
The first one is just called 'data' and it has an output similar to this
Time Kilometres
0 0
1 0.05
2 0.1
3 0.15
4 0.2
5 0.25
6 0.3
7 0.35
8 0.4
9 0.45
10 0.5
The other output is called 'hunt' and has this output
cuts: [20,25)
Time Kilometres
21 20 7.3
22 21 8.4
23 22 9.5
24 23 10.6
25 24 11.7
------------------------------------------------------------
cuts: [25,30)
Time Kilometres
26 25 12.8
27 26 13.9
28 27 15.0
29 28 16.1
30 29 17.2
------------------------------------------------------------
cuts: [30,35)
Time Kilometres
31 30 18.3
32 31 19.4
33 32 20.5
34 33 21.6
35 34 22.7
My question is, would it be possible to do seperate T.tests for each cut. Like get seperate p values for each cut whilst comparing each cut with my first data called 'data'.
so the p value for cuts:[0,5] =
cuts:[5:10] =
etc
Thanks again
So far it is not clear what you want to test within each group. The t.test can be used as either a two-group test or a one group test. You have only one group per cut. When used as a one-group test, there needs to be an assumed value for the mean against which the test gets run, but it's not clear what sort of reference value would be appropriate. I'm wondering if what you really want to do is a linear regression within each cut-category to test for a trend? This would implicitly be testing against a slope of zero. This is lightly tested code:
lapply( split( dat, cut.grp) ,
function(dgrp) summary(lm( Kilometres ~ Time, data=dgrp))$coefficients [ ,"Pr(>|t|)"] )