Wrong Fit using nls function - r

When I try to fit an exponential decay and my x axis has decimal number, the fit is never correct. Here's my data below:
exp.decay = data.frame(time,counts)
time counts
1 0.4 4458
2 0.6 2446
3 0.8 1327
4 1.0 814
5 1.2 549
6 1.4 401
7 1.6 266
8 1.8 182
9 2.0 140
10 2.2 109
11 2.4 83
12 2.6 78
13 2.8 57
14 3.0 50
15 3.2 31
16 3.4 22
17 3.6 23
18 3.8 20
19 4.0 19
20 4.2 9
21 4.4 7
22 4.6 4
23 4.8 6
24 5.0 4
25 5.2 6
26 5.4 2
27 5.6 7
28 5.8 2
29 6.0 0
30 6.2 3
31 6.4 1
32 6.6 1
33 6.8 2
34 7.0 1
35 7.2 2
36 7.4 1
37 7.6 1
38 7.8 0
39 8.0 0
40 8.2 0
41 8.4 0
42 8.6 1
43 8.8 0
44 9.0 0
45 9.2 0
46 9.4 1
47 9.6 0
48 9.8 0
49 10.0 1
fit.one.exp <- nls(counts ~ A*exp(-k*time),data=exp.decay, start=c(A=max(counts),k=0.1))
plot(exp.decay, col='darkblue',xlab = 'Track Duration (seconds)',ylab = 'Number of Particles', main = 'Exponential Fit')
lines(predict(fit.one.exp), col = 'red', lty=2, lwd=2)
I always get this weird fit. Seems to me that the fit is not recognizing the right x axis, because when I use a different set of data, with only integers in the x axis (time) the fit works! I don't understand why it's different with different units.

You need one small modification:
lines(predict(fit.one.exp), col = 'red', lty=2, lwd=2)
should be
lines(exp.decay$time, predict(fit.one.exp), col = 'red', lty=2, lwd=2)
This way you make sure to plot against the desired values on your abscissa.
I tested it like this:
data = read.csv('exp_fit_r.csv')
A0 <- max(data$count)
k0 <- 0.1
fit <- nls(data$count ~ A*exp(-k*data$time), start=list(A=A0, k=k0), data=data)
plot(data)
lines(data$time, predict(fit), col='red')
which gives me the following output:
As you can see, the fit describes the actual data very well, it was just a matter of plotting against the correct abscissa values.

Related

ggplot x axis different from database

When i was trying to plot a line, the x-axis came out different from the database. This is my data:
Month num temp
1 2016-1-1 61 4.5
2 2016-2-1 50 3.8
3 2016-3-1 51 5.3
4 2016-4-1 48 6.5
5 2016-5-1 49 11.3
6 2016-6-1 48 13.9
7 2016-7-1 50 15.3
8 2016-8-1 48 15.5
9 2016-9-1 52 14.6
10 2016-10-1 54 9.8
11 2016-11-1 69 4.9
12 2016-12-1 80 5.9
13 2017-1-1 59 3.8
14 2017-2-1 52 5.2
15 2017-3-1 51 7.3
16 2017-4-1 47 8.0
17 2017-5-1 50 12.1
18 2017-6-1 47 14.4
and my code was:
ggplot(data=trendsData,aes(x=Month, y=temp,group=1))+geom_line()+theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = .5))
but it came out:
enter image description here
Could anyone help with the disorder, thanks!
R can only sort those dates correctly when it knows what they are infact dates.
ymd() from the package lubridate is nice for that.
trendsData$Month <- ymd( trendsData$Month )
Then your plot should be fine.
EDIT:
If you want more date points to show on the x axis, you can use scale_x_date() like so:
+ scale_x_date( breaks=trendsData$Month )

Does ggplot2 exclude some data?

I want to create some basic grouped barplots with ggplot2 but it seems to exclude some data. If I review my input data everything is there, but some bars are missing and it is also messing with the error bars. I tried to convert into multiple variable types, regrouped, loaded again, saved everything in .csv and loaded all new... I just don't know what is wrong.
Here is my code:
library(ggplot2)
limits <- aes(ymax = DataCm$mean + DataCm$sd,
ymin = DataCm$mean - DataCm$sd)
p <- ggplot(data = DataCm, aes(x = factor(DataCm$Zeit), y = factor(DataCm$mean)
) )
p + geom_bar(stat = "identity",
position = position_dodge(0.9),fill =DataCm$group) +
geom_errorbar(limits, position = position_dodge(0.9),
width = 0.25) +
labs(x = "Time [min]", y = "Individuals per foodsource")
This is DataCm:
Zeit mean sd group
1 30 0.1 0.3162278 1
2 60 0.0 0.0000000 2
3 90 0.1 0.3162278 3
4 120 0.0 0.0000000 4
5 150 0.1 0.3162278 5
6 180 0.1 0.3162278 6
7 240 0.3 0.6749486 1
8 300 0.3 0.6749486 2
9 360 0.3 0.6749486 3
10 30 0.1 0.3162278 4
11 60 0.1 0.3162278 5
12 90 0.2 0.4216370 6
13 120 0.3 0.4830459 1
14 150 0.3 0.4830459 2
15 180 0.4 0.5163978 3
16 240 0.3 0.4830459 4
17 300 0.4 0.5163978 5
18 360 0.4 0.5163978 6
19 30 1.2 1.1352924 1
20 60 1.8 1.6865481 2
21 90 2.2 2.0976177 3
22 120 2.2 2.0976177 4
23 150 2.0 1.8856181 5
24 180 2.3 1.9465068 6
25 240 2.4 2.0655911 1
26 300 2.1 1.8529256 2
27 360 2.0 2.1602469 3
28 30 0.2 0.4216370 4
29 60 0.1 0.3162278 5
30 90 0.1 0.3162278 6
31 120 0.1 0.3162278 1
32 150 0.0 0.0000000 2
33 180 0.1 0.3162278 3
34 240 0.1 0.3162278 4
35 300 0.1 0.3162278 5
36 360 0.1 0.3162278 6
37 30 1.3 1.5670212 1
38 60 1.5 1.5811388 2
39 90 1.5 1.7159384 3
40 120 1.5 1.9002924 4
41 150 1.9 2.1317703 5
42 180 1.9 2.1317703 6
43 240 2.2 2.3475756 1
44 300 2.4 2.3190036 2
45 360 2.2 2.1499354 3
46 30 2.1 2.1317703 4
47 60 3.0 2.2110832 5
48 90 3.3 2.1628171 6
49 120 3.2 2.1499354 1
50 150 3.4 2.6331224 2
51 180 3.5 2.4152295 3
52 240 3.7 2.6267851 4
53 300 3.7 2.4060110 5
54 360 3.8 2.6583203 6
The output is:
Maybe you can help me. Thanks in advance!
Best wishes,
Benjamin
Solved it:
I reshaped everything in Excel and exported it another way. The group variable was also not the way I wanted it. Now it is fixed, but I can't really tell you why.
Your data looks malformed. I guess you wanted to have 6 different group values for each time point, but now the group variable just loops over, and you have:
1 30 0.1 0.3162278 1
...
10 30 0.1 0.3162278 4
...
19 30 1.2 1.1352924 1
...
28 30 0.2 0.4216370 4
geom_bar then probably omits rows that have identical mean and time. Although I am not sure why it chooses to do so, you should solve the group problem first anyway.

Selecting matching columns and actions on those lines

I have a file where I wanted to select lines where column $3 is the same. Now I have grouped them but I want to do certain actions on those lines in case column $1 (and/or $2) satisfies a certain condition.
For example - if all the values in $1 and $2 (within the group of lines that have the same value in $3) are within 0.1 from each other, I want to take the average of columns $1 and $2 (for that group that has the same $3). If it is larger, I want to just print those lines without taking an average.
My input is something like:
1.3 22.5 ALFA 45 50
1.4 22.6 ALFA 45 50
1.5 22.7 ALFA 45 50
1.6 22.8 ALFA 45 51
5.5 8.5 BETA 53 15
5.6 8.6 BETA 53 15
5.5 8.5 BETA 53 15
7.6 10.6 GAMA 75 13
7.7 10.7 GAMA 76 13
12 11.5 GAMA 75 13
4.5 4.5 DELTA 65 12
4.6 5.7 DELTA 65 12
12.1 8 EPS 44 16
12.2 8 EPS 44 16
I want my output to be:
out1.txt:
5.53 8.53 BETA 53 15
12.15 8 EPS 44 16
out2.txt:
1.3 22.5 ALFA 45 50
1.4 22.6 ALFA 45 50
1.5 22.7 ALFA 45 50
1.6 22.8 ALFA 45 50
7.6 10.6 GAMA 75 13
7.7 10.7 GAMA 76 13
12 11.5 GAMA 75 13
4.5 5.6 DELTA 65 12
4.6 9 DELTA 65 12
awk to the rescue!
awk '{k=$3;
if(!(k in min1)) {max1[k]=min1[k]=$1; max2[k]=min2[k]=$2}
sum1[k]+=$1; sum2[k]+=$2; count[k]++;
if(max1[k]<$1) max1[k]=$1; if(min1[k]>$1) min1[k]=$1;
if(max2[k]<$2) max2[k]=$2; if(min2[k]>$2) min2[k]=$2}
END {for(k in sum1)
if(max1[k]-min1[k]<=0.1 && max2[k]-min2[k]<=0.1)
printf "%.2f\t%.2f\t%s\n",sum1[k]/count[k],sum2[k]/count[k],k}' file
12.15 8.00 EPS
5.53 8.53 BETA
a lot of code but almost trivial, the order is not preserved though.

making a fully labelled scatter plot using R

can someone help me show me how I could make a fully labelled scatter plot for 2 variables, showing the axis labels with units(such as "cm"), and also including the chart title. Forexample, how would i make a fully labelled scatter plot including all the above listed features for age and height, using the following data using R?
Distance Age Height Coning
1 21.4 18 3.3 Yes
2 13.9 17 3.4 Yes
3 23.9 16 2.9 Yes
4 8.7 18 3.6 No
5 241.8 6 0.7 No
6 44.5 17 1.3 Yes
7 30.0 15 2.5 Yes
8 32.3 16 1.8 Yes
9 31.4 17 5.0 No
10 32.8 13 1.6 No
11 53.3 12 2.0 No
12 54.3 6 0.9 No
13 96.3 11 2.6 No
14 133.6 4 0.6 No
15 32.1 15 2.3 No
16 57.9 12 2.4 Yes
17 30.8 17 1.8 No
18 59.9 7 0.8 No
19 42.7 15 2.0 Yes
20 20.6 18 1.7 Yes
21 62.0 8 1.3 No
22 53.1 7 1.6 No
23 28.9 16 2.2 Yes
24 177.4 5 1.1 No
25 24.8 14 1.5 Yes
26 75.3 14 2.3 Yes
27 51.6 7 1.4 No
28 36.1 9 1.1 No
29 116.1 6 1.1 No
30 28.1 16 2.5 Yes
31 8.7 19 2.2 Yes
32 105.1 6 0.8 No
33 46.0 15 3.0 Yes
34 102.6 7 1.2 No
35 15.8 15 2.2 No
36 60.0 7 1.3 No
37 96.4 13 2.6 No
38 24.2 14 1.7 No
39 14.5 15 2.4 No
40 36.6 14 1.5 No
41 65.7 5 0.6 No
42 116.3 7 1.6 No
43 113.6 8 1.0 No
44 16.7 15 4.3 Yes
45 66.0 7 1.0 No
46 60.7 7 1.0 No
47 90.6 7 0.7 No
48 91.3 7 1.3 No
49 14.4 18 3.1 Yes
50 72.8 14 3.0 Yes
With base graphics:
df <- read.table(header=T, sep=" ", text="
Yes Distance Age Height Coning
1 21.4 18 3.3 Yes
2 13.9 17 3.4 Yes
3 23.9 16 2.9 Yes
4 8.7 18 3.6 No
5 241.8 6 0.7 No
6 44.5 17 1.3 Yes
7 30.0 15 2.5 Yes
8 32.3 16 1.8 Yes
9 31.4 17 5.0 No
10 32.8 13 1.6 No
11 53.3 12 2.0 No
12 54.3 6 0.9 No
13 96.3 11 2.6 No
14 133.6 4 0.6 No
15 32.1 15 2.3 No
16 57.9 12 2.4 Yes
17 30.8 17 1.8 No
18 59.9 7 0.8 No
19 42.7 15 2.0 Yes
20 20.6 18 1.7 Yes
21 62.0 8 1.3 No
22 53.1 7 1.6 No
23 28.9 16 2.2 Yes
24 177.4 5 1.1 No
25 24.8 14 1.5 Yes
26 75.3 14 2.3 Yes
27 51.6 7 1.4 No
28 36.1 9 1.1 No
29 116.1 6 1.1 No
30 28.1 16 2.5 Yes
31 8.7 19 2.2 Yes
32 105.1 6 0.8 No
33 46.0 15 3.0 Yes
34 102.6 7 1.2 No
35 15.8 15 2.2 No
36 60.0 7 1.3 No
37 96.4 13 2.6 No
38 24.2 14 1.7 No
39 14.5 15 2.4 No
40 36.6 14 1.5 No
41 65.7 5 0.6 No
42 116.3 7 1.6 No
43 113.6 8 1.0 No
44 16.7 15 4.3 Yes
45 66.0 7 1.0 No
46 60.7 7 1.0 No
47 90.6 7 0.7 No
48 91.3 7 1.3 No
49 14.4 18 3.1 Yes
50 72.8 14 3.0 Yes")
attach(df)
lab <- sprintf("%.1fcm, %dyr", Height, Age)
plot(Age ~ Height, main="The Title", pch=20, xlab="Height in cm", ylab="Age in years")
text(y=Age, x=Height, labels=lab, cex=.7, col=rgb(0,0,0,.5), pos=4)
detach(df)
And with the help of wordcloud::textplot():
if (!require(wordcloud)) {
install.packages("wordcloud")
library(wordcloud)
}
plot(Age ~ Height, main="The Title", pch=20, xlab="Height in cm", ylab="Age in years", type="n")
textplot(y=Age, x=Height, words=lab, cex=.5, new=F, show.lines=T)
You can use the ggplot2 library. Example -
library(ggplot2)
ggplot(mtcars, aes(x=wt, y=mpg, label=rownames(mtcars)))+
geom_point() +
geom_text()
What that code snippet is doing is taking the 'mtcars' dataset, assigning the x variable as the wt column, the y variable as the mpg column, and the labels as the rownames. geom_point adds a scatterplot based on the above x,y, and geom_text places the labels at the x,y coordinates.
Check out the help entry on geom_text to see the formatting options.
Examples taken from ggplot2 documentation, page 98
p <- ggplot(mtcars, aes(x=wt, y=mpg, label=rownames(mtcars)))
p + geom_text()
# Change size of the label
p + geom_text(size=10)
p <- p + geom_point()
# Set aesthetics to fixed value
p + geom_text()
p + geom_point() + geom_text(hjust=0, vjust=0)
p + geom_point() + geom_text(angle = 45)
# Add aesthetic mappings
p + geom_text(aes(colour=factor(cyl)))
p + geom_text(aes(colour=factor(cyl))) + scale_colour_discrete(l=40)
p + geom_text(aes(size=wt))
p + geom_text(aes(size=wt)) + scale_size(range=c(3,6))
# You can display expressions by setting parse = TRUE. The
# details of the display are described in ?plotmath, but note that
# geom_text uses strings, not expressions.
p + geom_text(aes(label = paste(wt, "^(", cyl, ")", sep = "")),
parse = TRUE)
# Add an annotation not from a variable source
c <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
c + geom_text(data = NULL, x = 5, y = 30, label = "plot mpg vs. wt")
# Or, you can use annotate
c + annotate("text", label = "plot mpg vs. wt", x = 2, y = 15, size = 8, colour = "red")
# Use qplot instead
qplot(wt, mpg, data = mtcars, label = rownames(mtcars),
geom=c("point", "text"))
qplot(wt, mpg, data = mtcars, label = rownames(mtcars), size = wt) +
geom_text(colour = "red")
# You can specify family, fontface and lineheight
p <- ggplot(mtcars, aes(x=wt, y=mpg, label=rownames(mtcars)))
p + geom_text(fontface=3)
p + geom_text(aes(fontface=am+1))
p + geom_text(aes(family=c("serif", "mono")[am+1]))

Adding information on a graph using R

I would like to add some information on my graph which was plotted from this data set:
EDITTED:
#data set:
day <- c(0:28)
ndied <- c(342,335,240,122,74,64,49,60,51,44,35,48,41,34,38,27,29,23,20,15,20,16,17,17,14,10,4,1,2)
pdied <- c(19.1,18.7,13.4,6.8,4.1,3.6,2.7,3.3,2.8,2.5,2.0,2.7,2.3,1.9,2.1,1.5,1.6,1.3,1.1,0.8,1.1,0.9,0.9,0.9,0.8,0.6,0.2,0.1,0.1)
pmort <- data.frame(day,ndied,pdied)
> pmort
day ndied pdied
1 0 342 19.1
2 1 335 18.7
3 2 240 13.4
4 3 122 6.8
5 4 74 4.1
6 5 64 3.6
7 6 49 2.7
8 7 60 3.3
9 8 51 2.8
10 9 44 2.5
11 10 35 2.0
12 11 48 2.7
13 12 41 2.3
14 13 34 1.9
15 14 38 2.1
16 15 27 1.5
17 16 29 1.6
18 17 23 1.3
19 18 20 1.1
20 19 15 0.8
21 20 20 1.1
22 21 16 0.9
23 22 17 0.9
24 23 17 0.9
25 24 14 0.8
26 25 10 0.6
27 26 4 0.2
28 27 1 0.1
29 28 2 0.1
I have put together this script and still trying to improve on it so that the rest of the information can be added:
> barplot(pmort$pdied,xlab="Age(days)",ylab="Percent",xlim=c(0,28),ylim=c(0,20),legend="Mortality")
I am trying to insert the numbers 0 to 28 (age in days) on the x-axis but could not and I know that it could be a simple script. Secondly, I would like to add the number died or ndied (342 to 2) below each day(0 to 28) along the x-axis.
Example:
0 1 2 3 4 5 and so on...
(N=342) (N=335) (N=240) (N=122) (N=74) (N=64)
Graph:
Any help would be appreciated.
Baz
I gave you two ways to plot the info: one above the bars and one below. You can tweak it to meet your needs.
barX <- barplot(pmort$pdied,xlab="Age(days)",
ylab="Percent", names=pmort$day,
xlim=c(0,28),ylim=c(0,20),legend="Mortality")
text(cex=.5, x=barX, y=pmort$pdied+par("cxy")[2]/2, pmort$ndied, xpd=TRUE)
barX <- barplot(pmort$pdied,xlab="Age(days)",
ylab="Percent", names=pmort$day,
xlim=c(0,28),ylim=c(0,20),legend="Mortality")
text(cex=.5, x=barX, y=-.5, pmort$ndied, xpd=TRUE)

Resources