R: Trimming/ Setting Boundaries on Filled.contour plot - r

I made a filled.contour plot and it cam out nicely, however I would like to trim out some of the hypotenuse of the triangle. Basically get rid of some of that yellow stripe along the hypotenuse.
Is there anyway I could go about doing that in R?
Here is my code:
library(akima)
> attach(asc)
>
> test<-interp(AvgDepth, AvgMaxDepth, Gs)
>
> filled.contour(test, color=heat.colors, xlab="Depth (m)", ylab="Maximum Depth (m)", ylim=c(5,90))
Here is my data set
AvgDepth AvgMaxDepth Gs
1 5 5 0.022706473
2 5 15 -0.006287207
3 15 15 -0.002071806
4 5 25 -0.002569846
5 15 25 -0.005698020
6 25 25 -0.013394740
7 5 35 -0.001723604
8 15 35 -0.004575939
9 25 35 -0.001260225
10 35 35 0.025808307
11 5 45 -0.008369802
12 15 45 -0.004661506
13 25 45 0.003438334
14 35 45 0.004066056
15 5 55 -0.004517855
16 15 55 0.001577937
17 25 55 -0.000761080
18 35 55 0.004597452
19 45 55 0.015894575
20 5 65 -0.003023326
21 15 65 0.001327518
22 25 65 -0.000967222
23 35 65 -0.005843258
24 45 65 -0.000534109
25 55 65 0.001292299
26 5 75 -0.003593511
27 15 75 0.000484908
28 25 75 -0.008013139
29 35 75 -0.013281240
30 45 75 -0.009767021
31 55 75 -0.019364488
32 65 75 -0.019202670
33 5 85 -0.004487259
34 15 85 -0.001588138
35 25 85 -0.004464418
36 35 85 -0.007797982
37 45 85 -0.013272495
38 55 85 -0.022616793
39 65 85 -0.017740362
40 75 85 0.012021166
41 5 95 0.002236271
42 15 95 0.002102761
43 25 95 -0.001748743
44 35 95 -0.003063959
45 45 95 -0.001264025
46 55 95 -0.004662023
47 65 95 0.002980029
48 75 95 0.015868836
49 85 95 0.008842697
50 95 95 0.036387641
Any Help is appreciated.
Thanks

Related

How to cut the values in a regular interval and define them into the separate group? [duplicate]

This question already has answers here:
Split a vector into chunks
(22 answers)
Closed 3 years ago.
How to cut the values (1 to 100) in a regular interval (25) and place them into 4 groups as below:
sdr <- c(1:100)
Group1: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Group2: 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
Group3: 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
Group4: 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Any suggestion, please.
You could use split
sdr <- 1:100
split(sdr, rep(1:4, each = 25))
#$`1`
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#
#$`2`
# [1] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
#
#$`3`
# [1] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
#
#$`4`
# [1] 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
#[20] 95 96 97 98 99 100
This returns a list with 4 vector elements.
Also note that the c() around 1:100 is not necessary.
Or we can define the number of groups
ngroup <- 4
split(sdr, rep(1:ngroup, each = length(sdr) %/% ngroup))
giving the same result.
You can make a dataframe for your groups and then transpose using t:
df <- t(data.frame(Group1 = c(1:25), Group2 = c(26:50), Group3 = c(51:75), Group4 = c(76:100)))

Poisson distribution too narrow, negative binomial too broad

I'm trying to fit some count data for the number of fish purchased by anglers(grey in the image) with a distribution using optim in R. I've fit both a poisson (red) and negative binomial distribution (blue) but as you can see neither seems to be right. What should my next steps be for getting a better fit?
My graph:
#fit poisson curve to data using optim
minus.logL.s<-function(lambda, dat){
-sum(dpois(dat,lambda, log=TRUE))}
mle<-optim(par=45,fn=minus.logL.s, method="BFGS",hessian=T,dat=survey.responses.baitusers$fish.per.trip)
mle
#simulate data coming from a poisson distribution of mean 38
simspois<-as.data.frame(rpois(1000, 38))
colnames(simspois)<-("simulated_values")
#fit negative binomial curve
minus.logL.nb<-function(pars, dat){
mu<-pars[1]
size<-pars[2]
-sum(dnbinom(dat, mu=mu, size=size,log=TRUE))}
mlenb<-optim(par=c(mu=38,size=1),fn=minus.logL.nb, method="BFGS",hessian=T,dat=survey.responses.baitusers$fish.per.trip)
mlenb
simsnegbin<-as.data.frame(rnbinom(1000,size=4, mu=38))
colnames(simsnegbin)<-("simulated_valuesnb")
#graph both
graph<-ggplot(survey.responses.baitusers)+aes(fish.per.trip)+geom_histogram()+geom_smooth(data=simspois, aes(simulated_values), stat = "count",color="red")+geom_smooth(data=simsnegbin, aes(simulated_valuesnb), stat="count", color="blue")
graph
Output from negative binomial fitting:
$par
mu size
38.333338 4.107287
Output from poisson fitting:
$par
[1] 38.33333
My data:
> survey.responses.baitusers$fish.per.trip
[1] 15 34 42 38 8 38 21 29 58 29 40 35 33 51 50 40 8 45 44 45 34 57 8 28 63 54 22 44 65 54 54 15 12
[34] 42 59 40 43 95 80 15 54 19 44 27 53 95 21 38 40 13 25 27 79 38 85 40 33 74 34 77 34 34 33 35 89 34
[67] 34 37 16 60 17 21 18 37 34 27 30 62 48 35 55 50 23 32 56 34 11 21 34 48 15 34 26 54 8 95 8 58 54
[100] 44 34 47 35 13 21 53 52 52 40 40 33 8 15 15 25 41 63 34 38 87 14 68 58 59 34 55 24 24 35 33 21 8
[133] 8 15 51 48 8 21 39 29 50 54 62 16 54 33 58 22 49 40 30 51 21 19 51 40 34 27 40 45 80 69 8 42 33
[166] 62 40 82 17 14 30 61 45 70 33 33 16 49 32 34 31 31 18 64 33 39 21 56 40 52 71 34 30 27 54 8 64 16
[199] 54 127 13 51 40 33 63 31 30 63 56 57 77 46 64 22 34 50 66 33 34 59 45 16 21 60 58 15 64 29 40 44 29
[232] 8 21 16 72 34 49 57 34 34 15 33 54 40 32 33 95 107 49 64 59 64 37 70 45 16 16 40 19 53 34 39 21 36
[265] 34 17 8 34 51 13 20 34 21 38 36 36 41 34 83 27 8 45 29 34 21 37 44 15 50 25 27 8 27 19 24 40 8
[298] 28 36 24 40 21 70 20 34 21 46 16 20 8 33 34 54 44 77 80 15 34 40 29 48 59 29 8 15 47 45 21 41 23
[331] 34 51 14 40 25 45 64 59 107 21 59 27 56 48 34 45 59 35 30 37 32 8 51 11 48 64 32 8 52 14 20 18 8
[364] 53 52 53 33 34 48 62 34 34 8 46 39 21 33 34 40 49 52 19 24 29 43 19 29 27 46 52 29 51 61 16 17 35
[397] 34 40 25 28 34 42 66 35 49 35 51 66 21 51 45 14 53 22 42 64 8 48 28 66 52 40 29 34 34 41 59 34 52
[430] 16 32 20 35 8 8 21 49 40 33 16 24 8 42 23 63 26 21 33 8 23 112 57 8 46 18 67 34 30 33 40 43 57
[463] 60 33 14 27 44 21 31 30 27 49 57 69 66 22 28 55 11 43

AIC for probability density function

I want code in r on how to calculate the AIC CAIC BIC HQIC W A of this this pdf
f(x) = ((a*log(b))/(x^2*b - x^2))*exp(-(a/x))*b^(exp(-(a/x))) and cdf F(x)= (b^(exp(-(a/x))) - 1)/(b - 1)
using these data
1 11 4 32 23 45 115 37 29 71 39 23 21 37 20 12 13 135
49 32 64 40 77 97 97 85 10 27 7 48 35 61 79 63 16 80
108 20 52 82 50 64 59 39 9 16 78 35 66 122 89 110 44 28
65 22 59 23 31 44 21 9 45 168 73 76 118 84 85 96 78 73
91 47 32 20 23 21 24 44 21 28 9 13 46 18 13 24 16 13
23 36 7 14 30 14 18 20

Generate sequence with alternating increments in R? [duplicate]

This question already has answers here:
Get a seq() in R with alternating steps
(6 answers)
Closed 6 years ago.
I want to use R to create the sequence of numbers 1:8, 11:18, 21:28, etc. through 1000 (or the closest it can get, i.e. 998). Obviously typing that all out would be tedious, but since the sequence increases by one 7 times and then jumps by 3 I'm not sure what function I could use to achieve this.
I tried seq(1, 998, c(1,1,1,1,1,1,1,3)) but it does not give me the results I am looking for so I must be doing something wrong.
This is a perfect case of vectorisation( recycling too) in R. read about them
(1:100)[rep(c(TRUE,FALSE), c(8,2))]
# [1] 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27 28 31 32
#[27] 33 34 35 36 37 38 41 42 43 44 45 46 47 48 51 52 53 54 55 56 57 58 61 62 63 64
#[53] 65 66 67 68 71 72 73 74 75 76 77 78 81 82 83 84 85 86 87 88 91 92 93 94 95 96
#[79] 97 98
rep(seq(0,990,by=10), each=8) + seq(1,8)
You want to exclude numbers that are 0 or 9 (mod 10). So you can try this too:
n <- 1000 # upper bound
x <- 1:n
x <- x[! (x %% 10) %in% c(0,9)] # filter out (0, 9) mod (10)
head(x,80)
# [1] 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27
# 28 31 32 33 34 35 36 37 38 41 42 43 44 45 46 47 48 51 52 53 54 55 56 57
# 58 61 62 63 64 65 66 67 68 71 72 73 74 75 76 77 78 81 82 83 84 85
# 86 87 88 91 92 93 94 95 96 97 98
Or in a single line using Filter:
Filter(function(x) !((x %% 10) %in% c(0,9)), 1:100)
# [1] 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27 28 31 32 33 34 35 36 37 38 41 42 43 44 45 46 47 48 51 52 53 54 55 56 57
# [48] 58 61 62 63 64 65 66 67 68 71 72 73 74 75 76 77 78 81 82 83 84 85 86 87 88 91 92 93 94 95 96 97 98
With a cycle: for(value in c(seq(1,991,10))){vector <- c(vector,seq(value,value+7))}

How to do efficient vectorized update on multiple columns using data.tables?

I have the following code using data.frames, and I'm wondering how to write this using data.tables, using the most efficient, most vectorized code?
data.frame code:
set.seed(1)
to <- cbind(data.frame(time=seq(1:5),bananas=sample(100,5),apples=sample(100,5)),setNames(data.frame(matrix(sample(100,90,replace=T),nrow=5)),paste0(1:18)))
from <- cbind(data.frame(time=seq(1:5),blah=sample(100,5),foo=sample(100,5)),setNames(data.frame(matrix(sample(100,90,replace=T),nrow=5)),paste0(1:18)))
from
to
rownames(to) <- to$time
to[as.character(from$time),paste0(1:18)] <- from[,paste0(1:18)]
to
Running this:
> set.seed(1)
> to <- cbind(data.frame(time=seq(1:5),bananas=sample(100,5),apples=sample(100,5)),setNames(data.frame(matrix(sample(100,90,replace=T),nrow=5)),paste0(1:18)))
> from <- cbind(data.frame(time=seq(1:5),blah=sample(100,5),foo=sample(100,5)),setNames(data.frame(matrix(sample(100,90,replace=T),nrow=5)),paste0(1:18)))
> from
time blah foo 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1 1 66 22 98 2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
2 2 35 13 74 72 50 52 8 57 61 18 56 53 90 7 85 65 20 76 39 12
3 3 27 47 36 11 49 21 4 53 24 75 33 8 45 34 86 75 89 73 11 85
4 4 97 90 44 45 18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
5 5 61 58 15 65 76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
> to
time bananas apples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1 1 27 90 21 50 94 39 49 67 83 79 48 10 92 26 34 90 44 21 24 80
2 2 37 94 18 72 22 2 60 80 65 3 87 32 30 48 84 87 72 72 6 46
3 3 57 65 69 100 66 39 50 11 79 48 44 52 46 77 35 39 40 13 65 42
4 4 89 62 39 39 13 87 19 73 56 74 25 67 34 9 34 78 33 25 88 82
5 5 20 6 77 78 27 35 83 42 53 70 8 41 66 88 48 97 76 15 78 61
>
> rownames(to) <- to$time
> to[as.character(from$time),paste0(1:18)] <- from[,paste0(1:18)]
> to
time bananas apples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1 1 27 90 98 2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
2 2 37 94 74 72 50 52 8 57 61 18 56 53 90 7 85 65 20 76 39 12
3 3 57 65 36 11 49 21 4 53 24 75 33 8 45 34 86 75 89 73 11 85
4 4 89 62 44 45 18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
5 5 20 6 15 65 76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
Basically, we update columns paste0(1:18) of to from columns paste0(1:18) of from, matching up the times.
data.tables apparently have some advantages, such as not needing head when printing them at the console, so I'm thinking about using them.
However I'd like not to have to write the := expressions by hand, ie try to avoid:
to[from,`1`:=i.`1`,`2`:=i.`2`, ..]
I'd also prefer to use vectorized syntax if possible, rather than some kind of for loop, ie try to avoid something like:
for( i in 1:18 ) {
to[from, sprintf("%d",i) := i.sprintf("%d",i)]
}
I read through the faq vignette, and the datatable-intro vignette, though I admit I probably haven't understood everything 100%.
I looked at Loop through columns in a data.table and transform those columns , but I can't say I understand it 100%, and it seems to say that I need to use a for loop?
There does seem to be some kind of a hint at the bottom of 8374816 that it might be possible to just use data frame syntax, adding with=FALSE? But since the data.frame procedure is hacking on the row names, I'm not sure how well / if that will work, and I wonder to what extent that makes use of the efficiencies of data.table?
Good question. The base construct you've shown :
to[as.character(from$time),paste0(1:18)] <- from[,paste0(1:18)]
works assuming row names can't be duplicated, or if they are then only the first is matched to. Here, the LHS of <- has the same number of rows as the RHS of <-.
data.table is different since routinely, multiple rows in to may match; the default for mult is "all". data.table also prefers long format to wide. So this question is kind of putting data.table through its paces for something it wasn't really designed for. If you have any NA in those 18 columns (i.e. sparse), then a long format may be more appropriate. If all 18 columns are the same type, then a matrix may be more appropriate.
That said, here are three data.table options for completeness.
1. Using := but without a for loop (multiple LHS and multiple RHS in LHS:=RHS)
from = as.data.table(from)
to = as.data.table(to)
from
time blah foo 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1: 1 66 22 98 2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
2: 2 35 13 74 72 50 52 8 57 61 18 56 53 90 7 85 65 20 76 39 12
3: 3 27 47 36 11 49 21 4 53 24 75 33 8 45 34 86 75 89 73 11 85
4: 4 97 90 44 45 18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
5: 5 61 58 15 65 76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
to
time bananas apples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1: 1 27 90 21 50 94 39 49 67 83 79 48 10 92 26 34 90 44 21 24 80
2: 2 37 94 18 72 22 2 60 80 65 3 87 32 30 48 84 87 72 72 6 46
3: 3 57 65 69 100 66 39 50 11 79 48 44 52 46 77 35 39 40 13 65 42
4: 4 89 62 39 39 13 87 19 73 56 74 25 67 34 9 34 78 33 25 88 82
5: 5 20 6 77 78 27 35 83 42 53 70 8 41 66 88 48 97 76 15 78 61
setkey(to,time)
setkey(from,time)
to[from,paste0(1:18):=from[.GRP,paste0(1:18),with=FALSE]]
time bananas apples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1: 1 27 90 98 2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
2: 2 37 94 74 72 50 52 8 57 61 18 56 53 90 7 85 65 20 76 39 12
3: 3 57 65 36 11 49 21 4 53 24 75 33 8 45 34 86 75 89 73 11 85
4: 4 89 62 44 45 18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
5: 5 20 6 15 65 76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
or
to[from,paste0(1:18):=from[,paste0(1:18),with=FALSE],mult="first"]
time bananas apples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1: 1 27 90 98 2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
2: 2 37 94 74 72 50 52 8 57 61 18 56 53 90 7 85 65 20 76 39 12
3: 3 57 65 36 11 49 21 4 53 24 75 33 8 45 34 86 75 89 73 11 85
4: 4 89 62 44 45 18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
5: 5 20 6 15 65 76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
Note I'm using latest v1.8.3, which is needed for option 1 to work (.GRP has just been added, and the outer with=FALSE is no longer needed).
2. Use one list column to store the length 18 vectors, rather than 18 columns
to = data.table( time=seq(1:5),
bananas=sample(100,5),
apples=sample(100,5),
v18=replicate(5,sample(100,18),simplify=FALSE))
from = data.table( time=seq(1:5),
blah=sample(100,5),
foo=sample(100,5),
v18=replicate(5,sample(100,18),simplify=FALSE))
setkey(to,time)
setkey(from,time)
from
time blah foo v18
1: 1 56 97 88,47,1,71,69,18,
2: 2 69 40 96,99,60,3,33,27,
3: 3 65 84 100,38,56,72,84,55,
4: 4 98 74 91,69,24,63,27,100,
5: 5 46 52 65,4,59,41,8,51,
to
time bananas apples v18
1: 1 66 73 100,36,74,77,68,46,
2: 2 19 37 84,88,92,8,37,52,
3: 3 94 77 37,94,13,7,93,43,
4: 4 88 2 27,93,71,16,46,66,
5: 5 91 91 85,94,58,49,19,1,
to[from,v18:=i.v18]
to
time bananas apples v18
1: 1 66 73 88,47,1,71,69,18,
2: 2 19 37 96,99,60,3,33,27,
3: 3 94 77 100,38,56,72,84,55,
4: 4 88 2 91,69,24,63,27,100,
5: 5 91 91 65,4,59,41,8,51,
If you are not used to list column printing, the trailing comma signifies that more items are in that vector. Just the first 6 are printed.
3. Use data.frame syntax on the data.table
to = as.data.table(to)
from = as.data.table(from)
setkey(to,time)
setkey(from,time)
from
time blah foo 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1: 1 66 22 98 2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
2: 2 35 13 74 72 50 52 8 57 61 18 56 53 90 7 85 65 20 76 39 12
3: 3 27 47 36 11 49 21 4 53 24 75 33 8 45 34 86 75 89 73 11 85
4: 4 97 90 44 45 18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
5: 5 61 58 15 65 76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
to
time bananas apples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1: 1 27 90 21 50 94 39 49 67 83 79 48 10 92 26 34 90 44 21 24 80
2: 2 37 94 18 72 22 2 60 80 65 3 87 32 30 48 84 87 72 72 6 46
3: 3 57 65 69 100 66 39 50 11 79 48 44 52 46 77 35 39 40 13 65 42
4: 4 89 62 39 39 13 87 19 73 56 74 25 67 34 9 34 78 33 25 88 82
5: 5 20 6 77 78 27 35 83 42 53 70 8 41 66 88 48 97 76 15 78 61
to[from, paste0(1:18)] <- from[,paste0(1:18),with=FALSE]
to
time bananas apples 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1: 1 27 90 98 2 100 46 58 60 69 46 62 19 29 42 64 90 30 19 72 60
2: 2 37 94 74 72 50 52 8 57 61 18 56 53 90 7 85 65 20 76 39 12
3: 3 57 65 36 11 49 21 4 53 24 75 33 8 45 34 86 75 89 73 11 85
4: 4 89 62 44 45 18 23 65 99 26 11 46 28 78 73 40 61 51 95 93 32
5: 5 20 6 15 65 76 60 93 51 73 87 51 22 89 34 39 91 88 55 29 79
So the LHS of <- can use data.table keyed join syntax; i.e. to[from]. It's just that this method (currently in R) will copy the entire to dataset. That's what := was introduced to avoid by providing update by reference. Also, if each row in from matches to multiple rows in to then the RHS of <- would need to expanded to line up (by you the user), otherwise the RHS would be recycled to fill up the LHS. That's one reason why, in data.table, we like := being inside j, all inside [...].

Resources