ANOVA test on groups with replicates - r

I have 3 independent groups and want to know which of these group means are different.
My dataset looks as below
T1 T2 T3 T4 H1 H2 H3 S1 S2 S3
A 22 19 16 13 10 19 16 13 10 7
B 55 52 49 46 43 52 49 46 43 40
C 26 23 20 17 14 23 20 17 14 11
D 84 81 78 75 72 81 78 75 72 69
E 95 92 89 86 83 92 89 86 83 80
F 45 42 39 36 33 42 39 36 33 30
G 35 32 29 26 23 32 29 26 23 20
H 84 81 78 75 72 81 78 75 72 69
I 39 36 33 30 27 36 33 30 27 24
I am aware of how to do ANOVA for single vector group, but here I have 3 (T, H and S) groups with replicates, can someone please help how to do the one way ANOVA test for the above data.

Related

Create new variables by dividing all pre-exisiting variables by all other variables

I would like to create new variables by dividing all pre-existing variables by each other
e.g.
X1/X1, X1/X2, X1/X3, X1/X4, X1/X5, X1/X6, X1/X7, X1/X8, X1/X9, X1/X10,
X2/X1, X2/X2, X2/X3, X2/X4, X2/X5, X2/X6, X2/X7, X2/X8, X2/X9, X2/X10,
X3/X1, X3/X2 ...
I started by trying to do each individually, as below, but I need to replicate this with multiple variable names so an automation (I assume a function/lapply) would be ideal.
ds$rom_3_5m <- (ds$roll_open_mean_3m/ds$roll_open_mean_5m)
ds$rom_3_10m <- (ds$roll_open_mean_3m/ds$roll_open_mean_10m)
ds$rom_3_15m <- (ds$roll_open_mean_3m/ds$roll_open_mean_15m)
ds$rom_3_30m <- (ds$roll_open_mean_3m/ds$roll_open_mean_30m)
ds$rom_3_60m <- (ds$roll_open_mean_3m/ds$roll_open_mean_60m)
ds$rom_3_120m <- (ds$roll_open_mean_3m/ds$roll_open_mean_120m)
ds$rom_3_240m <- (ds$roll_open_mean_3m/ds$roll_open_mean_240m)
ds$rom_3_480m <- (ds$roll_open_mean_3m/ds$roll_open_mean_480m)
ds$rom_3_960m <- (ds$roll_open_mean_3m/ds$roll_open_mean_960m)
ds$rom_3_1920m <- (ds$roll_open_mean_3m/ds$roll_open_mean_1920m)
ds$rom_3_3840m <- (ds$roll_open_mean_3m/ds$roll_open_mean_3840m)
ds$rom_3_7680m <- (ds$roll_open_mean_3m/ds$roll_open_mean_7680m)
ds$rom_3_15360m <- (ds$roll_open_mean_3m/ds$roll_open_mean_15360m)
ds$rom_3_30720m <- (ds$roll_open_mean_3m/ds$roll_open_mean_30720m)
ds$rom_3_61440m <- (ds$roll_open_mean_3m/ds$roll_open_mean_61440m)
ds$rom_3_122880m <- (ds$roll_open_mean_3m/ds$roll_open_mean_122880m)
ds$rom_3_245760m <- (ds$roll_open_mean_3m/ds$roll_open_mean_245760m)
ds$rom_3_491520m <- (ds$roll_open_mean_3m/ds$roll_open_mean_491520m)
#5m
ds$rom_5_3m <- (ds$roll_open_mean_5m/ds$roll_open_mean_3m)
ds$rom_5_10m <- (ds$roll_open_mean_5m/ds$roll_open_mean_10m)
ds$rom_5_15m <- (ds$roll_open_mean_5m/ds$roll_open_mean_15m)
ds$rom_5_30m <- (ds$roll_open_mean_5m/ds$roll_open_mean_30m)
ds$rom_5_60m <- (ds$roll_open_mean_5m/ds$roll_open_mean_60m)
ds$rom_5_120m <- (ds$roll_open_mean_5m/ds$roll_open_mean_120m)
ds$rom_5_240m <- (ds$roll_open_mean_5m/ds$roll_open_mean_240m)
ds$rom_5_480m <- (ds$roll_open_mean_5m/ds$roll_open_mean_480m)
ds$rom_5_960m <- (ds$roll_open_mean_5m/ds$roll_open_mean_960m)
ds$rom_5_1920m <- (ds$roll_open_mean_5m/ds$roll_open_mean_1920m)
ds$rom_5_3840m <- (ds$roll_open_mean_5m/ds$roll_open_mean_3840m)
ds$rom_5_7680m <- (ds$roll_open_mean_5m/ds$roll_open_mean_7680m)
ds$rom_5_15360m <- (ds$roll_open_mean_5m/ds$roll_open_mean_15360m)
ds$rom_5_30720m <- (ds$roll_open_mean_5m/ds$roll_open_mean_30720m)
ds$rom_5_61440m <- (ds$roll_open_mean_5m/ds$roll_open_mean_61440m)
ds$rom_5_122880m <- (ds$roll_open_mean_5m/ds$roll_open_mean_122880m)
ds$rom_5_245760m <- (ds$roll_open_mean_5m/ds$roll_open_mean_245760m)
ds$rom_5_491520m <- (ds$roll_open_mean_5m/ds$roll_open_mean_491520m)
#10m
ds$rom_10_3m <- (ds$roll_open_mean_10m/ds$roll_open_mean_3m)
ds$rom_10_5m <- (ds$roll_open_mean_10m/ds$roll_open_mean_5m)
ds$rom_10_15m <- (ds$roll_open_mean_10m/ds$roll_open_mean_15m)
I have a data frame with 40+ variables with 6 million rows, I have attached a smaller example data frame below.
Thanks in advance!
Charlie
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 57 77 48 8 31 43 47 13 26 88
2 25 75 86 77 4 65 5 49 31 57
3 91 90 42 69 82 33 56 99 47 39
4 35 96 86 77 67 77 20 17 77 92
5 6 100 50 62 16 31 0 39 72 4
6 90 34 74 89 71 37 73 45 24 28
7 24 22 92 13 57 97 32 2 12 80
8 74 59 49 2 97 100 15 37 15 67
9 43 38 66 97 8 20 85 25 97 67
10 82 4 56 40 42 46 44 98 98 76
11 60 68 92 99 81 92 78 59 23 81
12 22 57 37 100 7 1 89 41 40 56
13 69 13 1 82 89 45 83 24 71 29
14 8 14 66 48 94 8 20 3 28 63
15 26 70 56 62 9 34 11 86 71 64
16 7 55 15 100 91 89 46 74 98 14
17 29 68 19 66 83 29 84 76 90 45
18 27 76 6 48 17 28 8 7 52 37
19 68 58 51 75 60 57 74 46 98 93
20 15 15 89 55 23 3 3 8 32 37
21 78 49 57 48 96 89 4 95 67 58
22 12 36 42 59 27 92 48 0 92 28
23 51 17 77 61 84 53 46 22 27 36
24 40 84 83 35 19 13 80 78 96 87
25 44 80 25 72 43 17 74 70 52 36
26 14 61 63 82 16 47 32 93 19 84
27 93 19 28 62 74 1 85 65 50 9
28 80 62 6 58 48 97 97 18 65 43
29 12 58 95 79 37 89 89 83 22 85
30 57 73 22 88 99 63 58 87 90 66
As #27 ϕ 9 suggested in the comments you should use that lapply solution.
With this, you also create a unique dataframe with correct names
l <- lapply(df, `/`, df)
l <- unlist(l, recursive = FALSE)
data.frame(l)

What ways exist to create an array with given dimensions from a given sequence in Julia?

I'm new to Julia and I could not find any useful information on the following: I would like to create an array of given dimensions and fill it with a given sequence.
m,n = 10,10 # dimensions
i = 1:100 # sequence
I've tried to use collect, but this gives me a single column array. I have also tried it the Julia way
[? for i in 1:m, j in 1:n]
but I don't know what I could insert for ?.
The easiest way is reshape(i, m,n) (potentially together with a collect if you really need an Array{Int64,2}):
julia> reshape(i,m,n)
10×10 reshape(::UnitRange{Int64}, 10, 10) with eltype Int64:
1 11 21 31 41 51 61 71 81 91
2 12 22 32 42 52 62 72 82 92
3 13 23 33 43 53 63 73 83 93
4 14 24 34 44 54 64 74 84 94
5 15 25 35 45 55 65 75 85 95
6 16 26 36 46 56 66 76 86 96
7 17 27 37 47 57 67 77 87 97
8 18 28 38 48 58 68 78 88 98
9 19 29 39 49 59 69 79 89 99
10 20 30 40 50 60 70 80 90 100
julia> collect(ans)
10×10 Array{Int64,2}:
1 11 21 31 41 51 61 71 81 91
2 12 22 32 42 52 62 72 82 92
3 13 23 33 43 53 63 73 83 93
4 14 24 34 44 54 64 74 84 94
5 15 25 35 45 55 65 75 85 95
6 16 26 36 46 56 66 76 86 96
7 17 27 37 47 57 67 77 87 97
8 18 28 38 48 58 68 78 88 98
9 19 29 39 49 59 69 79 89 99
10 20 30 40 50 60 70 80 90 100
To answer your question what to put as ? in the array comprehension approach, you must convert the cartesian index to a linear index, for example like so:
julia> [i[LinearIndices((m,n))[p,q]] for p in 1:m, q in 1:n]
10×10 Array{Int64,2}:
1 11 21 31 41 51 61 71 81 91
2 12 22 32 42 52 62 72 82 92
3 13 23 33 43 53 63 73 83 93
4 14 24 34 44 54 64 74 84 94
5 15 25 35 45 55 65 75 85 95
6 16 26 36 46 56 66 76 86 96
7 17 27 37 47 57 67 77 87 97
8 18 28 38 48 58 68 78 88 98
9 19 29 39 49 59 69 79 89 99
10 20 30 40 50 60 70 80 90 100
Of course, you can also calculate the linear index yourself, [i[(q-1)*m + p] for p in 1:m, q in 1:n].
Alternatively, you can preallocate the array and fill it in a linear fashion:
julia> result = Matrix{Int64}(undef, m,n);
julia> result[:] .= i;
julia> result
10×10 Array{Int64,2}:
1 11 21 31 41 51 61 71 81 91
2 12 22 32 42 52 62 72 82 92
3 13 23 33 43 53 63 73 83 93
4 14 24 34 44 54 64 74 84 94
5 15 25 35 45 55 65 75 85 95
6 16 26 36 46 56 66 76 86 96
7 17 27 37 47 57 67 77 87 97
8 18 28 38 48 58 68 78 88 98
9 19 29 39 49 59 69 79 89 99
10 20 30 40 50 60 70 80 90 100
which is basically equivalent to the naive, explicit solution
julia> result = Matrix{Int64}(undef, m,n);
julia> for k in eachindex(i) result[k] = i[k] end
julia> result
10×10 Array{Int64,2}:
1 11 21 31 41 51 61 71 81 91
2 12 22 32 42 52 62 72 82 92
3 13 23 33 43 53 63 73 83 93
4 14 24 34 44 54 64 74 84 94
5 15 25 35 45 55 65 75 85 95
6 16 26 36 46 56 66 76 86 96
7 17 27 37 47 57 67 77 87 97
8 18 28 38 48 58 68 78 88 98
9 19 29 39 49 59 69 79 89 99
10 20 30 40 50 60 70 80 90 100

How to cut the values in a regular interval and define them into the separate group? [duplicate]

This question already has answers here:
Split a vector into chunks
(22 answers)
Closed 3 years ago.
How to cut the values (1 to 100) in a regular interval (25) and place them into 4 groups as below:
sdr <- c(1:100)
Group1: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Group2: 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
Group3: 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
Group4: 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
Any suggestion, please.
You could use split
sdr <- 1:100
split(sdr, rep(1:4, each = 25))
#$`1`
# [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#
#$`2`
# [1] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
#
#$`3`
# [1] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
#
#$`4`
# [1] 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
#[20] 95 96 97 98 99 100
This returns a list with 4 vector elements.
Also note that the c() around 1:100 is not necessary.
Or we can define the number of groups
ngroup <- 4
split(sdr, rep(1:ngroup, each = length(sdr) %/% ngroup))
giving the same result.
You can make a dataframe for your groups and then transpose using t:
df <- t(data.frame(Group1 = c(1:25), Group2 = c(26:50), Group3 = c(51:75), Group4 = c(76:100)))

AIC for probability density function

I want code in r on how to calculate the AIC CAIC BIC HQIC W A of this this pdf
f(x) = ((a*log(b))/(x^2*b - x^2))*exp(-(a/x))*b^(exp(-(a/x))) and cdf F(x)= (b^(exp(-(a/x))) - 1)/(b - 1)
using these data
1 11 4 32 23 45 115 37 29 71 39 23 21 37 20 12 13 135
49 32 64 40 77 97 97 85 10 27 7 48 35 61 79 63 16 80
108 20 52 82 50 64 59 39 9 16 78 35 66 122 89 110 44 28
65 22 59 23 31 44 21 9 45 168 73 76 118 84 85 96 78 73
91 47 32 20 23 21 24 44 21 28 9 13 46 18 13 24 16 13
23 36 7 14 30 14 18 20

Generate sequence with alternating increments in R? [duplicate]

This question already has answers here:
Get a seq() in R with alternating steps
(6 answers)
Closed 6 years ago.
I want to use R to create the sequence of numbers 1:8, 11:18, 21:28, etc. through 1000 (or the closest it can get, i.e. 998). Obviously typing that all out would be tedious, but since the sequence increases by one 7 times and then jumps by 3 I'm not sure what function I could use to achieve this.
I tried seq(1, 998, c(1,1,1,1,1,1,1,3)) but it does not give me the results I am looking for so I must be doing something wrong.
This is a perfect case of vectorisation( recycling too) in R. read about them
(1:100)[rep(c(TRUE,FALSE), c(8,2))]
# [1] 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27 28 31 32
#[27] 33 34 35 36 37 38 41 42 43 44 45 46 47 48 51 52 53 54 55 56 57 58 61 62 63 64
#[53] 65 66 67 68 71 72 73 74 75 76 77 78 81 82 83 84 85 86 87 88 91 92 93 94 95 96
#[79] 97 98
rep(seq(0,990,by=10), each=8) + seq(1,8)
You want to exclude numbers that are 0 or 9 (mod 10). So you can try this too:
n <- 1000 # upper bound
x <- 1:n
x <- x[! (x %% 10) %in% c(0,9)] # filter out (0, 9) mod (10)
head(x,80)
# [1] 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27
# 28 31 32 33 34 35 36 37 38 41 42 43 44 45 46 47 48 51 52 53 54 55 56 57
# 58 61 62 63 64 65 66 67 68 71 72 73 74 75 76 77 78 81 82 83 84 85
# 86 87 88 91 92 93 94 95 96 97 98
Or in a single line using Filter:
Filter(function(x) !((x %% 10) %in% c(0,9)), 1:100)
# [1] 1 2 3 4 5 6 7 8 11 12 13 14 15 16 17 18 21 22 23 24 25 26 27 28 31 32 33 34 35 36 37 38 41 42 43 44 45 46 47 48 51 52 53 54 55 56 57
# [48] 58 61 62 63 64 65 66 67 68 71 72 73 74 75 76 77 78 81 82 83 84 85 86 87 88 91 92 93 94 95 96 97 98
With a cycle: for(value in c(seq(1,991,10))){vector <- c(vector,seq(value,value+7))}

Resources