intervention analysis using Dummy data - r

I am working on a homework assignment using intervention analysis. The question is:
Generate a simulation of the difference equation y_t=a_0+〖a_1 y〗_(t-1)+〖c_0 z〗_t+x_t where x_t is the forcing process x_t=w_t, w_t is a white noise, and 〖|a〗_1 |<1. Define the intervention variable z_t as binary (0,1) but you may choose the start time of the intervention; assume the intervention lasts for 2 units of time.
So I wrote this code:
set.seed(50)
y <- w <- rnorm(200, sd=1)
alpha0 <- 1
alpha1 <- 0.9
cee0 <- 1
z <-rep(0, 200)
for (t in 1:200) {z[t] <- ifelse( t = 78:79,1,0)}
So the intervention would occur at the 78th and 79th instant.
But this does not work. I keep getting this error/warning message:
In z[t] <- ifelse(t = 77:78, 1, 0) :
number of items to replace is not a multiple of replacement length
I have tried the analysis using a continuous intervention at the 100th instant and it works fine:
z <-rep(0, 200)
for (t in 1:200) {z[t] <- ifelse( t > 100,1,0)}
So why does the t > 100 work but t = 77:78 not work? Is there something I am missing here?

You could change your command as follows.
for (t in 1:200) {z[t] <- ifelse( t %in% 78:79,1,0)}
> z
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[57] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[113] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[169] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Related

Genetic Algorithm in R: Specify number of 1s in binary chromosomes

I am using the rbga function, but my question still stands for other genetic algorithm implementations in R. Is there a way to specify the number of 1s in binary chromosomes?
I have the following example provided by the library documentation.
data(iris)
library(MASS)
X <- as.data.frame(cbind(scale(iris[,1:4]), matrix(rnorm(36*150), 150, 36)))
Y <- iris[,5]
iris.evaluate <- function(indices) {
print("Chromosome")
print(indices)
print("================================")
result = 1
if (sum(indices) > 2) {
huhn <- lda(X[,indices==1], Y, CV=TRUE)$posterior
result = sum(Y != dimnames(huhn)[[2]][apply(huhn, 1,
function(x)
which(x == max(x)))]) / length(Y)
}
result
}
monitor <- function(obj) {
minEval = min(obj$evaluations);
plot(obj, type="hist");
}
woppa <- rbga.bin(size=40, mutationChance=0.05, zeroToOneRatio=10,
evalFunc=iris.evaluate, showSettings=TRUE, verbose=TRUE)
Here are some of the chromosomes.
"Chromosome"
0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
"================================"
"Chromosome"
0 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0
"================================"
"Chromosome"
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
"================================"
"Chromosome"
0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
"================================"
The 1s (i.e., the chosen characteristics) are 5, 8, 5 and 4 respectively.
I am trying to follow the technique specified in a paper and they claim that they apply a genetic algorithm and in the end they pick a specific number of characteristics.
Is it possible to specify in a genetic algorithm the number of characteristics that I want my solution(s)/chromosome(s) to have?
Could this be done on the final solution/chromosome and if yes how?

Split variable into multiple multiple factor variables

I have some dataset similar to this:
df <- data.frame(n = seq(1:1000000), x = sample(LETTERS, 1000000, replace = T))
I'm looking for a guidance in finding a way to split variable x into multiple categorical variables with range 0-1
In the end it would look like this:
n x A B C D E F G H . . .
1 D 0 0 0 1 0 0 0 0 . . .
2 B 0 1 0 0 0 0 0 0 . . .
3 F 0 0 0 0 0 1 0 0 . . .
In my dataset, there's way more codes in variable x so adding each new variable manually would be too time consuming.
I was thinking about sorting codes in var x and assigning them an unique number each, then creating an iterating loop that creates new variable for each code in variable x.
But i feel like i'm overcomplicating things
A fast and easy way is to use fastDummies::dummy_cols:
fastDummies::dummy_cols(df, "x")
An alternative with tidyverse functions:
library(tidyverse)
df %>%
left_join(., df %>% mutate(value = 1) %>%
pivot_wider(names_from = x, values_from = value, values_fill = 0) %>%
relocate(n, sort(colnames(.)[-1])))
output
> dummmy <- fastDummies::dummy_cols(df, "x")
> colnames(dummy)[-c(1,2)] <- LETTERS
> dummy
n x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
1 1 Z 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
2 2 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
3 3 E 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 4 H 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 5 T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
6 6 X 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
7 7 R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
8 8 F 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9 9 Z 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
10 10 S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
Benchmark
Since there are many solutions and the question involves a large dataset, a benchmark might help. The nnet solution is the fastest according to the benchmark.
set.seed(1)
df <- data.frame(n = seq(1:1000000), x = sample(LETTERS, 1000000, replace = T))
library(microbenchmark)
bm <- microbenchmark(
fModel.matrix(),
fContrasts(),
fnnet(),
fdata.table(),
fFastDummies(),
fDplyr(),
times = 10L,
setup = gc(FALSE)
)
autoplot(bm)
Using match. First create a vector of zeroes, then match letter of df row with vector from the alphabet and turn to 1. You may use builtin LETTERS constant. Finally Vectorize the thing and cbind.
f <- \(x) {
z <- numeric(length(LETTERS))
z[match(x, LETTERS)] <- 1
setNames(z, LETTERS)
}
cbind(df, t(Vectorize(f)(df$x)))
# n x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
# Q 1 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
# E 2 E 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# A 3 A 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# Y 4 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
# J 5 J 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# D 6 D 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# R 7 R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
# Z 8 Z 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
# Q.1 9 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
# O 10 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
Alternatively, transform x to a factor with LETTERS as levels and use model.matrix.
df <- transform(df, x=factor(x, levels=LETTERS))
cbind(df, `colnames<-`(model.matrix(~ 0 + x, df), LETTERS))
# n x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
# 1 1 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
# 2 2 E 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# 3 3 A 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# 4 4 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
# 5 5 J 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# 6 6 D 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# 7 7 R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
# 8 8 Z 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
# 9 9 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
# 10 10 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
Data:
n <- 10
set.seed(42)
df <- data.frame(n = seq(1:n), x = sample(LETTERS, n, replace = T))
using data.table
library(data.table)
setDT(df) #make df a data.table if needed
merge(df, dcast(df, n ~ x, fun.agg = length), by = c("n"))
The main question here is that of resources? I think. I found using nnet is a fast solution:
library(nnet)
library(dplyr)
df %>% cbind(class.ind(.$x) == 1) %>%
mutate(across(-c(n, x), ~.*1))
n x A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
1 1 E 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 2 H 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 3 L 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 4 M 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
5 5 R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
6 6 A 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
7 7 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
8 8 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
9 9 F 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10 10 U 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
11 11 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
12 12 I 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
13 13 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
14 14 Z 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
15 15 P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
16 16 T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
17 17 F 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
18 18 K 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
19 19 H 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
20 20 V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
21 21 V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
22 22 G 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
23 23 P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
24 24 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
25 25 V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
26 26 R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
27 27 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
28 28 B 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
29 29 D 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
30 30 M 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
31 31 E 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
32 32 V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
33 33 S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
34 34 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
35 35 T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
[ reached 'max' / getOption("max.print") -- omitted 999965 rows ]
>
Another option would be to use ==.
. <- unique(df$x)
cbind(df, +do.call(cbind, lapply(setNames(., .), `==`, df$x)))
# n x C I L T Y
#1 1 I 0 1 0 0 0
#2 2 C 1 0 0 0 0
#3 3 C 1 0 0 0 0
#4 4 Y 0 0 0 0 1
#5 5 L 0 0 1 0 0
#6 6 T 0 0 0 1 0
#...
Or in one line using sapply.
cbind(df, +sapply(unique(df$x), `==`, df$x))
Or use contrasts and match them to df$x.
. <- contrasts(as.factor(df$x), FALSE)
#. <- contrasts(as.factor(unique(df$x)), FALSE) #Alternative
cbind(df, .[match(df$x, rownames(.)),])
#cbind(df, .[fastmatch::fmatch(df$x, rownames(.)),]) #Alternative
Or indexing in a matrix.
. <- unique(df$x) #Could be sorted
#. <- collapse::funique(df$x) #Alternative
#. <- kit::funique(df$x) #Alternative
i <- match(df$x, .)
#i <- fastmatch::fmatch(df$x, .) #Alternative
#i <- data.table::chmatch(df$x, .) #Alternative
nc <- length(.)
nr <- length(i)
cbind(df, matrix(`[<-`(integer(nc * nr), 1:nr + nr * (i - 1), 1), nr, nc,
dimnames=list(NULL, .)))
Or using outer.
. <- unique(df$x)
cbind(df, +outer(df$x, setNames(., .), `==`))
Or using rep and m̀atrix`.
. <- unique(df$x)
n <- nrow(df)
cbind(df, +matrix(df$x == rep(., each=n), n, dimnames=list(NULL, .)))
Benchmark of some methods which will work for more codes in variable x and not only for e.g. LETTERS.
set.seed(42)
df <- data.frame(n = seq(1:1000000), x = sample(LETTERS, 1000000, replace = T))
library(nnet)
library(dplyr)
microbenchmark::microbenchmark(times = 10L, setup = gc(FALSE), control=list(order="block")
, "nnet" = df %>% cbind(class.ind(.$x) == 1) %>%
mutate(across(-c(n, x), ~.*1))
, "contrasts" = {. <- contrasts(as.factor(df$x), FALSE)
cbind(df, .[match(df$x, rownames(.)),])}
, "==" = {. <- unique(df$x)
cbind(df, +do.call(cbind, lapply(setNames(., .), `==`, df$x)))}
, "==Sapply" = cbind(df, +sapply(unique(df$x), `==`, df$x))
, "matrix" = {. <- unique(df$x)
i <- match(df$x, .)
nc <- length(.)
nr <- length(i)
cbind(df, matrix(`[<-`(integer(nc * nr), 1:nr + nr * (i - 1), 1), nr, nc,
dimnames=list(NULL, .)))}
, "outer" = {. <- unique(df$x)
cbind(df, +outer(df$x, setNames(., .), `==`))}
, "rep" = {. <- unique(df$x)
n <- nrow(df)
cbind(df, +matrix(df$x == rep(., each=n), n, dimnames=list(NULL, .)))}
)
Result
Unit: milliseconds
expr min lq mean median uq max neval
nnet 208.6898 220.2304 326.2210 305.5752 386.3385 541.0621 10
contrasts 1110.0123 1168.7651 1263.5357 1216.1403 1357.0532 1514.4411 10
== 146.2217 156.8141 208.2733 185.1860 275.3909 278.8497 10
==Sapply 290.0458 291.4543 301.3010 295.0557 298.0274 358.0531 10
matrix 302.9993 304.8305 312.9748 306.8981 310.0781 363.0773 10
outer 524.5230 583.5224 603.3300 586.3054 595.4086 807.0260 10
rep 276.2110 285.3983 389.8187 434.2754 435.8607 442.3403 10

Simplifying/condensing long time series functions in R

I have a long time series loop that I would like to simplify/condense. I am trying to simulate the calving of a cattle herd over a period of ten years (monthly intervals) using a random binomial distribution. The function starts with the assumption that the cattle have been covered by the bull. Each variable is affected by the previous. The variables are as follows:
G1:G9 gestation for each month.
MC1:MC7 mothers with calves for 7 months, then after the calves are weaned.
Rest1:Rest6 periods of rest before they are covered by the bull again.
DeadCows based on the mortality rate.
NPreg non-pregnant cows based on the conception rate.
Inputs:
size_cowherd, number of cattle in the herd.
concep, conception rate.
Thanks in advance.
The code I have is as follows:
size_cowherd<-100
concep<-0.95
cows <- function(t=119, mort=0.0005){
G1<- numeric(length = t + 1)
G2<- numeric(length = t + 1)
G3<- numeric(length = t + 1)
G4<- numeric(length = t + 1)
G5<- numeric(length = t + 1)
G6<- numeric(length = t + 1)
G7<- numeric(length = t + 1)
G8<- numeric(length = t + 1)
G9<- numeric(length = t + 1)
MC1<- numeric(length = t + 1)
MC2<- numeric(length = t + 1)
MC3<- numeric(length = t + 1)
MC4<- numeric(length = t + 1)
MC5<- numeric(length = t + 1)
MC6<- numeric(length = t + 1)
MC7<- numeric(length = t + 1)
Rest1<- numeric(length = t + 1)
Rest2<- numeric(length = t + 1)
Rest3<- numeric(length = t + 1)
Rest4<- numeric(length = t + 1)
Rest5<- numeric(length = t + 1)
Rest6<- numeric(length = t + 1)
DeadCows <- numeric(length = t + 1)
NPreg <- numeric(length = t + 1)
G1[1]<- rbinom(1,size_cowherd,concep)
G2[1]<- 0
G3[1]<- 0
G4[1]<- 0
G5[1]<- 0
G6[1]<- 0
G7[1]<- 0
G8[1]<- 0
G9[1]<- 0
MC1[1]<- 0
MC2[1]<- 0
MC3[1]<- 0
MC4[1]<- 0
MC5[1]<- 0
MC6[1]<- 0
MC7[1]<- 0
Rest1[1]<-0
Rest2[1]<-0
Rest3[1]<-0
Rest4[1]<-0
Rest5[1]<-0
Rest6[1]<-0
DeadCows[1] <- 0
NPreg[1] <- size_cowherd - G1[1]
for(step in 1:t){
G2[step+1] <- rbinom(1, G1[step], (1-mort))
G3[step+1] <- rbinom(1, G2[step], (1-mort))
G4[step+1] <- rbinom(1, G3[step], (1-mort))
G5[step+1] <- rbinom(1, G4[step], (1-mort))
G6[step+1] <- rbinom(1, G5[step], (1-mort))
G7[step+1] <- rbinom(1, G6[step], (1-mort))
G8[step+1] <- rbinom(1, G7[step], (1-mort))
G9[step+1] <- rbinom(1, G8[step], (1-mort))
MC1[step+1] <- rbinom(1, G9[step], (1-mort))
MC2[step+1] <- rbinom(1, MC1[step], (1-mort))
MC3[step+1] <- rbinom(1, MC2[step], (1-mort))
MC4[step+1] <- rbinom(1, MC3[step], (1-mort))
MC5[step+1] <- rbinom(1, MC4[step], (1-mort))
MC6[step+1] <- rbinom(1, MC5[step], (1-mort))
MC7[step+1] <- rbinom(1, MC6[step], (1-mort))
Rest1[step+1] <- rbinom(1,MC7[step],(1-mort))
Rest2[step+1] <- rbinom(1,Rest1[step],(1-mort))
Rest3[step+1] <- rbinom(1,Rest2[step],(1-mort))
Rest4[step+1] <- rbinom(1,Rest3[step],(1-mort))
Rest5[step+1] <- rbinom(1,Rest4[step],(1-mort))
Rest6[step+1] <- rbinom(1,Rest5[step],(1-mort))
G1[step+1] <- rbinom(1, Rest6[step], (1-mort))
DeadCows[step+1] <-sum(G1[step]-G2[step+1],G2[step]-G3[step+1],G3[step]-
G4[step+1],G4[step]-G5[step+1],G5[step]-G6[step+1],G6[step]-
G7[step+1],G7[step]-G8[step+1],G8[step]-G9[step+1],G9[step]-
MC1[step+1],MC1[step]-MC2[step+1],MC2[step]-MC3[step+1],MC3[step]-
MC4[step+1],MC4[step]-MC5[step+1],MC5[step]-MC6[step+1],MC6[step]-
MC7[step+1],MC7[step]-Rest1[step+1],Rest1[step]-
Rest2[step+1],Rest2[step]-Rest3[step+1],Rest3[step]-
Rest4[step+1],Rest4[step]-Rest5[step+1],Rest5[step]-
Rest6[step+1],Rest6[step]-G1[step+1])
if(G1[step]<size_cowherd){
G1[step+1]<- rbinom(1,Rest6[step], concep)
NPreg[step+1]<-Rest6[step]-G1[step+1]
}
}
out <-cbind(G1,G2,G3,G4,G5,G6,G7,G8,G9,MC1,MC2,MC3,MC4,MC5,MC6,MC7,Rest1,R
est2,Rest3,Rest4,Rest5,Rest6,DeadCows,NPreg)
return(out)
}
Below is a sample of what the output should look like. In the 23rd month, the cycle restarts again.
G1 G2 G3 G4 G5 G6 G7 G8 G9 MC1 MC2 MC3 MC4 MC5 MC6 MC7 Rest1 Rest2 Rest3
Rest4 Rest5 Rest6 DeadCows NPreg
1 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 4
2 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
3 0 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
4 0 0 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
5 0 0 0 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
6 0 0 0 0 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
7 0 0 0 0 0 0 96 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
8 0 0 0 0 0 0 0 96 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
9 0 0 0 0 0 0 0 0 96 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0
10 0 0 0 0 0 0 0 0 0 96 0 0 0 0 0 0 0 0 0
0 0 0 0 0
11 0 0 0 0 0 0 0 0 0 0 96 0 0 0 0 0 0 0 0
0 0 0 0 0
12 0 0 0 0 0 0 0 0 0 0 0 96 0 0 0 0 0 0 0
0 0 0 0 0
13 0 0 0 0 0 0 0 0 0 0 0 0 96 0 0 0 0 0 0
0 0 0 0 0
14 0 0 0 0 0 0 0 0 0 0 0 0 0 96 0 0 0 0 0
0 0 0 0 0
15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 96 0 0 0 0
0 0 0 0 0
16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 96 0 0 0
0 0 0 0 0
17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 96 0 0
0 0 0 0 0
18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 96 0
0 0 0 0 0
19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 95
0 0 0 1 0
20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
95 0 0 0 0
21 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 95 0 0 0
22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 94 1 0
Something like this should work for you. I think the trick here is to make use of matrices to keep the bookkeeping a bit more straightforward.
size_cowherd <- 100
concep <- 0.95
stage_names <- c(paste0("G",seq(9)), paste0("MC",seq(7)), paste0("Rest",seq(6)))
cows <- function(size_cowherd, concep, t=220, mort=0.0005, names=stage_names) {
n_stages <- length(names)
stages <- matrix(0, t, n_stages)
dead_cows <- n_preg <- rep(NA, t)
stages[1,1] <- rbinom(1, size_cowherd, concep)
dead_cows[1] <- 0
n_preg[1] <- size_cowherd - stages[1,1]
for(tt in 2:t) {
stages[tt,1] <- rbinom(1, stages[tt-1,n_stages], 1-mort)
for(i in 2:n_stages) {
stages[tt,i] <- rbinom(1, stages[tt-1,i-1], 1-mort)
}
dead_cows[tt] <- sum(stages[tt-1,] - stages[tt,c(2:n_stages,1)])
if(stages[tt-1,1] < size_cowherd) {
stages[tt, 1] <- rbinom(1, stages[tt-1,n_stages], concep)
n_preg[tt] <- stages[tt-1,n_stages] - stages[tt,1]
}
}
res <- cbind(stages, dead_cows, n_preg)
colnames(res) <- c(names, "Dead", "N_Preg")
return(res)
}
head(cows(100, 0.95), 24)
G1 G2 G3 G4 G5 G6 G7 G8 G9 MC1 MC2 MC3 MC4 MC5 MC6 MC7 Rest1 Rest2 Rest3 Rest4 Rest5 Rest6 Dead N_Preg
[1,] 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
[2,] 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[3,] 0 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[4,] 0 0 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[5,] 0 0 0 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 96 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[7,] 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
[8,] 0 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[9,] 0 0 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[10,] 0 0 0 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[11,] 0 0 0 0 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 0 0 0
[12,] 0 0 0 0 0 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 0 0
[13,] 0 0 0 0 0 0 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 0
[14,] 0 0 0 0 0 0 0 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0
[15,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0
[16,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 95 0 0 0 0 0 0 0 0
[17,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 94 0 0 0 0 0 1 0
[18,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 94 0 0 0 0 0 0
[19,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 94 0 0 0 0 0
[20,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 94 0 0 0 0
[21,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 94 0 0 0
[22,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 94 0 0
[23,] 92 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
[24,] 0 92 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

R circlize: Error in circos.initialize

I am able to follow the Circlize example in the description of the package on CRAN easily:
library('circlize')
set.seed(123)
mat = matrix(sample(1:100, 18, replace = TRUE), 3, 6)
rownames(mat) = letters[1:3]
colnames(mat) = LETTERS[1:6]
### basic settings
par(mfrow = c(3, 2))
par(mar = c(1, 1, 1, 1))
chordDiagram(mat)
however, when I replace mat with myMatrix I get this error:
Error in circos.initialize(factors = factor(cate, levels = cate), xlim = cbind(rep(0, :
Since `xlim` is a matrix, it should have same number of rows as the length of the level of `factors` and number of columns of 2.
Can somebody explain why I am getting that message? I do not see a difference between mat and myMatrix other than myMatrix is larger:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A2 B2 C2 D2
A 1060360.659 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
B 0 32143148.75 996976.8445 0 4944648.524 5688385.041 61990.5913 0 0 0 0 -1563.225 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 31922242.6
C 0 0 6342776.843 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
D 0 0 0 28617385.81 17842142.64 0 0 0 0 0 0 0 0 409444.5633 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
E 0 0 0 4990921.202 105686446.3 536246.2188 0 0 0 0 0 0 0 8587899.583 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 378565.5746
F 0 92732.7741 0 4282.9319 33543553.89 36773976.59 1894761.93 0 0 333209.342 0 20739.0655 327956.7365 0 1022673.163 12229.0255 0 0 386112.1743 224039.3207 0 2395066.197 268247.2897 0 0 0 0 0 0 11926701.96
G 0 0 0 0 0 0 7753767.003 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
H 0 0 0 0 0 5184133.29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
I 0 0 0 0 462767.7374 0 0 0 8992223.296 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
J 0 0 0 0 0 0 0 0 0 1950552.642 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
K 0 0 0 0 891032.5584 0 0 0 0 0 520107.9821 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26724.8402 0 0 0 418902.5203
L 0 0 0 0 32044317.54 28147.5693 0 0 0 0 0 5383919.293 0 489912.5412 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4559115.003
M 0 0 0 0 0 3125823.41 0 0 0 0 0 0 1738293.164 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
N 0 1053825.966 -8526.9758 1283429.314 60333051.34 2621812.931 -1130.1924 0 -779545.8004 8055145.684 918.8702 -379747.1919 -177.6205 298563606.5 -9316.8654 0 0 0 0 0 2631991.077 0 0 0 0 0 1107369.803 0 0 118812465
O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1500451.292 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7432418.396
P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Q 0 0 1496058.76 0 -4056617.74 294503 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 410.4 0 0 0 0 0 0 0 1765984767
Code
dd <- read.table(header = TRUE, text = " rn A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A2 B2 C2 D2
A 1060360.659 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
B 0 32143148.75 996976.8445 0 4944648.524 5688385.041 61990.5913 0 0 0 0 -1563.225 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 31922242.6
C 0 0 6342776.843 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
D 0 0 0 28617385.81 17842142.64 0 0 0 0 0 0 0 0 409444.5633 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
E 0 0 0 4990921.202 105686446.3 536246.2188 0 0 0 0 0 0 0 8587899.583 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 378565.5746
F 0 92732.7741 0 4282.9319 33543553.89 36773976.59 1894761.93 0 0 333209.342 0 20739.0655 327956.7365 0 1022673.163 12229.0255 0 0 386112.1743 224039.3207 0 2395066.197 268247.2897 0 0 0 0 0 0 11926701.96
G 0 0 0 0 0 0 7753767.003 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
H 0 0 0 0 0 5184133.29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
I 0 0 0 0 462767.7374 0 0 0 8992223.296 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
J 0 0 0 0 0 0 0 0 0 1950552.642 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
K 0 0 0 0 891032.5584 0 0 0 0 0 520107.9821 0 0 0 0 0 0 0 0 0 0 0 0 0 0 26724.8402 0 0 0 418902.5203
L 0 0 0 0 32044317.54 28147.5693 0 0 0 0 0 5383919.293 0 489912.5412 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4559115.003
M 0 0 0 0 0 3125823.41 0 0 0 0 0 0 1738293.164 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
N 0 1053825.966 -8526.9758 1283429.314 60333051.34 2621812.931 -1130.1924 0 -779545.8004 8055145.684 918.8702 -379747.1919 -177.6205 298563606.5 -9316.8654 0 0 0 0 0 2631991.077 0 0 0 0 0 1107369.803 0 0 118812465
O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1500451.292 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7432418.396
P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Q 0 0 1496058.76 0 -4056617.74 294503 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 410.4 0 0 0 0 0 0 0 1765984767")
myMatrix <- as.matrix(dd[, -1])
rownames(myMatrix) <- dd[, 1]
chordDiagram(myMatrix)
In the old version of circlize, the matrix must be of a matrix class instead of a data.frame, so you need to convert the data frame explicitly by:
myMatrix = as.matrix(A + B)
In circlize, a data frame is for data stored as a adjacency list (e.g the first column for group1, second column for group2, third column for the strength of the relation).
Since read.table() always returns a data.frame class, in the newer version of circlize, it is fine if the matrix represents as a data frame. When it is a data frame, the chordDiagram() will first check whether the number of columns is larger than 3 and all columns are numeric. If so, it will be converted to a matrix internally.

Remove labels from imported file in R

I am importing a file and trying to display only the numbers in each row, with any commas or labels. With the following code, my output is given below:
mydata <- read.table("/home/mukhera3/Desktop/Test/part-r-00000", sep=",")
mydata
Output
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
V461 V462 V463 V464 V465 V466 V467 V468 V469 V470 V471 V472 V473 V474 V475 V476 V477 V478 V479 V480 V481 V482 V483
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
V484 V485 V486 V487 V488 V489 V490 V491 V492 V493 V494 V495 V496 V497 V498 V499 V500 V501 V502 V503 V504 V505 V506
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
V507 V508 V509 V510 V511 V512 V513 V514 V515 V516 V517 V518 V519 V520 V521 V522 V523 V524 V525 V526 V527 V528 V529
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
V530 V531 V532 V533 V534 V535 V536 V537 V538 V539 V540 V541 V542 V543 V544 V545 V546 V547 V548 V549 V550 V551 V552
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
V553 V554 V555 V556 V557 V558 V559 V560 V561 V562 V563 V564 V565 V566 V567 V568 V569 V570 V571 V572 V573 V574 V575
When I replace the "," for sep with whitespace (sep=""), keeping everything else the same. this is what I get:
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
I want to display the numbers 0,1 .. without any commas or other row numbers etc. I am new to R programming, and do not know how to do this. Any help would be appreciated.
If you want your file to be read directly as a vector and not as a dataframe, you can, for instance, use scan instead of read.table. Example with your example file saved as a.txt in my working directory:
> mydata <- scan(file="a.txt",sep=",")
Read 46 items
> mydata
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
You can also get that result from read.table with some additional steps:
> mydata <- read.table("a.txt",sep=",") # Reads your file as a data.frame
> mydata <- unlist(mydata) # Transforms into a named vector
> names(mydata) <- NULL # Gets rid of the names
> mydata
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
If you just want to "display" it like that but don't want to change the nature of your table, you can simply use cat (combined with unlist):
> mydata <- read.table("a.txt",sep=",")
> cat(unlist(mydata))
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

Resources