R - replace values in data frame using lookup table - r

I was having some trouble lately trying to replace specific values in a data frame or matrix by using a lookup-table.
So this represents the original.data to be modified ...
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
1 255 255 255 255 255 255 255 255 255 255 255 255 255 255
2 255 255 255 255 255 255 255 255 3 3 255 255 255 255
3 255 255 255 255 255 1 3 3 3 3 3 255 255 255
4 255 255 5 5 5 1 3 3 4 4 3 255 255 255
5 255 5 5 5 5 1 3 4 4 4 4 255 255 255
6 255 5 5 5 1 3 3 3 4 4 3 3 255 255
7 255 255 5 1 3 3 3 3 6 6 6 3 255 255
8 255 255 1 1 1 1 2 2 3 3 6 3 255 255
9 255 255 1 1 1 2 2 2 2 2 3 3 3 255
10 255 255 255 1 2 2 2 2 2 2 2 3 3 255
11 255 255 255 2 2 2 2 2 7 7 7 2 255 255
12 255 255 255 2 2 8 8 8 7 255 255 255 255 255
13 255 255 255 255 8 8 255 255 255 255 255 255 255 255
14 255 255 255 255 255 255 255 255 255 255 255 255 255 255
... and following may be the lookup.table (rows=1:9, column1="Sub", column2="Main"):
Sub Main
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 255 255
7 6 3
8 7 2
9 8 2
The aim is to compare e.g.
original.data[11,11] [7] with lookup.tabel[8,"Sub"] [7]
... and write a new matrix
modified.data[11,11] with lookup.table[8,"Main"] [2].
Until now all I came up with is using for-loops and an if-statement,
for (i in 1:ncol(original.data)){
for (j in 1:nrow(lookup.table)){
if (original.data[i,i]==lookup.table[j,1]){
origingal.data[j,i]<-lookup.table[j,2]
}
}
}
which leads to
Error in origingal.data[j, i] <- lookup.table[j, 2] :
object 'origingal.data' not found
but i cannot figure out my errors in reasoning.
I'd love to get some hints.
Thanks
\\\\\PROBLEM SOLVED
for (i in 1:ncol(original.data)){
for (j in 1:nrow(original.data)){
for (x in 1:nrow(lookup.table)){
if (original.data[j,i]==lookup.table[x,1]){
original.data[j,i]<-lookup.table[x,2]
}
}
}
}
... works, but this is a much faster method:
for(i in 1:nrow(lookup.table)){
c<-lookup.table[b,2]
d<-lookup.table[b,3]
original.data_modified[original.data == c] <- d
}

you can try :
# x the original.data (a matrix)
# y the lookup.table
x2 <- y[match(x, y[,1]),2]
dim(x2) <- dim(x)
table(x, x2)
x2
x 1 2 3 4 5 255
1 13 0 0 0 0 0
2 0 22 0 0 0 0
3 0 0 29 0 0 0
4 0 0 0 8 0 0
5 0 0 0 0 11 0
6 0 0 4 0 0 0
7 0 4 0 0 0 0
8 0 5 0 0 0 0
255 0 0 0 0 0 100

Related

Using lme to generate an MMRM model in R - incompatible formulas for groups in 'random' and 'correlation'

I'm trying to fit an MMRM to a dataset, there are specific covariates that have been pre-specified as random.
Previously, I have experience using gls() and simply accounting for the random effects in an unstructured correlation matrix via the below as an example:
nlme::corSymm(form = ~time | pt_num)
Since gls() doesn't specify random effects I'm trying to use lme, but I'm getting the error:
Error in lme.formula(la_mm2 ~ factor(arm) + factor(partI_bool) + la_base_mm2, :
incompatible formulas for groups in 'random' and 'correlation'
I've pasted some of the dummy data below as well as the model that I'm using, any clarity on what is causing the issue is highly appreciated!
#Model
gather1_mmrm <- lme(la_mm2 ~ factor(arm) + factor(partI_bool) + la_base_mm2,
data = gather1,
method = "REML",
random = ~t_factor + t_factor*factor(arm) + t_factor*factor(partI_bool) + t_factor*la_base_mm2 | pt_id,
na.action = na.omit,
correlation = nlme::corSymm(form= ~ t_factor | pt_num),
weights = nlme::varIdent(form= ~1|t_factor))
run pt_id arm partI_bool la_base_mm2 la_base_mm t_factor t_months la_mm2 la_mm ch_la_mm2 ch_la_mm
1 4 SHAM 1 4.302596153 2.074270029 1 0 4.302596153 2.074270029 0 0
1 4 SHAM 1 4.302596153 2.074270029 2 6 NA NA NA NA
1 4 SHAM 1 4.302596153 2.074270029 3 12 NA NA NA NA
1 12 SHAM 1 14.34691312 3.787731923 1 0 14.34691312 3.787731923 0 0
1 12 SHAM 1 14.34691312 3.787731923 2 6 15.64964748 3.955963533 1.302734357 1.302734357
1 12 SHAM 1 14.34691312 3.787731923 3 12 17.78969962 4.217783733 3.442786498 3.442786498
1 13 SHAM 1 5.356110596 2.314327245 1 0 5.356110596 2.314327245 0 0
1 13 SHAM 1 5.356110596 2.314327245 2 6 7.10663327 2.665826939 1.750522674 1.750522674
1 13 SHAM 1 5.356110596 2.314327245 3 12 9.041039758 3.00683218 3.684929162 3.684929162
1 14 SHAM 1 11.25759063 3.35523332 1 0 11.25759063 3.35523332 0 0
1 14 SHAM 1 11.25759063 3.35523332 2 6 13.61993655 3.690519821 2.362345918 2.362345918
1 14 SHAM 1 11.25759063 3.35523332 3 12 16.42931581 4.053309242 5.171725175 5.171725175
1 19 SHAM 1 2.758491755 1.660870782 1 0 2.758491755 1.660870782 0 0
1 19 SHAM 1 2.758491755 1.660870782 2 6 3.739763552 1.933846828 0.981271798 0.981271798
1 19 SHAM 1 2.758491755 1.660870782 3 12 4.772096698 2.18451292 2.013604943 2.013604943
1 35 SHAM 0 3.268341978 1.80785563 1 0 3.268341978 1.80785563 0 0
1 35 SHAM 0 3.268341978 1.80785563 2 6 3.617463276 1.901963006 0.349121298 0.349121298
1 35 SHAM 0 3.268341978 1.80785563 3 12 4.042171268 2.010515175 0.77382929 0.77382929
1 39 SHAM 0 5.806441903 2.409655972 1 0 5.806441903 2.409655972 0 0
1 39 SHAM 0 5.806441903 2.409655972 2 6 6.668033763 2.582253621 0.86159186 0.86159186
1 39 SHAM 0 5.806441903 2.409655972 3 12 6.657201504 2.580155326 0.850759601 0.850759601
1 51 SHAM 0 6.726560406 2.593561336 1 0 6.726560406 2.593561336 0 0
1 51 SHAM 0 6.726560406 2.593561336 2 6 6.752801108 2.598615229 0.026240702 0.026240702
1 51 SHAM 0 6.726560406 2.593561336 3 12 7.14950777 2.673856348 0.422947364 0.422947364
1 55 SHAM 0 5.281358119 2.298120562 1 0 5.281358119 2.298120562 0 0
1 55 SHAM 0 5.281358119 2.298120562 2 6 3.504983367 1.872160081 -1.776374752 -1.776374752
1 55 SHAM 0 5.281358119 2.298120562 3 12 1.453285056 1.205522732 -3.828073063 -3.828073063
1 103 SHAM 0 14.14175968 3.760553108 1 0 14.14175968 3.760553108 0 0
1 103 SHAM 0 14.14175968 3.760553108 2 6 15.83732887 3.979614161 1.695569189 1.695569189
1 103 SHAM 0 14.14175968 3.760553108 3 12 17.85259321 4.225232918 3.710833529 3.710833529
1 105 SHAM 0 8.118501066 2.849298346 1 0 8.118501066 2.849298346 0 0
1 105 SHAM 0 8.118501066 2.849298346 2 6 NA NA NA NA
1 105 SHAM 0 8.118501066 2.849298346 3 12 NA NA NA NA
1 107 SHAM 0 16.25989413 4.032355903 1 0 16.25989413 4.032355903 0 0
1 107 SHAM 0 16.25989413 4.032355903 2 6 NA NA NA NA
1 107 SHAM 0 16.25989413 4.032355903 3 12 NA NA NA NA
1 110 SHAM 0 3.108610876 1.763125315 1 0 3.108610876 1.763125315 0 0
1 110 SHAM 0 3.108610876 1.763125315 2 6 NA NA NA NA
1 110 SHAM 0 3.108610876 1.763125315 3 12 NA NA NA NA
1 122 2mg 1 14.23013034 3.772284499 1 0 14.23013034 3.772284499 0 0
1 122 2mg 1 14.23013034 3.772284499 2 6 16.31252812 4.038877086 2.082397778 2.082397778
1 122 2mg 1 14.23013034 3.772284499 3 12 17.26079811 4.154611667 3.030667766 3.030667766
1 129 2mg 1 5.839250811 2.416454181 1 0 5.839250811 2.416454181 0 0
1 129 2mg 1 5.839250811 2.416454181 2 6 7.464223376 2.732073091 1.624972565 1.624972565
1 129 2mg 1 5.839250811 2.416454181 3 12 9.640338241 3.104889409 3.80108743 3.80108743
1 133 2mg 1 4.413316452 2.100789483 1 0 4.413316452 2.100789483 0 0
1 133 2mg 1 4.413316452 2.100789483 2 6 5.064445674 2.25043233 0.651129222 0.651129222
1 133 2mg 1 4.413316452 2.100789483 3 12 NA NA NA NA
1 139 2mg 0 11.85320922 3.442848998 1 0 11.85320922 3.442848998 0 0
1 139 2mg 0 11.85320922 3.442848998 2 6 13.7069869 3.702294815 1.853777679 1.853777679
1 139 2mg 0 11.85320922 3.442848998 3 12 14.36968627 3.790736904 2.516477054 2.516477054
1 155 2mg 0 9.226600647 3.037531999 1 0 9.226600647 3.037531999 0 0
1 155 2mg 0 9.226600647 3.037531999 2 6 8.73090644 2.954810728 -0.495694207 -0.495694207
1 155 2mg 0 9.226600647 3.037531999 3 12 9.042412659 3.007060468 -0.184187988 -0.184187988
1 160 2mg 0 12.06173395 3.473000713 1 0 12.06173395 3.473000713 0 0
1 160 2mg 0 12.06173395 3.473000713 2 6 12.47205033 3.531579013 0.410316377 0.410316377
1 160 2mg 0 12.06173395 3.473000713 3 12 NA NA NA NA
1 161 2mg 0 6.051340791 2.459947315 1 0 6.051340791 2.459947315 0 0
1 161 2mg 0 6.051340791 2.459947315 2 6 NA NA NA NA
1 161 2mg 0 6.051340791 2.459947315 3 12 5.371776645 2.317709353 -0.679564146 -0.679564146
1 168 2mg 0 13.47495294 3.670824559 1 0 13.47495294 3.670824559 0 0
1 168 2mg 0 13.47495294 3.670824559 2 6 15.60077961 3.949782224 2.125826674 2.125826674
1 168 2mg 0 13.47495294 3.670824559 3 12 17.56628938 4.19121574 4.091336443 4.091336443
1 184 4mg 0 11.41583874 3.378733304 1 0 11.41583874 3.378733304 0 0
1 184 4mg 0 11.41583874 3.378733304 2 6 13.27277419 3.643181877 1.856935451 1.856935451
1 184 4mg 0 11.41583874 3.378733304 3 12 15.59128977 3.948580729 4.175451032 4.175451032
1 189 4mg 0 5.743634485 2.396588092 1 0 5.743634485 2.396588092 0 0
1 189 4mg 0 5.743634485 2.396588092 2 6 5.228250162 2.286536718 -0.515384323 -0.515384323
1 189 4mg 0 5.743634485 2.396588092 3 12 4.484122564 2.117574689 -1.259511921 -1.259511921
1 197 4mg 0 9.077401292 3.012872598 1 0 9.077401292 3.012872598 0 0
1 197 4mg 0 9.077401292 3.012872598 2 6 NA NA NA NA
1 197 4mg 0 9.077401292 3.012872598 3 12 NA NA NA NA
1 214 4mg 0 11.90323176 3.450106051 1 0 11.90323176 3.450106051 0 0
1 214 4mg 0 11.90323176 3.450106051 2 6 13.60561692 3.688579255 1.70238516 1.70238516
1 214 4mg 0 11.90323176 3.450106051 3 12 15.36408886 3.9197052 3.460857095 3.460857095
1 218 4mg 0 10.8365073 3.291885068 1 0 10.8365073 3.291885068 0 0
1 218 4mg 0 10.8365073 3.291885068 2 6 12.0130533 3.465985184 1.176545997 1.176545997
1 218 4mg 0 10.8365073 3.291885068 3 12 13.63051081 3.691952168 2.794003509 2.794003509
1 231 4mg 0 10.32451428 3.213178221 1 0 10.32451428 3.213178221 0 0
1 231 4mg 0 10.32451428 3.213178221 2 6 11.11654749 3.33414869 0.792033208 0.792033208
1 231 4mg 0 10.32451428 3.213178221 3 12 11.13792298 3.33735269 0.813408696 0.813408696
1 235 4mg 0 6.26445047 2.502888425 1 0 6.26445047 2.502888425 0 0
1 235 4mg 0 6.26445047 2.502888425 2 6 NA NA NA NA
1 235 4mg 0 6.26445047 2.502888425 3 12 7.748527082 2.783617625 1.484076612 1.484076612
1 237 4mg 0 5.393093093 2.322303402 1 0 5.393093093 2.322303402 0 0
1 237 4mg 0 5.393093093 2.322303402 2 6 6.382859882 2.526432244 0.989766789 0.989766789
1 237 4mg 0 5.393093093 2.322303402 3 12 6.65907103 2.58051759 1.265977937 1.265977937
1 250 4mg 0 3.578464204 1.891682903 1 0 3.578464204 1.891682903 0 0
1 250 4mg 0 3.578464204 1.891682903 2 6 3.922992029 1.980654445 0.344527825 0.344527825
1 250 4mg 0 3.578464204 1.891682903 3 12 4.405971911 2.099040712 0.827507707 0.827507707

Summing up different elements in a matrix in R

I'm trying to perform calculations on different elements in a matrix in R. My Matrix is 18x18 and I would like to get e.g. the mean of each 6x6 array (which makes 9 arrays in total). My desired arrays would be:
A1 <- df[1:6,1:6]
A2 <- df[1:6,7:12]
A3 <- df[1:6,13:18]
B1 <- df[7:12,1:6]
B2 <- df[7:12,7:12]
B3 <- df[7:12,13:18]
C1 <- df[13:18,1:6]
C2 <- df[13:18,7:12]
C3 <- df[13:18,13:18]
The matrix looks like this:
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
5 14 17 9 10 8 4 10 12 18 9 13 14 NA NA 19 15 10 10
10 30 32 23 27 17 28 25 12 28 29 28 26 19 25 34 24 11 17
15 16 16 16 9 17 27 17 16 30 13 18 13 15 13 19 8 7 9
20 15 12 18 18 18 6 4 6 9 11 10 10 13 11 8 10 15 15
25 7 13 21 7 3 5 2 5 5 4 3 2 3 5 2 1 5 6
30 5 9 1 7 7 4 4 12 8 9 2 0 5 2 1 0 2 6
35 3 0 2 0 0 4 4 7 4 4 5 2 0 0 1 0 0 0
40 0 4 0 0 0 1 3 9 10 10 1 0 0 0 1 0 1 0
45 0 0 0 0 0 3 10 9 17 9 1 0 0 0 0 0 0 0
50 0 0 2 0 0 0 2 8 20 0 0 0 0 0 1 0 0 0
55 0 0 0 0 0 0 7 3 21 0 0 0 0 0 0 0 0 0
60 0 0 0 0 3 4 10 2 2 0 0 1 0 0 0 0 0 0
65 0 0 0 0 0 4 8 4 8 11 0 0 0 0 0 0 0 0
70 0 0 0 0 0 6 2 5 14 0 0 0 0 0 0 0 0 0
75 0 0 0 0 0 4 0 5 9 0 0 0 0 0 0 0 0 0
80 0 0 0 0 0 4 4 0 4 2 0 0 0 0 0 0 0 0
85 0 0 0 0 0 0 0 4 1 1 0 0 0 0 0 0 0 0
90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Is there a clean way to solve this issue with a loop?
Thanks a lot in advance,
Paul
Given your matrix, e.g.
x <- matrix(1:(18*18), ncol=18)
Try, for example for sub matrices of 6
step <- 6
nx <- nrow(x)
if((nx %% step) != 0) stop("nx %% step should be 0")
indI <- seq(1, nx, by=step)
nbStep <- length(indI)
for(Col in 1:nbStep){
for(Row in 1:nbStep){
name <- paste0(LETTERS[Col],Row)
theCol <- indI[Col]:(indI[Col]+step-1)
theRow <- indI[Row]:(indI[Row]+step-1)
assign(name, sum(x[theCol, theRow]))
}
}
You'll get your results in A1, A2, A3...
This is the idea. Twist the code for non square matrices, different size of sub matrices, ...
Here's one way:
# generate fake data
set.seed(47)
n = 18
m = matrix(rpois(n * n, lambda = 5), nrow = n)
# generate starting indices
n_array = 6
start_i = seq(1, n, by = n_array)
arr_starts = expand.grid(row = start_i, col = start_i)
# calculate sums
with(arr_starts, mapply(function(x, y) sum(m[(x + 1:n_array) - 1, (y + 1:n_array) - 1]), row, col))
# [1] 158 188 176 201 188 201 197 206 204

R inspect() function, from tm package, only returns 10 outputs when using dictionary terms

I have 70 PDFs of scientific papers that I'm trying to narrow down by looking for specific terms within them, using the dictionary function of inspect(), which is part of the tm package. My PDFs are stored in a VCorpus object. Here's an example of what my code looks like using the crude dataset and common terms that would show up in (probably) every example paper in crude:
library(tm)
output.matrix <- inspect(DocumentTermMatrix(crude,
list(dictionary = c("i","and",
"all","of",
"the","if",
"i'm","looking",
"for","but","because","has",
"it","was"))))
output <- data.frame(output.matrix)
This search only ever returns 10 papers into output.matrix. The outcome given is:
Docs all and because but for has i i'm the was
144 0 9 0 5 5 2 0 0 17 1
236 0 7 4 2 4 5 0 0 15 7
237 1 11 1 3 3 2 0 0 30 2
246 0 9 0 0 6 1 0 0 18 2
248 1 6 1 1 2 0 0 0 27 4
273 0 5 2 2 4 1 0 0 21 1
368 0 1 0 1 0 0 0 0 11 2
489 0 5 0 0 4 0 0 0 8 0
502 0 6 0 1 5 0 0 0 13 0
704 0 5 1 0 3 2 0 0 21 0
For my actual dataset of 70 papers, I know there should be greater than 10 because as I add more PDFs to my VCorpus, which I know contain at least one of my search terms, I still only get 10 in the output. I want to adjust the outcome to be a list, like the one shown, that gives every paper from the VCorpus that contains a term, not just what I assume is the first 10.
Using R version 4.0.2, macOS High Sierra 10.13.6
You are misinterpreting what inspect does. For a document term matrix it show the first 10 rows and columns. inspect should only be used to check your corpus or document term matrix if it looks as you expect. Never for transforming data to a data.frame. If you want the data of the document term matrix in a data.frame, the following piece of code does this, using your example code and removing all the rows and columns that don't have a value for any of the documents or terms.
# do not use inspect as this will give a wrong result!
output.matrix <- DocumentTermMatrix(crude,
list(dictionary = c("i","and",
"all","of",
"the","if",
"i'm","looking",
"for","but","because","has",
"it","was")))
# remove rows and columns that are 0 staying inside a sparse matrix for speed
out <- output.matrix[slam::row_sums(output.matrix) > 0,
slam::col_sums(output.matrix) > 0]
# transform to data.frame
out_df <- data.frame(docs = row.names(out), as.matrix(out), row.names = NULL)
out_df
docs all and because but for. has the was
1 127 0 1 0 0 2 0 5 1
2 144 0 9 0 5 5 2 17 1
3 191 0 0 0 0 2 0 4 0
4 194 1 1 0 0 2 0 4 1
5 211 0 2 0 0 2 0 8 0
6 236 0 7 4 2 4 5 15 7
7 237 1 11 1 3 3 2 30 2
8 242 0 3 0 1 1 1 6 1
9 246 0 9 0 0 6 1 18 2
10 248 1 6 1 1 2 0 27 4
11 273 0 5 2 2 4 1 21 1
12 349 0 2 0 0 0 0 5 0
13 352 0 3 0 0 0 0 7 1
14 353 0 1 0 0 2 1 4 3
15 368 0 1 0 1 0 0 11 2
16 489 0 5 0 0 4 0 8 0
17 502 0 6 0 1 5 0 13 0
18 543 0 0 0 0 3 0 5 1
19 704 0 5 1 0 3 2 21 0
20 708 0 0 0 0 0 0 0 1

Calculate post error slowing in R

For my research, I would like to calculate the post-error slowing in the stop signal task to find out whether people become slower after they failed to inhibit their response. Here is some data and I would like to do the following:
For each subject determine first if it was a stop-trial (signal = 1)
For each stop-trial, determine if it is correct (signal = 1 & correct = 2) and then determine whether the next trial (thus the trial directly after the stop-trial) is a go-trial (signal = 0)
Then calculate the average reaction time for all these go-trials that directly follow a stop trial when the response is correct (signal = 0 & correct = 2).
For each incorrect stop trial (signal = 1 & correct = 0) determine whether the next trial (thus the trial directly after the stop-trial) is a go-trial (signal = 0)
Then calculate the average reaction time for all these go-trials that directly follow a stop-trial when the response is correct (correct = 2).
Then calculate the difference between the RTs calculated in step 2 and 3 (= post-error slowing).
I'm not that experienced in R to achieve this. I hope someone can help me with this script.
subject trial signal correct RT
1 1 0 2 755
1 2 0 2 543
1 3 1 0 616
1 4 0 2 804
1 5 0 2 594
1 6 0 2 705
1 7 1 2 0
1 8 1 2 0
1 9 0 2 555
1 10 1 0 604
1 11 0 2 824
1 12 0 2 647
1 13 0 2 625
1 14 0 2 657
1 15 1 0 578
1 16 0 2 810
1 17 1 2 0
1 18 0 2 646
1 19 0 2 574
1 20 0 2 748
1 21 0 0 856
1 22 0 2 679
1 23 0 2 738
1 24 0 2 620
1 25 0 2 715
1 26 1 2 0
1 27 0 2 675
1 28 0 2 560
1 29 1 0 584
1 30 0 2 564
1 31 0 2 994
1 32 1 2 0
1 33 0 2 715
1 34 0 2 644
1 35 0 2 545
1 36 0 2 528
1 37 1 2 0
1 38 0 2 636
1 39 0 2 684
1 40 1 2 0
1 41 0 2 653
1 42 0 2 766
1 43 0 2 747
1 44 0 2 821
1 45 0 2 612
1 46 0 2 624
1 47 0 2 665
1 48 1 2 0
1 49 0 2 594
1 50 0 2 665
1 51 1 0 658
1 52 0 2 800
1 53 1 2 0
1 54 1 0 738
1 55 0 2 831
1 56 0 2 815
1 57 0 2 776
1 58 0 2 710
1 59 0 2 842
1 60 1 0 516
1 61 0 2 758
1 62 1 2 0
1 63 0 2 628
1 64 0 2 713
1 65 0 2 835
1 66 1 0 791
1 67 0 2 871
1 68 0 2 816
1 69 0 2 769
1 70 0 2 930
1 71 0 2 676
1 72 0 2 868
2 1 0 2 697
2 2 0 2 689
2 3 0 2 584
2 4 1 0 788
2 5 0 2 448
2 6 0 2 564
2 7 0 2 587
2 8 1 0 553
2 9 0 2 706
2 10 0 2 442
2 11 1 0 245
2 12 0 2 601
2 13 0 2 774
2 14 1 0 579
2 15 0 2 652
2 16 0 2 556
2 17 0 2 963
2 18 0 2 725
2 19 0 2 751
2 20 0 2 709
2 21 0 2 741
2 22 1 0 613
2 23 0 2 781
2 24 1 2 0
2 25 0 2 634
2 26 1 2 0
2 27 0 2 487
2 28 1 2 0
2 29 0 2 692
2 30 0 2 745
2 31 1 2 0
2 32 0 2 610
2 33 0 2 836
2 34 1 0 710
2 35 0 2 757
2 36 0 2 781
2 37 0 2 1029
2 38 0 2 832
2 39 1 0 626
2 40 1 2 0
2 41 0 2 844
2 42 0 2 837
2 43 0 2 792
2 44 0 2 789
2 45 0 2 783
2 46 0 0 0
2 47 0 0 468
2 48 0 2 686
This may be too late to be useful but here's my solution: (i.e. I first split the data frame by subject, and then apply the same algorithm to each subject; the result is:
# 1 2
# -74.60317 23.39286
X <- read.table(
text=" subject trial signal correct RT
1 1 0 2 755
1 2 0 2 543
1 3 1 0 616
1 4 0 2 804
1 5 0 2 594
1 6 0 2 705
1 7 1 2 0
1 8 1 2 0
1 9 0 2 555
1 10 1 0 604
1 11 0 2 824
1 12 0 2 647
1 13 0 2 625
1 14 0 2 657
1 15 1 0 578
1 16 0 2 810
1 17 1 2 0
1 18 0 2 646
1 19 0 2 574
1 20 0 2 748
1 21 0 0 856
1 22 0 2 679
1 23 0 2 738
1 24 0 2 620
1 25 0 2 715
1 26 1 2 0
1 27 0 2 675
1 28 0 2 560
1 29 1 0 584
1 30 0 2 564
1 31 0 2 994
1 32 1 2 0
1 33 0 2 715
1 34 0 2 644
1 35 0 2 545
1 36 0 2 528
1 37 1 2 0
1 38 0 2 636
1 39 0 2 684
1 40 1 2 0
1 41 0 2 653
1 42 0 2 766
1 43 0 2 747
1 44 0 2 821
1 45 0 2 612
1 46 0 2 624
1 47 0 2 665
1 48 1 2 0
1 49 0 2 594
1 50 0 2 665
1 51 1 0 658
1 52 0 2 800
1 53 1 2 0
1 54 1 0 738
1 55 0 2 831
1 56 0 2 815
1 57 0 2 776
1 58 0 2 710
1 59 0 2 842
1 60 1 0 516
1 61 0 2 758
1 62 1 2 0
1 63 0 2 628
1 64 0 2 713
1 65 0 2 835
1 66 1 0 791
1 67 0 2 871
1 68 0 2 816
1 69 0 2 769
1 70 0 2 930
1 71 0 2 676
1 72 0 2 868
2 1 0 2 697
2 2 0 2 689
2 3 0 2 584
2 4 1 0 788
2 5 0 2 448
2 6 0 2 564
2 7 0 2 587
2 8 1 0 553
2 9 0 2 706
2 10 0 2 442
2 11 1 0 245
2 12 0 2 601
2 13 0 2 774
2 14 1 0 579
2 15 0 2 652
2 16 0 2 556
2 17 0 2 963
2 18 0 2 725
2 19 0 2 751
2 20 0 2 709
2 21 0 2 741
2 22 1 0 613
2 23 0 2 781
2 24 1 2 0
2 25 0 2 634
2 26 1 2 0
2 27 0 2 487
2 28 1 2 0
2 29 0 2 692
2 30 0 2 745
2 31 1 2 0
2 32 0 2 610
2 33 0 2 836
2 34 1 0 710
2 35 0 2 757
2 36 0 2 781
2 37 0 2 1029
2 38 0 2 832
2 39 1 0 626
2 40 1 2 0
2 41 0 2 844
2 42 0 2 837
2 43 0 2 792
2 44 0 2 789
2 45 0 2 783
2 46 0 0 0
2 47 0 0 468
2 48 0 2 686", header=TRUE)
sapply(split(X, X["subject"]), function(D){
PCRT <- with(D, RT[which(c(signal[-1],NA)==1 & c(correct[-1], NA)==2 & signal==0) ])
PERT <- with(D, RT[which(c(signal[-1],NA)==1 & c(correct[-1], NA)==0 & signal==0) ])
mean(PERT) - mean(PCRT)
})
This is ok if you can be sure that every respondent has at least 1 correct and 1 incorrect "stop" trial followed by a "go" trial. A more general case would be (giving NA if they are either always correct or always mistaken):
sapply(split(X, X["subject"]), function(D){
PCRT <- with(D, RT[which(c(signal[-1],NA)==1 & c(correct[-1], NA)==2 & signal==0) ])
PERT <- with(D, RT[which(c(signal[-1],NA)==1 & c(correct[-1], NA)==0 & signal==0) ])
if(length(PCRT)>0 & length(PERT)>0) mean(PERT) - mean(PCRT) else NA
})
Does that help you? A little bit redundant maybe, but I tried to follow your steps as best as possible (not sure whether I mixed something up, please check for yourself looking at the table). The idea is to put the data in a csv file first and treat it as a data frame. Find the csv raw file here: http://pastebin.com/X5b2ysmQ
data <- read.csv("datatable.csv",header=T)
data[,"condition1"] <- data[,"signal"] == 1
data[,"condition2"] <- data[,"condition1"] & data[,"correct"] == 2
data[,"RT1"] <- NA
for(i in which(data[,"condition2"])){
if( nrow(data)>i && !data[i+1,"condition1"] && data[i+1,"correct"] == 2 )
# next is a go trial
data[i+1,"RT1"] <- data[i+1,"RT"]
}
averageRT1 <- mean( data[ !is.na(data[,"RT1"]) ,"RT1"] )
data[,"RT2"] <- NA
for(i in which(data[,"condition1"] & data[,"correct"] == 0)){
if( nrow(data)>i && !data[i+1,"condition1"] && data[i+1,"correct"] == 2 )
# next is a go trial
data[i+1,"RT2"] <- data[i+1,"RT"]
}
averageRT2 <- mean( data[ !is.na(data[,"RT2"]) ,"RT2"] )
postErrorSlowing <- abs(averageRT2-averageRT1)
#Nilsole I just tried it and it is almost perfect. How could the code be improved that for each subject the postErrorSlowing is calculated and placed in a dataframe? Thus that a new data frame is created which consists of subject number (1,2,3 etc.) and the postErrorSlowing variable? Something like this (postErrorSlowing are made up numbers)
subject postErrorSlowing
1 50
2 75
....

Piecewise HLM model using nlme package in R

I have two time periods of interest and four observation points(0 months, 4 months, 12 months, 16 months) for my subjects. The first time period of interest is between observation 1 and observation 3. The second time period of interest is between observation 3 and observation 4.
I would like to run an HLM to account for the correlation of observations on the same subject. I have pasted some sample data and also my code and output are below.
When I compare the model output to actual means they are very similar in this case. However, when I use my actual data set they are less similar. What does this imply? Can you tell me if I have coded time appropriately? My goal is to compare the effect of treatment during time period 1 to the effect of treatment during time period 2. Thank You!
library(nlme)
#Run Model and Get Output
model=lme(Response~Time1*Treatment+Time2*Treatment,
random=~Time1+Time2|Subject,data=test,control=list(opt="optim"))
round(summary(model)$tTable,dig=3)
# Output
Value Std.Error DF t-value p-value
(Intercept) 172.357 2.390 41 72.110 0.000
Time1 0.464 0.062 41 7.496 0.000
Treatment -10.786 3.499 13 -3.083 0.009
Time2 -0.795 0.130 41 -6.113 0.000
Time1:Treatment -0.089 0.091 41 -0.985 0.331
Treatment:Time2 0.563 0.190 41 2.956 0.005
# Means by Treatment and Time vs. Model
mean(test$Response[test$Treatment==1 & test$Observation==1])
[1] 161.1429
#model
172.357-10.786
[1] 161.571
mean(test$Response[test$Treatment==0 & test$Observation==1])
[1] 171.75
#model
[1] 172.357
Sample Data Used for this Output:
Subject Treatment Observation Time Time2 Response
1 0 1 0 0 170
1 0 2 4 0 175
1 0 3 12 0 177
1 0 4 12 4 173
2 1 1 0 0 160
2 1 2 4 0 162
2 1 3 12 0 165
2 1 4 12 4 165
3 0 1 0 0 172
3 0 2 4 0 177
3 0 3 12 0 180
3 0 4 12 4 175
4 1 1 0 0 162
4 1 2 4 0 166
4 1 3 12 0 168
4 1 4 12 4 167
5 1 1 0 0 163
5 1 2 4 0 167
5 1 3 12 0 169
5 1 4 12 4 167
6 0 1 0 0 179
6 0 2 4 0 182
6 0 3 12 0 184
6 0 4 12 4 180
7 0 1 0 0 155
7 0 2 4 0 158
7 0 3 12 0 160
7 0 4 12 4 157
8 1 1 0 0 152
8 1 2 4 0 155
8 1 3 12 0 157
8 1 4 12 4 157
9 0 1 0 0 170
9 0 2 4 0 174
9 0 3 12 0 179
9 0 4 12 4 177
10 1 1 0 0 162
10 1 2 4 0 164
10 1 3 12 0 165
10 1 4 12 4 165
11 1 1 0 0 164
11 1 2 4 0 165
11 1 3 12 0 168
11 1 4 12 4 167
12 0 1 0 0 174
12 0 2 4 0 175
12 0 3 12 0 176
12 0 4 12 4 175
13 0 1 0 0 184
13 0 2 4 0 185
13 0 3 12 0 186
13 0 4 12 4 184
14 1 1 0 0 165
14 1 2 4 0 167
14 1 3 12 0 169
14 1 4 12 4 168
15 0 1 0 0 170
15 0 2 4 0 175
15 0 3 12 0 179
15 0 4 12 4 177
Thanks.

Resources