Turn extraction list to csv file - r

I have uploaded a raster file and polyline shapefile into R and use the extract function to to extract the data from every pixel along the polyline. How do I turn the list output by extract into a CSV file?

Here is a simple self-contained reproducible example (this one is taken from ?raster::extract)
library(raster)
r <- raster(ncol=36, nrow=18, vals=1:(18*36))
cds1 <- rbind(c(-50,0), c(0,60), c(40,5), c(15,-45), c(-10,-25))
cds2 <- rbind(c(80,20), c(140,60), c(160,0), c(140,-55))
lines <- spLines(cds1, cds2)
e <- extract(r, lines)
e is a list
> e
[[1]]
[1] 126 127 161 162 163 164 196 197 200 201 231 232 237 266 267 273 274 302 310 311 338 346 381 382 414 417 450 451 452 453 487 488
[[2]]
[1] 139 140 141 174 175 177 208 209 210 213 243 244 249 250 279 286 322 358 359 394 429 430 465 501 537
and you cannot directly write this to a csv because the list elements (vectors) have different lengths.
So first make them all the same length
x <- max(sapply(e, length))
ee <- sapply(e, `length<-`, x)
Let's see
head(ee)
# [,1] [,2]
#[1,] 126 139
#[2,] 127 140
#[3,] 161 141
#[4,] 162 174
#[5,] 163 175
#[6,] 164 177
tail(ee)
# [,1] [,2]
#[27,] 450 NA
#[28,] 451 NA
#[29,] 452 NA
#[30,] 453 NA
#[31,] 487 NA
#[32,] 488 NA
And now you can write to a csv file
write.csv(ee, "test.csv", row.names=FALSE)

If I understand what it is you're asking, I think you could resolve your situation by using unlist().
d <- c(1:10) # creates a sample data frame to use
d <- as.list(d) # converts the data frame into a list
d <- unlist(d) # converts the list into a vector

Related

Populate new dataframe with rowMeans for every four columns minus other rowMeans value

I am trying to calculate my data's means by populating a new dataframe with data corrected by my experiment's blank.
So far, I have created my new data frame:
data_mean <- data.frame(matrix(ncol = 17, # As many columns as experimental conditions plus one for "Time(h)"
nrow = nrow(data)))
Copied the data corresponding to time:
data_mean[,1] <- data[,1]
And attempted to populate the dataframe by assigning the mean of every condition minus the mean of the blanks to each column:
data_mean[,2] <- rowMeans(data[,5:8])-rowMeans(data[,2:4])
data_mean[,3] <- rowMeans(data[,9:12])-rowMeans(data[,2:4])
data_mean[,4] <- rowMeans(data[,13:16])-rowMeans(data[,2:4])
data_mean[,5] <- rowMeans(data[,17:20])-rowMeans(data[,2:4])
and so on.
Is there an easier way to do this rather than typing the same code over and over?
res <- sapply(split.default(data[, -1], seq(ncol(data) - 1)%/%4), rowSums)
res[,-1] - res[,1] # Should give you all the differences above
example:
data <- data.frame(matrix(1:200, 10))
res <- sapply(split.default(data[, -1], seq(ncol(data) - 1)%/%4), rowSums)
res[,-1] - res[,1]
1 2 3 4
[1,] 161 321 481 641
[2,] 162 322 482 642
[3,] 163 323 483 643
[4,] 164 324 484 644
[5,] 165 325 485 645
[6,] 166 326 486 646
[7,] 167 327 487 647
[8,] 168 328 488 648
[9,] 169 329 489 649
[10,] 170 330 490 650
and you can check:
rowSums(data[, 5:8]) - rowSums(data[,2:4])
[1] 161 162 163 164 165 166 167 168 169 170 # first column
rowSums(data[, 9:12]) - rowSums(data[,2:4])
[1] 321 322 323 324 325 326 327 328 329 330 # second column

Generate a sequence of numbers with repeated intervals

I am trying to create sequences of number of 6 cases, but with 144 cases intervals.
Like this one for example
c(1:6, 144:149, 288:293)
1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293
How could I generate automatically such a sequence with
seq
or with another function ?
I find the sequence function to be helpful in this case. If you had your data in a structure like this:
(info <- data.frame(start=c(1, 144, 288), len=c(6, 6, 6)))
# start len
# 1 1 6
# 2 144 6
# 3 288 6
then you could do this in one line with:
sequence(info$len) + rep(info$start-1, info$len)
# [1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293
Note that this solution works even if the sequences you're combining are different lengths.
Here's one approach:
unlist(lapply(c(0L,(1:2)*144L-1L),`+`,seq_len(6)))
# or...
unlist(lapply(c(1L,(1:2)*144L),function(x)seq(x,x+5)))
Here's a way I like a little better:
rep(c(0L,(1:2)*144L-1L),each=6) + seq_len(6)
Generalizing...
rlen <- 6L
rgap <- 144L
rnum <- 3L
starters <- c(0L,seq_len(rnum-1L)*rgap-1L)
rep(starters, each=rlen) + seq_len(rlen)
# or...
unlist(lapply(starters+1L,function(x)seq(x,x+rlen-1L)))
This can also be done using seq or seq.int
x = c(1, 144, 288)
c(sapply(x, function(y) seq.int(y, length.out = 6)))
#[1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293
As #Frank mentioned in the comments here is another way to achieve this using #josilber's data structure (This is useful particularly when there is a need of different sequence length for different intervals)
c(with(info, mapply(seq.int, start, length.out=len)))
#[1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293
From R >= 4.0.0, you can now do this in one line with sequence:
sequence(c(6,6,6), from = c(1,144,288))
[1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293
The first argument, nvec, is the length of each sequence; the second, from, is the starting point for each sequence.
As a function, with n being the number of intervals you want:
f <- function(n) sequence(rep(6,n), from = c(1,144*1:(n-1)))
f(3)
[1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293
I am using R 3.3.2. OSX 10.9.4
I tried:
a<-c() # stores expected sequence
f<-288 # starting number of final sub-sequence
it<-144 # interval
for (d in seq(0,f,by=it))
{
if (d==0)
{
d=1
}
a<-c(a, seq(d,d+5))
print(d)
}
print(a)
AND the expected sequence stores in a.
[1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293
And another try:
a<-c() # stores expected sequence
it<-144 # interval
lo<-4 # number of sub-sequences
for (d in seq(0,by=it, length.out = lo))
{
if (d==0)
{
d=1
}
a<-c(a, seq(d,d+5))
print(d)
}
print(a)
The result:
[1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293 432 433 434 435 436 437
I tackled this with cumsum function
seq_n <- 3 # number of sequences
rep(1:6, seq_n) + rep(c(0, cumsum(rep(144, seq_n-1))-1), each = 6)
# [1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293
No need to calculate starting values of sequences as in the #josilber's solution, but the length of a sequence has to be constant.

Calculating ratios and put them into a matrix in R

I have a table similar to this one, and want to calculate the ratio between column A and B. For example:
A B C D E F
[1,] 187 174 183 115 101 104
[2,] 451 166 177 842 101 133
[3,] 727 171 187 12803 98 134
[4,] 1532 181 196 730 98 108
[5,] 4139 188 214 20358 105 159
[6,] 689 185 211 1633 110 162
[7,] 1625 184 195 2283 109 114
[8,] 771 181 190 904 105 110
[9,] 950 177 190 1033 106 112
[10,] 703 180 191 463 106 110
[11,] 2052 178 188 2585 100 105
[12,] 1161 178 187 2874 99 110
[13,] 214 175 184 173 98 110
[14,] 473 184 191 971 104 111
[15,] 756 185 193 14743 107 114
I want to create a new matrix that has all of those previous rows as new rows and columns (15 rows and 15 columns) like so (values in parentheses are placeholders for the calculated ratios):
[,1] [,2] [,3] [,4]
[1,] (A1:B1) (A1:B2) (A1:B3) (A1:B4) ...
[2,]
[3,]
[4,]
...
That is maybe not the best example, but I hope it is not too confusing.
To calculate the ratios A1:B1, A2:B2, A3:B3 I could do something like:
data.matrix(data["A"]/data["B"])
And to do it for all, I would do something like:
data.matrix(data[1,]/data[1,1])
data.matrix(data[1,]/data[1,2])
...
and so on.
This seems to be a lot of work and maybe someone knows a quicker and more efficient method.
EDIT
I thought the combn function would work, but then I figured out it doesn't. When I have a 2 column matrix, such as:
A B
[1,] 187 115
[2,] 451 842
[3,] 727 12803
[4,] 1532 730
[5,] 4139 20358
[6,] 689 1633
[7,] 1625 2283
[8,] 771 904
[9,] 950 1033
[10,] 703 463
[11,] 2052 2585
[12,] 1161 2874
[13,] 214 173
[14,] 473 971
[15,] 756 14743
And I use the combn function to calculate all possible ratios (A1:B1, A1:B2, ... A2:B1, A2:B2...) I get just the result for A1 vs all values of B.
> combn(ncol(data), 2, function(x) data[,x[1]]/data[,x[2]])
[,1]
[1,] 1.62608696
[2,] 0.53562945
[3,] 0.05678357
[4,] 2.09863014
[5,] 0.20331074
[6,] 0.42192284
[7,] 0.71178274
[8,] 0.85287611
[9,] 0.91965150
[10,] 1.51835853
[11,] 0.79381044
[12,] 0.40396660
[13,] 1.23699422
[14,] 0.48712667
[15,] 0.05127857
Or maybe I just don't understand the combn function and I am doing something wrong here.
You can achieve what you want by using expand.grid, apply and matrix functions as below
I am assuming what you want is matrix like
A1/B1 A1/B2 A1/B3 ...
A2/B1 A2/B2 A2/B3 ...
... ... ... ...
... ... ... ...
Here is the code to do that. Explanation in comments
txt <- "A B C D E F\n187 174 183 115 101 104\n451 166 177 842 101 133\n727 171 187 12803 98 134\n1532 181 196 730 98 108\n4139 188 214 20358 105 159\n689 185 211 1633 110 162\n1625 184 195 2283 109 114\n771 181 190 904 105 110\n950 177 190 1033 106 112\n703 180 191 463 106 110\n2052 178 188 2585 100 105\n1161 178 187 2874 99 110\n214 175 184 173 98 110\n473 184 191 971 104 111\n756 185 193 14743 107 114"
data <- as.matrix(read.table(textConnection(txt), header = TRUE))
# expand.grid : creates every combination of one element each from column A and
# B with elements of B repeated first
# apply : calls function(x) { x[1]/x[2]) } for every combination outputted by
# expand.grid
# matrix : converts the result of apply into matrix. dimnames arguments sets
# rownames and colnames for easy verification for us
result <- matrix(apply(expand.grid(data[, "A"], data[, "B"]), 1, function(x) x[1]/x[2]),
nrow = nrow(data), dimnames = list(data[, "A"], data[, "B"]))
# note that we have set rownames for result to be values of A and colnames for
# result to be value of B
result
## 174 166 171 181 188 185 184
## 187 1.074713 1.126506 1.093567 1.033149 0.9946809 1.010811 1.016304
## 451 2.591954 2.716867 2.637427 2.491713 2.3989362 2.437838 2.451087
## 727 4.178161 4.379518 4.251462 4.016575 3.8670213 3.929730 3.951087
## 1532 8.804598 9.228916 8.959064 8.464088 8.1489362 8.281081 8.326087
## 4139 23.787356 24.933735 24.204678 22.867403 22.0159574 22.372973 22.494565
## 689 3.959770 4.150602 4.029240 3.806630 3.6648936 3.724324 3.744565
## 1625 9.339080 9.789157 9.502924 8.977901 8.6436170 8.783784 8.831522
## 771 4.431034 4.644578 4.508772 4.259669 4.1010638 4.167568 4.190217
## 950 5.459770 5.722892 5.555556 5.248619 5.0531915 5.135135 5.163043
## 703 4.040230 4.234940 4.111111 3.883978 3.7393617 3.800000 3.820652
## 2052 11.793103 12.361446 12.000000 11.337017 10.9148936 11.091892 11.152174
## 1161 6.672414 6.993976 6.789474 6.414365 6.1755319 6.275676 6.309783
## 214 1.229885 1.289157 1.251462 1.182320 1.1382979 1.156757 1.163043
## 473 2.718391 2.849398 2.766082 2.613260 2.5159574 2.556757 2.570652
## 756 4.344828 4.554217 4.421053 4.176796 4.0212766 4.086486 4.108696
## 181 177 180 178 178 175 184
## 187 1.033149 1.056497 1.038889 1.050562 1.050562 1.068571 1.016304
## 451 2.491713 2.548023 2.505556 2.533708 2.533708 2.577143 2.451087
## 727 4.016575 4.107345 4.038889 4.084270 4.084270 4.154286 3.951087
## 1532 8.464088 8.655367 8.511111 8.606742 8.606742 8.754286 8.326087
## 4139 22.867403 23.384181 22.994444 23.252809 23.252809 23.651429 22.494565
## 689 3.806630 3.892655 3.827778 3.870787 3.870787 3.937143 3.744565
## 1625 8.977901 9.180791 9.027778 9.129213 9.129213 9.285714 8.831522
## 771 4.259669 4.355932 4.283333 4.331461 4.331461 4.405714 4.190217
## 950 5.248619 5.367232 5.277778 5.337079 5.337079 5.428571 5.163043
## 703 3.883978 3.971751 3.905556 3.949438 3.949438 4.017143 3.820652
## 2052 11.337017 11.593220 11.400000 11.528090 11.528090 11.725714 11.152174
## 1161 6.414365 6.559322 6.450000 6.522472 6.522472 6.634286 6.309783
## 214 1.182320 1.209040 1.188889 1.202247 1.202247 1.222857 1.163043
## 473 2.613260 2.672316 2.627778 2.657303 2.657303 2.702857 2.570652
## 756 4.176796 4.271186 4.200000 4.247191 4.247191 4.320000 4.108696
## 185
## 187 1.010811
## 451 2.437838
## 727 3.929730
## 1532 8.281081
## 4139 22.372973
## 689 3.724324
## 1625 8.783784
## 771 4.167568
## 950 5.135135
## 703 3.800000
## 2052 11.091892
## 1161 6.275676
## 214 1.156757
## 473 2.556757
## 756 4.086486
Edit: I seem to have misunderstood the question. The answer is even more simpler using outer:
# gives the same 15*15 matrix as geektrader's
outer(mm[,1], mm[,2], '/')
Old answer (not correct):
You should use combn:
# combn(ncol(mm), 2) gives you all possible combinations
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
# [1,] 1 1 1 1 1 2 2 2 2 3 3 3 4 4 5
# [2,] 2 3 4 5 6 3 4 5 6 4 5 6 5 6 6
# it also accepts a function argument. we can use it to divide
# respective columns
mm.div <- combn(ncol(mm), 2, function(x) mm[,x[1]]/mm[,x[2]])
# set column names the matrix
colnames(mm.div) <- combn(colnames(mm), 2, paste, collapse="")
I might be completely missing the point here, but why not just use a couple for loops? I wrote a quick function, then you could pass the pairs to.
For example:
A <- rnorm(15)
B <- rnorm(15)
data <- data.frame(A,B)
ratio <- function(input1, input2){
out <- matrix(0, nrow=length(input1), ncol=length(input1))
k <- 1
for (i in 1:length(input1)){
for (j in 1:length(input1)){
out[k, j] <- input1[k] / input2[j]
}
k <- k + 1
}
return(out)
}
ratio(data$A, data$B)
EDIT
Another thought. To then use the function to do all possible pairs of ratios, you could simply add another for loop, like this:
combs <- combn(1:4, 2)
out <- list()
for (i in 1:(length(combs)/2)){
out[[i]] <- ratio(data[,combs[1,i]], data[,combs[2,i]])
}
Hope that helps!

Mapping cut factor variable to numeric in R

I have a factor variable represented by the histogram bins with values: '660-664' , ... , '740-744' , 745-749' ..
How can I map the factor variable to its mean value, e.g mapping '660-664' to 662?
Basically, what I'm looking for is the inverse of the "cut" function.
You can make use of the plot = FALSE argument from hist to extract the breaks, then use that to get your midpoints:
set.seed(1)
x <- sample(300, 30)
x
# [1] 80 112 171 270 60 266 278 194 184 18 296 52 198 111 221 142
# [17] 204 281 108 219 262 290 182 35 74 107 4 105 237 93
temp <- hist(x, plot = FALSE)$breaks
temp
# [1] 0 50 100 150 200 250 300
rowMeans(cbind(head(temp, -1),
tail(temp, -1)))
# [1] 25 75 125 175 225 275
Update: Calculating the mean from a character string of ranges
Judging by your comments, you might be looking for something more like this:
myVec <- c("735-739", "715-719", "690-694", "695-699", "695-699",
"670-674", "720-724", "705-709", "685-689")
myVec
# [1] "735-739" "715-719" "690-694" "695-699" "695-699" "670-674"
# [7] "720-724" "705-709" "685-689"
sapply(strsplit(myVec, "-"), function(x) mean(as.numeric(x)))
# [1] 737 717 692 697 697 672 722 707 687

New vector based on comparing elements of two other vectors "lagged"?

I have two vectors, subject and target. I want to create a new vector based on comparisons between the two existing vectors, with elements being compared lagged. I've solved this okay using the loop below, but I'm essentially wondering whether there's a more elegant solution using apply?
subject <- c(200, 195, 190, 185, 185, 185, 188, 189, 195, 200, 210, 210)
target <- c(subject[1], subject[1]-cumsum(rep(perweek, length(subject)-1)))
adjtarget <- target
for (i in 1:(length(subject)-1)) {
if (subject[i] > adjtarget[i]) {
adjtarget[i+1] <- adjtarget[i]
} else {
adjtarget[i+1] <- adjtarget[i]-perweek }
}
}
This doesn't exactly solve your problem, but may point in a helpful direction. I'm disregarding the interplay between changing adjtarget and comparing to it, and show a similar problem, where we compare to the constant target. Then it's possible to change the if in the loop to a vector comparison:
lv <- but.last(subject) > but.last(target)
ind <- which(lv)
Prepare the result vector (I'll call it x, as it won't be the same result as your adjtarget) as a shifted copy of target and assign the changes to it:
x <- c(target[1], but.last(target)) # corresponds to the true branch of the `if`
x[ind+1] <- target[ind] - perweek # corresponds to the false branch
Alternatively,
x <- c(target[1], but.last(target) - (!lv)*perweek
As I said, this doesn't solve your problem, but perhaps we could start from here.
Just for clarification, if I understand your code, this is the kind of result you're looking for...
> (goal <- cbind(subject,target,adjtarget))
subject target adjtarget
[1,] 200 200 200
[2,] 195 198 198
[3,] 190 196 196
[4,] 185 194 194
[5,] 185 192 192
[6,] 185 190 190
[7,] 188 188 188
[8,] 189 186 186
[9,] 195 184 186
[10,] 200 182 186
[11,] 210 180 186
[12,] 210 178 186
If I'm right, then the challenge to vectorizing this is the repeated assignment of 186 in adjtarget. Vectorized code will evaluate the right hand side (RHS) before assigning it to the left hand side (LHS). So, the vectorized code won't see the updated value in adjtarget at row 9 until after the assignment is finished.
> y <- ifelse(subject > target, 1, 0) # matches TRUE case
> x <- target
> x[ind+1] <- target[ind]
> cbind(goal, x, y)
subject target adjtarget x y
[1,] 200 200 200 200 0
[2,] 195 198 198 198 0
[3,] 190 196 196 196 0
[4,] 185 194 194 194 0
[5,] 185 192 192 192 0
[6,] 185 190 190 190 0
[7,] 188 188 188 188 0
[8,] 189 186 186 186 1
[9,] 195 184 186 186 1 # assigned correctly (?)
[10,] 200 182 186 184 1 # incorrect x; should be 186
[11,] 210 180 186 182 1 # incorrect x; should be 186
[12,] 210 178 186 180 1 # incorrect x; should be 186

Resources