I have a set of time-series data (GPS speed data, specifically), which includes gaps of missing values where the signal was lost. For missing periods of short durations I am about to fill simply using a na.spline, however this is inappropriate with longer time periods. I would like to ramp the values from the last true value down to zero, based on predefined acceleration limits.
#create sample data frame
test <- as.data.frame(c(6,5.7,5.4,5.14,4.89,4.64,4.41,4.19,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5.1,5.3,5.4,5.5))
names(test)[1] <- "speed"
#set rate of acceleration for ramp
ramp <- 6
#set sampling rate of receiver
Hz <- 1/10
So for missing data the ramp would use the previous value and the rate of acceleration to get the next data point, until speed reached zero (i.e. last speed [4.19] + (Hz * ramp)), yielding the following values:
3.59
2.99
2.39
1.79
1.19
0.59
0
Lastly, I need to do this in the reverse fashion, to ramp up from zero when the signal picks back up again.
Hope this is clear.
Cheers
It's not really elegant, but you can do it in a loop.
na.pos <- which(is.na(test$speed))
acc = FALSE
for (i in na.pos) {
if (acc) {
speed <- test$speed[i-1]+(Hz*ramp)
}
else {
speed <- test$speed[i-1]-(Hz*ramp)
if (round(speed,1) < 0) {
acc <- TRUE
speed <- test$speed[i-1]+(Hz*ramp)
}
}
test[i,] <- speed
}
The result is:
speed
1 6.00
2 5.70
3 5.40
4 5.14
5 4.89
6 4.64
7 4.41
8 4.19
9 3.59
10 2.99
11 2.39
12 1.79
13 1.19
14 0.59
15 -0.01
16 0.59
17 1.19
18 1.79
19 2.39
20 2.99
21 3.59
22 4.19
23 4.79
24 5.00
25 5.10
26 5.30
27 5.40
28 5.50
Note that '-0.01', because 0.59-(6*10) is -0.01, not 0. You can round it later, I decided not to.
When the question says "ramp the values from the last true value down to zero" in each run of NAs I assume that that means that any remaining NAs in the run after reaching zero are also to be replaced by zero.
Now, use rleid from data.table to create a grouping vector the same length as test$speed identifying each run in is.na(test$speed) and use ave to create sequence numbers within such groups, seqno. Then calculate the declining sequences, ramp_down by combining na.locf(test$speed) and seqno. Finally replace the NAs.
library(data.table)
library(zoo)
test_speed <- test$speed
seqno <- ave(test_speed, rleid(is.na(test_speed)), FUN = seq_along)
ramp_down <- pmax(na.locf(test_speed) - seqno * ramp * Hz, 0)
result <- ifelse(is.na(test_speed), ramp_down, test_speed)
giving:
> result
[1] 6.00 5.70 5.40 5.14 4.89 4.64 4.41 4.19 3.59 2.99 2.39 1.79 1.19 0.59 0.00
[16] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 5.00 5.10 5.30 5.40 5.50
I have used the specaccum() command to develop species accumulation curves for my samples.
Here is some example data:
site1<-c(0,8,9,7,0,0,0,8,0,7,8,0)
site2<-c(5,0,9,0,5,0,0,0,0,0,0,0)
site3<-c(5,0,9,0,0,0,0,0,0,6,0,0)
site4<-c(5,0,9,0,0,0,0,0,0,0,0,0)
site5<-c(5,0,9,0,0,6,6,0,0,0,0,0)
site6<-c(5,0,9,0,0,0,6,6,0,0,0,0)
site7<-c(5,0,9,0,0,0,0,0,7,0,0,3)
site8<-c(5,0,9,0,0,0,0,0,0,0,1,0)
site9<-c(5,0,9,0,0,0,0,0,0,0,1,0)
site10<-c(5,0,9,0,0,0,0,0,0,0,1,6)
site11<-c(5,0,9,0,0,0,5,0,0,0,0,0)
site12<-c(5,0,9,0,0,0,0,0,0,0,0,0)
site13<-c(5,1,9,0,0,0,0,0,0,0,0,0)
species_counts<-rbind(site1,site2,site3,site4,site5,site6,site7,site8,site9,site10,site11,site12,site13)
accum <- specaccum(species_counts, method="random", permutations=100)
plot(accum)
In order to ensure I have sampled sufficiently, I need to make sure the curve of the species accumulation plot reaches an asymptote, defined as a slope of <0.3 between the last two points (ei between sites 12 and 13).
results <- with(accum, data.frame(sites, richness, sd))
Produces this:
sites richness sd
1 1 3.46 0.9991916
2 2 4.94 1.6625403
3 3 5.94 1.7513054
4 4 7.05 1.6779918
5 5 8.03 1.6542263
6 6 8.74 1.6794660
7 7 9.32 1.5497149
8 8 9.92 1.3534841
9 9 10.51 1.0492422
10 10 11.00 0.8408750
11 11 11.35 0.7017295
12 12 11.67 0.4725816
13 13 12.00 0.0000000
I feel like I'm getting there. I could generate an lm with site vs richness and extract the exact slope (tangent?) between sites 12 and 13. Going to search a bit longer here.
Streamlining your data generation process a little bit:
species_counts <- matrix(c(0,8,9,7,0,0,0,8,0,7,8,0,
5,0,9,0,5,0,0,0,0,0,0,0, 5,0,9,0,0,0,0,0,0,6,0,0,
5,0,9,0,0,0,0,0,0,0,0,0, 5,0,9,0,0,6,6,0,0,0,0,0,
5,0,9,0,0,0,6,6,0,0,0,0, 5,0,9,0,0,0,0,0,7,0,0,3,
5,0,9,0,0,0,0,0,0,0,1,0, 5,0,9,0,0,0,0,0,0,0,1,0,
5,0,9,0,0,0,0,0,0,0,1,6, 5,0,9,0,0,0,5,0,0,0,0,0,
5,0,9,0,0,0,0,0,0,0,0,0, 5,1,9,0,0,0,0,0,0,0,0,0),
byrow=TRUE,nrow=13)
Always a good idea to set.seed() before running randomization tests (and let us know that specaccum is in the vegan package):
set.seed(101)
library(vegan)
accum <- specaccum(species_counts, method="random", permutations=100)
Extract the richness and sites components from within the returned object and compute d(richness)/d(sites) (note that the slope vector is one element shorter than the origin site/richness vectors: be careful if you're trying to match up slopes with particular numbers of sites)
(slopes <- with(accum,diff(richness)/diff(sites)))
## [1] 1.45 1.07 0.93 0.91 0.86 0.66 0.65 0.45 0.54 0.39 0.32 0.31
In this case, the slope never actually goes below 0.3, so this code for finding the first time that the slope falls below 0.3:
which(slopes<0.3)[1]
returns NA.
I bascially want to format my dataframe print. I would like to choose for each column if the text is centered or not and after some defined rows insert a 1px separator. Furthermore it would be great to define for each column the width. With what function is that possible? I would like to output that later on to a textfile, and don't want to use Latex.
EDIT:
I just want to print a dataframe to a textfile, but as a nicely formatted table. So that it looks like an Excel sheet where you hide the gridlines. After x rows I want to basically just have a separator line ("----------") filling the whole width of the table.
Example:
My Data Frame consists of the following data:
Row1: "Test Entry 1", 5, 75, 0.3
Row2: "Test 2", 0.3, 1, 0.5
Output should be
Test Entry 1 5 75 0.3
------------------------------
Test 2 0.3 1 0.5
I hope it's more clear now :)
You are probably better off using one of the table packages but if you really really want to do it you can try something like this (pretty rudimentary, but can be expanded)
df<-data.frame(Test=sample(c("Test 1","Test 2","Test 3"),10,replace=T),
D1=round(runif(10)*10,2),
D2=round(runif(10)*10,2),
D3=round(runif(10)*10,2))
sepwidth<-60
colwidth<-10
require(plyr)
ddply(df,.(Test),function(d){
print(noquote(apply(d,c(1:2),function(p)paste0(paste0(rep(" ",colwidth-length(p)),collapse=""),p,collapse=""))));
print(noquote(paste0(rep("-",sepwidth),collapse="")))
return(NULL)})
Test D1 D2 D3
[1,] Test 1 5.37 3.48 1.19
[2,] Test 1 9.49 9.51 9.44
[3,] Test 1 8.52 6.53 4.10
[4,] Test 1 0.72 0.20 0.20
[5,] Test 1 2.70 6.19 8.17
[1] ------------------------------------------------------------
Test D1 D2 D3
[1,] Test 2 0.61 0.96 2.17
[2,] Test 2 6.85 2.36 6.90
[3,] Test 2 8.99 2.86 2.32
[1] ------------------------------------------------------------
Test D1 D2 D3
[1,] Test 3 0.23 6.42 9.41
[2,] Test 3 1.53 1.84 4.60
[1] ------------------------------------------------------------