I found a formula for getting the angle between two 3D vectors (e.g. as shown in https://stackoverflow.com/a/12230083/6607497).
However when trying to implement it in PostScript I realized that PostScript lacks the arcus cosinus function needed to get the angle.
Why doesn't PostScript have that function and how would a work-around look like?
In Wikipedia I found the formula
$\arccos(x)={\frac {\pi }{2}}-\arctan \left({\frac {x}{\sqrt {1-x^{2}}}}\right)}$, but that looks a bit complicated; and if it's not: Why didn't they add acos (arcus cosinus) using that definition?
Did I miss something obvious?
As indicated in Wikipedia and done in pst-math, acos can be implemented rather easily using atan and sqrt (among other primitives) like this:
GS>% arcus cosinus, using degrees
GS>/acos { dup dup mul neg 1.0 add sqrt exch atan } bind def
GS>1 acos ==
0.0
GS>-1 acos ==
180.0
GS>0 acos ==
90.0
However this may be less efficient than a native implementation.
As the comments indicated, if the only purpose for arcus cosinus is to get the angle between two vectors, another approach using the two-parameter atan is preferable.
I ran some experiments in C and Kahan's formula 2 * atan2 (norm (x * norm (y) - norm (x) * y), norm (x * norm (y) + norm (x) * y)) works well, as does the more commonly used atan2 (norm (cross (x, y)), dot (x, y)). Best I can tell, C's atan2 is equivalent to Postscript's two-argument atan function. –
njuffa -Oct 23 at 23:21
So I tried to implement it (for 3D vectors (x y z)):
First the norm (length), named n3:
% norm 3D vector
%(x y z)
/n3 { dup mul exch dup mul add exch dup mul add sqrt } def
Then three helpers to multiply (v3m), add (v3a), and subtract (v3s) 3D vectors (for those not fearing stack rotations):
% multiply 3D vector with factor
%(x y z f)
/v3m {
dup 3 1 roll mul 4 1 roll %(z*f x y f)
dup 3 1 roll mul 3 1 roll %(z*f y*f x f)
mul exch 3 -1 roll %(x*f y*f z*f)
} def
% subtract 3D vectors
%(x1 y1 z1 x2 y2 z2)
/v3s {
6 -3 roll %(x2 y2 z2 x1 y1 z1)
4 -1 roll sub %(x2 y2 x1 y1 z1-z2)
5 1 roll 3 -1 roll sub %(z1-z2 x2 x1 y1-y2)
4 1 roll exch sub %(y1-y2 z1-z2 x1-x2)
3 1 roll %(x1-x2 y1-y2 z1-z2)
} def
% add 3D vectors
%(x1 y1 z1 x2 y2 z2)
/v3a {
4 -1 roll add %(x1 y1 x2 y2 z2+z1)
5 1 roll 3 -1 roll add %(z2+z1 x1 x2 y2+y1)
4 1 roll add %(y2+y1 z2+z1 x1+x2)
3 1 roll %(x1+x2 y2+y1 z2+z1)
} def
Eventually the angle function (a) is a bit different, because I wanted to avoid heavy stack shuffling, so converted the vectors to arrays and assigned the arrays to a name each.
I think it makes the code also a bit more readable, while a bit less efficient (you may try to use copy to duplicate the vectors' coordinates without using arrays and names (and the additional local dictionary):
% angle between two 3D vectors
%(x1 y1 z1 x2 y2 z2)
/a {
[ 4 1 roll ]
4 1 roll
[ 4 1 roll ]
%([v2] [v1])
2 dict begin
/v1 exch def
/v2 exch def
v1 aload pop n3 v2 aload pop
4 -1 roll v3m
%(|v1|*v2)
v2 aload pop n3 v1 aload pop
4 -1 roll v3m
%(|v1|*v2 |v2|*v1)
v3s n3
%(||v1|*v2-|v2|*v1|)
v1 aload pop n3 v2 aload pop
4 -1 roll v3m
%(|v1|*v2)
v2 aload pop n3 v1 aload pop
4 -1 roll v3m
%(|v1|*v2 |v2|*v1)
v3a n3
%(||v1|*v2-|v2|*v1| ||v1|*v2+|v2|*v1|)
atan
%(atan(||v1|*v2-|v2|*v1|,||v1|*v2+|v2|*v1|))
2.0 mul
end
} def
Finally here are some test cases:
GS>0 0 1 0 0 1 a ==
0.0
GS>0 0 1 0 1 0 a ==
90.0
GS>0 0 1 0 1 1 a ==
45.0
GS>0 0 1 1 0 0 a ==
90.0
GS>0 0 1 1 0 1 a ==
45.0
GS>0 0 1 1 1 0 a ==
90.0
GS>0 0 1 1 1 1 a ==
54.7356071
%
GS>0 1 0 0 0 1 a ==
90.0
GS>0 1 0 0 1 0 a ==
0.0
GS>0 1 0 0 1 1 a ==
45.0
GS>0 1 0 1 0 0 a ==
90.0
GS>0 1 0 1 0 1 a ==
90.0
GS>0 1 0 1 1 0 a ==
45.0
GS>0 1 0 1 1 1 a ==
54.7356071
%
GS>0 1 1 0 0 1 a ==
45.0
GS>0 1 1 0 1 0 a ==
45.0
GS>0 1 1 0 1 1 a ==
0.0
GS>0 1 1 1 0 0 a ==
90.0
GS>0 1 1 1 0 1 a ==
59.9999962
GS>0 1 1 1 1 0 a ==
59.9999962
GS>0 1 1 1 1 1 a ==
35.264389
%
GS>1 0 0 0 0 1 a ==
90.0
GS>1 0 0 0 1 0 a ==
90.0
GS>1 0 0 0 1 1 a ==
90.0
GS>1 0 0 1 0 0 a ==
0.0
GS>1 0 0 1 0 1 a ==
45.0
GS>1 0 0 1 1 0 a ==
45.0
GS>1 0 0 1 1 1 a ==
54.7356071
%
GS>1 0 1 0 0 1 a ==
45.0
GS>1 0 1 1 1 0 a ==
59.9999962
GS>1 0 1 0 1 1 a ==
59.9999962
GS>1 0 1 1 0 0 a ==
45.0
GS>1 0 1 1 0 1 a ==
0.0
GS>1 0 1 1 1 0 a ==
59.9999962
GS>1 0 1 1 1 1 a ==
35.264389
%
GS>1 1 0 0 0 1 a ==
90.0
GS>1 1 0 0 1 0 a ==
45.0
GS>1 1 0 0 1 1 a ==
59.9999962
GS>1 1 0 1 0 0 a ==
45.0
GS>1 1 0 1 0 1 a ==
59.9999962
GS>1 1 0 1 1 0 a ==
0.0
GS>1 1 0 1 1 1 a ==
35.264389
%
GS>1 1 1 0 0 1 a ==
54.7356071
GS>1 1 1 0 1 0 a ==
54.7356071
GS>1 1 1 0 1 1 a ==
35.264389
GS>1 1 1 1 0 0 a ==
54.7356071
GS>1 1 1 1 0 1 a ==
35.264389
GS>1 1 1 1 1 0 a ==
35.264389
GS>1 1 1 1 1 1 a ==
0.0
I hope I didn't mess it up.
Related
I am currently working on a huge file containing stops/go of several machinas (about 60) over a long period (more than 60 000 rows).
I have already indexed the table by 1 if the device is working, or 0 if it is not working.
**Date n°1 n°2 n°3 n°4 n°5 n°6 n°7**
1 2011-12-13 00:00:00 0 1 1 1 1 1 1
2 2011-12-13 01:00:00 0 1 1 1 1 1 1
3 2011-12-13 02:00:00 0 1 1 1 1 1 1
4 2011-12-13 03:00:00 0 1 1 1 1 1 1
5 2011-12-13 04:00:00 0 1 1 1 1 1 1
6 2011-12-13 05:00:00 0 1 1 1 1 1 1
7 2011-12-13 06:00:00 0 1 1 1 1 1 1
Sometimes the devices have to be stopped (not at the same time) for a longer period (more than 480 hours) for specific purposes. It is equivalent to more than 480 sucessive rows of not working.
I would like to identify those specific periods and separate it from regular stops 0 and replace by -1 in order to get the beginning date of those long periods.
I have a code already working. The problem is that it takes a long time to run... I guess it is because of the nested loop. But I tried and cannot figure out another way of processing using lapply for instance.
for (c in 2:ncol(dataframe)){
for (r in 1:(nrow(dataframe)-480)) {
if(sum(dataframe[r:(r+480),c])==0)
{dataframe[r,c]<-(-1) }
else
{dataframe[r,c]<-dataframe[r,c]}
}}
for (c in 2:ncol(dataframe)){
for (r in 1:(nrow(dataframe)-1)) {
if (dataframe[r,c]==-1 && dataframe[r+1,c]==0)
{dataframe[r+1,c]<-(-1)}
}}
This code replace 0 by (-1) if there are at least 480 following zeros in the column. If there are still some zeros following (the last ones), they will be transformed to "-1".
I just would like to know how I can improve this coding scheme and save computation time...
Thank you in advance
You can use rle for that (thanks to #A.Suliman for the helpful comment).
f <- function(x, thres = 480, replacement = -1) {
r <- rle(x)
r$values <- with(r, replace(values, lengths >= thres & values == 0, replacement))
inverse.rle(r)
}
Apply the function on each column, I use 5 consecutive 0's as an example. (you would need to exclude the first column and set thres = 480, i.e. dat[-1] <- lapply(dat[-1], f) )
dat[] <- lapply(dat, f, thres = 5)
dat
# X1 X2 X3 X4 X5 X6 X7
#1 0 1 1 1 0 0 1
#2 0 -1 0 -1 1 0 0
#3 0 -1 1 -1 0 0 0
#4 1 -1 0 -1 0 1 0
#5 0 -1 0 -1 1 0 1
#6 1 -1 1 -1 0 0 -1
#7 1 -1 0 -1 1 0 -1
#8 -1 -1 0 1 -1 0 -1
#9 -1 1 1 0 -1 1 -1
#10 -1 -1 0 1 -1 0 -1
#11 -1 -1 0 0 -1 1 -1
#12 -1 -1 1 1 -1 1 -1
#13 -1 -1 -1 0 -1 0 -1
#14 -1 -1 -1 0 1 0 -1
#15 1 1 -1 0 1 0 1
#16 0 0 -1 1 1 0 0
#17 1 1 -1 1 0 1 0
#18 1 0 -1 0 0 0 0
#19 0 1 -1 1 1 0 1
#20 1 0 -1 1 0 0 0
data
set.seed(1)
dat <- data.frame(replicate(7, expr = sample(c(0, 1), 20, TRUE, prob = c(.7, .3))))
I am trying to create a function to apply to a variable in a dataframe that, for a windows of 2 days forward from the current observation, change the value of VarD if in that date window it always take the value 1.
The dataframe looks like this:
VarA VarB Date Diff VarD
1 1 2007-04-09 NA 0
1 1 2007-04-10 0 0
1 1 2007-04-11 -2 1
1 1 2007-04-12 0 1
1 1 2007-04-13 2 0
1 1 2007-04-14 0 0
1 1 2007-04-15 -2 1
1 1 2007-04-16 1 0
1 1 2007-04-17 -4 1
1 1 2007-04-18 0 1
1 1 2007-04-19 0 1
1 1 2007-04-20 0 1
The new dataframe should look like the following:
VarA VarB Date Diff VarD VarC
1 1 2007-04-09 NA 0 0
1 1 2007-04-10 0 0 0
1 1 2007-04-11 -2 1 1
1 1 2007-04-12 0 1 1
1 1 2007-04-13 2 0 0
1 1 2007-04-14 0 0 0
1 1 2007-04-15 -2 1 1
1 1 2007-04-16 1 0 0
1 1 2007-04-17 -4 1 0
1 1 2007-04-18 0 1 0
1 1 2007-04-19 0 1 0
1 1 2007-04-20 0 1 0
I have tried the following code:
db$VarC <- 0
for (i in unique(db$VarA)) {
for (j in unique(db$VarB)) {
for (n in 1 : lenght(db$Date)) {
if (db$VarD[n] == 0) {db$VarC[n] <- 0}
else { db$VarC[n] <- ifelse(0 %in% db[(db$Date >=n & db$Date < n+3,]$VarC, 1,0}
}
}
But I obtain just zeroes in VarC. I have checked the code without the else and it works fine. No error by r if the complete code is run. I do not have any clue on where the problem could be.
Here are some alternatives. The first one avoids some messy indexing but the last two do not require any packages.
1) rollapply This applies the VarC function in a rolling fashion to each 3 elements of db$VarD. align = "left" says that when it passes x to function VarC that x[1] is the current element, x[2] the next and x[3] the next, i.e. the current element is the leftmost. partial = TRUE says that if there are not 3 elements available (which would be the case for the last and next to last elements) then just pass however many there are remaining.
library(zoo)
VarC <- function(x) if (all(x[-1] == 1)) 0 else x[1]
db$VarC <- rollapply(db$VarD, 3, VarC, partial = TRUE, align = "left")
giving:
> db
VarA VarB Date Diff VarD VarC
1 1 1 2007-04-09 NA 0 0
2 1 1 2007-04-10 0 0 0
3 1 1 2007-04-11 -2 1 1
4 1 1 2007-04-12 0 1 1
5 1 1 2007-04-13 2 0 0
6 1 1 2007-04-14 0 0 0
7 1 1 2007-04-15 -2 1 1
8 1 1 2007-04-16 1 0 0
9 1 1 2007-04-17 -4 1 0
10 1 1 2007-04-18 0 1 0
11 1 1 2007-04-19 0 1 0
12 1 1 2007-04-20 0 1 0
2) sapply or using VarC from above:
n <- nrow(db)
db$VarC <- sapply(1:n, function(i) VarC(db$VarD[i:min(i+2, n)]))
3) for or using n and VarC from above:
db$VarC <- NA
for(i in 1:n) db$VarC[i] <- VarC(db$VarD[i:min(i+2, n)])
Note: The input db in reproducible form is:
Lines <- "VarA VarB Date Diff VarD VarC
1 1 2007-04-09 NA 0 0
1 1 2007-04-10 0 0 0
1 1 2007-04-11 -2 1 1
1 1 2007-04-12 0 1 1
1 1 2007-04-13 2 0 0
1 1 2007-04-14 0 0 0
1 1 2007-04-15 -2 1 1
1 1 2007-04-16 1 0 0
1 1 2007-04-17 -4 1 0
1 1 2007-04-18 0 1 0
1 1 2007-04-19 0 1 0
1 1 2007-04-20 0 1 0 "
db <- read.table(text = Lines, header = TRUE)
Let's say I have 3 vectors (strings of 10):
X <- c(1,1,0,1,0, 1,1, 0, NA,NA)
H <- c(0,0,1,0,NA,1,NA,1, 1, 1 )
I <- c(0,0,0,0,0, 1,NA,NA,NA,1 )
Data.frame Y contains 10 columns and 6 rows:
1 2 3 4 5 6 7 8 9 10
0 1 0 0 1 1 1 0 1 0
1 1 1 0 1 0 1 0 0 0
0 0 0 0 1 0 0 1 0 1
1 0 1 1 0 1 1 1 0 0
0 0 0 0 0 0 1 0 0 0
1 1 0 1 0 0 0 0 1 1
I'd like to use vector X, H en I to make column selections in data.frame Y, using "1's" and "0's" in the vector as selection criterium .
So the results for vector X using the '1' as selection criterium should be:
X <- c(1,1,0,1,0, 1,1, 0, NA,NA)
1 2 4 6 7
0 1 0 1 1
1 1 0 0 1
0 0 0 0 0
1 0 1 1 1
0 0 0 0 1
1 1 1 0 0
For vector H using the '1' as selection criterium:
H <- c(0,0,1,0,NA,1,NA,1, 1, 1 )
3 6 8 9 10
0 1 0 1 0
1 0 0 0 0
0 0 1 0 1
1 1 1 0 0
0 0 0 0 0
0 0 0 1 1
For vector I using the '1' as selection criterium:
I <- c(0,0,0,0,0, 1,NA,NA,NA,1 )
6 10
1 0
0 0
0 1
1 0
0 0
0 1
For convenience and speed I'd like to use a loop. It might be something like this:
all.ones <- lapply[,function(x) x %in% 1]
In the outcome (all.ones), the result for each vector should stay separate. For example:
X 1,2,4,6,7
H 3,6,8,9,10
I 6,10
The standard way of doing this is using the %in% operator:
Y[, X %in% 1]
To do this for multiple vectors (assuming you want an AND operation):
mylist = list(X, H, I, D, E, K)
Y[, Reduce(`&`, lapply(mylist, function(x) x %in% 1))]
The problem is the NA, use which to get round it. Consider the following:
x <- c(1,0,1,NA)
x[x==1]
[1] 1 1 NA
x[which(x==1)]
[1] 1 1
How about this?
idx <- which(X==1)
Y[,idx]
EDIT: For six vectors, do
idx <- which(X==1 & H==1 & I==1 & D==1 & E==1 & K==1)
Y[,idx]
Replace & with | if you want all columns of Y where at least one of the lists has a 1.
I have a txt file (data5.txt):
1 0 1 0 0
1 1 1 0 0
0 0 1 0 0
1 1 1 0 1
0 0 0 0 1
0 0 1 1 1
1 0 0 0 0
1 1 1 1 1
0 1 0 0 1
1 1 0 0 0
I need to count the frequency of one's and zero's in each column
if the frequency of ones >= frequency of zero's then I will print 1 after the last row for that Colum
I'm new in R, but I tried this, and I got error:
Error in if (z >= d) data[n, i] = 1 else data[n, i] = 0 :
missing value where TRUE/FALSE needed
my code:
data<-read.table("data5.txt", sep="")
m =length(data)
d=length(data[,1])/2
n=length(data[,1])+1
for(i in 1:m)
{
z=sum(data[,i])
if (z>=d) data[n,i]=1 else data[n,i]=0
}
You may try this:
rbind(df, ifelse(colSums(df == 1) >= colSums(df == 0), 1, NA))
# V1 V2 V3 V4 V5
# 1 1 0 1 0 0
# 2 1 1 1 0 0
# 3 0 0 1 0 0
# 4 1 1 1 0 1
# 5 0 0 0 0 1
# 6 0 0 1 1 1
# 7 1 0 0 0 0
# 8 1 1 1 1 1
# 9 0 1 0 0 1
# 10 1 1 0 0 0
# 11 1 1 1 NA 1
Update, thanks to a nice suggestion from #Arun:
rbind(df, ifelse(colSums(df == 1) >= ceiling(nrow(df)/2), 1, NA)
or even:
rbind(df, ifelse(colSums(df == 1) >= nrow(df)/2, 1, NA)
Thanks to #SvenHohenstein.
Possibly I misinterpreted your intended results. If you want 0 when frequency of ones is not equal or larger than frequency of zero, then this suffice:
rbind(df, colSums(df) >= nrow(df) / 2)
Again, thanks to #SvenHohenstein for his useful comments!
I have a data frame that looks generally like this
df.data <- data.frame(x=sample(1:9, 10, replace = T), y=sample(1:9, 10, replace=T), vx=sample(-1:1, 10, replace=T), vy=sample(-1:1, 10, replace=T))
x and y are positions. vx and vy are x, y values for a 2d vector. I want to take this data frame and "bin" based on the x and y values, but performing a calculation on the vx and vy. This function does this except it uses a loop which is going to be too slow for my data set.
slowWay <- function(df)
{
df.bin <- data.frame(expand.grid(x=0:3, y=0:3, vx=0, vy=0, count=0))
for(i in 1:nrow(df))
{
x.bin <- floor(df[i, ]$x / 3)
y.bin <- floor(df[i, ]$y / 3)
print(c(x.bin, y.bin))
df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$vx = df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$vx + df[i, ]$vx
df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$vy = df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$vy + df[i, ]$vy
df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$count = df.bin[df.bin$x == x.bin & df.bin$y == y.bin, ]$count + 1
}
return(df.bin)
}
Is this type of 2D binning possible in a non looping way?
Here's another faster way to do it, one that includes unpopulated bin combinations:
fasterWay <- function(df.data) {
a1 <- aggregate(df.data[,3:4], list(x=floor(df.data$x/3), y=floor(df.data$y/3)), sum)
a2 <- aggregate(list(count=rep(NA,nrow(df.data))), list(x=floor(df.data$x/3), y=floor(df.data$y/3)), length)
result <- merge(expand.grid(y=0:3,x=0:3), merge(a1,a2), by=c("x","y"), all=TRUE)
result[is.na(result)] <- 0
result <- result[order(result$y, result$x),]
rownames(result) <- NULL
result
}
It gives me:
x y vx vy count
1 0 0 0 0 1
2 0 1 0 0 0
3 0 2 -1 -1 1
4 0 3 0 0 0
5 1 0 -1 -1 1
6 1 1 0 0 0
7 1 2 0 0 0
8 1 3 -1 0 2
9 2 0 -1 -1 1
10 2 1 0 0 0
11 2 2 -1 1 2
12 2 3 0 0 1
13 3 0 0 0 0
14 3 1 0 0 0
15 3 2 -1 0 1
16 3 3 0 0 0
This is one way, but will probably need to do it in a couple of steps if you want the full record with unpopulated bin combinations:
> by(df.data[, c("vx", "vy")], # input data
list(x.bin=floor(df.data$x / 3), y.bin=floor(df.data$y / 3)), # grouping
function(df) sapply(df, function(x) c(Sum=sum(x), Count=length(x) ) ) ) #calcs
x.bin: 0
y.bin: 1
vx vy
Sum 0 1
Count 1 1
---------------------------------------------------------------------
x.bin: 1
y.bin: 1
vx vy
Sum 0 1
Count 2 2
---------------------------------------------------------------------
x.bin: 2
y.bin: 1
vx vy
Sum -1 -2
Count 2 2
---------------------------------------------------------------------
x.bin: 0
y.bin: 2
vx vy
Sum 1 0
Count 1 1
---------------------------------------------------------------------
x.bin: 1
y.bin: 2
NULL
---------------------------------------------------------------------
x.bin: 2
y.bin: 2
vx vy
Sum 2 1
Count 4 4
Here is a data.table version:
library(data.table)
dt.data<-as.data.table(df.data) # Convert to data.table
dt.data[,c("x.bin","y.bin"):=list(floor(x/3),floor(y/3))] # Add bin columns
setkey(dt.data,x.bin,y.bin)
dt.bin<-CJ(x=0:3, y=0:3) # Cross join to create bin combinations
dt.data.2<-dt.data[dt.bin,list(vx=sum(vx),vy=sum(vy),count=.N)] # Join the bins and data; sum vx/vy and count matching rows
dt.data.2[is.na(vx),vx:=0L] # Replace NA with 0
dt.data.2[is.na(vy),vy:=0L] # Replace NA with 0
dt.data.2[order(y.bin,x.bin)] # Display the final data.table output
## x.bin y.bin vx vy count
## 1: 0 0 0 0 0
## 2: 1 0 0 0 0
## 3: 2 0 1 1 1
## 4: 3 0 0 0 0
## 5: 0 1 0 0 0
## 6: 1 1 0 -2 3
## 7: 2 1 0 0 0
## 8: 3 1 0 0 0
## 9: 0 2 0 0 1
## 10: 1 2 0 0 0
## 11: 2 2 0 2 3
## 12: 3 2 -1 1 1
## 13: 0 3 0 0 0
## 14: 1 3 0 0 0
## 15: 2 3 0 0 0
## 16: 3 3 1 -1 1