I need to calculate the results of a very simple formula (weighted average) that uses two variables (A and B) and two weight factors (A_prop and B_prop). The calculation is to be performed in a very large data set and the weight factors are stored in another data frame that I called here grid.
My approach was first to create repetitions of the data for each weight factors combination and then performed the calculations. Till that nothing strange. However then I thought about calculating values inside loop. Everything seemed to be in place, but then I checked the results of both approaches and results do not match. The results from the calculation inside loop are incorrect.
I know I should just get along and keep with the one that gives me the correct results, also because the number of lines are quite small. No big problem. However... I can just live with this. I'm about to tear my hair.
Can anyone explain me why the results are not the same? What's wrong with the loop calculation?
Also, in addition, if you have any suggestion on a more elegant it will be welcome.
(note: my first time using a reprex. Hope it is as it should)
>require(tidyverse)
>require(magicfor)
>require(readxl)
>require(reprex)
> dput(dt)
structure(list(X = 1:5, A = c(83.73, 50.4, 79.59, 62.96, 0),
B = c(100, 86.8, 80.95, 81.48, 0), weight = c(201.6, 655,
220.5, 280, 94.5), ind = c(733L, 26266L, 6877L, 8558L, 16361L
)), class = "data.frame", row.names = c(NA, -5L))
> dput(grid)
structure(list(A_prop = c(0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8,
0.85, 0.9, 0.95, 1), B_prop = c(0.5, 0.45, 0.4, 0.35, 0.3, 0.25,
0.2, 0.15, 0.1, 0.05, 0), id = 1:11, tag = structure(1:11, .Label = c("Aprop_0.5",
"Aprop_0.55", "Aprop_0.6", "Aprop_0.65", "Aprop_0.7", "Aprop_0.75",
"Aprop_0.8", "Aprop_0.85", "Aprop_0.9", "Aprop_0.95", "Aprop_1"
), class = "factor")), class = "data.frame", row.names = c(NA,
-11L))
> foo<-function(data,i){
+ val<-(data$A*grid[i,1])+(data$B*grid[i,2])
+ return(val)
+ }
> magic_for(print, progress=FALSE,silent = TRUE)
> for(i in grid$id){
+
+ score<-(dt$A*grid[i,1])+(dt$B*grid[i,2])
+
+ weight=dt$weight
+ A<-dt$A
+ B<-dt$B
+
+ ind=dt$ind
+
+ print(score)
+ print(weight)
+ print(ind)
+ print(A)
+ print(B)
+ }
> rest<-magic_result_as_dataframe()
> magic_free()
> rest2<-left_join(rest,grid,by=c("i"="id"))%>%
+ arrange(ind,tag)%>%
+ mutate(score2=(A*A_prop)+(B*B_prop))
> head(rest2)
i score weight ind A B A_prop B_prop tag score2
1 1 91.8650 201.6 733 83.73 100 0.50 0.50 Aprop_0.5 91.8650
2 2 84.5435 201.6 733 83.73 100 0.55 0.45 Aprop_0.55 91.0515
3 3 86.1705 201.6 733 83.73 100 0.60 0.40 Aprop_0.6 90.2380
4 4 87.7975 201.6 733 83.73 100 0.65 0.35 Aprop_0.65 89.4245
5 5 89.4245 201.6 733 83.73 100 0.70 0.30 Aprop_0.7 88.6110
6 6 91.0515 201.6 733 83.73 100 0.75 0.25 Aprop_0.75 87.7975
The problem is actually your left_join and NOT the for loop. For future posts, I would recommend that you also provide a minimal(istic) example.
I will demonstrate what went wrong in your code.
Say, we have these data frames, which should be similar to your real-world data:
dt <- data.frame(
A = c(2,3,4),
B = c(20,30,40)
)
grid <- data.frame(
A_prop = c(0.5, 0.6),
B_prop = c(0.5, 0.4),
id = c(1,2),
tag = c("A_prop0.5", "A_prop0.6"))
We expect the following outputs:
Expected Output dt[1,] & A_prop 0.5 and B_prop 0.5
2 * 0.5 + 20 * 0.5 #= 11
Expected Output dt[2,] & A_prop 0.5 and B_prop 0.5
3 * 0.5 + 30 * 0.5 #= 16.5
Expected Output dt[3,] & A_prop 0.5 and B_prop 0.5
4 * 0.5 + 40 * 0.5 #= 22
Expected Output dt[1,] & A_prop 0.6 and B_prop 0.4
2 * 0.6 + 20 * 0.4 #= 9.2
Expected Output dt[1,] & A_prop 0.6 and B_prop 0.4
3 * 0.6 + 30 * 0.4 #= 13.8
Expected Output dt[1,] & A_prop 0.6 and B_prop 0.4
4 * 0.6 + 40 * 0.4 #= 18.4
I have never used the "magicfor" library, but the problem lies in your way of joining i and id.
I would write the for loop as follows:
l <- list()
for(i in grid$id){
score<-(dt$A*grid[i,1])+(dt$B*grid[i,2])
A<-dt$A
B<-dt$B
iteration <- rep(i, 3) # to keep track in which iteration the result was created.
l[[i]] <- list(
score = score,
A = A,
B = B,
iteration = iteration
)
}
Now I bind the list to a data frame and do the left_join as you did in your example:
l <- bind_rows(l)
l_merged <- grid %>% left_join(l, by = c("id"="iteration")) %>%
mutate(score2 = (A*A_prop + B*B_prop))
The test that score and score2 are the same:
transmute(l_merged, identical = score == score2)
identical
1 TRUE
2 TRUE
3 TRUE
4 TRUE
5 TRUE
6 TRUE
Now to the actual problem
I have adapted your code a little bit. I have added the iteration number to the output.
magic_for(print, progress=FALSE,silent = TRUE)
for(i in grid$id){
score<-(dt$A*grid[i,1])+(dt$B*grid[i,2])
A<-dt$A
B<-dt$B
iteration <- rep(i, 3)
print(score)
print(A)
print(B)
print(iteration)
}
rest<-magic_result_as_dataframe()
magic_free()
Now, if we look at the output and compare i and iteration, we can see that these are not identical. Therefore your left_join() has produced a confusing result.
rest %>% arrange(i)
i score A B iteration
1 1 11.0 2 20 1
2 1 22.0 4 40 1
3 1 13.8 3 30 2
4 2 16.5 3 30 1
5 2 9.2 2 20 2
6 2 18.4 4 40 2
To finalise, we can test it:
grid %>% left_join(rest, by = c("id"="i")) %>% # using i for the join
mutate(score2 = (A*A_prop + B*B_prop)) %>%
transmute(identical = score == score2)
identical
1 TRUE
2 TRUE
3 FALSE
4 FALSE
5 TRUE
6 TRUE
The join with i does not produce the correct results.
But the join with iteration will:
grid %>% left_join(rest, by = c("id"="iteration")) %>% # using the "manually" produced iteration for the join
mutate(score2 = (A*A_prop + B*B_prop)) %>%
transmute(identical = score == score2)
identical
1 TRUE
2 TRUE
3 TRUE
4 TRUE
5 TRUE
6 TRUE
I am not sure why the i from "magicfor" is different from the manually created iteration. I certainly get your confusion...
Related
I'd like to create a new data frame (new.df) using an original (df) but with a new column (Age) using a complex operation with mutate function in dplyr package. My steps are:
# Artificial dataframe
IDtest<-c(1,1,1,1,1,1,2,2,2,3,3,3,3)
Class<-c(1,1,2,2,2,3,1,1,2,1,2,2,3)
Day<-c(0,47,76,100,150,173,0,47,76,0,47,76,100)
Area<-c(0.45,0.85,1.50,1.53,1.98,5.2,
0.36,0.58,1.2,
0.85,1.36,2.26,3.59)
df<-data.frame(cbind(IDtest, Class, Day, Area))
str(df)
#Split each IDtest
df[df[,1]==1,]
# IDtest Class Day Area
#1 1 1 0 0.45
#2 1 1 47 0.85
#3 1 2 76 1.50
#4 1 2 100 1.53
#5 1 2 150 1.98
#6 1 3 173 5.20
Inside each IDtest:
Last Area inside each Class minus a factor (If the Area < 1 = 0; < 2.9 = 1; < 8.9 = 3; < 24.9 = 9; > 25 = 25); and
Than 1) divided by make subtraction between last and first Area and divided by Day inside each Class
Than 1) and 2) sum Day inside each Class minus last Day. Finally sum of all divided by 365 and create a new column Age
#For Class 1
(0.85-0)/((0.85-0.45)/47) + (47 - 0)
#For Class 2
(1.98-1)/((1.98-0.85)/150) + (157 - 47)
#For Class 3
(5.20-3)/((5.2-1.98)/173) + (173 - 150)
#Final
Age<-((0.85-0)/((0.85-0.45)/47) + (47 - 0) +
(1.98-1)/((1.98-0.85)/150) + (157 - 47) +
(5.20-3)/((5.2-1.98)/173) + (173 - 150))/365
Age
#[1] 1.44702
# Desirable output
new.df
# IDtest Class Day Area Age
#1 1 1 0 0.45 1.44702
#2 1 1 47 0.85 1.44702
#3 1 2 76 1.50 1.44702
#4 1 2 100 1.53 1.44702
#5 1 2 150 1.98 1.44702
#6 1 3 173 5.20 1.44702
Please any ideas?
It's quite tricky, so I have made all steps separatedly, to make you easier detecting any possible missunderstanding.
Is it possible that a mistake exists in this line of yours?
(1.98-1)/((1.98-0.85)/150) + (157 - 47) # 157? wouldn't it be 150?
That said, my results for Class 1 are the same as yours, but please be careful with Class 2 and 3, because I'm not sure of properly understanding the second and third steps, I'm not absolutely sure about your use of "last" (i.e. "last" in the Class or the "previous" Class).
In the second step I use "last" in the Class and in third one I use a for loop to use "the previous". I think you can addapt the idea
df2 <- df %>%
group_by(IDtest, Class) %>%
mutate(
DayOrder = row_number()
)
df2 <- df2 %>%
mutate(step1a = Area[max(DayOrder)], # I divide step1 in several steps to make it clearer
minus = # what you want to substract
case_when(
step1a < 1 ~ 0,
step1a < 2.9 ~ 1,
step1a < 8.9 ~ 3,
step1a < 24.9 ~ 9,
step1a > 25 ~ 25
),
step1done = step1a - minus,
step2a = Area[max(DayOrder)] - Area[min(DayOrder)], # "Last" inside the same Class (as it is inside mutate, which is under group_by)
step2b = Day[max(DayOrder)],
step2done = step2a / step2b,
step1by2 = step1done / step2done
)
df2$step3 <- NA
for (i in 1:max(df2$Class)){
if(i == 1){
df2$step3[Class == i] <- max(df2$Day[df2$Class == i]) - 0 # quite silly
}else{
df2$step3[Class == i] <- max(df2$Day[df2$Class == i]) - max(df2$Day[df2$Class == i - 1]) # "Last" as the "previous" Class, not inside the same Class
}}
df2 %>%
mutate(
step3done = step1by2 + step3,
Age = step3done / 365 # Do you want "age" as a unique value?? not a value for each person? This case I would do this outside mutate and add as a new column
)
If I have misunderstood you, I hope you can at least take some ideas!
I am creating a 3 x 3 data.tree of conditional probabilities and would like the last branch of each node to be equal to 1 - sum of the siblings. The idea is that I will tweak each node of the tree and want to make sure the probabilities at the level do not sum up to be greater than 100%.
I have looked at https://cran.r-project.org/web/packages/data.tree/vignettes/data.tree.html and see there is a way to do custom fields as a function but how do I take such an exmaple and allow the function to look at the nodes at the same level as itself?
library(data.tree)
Probability <- function(node) {
result <- node$prob
if (node$position == 3)
result <- 1 - node$parent$children[[1]]$prob - node$parent$children[[2]]$prob
else
sapply(node$children, Probability)
return(result)
}
df <- data.frame(pathString = c("R/1", "R/2", "R/3", "R/1/1", "R/1/2", "R/1/3", "R/2/1", "R/2/2", "R/2/3", "R/3/1", "R/3/2", "R/3/3"),
prob = c(0.1, 0.2, 0, 0.2, 0.2, 0, 0.5, 0.35, 0, 0.35, 0.35, 0))
tree <- as.Node(df)
tree$prob <- Probability(tree)
The output I get with
print(tree, "prob")
levelName prob
1 R NA
2 ¦--1 0.10
3 ¦ ¦--1 0.20
4 ¦ ¦--2 0.20
5 ¦ °--3 0.00
6 ¦--2 0.20
7 ¦ ¦--1 0.50
8 ¦ ¦--2 0.35
9 ¦ °--3 0.00
10 °--3 0.00
11 ¦--1 0.35
12 ¦--2 0.35
13 °--3 0.00
The 3rd branch is still 0 and would have hoped it would be as I expected which is 1 - prob[node1] - prob[node2].
So
R/3 should be 0.7 (= 1 - 0.10 - 0.20)
R/1/3 should be 0.6 (= 1 - 0.20 - 0.20)
R/2/3 should be 0.15 (= 1 - 0.50 - 0.35)
R/3/3 should be 0.30 (= 1 - 0.35 - 0.35)
Am new to data.trees. Any help would be greatly appreciated!!
I have managed to do this using recursion and the Cumulate function
refresh.probs <- function(node) {
if (node$isRoot) {
sapply(node$children, refresh.probs)
} else {
if (node$position == node$parent$count) {
node$prob <- 0
node$prob <- 1 - Cumulate(node, "prob", sum)
}
if (!node$isLeaf)
sapply(node$children, refresh.probs)
}
}
refresh.probs(tree)
This seems to do the trick.
I am trying to identify the appropriate thresholds for two activities which generate the greatest success rate.
Listed below is an example of what I am trying to accomplish. For each location I am trying to identify the thresholds to use for activities 1 & 2, so that if either criteria is met then we would guess 'yes' (1). I then need to make sure that we are guessing 'yes' for only a certain percentage of the total volume for each location, and that we are maximizing our accuracy (our guess of yes = 'outcome' of 1).
location <- c(1,2,3)
testFile <- data.frame(location = rep.int(location, 20),
activity1 = round(rnorm(20, mean = 10, sd = 3)),
activity2 = round(rnorm(20, mean = 20, sd = 3)),
outcome = rbinom(20,1,0.5)
)
set.seed(145)
act_1_thresholds <- seq(7,12,1)
act_2_thresholds <- seq(19,24,1)
I was able to accomplish this by creating a table that contains all of the possible unique combinations of thresholds for activities 1 & 2, and then merging it with each observation within the sample data set. However, with ~200 locations in the actual data set, each of which with thousands of observations I quickly ran of out of space.
I would like to create a function that takes the location id, set of possible thresholds for activity 1, and also for activity 2, and then calculates how often we would have guessed yes (i.e. the values in 'activity1' or 'activity2' exceed their respective thresholds we're testing) to ensure our application rate stays within our desired range (50% - 75%). Then for each set of thresholds which produce an application rate within our desired range we would want to store only the set of which maximizes accuracy, along with their respective location id, application rate, and accuracy rate. The desired output is listed below.
location act_1_thresh act_2_thresh application_rate accuracy_rate
1 1 13 19 0.52 0.45
2 2 11 24 0.57 0.53
3 3 14 21 0.67 0.42
I had tried writing this into a for loop, but was not able to navigate my way through the number of nested arguments I would have to make in order to account for all of these conditions. I would appreciate assistance from anyone who has attempted a similar problem. Thank you!
An example of how to calculate the application and accuracy rate for a single set of thresholds is listed below.
### Create yard IDs
location <- c(1,2,3)
### Create a single set of thresholds
single_act_1_threshold <- 12
single_act_2_threshold <- 20
### Calculate the simulated application, and success rate of thresholds mentioned above using historical data
as.data.table(testFile)[,
list(
application_rate = round(sum(ifelse(single_act_1_threshold <= activity1 | single_act_2_threshold <= activity2, 1, 0))/
nrow(testFile),2),
accuracy_rate = round(sum(ifelse((single_act_1_threshold <= activity1 | single_act_2_threshold <= activity2) & (outcome == 1), 1, 0))/
sum(ifelse(single_act_1_threshold <= activity1 | single_act_2_threshold <= activity2, 1, 0)),2)
),
by = location]
Consider expand.grid that builds a data frame of all combinations betwen both thresholds. Then use Map to iterate elementwise between both columns of data frame to build a list of data tables (of which now includes columns for each threshold indicator).
act_1_thresholds <- seq(7,12,1)
act_2_thresholds <- seq(19,24,1)
# ALL COMBINATIONS
thresholds_df <- expand.grid(th1=act_1_thresholds, th2=act_2_thresholds)
# USER-DEFINED FUNCTION
calc <- function(th1, th2)
as.data.table(testFile)[, list(
act_1_thresholds = th1, # NEW COLUMN
act_2_thresholds = th2, # NEW COLUMN
application_rate = round(sum(ifelse(th1 <= activity1 | th2 <= activity2, 1, 0)) /
nrow(testFile),2),
accuracy_rate = round(sum(ifelse((th1 <= activity1 | th2 <= activity2) & (outcome == 1), 1, 0)) /
sum(ifelse(th1 <= activity1 | th2 <= activity2, 1, 0)),2)
), by = location]
# LIST OF DATA TABLES
dt_list <- Map(calc, thresholds_df$th1, thresholds_df$th2)
# NAME ELEMENTS OF LIST
names(dt_list) <- paste(thresholds_df$th1, thresholds_df$th2, sep="_")
# SAME RESULT AS POSTED EXAMPLE
dt_list$`12_20`
# location act_1_thresholds act_2_thresholds application_rate accuracy_rate
# 1: 1 12 20 0.23 0.5
# 2: 2 12 20 0.23 0.5
# 3: 3 12 20 0.23 0.5
And if you need to append all elements use data.table's rbindlist:
final_dt <- rbindlist(dt_list)
final_dt
# location act_1_thresholds act_2_thresholds application_rate accuracy_rate
# 1: 1 7 19 0.32 0.47
# 2: 2 7 19 0.32 0.47
# 3: 3 7 19 0.32 0.47
# 4: 1 8 19 0.32 0.47
# 5: 2 8 19 0.32 0.47
# ---
# 104: 2 11 24 0.20 0.42
# 105: 3 11 24 0.20 0.42
# 106: 1 12 24 0.15 0.56
# 107: 2 12 24 0.15 0.56
# 108: 3 12 24 0.15 0.56
I am performing this computation that involves two data frames.I created two reproducible examples of the two data frames as an example
> df1
Day1 Day2 Day3 Day4 Day5 Day6 Day7 Day8 Day9 Day10
Time1 0.03 0.43 0.39 0.41 0.94 0.70 0.18 0.65 0.72 0.72
Time2 0.42 0.63 0.93 0.53 0.19 0.55 0.22 0.16 0.56 0.04
and
> df2
Day Time X3 X4 X5
1 1 1 9.252042 19.512621 11.601671
2 1 2 5.021522 17.712484 5.044728
3 2 1 9.603795 19.404302 17.206771
4 2 2 19.686793 18.791541 12.655874
5 3 1 7.546551 18.810526 19.865979
6 3 2 18.233872 19.596584 11.653980
7 4 1 17.499680 14.014276 15.553013
8 4 2 8.115352 17.898786 12.841630
9 5 1 10.719540 8.518823 19.126440
10 5 2 12.853401 6.026599 14.041490
11 6 1 19.984946 10.693528 6.890835
12 6 2 16.360035 15.778092 18.087471
13 7 1 15.498714 15.039444 5.259257
14 7 2 13.179111 17.533358 7.382507
15 8 1 5.124188 15.507194 12.547365
16 8 2 8.008336 10.463382 6.934014
17 9 1 11.246527 6.975527 14.464758
18 9 2 17.914083 18.039384 19.324091
19 10 1 9.876625 19.216317 8.787550
20 10 2 11.851955 15.729080 5.741095
the columns in df1 represent days that the values were recorded and the rows indicate the hours/or time (time 1 or 2). In df2, the first two columns represent the days and times respectively and the other columns are for the locations where data was recorded.
what I will like to do with R is to create another data frame which has the same size as df2, that divides the values in df2[,3:5] by the corresponding df1 value i.e depending on the values in the day and time columns of df2, select the corresponding values of df1.
an example is for for the first value of df2$X3, in the new data frame I will have a value of 9.252042 divided by 0.03. and for the third value of df2$X3, I will have a value of 9.603795 divided by 0.43.
Thank you in advance for any help!
I suppose You makes your data (df1 and df2) as below:
df1 = data.frame(time=c(1:10),time1=c(0.03,0.43,0.39,0.41,.94,.70,.18,.065,0.72,0.72),time2 = c(.42,.63,.93,.53,.19,.55,.22,.16,.56,.04))
df2 = data.frame(Day = rep(c(1:10),each=2),Time = rep(c(1,2),10),X3=c(9.2,5.02,9.6,19.6,7.5,18.2,17.4,8.1,10.7,12.8,19.9,16.3,15.4,13.1,5.1,8,11.2,17.9,9.8,11.8),X4=c(19.5,17.7,19.4,18.8,18,19.5,14.01,17.8,8.5,6,10.6,15.7,15,17.5,15,10,6,18,19,15),X5=c(11.6,5,17,12,19,11,15,12,19,14,6,18,5,7,12,6,14,19,8,5))
Then the code that you will new to create df3 will be this:
df3 = data.frame(df2$Day,df2$Time,newx3 = df2$X3 / df1$time[df2$Day],newx4 = df2$X4 / df1$time[df2$Day],newx5 = df2$X5 / df1$time[df2$Day])
My suggestion is to follow tidy data principles
Here I provide an example with the same structure as your dataframes but more simplified and only with days 1-3:
library(dplyr)
library(tidyr)
untidy = tibble(day1 = c(0.03,0.42), day2 = c(0.43,0.63), day3 = c(0.39,0.93))
tidy = tibble(day = c(1,1,2,2,3,3), time = c(1,2,1,2,1,2), val1 = c(9.252042,5.012522,9.603795,19.686793,7.546551,18.233872))
untidy_to_tidy = untidy %>%
gather(day,val2) %>%
mutate(day = as.double(gsub("day","",day)),
time = rep(c(1,2), (ncol(untidy) * nrow(untidy))/2)) %>%
select(day,time,val2)
tidy %>%
left_join(untidy_to_tidy, by = c("day","time")) %>%
mutate(division = val1 / val2)
If you are new to R please keep it simple and do like this:
read you CSV/TSV/etc using read_csv("YOUR_FILE.CSV") from readr package
in my example replace
untidy = tibble(day1 = c(0.03,0.42), day2 = c(0.43,0.63), day3 = c(0.39,0.93))
by
untidy = read_csv("YOUR_FILE.CSV")
and
tidy = tibble(day = c(1,1,2,2,3,3), time = c(1,2,1,2,1,2), val1 = c(9.252042,5.012522,9.603795,19.686793,7.546551,18.233872))
by
tidy = read_csv("YOUR_OTHER_FILE.CSV")
What you need to do is be careful: Your two dataframes are arranged in a sweet order. The code is as follows:
df2[3:5]/unlist(df1)
X3 X4 X5
1 308.401400 650.420700 386.72237
2 11.956005 42.172581 12.01126
3 22.334407 45.126284 40.01575
4 31.248878 29.827843 20.08869
5 19.350131 48.232118 50.93841
6 19.606314 21.071596 12.53116
: : : :
: : : :
Say I have two vectors with values that come from formulas:
A <- c(0.11, -0.11, -.20, .20, -0.18, 0.18)
B <- c(-0.11, 0.11, .20, -.20, 0.18, -0.18)
What I wish to accomplish is to merge the vertices into one vector where I have the first two values of A, then the 3rd and 4th value of B, then the 5th and 6th value of A (in the actual data set the vertices are 96 characters long), to end up with:
V <- c(0.11, -0.11, .20, -.20, -0.18, 0.18)
I wish to accomplish the same with four vectors where it switches between vectors every 4 values. Seeing as the vectors are long, I don't want to have to resort to the use of indices.
I've fumbled around a lot with combinations of the c() and rbind() functions, but always end up merging incorrectly.
An example of code I've tried (with called objects substituted with possible values):
c(rbind(1.2 - (1.2 + 1.2/2),
1.2 - 1.2/2)),
rbind(1.2 - 1.2/2),
1.2 - (1.2 + 1.2/2)))
This would end up with the vectors being merged after the first one ends. I've tried different combinations, but none worked out for me.
Does anybody have a nifty trick up their sleeve?
Here's a wrapper function that will accept any number of vectors and give you desire result (though the vectors are assumed to be of same length)
Myfunc <- function(...){
temp <- cbind(...)
len <- ncol(temp)
suppressWarnings(temp[cbind(seq(nrow(temp)), rep(seq(len), each = len))])
}
Myfunc(A, B)
## [1] 0.11 -0.11 0.20 -0.20 -0.18 0.18
On 4 vectors (Provided by OP in comments)
A <- 1:16 ; B <- 21:36 ; C <- 41:56 ; D <- 61:76
Myfunc(A, B, C, D)
## [1] 1 2 3 4 25 26 27 28 49 50 51 52 73 74 75 76
> (1:6) %% 4 %in% c(1,2)
[1] TRUE TRUE FALSE FALSE TRUE TRUE
> (1:12) %% 8 %in% c(1,2,3,4)
[1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE
etc.
or for your example
> D <- rep(0,6)
> D[(1:6) %% 4 %in% c(1,2)] <- A[(1:6) %% 4 %in% c(1,2)]
> D[!(1:6) %% 4 %in% c(1,2)] <- B[!(1:6) %% 4 %in% c(1,2)]
> D
[1] 0.11 -0.11 0.20 -0.20 -0.18 0.18