Compare one variable to other variables by group in R - r

I have the following data frame:
data.frame(id = c("a", "a", "a", "d", "d"),
value = c(5, 46, 12, 14, 32),
low = c(46, 8, NA, 0, 34),
high = c(56, 20, NA, 12, 60))
id value low high
1 a 5 46 56
2 a 46 8 20
3 a 12 NA NA
4 d 14 0 12
5 d 32 34 60
I need to set a new variable to TRUE if value is out of every intervals defined by low and high for each line with the same id.
My desired dataframe would be:
id value low high result
1 a 5 45 56 TRUE # 5 not in 45-56, 8-20
2 a 46 8 20 FALSE # 46 in 45-56
3 a 12 NA NA FALSE # 12 in 8-20
4 d 14 0 12 TRUE # 14 not in 0-12, 34-60
5 d 32 34 60 TRUE # 32 not in 0-12, 34-60
How can I do it in base R? I work in a restrictive environment where I only have access to base R.

I figured out an ugly and not optimized solution but it works ! Here is the code :
df <- data.frame(id = c("a", "a", "a", "d", "d"),
value = c(5, 46, 12, 14, 32),
low = c(46, 8, NA, 0, 34),
high = c(56, 20, NA, 12, 60))
list.inter <- list()
for(i in 1:nrow(df)){
if(is.na(df$low[i]) | is.na(df$low[i])) {
list.inter[[i]] <- NA
}else{
list.inter[[i]] <- seq(from = df$low[i], to = df$high[i])
}
}
result <- c()
for(i in 1:nrow(df)){
result[i] <- ! df$value[i] %in% unlist(list.inter[which(df$id[i]==df$id)])
}
df$result <- result
I hope it helps and I am curious to see some optimized code from other users!

Without apply, sapply and map function:
isInDataframe <- function(data = data, value = "value", from = "low", to = "high", id = "id"){
result <- c()
for (i in 1:length(data[,1])) {
deeta <- data[data[id] == as.character(data[id][i,1]),]
subresult <- c()
for (j in 1:nrow(deeta)) {
subresult[j] <- (data[value][i,1] >= deeta[from][j,1] & data[value][i,1] <= deeta[to][j,1])
}
result[i] <- !any(subresult,na.rm = T)
}
data$result <- result
return(data)
}
isInDataframe(data = data, value = "value", from = "low", to = "high", id = "id")
id value low high result
1 a 5 46 56 TRUE
2 a 46 8 20 FALSE
3 a 12 NA NA FALSE
4 d 14 0 12 TRUE
5 d 32 34 60 TRUE

I finally choose to separate id and value in a data frame and id, low and high in another data frame for this analysis.
However, here is a solution highly inspired from the solutions suggested for this new approach:
df <- data.frame(id = c("a", "a", "a", "d", "d"),
value = c(5, 46, 12, 14, 32),
low = c(46, 8, NA, 0, 34),
high = c(56, 20, NA, 12, 60))
temp <- merge(x = df[c("id",
"value")],
y = df[c("id",
"low",
"high")])
temp$result <- temp$value < temp$low | temp$value > temp$high
merge(x = df,
y = aggregate(formula = result ~ id + value,
data = temp,
FUN = all))
id value low high result
1 a 12 NA NA FALSE
2 a 46 8 20 FALSE
3 a 5 46 56 TRUE
4 d 14 0 12 TRUE
5 d 32 34 60 TRUE

Related

Routine for non-manual argument of a set of variables in coalesce() dplyr function [duplicate]

This question already has answers here:
Using dplyr to fill in missing values (through a join?)
(3 answers)
Closed 8 months ago.
This post was edited and submitted for review 8 months ago and failed to reopen the post:
Original close reason(s) were not resolved
I have a list of dfs to be combined into one. These dfs have some matching columns and rows and some distinct or missing ones.
The minimum structure (for understanding) of the first two dfs.
df1:
df1 <- structure(list(id = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6),
Name = c("LI","NO","WH","MA","BU","SO","FO","AT","CO","IN","SP","CE"),
H_A = c("H", "A", "H", "A", "H", "A", "H", "A", "H", "A", "H", "A"),
W = c(15, 13, 5, 13, 9, 12, 10, 13, 1, 8, 4, 2),
X = c(NA, NA, NA, NA, NA, NA, 12, 7, 5, 13, 1, 3),
Y = c(0, 0, 0, 0, 0,0, NA, NA, NA, NA, NA, NA)),
row.names = c(NA,-12L), class = c("tbl_df","tbl", "data.frame"))
df2:
df2 <- structure(list(id = c(1, 1, 2, 2, 3, 3),
Name = c("LI","NO", "WH", "MA", "BU", "SO"),
H_A = c("H", "A", "H", "A", "H", "A"),
W = c(15, 13, 5, 13, 9, 12),
X = c(10, 12, 11, 15, 6, 14),
Z = c(4, 14, 16, 16, 25, 30)),
row.names = c(NA,-6L),class = c("tbl_df", "tbl", "data.frame"))
This can be solved with this alternative:
df_combined <- full_join(df1, df2, by = c("id", "Name", "H_A")) %>%
mutate(X = coalesce(X.x, X.y),
W = coalesce(W.x, W.y)) %>%
select(-contains("."))
I would like to automate the routine for non-manual input of the variables in mutate coalesce function. After all, there are several variables for the context X and W above. In addition to this I will continue the routine for df3, df4, df5 that have the same minimal matching with df1.
Joins by their nature don't natively fill in positions we have to implement a fix to solve this problem, and although you can use if else statements as shown in the answer above, coalesce() is a much cleaner function to use.
See this post here for another example (could potentially be seen as a repeated question).
Using dplyr to fill in missing values (through a join?)
library(tidyverse)
df_test <- full_join(df1, df2, by = c("id", "Name", "H_A")) %>%
mutate(X = coalesce(X.x, X.y),
W = coalesce(W.x, W.y)) %>%
select(id, Name, H_A, W, X, Y, Z)
df_test == df_combined
id Name H_A W X Y Z
[1,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[2,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[3,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[4,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[5,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[6,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[7,] TRUE TRUE TRUE TRUE TRUE NA NA
[8,] TRUE TRUE TRUE TRUE TRUE NA NA
[9,] TRUE TRUE TRUE TRUE TRUE NA NA
[10,] TRUE TRUE TRUE TRUE TRUE NA NA
[11,] TRUE TRUE TRUE TRUE TRUE NA NA
[12,] TRUE TRUE TRUE TRUE TRUE NA NA
NA's expectedly return NA as you can't match two NA's together using a simple == statement.
You can use left_join from dplyr and substitute NA's like this, where I am guessing Id and H_A together make a key value:
library(dplyr)
df1 <- structure(list(id = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6),
Name = c("LI","NO","WH","MA","BU","SO","FO","AT","CO","IN","SP","CE"),
H_A = c("H", "A", "H", "A", "H", "A", "H", "A", "H", "A", "H", "A"),
W = c(15, 13, 5, 13, 9, 12, 10, 13, 1, 8, 4, 2),
X = c(NA, NA, NA, NA, NA, NA, 12, 7, 5, 13, 1, 3),
Y = c(0, 0, 0, 0, 0,0, NA, NA, NA, NA, NA, NA)),
row.names = c(NA,-12L), class = c("tbl_df","tbl", "data.frame"))
df2 <- structure(list(id = c(1, 1, 2, 2, 3, 3),
Name = c("LI","NO", "WH", "MA", "BU", "SO"),
H_A = c("H", "A", "H", "A", "H", "A"),
W = c(15, 13, 5, 13, 9, 12),
X = c(10, 12, 11, 15, 6, 14),
Z = c(4, 14, 16, 16, 25, 30)),
row.names = c(NA,-6L),class = c("tbl_df", "tbl", "data.frame"))
df_combined <- left_join(df1,
df2 %>%
select(id, H_A, "df2_X" = X, Z)) %>%
mutate(X = if_else(is.na(X), df2_X, X)) %>%
select(-df2_X)
#> Joining, by = c("id", "H_A")
df_combined
#> # A tibble: 12 × 7
#> id Name H_A W X Y Z
#> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 1 LI H 15 10 0 4
#> 2 1 NO A 13 12 0 14
#> 3 2 WH H 5 11 0 16
#> 4 2 MA A 13 15 0 16
#> 5 3 BU H 9 6 0 25
#> 6 3 SO A 12 14 0 30
#> 7 4 FO H 10 12 NA NA
#> 8 4 AT A 13 7 NA NA
#> 9 5 CO H 1 5 NA NA
#> 10 5 IN A 8 13 NA NA
#> 11 6 SP H 4 1 NA NA
#> 12 6 CE A 2 3 NA NA
data.table approach
library(data.table)
# set to data.table format
setDT(df1); setDT(df2)
# perform an update join, overwriting NA-values in W, X and Y, and
# adding Z, based on key-columns ID, Name and H_A
df1[df2, `:=`(W = ifelse(is.na(W), i.W, W),
X = ifelse(is.na(X), i.X, X),
Y = ifelse(is.na(Y), i.Y, Y),
Z = i.Z),
on = .(id, Name, H_A)][]
# id Name H_A W X Y Z
# 1: 1 LI H 15 10 0 4
# 2: 1 NO A 13 12 0 14
# 3: 2 WH H 5 11 0 16
# 4: 2 MA A 13 15 0 16
# 5: 3 BU H 9 6 0 25
# 6: 3 SO A 12 14 0 30
# 7: 4 FO H 10 12 NA NA
# 8: 4 AT A 13 7 NA NA
# 9: 5 CO H 1 5 NA NA
#10: 5 IN A 8 13 NA NA
#11: 6 SP H 4 1 NA NA
#12: 6 CE A 2 3 NA NA

Is there a way to create new columns in R based on manipulations from multiple data frames?

Does anyone know if it is possible to use a variable in one dataframe (in my case the "deploy" dataframe) to create a variable in another dataframe?
For example, I have two dataframes:
df1:
deploy <- data.frame(ID = c("20180101_HH1_1_1", "20180101_HH1_1_2", "20180101_HH1_1_3"),
Site_Depth = c(42, 93, 40), Num_Depth_Bins_Required = c(5, 100, 4),
Percent_Column_in_each_bin = c(20, 10, 25))
df2:
sp.c <- data.frame(species = c("RR", "GS", "GT", "BR", "RS", "BA", "GS", "RS", "SH", "RR"),
ct = c(25, 66, 1, 12, 30, 6, 1, 22, 500, 6),
percent_dist_from_surf = c(11, 15, 33, 68, 71, 100, 2, 65, 5, 42))
I want to create new columns in df2 that assigns each species and count to a bin based on the Percent_Column_in_each_bin for each ID. For example, in 20180101_HH1_1_3 there would be 4 bins that each make up 25% of the column and all species that are within 0-25% of the column (in df2) would be in bin 1 and species within 25-50% of the column would be in depth bin 2, and so on. What I'm imagining this looking like is:
i.want.this <- data.frame(species = c("RR", "GS", "GT", "BR", "RS", "BA", "GS", "RS", "SH", "RR"),
ct = c(25, 66, 1, 12, 30, 6, 1, 22, 500, 6),
percent_dist_from_surf = c(11, 15, 33, 68, 71, 100, 2, 65, 5, 42),
'20180101_HH1_1_1_Bin' = c(1, 1, 2, 4, 4, 5, 1, 4, 1, 3),
'20180101_HH1_1_2_Bin' = c(2, 2, 4, 7, 8, 10, 1, 7, 1, 5),
'20180101_HH1_1_3_Bin' = c(1, 1, 2, 3, 3, 4, 1, 3, 1, 2))
I am pretty new to R and I'm not sure how to make this happen. I need to do this for over 100 IDs (all with different depths, number of depth bins, and percent of the column in each bin) so I was hoping that I don't need to do them all by hand. I have tried mutate in dplyr but I can't get it to pull from two different dataframes. I have also tried ifelse statements, but I would need to run the ifelse statement for each ID individually.
I don't know if what I am trying to do is possible but I appreciate the feedback. Thank you in advance!
Edit: my end goal is to find the max count (max ct) for each species within each bin for each ID. What I've been doing to find this (using the bins generated with suggestions from #Ben) is using dplyr to slice and find the max ID like this:
20180101_HH1_1_1 <- sp.c %>%
group_by(20180101_HH1_1_1, species) %>%
arrange(desc(ct)) %>%
slice(1) %>%
group_by(20180101_HH1_1_1) %>%
mutate(Count_Total_Per_Bin = sum(ct)) %>%
group_by(species, add=TRUE) %>%
mutate(species_percent_of_total_in_bin =
paste0((100*ct/Count_Total_Per_Bin) %>%
mutate(ID= "20180101_HH1_1_1 ") %>%
ungroup()
but I have to do this for over 100 IDs. My desired output would be something like:
end.goal <- data.frame(ID = c(rep("20180101_HH1_1_1", 8)),
species = c("RR", "GS", "SH", "GT", "RR", "BR", "RS", "BA"),
bin = c(1, 1, 1, 2, 3, 4, 4, 5),
Max_count_of_each_species_in_each_bin = c(11, 66, 500, 1, 6, 12, 30, 6),
percent_dist_from_surf = c(11, 15, 5, 33, 42, 68, 71, 100),
percent_each_species_max_in_each_bin = c((11/577)*100, (66/577)*100, (500/577)*100, 100, 100, (12/42)*100, (30/42)*100, 100))
I was thinking that by answering the original question I could get to this but I see now that there's still a lot you have to do to get this for each ID.
Here is another approach, which does not require a loop.
Using sapply you can cut to determine bins for each percent_dist_from_surf value in your deploy dataframe.
res <- sapply(deploy$Percent_Column_in_each_bin, function(x) {
cut(sp.c$percent_dist_from_surf, seq(0, 100, by = x), include.lowest = TRUE, labels = 1:(100/x))
})
colnames(res) <- deploy$ID
cbind(sp.c, res)
Or using purrr:
library(purrr)
cbind(sp.c, imap(setNames(deploy$Percent_Column_in_each_bin, deploy$ID),
~ cut(sp.c$percent_dist_from_surf, seq(0, 100, by = .x), include.lowest = TRUE, labels = 1:(100/.x))
))
Output
species ct percent_dist_from_surf 20180101_HH1_1_1 20180101_HH1_1_2 20180101_HH1_1_3
1 RR 25 11 1 2 1
2 GS 66 15 1 2 1
3 GT 1 33 2 4 2
4 BR 12 68 4 7 3
5 RS 30 71 4 8 3
6 BA 6 100 5 10 4
7 GS 1 2 1 1 1
8 RS 22 65 4 7 3
9 SH 500 5 1 1 1
10 RR 6 42 3 5 2
Edit:
To determine the maximum ct value for each species, site, and bin, put the result of above into a dataframe called res and do the following.
First would put into long form with pivot_longer. Then you can group_by species, site, and bin, and determine the maximum ct for this combination.
library(tidyverse)
res %>%
pivot_longer(cols = starts_with("2018"), names_to = "site", values_to = "bin") %>%
group_by(species, site, bin) %>%
summarise(max_ct = max(ct)) %>%
arrange(site, bin)
Output
# A tibble: 26 x 4
# Groups: species, site [21]
species site bin max_ct
<fct> <chr> <fct> <dbl>
1 GS 20180101_HH1_1_1 1 66
2 RR 20180101_HH1_1_1 1 25
3 SH 20180101_HH1_1_1 1 500
4 GT 20180101_HH1_1_1 2 1
5 RR 20180101_HH1_1_1 3 6
6 BR 20180101_HH1_1_1 4 12
7 RS 20180101_HH1_1_1 4 30
8 BA 20180101_HH1_1_1 5 6
9 GS 20180101_HH1_1_2 1 1
10 SH 20180101_HH1_1_2 1 500
11 GS 20180101_HH1_1_2 2 66
12 RR 20180101_HH1_1_2 2 25
13 GT 20180101_HH1_1_2 4 1
14 RR 20180101_HH1_1_2 5 6
15 BR 20180101_HH1_1_2 7 12
16 RS 20180101_HH1_1_2 7 22
17 RS 20180101_HH1_1_2 8 30
18 BA 20180101_HH1_1_2 10 6
19 GS 20180101_HH1_1_3 1 66
20 RR 20180101_HH1_1_3 1 25
21 SH 20180101_HH1_1_3 1 500
22 GT 20180101_HH1_1_3 2 1
23 RR 20180101_HH1_1_3 2 6
24 BR 20180101_HH1_1_3 3 12
25 RS 20180101_HH1_1_3 3 30
26 BA 20180101_HH1_1_3 4 6
It is helpful to distinguish between the contents of your two dataframes.
df2 appears to contain measurements from some sites
df1 appears to contain parameters by which you want to process/summarise the measurements in df2
Given these different purposes of the two dataframes, your best approach is probably to loop over all the rows of df1 each time adding a column to df2. Something like the following:
max_dist = max(df2$percent_dist_from_surf)
for(ii in 1:nrow(df1)){
# extract parameters
this_ID = df1[[ii,"ID"]]
this_depth = df1[[ii,"Site_Depth"]]
this_bins = df1[[ii,"Num_Depth_Bins_Required"]]
this_percent = df1[[ii,"Percent_Column_in_each_bin"]]
# add column to df2
df2 = df2 %>%
mutate(!!sym(this_ID) := insert_your_calculation_here)
}
The !!sym(this_ID) := part of the code is to allow dynamic naming of your output columns.
And as best I can determine the formula you want for insert_your_calculation_here is ceil(percent_dist_from_surf / max_dist * this_bins)

Matching data replacement in R

I have a two datasets with a similar dimensions and a similar column names. The goal is to check if NA values exist in one of the datasets and replace with the corresponding values in the other dataset as shown in the example below.
I have tried running a for loop for to do solve the problem but that didn't work and failed miserably.
df is new data frame created with NA's
loop = for (a in 1:nrow(data1)) {
for (b in 1:ncol(data1)) {
for (c in 1:nrow(data2)) {
for (d in 1:ncol(data2)) {
for (x in 1:nrow(df)) {
for (y in 1:ncol(df)) {
df[x,y]<- ifelse(data1[a,b] != "NA", data1[a,b], data2[c,d])
return(df)`enter code here`
}
}
}
}
}
}
Example
# The first data frame
structure(list(age = c(23, 22, 21, 20), gender = c("M", "F",
NA, "F")), row.names = c(NA, -4L), class = c("tbl_df", "tbl",
"data.frame"))
# age gender
# 1 23 M
# 2 22 F
# 3 21 NA
# 4 20 F
# The second data frame
structure(list(age = c(23, 22, 21, 20), gender = c("M", "F",
"M", "F")), row.names = c(NA, -4L), class = c("tbl_df", "tbl",
"data.frame"))
# age gender
# 1 23 M
# 2 22 F
# 3 21 M
# 4 20 F
Desired output
Age Gender
23 M
22 F
21 M
20 F
You might try this:
df1 <- tibble(age = c(23,22,21,20),
gender = c("M", "F", NA, "F"))
# -------------------------------------------------------------------------
#> df1
# # A tibble: 4 x 2
# age gender
# <dbl> <chr>
# 1 23 M
# 2 22 F
# 3 21 NA
# 4 20 F
# -------------------------------------------------------------------------
df2 <- tibble(age = c(23,22,21,20),
gender = c("M", "F", "M", "F"))
# -------------------------------------------------------------------------
#> df2
# # A tibble: 4 x 2
# age gender
# <dbl> <chr>
# 1 23 M
# 2 22 F
# 3 21 M
# 4 20 F
# -------------------------------------------------------------------------
# get the na in df1 of gender var
df1.na <- is.na(df1$gender)
#> df1.na
# [1] FALSE FALSE TRUE FALSE
# -------------------------------------------------------------------------
# use the values in df2 to replace na in df1 (Note that this is index based)
df1$gender[df1.na] <- df2$gender[df1.na]
df1
# -------------------------------------------------------------------------
#> df1
# A tibble: 4 x 2
# age gender
# <dbl> <chr>
# 1 23 M
# 2 22 F
# 3 21 M
# 4 20 F
# -------------------------------------------------------------------------
This can be done using the natural_join function from the rqdatatable library. The function does require an index to merge on, so we will need to create one.
Creating a reproducible example will help other people help you. Here I've created two simple data frames that should cover most cases for your problem.
# Create example data
tbl1 <-
data.frame(
w = c(1, 2, 3, 4),
x = c(1, 2, 3, NA),
y = c(1, 2, 3, 4),
z = c(1, NA, NA, NA)
)
tbl2 <-
data.frame(
w = c(9, 9, 9, 9), # check value doesnt overwrite value,
x = c(1, 2, 3, 4), # check na gets filled in
y = c(1, 2, 3, NA), # check NA doesnt overwrite value
z = c(9, NA, NA, NA) # check NA in both stays NA
)
# Create join index
tbl1$indx <- 1:nrow(tbl1)
tbl2$indx <- 1:nrow(tbl2)
# Use natural_join
library("rqdatatable")
natural_join(tbl1, tbl2, by = "indx")

R Recode Variables In A Loop

Ciao,
Here is a replicate able example.
df <- data.frame("STUDENT"=c(1,2,3,4,5),
"TEST1"=c(6,88,17,5,18),
"TEST2"=c(34,NA,87,88,82),
"TEST3"=c(87,62,13,8,71),
"TEST1NEW"=c(0,1,0,0,0),
"TEST2NEW"=c(0,NA,1,1,1),
"TEST3NEW"=c(1,1,0,0,1)
If I have data frame df with STUDENT, TEST1, TEST2, TEST3 I want to make TEST1NEW TEST2NEW and TEST3NEW such that the new variables are equal to 1 when old variable TEST is more than or equals to 50 and the NEW TEST variables should be equal to 0 when the old TEST variable is below 50. I made an attempt here below but this is insufficient and also I believe this may require a loop.
COLUMNS <- c("TEST1", "TEST2", "TEST3")
df[paste0(COLUMNS)] <- replace(df[COLUMNS],df[COLUMNS] < 50, 0 , 1, NA)
You could do
df[, paste0("TEST", 1:3, "_NEW")] <- as.integer(df[,-1] >= 50)
df
# STUDENT TEST1 TEST2 TEST3 TEST1_NEW TEST2_NEW TEST3_NEW
#1 1 6 34 87 0 0 1
#2 2 88 NA 62 1 NA 1
#3 3 17 87 13 0 1 0
#4 4 5 88 8 0 1 0
#5 5 18 82 71 0 1 1
data
df <- data.frame(
"STUDENT" = c(1, 2, 3, 4, 5),
"TEST1" = c(6, 88, 17, 5, 18),
"TEST2" = c(34, NA, 87, 88, 82),
"TEST3" = c(87, 62, 13, 8, 71)
)
In case where the assignment is more complex we can make use of dplyr::case_when
library(dplyr)
df[, paste0("TEST", 1:3, "_NEW")] <- case_when(df[,-1] < 20 ~ 4L,
df[,-1] >= 65 ~ 8L,
is.na(df[,-1]) ~ NA_integer_,
TRUE ~ 7L)

Calculate means of specific rows in a column when meeting condition from another column

Essentially, I need to calculate means of values in rows under certain conditions.
Name = c("A", "A", "A", "A", "B", "B", "B", "B")
temp = c(22, 22, 26, 23, 18, 20, 18, 17)
peak = c(0, 0, 1, 0, 0, 1, 0, 0)
new = NA
d<- data.frame(Name, temp, peak, new)
When peak = 1, calculate the average of temp i-1 and i+1, place that value in 'new' column. Otherwise, the value in new should be the same as temp. I would like to do this only within "Name" groups so that group A temp values are not mixed with group B.
Then, the output will look like this:
Name temp peak new
1 A 22 0 22.0
2 A 22 0 22.0
3 A 26 1 22.5
4 A 23 0 23.0
5 B 18 0 18.0
6 B 20 1 18.0
7 B 18 0 18.0
8 B 17 0 17.0
I started writing an ifelse statement, which might look something like this:
d$new<-ifelse(d$peak==1, mean(peak[i-1, i+1]), d$temp)
I also thought about lapply, but I think this needs a loop. Any suggestions?
This should do the trick. No loops
Name = c("A", "A", "A", "A", "B", "B", "B", "B")
temp = c(22, 22, 26, 23, 18, 20, 18, 17)
peak = c(0, 0, 1, 0, 0, 1, 0, 0)
d<- data.frame(Name, temp, peak)
d$new = temp
ind = which(d$peak==1)
d$new[ind] = (d$temp[ind-1]+d$temp[ind+1])/2
Try rollapply from the zoo package:
library(zoo)
rollfun <- function(i) with(d[i, ], if (peak[2]) mean(temp[-2]) else temp[2])
transform(d, temp.new = rollapply(seq(0, nrow(d)+1), 3, rollfun))
Note that this assumes that there are no peaks at boundaries (which is the case in the question).
REVISED Some simplifications.
Here is the output:
> Name = c("A", "A", "A", "A", "B", "B", "B", "B")
> temp = c(22, 22, 26, 23, 18, 20, 18, 17)
> peak = c(0, 0, 1, 0, 0, 1, 0, 0)
> new = NA
> d<- data.frame(Name, temp, peak, new)
> library(zoo)
>
> rollfun <- function(i) with(d[i, ], if (peak[2]) mean(temp[-2]) else temp[2])
> transform(d, temp.new = rollapply(seq(0, nrow(d)+1), 3, rollfun))
Name temp peak new temp.new
1 A 22 0 NA 22.0
2 A 22 0 NA 22.0
3 A 26 1 NA 22.5
4 A 23 0 NA 23.0
5 B 18 0 NA 18.0
6 B 20 1 NA 18.0
7 B 18 0 NA 18.0
8 B 17 0 NA 17.0

Resources