How to check if pairs remain the same between years?

How to check if pairs remain the same between years? - r

I have a data frame with pairs of individual birds (male and female) that were observed in several years. I am trying to figure out whether these pairs have changed from one year to the next so that I can do some further analyses.
My data is structured like this:
dat <- tibble(year = rep(1:3, each = 3),
Male = c("A1", "B1", "C1",
"A1", "B1", "C1",
"A1", "B1", "C2"),
Female = c("X1", "Y1", "Z1",
"X1", "Y2", "Z2",
"X1", "Y2", "Z2"))
# A tibble: 9 x 3
year Male Female
<int> <chr> <chr>
1 1 A1 X1
2 1 B1 Y1
3 1 C1 Z1
4 2 A1 X1
5 2 B1 Y2
6 2 C1 Z2
7 3 A1 X1
8 3 B1 Y2
9 3 C2 Z2
And my expected output is something like:
# A tibble: 9 x 5
year Male Female male_state female_state
<int> <chr> <chr> <chr> <chr>
1 1 A1 X1 new new
2 1 B1 Y1 new new
3 1 C1 Z1 new new
4 2 A1 X1 reunited reunited
5 2 B1 Y2 divorced new
6 2 C1 Z2 divorced new
7 3 A1 X1 reunited reunited
8 3 B1 Y2 reunited reunited
9 3 C2 Z2 new divorced
I cannot figure out how to check whether a value from a different column is the same in the year before (e.g. if the male ID is the same for a certain female in year 2 or 3 as in the year prior). Any ideas?

This (probably overcomplicated) pipe produces the following output.
dat <- tibble(year = rep(1:3, each = 3),
Male = c("A1", "B1", "C1",
"A1", "B1", "C1",
"A1", "B1", "C2"),
Female = c("X1", "Y1", "Z1",
"X1", "Y2", "Z2",
"X1", "Y2", "Z2"))
dat %>%
mutate(pair=paste0(Male,Female)) %>%
arrange(pair,year) %>%
mutate(check = if_else((pair==lag(pair)) & (year>lag(year)), 'old couple', 'new couple')) %>%
mutate(check = if_else(is.na(check), 'new couple', check)) %>%
mutate(divorced = if_else((Male == lag(Male)) & (Female != lag(Female)), 'divorce', '')) %>%
mutate(divorced = if_else(is.na(divorced), '', divorced))
OUTPUT:
# A tibble: 9 × 6
year Male Female pair check divorced
<int> <chr> <chr> <chr> <chr> <chr>
1 1 A1 X1 A1X1 new couple ""
2 2 A1 X1 A1X1 old couple ""
3 3 A1 X1 A1X1 old couple ""
4 1 B1 Y1 B1Y1 new couple ""
5 2 B1 Y2 B1Y2 new couple "divorce"
6 3 B1 Y2 B1Y2 old couple ""
7 1 C1 Z1 C1Z1 new couple ""
8 2 C1 Z2 C1Z2 new couple "divorce"
9 3 C2 Z2 C2Z2 new couple ""

Try this:
library(tidyverse)
dat <- tibble(
year = rep(1:3, each = 3),
Male = c(
"A1", "B1", "C1",
"A1", "B1", "C1",
"A1", "B1", "C2"
),
Female = c(
"X1", "Y1", "Z1",
"X1", "Y2", "Z2",
"X1", "Y2", "Z2"
)
)
dat |>
mutate(pairing = str_c(Male, "|", Female)) |>
add_count(pairing) |>
group_by(pairing) |>
mutate(male_state = if_else(pairing == lag(pairing), "reunited", NA_character_),
female_state = if_else(pairing == lag(pairing), "reunited", NA_character_)) |>
group_by(Male) |>
mutate(
male_state = if_else(row_number() == 1, "new", male_state),
male_state = if_else(is.na(male_state), "divorced", male_state)
) |>
group_by(Female) |>
mutate(
female_state = if_else(row_number() == 1, "new", female_state),
female_state = if_else(is.na(female_state), "divorced", female_state)
) |>
arrange(year, Male)
#> # A tibble: 9 × 7
#> # Groups: Female [5]
#> year Male Female pairing n male_state female_state
#> <int> <chr> <chr> <chr> <int> <chr> <chr>
#> 1 1 A1 X1 A1|X1 3 new new
#> 2 1 B1 Y1 B1|Y1 1 new new
#> 3 1 C1 Z1 C1|Z1 1 new new
#> 4 2 A1 X1 A1|X1 3 reunited reunited
#> 5 2 B1 Y2 B1|Y2 2 divorced new
#> 6 2 C1 Z2 C1|Z2 1 divorced new
#> 7 3 A1 X1 A1|X1 3 reunited reunited
#> 8 3 B1 Y2 B1|Y2 2 reunited reunited
#> 9 3 C2 Z2 C2|Z2 1 new divorced
Created on 2022-05-03 by the reprex package (v2.0.1)

Related

How collect members of a column based on the value of a specific member in that column in R

In the following data frame, I want to collect members of B1, where their value in B2 is equal to or more than the value of "b" in B2. And then after this new information, count how many times each of the B1 members occurred.
dataframe:
ID B1 B2
z1 a 2.5
z1 b 1.7
z1 c 170
z1 c 9
z1 d 3
y2 a 0
y2 b 21
y2 c 15
y2 c 101
y2 d 30
y2 d 3
y2 d 15.5
x3 a 30.8
x3 a 54
x3 a 0
x3 b 30.8
x3 c 30.8
x3 d 7
so the result would be:
ID B1 B2
z1 a 2.5
z1 c 170
z1 c 9
z1 d 3
y2 c 101
y2 d 30
x3 a 30.8
x3 a 54
x3 c 30.8
and
ID B1 count
z1 a 1
z1 c 2
z1 d 1
y2 a 0
y2 c 1
y2 d 1
x3 a 2
x3 c 1
x3 d 0

Grouped by 'ID', filter where the 'B2' is greater than or equal to 'B2' where 'B1' is 'b' as well as create another condition where 'B1' is not equal to 'b'
library(dplyr)
out1 <- df1 %>%
group_by(ID) %>%
filter(any(B1 == "b") & B2 >= min(B2[B1 == "b"]), B1 != 'b')
-output
> out1
# A tibble: 9 × 3
# Groups: ID [3]
ID B1 B2
<chr> <chr> <dbl>
1 z1 a 2.5
2 z1 c 170
3 z1 c 9
4 z1 d 3
5 y2 c 101
6 y2 d 30
7 x3 a 30.8
8 x3 a 54
9 x3 c 30.8
The second output will be do a group by with summarise to get the number of rows, and then fill the missing combinations with complete
library(tidyr)
out1 %>%
group_by(B1, .add = TRUE) %>%
summarise(count = n(), .groups = "drop_last") %>%
complete(B1 = unique(.$B1), fill = list(count = 0)) %>%
ungroup
# A tibble: 9 × 3
ID B1 count
<chr> <chr> <int>
1 x3 a 2
2 x3 c 1
3 x3 d 0
4 y2 a 0
5 y2 c 1
6 y2 d 1
7 z1 a 1
8 z1 c 2
9 z1 d 1
data
df1 <- structure(list(ID = c("z1", "z1", "z1", "z1", "z1", "y2", "y2",
"y2", "y2", "y2", "y2", "y2", "x3", "x3", "x3", "x3", "x3", "x3"
), B1 = c("a", "b", "c", "c", "d", "a", "b", "c", "c", "d", "d",
"d", "a", "a", "a", "b", "c", "d"), B2 = c(2.5, 1.7, 170, 9,
3, 0, 21, 15, 101, 30, 3, 15.5, 30.8, 54, 0, 30.8, 30.8, 7)),
class = "data.frame", row.names = c(NA,
-18L))

Using tidyverse:
library(tidyverse)
df %>%
group_by(ID) %>%
filter(B2 > B2[B1 == "b"]) %>%
group_by(ID, B1) %>%
count(name = "count") %>%
as.data.frame()
#> ID B1 count
#> 1 x3 a 1
#> 2 y2 c 1
#> 3 y2 d 1
#> 4 z1 a 1
#> 5 z1 c 2
#> 6 z1 d 1
Created on 2022-04-26 by the reprex package (v2.0.1)

tidyverse: Simulating random sample with nested factor

I want to simulate random sample with nested factor. Factor Dept has two levels A & B. Level A has two nested levels A1 and A2. Level B has three nested levels B1, B2 and B3. Want to simulate random sample from 2022-01-01 to 2022-01-31 using some R code. Part of desired output is given below (from 2022-01-01 to 2022-01-02 only for reference).
library(tibble)
set.seed(12345)
df1 <-
tibble(
Date = c(rep("2022-01-01", 5), rep("2022-01-02", 4), rep("2022-01-03", 4))
, Dept = c("A", "A", "B", "B", "B", "A", "B", "B", "B", "A", "A", "B", "B")
, Prog = c("A1", "A2", "B1", "B2", "B3", "A1", "B1", "B2", "B3", "A1", "A2", "B2", "B3")
, Amount = runif(n = 13, min = 50000, max = 100000)
)
df1
#> # A tibble: 13 x 4
#> Date Dept Prog Amount
#> <chr> <chr> <chr> <dbl>
#> 1 2022-01-01 A A1 86045.
#> 2 2022-01-01 A A2 93789.
#> 3 2022-01-01 B B1 88049.
#> 4 2022-01-01 B B2 94306.
#> 5 2022-01-01 B B3 72824.
#> 6 2022-01-02 A A1 58319.
#> 7 2022-01-02 B B1 66255.
#> 8 2022-01-02 B B2 75461.
#> 9 2022-01-02 B B3 86385.
#> 10 2022-01-03 A A1 99487.
#> 11 2022-01-03 A A2 51727.
#> 12 2022-01-03 B B2 57619.
#> 13 2022-01-03 B B3 86784.

If we want to sample randomly, create the expanded data with crossing and then filter/slice to return random rows for each 'date'
library(dplyr)
library(tidyr)
library(stringr)
crossing(Date = seq(as.Date("2022-01-01"), as.Date("2022-01-31"),
by = "1 day"), Dept = c("A", "B"), Prog = 1:3) %>%
mutate(Prog = str_c(Dept, Prog)) %>%
filter(Prog != "A3") %>%
mutate(Amount = runif(n = n(), min = 50000, max = 100000)) %>%
group_by(Date) %>%
slice(seq_len(sample(row_number(), 1))) %>%
ungroup
-output
# A tibble: 102 × 4
Date Dept Prog Amount
<date> <chr> <chr> <dbl>
1 2022-01-01 A A1 83964.
2 2022-01-01 A A2 93428.
3 2022-01-01 B B1 85187.
4 2022-01-01 B B2 79144.
5 2022-01-01 B B3 65784.
6 2022-01-02 A A1 86014.
7 2022-01-03 A A1 76060.
8 2022-01-03 A A2 56412.
9 2022-01-03 B B1 87365.
10 2022-01-03 B B2 66169.
# … with 92 more rows

how to create new variables from one variable using two rules

I would appreciate any help to create new variables from one variable.
Specifically, I need help to simultaneously create one row per each ID and various columns of E, where each of the new columns of E, (that is, E1, E2, E3) contains the values of E for each row of ID. I tried doing this which melt followed by spread but I am getting the error:
Error: Duplicate identifiers for rows (4, 7, 9), (1, 3, 6), (2, 5, 8)
Additionally, I tried the solutions discussed here and here but these did not work for my case because I need to be able to create row identifiers for rows (4, 1, 2), (7, 3, 5), and (9, 6, 8). That is, E for rows (4, 1, 2) should be named E1, E for rows (7, 3, 5) should be named E2, E for rows (9, 6, 8) should be named E3, and so on.
#data
dT<-structure(list(A = c("a1", "a2", "a1", "a1", "a2", "a1", "a1",
"a2", "a1"), B = c("b2", "b2", "b2", "b1", "b2", "b2", "b1",
"b2", "b1"), ID = c("3", "4", "3", "1", "4", "3", "1", "4", "1"
), E = c(0.621142094943352, 0.742109450696123, 0.39439152996948,
0.40694392882818, 0.779607277916503, 0.550579323666347, 0.352622183880119,
0.690660491345867, 0.23378944873769)), class = c("data.table",
"data.frame"), row.names = c(NA, -9L))
#my attempt
A B ID E
1: a1 b2 3 0.6211421
2: a2 b2 4 0.7421095
3: a1 b2 3 0.3943915
4: a1 b1 1 0.4069439
5: a2 b2 4 0.7796073
6: a1 b2 3 0.5505793
7: a1 b1 1 0.3526222
8: a2 b2 4 0.6906605
9: a1 b1 1 0.2337894
aTempDF <- melt(dT, id.vars = c("A", "B", "ID")) )
A B ID variable value
1: a1 b2 3 E 0.6211421
2: a2 b2 4 E 0.7421095
3: a1 b2 3 E 0.3943915
4: a1 b1 1 E 0.4069439
5: a2 b2 4 E 0.7796073
6: a1 b2 3 E 0.5505793
7: a1 b1 1 E 0.3526222
8: a2 b2 4 E 0.6906605
9: a1 b1 1 E 0.2337894
aTempDF%>%spread(variable, value)
Error: Duplicate identifiers for rows (4, 7, 9), (1, 3, 6), (2, 5, 8)
#expected output
A B ID E1 E2 E3
1: a1 b2 3 0.6211421 0.3943915 0.5505793
2: a2 b2 4 0.7421095 0.7796073 0.6906605
3: a1 b1 1 0.4069439 0.3526222 0.2337894
Thanks in advance for any help.

You can use dcast from data.table
library(data.table)
dcast(dT, A + B + ID ~ paste0("E", rowid(ID)))
# A B ID E1 E2 E3
#1 a1 b1 1 0.4069439 0.3526222 0.2337894
#2 a1 b2 3 0.6211421 0.3943915 0.5505793
#3 a2 b2 4 0.7421095 0.7796073 0.6906605
You need to create the correct 'time variable' first which is what rowid(ID) does.

For those looking for a tidyverse solution:
library(tidyverse)
dT <- structure(
list(
A = c("a1", "a2", "a1", "a1", "a2", "a1", "a1", "a2", "a1"),
B = c("b2", "b2", "b2", "b1", "b2", "b2", "b1", "b2", "b1"),
ID = c("3", "4", "3", "1", "4", "3", "1", "4", "1"),
E = c(0.621142094943352, 0.742109450696123, 0.39439152996948, 0.40694392882818,
0.550579323666347, 0.352622183880119, 0.690660491345867, 0.23378944873769,
0.779607277916503)),
class = c("data.table",
"data.frame"),
row.names = c(NA, -9L))
dT %>%
as_tibble() %>% # since dataset is a data.table object
group_by(A, B, ID) %>%
# Just so columns are "E1", "E2", etc.
mutate(rn = glue::glue("E{row_number()}")) %>%
ungroup() %>%
spread(rn, E) %>%
# not necessary, just making output in the same order as your expected output
arrange(desc(B))
# A tibble: 3 x 6
# A B ID E1 E2 E3
# <chr> <chr> <chr> <dbl> <dbl> <dbl>
#1 a1 b2 3 0.621 0.394 0.551
#2 a2 b2 4 0.742 0.780 0.691
#3 a1 b1 1 0.407 0.353 0.234
As mentioned in the accepted answer, you need a "key" variable to spread on first. This is created using row_number() and glue where glue just gives you the proper E1, E2, etc. variable names.
The group_by piece just makes sure that the row numbers are with respect to A, B and ID.
EDIT for tidyr >= 1.0.0
The (not-so) new pivot_ functions supercede gather and spread and eliminate the need to glue the new variable names together in a mutate.
dT %>%
as_tibble() %>% # since dataset is a data.table object
group_by(A, B, ID) %>%
# no longer need to glue (or paste) the names together but still need a row number
mutate(rn = row_number()) %>%
ungroup() %>%
pivot_wider(names_from = rn, values_from = E, names_glue = "E{.name}") %>% # names_glue argument allows for easy transforming of the new variable names
# not necessary, just making output in the same order as your expected output
arrange(desc(B))
# A tibble: 3 x 6
# A B ID E1 E2 E3
# <chr> <chr> <chr> <dbl> <dbl> <dbl>
#1 a1 b2 3 0.621 0.394 0.551
#2 a2 b2 4 0.742 0.780 0.691
#3 a1 b1 1 0.407 0.353 0.234

Convert column to comma separated in R

I have two columns A and B in excel with large data.we have to consider both columns A and B, I am trying to achieve column C as output. Right now I am doing everything in excel. So I think there may a way to this in R but really don't know how to do it.Any help is appreciated..Thanks
I have
Column A ColumnB Column C(output column)
A1 10 A2
A2 10 A1
B1 3 B2,B3,B4
B2 3 B1,B3,B4
B3 3 B1,B2,B4
B4 3 B1,B2,B3
C1 6 C2,C3
C2 6 C1,C3
C3 6 C1,C2

We can group by column B then find a set difference between the current column A character and the whole characters in the group:
library(tidyverse)
df %>%
group_by(ColumnB) %>%
mutate(ColumnC=map_chr(ColumnA, ~toString(setdiff(ColumnA, .x))))
# A tibble: 9 x 3
# Groups: ColumnB [3]
ColumnA ColumnB ColumnC
<fct> <int> <chr>
1 A1 10 A2
2 A2 10 A1
3 B1 3 B2, B3, B4
4 B2 3 B1, B3, B4
5 B3 3 B1, B2, B4
6 B4 3 B1, B2, B3
7 C1 6 C2, C3
8 C2 6 C1, C3
9 C3 6 C1, C2

I don't think the question is phrased very clearly but I am interpreting the desired results to be that you want Column C to have all the values from each group of Column B, leaving out the value of Column A. You can do this as follows:
nest Column A and join it back onto the original data frame
flatten it so you now have a vector of the Column A values
use setdiff to get the values that are not Column A
collapse into comma separated string with str_c
You can see that your desired Column C is reproduced.
library(tidyverse)
tbl <- structure(list(ColumnA = c("A1", "A2", "B1", "B2", "B3", "B4", "C1", "C2", "C3"), ColumnB = c(10L, 10L, 3L, 3L, 3L, 3L, 6L, 6L, 6L), ColumnC = c("A2", "A1", "B2,B3,B4", "B1,B3,B4", "B1,B2,B4", "B1,B2,B3", "C2,C3", "C1,C3", "C1,C2")), problems = structure(list(row = 9L, col = "ColumnC", expected = "", actual = "embedded null", file = "literal data"), row.names = c(NA, -1L), class = c("tbl_df", "tbl", "data.frame")), row.names = c(NA, -9L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list(cols = list(ColumnA = structure(list(), class = c("collector_character", "collector")), ColumnB = structure(list(), class = c("collector_integer", "collector")), ColumnC = structure(list(), class = c("collector_character", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec"))
tbl %>%
left_join(
tbl %>% select(-ColumnC) %>% nest(ColumnA)
) %>%
mutate(
data = flatten(data),
output = map2(data, ColumnA, ~ setdiff(.x, .y)),
output = map_chr(output, ~ str_c(., collapse = ","))
)
#> Joining, by = "ColumnB"
#> # A tibble: 9 x 5
#> ColumnA ColumnB ColumnC data output
#> <chr> <int> <chr> <list> <chr>
#> 1 A1 10 A2 <chr [2]> A2
#> 2 A2 10 A1 <chr [2]> A1
#> 3 B1 3 B2,B3,B4 <chr [4]> B2,B3,B4
#> 4 B2 3 B1,B3,B4 <chr [4]> B1,B3,B4
#> 5 B3 3 B1,B2,B4 <chr [4]> B1,B2,B4
#> 6 B4 3 B1,B2,B3 <chr [4]> B1,B2,B3
#> 7 C1 6 C2,C3 <chr [3]> C2,C3
#> 8 C2 6 C1,C3 <chr [3]> C1,C3
#> 9 C3 6 C1,C2 <chr [3]> C1,C2
Created on 2018-08-21 by the reprex package (v0.2.0).

My understanding is to find all OTHER entries of column A that share the current value of column B
Grouping by B, and finding all A's associated with the value should do the trick (some clean-up afterward removes the current entry of A from the resulting column C)
a <- c("a1", "a2","b1", "b2","b3", "b4","c1","c2","c3","d1")
b <- c(10,10,3,3,3,3,6,6,6,5)
dta <- data.frame(a,b, stringsAsFactors = F)
dta<-dta %>%
group_by(b) %>%
mutate(c = paste0(a,collapse = ",")) %>%
ungroup() %>%
mutate(c = str_replace(c,pattern = paste0(",",a),replacement = "")) %>%
mutate(c = str_replace(c,pattern = paste0(a,","),replacement = "")) %>%
mutate(c = ifelse(c==a,NA,c))

Another version of tidyverse solution. The separate function is handy to separate an existing column to new columns. By doing this, we can create the Group column to make sure all the operation would be within each group. map2 and map function are ideal to do vectorized operation. dat2 is the final output.
library(tidyverse)
dat2 <- dat %>%
separate(ColumnA, into = c("Group", "Number"), remove = FALSE, convert = TRUE, sep = 1) %>%
group_by(Group) %>%
mutate(List = list(ColumnA)) %>%
mutate(List = map2(List, ColumnA, ~.x[!(.x %in% .y)])) %>%
mutate(ColumnC = map_chr(List, ~str_c(.x, collapse = ","))) %>%
ungroup() %>%
select(starts_with("Column"))
dat2
# # A tibble: 9 x 3
# ColumnA ColumnB ColumnC
# <chr> <int> <chr>
# 1 A1 10 A2
# 2 A2 10 A1
# 3 B1 3 B2,B3,B4
# 4 B2 3 B1,B3,B4
# 5 B3 3 B1,B2,B4
# 6 B4 3 B1,B2,B3
# 7 C1 6 C2,C3
# 8 C2 6 C1,C3
# 9 C3 6 C1,C2
DATA
dat <- read.table(text = "ColumnA ColumnB
A1 10
A2 10
B1 3
B2 3
B3 3
B4 3
C1 6
C2 6
C3 6",
stringsAsFactors = FALSE, header = TRUE)

df = read.table(text = "
ColumnA ColumnB
A1 10
A2 10
B1 3
B2 3
B3 3
B4 3
C1 6
C2 6
C3 6
", header=T, stringsAsFactors=F)
library(tidyverse)
df %>%
group_by(ColumnB) %>% # for each ColumnB value
mutate(vals = list(ColumnA), # create a list of all Column A values for each row
vals = map2(vals, ColumnA, ~.x[.x != .y]), # exclude the value in Column A from that list
vals = map_chr(vals, ~paste0(.x, collapse = ","))) %>% # combine remaining values in the list
ungroup() # forget the grouping
# # A tibble: 9 x 3
# ColumnA ColumnB vals
# <chr> <int> <chr>
# 1 A1 10 A2
# 2 A2 10 A1
# 3 B1 3 B2,B3,B4
# 4 B2 3 B1,B3,B4
# 5 B3 3 B1,B2,B4
# 6 B4 3 B1,B2,B3
# 7 C1 6 C2,C3
# 8 C2 6 C1,C3
# 9 C3 6 C1,C2

Adding a pvalue column to dataframe in R

I have a dataframe that looks like this:
A1 A2 A3 B1 B2 B3
0 1 0 2 3 3
5 6 4 4 6 6
I would like to add a column based on t-testing the significance of the difference between As and Bs:
A1 A2 A3 B1 B2 B3 PValue
0 1 0 3 3 4 <some small number>
5 6 4 4 6 6 <some large number>
I tried using dplyr like this:
data %>%
mutate(PValue = t.test(unlist(c(A1,A2,A3),unlist(c(B1,B2,B3)))$p.value)
However, the resulting PValue column is constant for some reason. I would appreciate any help.

If we are doing this by row, then pmap is one way
library(tidyverse)
pmap_dbl(data, ~ c(...) %>%
{t.test(.[1:3], .[4:6])$p.value}) %>%
bind_cols(data, PValue = .)
# A1 A2 A3 B1 B2 B3 PValue
#1 0 1 0 2 3 3 0.007762603
#2 5 6 4 4 6 6 0.725030185
or another option is rowwise with do
data %>%
rowwise() %>%
do(data.frame(., PValue = t.test(unlist(.[1:3]), unlist(.[4:6]))$p.value))
# A tibble: 2 x 7
# A1 A2 A3 B1 B2 B3 PValue
#* <int> <int> <int> <int> <int> <int> <dbl>
#1 0 1 0 2 3 3 0.00776
#2 5 6 4 4 6 6 0.725
Or we can gather to 'long' format and then do the group by t.test
data %>%
rownames_to_column('rn') %>%
gather(key, val, -rn) %>% group_by(rn) %>%
summarise(PValue = t.test(val[str_detect(key, "A")],
val[str_detect(key, "B")])$p.value) %>%
pull(PValue) %>%
bind_cols(data, PValue = .)
data
data <- structure(list(A1 = c(0L, 5L), A2 = c(1L, 6L), A3 = c(0L, 4L),
B1 = c(2L, 4L), B2 = c(3L, 6L), B3 = c(3L, 6L)), .Names = c("A1",
"A2", "A3", "B1", "B2", "B3"), class = "data.frame", row.names = c(NA,
-2L))

Also with apply in Base R:
data$PValue = apply(data, 1, function(x) t.test(x[1:3], x[4:6])$p.value)
or:
library(dplyr)
data %>%
mutate(PValue = apply(., 1, function(x) t.test(x[1:3], x[4:6])$p.value))
Result:
A1 A2 A3 B1 B2 B3 PValue
1 0 1 0 2 3 3 0.007762603
2 5 6 4 4 6 6 0.725030185

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to check if pairs remain the same between years? - r

Related

How collect members of a column based on the value of a specific member in that column in R

tidyverse: Simulating random sample with nested factor

how to create new variables from one variable using two rules

Convert column to comma separated in R

Adding a pvalue column to dataframe in R

Categories

Resources