Combining rows from the same data frame [duplicate] - r

This question already has answers here:
Merge multiple variables in R
(6 answers)
How to implement coalesce efficiently in R
(9 answers)
Closed 3 years ago.
I am trying to write a code that create a new column to combine two rows together. The idea is to add the row when there is NA.
The new column will be the "EventDate
Here is a sample data frame:
Id SDate CDate EventDate
101 2013-03-27 NA 2013-03-27
101 2013-05-09 NA 2013-05-09
101 NA 2013-05-30 2013-05-30
101 NA 2013-07-26 2013-07-26

We can use coalesce
library(tidyverse)
df1 %>%
mutate(EventDate = coalesce(SDate, CDate))

Related

Join numbers together in R [duplicate]

This question already has answers here:
Paste multiple columns together
(11 answers)
Closed 7 months ago.
I'm a beginner to R. What I want to do is join numbers together. I made a data as follows:
data<-data.frame(year=c(2020,2021,2022),month=c(10,11,12))
My expected output is as follows:
data=data.frame(year=c(2020,2021,2022),month=c(10,11,12),year_month=c(202010,202111,202212))
year_month is the column joining year and month together.
How can I do this?
You could concatenate the columns using paste0 like this:
data<-data.frame(year=c(2020,2021,2022),month=c(10,11,12))
data$year_month <- do.call(paste0, data)
data
#> year month year_month
#> 1 2020 10 202010
#> 2 2021 11 202111
#> 3 2022 12 202212
Created on 2022-07-30 by the reprex package (v2.0.1)

Subsetting a dataframe based on another dataframe column value in R [duplicate]

This question already has answers here:
Subset rows in a data frame based on a vector of values
(4 answers)
Subsetting a data frame based on contents of another data frame
(1 answer)
Closed last year.
I have the following dataframe, df:
studID Name
023 John
283 Mary
842 Jacob
211 Amy
and another dataframe, df_2:
studID
023
999
100
211
575
I want to subset the first dataframe, df so that it only contains the row values which the studID exists in the dataframe df_2.
So i would get:
studID Name
023 John
211 Amy
This dataframe would only contain John and Amy record since their studID is found in df_2.
I tried the following:
df_3 <- df[intersect(df$studID, df_2$studID),]
But I'm getting N/A values.

Insert values from one data frame into another based on values of one column in R [duplicate]

This question already has answers here:
Adding value from one data.frame to another data.frame by matching a variable
(4 answers)
Closed 2 years ago.
I have a dataframe, df1, in R that has two columns BANK-CODE and BANK-NAME.
BANK-CODE BANK-NAME
1 B001 Bank of America
2 B002 Bank of China
3 B003 Barclays
4 B004 BNP Paribas
5 B005 Citibank
A second dataframe, df2, also contains these columns along with a few others
DATE TIME-ZONE BANK-NAME BANK-CODE
1 2019-01-01T11:10:20+00:00 NA Mizuho NA
2 2019-01-04T17:51:11+00:00 NA Sberbank NA
3 2019-01-05T02:46:11+00:00 NA Lloyds NA
4 2019-01-05T06:13:46+00:00 NA Barclays NA
5 2019-01-05T07:52:16+00:00 NA Emirates NBD NA
My goal is to replace the NA values for BANK-CODE in df2, with those in df1, corresponding to the name of the bank, so df2[4,4] should be B003. What is the best way to do this?
You can use match :
df2$BANK_CODE <- df1$BANK_CODE[match(df2$BANK_NAME, df1$BANK_NAME)]
library("imputeTS")
df2 <- na.replace(BANK-CODE)//or sth like this
you can use something like this?! you just need to install the package(imputeTS) first.
Does that help?

Trying to find a specific element based on a condition [duplicate]

This question already has answers here:
Find value corresponding to maximum in other column [duplicate]
(2 answers)
Closed 2 years ago.
This is my dataframe in r studio. I'm trying to find code what will produce the name of the student with he highest age.
students.df #Name of dataframe
name DAD BDA gender nationality age
1 Amy 80 70 F IRL 20
2 Bill 65 50 M UK 21
3 Carl 50 80 M IRL 22
as.character(subset(students.df,students.df$age==max(students.df$age))$name)
library(dplyr)
students.df %>% filter(age==max(age)) %>% select(name)
you can try this
students.df[which.max(student.df$age),]

Merge records in data frame in R [duplicate]

This question already has answers here:
How to reshape data from long to wide format
(14 answers)
Convert data from long format to wide format with multiple measure columns
(6 answers)
Closed 6 years ago.
I have the following example data set:
data.frame(SEX=c("M","F","M","F"),COMPLAINT=c("headache","headache", "dizziness", "dizziness"),
reports=c(5,4,9,12), users = c(1250,3460,2500,1850))
SEX COMPLAINT reports users
1 M headache 5 1250
2 F headache 4 3460
3 M dizziness 9 2500
4 F dizziness 12 1850
My question is how to merge rows 1 and 2 , and 3 and 4 so that my data frame is as follows:
COMPLAINT reports_male reports_female users_male users_female
1 headache 5 4 1250 3460
2 dizziness 9 12 2500 1850
Anyone got a quick solution that I can use for a (much) larger dataset?
We can use the dcast from data.table which can take multiple value.var columns and is quite efficient on big datasets
library(data.table)
dcast(setDT(df1), COMPLAINT ~ SEX, value.var = c("reports", "users"))
# COMPLAINT reports_F reports_M users_F users_M
#1: dizziness 12 9 1850 2500
#2: headache 4 5 3460 1250
As seen in How to reshape data from long to wide format?, we can use library(reshape2) and then
reshape(df, idvar = "COMPLAINT", timevar = "SEX", direction = "wide").
COMPLAINT reports.M users.M reports.F users.F
1 headache 5 1250 4 3460
3 dizziness 9 2500 12 1850

Resources