Conditionally mapping values from one data frame to another one R [duplicate] - r

This question already has answers here:
How to join (merge) data frames (inner, outer, left, right)
(13 answers)
Closed 6 years ago.
I have two dataframes:
> SubObj
sNumber runningTrialNo wordTar SubObj_ind
1 34 nerd 3
1 32 hooligan 1
1 7 villager 3
2 32 oak 2
2 8 deer 2
3 8 mammal 3
> df
sNumber runningTrialNo wordTar
1 34 nerd
1 34 nerd
1 34 nerd
1 32 hooligan
1 32 hooligan
1 7 villager
2 32 oak
2 32 oak
2 8 deer
3 8 mammal
3 8 mammal
I want to map values from SubObj$SubObj_ind into df$SubObj, so all the values would be in accordance with sNumber (subject number) and runningTrialNo (trial number). It should look smth like this:
> df
sNumber runningTrialNo wordTar SubObj_ind
1 34 nerd 3
1 34 nerd 3
1 34 nerd 3
1 32 hooligan 1
1 32 hooligan 1
1 7 villager 3
2 32 oak 2
2 32 oak 2
2 8 deer 2
3 8 mammal 3
3 8 mammal 3
I wrote the code that hypothetically should do the work but it doesn't map over trial and subject number:
df$SubObj_indO <- array(0, nrow(df))
for(i in 1:nrow(SubObj)) {
index <- df$runningTrialNo == SubObj[i,"runningTrialNo"] &
df$sNumber == SubObj[i,"sNumber"]
df$SubObj_ind[index] <- SubObj[index, "SubObj_ind"]
}
What is wrong in this peace of the code?

We can use match
df$SubObj_ind <- with(df, SubObj$SubObj_ind[match(wordTar, SubObj$wordTar)])
df
# sNumber runningTrialNo wordTar SubObj_ind
#1 1 34 nerd 3
#2 1 34 nerd 3
#3 1 34 nerd 3
#4 1 32 hooligan 1
#5 1 32 hooligan 1
#6 1 7 villager 3
#7 2 32 oak 2
#8 2 32 oak 2
#9 2 8 deer 2
#10 3 8 mammal 3
#11 3 8 mammal 3
Or use data.table
library(data.table)
setDT(df)[SubObj[c("wordTar", "SubObj_ind")], on = "wordTar"]

Related

How to make an histogram with a data frame

I was trying to make an histogram of the frequencies of a name list, the list is like this:
> x[1:15,]
X x
1 1 JUAN DOMINGOGONZALEZDELGADO
2 2 FRANCISCO JAVIERVELARAMIREZ
3 3 JUAN CARLOSPEREZMEDINA
4 4 ARMANDOSALINASSALINAS
5 5 JOSE FELIXZAMBRANOAMEZQUITA
6 6 GABRIELMONTIELRIVAS
7 7 DONACIANOSANCHEZHERRERA
8 8 JUAN MARTINXHUERTA
9 9 ALVARO ALEJANDROGONZALEZRAMOS
10 10 OMAR ROMANCASTAÑEDALOPEZ
11 11 IGNACIOBUENOCANO
12 12 RAFAELBETANCOURTDELGADO
13 13 LUIS ALBERTOCASTILLOESCOBEDO
14 14 VICTORHERNANDEZGONZALEZ
15 15 FATIMAROMOTORRES
in order to do that I change it to a frequency table, it looks like this:
> y[1:15,]
X x Freq
1 1 15
2 2 JULIO CESAR ORDAZFLORES 1
3 3 MARCOS ANTONIOCUEVASNAVARRO 1
4 4 DULEY DILTON TRIBOUILLIERLOARCA 1
5 5 ANTONIORAMIREZLOPEZ 2
6 6 BRAYAN ALEJANDROOJEDARAMIREZ 1
7 7 JOSE DE JESUSESCOTOCORTEZ 1
8 8 AARONFLORESGARCIA 1
9 9 ABIGAILNAVARROAMBRIZ 1
10 10 ABILENYRODRIGUEZORTEGA 1
11 11 ABRAHAMHERNANDEZRAMIREZ 1
12 12 ABRAHAMPONCEALCANTARA 1
13 13 ADRIAN VAZQUEZ BUSTAMANTE 2
14 14 ADRIANHERNANDEZBERMUDEZ 28
15 15 ALAN ORLANDOCASTILLALOPEZ 11
when I try hist(x) or hist(x[,2]) I get:
Error in hist.default(x) : 'x' must be numeric
and if I try hist(y[,3]) I got an strange histogram which is not the desired, now how can I make a histogram of the frequencies of the name list?

How to sort with multiple conditions in R [duplicate]

This question already has answers here:
Sort (order) data frame rows by multiple columns
(19 answers)
Closed 3 years ago.
I have a very simple dataframe in R:
x <- data.frame("SN" = 1:7, "Age" = c(21,15,22,33,21,15,25), "Name" = c("John","Dora","Paul","Alex","Bud","Chad","Anton"))
My goal is to sort the dataframe by the Age and the Name. I am able to achieve this task partially if i type the following command:
x[order(x[, 'Age']),]
which returns:
SN Age Name
2 2 15 Dora
6 6 15 Chad
1 1 21 John
5 5 21 Bud
3 3 22 Paul
7 7 25 Anton
4 4 33 Alex
As you can see the dataframe is order by the Age but not the Name.
Question: how can i order the dataframe by the age and name at the same time? This is what the result should look like
SN Age Name
6 6 15 Chad
2 2 15 Dora
5 5 21 Bud
1 1 21 John
3 3 22 Paul
7 7 25 Anton
4 4 33 Alex
Note: I would like to avoid to use additional packages but using just the default ones
With dplyr:
library(dplyr)
x %>%
arrange(Age, Name)
SN Age Name
1 6 15 Chad
2 2 15 Dora
3 5 21 Bud
4 1 21 John
5 3 22 Paul
6 7 25 Anton
7 4 33 Alex
x[with(x, order(Age, Name)), ]
SN Age Name
6 6 15 Chad
2 2 15 Dora
5 5 21 Bud
1 1 21 John
3 3 22 Paul
7 7 25 Anton
4 4 33 Alex

Using tidyverse gather() to output multiple value vectors with a single key in a data frame

Despite the conventions of R, data collection and entry is for me most easily done in vertical columns. Therefore, I have a question about efficiently converting to horizontal rows with the gather() function in the tidyverse library. I find myself using gather() over and over which seems inefficient. Is there a more efficient way? And can an existing vector serve as the key? Here is an example:
Let's say we have the following health metrics on baby birds.
bird day_1_mass day_2_mass day_1_heart_rate day_3_heart_rate
1 1 5 6 60 55
2 2 6 8 62 57
3 3 3 3 45 45
Using the gather function I can reorganize the mass data into rows.
horizontal.data <- gather(vertical.data,
key = age,
value = mass,
day_1_mass:day_2_mass,
factor_key=TRUE)
Giving us
bird day_1_heart_rate day_3_heart_rate age mass
1 1 60 55 day_1_mass 5
2 2 62 57 day_1_mass 6
3 3 45 45 day_1_mass 3
4 1 60 55 day_2_mass 6
5 2 62 57 day_2_mass 8
6 3 45 45 day_2_mass 3
And use the same function again to similarly reorganize heart rate data.
horizontal.data.2 <- gather(horizontal.data,
key = age2,
value = heart_rate,
day_1_heart_rate:day_3_heart_rate,
factor_key=TRUE)
Producing a new dataframe
bird age mass age2 heart_rate
1 1 day_1_mass 5 day_1_heart_rate 60
2 2 day_1_mass 6 day_1_heart_rate 62
3 3 day_1_mass 3 day_1_heart_rate 45
4 1 day_2_mass 6 day_1_heart_rate 60
5 2 day_2_mass 8 day_1_heart_rate 62
6 3 day_2_mass 3 day_1_heart_rate 45
7 1 day_1_mass 5 day_3_heart_rate 55
8 2 day_1_mass 6 day_3_heart_rate 57
9 3 day_1_mass 3 day_3_heart_rate 45
10 1 day_2_mass 6 day_3_heart_rate 55
11 2 day_2_mass 8 day_3_heart_rate 57
12 3 day_2_mass 3 day_3_heart_rate 45
So it took two steps, but it worked. The questions are 1) Is there a way to do this in one step? and 2) Can it alternatively be done with one key (the "age" vector) that I can then simply replace as numeric data?
if I get the question right, you could do that by first gathering everything together, and then "spreading" on mass and heart rate:
library(forcats)
library(dplyr)
mass_levs <- names(vertical.data)[grep("mass", names(vertical.data))]
hearth_levs <- names(vertical.data)[grep("heart", names(vertical.data))]
horizontal.data <- vertical.data %>%
gather(variable, value, -bird, factor_key = TRUE) %>%
mutate(day = stringr::str_sub(variable, 5,5)) %>%
mutate(variable = fct_collapse(variable,
"mass" = mass_levs,
"hearth_rate" = hearth_levs)) %>%
spread(variable, value)
, giving:
bird day mass hearth_rate
1 1 1 5 60
2 1 2 6 NA
3 1 3 NA 55
4 2 1 6 62
5 2 2 8 NA
6 2 3 NA 57
7 3 1 3 45
8 3 2 3 NA
9 3 3 NA 45
we can see how it works by going through the pipe one pass at a time.
First, we gather everyting on a long format:
horizontal.data <- vertical.data %>%
gather(variable, value, -bird, factor_key = TRUE)
bird variable value
1 1 day_1_mass 5
2 2 day_1_mass 6
3 3 day_1_mass 3
4 1 day_2_mass 6
5 2 day_2_mass 8
6 3 day_2_mass 3
7 1 day_1_heart_rate 60
8 2 day_1_heart_rate 62
9 3 day_1_heart_rate 45
10 1 day_3_heart_rate 55
11 2 day_3_heart_rate 57
12 3 day_3_heart_rate 45
then, if we want to keep a "proper" long table, as the OP suggested we have to create a single key variable. In this case, it makes sense to use the day (= age). To create the day variable, we can extract it from the character strings now in variable:
%>% mutate(day = stringr::str_sub(variable, 5,5))
here, str_sub gets the substring in position 5, which is the day (note that if in the full dataset you have multiple-digits days, you'll have to tweak this a bit, probably by splitting on _):
bird variable value day
1 1 day_1_mass 5 1
2 2 day_1_mass 6 1
3 3 day_1_mass 3 1
4 1 day_2_mass 6 2
5 2 day_2_mass 8 2
6 3 day_2_mass 3 2
7 1 day_1_heart_rate 60 1
8 2 day_1_heart_rate 62 1
9 3 day_1_heart_rate 45 1
10 1 day_3_heart_rate 55 3
11 2 day_3_heart_rate 57 3
12 3 day_3_heart_rate 45 3
now, to finish we have to "spread " the table to have a mass and a heart rate column.
Here we have a problem, because currently there are 2 levels each corresponding to mass and hearth rate in the variable column. Therefore, applying spread on variable would give us again four columns.
To prevent that, we need to aggregate the four levels in variable into two levels. We can do that by using forcats::fc_collapse, by providing the association between the new level names and the "old" ones. Outside of a pipe, that would correspond to:
horizontal.data$variable <- fct_collapse(horizontal.data$variable,
mass = c("day_1_mass", "day_2_mass",
heart = c("day_1_hearth_rate", "day_3_heart_rate")
However, if you have many levels it is cumbersome to write them all. Therefore, I find beforehand the level names corresponding to the two "categories" using
mass_levs <- names(vertical.data)[grep("mass", names(vertical.data))]
hearth_levs <- names(vertical.data)[grep("heart", names(vertical.data))]
mass_levs
[1] "day_1_mass" "day_2_mass"
hearth_levs
[1] "day_1_heart_rate" "day_3_heart_rate"
therefore, the third line of the pipe can be shortened to:
%>% mutate(variable = fct_collapse(variable,
"mass" = mass_levs,
"hearth_rate" = hearth_levs))
, after which we have:
bird variable value day
1 1 mass 5 1
2 2 mass 6 1
3 3 mass 3 1
4 1 mass 6 2
5 2 mass 8 2
6 3 mass 3 2
7 1 hearth_rate 60 1
8 2 hearth_rate 62 1
9 3 hearth_rate 45 1
10 1 hearth_rate 55 3
11 2 hearth_rate 57 3
12 3 hearth_rate 45 3
, so that we are now in the condition to "spread" the table again according to variable using:
%>% spread(variable, value)
bird day mass hearth_rate
1 1 1 5 60
2 1 2 6 NA
3 1 3 NA 55
4 2 1 6 62
5 2 2 8 NA
6 2 3 NA 57
7 3 1 3 45
8 3 2 3 NA
9 3 3 NA 45
HTH
If you insist on a single command , i can give you one
setup the data.frame
c1<-c(1,2,3)
c2<-c(5,6,3)
c3<-c(6,8,3)
c4<-c(60,62,45)
c5<-c(55,57,45)
dt<-as.data.table(cbind(c1,c2,c3,c4,c5))
colnames(dt)<-c("bird","day_1_mass","day_2_mass","day_1_heart_rate","day_3_heart_rate")
Now use this single command to get the final outcome
merge(melt(dt[,c("bird","day_1_mass","day_2_mass")],id.vars = c("bird"),variable.name = "age",value.name="mass"),melt(dt[,c("bird","day_1_heart_rate","day_3_heart_rate")],id.vars = c("bird"),variable.name = "age2",value.name="heart_rate"),by = "bird")
The final outcome is
bird age mass age2 heart_rate
1: 1 day_1_mass 5 day_1_heart_rate 60
2: 1 day_1_mass 5 day_3_heart_rate 55
3: 1 day_2_mass 6 day_1_heart_rate 60
4: 1 day_2_mass 6 day_3_heart_rate 55
5: 2 day_1_mass 6 day_1_heart_rate 62
6: 2 day_1_mass 6 day_3_heart_rate 57
7: 2 day_2_mass 8 day_1_heart_rate 62
8: 2 day_2_mass 8 day_3_heart_rate 57
9: 3 day_1_mass 3 day_1_heart_rate 45
10: 3 day_1_mass 3 day_3_heart_rate 45
11: 3 day_2_mass 3 day_1_heart_rate 45
12: 3 day_2_mass 3 day_3_heart_rate 45
Though already answered, I have a different solution in which you save a list of the gather parameters you would like to run, and then run the gather_() command for each set of parameters in the list.
# Create a list of gather parameters
# Format is key, value, columns_to_gather
gather.list <- list(c("age", "mass", "day_1_mass", "day_2_mass"),
c("age2", "heart_rate", "day_1_heart_rate", "day_3_heart_rate"))
# Run gather command for each list item
for(i in gather.list){
df <- gather_(df, key_col = i[1], value_col = i[2], gather_cols = c(i[3:length(i)]), factor_key = TRUE)
}

Group dataframe rows by consecutive ascending IDs [duplicate]

This question already has answers here:
Find consecutive values in vector in R [duplicate]
(2 answers)
Closed 6 years ago.
I am fairly new to the art of programming (loops etc..) and this is something where I would be grateful if I could get an opinion whether my approach is fine or it would definitely need to be optimized if it was about to used on much bigger sample.
Currently I have approximately 20 000 observations and one of the columns is the ID of receipt. What I would like to achieve is to assign each row to a group that would consist of IDs that are ascending in a format of n+1. If this rule is broken the new group should be created until the rule is broken again.
To illustrate, lets say I have this table (Important note is that ID are not necessarily unique and can repeat, like ID 10 in my example):
MyTable <- data.frame(ID = c(1,2,3,4,6,7,8,10,10,11,17,18,19,200,201,202,2010,2011,2013))
MyTable
ID
1
2
3
4
6
7
8
10
10
11
17
18
19
200
201
202
2010
2011
2013
The result of my grouping should be following:
ID GROUP
1 1
2 1
3 1
4 1
6 2
7 2
8 2
10 3
10 3
11 3
17 4
18 4
19 4
200 5
201 5
202 5
2010 6
2011 6
2013 7
I used dplyr for ordering the ID in ascending way. Then created the variable MyData$Group which I have simply filled with 1's.
rep(1,length(MyTable$ID)
for (i in 2:length(MyTable$ID) ) {
if(MyTable$ID[i] == MyTable$ID[i-1]+1 | MyTable$ID[i] == MyTable$ID[i-1]) {
MyTable$ID[i] <- MyTable$GROUP[i-1]
} else {
MyTable$GROUP[i] <- MyTable$GROUP[i-1]+1
}
}
This code worked for me and I got the results fairly easily. However, I wonder if in eyes of more experienced programmers, this piece of code would be considered as "bad", "average", "good" or whatever rating you come up with.
EDIT: I am sure this topic has been touched already, not arguing against that. Though, as the main difference is that I would like to touch a topic of optimization here and see whether my approach meets standards.
Thanks!
To make a long story short:
MyTable$Group <- cumsum(c(1, diff(MyTable$ID) != 1))
# ID Group
#1 1 1
#2 2 1
#3 3 1
#4 4 1
#5 6 2
#6 7 2
#7 8 2
#8 10 3
#9 11 3
#10 12 3
#11 17 4
#12 18 4
#13 19 4
#14 200 5
#15 201 5
#16 202 5
#17 2010 6
#18 2011 6
#19 2013 7
You are searching all differences in your vector Mytable$ID, which are not 1, so this are your "breaks". And then you cumsum all these values. When you do not know cumsum so type ?cumsum.
That's all!
UPDATE: with repeating IDs, you can use this:
MyTable <- data.frame(ID = c(1,2,3,4,6,7,8,10,10,11,17,18,19,200,201,202,2010,2011,2013))
MyTable$Group <- cumsum(c(1, !diff(MyTable$ID) %in% c(0,1) ))
# ID Group
#1 1 1
#2 2 1
#3 3 1
#4 4 1
#5 6 2
#6 7 2
#7 8 2
#8 10 3
#9 10 3
#10 11 3
#11 17 4
#12 18 4
#13 19 4
#14 200 5
#15 201 5
#16 202 5
#17 2010 6
#18 2011 6
#19 2013 7

How to create unique rows in a data frame

I have a dataframe where rows are duplicated. I need to create unique rows from this. I tried a couple of options but they don't seem to work
l1 <-summarise(group_by(l,bowler,wickets),economyRate,d=unique(date))
This works for some rows but also gives the error "Expecting a single value". The dataframe 'l' looks like this
bowler overs maidens runs wickets economyRate date opposition
(fctr) (int) (int) (dbl) (dbl) (dbl) (date) (chr)
1 MA Starc 9 0 51 0 5.67 2010-10-20 India
2 MA Starc 9 0 27 4 3.00 2010-11-07 Sri Lanka
3 MA Starc 9 0 27 4 3.00 2010-11-07 Sri Lanka
4 MA Starc 9 0 27 4 3.00 2010-11-07 Sri Lanka
5 MA Starc 9 0 27 4 3.00 2010-11-07 Sri Lanka
6 MA Starc 6 0 33 2 5.50 2012-02-05 India
7 MA Starc 6 0 33 2 5.50 2012-02-05 India
8 MA Starc 10 0 50 2 5.00 2012-02-10 Sri Lanka
9 MA Starc 10 0 50 2 5.00 2012-02-10 Sri Lanka
10 MA Starc 8 0 49 0 6.12 2012-02-12 India
The date is unique and can be used to get the rows for which the row can be selected. Please let me know how this can be done.
In the example dataset, there are more than one unique elements of 'date' per each 'bowler', 'wickets' combination. One option would be to paste the unique 'date' together
l %>%
group_by(bowler, wickets) %>%
summarise(economyRate= mean(economyRate), d = toString(unique(date)))
Or create 'd' as a list column
l %>%
group_by(bowler, wickets) %>%
summarise(economyRate= mean(economyRate), d = list(unique(date)))
With respect to 'economyRate', I am guessing the OP need the mean of that.
If we need to create a column of unique date in the original dataset, use mutate
l %>%
group_by(bowler, wickets) %>%
mutate(d = list(unique(date)))
As the OP didn't provide the expected output, the below could be also the result
l %>%
group_by(bowler, wickets) %>%
distinct(date)
Or as #Frank mentioned
l %>%
group_by(bowler,wickets,date) %>%
slice(1L)
If I get the intention of th OP right, he is asking to simply remove the duplicate rows. So, I would use
unique(l1)
That's what ?unique says:
unique returns a vector, data frame or array like x but with duplicate elements/rows removed.
Data
l <- read.table(text = "bowler overs maidens runs wickets economyRate date opposition
1 MA_Starc 9 0 51 0 5.67 2010-10-20 India
2 MA_Starc 9 0 27 4 3.00 2010-11-07 Sri-Lanka
3 MA_Starc 9 0 27 4 3.00 2010-11-07 Sri-Lanka
4 MA_Starc 9 0 27 4 3.00 2010-11-07 Sri-Lanka
5 MA_Starc 9 0 27 4 3.00 2010-11-07 Sri-Lanka
6 MA_Starc 6 0 33 2 5.50 2012-02-05 India
7 MA_Starc 6 0 33 2 5.50 2012-02-05 India
8 MA_Starc 10 0 50 2 5.00 2012-02-10 Sri-Lanka
9 MA_Starc 10 0 50 2 5.00 2012-02-10 Sri-Lanka
10 MA_Starc 8 0 49 0 6.12 2012-02-12 India")
Distinct
Use dplyr::distinct to remove duplicated rows.
ldistinct <- distinct(l)
# bowler overs maidens runs wickets economyRate date
# 1 MA_Starc 9 0 51 0 5.67 2010-10-20
# 2 MA_Starc 9 0 27 4 3.00 2010-11-07
# 3 MA_Starc 6 0 33 2 5.50 2012-02-05
# 4 MA_Starc 10 0 50 2 5.00 2012-02-10
# 5 MA_Starc 8 0 49 0 6.12 2012-02-12
# opposition
# 1 India
# 2 Sri-Lanka
# 3 India
# 4 Sri-Lanka
# 5 India
l2 <- summarise(group_by(ldistinct,bowler,wickets),
economyRate,d=unique(date))
# Error: expecting a single value
But it's not enough here, there are still many dates for
one combination of bowler and wickets.
Collapse values together
By pasting multiple values together you will see that there are many dates and many economyRate for a single combination of bowler and wickets.
l3 <- summarise(group_by(l,bowler,wickets),
economyRate = paste(unique(economyRate),collapse=", "),
d=paste(unique(date),collapse=", "))
l3
# bowler wickets economyRate d
# (fctr) (int) (chr) (chr)
# 1 MA_Starc 0 5.67, 6.12 2010-10-20, 2012-02-12
# 2 MA_Starc 2 5.5, 5 2012-02-05, 2012-02-10
# 3 MA_Starc 4 3 2010-11-07
So, I took an unusual route to doing this disection, but I let the date remain a factor when it came over from the csv file I created. you could easily the date column to a factor with
l1$date<-as.factor(l1$date)
This will make that row a non-date row, you could also convert to character, either will work fine. This is what it looks like structurally.
str(l1)
'data.frame': 10 obs. of 10 variables:
$ bowler : Factor w/ 2 levels "(fctr)","MA": 2 2 2 2 2 2 2 2 2 2
$ overs : Factor w/ 2 levels "(int)","Starc": 2 2 2 2 2 2 2 2 2 2
$ maidens : Factor w/ 5 levels "(int)","10","6",..: 5 5 5 5 5 3 3 2 2 4
$ runs : Factor w/ 2 levels "(dbl)","0": 2 2 2 2 2 2 2 2 2 2
$ wickets : Factor w/ 6 levels "(dbl)","27","33",..: 6 2 2 2 2 3 3 5 5 4
$ economyRate: Factor w/ 4 levels "(dbl)","0","2",..: 2 4 4 4 4 3 3 3 3 2
$ date : Factor w/ 6 levels "(date)","3","5",..: 5 2 2 2 2 4 4 3 3 6
$ opposition : Factor w/ 6 levels "(chr)","10/20/2010",..: 2 3 3 3 3 6 6 4 4 5
$ X.1 : Factor w/ 3 levels "","India","Sri": 2 3 3 3 3 2 2 3 3 2
$ X.2 : Factor w/ 2 levels "","Lanka": 1 2 2 2 2 1 1 2 2 1
After that it is about making sure that you are using the sub-setting grammar properly with the most concise query:
l2<-l1[!duplicated(l1$date),]
And this is what is returned, 5 rows of unique data:
bowler overs maidens runs wickets economyRate date opposition X.1 X.2
2 MA Starc 9 0 51 0 5.67 10/20/2010 India
3 MA Starc 9 0 27 4 3 11/7/2010 Sri Lanka
7 MA Starc 6 0 33 2 5.5 2/5/2012 India
9 MA Starc 10 0 50 2 5 2/10/2012 Sri Lanka
11 MA Starc 8 0 49 0 6.12 2/12/2012 India
The only thing you need to be careful of is to keep that comma after the !duplicated(l1$date) to be sure that ALL columns are searched and included in the final subset.
If you want dates or characters you can as.POSIXct or as.character convert them to a usable format for the rest of your manipulation.
I hope this is useful to you!

Resources