Question Regarding creating time period/intervals in R? - r

I have an issue with creating a time period in R.
The data that I have on hand is enter image description here
Now I want to do the following.
Identify the hour intervals between start and end time and create a list
Identify the hour intervals b/n start and the break time
Finally, remove the break intervals to find the total time
and then create an output.
Could you please assist me?

I am not entirely sure what exactly you need, but from what i understand here is something to get you going:
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
t <- tibble(
place = "xyz",
st = "9:00",
et = "13:00"
)
t <- t %>%
mutate(stdur = period_to_seconds(hm(st))) %>%
mutate(etdur = period_to_seconds(hm(et))) %>%
mutate(interval = dseconds(etdur - stdur)) %>%
mutate(interval_hours = seconds_to_period(interval))
glimpse(t)
#> Observations: 1
#> Variables: 7
#> $ place <chr> "xyz"
#> $ st <chr> "9:00"
#> $ et <chr> "13:00"
#> $ stdur <dbl> 32400
#> $ etdur <dbl> 46800
#> $ interval <Duration> 14400s (~4 hours)
#> $ interval_hours <Period> 4H 0M 0S
Created on 2020-02-10 by the reprex package (v0.3.0)

Related

Create date of "X" column, when I have age in days at "X" column and birth date column in R

I'm having some trouble finding out how to do a specific thing in R.
In my dataset, I have a column with the date of birth of participants. I also have a column giving me the age in days at which a disease was diagnosed.
What I want to do is to create a new column showing the date of diagnosis. I'm guessing it's a pretty easy thing to do since I have all the information needed, basically it's birth date + X number of days = Date of diagnosis, but I'm unable to figure out how to do it.
All of my searches give me information on the opposite, going from date to age. So if you're able to help me, it would be much appreciated!
library(tidyverse)
library(lubridate)
df <- tibble(
birth = sample(seq("1950-01-01" %>%
as.Date(),
today(), by = "day"), 10, replace = TRUE),
age = sample(3650:15000, 10, replace = TRUE)
)
df %>%
mutate(diagnosis_date = birth %m+% days(age))
#> # A tibble: 10 x 3
#> birth age diagnosis_date
#> <date> <int> <date>
#> 1 1955-01-16 6684 1973-05-05
#> 2 1958-11-03 6322 1976-02-24
#> 3 2007-02-23 4312 2018-12-14
#> 4 2002-07-11 8681 2026-04-17
#> 5 2021-12-28 11892 2054-07-20
#> 6 2017-07-31 3872 2028-03-07
#> 7 1995-06-30 14549 2035-04-30
#> 8 1955-09-02 12633 1990-04-04
#> 9 1958-10-10 4534 1971-03-10
#> 10 1980-12-05 6893 1999-10-20
Created on 2022-06-30 by the reprex package (v2.0.1)

Count and aggregate by date [duplicate]

This question already has answers here:
Convert date-time string to class Date
(4 answers)
Count number of rows within each group
(17 answers)
Closed 1 year ago.
The dataset:
Date
2021-09-25T17:07:24.222Z
2021-09-25T16:17:20.376Z
2021-09-24T09:30:53.013Z
2021-09-24T09:06:24.565Z
I would like to count the number of rows per day. For example, 2021-09-25 will be 2.
To solve said challenge I looked at the following post:
Count and Aggregate Date in R
The answer of Rorshach is the solution. However, I do not understand how I can format my rows in the Date column to 2021/09/24 instead of 2021-09-24T09:06:24.565Z.
Could someone explain to me how to format the entries in the Date column?
After converting the date you may use table to count occurrence of each Date.
table(as.Date(df$Date))
#2021-09-24 2021-09-25
# 2 2
Parse the string into a datetime object and then extract the date (without the hours and minutes) to be able to count:
library(dplyr)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
tibble::tribble(
~Date,
"2021-09-25T17:07:24.222Z",
"2021-09-25T16:17:20.376Z",
"2021-09-24T09:30:53.013Z",
"2021-09-24T09:06:24.565Z"
) %>%
mutate(
day = Date %>% parse_datetime() %>% as.Date()
) %>%
count(day)
#> # A tibble: 2 × 2
#> day n
#> <date> <int>
#> 1 2021-09-24 2
#> 2 2021-09-25 2
#RonakShah's answer is good, but to have the dataframe in better format, use the count function from the plyr library:
library(plyr)
count(as.Date(df$Date))
Output:
x freq
1 2021-09-24 2
2 2021-09-25 2

Combine tidy text with synonyms to create dataframe

I have sample data frame as below:
quoteiD <- c("q1","q2","q3","q4", "q5")
quote <- c("Unthinking respect for authority is the greatest enemy of truth.",
"In the middle of difficulty lies opportunity.",
"Intelligence is the ability to adapt to change.",
"Science is not only a disciple of reason but, also, one of romance and passion.",
"If I have seen further it is by standing on the shoulders of Giants.")
library(dplyr)
quotes <- tibble(quoteiD = quoteiD, quote= quote)
quotes
I have created some tidy text as below
library(tidytext)
data(stop_words)
tidy_words <- quotes %>%
unnest_tokens(word, quote) %>%
anti_join(stop_words) %>%
count( word, sort = TRUE)
tidy_words
Further, I have searched the synonyms using qdap package as below
library(qdap)
syns <- synonyms(tidy_words$word)
The qdap out put is a list , and I am looking to pick the first 5 synonym for each word in the tidy data frame and create a column called synonyms as below:
word n synonyms
ability 1 adeptness, aptitude, capability, capacity, competence
adapt 1 acclimatize, accommodate, adjust, alter, apply,
authority 1 ascendancy, charge, command, control, direction
What is an elegant way of merging the list of 5 words from qdap synonym function and separate by commas?
One way this can be done using a tidyverse solution is
library(plyr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:plyr':
#>
#> arrange, count, desc, failwith, id, mutate, rename, summarise,
#> summarize
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidytext)
library(qdap)
#> Loading required package: qdapDictionaries
#> Loading required package: qdapRegex
#>
#> Attaching package: 'qdapRegex'
#> The following object is masked from 'package:dplyr':
#>
#> explain
#> Loading required package: qdapTools
#>
#> Attaching package: 'qdapTools'
#> The following object is masked from 'package:dplyr':
#>
#> id
#> The following object is masked from 'package:plyr':
#>
#> id
#> Loading required package: RColorBrewer
#>
#> Attaching package: 'qdap'
#> The following object is masked from 'package:dplyr':
#>
#> %>%
#> The following object is masked from 'package:base':
#>
#> Filter
library(tibble)
library(tidyr)
#>
#> Attaching package: 'tidyr'
#> The following object is masked from 'package:qdap':
#>
#> %>%
quotes <- tibble(quoteiD = paste0("q", 1:5),
quote= c(".\n\nthe ebodac consortium consists of partners: janssen (efpia), london school of hygiene and tropical medicine (lshtm),",
"world vision) mobile health software development and deployment in resource limited settings grameen\n\nas such, the ebodac consortium is well placed to tackle.",
"Intelligence is the ability to adapt to change.",
"Science is a of reason of romance and passion.",
"If I have seen further it is by standing on ."))
quotes
#> # A tibble: 5 x 2
#> quoteiD quote
#> <chr> <chr>
#> 1 q1 ".\n\nthe ebodac consortium consists of partners: janssen (efpia~
#> 2 q2 "world vision) mobile health software development and deployment~
#> 3 q3 Intelligence is the ability to adapt to change.
#> 4 q4 Science is a of reason of romance and passion.
#> 5 q5 If I have seen further it is by standing on .
data(stop_words)
tidy_words <- quotes %>%
unnest_tokens(word, quote) %>%
anti_join(stop_words) %>%
count( word, sort = TRUE)
#> Joining, by = "word"
tidy_words
#> # A tibble: 33 x 2
#> word n
#> <chr> <int>
#> 1 consortium 2
#> 2 ebodac 2
#> 3 ability 1
#> 4 adapt 1
#> 5 change 1
#> 6 consists 1
#> 7 deployment 1
#> 8 development 1
#> 9 efpia 1
#> 10 grameen 1
#> # ... with 23 more rows
syns <- synonyms(tidy_words$word)
#> no match for the following:
#> consortium, ebodac, consists, deployment, efpia, grameen, janssen, london, lshtm, partners, settings, software, tropical
#> ========================
syns %>%
plyr::ldply(data.frame) %>% # Change the list to a dataframe (See https://stackoverflow.com/questions/4227223/r-list-to-data-frame)
rename("Word_DefNumber" = 1, "Syn" = 2) %>% # Rename the columns with a name that is more intuitive
separate(Word_DefNumber, c("Word", "DefNumber"), sep = "\\.") %>% # Find the word part of the word and definition number
group_by(Word) %>% # Group by words, so that when we select rows it is done for each word
slice(1:5) %>% # Keep the first 5 rows for each word
summarise(synonyms = paste(Syn, collapse = ", ")) %>% # Combine the synonyms together comma separated using paste
ungroup() # So there are not unintended effects of having the data grouped when using the data later
#> # A tibble: 20 x 2
#> Word synonyms
#> <chr> <chr>
#> 1 ability adeptness, aptitude, capability, capacity, competence
#> 2 adapt acclimatize, accommodate, adjust, alter, apply
#> 3 change alter, convert, diversify, fluctuate, metamorphose
#> 4 development advance, advancement, evolution, expansion, growth
#> 5 health fitness, good condition, haleness, healthiness, robustness
#> 6 hygiene cleanliness, hygienics, sanitary measures, sanitation
#> 7 intelligence acumen, alertness, aptitude, brain power, brains
#> 8 limited bounded, checked, circumscribed, confined, constrained
#> 9 medicine cure, drug, medicament, medication, nostrum
#> 10 mobile ambulatory, itinerant, locomotive, migrant, motile
#> 11 passion animation, ardour, eagerness, emotion, excitement
#> 12 reason apprehension, brains, comprehension, intellect, judgment
#> 13 resource ability, capability, cleverness, ingenuity, initiative
#> 14 romance affair, affaire (du coeur), affair of the heart, amour, at~
#> 15 school academy, alma mater, college, department, discipline
#> 16 science body of knowledge, branch of knowledge, discipline, art, s~
#> 17 standing condition, credit, eminence, estimation, footing
#> 18 tackle accoutrements, apparatus, equipment, gear, implements
#> 19 vision eyes, eyesight, perception, seeing, sight
#> 20 world earth, earthly sphere, globe, everybody, everyone
Created on 2019-04-05 by the reprex package (v0.2.1)
Please note that plyr should be loaded before dplyr

Tableau LOD R Equivalent

I'm using a Tableau Fixed LOD function in a report, and was looking for ways to mimic this functionality in R.
Data set looks like:
Soldto<-c("123456","122456","123456","122456","124560","125560")
Shipto<-c("123456","122555","122456","124560","122560","122456")
IssueDate<-as.Date(c("2017-01-01","2017-01-02","2017-01-01","2017-01-02","2017-01-01","2017-01-01"))
Method<-c("Ground","Ground","Ground","Air","Ground","Ground")
Delivery<-c("000123","000456","000123","000345","000456","000555")
df1<-data.frame(Soldto,Shipto,IssueDate,Method,Delivery)
What I'm looking to do is "For each Sold-to/Ship-to/Method count the number of unique delivery IDs".
The intent is to find the number of unique deliveries that could potentially be "aggregated."
In Tableau that function looks like:
{FIXED [Soldto],[Shipto],[IssueDate],[Method],:countd([Delivery])
Could this be done with aggregate or summarize as in an example below:
df.new<-ddply(df,c("Soldto","Shipto","Method"),summarise,
Deliveries = n_distinct(Delivery))
This is fairly easy with dplyr. You are looking for the number of unique delivery for each combination of soldto, shipto and method, which is just group_by and then summarise:
library(tidyverse)
tbl <- tibble(
soldto = c("123456","122456","123456","122456","124560","125560"),
shipto = c("123456","122555","122456","124560","122560","122456"),
issuedate = as.Date(c("2017-01-01","2017-01-02","2017-01-01","2017-01-02","2017-01-01","2017-01-01")),
method = c("Ground","Ground","Ground","Air","Ground","Ground"),
delivery = c("000123","000456","000123","000345","000456","000555")
)
tbl %>%
group_by(soldto, shipto, method) %>%
summarise(uniques = n_distinct(delivery))
#> # A tibble: 6 x 4
#> # Groups: soldto, shipto [?]
#> soldto shipto method uniques
#> <chr> <chr> <chr> <int>
#> 1 122456 122555 Ground 1
#> 2 122456 124560 Air 1
#> 3 123456 122456 Ground 1
#> 4 123456 123456 Ground 1
#> 5 124560 122560 Ground 1
#> 6 125560 122456 Ground 1
Created on 2018-03-02 by the reprex package (v0.2.0).

reformat data frame in R

I am new to R.
I need to reformat the following data frame:
`Sample Name` `Target Name` 'CT values'
<chr> <chr> <dbl>
1 Sample 1 actin 19.69928
2 Sample 1 Ho-1 27.71864
3 Sample 1 Nrf-2 26.00012
9 Sample 9 Ho-1 25.31180
10 Sample 9 Nrf-2 26.41421
11 Sample 9 C3 26.16980
...
15 Sample 1 actin 19.49202
Actually, I want to have the different 'Target Names' as column names, and the individual 'Sample Names' as row names. The table should then display the respective CT values.
But note that there are duplicates, e.g., Sample 1 exists twice, as the corresponding Target name, e.g. "actin" does. What I want to have is that the table later only shows these duplicates once, with the means of the two different CT values.
I guess this is a very basic R data frame manipulation, but as I said, I am quite new to R and messing around with different tutorials.
Thank you very much in advance!
One way of doing that using the tidyverse ecosystem of packages:
library(tidyverse)
tab <- tribble(
~`Sample Name`, ~`Target Name`, ~ `CT values`,
"Sample 1", "actin", 19.69928,
"Sample 1", "Ho-1", 27.71864,
"Sample 1", "Nrf-2", 26.00012,
"Sample 9", "Ho-1", 25.31180,
"Sample 9", "Nrf-2", 26.41421,
"Sample 9", "C3", 26.16980,
"Sample 1", "actin", 19.49202
)
tab %>%
# calculate the mean of your dpulicate
group_by(`Sample Name`, `Target Name`) %>%
summarise(`CT values` = mean(`CT values`)) %>%
# reshape the data
spread(`Target Name`, `CT values`)
#> # A tibble: 2 x 5
#> # Groups: Sample Name [2]
#> `Sample Name` actin C3 `Ho-1` `Nrf-2`
#> * <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Sample 1 19.6 NA 27.7 26.0
#> 2 Sample 9 NA 26.2 25.3 26.4
you can also use data.table to a more consise way of doing this with
dcast reshape function
library(data.table)
#>
#> Attachement du package : 'data.table'
#> The following objects are masked from 'package:dplyr':
#>
#> between, first, last
#> The following object is masked from 'package:purrr':
#>
#> transpose
setDT(tab)
dcast(tab, `Sample Name` ~ `Target Name`, fun.aggregate = mean)
#> Using 'CT values' as value column. Use 'value.var' to override
#> Sample Name C3 Ho-1 Nrf-2 actin
#> 1: Sample 1 NaN 27.71864 26.00012 19.59565
#> 2: Sample 9 26.1698 25.31180 26.41421 NaN
Created on 2018-01-13 by the reprex package (v0.1.1.9000).

Resources