R crashes while using data.table - r

sac[,treatment_days := as.character(seq(from = SACDPDAT, to = SACRTDAT, by = "1 day")), by = PACKID] I have data named sac with dput(sac[1:2,]) as follows:
structure(list(SUBJECT_Blinded = c(1201001, 1101001), LINE = c(8,
4), MODULE = c("SAC", "SAC"), CENTRE_Blinded = c(1201, 1201),
STUDYPER = c(7, 4), PACKID = c(10096, 10595), SACDPDAT = structure(c(1335304800,
1325545200), class = c("POSIXct", "POSIXt"), tzone = ""),
SACDP1 = c(35, 35), C_SACDP = c(NA_character_, NA_character_
), SACRTDAT = structure(c(1340316000, 1327964400), class = c("POSIXct",
"POSIXt"), tzone = ""), SACRT1 = c(0, 9), C_SACRT = c(NA_character_,
NA_character_)), .Names = c("SUBJECT_Blinded", "LINE", "MODULE",
"CENTRE_Blinded", "STUDYPER", "PACKID", "SACDPDAT", "SACDP1",
"C_SACDP", "SACRTDAT", "SACRT1", "C_SACRT"), sorted = c("SUBJECT_Blinded",
"PACKID"), class = c("data.table", "data.frame"), row.names = c(NA,
-2L))
When I running the code:
sac[,treatment_days := list(format(seq(from = SACDPDAT, to = SACRTDAT, by = "1 day"),"%Y-%m-%d")), by = PACKID]
RStudio crushes and returns info:
Problem signature:
Problem Event Name: APPCRASH
Application Name: rsession.exe
Application Version: 0.98.501.0
Application Timestamp: 52e8371d
Fault Module Name: R.dll
Fault Module Version: 3.3.65126.0
Fault Module Timestamp: 53185fd3
Exception Code: c0000005
Exception Offset: 0000000000028c36
OS Version: 6.1.7601.2.1.0.256.48
Locale ID: 1045
Additional Information 1: 4fc0
Additional Information 2: 4fc0e6e5b53a870c89fb6e37a38d7e6b
Additional Information 3: 9d6e
Additional Information 4: 9d6e8f79167930945e5a5d06afac680e
It's the same with pure R. Any ideas how to do it another way?

There's a couple of problems with your new code:
"1 day" is incorrect, if you run seq on a date object, the number you pass to by will be interpreted as days, so:
seq(from = SACDPDAT, to = SACRTDAT, by = 1)
You also cannot create a new column from this sequence, because there can only be one value for each row. Instead, you can generate the sequence of days by PACKID, and then join this onto the old data.table
So try:
setkey(sac, PACKID)
sac <- sac[sac[,seq(from = SACDPDAT, to = SACRTDAT, by = 1), by=PACKID]]

Related

How to NOT write_csv if data frame is empty

I have a dataframe that is gathered everyday via a sql query. Sometimes it'll have rows in it, sometimes it wont. I then write_csv it into a onedrive location which triggers an automated email.
df and code like this if relevant:
df<-structure(list(PROTOCOL_ID = numeric(0), PROTOCOL_NO = character(0),
STATUS = character(0), STATUS_DATE = structure(numeric(0), tzone = "", class = c("POSIXct",
"POSIXt")), PROCESSED_FLAG = character(0), INITIATOR_CODE = numeric(0),
CHANGE_REASON_CODE = numeric(0), PR_STATUS_ID = numeric(0),
COMMENTS = character(0), CREATED_DATE = structure(numeric(0), tzone = "", class = c("POSIXct",
"POSIXt")), CREATED_USER = character(0), MODIFIED_DATE = structure(numeric(0), tzone = "", class = c("POSIXct",
"POSIXt")), MODIFIED_USER = character(0), OUTCOME_ID = numeric(0),
IRB_NO = character(0), NCT_NUMBER = character(0), PI_NAMES = character(0)), row.names = integer(0), class = "data.frame")
write_csv(df, "df.csv")
If the dataframe has zero rows that day, I'd rather it DIDN'T write the csv. I'm sure I could figure out a step that deletes the data frame if empty and then the write_csv line would error, but I'd rather not do that. Is there an easy way to 'turn off' the write?
We could have a condition to only write to csv when the number of rows is greater than 0
if(nrow(df) > 0) readr::write_csv(df, "df.csv")

Lexis function not found in R

I am using this code from the R help guide in the Epi
package:
# A small bogus cohort
xcoh <- structure( list( id = c("A", "B", "C"),
birth = c("14/07/1952", "01/04/1954",
"10/06/1987"),
entry = c("04/08/1965", "08/09/1972",
"23/12/1991"),
exit = c("27/06/1997", "23/05/1995",
"24/07/1998"),
fail = c(1, 0, 1) ),
.Names = c("id", "birth", "entry", "exit",
"fail"),
row.names = c("1", "2", "3"),
class = "data.frame" )
# Define a Lexis object with timescales calendar time and
age
Lcoh <- Lexis( entry = list( per=entry ),
exit = list( per=exit,
age=exit-birth ),
exit.status = fail,
data = xcoh )
But I get this error:
Error in Lexis(entry = list(per = entry), exit = list(per = exit, age = exit - :
could not find function "Lexis"
Any thoughts?
Epi package first needs to be installed in the environment using:
install.packages("Epi")
And then the library for Epi needs to be loaded.
library(Epi)
Hence your code being modified as follows:
install.packages("Epi")
library(Epi)
xcoh <- structure( list( id = c("A", "B", "C"),
birth = c("14/07/1952", "01/04/1954",
"10/06/1987"),
entry = c("04/08/1965", "08/09/1972",
"23/12/1991"),
exit = c("27/06/1997", "23/05/1995",
"24/07/1998"),
fail = c(1, 0, 1) ),
.Names = c("id", "birth", "entry", "exit",
"fail"),
row.names = c("1", "2", "3"),
class = "data.frame" )
# Define a Lexis object with timescales calendar time and
Lcoh <- Lexis( entry = list( per=entry ),
exit = list( per=exit,
age=exit-birth ),
exit.status = fail,
data = xcoh )
Note: I have removed the line that says age. Assuming it is not relevant to the question posted here.

r data.table appears to have duplicated rows but unique doesn't find them

I have a data.table, dt, where there appear to be two copies of each row. But when I run unique(dt), there are no duplicates.
The output of dput on the file is below
structure(list(region_code.IMPACT159 = c("CHM", "CHM"), c_Crust.elas = c(1, 1),
c_Mllsc.elas = c(0.437389655806453,
0.437389655806453),
c_FrshD.elas = c(0.361233613522818,
0.361233613522818),
c_OPelag.elas = c(0.361774165068678,
0.361774165068678
),
c_ODmrsl.elas = c(1, 1),
c_OMarn.elas = c(-0.09, -0.09),
c_FshOil.elas = c(0.382700000000001,
0.382700000000001),
c_aqan.elas = c(0, 0),
c_aqpl.elas = c(0,
0)),
sorted = "region_code.IMPACT159",
class = c("data.table",
"data.frame"),
row.names = c(NA, -2L),
.internal.selfref = <pointer: 0x7fd6af00e2e0>)
I can't run this directly because of the internal self ref code. But when I delete that the resulting file does show one row is duplicated. I've been running this type of code for a long time so I'm not sure what has changed to cause this. I'm using version 1.11.9 of data.table

Using literal month names with year in ramcharts

Here is my code to generate barplot using rAmChart,
library(rAmCharts)
amBarplot(x = "month", y = "value", data = dataset,
dataDateFormat = "MM/YYYY", minPeriod = "MM",
show_values = FALSE, labelRotation = -90, depth = 0.1)
However, is there a way to use month names & year in my x axis? I am trying to use MMM-YY formats.
Sample dataset,
structure(list(value = c(11544, 9588, 9411, 10365, 11154, 12688
), month = c("05/2012", "06/2012", "07/2012", "08/2012", "09/2012",
"10/2012")), .Names = c("value", "month"), row.names = c(NA,
6L), class = "data.frame")
Thanks.
It appears that rAmCharts doesn't expose AmCharts' dateFormats setting in the categoryAxis, so you have to access it through the init event and create your own dateFormats array with a modified format string for the MM period. I'm not very experienced with R, but here's how I managed to make it work using R 3.4.2 and rAmCharts 2.1.5
chart <- amBarplot( ... settings omitted ... )
addListener(.Object = chart,
name = 'init',
expression = paste(
"function(e) {",
"e.chart.categoryAxis.dateFormats = ",
'[{"period":"fff","format":"JJ:NN:SS"},{"period":"ss","format":"JJ:NN:SS"},',
'{"period":"mm","format":"JJ:NN"},{"period":"hh","format":"JJ:NN"},{"period":"DD","format":"MMM DD"},',
'{"period":"WW","format":"MMM DD"},',
'{"period":"MM","format":"MMM-YY"},', # "add YY to default MM format
'{"period":"YYYY","format":"YYYY"}]; ',
'e.chart.validateData();',
"}")
)
Here is a different solution:
library(rAmCharts)
dataset <- structure(list(value = c(11544, 9588, 9411, 10365, 11154, 12688
), month = c("05/2012", "06/2012", "07/2012", "08/2012", "09/2012",
"10/2012")), .Names = c("value", "month"), row.names = c(NA,
6L), class = "data.frame")
dataset$month <- as.character(
format(
as.Date(paste0("01/",dataset$month), "%d/%m/%Y"),
"%B %Y"))
amBarplot(x = "month", y = "value", data = dataset,
show_values = FALSE, labelRotation = -90, depth = 0.1)

date format change with DT and shiny

my problem is when i use datatable on my computer and on the server formatDate is changing
i know i'm using method = 'toLocaleDateString' maybe it's not the good method
on my computer it give me the format i want :
1 février 2000
21 mars 2000
on shiny it give me :
01/02/2000
21/03/2000
local computer and server have Sys.timezone()
[1] "Europe/Paris"
im trying to do it like this
a <-structure(list(timestamp = structure(c(949363200, 953596800,
961286400, 962582400, 965347200, 969667200),
class = c("POSIXct", "POSIXt"), tzone = "UTC"),
anoms = c(1, 1, 1, 1, 1, 2), syndrome = c("Acrosyndrome",
"Acrosyndrome", "Acrosyndrome", "Acrosyndrome", "Acrosyndrome",
"Acrosyndrome")), .Names = c("timestamp", "anoms", "syndrome"
), row.names = c(NA, 6L), class = "data.frame")
datatable(a) %>% formatDate( 1, method = 'toLocaleDateString')
a
Thank you
With the development version of DT (>= 0.2.2) on Github, you can pass additional parameters to the date conversion method, e.g.
datatable(a) %>%
formatDate(1, method = 'toLocaleDateString', params = list('fr-FR'))
Or more parameters:
datatable(a) %>% formatDate(
1, method = 'toLocaleDateString',
params = list('fr-FR', list(year = 'numeric', month = 'long', day = 'numeric'))
)

Resources