How to handle and convert null datetime field into unixtime stamp in Scala - rdd

I have some code snippet as below that is not accepted by Scala, it would be appreciated if someone can help to fix it, thanks.
train_no_header is an RDD generated from a csv file, its first line is shown as below:
scala> train_no_header.first
res4: String = 87540,12,1,13,497,2017-11-07 09:30:38,,0
Now, I want to generate another RDD to parse and transform records with null or empty value for the 6th field which should be a DateTime (in the above sample the field is empty), some records might have that and some might not, for those having that, the format is same as 5th which is a UTC DateTime.
I need to calculate the delta between the two DateTime, I plan to convert them into Unixtime format, that being said, the final RDD should have both the two date fields converted into Unixtime format.
So my question is:
with the sample data and format, how do I create the RDD with the needed result?
for records with empty value in the 6th field, how should I handle it so that no exception would be generated in the future query in data frame (which is what I intend to work in)
Thank you very much in advance, any clue is appreciated.

Related

Has anyone used 128 bit integer for datetime

I am working with an S3 bucket and one of the fields is a datetime group and the values appear to be 128-bit integers. An example: 45376204963002810833065984. Has anyone seen this before? If so, how should this be parsed? This should work out to a date in 2022.
This is giving me problems when I try to get the data into a pandas (or polars) dataframe. The data outputs as a JSON string, and then when I try to put it into a pandas dataframe, I get an error: Value too big!. So I'm hoping to use SQL to cast this integer as a datetime before outputting as a JSON file.

Date parameter mis-read in Delphi SQLite Query

What is wrong with my code:
ExecSql('DELETE FROM STLac WHERE RegN=99 AND BegDate>= 2016-12-14');
This runs, but deletes ALL the rows in STLac for RegN, not just the rows with BegDate on or after 2016-12-14.
Originally I had:
ExecSql('DELETE FROM STLac WHERE RegN=99 AND BegDate>= :myDdate,[myDate]);
which has the advantage I hoped of not being particular to the date format. So I tried the literal date should in the format SQLite likes. Either way I get all rows deleted, not just those on or after the specified date.
Scott S.
Try double quote while putting date. As any value must be provided in between quotes until and unless that column is not int
ExecSql('DELETE FROM STLac WHERE RegN=99 AND BegDate>= "2016-12-14"');
SQLite does not have datetime format as such, so you have to figure out how date is actually represented in the table and change your query to provide the same format. First execute the "select" statement in some kind of management tool,
select * from STLac where RegN = 99 and BegDate >= '2016-12-14' --(or '2016.12.04' or something else)
which displays the result in the grid; when you see the expected rows, change it to "delete" query and copy into your Delphi program.

How to convert the format of inserted datetime in Informix?

I insert my date/time data into a CHAR column in the format: '6/4/2015 2:08:00 PM'.
I want that this should get automatically converted to format:
'2015-06-04 14:08:00' so that it can be used in a query because the format of DATETIME is YYYY-MM-DD hh:mm:ss.fffff.
How to convert it?
Given that you've stored the data in a string format (CHAR or VARCHAR), you have to decide how to make it work as a DATETIME YEAR TO SECOND value. For computational efficiency, and for storage efficiency, it would be better to store the value as a DATETIME YEAR TO SECOND value, converting it on input and (if necessary) reconverting on output. However, if you will frequently display the value without doing computations (including comparisons or sorting) it, then maybe a rococo locale-dependent string notation is OK.
The key function for converting the string to a DATETIME value is TO_DATE. You also need to look at the TO_CHAR function because that documents the format codes that you need to use, and because you'll use that to convert a DATETIME value to your original format.
Assuming the column name is time_string, then you need to use:
TO_DATE(time_string, '%m/%d/%Y %I:%M %x') -- What goes in place of x?
to convert to a DATETIME YEAR TO SECOND — or maybe DATETIME YEAR TO MINUTE — value (which will be further manipulated as if by EXTEND as necessary).
I would personally almost certainly convert the database column to DATETIME YEAR TO SECOND and, when necessary, convert to the string format on output with TO_CHAR. The column name would now be time_value (for sake of concreteness):
TO_CHAR(time_value, '%m/%d/%Y %I:%M %x') -- What goes in place of x?
The manual pages referenced do not immediately lead to a complete specification of the format strings. I think a relevant reference is GL_DATETIME environment variable, but finding that requires more knowledge of the arcana of the Informix product set than is desirable (it is not the first thing that should spring to anyone's mind — not even mine!). If that's correct (it probably is), then one of %p and %r should be used in place of %x in my examples. I have to get Informix (re)configured on my machine to be able to test it.

PHPExcel - Reading Date values

I am using PHPExcel library to read spread sheet data. This library is giving me trouble reading the 'DATE' values.
Question:
How to instruct the PHPExcel to read the 'DATE' values properly even if 'setReadDataOnly' is set as 'false'?
Description:
PHPExcel version: 2.1, Environment: Ubuntu 12
Here is the code block:
$objReader = \PhpExcel\PHPExcel_IOFactory::createReaderForFile($spreadSheetFullFilePathString);
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load($spreadSheetFullFilePathString);
$objWorksheetObject = $objPHPExcel->getActiveSheet();
Am getting 'integer' values for the 'DATE' column values by default.
I found that, the line '$objReader->setReadDataOnly(true)' is causing the trouble.
So, Changed it as '$objReader->setReadDataOnly(false)'.
I am getting the Date column value properly after this change. But now the reader is reading ALL the ROWS + COLUMNS found in the excel.
Any help would be greatly appreciated.
If $objReader->setReadDataOnly(true);, then the only reader that can identify dates is the Gunumeric Reader. A date value can only be identified by the number format mask, and setReadDataOnly(true) tells the Reader not to read the formatting, so dates cannot subsequently be identified (Gnumeric is the exception, because it was written after the problem was identified, but none of the other Readers have yet been modified to read this information regardless of the setReadDataOnly value. If you need to identify dates, then the only option at this point is $objReader->setReadDataOnly(false);
However, for date cells, you shouldn't get an integer value; you should get a float: it's the decimal that identifies the time part of the Excel datetime serialized value.
If you know which cells contain dates, then you can convert them to unix timestamps or PHPExcel DateTime objects using the helper functions defined in the PHPExcel_Shared_DateTime class (and can then use all the standard PHP date handling functions).
You say that the reader is now reading ALL the ROWS + COLUMNS found in the excel: it should, irrespective of the setReadDataOnly value. If you don't want to read all the rows and columns, then you need to set a Read Filter.

SELECT clause with a DATETIME column in Sybase 15

I'm trying to do a query like this on a table with a DATETIME column.
SELECT * FROM table WHERE the_date =
2011-03-06T15:53:34.890-05:00
I have the following as an string input from an external source:
2011-03-06T15:53:34.890-05:00
I need to perform a query on my database table and extract the row which contains this same date. In my database it gets stored as a DATETIME and looks like the following:
2011-03-06 15:53:34.89
I can probably manipulate the outside input slightly ( like strip off the -5:00 ). But I can't figure out how to do a simple select with the datetime column.
I found the convert function, and style 123 seems to match my needs but I can't get it to work. Here is the link to reference about style 123
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.ase_15.0.blocks/html/blocks/blocks125.htm
I think that convert's slightly wrongly documented in that version of the docs.
Because this format always has century I think you only need use 23. Normally the 100 range for convert adds the century to the year format.
That format only goes down to seconds what's more.
If you want more you'll need to past together 2 x converts. That is, past a ymd part onto a convert(varchar, datetime-column, 14) and compare with your trimmed string. milliseconds comparison is likely to be a problem depending on where you got your big time string though because the Sybase binary stored form has a granularity of 300ms I think, so if your source string is from somewhere else it's not likely to compare. In other words - strip the milliseconds and compare as strings.
So maybe:
SELECT * FROM table WHERE convert(varchar,the_date,23) =
'2011-03-06T15:53:34'
But the convert on the column would prevent the use of an index, if that's a problem.
If you compare as datetimes then the convert is on the rhs - but you have to know what your milliseconds are in the_date. Then an index can be used.

Resources