KeyError: 'date' in format dd-mm-yyyy - datetime

#read the data in csv
df_sales = pd.read_csv("C:\\Users\\Yuvraj\\Downloads\\sales_data.csv",'utf-8')
#convert date field from string to datetime
df_sales = {1: "date", 2: "store", 3: "item", 4:"sales"}
df_sales['date'] = pd.to_datetime(df_sales['date'])
#show first 10 rows
df_sales.head(10)
was trying to convert 'date' field from string to datetime
This is the output : KeyError: 'date'
what do I do?

Related

Transform integer column to date

I have a column in format 19990101 that I would like to change to date format. It is currently as integer.
I've tried the following but it results in
Error: unexpected symbol in: mutate(Date = Date, Date = as.integer(format)date.
mutate(Date = Date,
Date = as.integer(format)date, "%Y%m%d")
mutate(Date = Date,
Date = as.Date(format(x), "%Y%m%d")
The above code solved my issue.

How to format properly date-time column in R using mutate?

I am trying to format a string column to a date-time serie.
The row in the column are like this example: "2019-02-27T19:08:29+000"
(dateTime is the column, the variable)
mutate(df,dateTime=as.Date(dateTime, format = "%Y-%m-%dT%H:%M:%S+0000"))
But the results is:
2019-02-27
What about the hours, minutes and seconds ?
I need it to apply a filter by date-time
Your code is almost correct. Just the extra 0 and the as.Date command were wrong:
library("dplyr")
df <- data.frame(dateTime = "2019-02-27T19:08:29+000",
stringsAsFactors = FALSE)
mutate(df, dateTime = as.POSIXct(dateTime, format = "%Y-%m-%dT%H:%M:%S+000"))

Error in converting daily stock series into monthly ones

I am a newbie in R and have difficulties converting daily stock series into monthly ones in xts class.
I have a stock data about the ticker IOO downloaded into a .csv (Comma delimited) file from Yahoo!Finance looking like this:
Date Open High Low Close Adjusted Close Volume
12/31/2012 63 63.95 63 63.95 56.11 87900
1/2/2013 64.94 65.19 64.62 65.08 57.09 77900
1/3/2013 64.94 64.98 64.63 64.68 56.74 36000
1/4/2013 64.70 65.13 64.63 65.12 57.13 49400
1/7/2013 64.83 64.91 64.61 64.88 56.93 102000
1/8/2013 64.65 64.76 64.37 64.58 56.66 31600
I have written the following in R in order to read it, convert it to xts and convert daily to monthly:
library(tseries)
library(xts)
library(PerformanceAnalytics)
IOO <- read.csv(file = “IOO.csv”, header = TRUE, sep = “,”)
IOO <- subset(IOO[, c(1,2,3,4,6)])
colnames(IOO) <- c(“Date”, “Open”, “High”, “Low”, “Close”)
IOO[,“Date”] <- as.Date(IOO[,“Date”], format = “%m / %d / %Y”)
IOO <- as.xts(IOO, order.by = as.Date(rownames(IOO), “%Y-%m-%d”), dateFormat = “POSIXct”, frequency = NULL, .RECLASS = FALSE)
IOO_monthly <- to.monthly(IOO, indexAt=‘yearmon’, drop.time = TRUE, name = NULL)
Error in to.period(x, “months”, indexAt = indexAt, name = name, …) :
unsupported type
IOO_monthly <- to.period(IOO, period = “months”, indexAt=‘yearmon’, name = NULL)
Error in to.period(IOO, period = “months”, indexAt = “yearmon”, name = NULL) :
unsupported type
IOO_monthly <- to.period(IOO, period = “months”, indexAt= NULL, name = NULL)
Error in to.period(IOO, period = “months”, indexAt = NULL, name = NULL) :
unsupported type
I have tried many other combinations of arguments in to.period and to.monthly, but it did not work out.
Thank you in advance for your help.
To answer your actual question: the problem is that you include the "Date" column in the coredata of your xts object. xts and zoo objects are simple a matrix with an index attribute (not an index column). Since you can't have mixed column types in a matrix, all your data are converted to a single type (character in this case).
library(xts)
# Example data
IOO <- read.csv(text="Date,Open,High,Low,Close,Adjusted,Close,Volume
12/31/2012,63,63.95,63,63.95,56.11,87900
1/2/2013,64.94,65.19,64.62,65.08,57.09,77900
1/3/2013,64.94,64.98,64.63,64.68,56.74,36000
1/4/2013,64.70,65.13,64.63,65.12,57.13,49400
1/7/2013,64.83,64.91,64.61,64.88,56.93,102000
1/8/2013,64.65,64.76,64.37,64.58,56.66,31600")
# Omit 'Close' and 'Volume' columns
IOO <- IOO[, c(1,2,3,4,6)]
# Rename 'Adjusted' column to 'Close'
colnames(IOO) <- c("Date", "Open", "High", "Low", "Close")
# Convert 'Date' column to actual Date class
IOO[, "Date"] <- as.Date(IOO[, "Date"], format = "%m/%d/%Y")
# Create xts object, using 'Date' column as index, and everything else as coredata
IOO <- xts(IOO[,-1], IOO[,1])
# Aggregate to monthly
IOO_monthly <- to.monthly(IOO)
IOO_monthly
# IOO.Open IOO.High IOO.Low IOO.Close
# Dec 2012 63.00 63.95 63.00 56.11
# Jan 2013 64.94 65.19 64.37 56.66

How to convert datetime from string format into datetime format in pyspark?

I created a dataframe using sqlContext and I have a problem with the datetime format as it is identified as string.
df2 = sqlContext.createDataFrame(i[1])
df2.show
df2.printSchema()
Result:
2016-07-05T17:42:55.238544+0900
2016-07-05T17:17:38.842567+0900
2016-06-16T19:54:09.546626+0900
2016-07-05T17:27:29.227750+0900
2016-07-05T18:44:12.319332+0900
string (nullable = true)
Since the datetime schema is a string, I want to change it to datetime format as follows:
df3 = df2.withColumn('_1', df2['_1'].cast(datetime()))
Here I got an error:
TypeError: Required argument 'year' (pos 1) not found
What should I do to solve this problem?
Try this:
from pyspark.sql.types import DateType
ndf = df2.withColumn('_1', df2['_1'].cast(DateType()))

R read dates in format yyyymmdd [duplicate]

This question already has answers here:
Convert integer to class Date
(3 answers)
Closed 2 years ago.
I have a dataset, the first column has the date field with yyyymmdd formats. I tried to convert these dates to R format using:
yield.usd$Fecha <- as.Date(yield.usd$Fecha, format='%Y%m%d')
and the output is:
Error in as.Date.numeric(yield.usd$Fecha, format = "%Y%m%d") : 'origin' must be supplied
Then I tried:
yield.dates <- as.Date(yield.usd[1], format='%Y%m%d')
the output as follows:
Error in as.Date.default(yield.usd[1], format = "%Y%m%d") : do not know how to convert 'yield.usd[1]' to class “Date”
How can I make R read those dates?
The dput(head(yield.usd)) as follows:
structure(list(Fecha = c(20120815L, 20120815L, 20120815L, 20120815L,
20120815L, 20120815L), Plazo = 1:6, Soberana = c(0.001529738,
0.001558628, 0.001587518, 0.001616408, 0.001645299, 0.001674189
), AAA = c(0.009642716, 0.009671607, 0.009700497, 0.009729387,
0.009758277, 0.009787168), AA. = c(0.017483959, 0.01751285, 0.01754174,
0.01757063, 0.01759952, 0.017628411), AA = c(0.017762383, 0.017791273,
0.017820163, 0.017849053, 0.017877944, 0.017906834), AA..1 = c(0.018207843,
0.018236733, 0.018265624, 0.018294514, 0.018323404, 0.018352294
), A = c(0.036340293, 0.036369183, 0.036398073, 0.036426964,
0.036455854, 0.036484744)), .Names = c("Fecha", "Plazo", "Soberana",
"AAA", "AA.", "AA", "AA..1", "A"), row.names = c(NA, 6L), class = "data.frame")
Your dates are in number format, so R thinks it represents a timestamp (in such case it needs origin of timestamp, usually 1st January 1970). What you want is comprehension from strings, so you should convert your numbers to characters first:
as.Date(as.character(yield.usd$Fecha), format='%Y%m%d')

Resources