I have read a csv file in R. After reading I need to transform the Data column as Date object and Time as Time Object. How do I do it. The file is in memory at this point in time.
Also How do I get classes of all columns in a file? I tried lapply and sapply. It prints out name of column to console but not say anything abut class.
You're going to need the strptime function. The exact code is going to differ depending on the time format of the .csv file, however you can find the way to do it at the link below here.
So if your date-time is given like this:
2014/01/04 12:30:36
Then your code will look something like this:
strptime(data$column_name, format="%Y/%m/%d %H:%M:%S")
For finding the class, simply use the class() function.
These tools can be discovered fairly easily with a little bit of research. Next time put in a little bit more effort before asking your question.
Hope this helps.
Related
I have a data set that will be growing. It is categorical observations (i.e., 1=yes, 2=no) by date and hour. Is the following an acceptable method of formatting for import to R or is there a better way?
I would use a template like this:
Using one column for the date makes it much easier to read/import into R. Also, the YYYY-MM-DD is the default format in R for date columns. Trying to write date and hour together in one column could be done but seems like it could be tedious and not as easy to see what is going on in the data. As was mentioned in the comments above, each observation should be on a separate row. Once you save the data as a csv, it will be easily imported into R.
Good luck.
So I've used the decompose function and I want to export all the lists it generates not just the plot it creates. I tried converting the lists into either a matrix or data frame but then that gets rid of the date header and year columns so if someone knows how to convert it and keep the list formatting that would solve my issue I think.
Anyway, The closest I've got to being able to do this keeping the list format is by doing
capture.output(decompose, file = "filename.csv")
As you can see from the image attached though:
Sometimes the months arent all together in a row which is really not helpful or what I want. It also just puts it in one column and I'm having to go into the excel after and do the text to column option which is going to get old really quickly.
Any help would be greatly appriciated. I'm really new to R so apologise if there is an obvious fix I'm missing.
Is there a way to save a data frame data and keep attributes classes? I find it very annoying to read the file, and convert e.g: from character to fastposix, from numeric to factor. Specially knowing this was all set before.
As #nicola pointted out. The function saveRDS and loadRDS does the work
So full disclosure, I am new to R and programming in general. Because of that, it is very hard for me to search when I have problems because I am not even sure what keywords to use. I am learning, and all I am hoping for y'all to do is point me in the right direction.
I have a very large csv file that I imported into R. Around 2 million observations (don't worry, I am not planning on using all 2 million). The only problem is that the people recording the data formatted the file to record to prices as "$10.00". Because of this, R recognizes the data has a factor, and also treats each individual price as a separate variable because of the dollar sign. I would like to reformat this column as a numeric variable.
I am sure there is some way to go about reformatting this in R, the only problem is I am not sure which functions I need. Sorry for the very basic question, I have just hit a wall a figured I would reach out.
Any and all help is much appreciated!
Thank you!
We could also use sub
as.numeric(sub('\\D+', '', x))
#[1] 10.00 11.24 15.22
data
x<-c("$10.00","$11.24","$15.22")
Suppose that your data looks like this:
x<-c("$10.00","$11.24","$15.22")
You can use the substring function to trim the initial dollar sign (which will still leave you with strings) and then use as.numeric to turn it to a numeric vector.
newx<-as.numeric(substring(x,2))
will produce a vector named newx with value
c(10.00,11.24,15.22)
We tell the substring to start at the 2nd character (strings in R are 1-indexed), and then cast to numeric.
In your data frame (suppose it is called df), you can replace the column like
df$MoneyColumn <- as.numeric(substring(df$MoneyColumn,2))
I would like to import a time-series where the first field indicates a period:
08:00-08:15
08:15-08:30
08:30-08:45
Does R have any features to do this neatly?
Thanks!
Update:
The most promising solution I found, as suggested by Godeke was the cron package and using substring() to extract the start of the interval.
I'm still working on related issues, so I'll update with the solution when I get there.
CRAN shows a package that is actively updated called "chron" that handles dates. You might want to check that and some of the other modules found here: http://cran.r-project.org/web/views/TimeSeries.html
xts and zoo handle irregular time series data on top of that. I'm not familiar with these packages, but a quick look over indicates you should be able to use them fairly easily by splitting on the hyphen and loading into the structures they provide.
So you're given a character vector like c("08:00-08:15",08:15-08:30) and you want to convert to an internal R data type for consistency? Check out the help files for POSIXt and strftime.
How about a function like this:
importTimes <- function(t){
t <- strsplit(t,"-")
return(lapply(t,strptime,format="%H:%M:%S"))
}
This will take a character vector like you described, and return a list of the same length, each element of which is a POSIXt 2-vector giving the start and end times (on today's date). If you want you could add a paste("1970-01-01",x) somewhere inside the function to standardize the date you're looking at if it's an issue.
Does that help at all?