This question already has answers here:
Extracting unique numbers from string in R
(7 answers)
Closed 2 years ago.
I have a date data which have different input format. I would like to keep only number for it. what should I do.
The data looks like this:
The codes are:
Days<-c("Day 1","Day 4"," Day_6", "Day7")
Sample.data <- data.frame(Days)
Basicly I want to get the number out of Days. Was thinking use 'str_replace' or 'gsub' but don't know how to handle different pattern. Please give me as many methods as possible for this problem. Thanks.
Does this work:
> Sample.data$Day <- as.numeric(gsub('(.*)(\\d)', '\\2', Sample.data$Days))
> Sample.data
Days Day
1 Day 1 1
2 Day 4 4
3 Day_6 6
4 Day7 7
>
Related
This question already has answers here:
Convert currency with commas into numeric
(4 answers)
Closed 1 year ago.
I am working on a dataset that has 2 Price formatted variables that are currently read as character format. I need them to be numerical. I have tried the following different examples but all have created more than a thousand nas when I run it.
dataframe$Price <-as.numeric(dataframe$Price)
dataframe$Price <-as.numeric(as.character(dataframe$Price))
If I run it as
as.numeric(dataframe$Price)
It doesn't change the variable. I am relatively new to R (about 2 months) and I have no idea what I'm doing. I appreciate any help!
If every elements of dataframe$Price is like $12.12, for example,
dummy <- data.frame(
Price = c("$12.11", "$11.14", "$10.12")
)
Price
1 $12.11
2 $11.14
3 $10.12
By using stringr::str_replace function,
dummy$Price <- as.numeric(str_replace(dummy$Price, "\\$", ""))
Price
1 12.11
2 11.14
3 10.12
This question already has answers here:
Aggregating by unique identifier and concatenating related values into a string [duplicate]
(4 answers)
Closed 5 years ago.
I have a time-series of events:
time <-c("01-01-1970","01-01-1971","01-01-1971","01-01-1972")
event <-c("A","A","B","B")
df <- data.frame(time, event)
time event
1 01-01-1970 A
2 01-01-1971 A
3 01-01-1971 B
4 01-01-1972 B
Now, I would like to put events that happen at the same time in one line. In my example that would be rows 2 and 3. The outcome should look like this:
time event
1 01-01-1970 A
2 01-01-1971 A & B
4 01-01-1972 B
Any ideas how to do this?
Best,
Felix
You can use aggregate:
aggregate(df$event,by=list(df$time),FUN= paste,collapse = " & ")
This question already has answers here:
R adding days to a date [duplicate]
(4 answers)
Closed 6 years ago.
I have a data frame "Invoice" which look like this:
Invoice_ID Invoice_DATE Nr_of_days_until_deadline
101 1/20/2017 7
102 1/25/2017 4
103 1/29/2017 5
104 2/01/2017 4
105 2/05/2017 3
I have to populate the data frame Deadline_Invoces by writing R modules for determining the deadline DATE for each invoice, where deadline_DATE = Invoice_DATE + Nr_of_days_until_deadline.
Thus I have to obtain data frame Deadline_Invoces:
Invoice_ID Invoice_DATE Deadline_DATE
How should I sum DATE with a NUMBER and to obtain a DATE?
Thank you in advance!!!
make sure you have your date column formatted as a date...then make a new column, call it Deadline or something like that. From there: Invoice$Deadline <- as.Date(Invoice$Invoice_DATE) + Invoice$Nr_of_days_until_deadline
Check out the lubridate package in R
This question already has answers here:
Quick question about subsetting via character-class
(3 answers)
Closed 8 years ago.
I have a vector called gas
gas <- c("Hydrogen","Methane")
I also have a data frame called df that looks like
df <- ID Hydrogen Methane
1 2 20
1 3 19
1 2 23
2 8 13
ect..
Normally to use a variable in a data frame I would use df$Hydrogen for example but what I want to know is can I also call Hydrogen by using the vector above? e.g.
data$gas[1]
#In other words I would like the following to be true:
data$gas[1] == data$Hydrogen
what syntax, if any, would I use to obtain this?
Thanks
If you want a specific gas, try:
df[,gas[1]]
For all gases:
df[gas]
This question already has answers here:
Read a list of files with R, each file contains a list of float numbers. what's the proper way to do it?
(2 answers)
Closed 9 years ago.
I've got many text files with named by year i.e. yob1940.txt,yob1941.txt. Each file has 3 colums of data. I'm trying to import the data into R in a single data table, and add the year for each file in a 4th column.
Any help would be much appreciated.
Thanks
Smth like this will work:
rbindlist(lapply(list.files(pattern = "yob[0-9]+\\.txt"),
function(x) data.table(year = sub('.*?([0-9]+).*', '\\1', x),
fread(x)))))
Assuming you have read these files as x1 and x2
df.list<-list(x1,x2)
kk<-do.call(rbind,df.list)
year<-data.frame(rep(c(1940,1941),c(nrow(x1),nrow(x2))))
names(year)<-"year"
mydata<-data.frame(cbind(kk,year))
A sample example:
x1<-data.frame(x=c(1,3),y=c(2,3))
x2<-data.frame(x=c(3,3),y=c(2,2))
df.list<-list(x1,x2)
kk<-do.call(rbind,df.list)
year<-data.frame(rep(c(1940,1941),c(nrow(x1),nrow(x2))))
names(year)<-"year"
mydata<-data.frame(cbind(kk,year))
mydata
x y year
1 1 2 1940
2 3 3 1940
3 3 2 1941
4 3 2 1941