So I have a text column in data frame:
stocksavailable
140,13-,3-,40-,2-
The numbers 13-, 2- and 3- are incorrect while extracting, can we get something like this using R code?
stocksavailable
140,-13,-3,-40,-2
Assuming the numbers are integers, try: (\d+)(\-)?
Demo
You should be able to use the gsub method like this
Related
So I have a data frame that includes a column like this:
image
And I would like to remove the operator as well as the numbers to the right of it, i.e. so the first entry would just say 51.81 rather than 51.81 - 11.19. How would I go about this? I feel like using a for loop might work but I'm unsure of the syntax required.
Thanks
We can use sub to match zero or more spaces (\\s*) followed by a - or + and other characters, and replace with blank ("")
df1$xG <- as.numeric(sub("\\s*[-+]+.*", "", df1$xG))
I Have dataset like below which I am trying to convert column "Installs" to numeric, my codes are like below:
Original Dataset
My Codes:-
Data$Installs<-substr(Data$Installs,1,nchar(Data$Installs)-1)
Data$Installs<-gsub(",","",gsub("\\s+","",Data$Installs))
Data$Installs<-as.numeric(Data$Installs)
after the code I get below
This is the result I get
Any help?
From what I can see, you need only to remove commas and a possible trailing plus sign. So, the following should work:
Data$Installs <- as.numeric(gsub("[+,]", "", Data$Installs))
You might want to create a new column though and keep the original one.
Used to run R with numbers and matrix, when it comes to play with strings and characters I am lost. I want to analyze some data where the time is read into R as follow:
>my.time.char[1]
[1] "\"2011-10-05 15:55:00\""
I want to end up with a string containing only:
"2011-10-05 15:55:00"
Using the function sub() (that i barely understand...), I got the following result:
> sub("(\")","",my.time.char[1])
[1] "2011-10-05 15:55:00\""
This is closer to the format i am looking for, but I still need to get rid of the two last characters (\").
The second line from ?sub explains:
sub and gsub perform replacement of the first and all matches respectively.
which should tell you to use gsub instead.
So I have this list of names:
names <- c("stewart,pat", "peterson,greg")
from which I extract only the lastname,firstname items with the following regular expression:
myregexpr <- "(\\w+),(\\w+)?"
str_view(str_extract_all(names, myregexpr), myregexpr)
This yields a view like:
stewart,pat
peterson,greg
My question: Is there a way for me to write the regular expression such that the result would instead look like:
pat_stewart
greg_peterson
i.e. where the result of is first_last? I believe there is a way to do it as I've seen on other, similar questions. I've tried:
myregexpr <- "(\\w+),(\\w+)?\\2_\\1"
but that returns only `character(0)'. I've attempted many versions - some of which crash R studio. Any ideas?
When doing some textual data cleaning in R, I can found some special characters. In order to get rid of them, I have to know their unicodes, for example € is \u20AC. I would like to know if it is possible "see" the unicodes with a function that take into account the string within the special character as an input?
Refering to Cath comment, iconv can do the job :
iconv("é", toRaw = TRUE)
Then, you may want to unlist and paste with \u00.
special_char <- "%"
Unicode::as.u_char(utf8ToInt(special_char))