I am doing something which should be quite simple. I would like to convert a vector of charter string to unique integers. From researching how to do this I found the stroi function which claims to convert strings to integers. However, I am getting weird results and I cannot understand why. When I run the code in the documentation below it works fine:
strtoi(c("ffff", "FFFF"), 16L)
[1] 65535 65535
However, when I apply this function to actually data I get a vector of NAs. Consider the following example:
strtoi(c('spy','spx'),16L)
[1] NA NA
Why does it return NAs in this example? Is there a way to get strtoi to work or do I need to write my own function?
in
strtoi(c("ffff", "FFFF"), 16L)
[1] 65535 65535
you are converting Hexa string to numbers
in this case
strtoi(c('spy','spx'),16L)
[1] NA NA
S, p, y and x are out of the HEXA spectrum....
thats why you get NA
if you try another base it might work... for instance
strtoi(c('spy','spx'),36L)
[1] 37222 37221
Related
How can I convert y vector into a numeric vector.
y <- c("1+2", "0101", "5*5")
when I use
as.numeric(Y)
OUTPUT
Na 101 NA
The following code
sapply(y, function(txt) eval(parse(text=txt)))
should to the work.
The problem is quite deep and you need to know about metaprogramming.
The problem with as.numeric is, that it only converts a string to a numeric, if the string only consists of numbers and one dot. Everything else is converted to NA. In your case, "1+2" contains a plus, hence NA. Or "5*5" contains a multiplication, hence NA. To say R that it should "perform the operation given by a string", you need eval and parse.
An option with map
library(purrr)
map_dbl(y, ~ eval(rlang::parse_expr(.x)))
#[1] 3 101 25
In a dataframe, I have a column that has numeric values and some mixed in character data for some rows. I want to remove all rows with the character data and keep those rows with a number value. The df I have is 6 million rows, so I simply made a small object to try to solve my issue and then implement at a larger scale.
Here is what I did:
a <- c("fruit", "love", 53)
b <- str_replace_all("^[:alpha:]", 0)
Reading answers to other UseMethod errors on here (about factors), I tried to change "a" to as.character(a) and attempt "b" again. But, I get the same error. I'm trying to simply make any alphabetic value into the number zero and I'm fairly new at all this.
There are several issues here, even in these two lines of code. First, a is a character vector, because its first element is a character. This means that your numeric 53 is coerced into a character.
> print(a)
[1] "fruit" "love" "53"
You've got the wrong syntax for str_replace_all. See the documentation for how to use it correctly. But that's not what you want here, because you want numerics.
The first thing you need to do is convert a to a numeric. A crude way of doing this is simply
>b <- as.numeric(a)
Warning message:
NAs introduced by coercion b
> b
[1] NA NA 53
And then subset to include only the numeric values in b:
> b <- b[!is.na(b)]
> b
[1] 53
But whether that's what you want to do with a 6 million row dataframe is another matter. Please think about exactly what you would like to do, supply us with better test data, and ask your question again.
There's probably a more efficient way of doing this on a large data frame (e.g. something column-wise, instead of row-wise), but to answer your specific question about each row a:
as.numeric(stringr::str_replace_all(a, "[a-z]+", "0"))
Note that the replacing value must be a character (the last argument in the function call, "0"). (You can look up the documentation from your R-console by: ?stringr::str_replace_all)
I have a number in an excel file that is equal to -29998,1500000003
When I try to open it in R I get
> library(openxlsx)
> posotest <- as.character(read.xlsx("sofile.xlsx"))
> posotest
[1] "-29998.1500000004"
Any help? Desired result: -29998,1500000003
EDIT: with options(digits=13) I get -29998.150000000373 which could explain why the rounding is done, however even with options(digits=13) I get
> as.character(posotest)
[1] "-29998.1500000004"
Do you have any function that would allow me to get the full number in characters?
EDIT2 format does this but it adds artificial noise at the end.
x <- -29998.150000000373
format(x,digits=22)
[1] "-29998.15000000037252903"
How can I know how many digits to use in format since nchar will give me a wrong value?
The file is here
You can get a string with up to 22 digits of precision via format():
x <- -29998.150000000373
format(x,digits=22)
[1] "-29998.15000000037252903"
Of course, this will show you all sorts of ugliness related to trying to represent a decimal number in a binary representation with finite precision ...
What is equivalent SQL server isnumeric function in R studio. I am trying to migrate one of SQL logic to r studio and i have column where it holds both Char values and Int values, now i want take only int values and update them as -1 in R data.table. Please help me to solve the problem.
I have attached results as image, column "A" values are current values and i am expecting have the values like column B.
There are also data type tools in R (as in SQL and other languages) such as is.numeric() and is.integer() in R. Normally these return boolean values, but you could use sub or gsub() to make it -1:
example <- list(123, 321, "not numeric", as.Date("2018/01/01"))
gsub(T, -1, sapply(example, is.numeric))
[1] "-1" "-1" "FALSE" "FALSE"
Also, note that in R numeric is different from integer.
example <- list(as.integer(123), 321, "not numeric", as.Date("2018/01/01"))
example[sapply(example, is.integer)] <- -1
example
[[1]]
[1] -1
[[2]]
[1] 321
[[3]]
[1] "not numeric"
[[4]]
[1] "2018-01-01"
You can convert them back and forth with as.numeric() and as.integer(). Also, note that in R data types in this sense are referred to as the class or classes of the data, whereas the type in R refers to the storage or R internal data type.
I think if you're specifically interested in integers, then the question above is a duplicate of the following:
Check if the number is integer
Your if condition would be something like x == round(x, 0). This will be TRUE if values are integers, but not double or other non-numeric classes.
Finally i have fix this issue by following below steps.
captured all numeric values to separate data table by using below script
CustomDerivedL2AMID <- (subset(DimCombinedEnduser$DRVDEUL2AMID, grepl('^\d+$',DimCombinedEnduser$DRVDEUL2AMID)))
library(data.table)
HandleDerivedL2AMID <-data.table(CustomDerivedL2AMID)
match the HandleDerivedL2AMID table results with original data table and replaced all values to -1.
DCE$DRVDEUL2AMID <- replace(DCE$DRVDEUL2AMID,DCE$DRVDEUL2AMID %in% HandleDerivedL2AMID$CustomDerivedL2AMID,'-1')
now i see only character values. no more numeric values with data set under DRVDEUL2AMID.
My numbers have “,” for 1,000 and above and R considers it as factor. I want to switch two such variables from factor to numeric (Actually both variables are Numbers, but R considers them as factor for some reason (data is imported from excel). To change a factor variable mydata$x1 to numeric variables I use the following code but it seems not to work properly and some values change, for example it changes 8180 to zero! and it happened many other values as well. Is there other ways to do so without such issues?
mydata$x1<- as.numeric(as.character(mydata$x1))
Since it seems as though the problem is that you have saved your numeric data as characters in Excel (instead of using format to display the commas) you may want a function like this.
#' Replace Commas Function
#'
#' This function converts a character representation of a number that contains a comma separator with a numeric value.
#' #keywords read data
#' #export
replaceCommas<-function(x){
x<-as.numeric(gsub("\\,", "", x))
}
Then
rcffull$RetBackers <- replaceCommas(rcffull$Returning.Backers)
rcffull$NewBackers <- replaceCommas(rcffull$New.Backers)
The reason that G5W is asking for dput ouput is that he (we) are unable to figure out where something that displays as 8180 when it's a factor might not properly be converted with that code. It's not because of leading or trailing spaces (which would not appear in a print-version of a factor. Witness this test:
> as.numeric(as.character(factor(" 8180")))
[1] 8180
> as.numeric(as.character(factor(" 8180 ")))
[1] 8180
And the fact that it gets converted to 0 is a real puzzle since generally items that do not get recognized as parseable R numerics will get coerced to NA (with a warning).
> as.numeric(as.character(factor(" 0 8180 ")))
[1] NA
Warning message:
NAs introduced by coercion
We really need the dput output from the item that displays as "8180" and its neighbors.