Converting factor to numerical giving odd results [duplicate] - r

This question already has answers here:
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Closed 8 years ago.
I have a data frame and I need to convert 2 variables from factor to numerical variables. I have a
df$QTY.SHIPPED=as.numeric(df$QTY.SHIPPED)
df$PRE.TAX.TOTAL.=as.numeric(df$PRE.TAX.TOTAL.)
The quantity shipped converts well. Because it is already in integer format. Howerver, the PRE.TAX.TOTAL. does not convert well.
PRE.TAX.TOTAL.(Factor) PRE.TAX.TOTAL.(Numerical)
57.8 3856
210 2159
Does anybody have an idea why it is converting this way?
Thank you

convert to character first and then to numeric. Otherwise it will just be converting to the underlying integer that encodes the factor
> v<-factor(c("57.8","82.9"))
> as.numeric(v)
[1] 1 2
> as.numeric(as.character(v))
[1] 57.8 82.9

You actually could read the documentation. Typing ?factor in console produces
Warning
The interpretation of a factor depends on both the codes and the
"levels" attribute. Be careful only to compare factors with the same
set of levels (in the same order). In particular, as.numeric applied
to a factor is meaningless, and may happen by implicit coercion. To
transform a factor f to approximately its original numeric values,
as.numeric(levels(f))[f] is recommended and slightly more efficient
than as.numeric(as.character(f)).
Thus, the more proper way would probably be as.numeric(levels(f))[f]

Related

as. double()/ as.numeric() not working [duplicate]

This question already has answers here:
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Closed 5 years ago.
I want the program to read the data as double float but when I use as.double and as.numeric it changes the data itself.
Original data
Original data is in fractions
After applying as.double to each column separately and combining to form a dataframe, the data starts looking like this
Changed data values after applying as. double()
Your data are probably factor (not character).
To convert column x to numeric use as.numeric(levels(x))[x]
This can also help.

How to convert data.frame column from Factor to numeric [duplicate]

This question already has answers here:
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Closed 8 years ago.
I have a data.frame whose class column is Factor. I'd like to convert it to numeric so that I can use correlation matrix.
> str(breast)
'data.frame': 699 obs. of 10 variables:
....
$ class : Factor w/ 2 levels "2","4": 1 1 1 1 1 2 1 1 1 1 ...
> table(breast$class)
2 4
458 241
> cor(breast)
Error in cor(breast) : 'x' must be numeric
How can I convert a Factor column to a numeric column?
breast$class <- as.numeric(as.character(breast$class))
If you have many columns to convert to numeric
indx <- sapply(breast, is.factor)
breast[indx] <- lapply(breast[indx], function(x) as.numeric(as.character(x)))
Another option is to use stringsAsFactors=FALSE while reading the file using read.table or read.csv
Just in case, other options to create/change columns
breast[,'class'] <- as.numeric(as.character(breast[,'class']))
or
breast <- transform(breast, class=as.numeric(as.character(breast)))
From ?factor:
To transform a factor f to approximately its original numeric values, as.numeric(levels(f))[f] is recommended and slightly more efficient than as.numeric(as.character(f)).
This is FAQ 7.10. Others have shown how to apply this to a single column in a data frame, or to multiple columns in a data frame. But this is really treating the symptom, not curing the cause.
A better approach is to use the colClasses argument to read.table and related functions to tell R that the column should be numeric so that it never creates a factor and creates numeric. This will put in NA for any values that do not convert to numeric.
Another better option is to figure out why R does not recognize the column as numeric (usually a non numeric character somewhere in that column) and fix the original data so that it is read in properly without needing to create NAs.
Best is a combination of the last 2, make sure the data is correct before reading it in and specify colClasses so R does not need to guess (this can speed up reading as well).
As an alternative to $dollarsign notation, use a within block:
breast <- within(breast, {
class <- as.numeric(as.character(class))
})
Note that you want to convert your vector to a character before converting it to a numeric. Simply calling as.numeric(class) will not the ids corresponding to each factor level (1, 2) rather than the levels themselves.

Converting factors to numeric but getting NAs [duplicate]

This question already has answers here:
R cleaning up a character and converting it into a numeric
(2 answers)
Closed 9 years ago.
I am converting factors to numbers and have tried both solutions previously posted:
as.numeric(as.character(factor))
as.numeric(levels(factor))
In both cases: I get lots of NA's and a warning message, NAs introduced by coercion. When I typed levels(factor), I do get many percentages (i.e. these are interest rates).
Is there any way I can convert these interest rates, whose class is factor, into numeric?
Thanks,
Shelley
A "number" with percentage symbol is not considered as a numeric or integer in R, so you need to remove this symbol in every number first using for example gsub before doing the coercion.
perc <- factor(c("10%", "21.6%", "15%"))
as.numeric(as.character(perc))
[1] NA NA NA
Warning message:
NAs introduced by coercion
as.numeric(gsub("\\%", "", perc))
[1] 10.0 21.6 15.0

What's wrong with as.numeric in R? [duplicate]

This question already has answers here:
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Closed 10 years ago.
> X864291X8X74
[1] 8.0000000000 9.0000000000 10.0000000000 6.0000000000 8.0000000000
10 Levels: 0.0000000000 10.0000000000 12.0000000000 3.0000000000 4.0000000000 6.0000000000 ... NULL
> as.numeric(X864291X8X74)
[1] 8 9 2 6 8
what did I misunterstood? shouldn't be the result of as.numeric 8 9 10 6 8?
How do I get the correct result?
Your vector is a factor. This question has been asked quite a few times, ex: here, here, here. In order to convert a factor to numeric, you'll have to convert to character first. Try:
as.numeric(as.character(my_vec))
The documentation at ?factor states:
To transform a factor f to approximately its original numeric values,
as.numeric(levels(f))[f] is recommended and slightly more efficient
than as.numeric(as.character(f)).
So the following works as well:
as.numeric(levels(my_vec))[my_vec]

Convert factor to integer [duplicate]

This question already has answers here:
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Closed 6 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
I am manipulating a data frame using the reshape package. When using the melt function, it factorizes my value column, which is a problem because a subset of those values are integers that I want to be able to perform operations on.
Does anyone know of a way to coerce a factor into an integer? Using as.character() will convert it to the correct character, but then I cannot immediately perform an operation on it, and as.integer() or as.numeric() will convert it to the number that system is storing that factor as, which is not helpful.
Thank you!
Jeff
Quoting directly from the help page for factor:
To transform a factor f to its original numeric values, as.numeric(levels(f))[f] is recommended and slightly more efficient than as.numeric(as.character(f)).
You can combine the two functions; coerce to characters thence to numerics:
> fac <- factor(c("1","2","1","2"))
> as.numeric(as.character(fac))
[1] 1 2 1 2

Resources