How to subtract 1 from a column in a dataframe? - r

I have a dataframe, with one column as factor. I want to subtract 1 from all rows in my column, but when i try i get an error message that " - " is not meaningful for factors.
How can i do this?

Factors aren't numbers even though there is a numbering system under them. Thus, when you try to substract 1 from factor levels, software will error. This is a logic error, not a software error.
Did you want factors, or was your data converted to factors when you imported it? If you want numeric data, you can convert factors to numbers by using one command.

Related

how can i get numeric data from all this character data?

In the data set I use, there is no numeric information other than the measurement values explained with 0 and 1 values. the remaining columns are values such as location, education information. how can i get numeric data from all this character data? By the way, I'm using the R language.
I got some frequency values but I don't know what to do about columns like location, education.

Transform factor variable to numeric R

I have tried multiple things so I'll ask my question here.
I have a dataset, containing of 5 columns. The first one lists countries (text), the second Year (integer) and 3-5 are my variables which now are factors.
I want to run a regression with my 3 variables, which is not possible rn as (I guess) my variables are not numeric/integers. I tried to transform them to numeric directly, but it only gave out ranks. I also tried to firstly transform them to characters and secondly to integers/numeric (tried both), but also only transformed my 3 variables into ranks. I used the transform and as.integer code, thus creating a new dataset.
x<-transform(GDPall, HardWork = as.integer(HardWork), FamilyImportance = as.integer(FamilyImportance), GDPWorker = as.integer(GDPWorker))
How can I transform my 3 variables into a class which allows me to run my regression?
Thank you in advance!

NAs introduced by coercion whith a line code that some days before worked

I am trying to convert a variable codified as factor to numeric. I have two variables that are equal in this way in the data base, for one of them the code that I tried works but for the other one all the values are converted to NAs and I donĀ“t know why because till some days ago it worked too.
The database is called 'Visitas'
The variables are 'result' and 'dose' both of them contain numbers representing the result of a medical trial and the administered dose in mg.
The code I used is the next one:
Visitas[, 'result'] <- as.numeric(as.character(Visitas[, 'result']))
Visitas[, 'dose'] <- as.numeric(as.character(Visitas[, 'dose']))
I have to convert them first to character type in order to not get the number of the level of the factor, remember this is how it is codified when I import the database, and get the number itself.

Change decimal digits for data frame column in R

Questions about displaying of certain numbers of digits have been posted, however, just for single values or vectors, so I hope someone can help me with this.
I have a data frame with several columns and want to display all values in one column with two decimal digits (this column only). I have tried round() and format() and options(digits) but none worked on a column (numerical). I wonder if there is a method to do this without going the extra way of converting the column to a vector and gluing all together again.
Thanks a lot!
Here's an example of how to do this with the cars data.frame that comes installed with R.
First I'll add some variability so that we have numbers with decimal places:
data=cars+runif(nrow(cars))
Then to round just a single column (in this case the dist column to 2 decimal places):
data[,'dist']=round(data[,'dist'],2)
If your data contain whole numbers then you can guarantee that all values will have 2 decimal places by using:
cars[,'dist']=format(round(cars[,'dist'],2),nsmall=2)

Cluster analysis on two columns that contain name of person in R

I am a beginner in R. I have to do cluster analysis in data that contains two columns with name of persons. I converted it in data frame but it is character type. To use dist() function the data frame must be numeric. example of my data:
Interviewed.Type interviewed.Relation.Type
1. An1 Xuan
2. An2 The
3. An3 Ngoc
4. Bui Thi
5. ANT feed
7. Bach Thi
8. Gian1 Thi
9. Lan5 Thi
.
.
.
1100. Xung Van
I will be grateful for your help.
You can convert a character vector to a factor using factor. A factor is basically a vector of numbers together with an attribute giving the text associated with each number, which are called levels in R. One can use as.numeric or unclass to get at the raw numbers. These can then be fed into algorithms which require numbers, like e.g. dist.
Note that the order in which numbers are associated with texts is pretty much arbitrary (in fact alphabetical), so the difference between numbers has no meaning in most applications. Therefore calling dist on this result is technically possible, but not neccessarily meaningful. For this reason, the author of this answer is not satisfied with it, even if the original poster seems to be happy about it. :-)
Also note that if there are different vectors, converting each separately will mean that the same number will represent different textual values and vice versa, unless both vectors are compromised from exactly the same set of distinct values. Additional care has to be taken if you want the same levels for both factors. One way would be to concatenate both vecotrs, turn that into a factor, and then split the result into two factor vectors.

Resources