This question already has answers here:
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Count number of occurences for each unique value
(14 answers)
Closed 2 years ago.
I am just trying to convert a column of numbers, which R thinks are characters, to numerical values.
I have the following table:
> longtab=as.data.frame(table(long));head(longtab)
long Freq
1 189485 1
2 189486 1
3 189487 1
4 189488 1
5 189489 1
6 189490 1
I've created a new table from those data as follows:
> q=head(longtab);q
long Freq
1 189485 1
2 189486 1
3 189487 1
4 189488 1
5 189489 1
6 189490 1
When I test whether the "long" column is numeric, R tells me that it is not.
> is.numeric(q$long)
[1] FALSE
When I try to coerce "long" values to be numeric using as.numeric(), I get the following:
> as.numeric(q$long)
[1] 1 2 3 4 5 6
But these are the row numbers not the values in the "long" column. This seems like it should be a simple problem to fix but I am struggling and have been at this a while. Any help would be greatly appreciated.
Related
This question already has answers here:
Faster ways to calculate frequencies and cast from long to wide
(4 answers)
Closed 3 years ago.
there is a dataframe with two column as below,and i want to change it into a dataframe with 3 column
df <- data.frame(key=c('a','a','a','b','b'),value=c(1,2,2,1,3))
I have tried it in python,that's ok,but in r i have no idea
the expect output should be like
1 2 3
a 1 2 0
b 1 0 1
library(data.table)
dcast(key~value, data=df, fun.aggregate=length)
# key 1 2 3
# 1 a 1 2 0
# 2 b 1 0 1
This question already has answers here:
collapse mulitple columns into one column and generate an index variable
(4 answers)
Reshaping data.frame from wide to long format
(8 answers)
Closed 3 years ago.
I have been tasked to tidy up some data and am having issues with trying to transform the data from this format:
id occupation_busdriver occupation_cashier occupation_nurse
1 0 0 1
2 0 1 0
3 1 0 0
my actual dataset is significantly larger, but this is the area in which I am struggling, and therefore an example for this set would be much appreciated.
I have already tried using the gather and select functions
I am looking to have the data in this format:
id occupation
1 nurse
2 cashier
3 busdriver
We can use max.col to get the column index of the max value per row and based on the index, get the column names
data.frame(df1[1], occupation = sub(".*_", "", names(df1))[-1][max.col(df1[-1])])
# id occupation
#1 1 nurse
#2 2 cashier
#3 3 busdriver
This question already has answers here:
Reshaping multiple sets of measurement columns (wide format) into single columns (long format)
(8 answers)
Reshape a dataframe to long format with multiple sets of measure columns [duplicate]
(3 answers)
Closed 4 years ago.
I have 3 tables that looks something like this (but with 40,000+ observations and 40 variables)
in1 out1 in2 out2 in3 out3
1 2 2 4 3 5
1 3 2 5 3 6
1 3 2 6 3 7
I want to take columns out1, out2, and out3 and make them one column, then create a new table that looks like this:
in out
1 2
1 3
1 3
2 4
2 5
2 6
3 5
3 6
3 7
So basically I want to take 3 huge tables I have, combine them into 1 table and then merge (stack? I don't know the correct wording) 3 specific columns together into 1 column with a new name.
I've tried a few methods such as:
table$out <- cbind(table1$out1, table2$out2, table3$out3)
but I get errors like this:
Error in `$<-.data.frame`(`tmp`, out, value = c(0.98, 0.59, 0.69, :
replacement has 31467 rows, data has 42141
number of rows of result is not a multiple of vector length (arg 1)
I'm sorry if this is a very simple question.. I might just be overthinking it
This question already has answers here:
How to create a consecutive group number
(13 answers)
Closed 5 years ago.
I have a vector of integers, for example, v <- c(1,5,1,2,2,4,7,5,7). If I sort(unique(v)), the values 3 and 6 would be missing in the sequence. How can I transform v into a vector where sort(unique(v)) is an actual sequence of integers? This is, transforming v into c(1,4,1,2,2,3,5,3,5) (in general, of course).
Converting v to factor and back to numeric could do the trick
as.numeric(as.factor(v))
#[1] 1 4 1 2 2 3 5 4 5
Using OP's method, we get the expected output with match
match(v, sort(unique(v)))
#[1] 1 4 1 2 2 3 5 4 5
This question already has answers here:
Mean per group in a data.frame [duplicate]
(8 answers)
Closed 7 years ago.
I've got a data frame as so,
Treatment Type Numerical Value
1 A 3
1 B 2
1 A 8
1 B 7
2 B 4
2 B 1
2 A 2
2 A 2
I want to make a table of means for each type and treatments.
Using aggregate, I have: aggregate(df[,3], list(Treatment) ,mean) which gives me the means for each treatment but not separated by type too. I was thinking this could be rectified by a for-loop.
Note: This is just a subset of the data, and the list of numerical values is hundreds for each type and treatment.
Since I don't have repu to comment:
aggregate(df, list(Treatment,Type), mean)