How to convert 0=..., 1=... columns, into 1 single column [duplicate] - r

This question already has answers here:
collapse mulitple columns into one column and generate an index variable
(4 answers)
Reshaping data.frame from wide to long format
(8 answers)
Closed 3 years ago.
I have been tasked to tidy up some data and am having issues with trying to transform the data from this format:
id occupation_busdriver occupation_cashier occupation_nurse
1 0 0 1
2 0 1 0
3 1 0 0
my actual dataset is significantly larger, but this is the area in which I am struggling, and therefore an example for this set would be much appreciated.
I have already tried using the gather and select functions
I am looking to have the data in this format:
id occupation
1 nurse
2 cashier
3 busdriver

We can use max.col to get the column index of the max value per row and based on the index, get the column names
data.frame(df1[1], occupation = sub(".*_", "", names(df1))[-1][max.col(df1[-1])])
# id occupation
#1 1 nurse
#2 2 cashier
#3 3 busdriver

Related

How to distill this data.frame into more condense data? [duplicate]

This question already has answers here:
Mean per group in a data.frame [duplicate]
(8 answers)
Calculate the mean by group
(9 answers)
Aggregate / summarize multiple variables per group (e.g. sum, mean)
(10 answers)
Closed 1 year ago.
ID Error
EID0062, EID0175 1
EID0063 1
EID0063 1
EID0064 1
EID0069 1
EID0069 0
EID0072 0
EID0075 0
EID0075 0
EID0093 1
EID0023 0
EID0013 1
EID0062, EID0175 1
I have ~200 rows with ~150 unique IDs. I would like to create a new data.frame with just the unique IDs and have the column Error be representative if there is ever an error for that person. For example, for EID0069, there is both an error and non-error, but I would like the new df to show that person as an error. Like this:
ID Error
EID0062, EID0175 1
EID0063 1
EID0064 1
EID0069 1
EID0072 0
EID0075 0
EID0093 1
EID0023 0
EID0013 1
All the best!

R won't interpret table column as numerical values [duplicate]

This question already has answers here:
How to convert a factor to integer\numeric without loss of information?
(12 answers)
Count number of occurences for each unique value
(14 answers)
Closed 2 years ago.
I am just trying to convert a column of numbers, which R thinks are characters, to numerical values.
I have the following table:
> longtab=as.data.frame(table(long));head(longtab)
long Freq
1 189485 1
2 189486 1
3 189487 1
4 189488 1
5 189489 1
6 189490 1
I've created a new table from those data as follows:
> q=head(longtab);q
long Freq
1 189485 1
2 189486 1
3 189487 1
4 189488 1
5 189489 1
6 189490 1
When I test whether the "long" column is numeric, R tells me that it is not.
> is.numeric(q$long)
[1] FALSE
When I try to coerce "long" values to be numeric using as.numeric(), I get the following:
> as.numeric(q$long)
[1] 1 2 3 4 5 6
But these are the row numbers not the values in the "long" column. This seems like it should be a simple problem to fix but I am struggling and have been at this a while. Any help would be greatly appreciated.

Is there a way in R to make all possible combinations between rows of different columns? [duplicate]

This question already has answers here:
Unique combination of all elements from two (or more) vectors
(6 answers)
Generate list of all possible combinations of elements of vector
(10 answers)
Closed 2 years ago.
I have a df with one column and I would like to make combinations with the values of this column in order to have a new df with two columns, like he simple example below: (Obs: my df has ~5000 rows)
df
CG
1
2
3
##I would like a result similar to this:
> head(df1)
C1 C2
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
Does someone could help me?
Thank you in advance

R: dataframe turns into vector after deleting columns [duplicate]

This question already has answers here:
How do I extract a single column from a data.frame as a data.frame?
(3 answers)
Closed 2 years ago.
I have a data frame:
L1 2020 NA
1 1 0 0
2 2 1 0
3 3 1 0
I want to delete first and last column, to get dataframe like this:
2020
1 0
2 1
3 1
I tried:
1)
df <- df[,-c(1,ncol(df))]
or 2)
df <- subset(df, select = -c(1,ncol(df)))
For both I get result:
[1] 0 1 1
So I guess it changed data frame into vector. How can I delete these columns to keep it as a data frame?It is important for me to keep it like this. I don't have this problem when there are more columns. It changes only when one column is supposed to be left.
After specifiing the columns in the square-brackets, add ,drop=FALSE right after it.
The drop-argument is TRUE by default and you are struggling with this default.
df <- data.frame(a=1:10,b=1:10)
df[,1] #R simplifies to a vector via implicit drop=TRUE default
df[,1,drop=FALSE] #dataframe-structure remains

Row numbering by group and date [duplicate]

This question already has answers here:
Numbering rows within groups in a data frame
(10 answers)
numbering by groups [duplicate]
(8 answers)
Closed 6 years ago.
I have a question about numbering rows by group AND by one further condition. I know how to do this by group but not by adding one further condition.
Suppose I have the ID and the DATE and want to create NUM as shown in the table:
ID ...... DATE...... NUM
1 20160103 ...... 1
1 20160104...... 1
1 20160104...... 2
1 20160105...... 1
1 20160105...... 2
1 20160105...... 3
1 20160106...... 1
2 20160103...... 1
2 20160103...... 2
2 20160105...... 1
Any one knows How to do this?
We can use ave from base R
df$NUM <- with(df, ave(ID, ID, DATE, FUN =seq_along))

Resources