R: dataframe turns into vector after deleting columns [duplicate] - r

This question already has answers here:
How do I extract a single column from a data.frame as a data.frame?
(3 answers)
Closed 2 years ago.
I have a data frame:
L1 2020 NA
1 1 0 0
2 2 1 0
3 3 1 0
I want to delete first and last column, to get dataframe like this:
2020
1 0
2 1
3 1
I tried:
1)
df <- df[,-c(1,ncol(df))]
or 2)
df <- subset(df, select = -c(1,ncol(df)))
For both I get result:
[1] 0 1 1
So I guess it changed data frame into vector. How can I delete these columns to keep it as a data frame?It is important for me to keep it like this. I don't have this problem when there are more columns. It changes only when one column is supposed to be left.

After specifiing the columns in the square-brackets, add ,drop=FALSE right after it.
The drop-argument is TRUE by default and you are struggling with this default.
df <- data.frame(a=1:10,b=1:10)
df[,1] #R simplifies to a vector via implicit drop=TRUE default
df[,1,drop=FALSE] #dataframe-structure remains

Related

Remove columns from data frame that only contain zeros [duplicate]

This question already has answers here:
Remove columns with zero values from a dataframe
(10 answers)
Closed 4 years ago.
I'm trying to recreate a data frame (DC5_prod) that has hundreds of columns, but many without any values other than zero.
The first column in the data frame is text and the rest are numeric. Is there a way to ignore the first column while simultaneously eliminating the remainder of columns that are composed entirely of zeros?
DC5_Prod
a b c d e f
1 AK 0 0 0 0 1
2 JI 0 0 0 0 0
The above is a snippet of how it currently stands and would want an output of:
DC5_Prod
a f
1 AK 1
2 JI 0
When I attempt to utilize the solution issued on a similar question on the site:
DC5_prod[, colSums(DC5_prod != 0) > 0]
just essentially returns the first column without removing any.
Try this R base approach
> ind <- sapply(DC5_Prod, function(x) sum(x==0)) != nrow(DC5_Prod)
> DC5_Prod[,ind]
a f
1 AK 1
2 JI 0

remove R Dataframe rows based on zero values in one column [duplicate]

This question already has answers here:
Subset / filter rows in a data frame based on a condition in a column
(3 answers)
Closed 5 years ago.
I have this dataframe with a set of zero values along two columns. df
Sl_No No_of_Mails No_of_Responses
1 10 2
2 0 0
3 20 0
4 10 10
5 0 0
6 0 NA
7 10 NA
8 10 0
I want to remove those rows where No_of_Mails equals zero without disturbing the other column.
I have tried the following code
row_sub = apply(df, 1, function(row) all(row !=0 ))
df[row_sub,]
This removes all the 0 values including the one from the number_of_responses column. I wish to have that column undisturbed
I have also tried this
df[df$No_of_Mails][!(apply(df, 1, function(y) any(y == 0))),].
This deletes all the rows and gives me a table with zero rows.
Just subset the data frame based on the value in the No_of_Mails column:
df[df$No_of_Mails != 0, ]
Demo

R get every column with NA [duplicate]

This question already has answers here:
Remove columns from dataframe where some of values are NA
(8 answers)
Closed 6 years ago.
I need a function in R which is doing the following.
I have a matrix with some data
mydata <- data.frame (matrix(c(1,2,3,NA,2,3,NA,NA,2), 3,3))
mydata
X1 X2 X3
1 1 NA NA
2 2 2 NA
3 3 3 2
No I want to check every column of this matrix if there is any NA in a column and create a vector which stores 0 if there is a NA in a column or 1 if there is no NA in the column.
So check column X1: if NA is in this column write 0 to the vector, if not write 1 to the vector. Then check the next column and so on.
After checking mydata the vector should look like this
(1 0 0)
with
colnames(mydata)[colSums(is.na(mydata)) > 0]
I get the column names which have NA. But how can I use this function to create the vector?
We can coerce the logical vector from colSums on the logical matrix to binary with as.integer
as.integer(!colSums(is.na(mydata)))
#[1] 1 0 0

R reset cumsum when it found 0 [duplicate]

This question already exists:
R ffdfdply reset cumsum using data.table
Closed 9 years ago.
I am using the ff package to load an excel file.
i=as.ffdf(data.frame(a=c(1,1,1,1,1,1), b=c(1,4,6,2,5,3), c=c(1,1,1,1,1,1), d=c(1,0,1,1,0,1)))
I am trying to get the cumulative sum on column d and reset it whenever it found 0. I am trying to get the below output.
a b c d Result
1 1 1 1 1
1 4 1 0 0
1 6 1 1 1
1 2 1 1 2
1 5 1 0 0
1 3 1 1 1
I know, I can easily achieved it through ddply but I have large set of data rows i.e. > 5000000 rows.
Thanks
This will work but little bit slower with 24385601 rows. I created unique combination on column a and c and use the Arun solution. Key column (key_a_c) is used to split the data set i.e. to reset cumsum.
Create a unique key on column a and c
i$key_a_c <- ikey(i[c("a", "c")])
Generate cumulative series by spliting on the basis of key_a_c
p1=ffdfdply(i, split=as.character(i$key_a_c), FUN= function(x) {
x$Result <- as.ff(x[, "d"] * sequence(rle(x[, "d"])$lengths))
as.data.frame(x)
}, trace=T)
Please share your views and code if you have some optimized solution.

Apply a function to each row in a data frame in R [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
how to apply a function to every row of a matrix (or a data frame) in R
R - how to call apply-like function on each row of dataframe with multiple arguments from each row of the df
I want to apply a function to each row in a data frame, however, R applies it to each column by default. How do I force it otherwise?
> a = as.data.frame(list(c(1,2,3),c(10,0,6)),header=T)
> a
c.1..2..3. c.10..0..6.
1 1 10
2 2 0
3 3 6
> sapply(a,min)
c.1..2..3. c.10..0..6.
1 0
I wanted something like
1 2
2 0
3 3
You want apply (see the docs for it). apply(var,1,fun) will apply to rows, apply(var,2,fun) will apply to columns.
> apply(a,1,min)
[1] 1 0 3

Resources