Finding Index from Vector/Matrix or Dataframe in R - r

I have data in R as follow:
data <- c(1,12,22,0,8,1,0,0)
Is there any way to index the data to find the index for element that is greater than 0? So the result will be:
1 2 3 5 6
I tried to use as.factor(data), but it will take several more step to get the result that I aim for. Thanks.

We can use which on a logical vector
which(data >0)
#[1] 1 2 3 5 6

Another option is using seq_along (but not as straightforward as the which method by #akrun)
> seq_along(data)[data>0]
[1] 1 2 3 5 6

Related

R how to remove repeated value while save unique values in running length [duplicate]

This question already has answers here:
Remove/collapse consecutive duplicate values in sequence
(5 answers)
Closed 1 year ago.
So Example I have this vectors:
v <- c(3,3,3,3,3,1,1,1,1,1,1,
3,3,3,3,3,3,3,3,3,3,3,3,
3,3,3,2,2,2,2,2,2,2,3,3,
3,3,3,3,3,3,3,3,3,3,3)
And I like to Simplify the vectors as this expected outputs:
exp_output <- c(3,1,3,2,3)
Whats the best and convenient way to do this? Thankyou
Try rle(v)$values which results in [1] 3 1 3 2 3.
Another option using diff and which.
v[c(1, which(diff(v) != 0) + 1)]
#[1] 3 1 3 2 3
Another option is with lag:
library(dplyr)
v[v!=lag(v, default=1)]
[1] 3 1 3 2 3
We can use rleid
library(data.table)
tapply(v, rleid(v), FUN = first)
1 2 3 4 5
3 1 3 2 3

Error Merging Two Columns

I'm trying to combine two columns, Previous$Col1 and Previous$Col3, into one data.frame.
This is what I tried to do:
x<-data.frame(Previous$Col1)
y<-data.frame(Previous$Col3)
z<-merge(x,y)
And for some reason I got this error on the console:
Error: cannot allocate vector of size 24.0 Gb
What's going wrong and what should I do to fix this?
How could a data frame with two columns with 80000ish rows take up 24 GB of memory?
Thanks!
You are creating a full cartesian product, which has 80000*80000 rows and two columns, that is a total of 1.28e+10 elements (about 51GB if I am correct). What are you trying to accomplish with your merge?
> x<-data.frame(a= c("a","b"))
> y<-data.frame(b= c(1,2,3))
> z<-merge(x,y)
> x
a
1 a
2 b
> y
b
1 1
2 2
3 3
> z
a b
1 a 1
2 b 1
3 a 2
4 b 2
5 a 3
6 b 3
You could do data.frame(Col3 = Previous$Col3, Col1= Previous$Col1) to achieve what you want.
Try using bind_cols from the dplyr package or cbind from base R.
bind_cols(Previous$Col1,Previous$Col3)
or
cbind(Previous$Col1,Previous$Col3)
Additionally, since these columns come from the same original data.frame. select() from the dplyr package could be used:
select(Previous,Col1,Col3)

How do you convert information from rle into a data frame

I want to convert the information contained in a the "rle" function in R, into a data frame, but couldn't find how. For example, for the vector
x <- c(1,1,1,2,2,3,4,4,4)
I want a dataframe that has two columns of 1 2 3 4 and 3 2 1 3
Any help would be greatly appreciated!
Use unclass to remove the rle class. Then you can just use data.frame on the resulting list.
data.frame(unclass(rle(x)))
## lengths values
## 1 3 1
## 2 2 2
## 3 1 3
## 4 3 4
You can do it direclty with the data.frame function. rle actually returns a list of two components (lengths and values).
rleX
data.frame(values = rleX$values, lengths = rleX$lengths)
You can use this simple function to convert to dataframe
data <- with(rle(x), data.frame(values, lengths))
Try this:
data.frame(table(x))
x Freq
1 1 3
2 2 2
3 3 1
4 4 3

length of table read from a data file in r

I have a simple table with the following entries.
1
2
3
4
5
The file name is "test.txt". I have used the following command to read in the file.
mydata<-read.table("test.txt")
But when I enter
length(mydata)
it shows 1 instead of 5. Why does it show 1 and not 5 ?
I believe
nrow(mydata)
should return the number of rows (5)
The length of the data frame will give you number of columns present in the data-frame. In this case it is 1.
mydata<- data.frame(c(1:5))
The above code creates a dataframe
X1.5
1 1
2 2
3 3
4 4
5 5
Lets see some commands
length(mydata)
[1] 1
To know the number of rows
case 1
nrow(mydata)
[1] 5
case 2: To know the number of elements in first column of a dataframe
length(mydata$X1.5)
[1] 5
length(mydata[[1]])
[1] 5
Length is used mostly for vectors and for dataframe it is good to use nrow command.
Regards,
Ganesh

apply which.max to second, third, etc. highest value

I have a vector:
x<-rnorm(100),
I would like to create a vector that stores the position of the first, second, third...100th highest value in X.
For example if x=4,9,2,0,10,11 then the desired vector would be 6,5,2,1,3,4 is there a function for doing this?
Try using order
> order(x, decreasing =TRUE)
[1] 6 5 2 1 3 4
Try this:
> order(-x)
[1] 6 5 2 1 3 4

Resources