Keep column names after calling function combn - r

After looking at another post about column names and combn function here consider the same data.frame. We make a combn with all 2 possible vectors:
foo <- data.frame(x=1:5,y=4:8,z=10:14, w=8:4)
all_comb <- combn(foo,2)
Is there a way to keep column names after the combn call so in this case we could get "x y" instead of "X1.5 X4.8" as shown below ?
comb_df <- data.frame(all_comb[1,1],all_comb[2,1])
print(comb_df)
X1.5 X4.8
1 1 4
2 2 5
3 3 6
4 4 7
5 5 8

I suspect you really want to use expand.grid() instead.
Try this:
head(expand.grid(foo))
x y z w
1 1 4 10 8
2 2 4 10 8
3 3 4 10 8
4 4 4 10 8
5 5 4 10 8
6 1 5 10 8
or
head(expand.grid(foo[, 1:2]))
x y
1 1 4
2 2 4
3 3 4
4 4 4
5 5 4
6 1 5

Related

Transforming a looping factor variable into a sequence of numerics

I have a factor variable with 6 levels, which simplified looks like:
1 1 2 2 2 3 3 3 4 4 4 4 5 5 5 6 6 6 1 1 1 2 2 2 2... 1 1 1 2 2... (with n = 78)
Note, that each number is repeated mostly but not always three times.
I need to transform this variable into the following pattern:
1 1 2 2 2 3 3 3 4 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 8...
where each repetition of the 6 levels continuous counting ascending.
Is there any way / any function that lets me do that?
Sorry for my bad description!
Assuming that you have a numerical vector that represents your simplified version you posted. i.e. x = c(1,1,1,2,2,3,3,3,1,1,2,2), you can use this:
library(dplyr)
cumsum(x != lag(x, default = 0))
# [1] 1 1 1 2 2 3 3 3 4 4 5 5
which compares each value to its previous one and if they are different it adds 1 (starting from 1).
Maybe you can try rle, i.e.,
v <- rep(seq_along((v<-rle(x))$values),v$lengths)
Example with dummy data
x = c(1,1,1,2,2,3,3,3,4,4,5,6,1,1,2,2,3,3,3,4,4)
then we can get
> v
[1] 1 1 1 2 2 3 3 3 4 4 5 6 7 7 8 8 9 9
[19] 9 10 10
In base you can use diff and cumsum.
c(1, cumsum(diff(x)!=0)+1)
# [1] 1 1 2 2 2 3 3 3 4 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 8
Data:
x <- c(1,1,2,2,2,3,3,3,4,4,4,4,5,5,5,6,6,6,1,1,1,2,2,2,2)

Sequentially remove vector elements

I want to replicate a vector with one value within this vector is missing (sequentially).
For example, my vector is
value <- 1:7
First, the series is without 1, second without 2, and so on. In the end, the series is in one vector.
The intended output looks like
2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 6
Is there any smart way to do this?
You could use the diagonal matrix to set up a logical vector, using it to remove the appropriate values.
n <- 7
rep(1:n, n)[!diag(n)]
# [1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5
# [36] 7 1 2 3 4 5 6
Well, you can certainly do it as a one-liner but I am not sure it qualifies as smart. For example:
x <- 1:7
do.call("c", lapply(as.list(-1:-length(x)), function(a)x[a]))
This simple uses lapply to create a list of copies of x with each of its entries deleted, and then concatenates them using c. The do.call function applies its first argument (a function) to its second argument (a list of arguments to the function).
For fun, it's also possible to just use rep:
> n <- 7
> rep(1:n, n)[rep(c(FALSE, rep(TRUE, n)), length.out=n^2)]
[1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 7 1 2
[39] 3 4 5 6
But lapply is cleaner, I think.
You could also do:
n <- 7
rep(seq(n), n)[-seq(1,n*n,n+1)]
#[1] 2 3 4 5 6 7 1 3 4 5 6 7 1 2 4 5 6 7 1 2 3 5 6 7 1 2 3 4 6 7 1 2 3 4 5 7 1 2 3 4 5 6

group cases by shared values in r [duplicate]

This question already has answers here:
R: define distinct pattern from values of multiple variables [duplicate]
(3 answers)
Closed 5 years ago.
I have a dataset like this:
case x y
1 4 5
2 4 5
3 8 9
4 7 9
5 6 3
6 6 3
I would like to create a grouping variable.
This variable should have the same values when both x and y are the same.
I do not care what this value is but it is to group them. Because in my dataset if x and y are the same for two cases they are probably part of the same organization. I want to see which organizations there are.
So my preferred dataset would look like this:
case x y org
1 4 5 1
2 4 5 1
3 8 9 2
4 7 9 3
5 6 3 4
6 6 3 4
How would I have to program this in R?
As you said , I do not care what this value is, you can just do following
dt$new=as.numeric(as.factor(paste(dt$x,dt$y)))
dt
case x y new
1 1 4 5 1
2 2 4 5 1
3 3 8 9 4
4 4 7 9 3
5 5 6 3 2
6 6 6 3 2
A solution from dplyr using the group_indices.
library(dplyr)
dt2 <- dt %>%
mutate(org = group_indices(., x, y))
dt2
case x y org
1 1 4 5 1
2 2 4 5 1
3 3 8 9 4
4 4 7 9 3
5 5 6 3 2
6 6 6 3 2
If the group numbers need to be in order, we can use the rleid from the data.table package after we create the org column as follows.
library(dplyr)
library(data.table)
dt2 <- dt %>%
mutate(org = group_indices(., x, y)) %>%
mutate(org = rleid(org))
dt2
case x y org
1 1 4 5 1
2 2 4 5 1
3 3 8 9 2
4 4 7 9 3
5 5 6 3 4
6 6 6 3 4
Update
Here is how to arrange the columns in dplyr.
library(dplyr)
dt %>%
arrange(x)
case x y
1 1 4 5
2 2 4 5
3 5 6 3
4 6 6 3
5 4 7 9
6 3 8 9
We can also do this for more than one column, such as arrange(x, y) or use desc to reverse the oder, like arrange(desc(x)).
DATA
dt <- read.table(text = " case x y
1 4 5
2 4 5
3 8 9
4 7 9
5 6 3
6 6 3",
header = TRUE)

Order data frame by column and display WITH indices

I have the following R data frame
> df
a
1 3
3 2
4 1
5 3
6 6
7 7
8 2
10 8
I order it by the a column with the order function df[ order(df), ]:
[1] 1 2 2 3 3 6 7 8
This is the result I want, BUT, how can list the whole data frame with the permuted indices?
The only thing that works is the following, but it seems sloppy and I don't really understand what it does:
> df[ order(df), c(1,1) ] # I want this but without the a.1 column!!!!
a a.1
4 1 1
3 2 2
8 2 2
1 3 3
5 3 3
6 6 6
7 7 7
10 8 8
Thanks
If we need the indices as well, use sort with index.return = TRUE
data.frame(sort(df$a, index.return=TRUE))

How to reverse a column in R

I have a dataframe as described below. Now I want to reverse the order of column B without hampering the total order of the dataframe. So now the column B has 5,4,3,2,1. I want to change it to 1,2,3,4,5. I don't want to sort as it will hamper the total ordering.
A B C
1 5 6
2 4 8
3 3 5
4 2 5
5 1 3
You can replace just that column:
x$B <- rev(x$B)
On your data:
> x$B <- rev(x$B)
> x
A B C
1 1 1 6
2 2 2 8
3 3 3 5
4 4 4 5
5 5 5 3
transform is also handy for this:
> transform(x, B = rev(B))
A B C
1 1 1 6
2 2 2 8
3 3 3 5
4 4 4 5
5 5 5 3
This doesn't modify x so you need to assign the result to something (perhaps back to x).

Resources