How to turn a 1D named int into a wide dataframe [duplicate] - r

This question already has answers here:
Filtering single-column data frames
(1 answer)
How to subset matrix to one column, maintain matrix data type, maintain row/column names?
(1 answer)
Closed 4 months ago.
I have a large matrix from which I want to select rows to further process them.
Ultimatly, I want to convert the matrix to a dataframe that only contains my rows of interest.
The easiest thing would be to simple run something like data.frame(my_matrix)[c(3,5),]. The problem with this is, that I am working with a very big sparse matrix, so converting all of it to a dataframe just to select a few rows is ineffective.
This option does what I want, but somehow only returns the result that I intend if I indicate at least 2 indices.
m <- matrix(1:25,nrow = 5)
rownames(m) <- 1:5
colnames(m) <- c("A","B","C","D","E")
data.frame(m[c(3,5),])
If i only want to select 1 row, and if I use the code above, the result is not a "wide" dataframe, but instead a long one, which looks like this:
data.frame(m[c(2),])
m.c.2....
A 2
B 7
C 12
D 17
E 22
Is there a simple way to get a dataframe with just one row out of the matrix without converting the whole matrix first? It feels like I am overlooking something very obvious here...
Any help is much appreciated!

You need to use drop=FALSE in the matrix subset, otherwise it will turn the matrix into a vector, as you saw.
m <- matrix(1:25,nrow = 5)
rownames(m) <- 1:5
colnames(m) <- c("A","B","C","D","E")
data.frame(m[c(2),, drop=FALSE])
#> A B C D E
#> 2 2 7 12 17 22
Created on 2022-10-12 by the reprex package (v2.0.1)

Related

Finding the maximum value for each row and extract column names [duplicate]

This question already has answers here:
R Create column which holds column name of maximum value for each row
(4 answers)
Closed 1 year ago.
Say we have the following matrix,
x <- matrix(1:9, nrow = 3, dimnames = list(c("X","Y","Z"), c("A","B","C")))
What I'm trying to do is:
1- Find the maximum value of each row. For this part, I'm doing the following,
df <- apply(X=x, MARGIN=1, FUN=max)
2- Then, I want to extract the column names of the maximum values and put them next to the values. Following the reproducible example, it would be "C" for the three rows.
Any assistance would be wonderful.
You can use apply like
maxColumnNames <- apply(x,1,function(row) colnames(x)[which.max(row)])
Since you have a numeric matrix, you can't add the names as an extra column (it would become converted to a character-matrix).
You can choose a data.frame and do
resDf <- cbind(data.frame(x),data.frame(maxColumnNames = maxColumnNames))
resulting in
resDf
A B C maxColumnNames
X 1 4 7 C
Y 2 5 8 C
Z 3 6 9 C

How to subset the first column (rownames) in R [duplicate]

This question already has answers here:
What is about the first column in R's dataset mtcars?
(4 answers)
Closed 3 years ago.
I have xy data for gene expression in multiple samples. I wish to subset the first column so I can order the genes alphabetically and perform some other filtering.
> setwd("C:/Users/Will/Desktop/BIOL3063/R code assignment");
> df = read.csv('R-assignments-dataset.csv', stringsAsFactors = FALSE);
Here is a simplified example of the dataset I'm working with, it has 270 columns (tissue samples) and 7065 rows (gene names).
The first column is a list of gene names (A2M, AAAS, AACS etc.) and each column is a different tissue sample, thus showing the gene expression in each tissue sample.
The question being asked is "Sort the gene names alpahabetically (A-Z) and print out the first 20 gene names"
My thought process would be to subset the first column (gene names) and then perform order() to sort alphabetically, after which I can use head() to print the first 20.
However when I try
> genes <- df[1]
It simply subsets the first column that has data in it (TCGA-A6-2672_TissueA) rather than the one to its left.
Also
> genes <- df[,df$col1];
> genes;
data frame with 0 columns and 7065 rows
> order(genes);
integer(0)
Appears to create a list of gene names in R studio's viewer but I cannot perform any manipulation on it.
I am unable to correctly locate the first column in the data.frame, since it does not have a column header, and I also have the same problem when doing the same thing with row 1 (sample names) as well.
I'm a complete novice at R and this is part of an assignment I'm working on, it seems I'm missing something fundamental but I can not figure out what.
Cheers guys
Please include a sample of your text file as text instead of an image.
I have created a dataset similar to yours:
X Y
1 a b
2 c d
3 d g
Note that your tissue columns have a header but your gene names do not. Therefore these will be interpreted as rownames, see ?read.table:
If row.names is not specified and the header line has one less entry
than the number of columns, the first column is taken to be the row
names.
Reading it in R:
df <- read.table(text = ' X Y
1 a b
2 c d
3 d g')
So your gene names are not at df[1] but instead in rownames(df), so to get these genes <- rownames(df) or to add these to the existing df you can use df$gene <- rownames(df)
There are numerous ways to convert your row names to a column see for example this question.
If you are asking what I think you are asking, you just need to subset inside the as.data.frame function, which will auto-generate a "header", as you call it. It will be called V1, the first variable of your new data frame.
genes <- as.data.frame(df[,1])
genes$V1
1 A
2 C
3 A
4 B
5 C
6 D
7 A
8 B
As per the comment below, the issue could be avoided if you remove the comma from your subsetting syntax. When you select columns from a data.frame, you only need to index the column, not the rows.
genes <- df[1]

Reshaping matrix into vector of alternate columns [duplicate]

This question already has answers here:
Convert a matrix to a 1 dimensional array
(11 answers)
Closed 4 years ago.
I have a matrix measuring 91 x 2 (i.e 91 rows and two columns).
mat1 <- matrix(1:182, 91, 2)
I need to create a vector from the said matrix of one row. I can do that with the following:
mat2 <- matrix(mat1, nrow = 1, byrow = TRUE).
However, I would like to have each row in the original matrix to be represented one after another. Currently it's taking all of column 1 then all of column 2 and joining those together sequentially. Whilst I need them to be in one long row, like this: 1,92,2,93,3,94 etcMeaning the structure ultimately would be 1,182 (i.e. one row with 182 columns).
How can I achieve this?
Thanks.
We can transpose the matrix and convert it to a vector
c(t(mat1))

select multiple ranges of columns in data.table using column names [duplicate]

This question already has answers here:
Select multiple ranges of columns using column names in data.table
(2 answers)
Closed 4 years ago.
I can select multiple ranges of columns in a data.table using a numeric vector like c(1:5,27:30). Is there any way to do the same with column names? For example, in some form similar to col1:col5,col27:col30?
You can with dplyr:
df <- data.frame(a=1, b=2, c=3, d=4, e=5, f=6, g=7)
dplyr::select(df, a:c, f:g)
a b c f g
1 2 3 6 7
I am not sure if my answer is efficient, but I think that could give you a workaround at least in case you need to work with data.table.
My proposal is to use data.table in conjunction with cbind. Thus you could have:
df <- data.frame(a=1, b=2, c=3, d=4, e=5, f=6, g=7)
multColSelectedByName<- cbind(df[,a:c],df[,f:g])
#a b c f g
#1: 1 2 3 6 7
One point that one should be careful is that if there is only one column in one of the selections, for example df[,f] then the name of this column would be something like V2 and not f. In such a case one could use:
multColSelectedByName<- cbind(df[,a:c],f=df[,f])

R, accessing a column vector of a matrix by name [duplicate]

This question already has answers here:
Extract matrix column values by matrix column name
(2 answers)
Closed 7 years ago.
In R I can access the data in a column vector of a column matrix by the following:
mat2[,1]
Each column of mat2 has a name. How can I retrieve the data from the first column by using the name attribute instead of [,1]?
For example suppose my first column had the name "saturn". I want something like
mat2[,1] == mat2[saturn]
The following should do it:
mat2[,'saturn']
For example:
> x <- matrix(1:21, nrow=7, ncol=3)
> colnames(x) <- paste('name', 1:3)
> x[,'name 1']
[1] 1 2 3 4 5 6 7
Bonus information (adding to the first answer)
x[,c('name 1','name 2')]
would return two columns just as if you had done
x[,1:2]
And finally, the same operations can be used to subset rows
x[1:2,]
And if rows were named...
x[c('row 1','row 2'),]
Note the position of the comma within the brackets and with respect to the indices.

Resources