Exporting Seurat Object Data by Cluster - r

I'm using Seurat to perform a single cell analysis and am interested in exporting the data for all cells within each of my clusters. I tried to use the below code but have had no success.
My Seurat object is called Patients. I also attached a screenshot of my Seurat object. I am looking to extract all the clusters (i.e. Ductal1, Macrophage1, Macrophage2, etc...)
meta.data.cluster <- unique(x = Patients#meta.data$active.ident)
for(group in meta.data.cluster) {
group.cells <- WhichCells(object = Patients, subset.name = "active.ident" , accept.value = group)
data_to_write_out <- as.data.frame(x = as.matrix(x = Patients#raw.data[, group.cells]))
write.csv(x = data_to_write_out, row.names = TRUE, file = paste0(save_dir,"/",group, "_cluster_outfile.csv"))
}
I am new to R and coding so any help is greatly appreciated! :)

It doesn't work because there is no active.ident column under your metadata. For example if we use an example dataset like yours and set the ident:
library(Seurat)
M = matrix(rnbinom(5000,mu=20,size=1),ncol=50)
colnames(M) = paste0("P",1:50)
rownames(M) = paste0("gene",1:100)
Patients = CreateSeuratObject(M)
Patients$grp = sample(c("Ductal1","Macrophage1","Macrophage2"),50,replace=TRUE)
Idents(Patients) = Patients$grp
You can see this line of code gives you no value:
meta.data.cluster <- unique(x = Patients#meta.data$active.ident)
meta.data.cluster
NULL
You can do:
meta.data.cluster <- unique(Idents(Patients))
for(group in meta.data.cluster) {
group.cells <- WhichCells(object = Patients, idents = group)
data_to_write_out <- as.data.frame(GetAssayData(Patients,slot = 'counts')[,group.cells])
write.csv(data_to_write_out, row.names = TRUE, file = paste0(save_dir,"/",group, "_cluster_outfile.csv"))
}
Note also you can get the counts out using GetAssayData . You can subset one group and write out like this:
wh <- which(Idents(Patients) =="Macrophage1" )
da = as.data.frame(GetAssayData(Patients,slot = 'counts')[,wh])
write.csv(da,...)

Related

How do I resolve an integration error in Seurat?

I am new to Seurat, and am trying to run an integrated analysis of two different single-nuclei RNAseq datasets. I have been following the Seurat tutorial on integrated analysis (https://satijalab.org/seurat/articles/integration_introduction.html) to guide me, but when I ran the last line of code, I got an error.
# Loading required libraries
library(Seurat)
library(cowplot)
library(patchwork)
# Set up the Seurat Object
vgat.data <- Read10X(data.dir = "~/Desktop/VGAT Viral Data 1/")
vglut.data <- Read10X(data.dir = "~/Desktop/VGLUT3 Viral/")
# Initialize the Seurat object with the raw (non-normalized data)
vgat <- CreateSeuratObject(counts = vgat.data, project = "VGAT/VGLUT Integration", min.cells = 3, min.features = 200)
vglut <- CreateSeuratObject(counts = vglut.data, project = "VGAT/VGLUT Integration", min.cells = 3, min.features = 200)
# Merging the datasets
vgat <- AddMetaData(vgat, metadata = "VGAT", col.name = "Cell")
vglut <- AddMetaData(vglut, metadata = "VGLUT", col.name = "Cell")
merged <- merge(vgat, y = vglut, add.cell.ids = c("VGAT", "VGLUT"), project = "VGAT/VGLUT Integration")
# Split the dataset into a list of two seurat objects (vgat and vglut)
merged.list <- SplitObject(merged, split.by = "Cell")
# Normalize and Identify variable features for each dataset independently
merged.list <lapply(X = merged.list, FUN = function(x) {
x <- NormalizeData(x)
x <- FindVariableFeatures(x, selection.method = "vst", nFeatures = 2000)
})
After running the last line of code, I get the following error: Error in merged.list < lapply(X = merged.list, FUN = function(x) { :
comparison of these types is not implemented
I was wondering if anyone is familiar with Seurat and knows how I can troubleshoot this error. Any help would be greatly appreciated.

Weird characters appearing in the plot legend when using DoHeatmap

I was using Seurat to analyse single cell RNA-seq data and I managed to draw a heatmap plot with DoHeatmap() after clustering and marker selection, but got a bunch of random characters appearing in the legend. They are random characters as they will change every time you run the code. I was worrying over it's something related to my own dataset, so I then tried the test Seurat object 'ifnb' but still got the same issue (see the red oval in the example plot).
example plot
I also tried importing the Seurat object in R in the terminal (via readRDS) and ran the plotting function, but got the same issue there, so it's not a Rstudio thing.
Here are the codes I ran:
'''
library(Seurat)
library(SeuratData)
library(patchwork)
InstallData("ifnb")
LoadData("ifnb")
ifnb.list <- SplitObject(ifnb, split.by = "stim")
ifnb.list <- lapply(X = ifnb.list, FUN = function(x) {
x <- NormalizeData(x)
x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
})
features <- SelectIntegrationFeatures(object.list = ifnb.list)
immune.anchors <- FindIntegrationAnchors(object.list = ifnb.list, anchor.features = features)
immune.combined <- IntegrateData(anchorset = immune.anchors)
immune.combined <- ScaleData(immune.combined, verbose = FALSE)
immune.combined <- RunPCA(immune.combined, npcs = 30, verbose = FALSE)
immune.combined <- RunUMAP(immune.combined, reduction = "pca", dims = 1:30)
immune.combined <- FindNeighbors(immune.combined, reduction = "pca", dims = 1:30)
immune.combined <- FindClusters(immune.combined, resolution = 0.5)
DefaultAssay(immune.combined) <- 'RNA'
immune_markers <- FindAllMarkers(immune.combined, latent.vars = "stim", test.use = "MAST", assay = 'RNA')
immune_markers %>%
group_by(cluster) %>%
top_n(n = 10, wt = avg_log2FC) -> top10_immune
DoHeatmap(immune.combined, slot = 'data',features = top10_immune$gene, group.by = 'stim', assay = 'RNA')
'''
Does anyone have any idea how to solve this issue other than reinstalling everything?
I have been having the same issue myself and while I have solved it by not needing the legend, I think you could use this approach and use a similar solution:
DoHeatmap(immune.combined, slot = 'data',features = top10_immune$gene, group.by = 'stim', assay = 'RNA') +
scale_color_manual(
values = my_colors,
limits = c('CTRL', 'STIM'))
Let me know if this works! It doesn't solve the source of the odd text values but it does the job! If you haven't already, I would recommend creating a forum question on the Seurat forums to see where these characters are coming from!
When I use seurat4.0, I met the same problem.
While I loaded 4.1, it disappeared

Repeating a process of hexagon polygons that define raster boundaries for a large set of polygons

My apologies. This is my first time using Stackoverflow, so I'm not used to posting questions. Here's what I'm coding
library(raster)
library(landscapemetrics)
library(landscapetools)
# Add raster data for 2000
hex1_2000<-raster('2000_hex1.tif')
hex2_2000<-raster('2000_hex2.tif')
hex3_2000<-raster('2000_hex3.tif')
hex4_2000<-raster('2000_hex4.tif')
...
hex23_2000<-('2000_hex4.tif')
# Add raster data for 2010
hex1_2010<-raster('2010_hex1.tif')
hex2_2010<-raster('2010_hex2.tif')
hex3_2010<-raster('2010_hex3.tif')
hex4_2010<-raster('2010_hex4.tif')
...
hex23_2010<-('2000_hex4.tif')
#Create data frame as table
hex1 = data.frame(
lc00 = values(hex1_2000),
lc10 = values(hex1_2010))
hex2 = data.frame(
lc00 = values(hex2_2000),
lc10 = values(hex2_2010))
hex3 = data.frame(
lc00 = values(hex3_2000),
lc10 = values(hex3_2010))
hex4 = data.frame(
lc00 = values(hex4_2000),
lc10 = values(hex4_2010))
...
hex23 = data.frame(
lc00 = values(hex23_2000),
lc10 = values(hex23_2010))
...
hex1 = table(hex1[,c('lc00','lc10')])
hex2 = table(hex2[,c('lc00','lc10')])
hex3 = table(hex3[,c('lc00','lc10')])
hex4 = table(hex4[,c('lc00','lc10')])
...
hex23 = table(hex23[,c('lc00','lc10')])
#Define crosstabulation matrix
Hex1_Trans = as.matrix(hex1 / rowSums(hex1))
write.csv(Hex1_Trans, 'hex1Trans.csv')
Hex2_Trans = as.matrix(hex2 / rowSums(hex2))
write.csv(Hex2_Trans, 'hex2Trans.csv')
Hex3_Trans = as.matrix(hex3 / rowSums(hex3))
write.csv(Hex3_Trans, 'hex3Trans.csv')
Hex4_Trans = as.matrix(hex2 / rowSums(hex4))
write.csv(Hex4_Trans, 'hex2Trans.csv')
...
Hex23_Trans = as.matrix(hex23 / rowSums(hex23))
write.csv(Hex23_Trans, 'hex23Trans.csv')
As you can see, there are innumerous instances where I'm repeating the same process. I would be delighted to know how I can make this code simpler and more elegant. My coding is always like this, and I find this obviously highly inefficient. Thank you everyone for your help.
Here is an incomplete draft illustrating how to use Map to iterate simultaneously through the 2000 and 2010 data.
fn_y2000 <- c("2000_hex1.tif", "2000_hex2.tif", "2000_hex3.tif")
fn_y2010 <- c("2010_hex1.tif", "2010_hex2.tif", "2010_hex3.tif")
lst <- Map(
function(x1, x2) {
hex1 <- raster(x1)
hex2 <- raster(x2)
tbl <- table(values(hex1), values(hex2))
#... Normalise and write output
},
fn_y2000, fn_y2010)
The return object is a list.
Maybe something like the following will do what the question asks for.
It is a repeated use of lapply to read in the data files and table the required columns.
hexnames <- list.files(pattern = "2000_hex\\d+\\.tif")
hex_list <- lapply(hexnames, raster)
names(hex_list) <- paste0("hex", seq_along(hex_list), "_2000")
hex_table <- lapply(hex_list, function(X) table(X[, c('lc00','lc10')]))
Very simple solution, try assign(). This code is from Data Camp's documentation page.
for(i in 1:6) {
#-- Create objects 'r.1', 'r.2', ... 'r.6' --
nam <- paste("r", i, sep = ".")
assign(nam, 1:i)
}
ls(pattern = "^r..$")
Here is the link to the page. Look at the 'Examples' section. rdocumentation.org/packages/base/versions/3.6.1/topics/assign

R - XGBoost: Error building DMatrix

I am having trouble using the XGBoost in R.
I am reading a CSV file with my data:
get_data = function()
{
#Loading Data
path = "dados_eye.csv"
data = read.csv(path)
#Dividing into two groups
train_porcentage = 0.05
train_lines = nrow(data)*train_porcentage
train = data[1:train_lines,]
test = data[train_lines:nrow(data),]
rownames(train) = c(1:nrow(train))
rownames(test) = c(1:nrow(test))
return (list("test" = test, "train" = train))
}
This function is Called my the main.R
lista_dados = get_data()
#machine = train_svm(lista_dados$train)
#machine = train_rf(lista_dados$train)
machine = train_xgt(lista_dados$train)
The problem is here in the train_xgt
train_xgt = function(train_data)
{
data_train = data.frame(train_data[,1:14])
label_train = data.frame(factor(train_data[,15]))
print(is.data.frame(data_train))
print(is.data.frame(label_train))
dtrain = xgb.DMatrix(data_train, label=label_train)
machine = xgboost(dtrain, num_class = 4 ,max.depth = 2,
eta = 1, nround = 2,nthread = 2,
objective = "binary:logistic")
return (machine)
}
This is the Error:
becchi#ubuntu:~/Documents/EEG_DATA/Dados_Eye$ Rscript main.R
[1] TRUE
[1] TRUE
Error in xgb.DMatrix(data_train, label = label_train) :
xgb.DMatrix: does not support to construct from list Calls: train_xgt
-> xgb.DMatrix Execution halted becchi#ubuntu:~/Documents/EEG_DATA/Dados_Eye$
As you can see, they are both DataFrames.
I dont know what I am doing wrong, please help!
Just convert data frame to matrix first using as.matrix() and then pass to xgb.Dmatrix().
Check if all columns have numeric data in them- I think this could be because you have some column that has data stored as factors/ characters which it won't be able to convert to a matrix. if you have factor variables, you can use one-hot encoding to convert them into dummy variables.
Try:
dtrain = xgb.DMatrix(as.matrix(sapply(data_train, as.numeric)), label=label_train)
instead of just:
dtrain = xgb.DMatrix(data_train, label=label_train)

Unused arguments in R error

I am new to R , I am trying to run example which is given in "rebmix-help pdf". It use galaxy dataset and here is the code
library(rebmix)
devAskNewPage(ask = TRUE)
data("galaxy")
write.table(galaxy, file = "galaxy.txt", sep = "\t",eol = "\n", row.names = FALSE, col.names = FALSE)
REBMIX <- array(list(NULL), c(3, 3, 3))
Table <- NULL
Preprocessing <- c("histogram", "Parzen window", "k-nearest neighbour")
InformationCriterion <- c("AIC", "BIC", "CLC")
pdf <- c("normal", "lognormal", "Weibull")
K <- list(7:20, 7:20, 2:10)
for (i in 1:3) {
for (j in 1:3) {
for (k in 1:3) {
REBMIX[[i, j, k]] <- REBMIX(Dataset = "galaxy.txt",
Preprocessing = Preprocessing[k], D = 0.0025,
cmax = 12, InformationCriterion = InformationCriterion[j],
pdf = pdf[i], K = K[[k]])
if (is.null(Table))
Table <- REBMIX[[i, j, k]]$summary
else Table <- merge(Table, REBMIX[[i, j,k]]$summary, all = TRUE, sort = FALSE)
}
}
}
It is giving me error ERROR:
unused argument (InformationCriterion = InformationCriterion[j])
Plz help
I'm running R 3.0.2 (Windows) and the library rebmix defines a function REBMIX where InformationCriterion is not listed as a named argument, but Criterion.
Brief invoke REBMIX as :
REBMIX[[i, j, k]] <- REBMIX(Dataset = "galaxy.txt",
Preprocessing = Preprocessing[k], D = 0.0025,
cmax = 12, Criterion = InformationCriterion[j],
pdf = pdf[i], K = K[[k]])
It looks as though there have been substantial changes to the rebmix package since the example mentioned in the OP was created. Among the most noticable changes is the use of S4 classes.
There's also an updated demo in the rebmix package using the galaxy data (see demo("rebmix.galaxy"))
To get the above example to produce results (Note: I am not familiar with this package or the rebmix algorithm!!!):
Change the argument to Criterion as mentioned by #Giupo
Use the S4 slot access operator # instead of $
Don't name the results object REDMIX because that's already the function name
library(rebmix)
data("galaxy")
## Don't re-name the REBMIX object!
myREBMIX <- array(list(NULL), c(3, 3, 3))
Table <- NULL
Preprocessing <- c("histogram", "Parzen window", "k-nearest neighbour")
InformationCriterion <- c("AIC", "BIC", "CLC")
pdf <- c("normal", "lognormal", "Weibull")
K <- list(7:20, 7:20, 2:10)
for (i in 1:3) {
for (j in 1:3) {
for (k in 1:3) {
myREBMIX[[i, j, k]] <- REBMIX(Dataset = list(galaxy),
Preprocessing = Preprocessing[k], D = 0.0025,
cmax = 12, Criterion = InformationCriterion[j],
pdf = pdf[i], K = K[[k]])
if (is.null(Table)) {
Table <- myREBMIX[[i, j, k]]#summary
} else {
Table <- merge(Table, myREBMIX[[i, j,k]]#summary, all = TRUE, sort = FALSE)
}
}
}
}
I guess this is late. But I encountered a similar problem just a few minutes ago. And I realized the real scenario that you may face when you got this kind of error msg... It's just the version conflict.
You may use a different version of the R package from the tutorial, thus the argument names could be different between what you are running and what the real code use.
So please check the version first before you try to manually edit the file. Also, it happens that your old version package is still in the path and it overrides the new one. This was exactly what I had... since I manually installed the old and new version separately...

Resources