I'm trying to extract the data matrix computed by Signac r package when running RegionMatrix(). I know I can plot the data with RegionHeatmap(), but I want to plot the data with another package.
So, if I created a region matrix like this:
my_object <- RegionMatrix(my_object
, key = "my_region_matrix"
, regions = StringToGRanges(top_cluster_genes$gene)
, upstream = 2500, downstream = 2500)
How do I go about extracting the data stored within the object under the key "my_region_matrix"? I know it's a pretty basic question, but I've been browsing the returned object and can't find any assay/matrix with that key name. (I'm putting this under Seurat because apparently there is no tag for Signac and the object is a Seurat object anyway)
Related
I am trying to create a phylo correlogram based on my data using phyloCorrelogram from the phylosignal package in order to test for presence of a phylogenetic signal. My data is in the so-called phylo4d format and is called tree.
Now, when I run phyloCorrelogram(tree), I am returned the following error:
library("phylobase")
library("ape")
library("phylosignal") # contains phyloCorrelogram()
> phyloCorrelogram(tree, ci.bs = 10, n.points = 10)
Error in boot::boot(X, function(x, z) moranTest(xr = x[z], Wr = prop.table(Wi[z, :
no data in call to 'boot'
I have already done an elaborate search on the internet to look for ways how to solve this issue, but without success.
Since the dimension of the data that I am using is too large to post it here, I have posted my data in .rda format on dropbox.
Does anyone know the flaw?
You have no data associated to your tree. Your tree is a phylo4d object which has the "tree" information but no data attached to it. You need something like that
library(phylobase)
g1 <- as(geospiza_raw$tree, "phylo4")
geodata <- geospiza_raw$data
g2 <- phylo4d(g1, geodata)
pc <- phyloCorrelogram(g2)
I'm trying to use DESeq2's PCAPlot function in a meta-analysis of data.
Most of the files I have received are raw counts pre-normalization. I'm then running DESeq2 to normalize them, then running PCAPlot.
One of the files I received does not have raw counts or even the FASTQ files, just the data that has already been normalized by DESeq2.
How could I go about importing this data (non-integers) as a DESeqDataSet object after it has already been normalized?
Consensus in vignettes and other comments seems to be that objects can only be constructed from matrices of integers.
I was mostly concerned with getting the format the same between plots. Ultimately, I just used a workaround to get the plots looking the same via ggfortify.
If anyone is curious, I just ended up doing this. Note, the "names" file is just organized like the meta file for colData for building a DESeq object from DESeqDataSetFrom Matrix, but I changed the name of the design column from "conditions" to "group" so it would match the output of PCAplot. Should look identical.
library(ggfortify)
data<-read.csv('COUNTS.csv',sep = ",", header = TRUE, row.names = 1)
names<-read.csv("NAMES.csv")
PCA<-prcomp(t(data))
autoplot(PCA, data = names, colour = "group", size=3)
I have a stake in raster data of Landsat and I want to extract the values of them by employing SpatialPointData in R and then plot the extracted values with the associated variables in SpatialPointData and finally, I want to exported the extracted data along with the variable in the attribute of the spatial point object. I have used extract function to do so but the problem is that after extracting I got several errors every time and sometimes it works but it gives me only a data frame as a matrix which I cant match them with observation point.
My scripts
#raster
lsat <- stack(b1,b2,b3,b4,b5,b6_1,b6_2,b7)
#SpatialPoint
soil_sp=SpatialPoints(cbind(soil.clean2$x,soil.clean2$y))
Extrcat the value from stack layer
soil_sp$ref<- extract(lsat2, soil_sp)
plot the extrced value and observed value in point data for each band
plot( soil_sp$ref ~., data=soil_sp)
Finally, I want to export the extracted value along with variable in point data in a single data frame or SpatialPointsDataFrame.
The solution was to use SpatialPointDataFrame to extract the value of stack raster and then write the result into a CSV file or whatever you would like.
implementation:
Crerate SpatialPointDataFram out of SpatialPointData
soil_spdf=SpatialPointsDataFrame(coords = soil_sp,data = soil.clean2,
proj4string = soil_crs)
Extrcat the values.
soil_spdf$ref <- extract(lsat2 , soil_spdf)
write the data in CSV file in the desired directory on your PC.
write.csv(x =soil_spdf,file ="C:/lsat2.csv")
I would like to extract the returns from the backtest package which are according the to the manual stored within a 5 dimensional array called 'results')
This is the backtest package:
https://cran.r-project.org/web/packages/backtest/backtest.pdf
A simple example looks like this:
library(backtest)
data(starmine)
bt <- backtest(starmine, in.var = c("smi"),
ret.var = "ret.0.1.m", date.var = "date",
id.var = "id", buckets = 10,
natural = TRUE, by.period = TRUE)
summary(bt)
When you run the summary command, it will print out the return series for each decile. I would like to extract those into a dataframe that I can use for further analysis.
Does someone know, how I can access the return series or extract it?
The bt object is an object with class backtest (which we see from class(bt)). The summary() function has a method defined for backtest objects which only prints the information to the screen. If you try to assign the information via stuff <- summary(bt), the stuff object will be NULL. To access the data that summary(bt) prints to the screen, you should use the accessor functions created for that object ( they are described in ?'backtest-class'). These functions include:
means()
counts()
summary()
marginals()
summaryStats()
turnover()
In order to access the data frame of summary statistics by month printed as the side effect of summary(bt), you can run summaryStats(bt). Please see pages 5-8 of the backtest help files for more information.
Using leaflet, I'm trying to plot some lines and set their color based on a 'speed' variable. My data start at an encoded polyline level (i.e. a series of lat/long points, encoded as an alphanumeric string) with a single speed value for each EPL.
I'm able to decode the polylines to get lat/long series of (thanks to Max, here) and I'm able to create segments from those series of points and format them as a SpatialLines object (thanks to Kyle Walker, here).
My problem: I can plot the lines properly using leaflet, but I can't join the SpatialLines object to the base data to create a SpatialLinesDataFrame, and so I can't code the line color based on the speed var. I suspect the issue is that the IDs I'm assigning SL segments aren't matching to those present in the base df.
The objects I've tried to join, with SpatialLinesDataFrame():
"sl_object", a SpatialLines object with ~140 observations, one for each segment; I'm using Kyle's code, linked above, with one key change - instead of creating an arbitrary iterative ID value for each segment, I'm pulling the associated ID from my base data. (Or at least I'm trying to.) So, I've replaced:
id <- paste0("line", as.character(p))
with
lguy <- data.frame(paths[[p]][1])
id <- unique(lguy[,1])
"speed_object", a df with ~140 observations of a single speed var and row.names set to the same id var that I thought I created in the SL object above. (The number of observations will never exceed but may be smaller than the number of segments in the SL object.)
My joining code:
splndf <- SpatialLinesDataFrame(sl = sl_object, data = speed_object)
And the result:
row.names of data and Lines IDs do not match
Thanks, all. I'm posting this in part because I've seen some similar questions - including some referring specifically to changing the ID output of Kyle's great tool - and haven't been able to find a good answer.
EDIT: Including data samples.
From sl_obj, a single segment:
print(sl_obj)
Slot "ID":
[1] "4763655"
[[151]]
An object of class "Lines"
Slot "Lines":
[[1]]
An object of class "Line"
Slot "coords":
lon lat
1955 -74.05228 40.60397
1956 -74.05021 40.60465
1957 -74.04182 40.60737
1958 -74.03997 40.60795
1959 -74.03919 40.60821
And the corresponding record from speed_obj:
row.names speed
... ...
4763657 44.74
4763655 34.8 # this one matches the ID above
4616250 57.79
... ...
To get rid of this error message, either make the row.names of data and Lines IDs match by preparing sl_object and/or speed_object, or, in case you are certain that they should be matched in the order they appear, use
splndf <- SpatialLinesDataFrame(sl = sl_object, data = speed_object, match.ID = FALSE)
This is documented in ?SpatialLinesDataFrame.
All right, I figured it out. The error wasn't liking the fact that my speed_obj wasn't the same length as my sl_obj, as mentioned here. ("data =
object of class data.frame; the number of rows in data should equal the number of Lines elements in sl)
Resolution: used a quick loop to pull out all of the unique lines IDs, then performed a left join against that list of uniques to create an exhaustive speed_obj (with NAs, which seem to be OK).
ids <- data.frame()
for (i in (1:length(sl_obj))) {
id <- data.frame(sl_obj#lines[[i]]#ID)
ids <- rbind(ids, id)
}
colnames(ids)[1] <- "linkId"
speed_full <- join(ids, speed_obj)
speed_full_short <- data.frame(speed_obj[,c(-1)])
row.names(speed_full_short) <- speed_full$linkId
splndf <- SpatialLinesDataFrame(sl_obj, data = speed_full_short, match.ID = T)
Works fine now!
I may have deciphered the issue.
When I am pulling in my spatial lines data and I check the class it reads as
"Spatial Lines Data Frame" even though I know it's a simple linear shapefile, I'm using readOGR to bring the data in and I believe this is where the conversion is occurring. With that in mind the speed assignment is relatively easy.
sl_object$speed <- speed_object[ match( sl_object$ID , row.names( speed_object ) ) , "speed" ]
This should do the trick, as I'm willing to bet your class(sl_object) is "Spatial Lines Data Frame".
EDIT: I had received the same error as OP, driving me to check class()
I am under the impression that the error that was populated for you is because you were trying to coerce a data frame into a data frame and R wasn't a fan of that.