Reading .h5ad file in R using Convert - r

I'm trying to read a .h5ad file in my RStudio.
I first converted the .h5ad file to .h5Seurat file using the Convert() function in library(SeuratDisk).
The code for my attempt can be found here:
> library(Seurat)
> library(SeuratDisk)
> Convert("train.h5ad", "train.h5Seurat")
Warning: Unknown file type: h5ad
Warning: 'assay' not set, setting to 'RNA'
Creating h5Seurat file for version 3.1.5.9900
Adding X as data
Adding X as counts
Adding meta.features from var
Adding X_Compartment_tSNE as cell embeddings for Compartment_tSNE
Adding X_tSNE as cell embeddings for tSNE
Adding layer counts as data in assay counts
Adding layer counts as counts in assay counts
> train_seurat <- LoadH5Seurat("train.h5Seurat")
Validating h5Seurat file
Error: Ambiguous assays
The data which I'm trying to read can be found here: https://drive.google.com/drive/folders/1cXYoKNU9qY0f1bbYNh2uykWG6juVJln7
To add, I tried:
> train_seurat <- LoadH5Seurat("train.h5Seurat", assays = "RNA")
But I faced the same issue. Trying to find something quick.

Kindly try the anndata library but note that the data type won’t be seurat as you would want. It’ll be an anndata class object.

Related

Error while using colSums with tab-delimited file

I'm new at R and I'm currently trying to get some statistical data from a file. It is a large set of data in txt tab-delimited file. While importing the file I had no problem and all of the data is shown correctly as a table in rstudio. However, when I'm trying to make any sort of calculations using colsums,
> colSums("Wages and salaries")
Error in colSums("Wages and salaries") : 'x' must be an array of at
least two dimensions
I do receive an error
x' must be an array of at least two dimensions.
"Wages and Salaries" is the name of the column I'm trying to get the sum of.
Using V1 or any other column name that was created by r gives me another error
> colSums(V2)
Error in is.data.frame(x) : object 'V2' not found
The way I'm importing the file is
rm(list=ls())
filename <- read.delim("~/filename.txt", header=FALSE)`
> is.data.frame(filename)
[1] TRUE
This gives me a matrix type data table with rows and columns the same way excel would show me the data.
The reason I'm trying to get a sum of all of the numbers in column is to later get sum of several different columns.
I'm very new at R and I could not find an answer to my question as most of the examples are using just a very small set of data that was created in the r.
In R you can access a column in 2 ways:
filename["Wages and salaries"]
or
filename$`Wages and salaries`
So, please try :
colSums(filename["Wages and salaries"])

Difficulty opening a package data file of unknown type

I am trying to load the state map from the maps package into an R object. I am hoping it is a SpatialPolygonsDataFrame or something I can turn into one after I have inspected it. However I am failing at the first step – getting it into an R object. I do not know the file type.
I first tried to assign the map() output to an R object directly:
st_m <- maps::map(database = "state")
draws the map, but str(st_m) appears to do nothing, unless it is redrawing the same map.
Then I tried loading it as a dataset: st_m <- data("stateMapEnv", package="maps") but this just returns a string:
> str(stateMapEnv)
chr "R_MAP_DATA_DIR"
I opened the maps directory win-library/3.4/maps/mapdata/ and found what I think is the map file, “state.L”.
I tried reading it with scan and got an error message I do not understand:
scan(file = "D:/Documents/R/win-library/3.4/maps/mapdata/state.L")
Error in scan(file = "D:/Documents/R/win-library/3.4/maps/mapdata/state.L") :
scan() expected 'a real', got '#'
I then opened the file with Notepad++. It appears to be a binary or compressed file.
So I thought it might be an R data file with an unusual extension. But my attempt to load it returned a “bad magic number” error:
st_m <- load("D:/Documents/R/win-library/3.4/maps/mapdata/state.L")
Error in load("D:/Documents/R/win-library/3.4/maps/mapdata/state.L") :
bad restore file magic number (file may be corrupted) -- no data loaded
Observing that these responses have progressed from the unhelpful through the incomprehensible to the occult, I thought it best to seek assistance from the wizards of stackoverflow.
This should be able to export the 'state' or any other maps dataset for you:
library(ggplot2)
state_dataset <- map_data("state")

cummeRbund Create Gene Set Error

I am having trouble creating a Gene Set using cummeRbund (R software used to analyze cufflinks, cuffdiff output).
I have been working from the cummeRbund manual that can be found here The directions have worked up until the point of creating the gene sets.
Before creating the gene sets you need to create a vector of gene_ids to include. In the example they enclose each item in this list in quotation marks. I have a created a gene_ids .txt file named OtoSCOPE_v7_list_oneline.txt the first 4 entries in this list are shown below.
“Adcy1” “Bdp1” “Bsnd” “Cabp2”
Here is the create gene sets portion of the script that I have been using.
###################################
# Creating Gene Sets
###################################
#first created a vector of gene_ids that you want included in your gene set
base_dir <- "/Users/paulranum/Documents/cummeRbund"
otoscope_genes <- read.table(file.path(base_dir, "OtoSCOPE_v7_list_oneline.txt"), stringsAsFactors=FALSE)
data(cuff)
myGeneIds<-otoscope_genes
myGeneIds
myGenes<-getGenes(cuff, myGeneIds)
myGenes
When I run this I get the following output and errors.
> data(cuff)
Warning message:
In data(cuff) : data set 'cuff' not found
> myGeneIds<-otoscope_genes
> myGeneIds
V1 V2 V3 V4
1 “Adcy1” “Bdp1” “Bsnd” “Cabp2”
> myGenes<-getGenes(cuff, myGeneIds)
Error in rsqlite_send_query(conn#ptr, statement) :
cannot start a transaction within a transaction
> myGenes
Error: object 'myGenes' not found
From what I can tell there are two main issues going on.
it is not recognizing my data(cuff) command. cuff is the name of my CuffSet data file this file has worked for everything else. is this not the correct data file?
the error after the myGenes<-getGenes(cuff, myGeneIds) command:
Error in rsqlite_send_query(conn#ptr, statement) : cannot start a
transaction within a transaction
Thanks for reading any help would be very much appreciated.

Reading circular read mapping (BAM) into R with readGAlignments()

I am trying to create a circular genome map in R using the ggbio package. I am new to ggbio and related packages like GenomicAlignments and GenomicRanges.
I exported my read mapping as a BAM file (with associated index file) and tried to use readGAlignmentsFromBam() to read in the file.
myreads <- readGAlignmentsFromBam("final.assembly", index = "final.assembly", use.names = TRUE)
But I always get
Warning message:
In GenomicRanges:::valid.GenomicRanges.seqinfo(x) :
GAlignments object contains 12006 out-of-bound ranges located on sequence
Consensus. Note that only ranges located on a non-circular sequence whose
length is not NA can be considered out-of-bound (use seqlengths() and
isCircular() to get the lengths and circularity flags of the underlying
sequences).
Which makes sense - it's a circular chromosome, so some reads will be outside of the "linear" reference sequence. The question is, how do I fix it? I've attempted adding isCircular = c(TRUE) as an argument, but that did not help. It would seem there is a flag somewhere (in the BAM file? in the R code?) that should be set which isn't, but I can't figure out where.
Apologies for not having a reproducible example, but this is a huge BAM file and I am not familiar enough with the file type to mock up the data.

How can I bin data in bigvis package R for a non-numeric data set?

I am trying to use bin() on my big data set. I am using the Lahman data set as an example: http://www.seanlahman.com/baseball-archive/statistics/
I am using the comma-delimited version and looking at the 'Batting' csv file. My data set will be much, much bigger, but if my program cannot handle this, it can't handle my bigger data set.
This is what I am trying to do currently:
> require(devtools)
> require(bigvis)
> bigData <- read.csv("GCdataViz/lahman2012-csv/Batting.csv")
> bigDataNum <- bigData[,sapply(bigData,is.numeric)]
> bin(bigDataNum)
Error: is.numeric(x) is not TRUE
I first got an error when I tried to use bin() because my data set wasn't all numeric. So I used sapply() with the is.numeric parameter, but still got the error that my data set wasn't numeric.
The bigvis library doesn't have much documentation. Should I smooth() after the condense method or go ahead and autoplot(). Is there anyway I can specify plots like bar graphs, line graphs, box, etc?
EDIT: my error:

Resources