How do I count treated and untreated in R - r

I'm trying to learn R again and am trying to count the number total number of genes that are "treated" and "untreated" with dex in the bioconductor airway dataset. (https://bioconductor.org/packages/release/data/experiment/html/airway.html).
I'm trying:
airway$dex=='trted'
#[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
and it's not working.

After installing that package I performed the following actions at my console ( and including all output):
> library(airway)
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats
Attaching package: ‘matrixStats’
The following object is masked from ‘package:dplyr’:
count
Attaching package: ‘MatrixGenerics’
The following objects are masked from ‘package:matrixStats’:
colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse, colCounts, colCummaxs, colCummins,
colCumprods, colCumsums, colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs, colMads, colMaxs,
colMeans2, colMedians, colMins, colOrderStats, colProds, colQuantiles, colRanges, colRanks, colSdDiffs,
colSds, colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads, colWeightedMeans,
colWeightedMedians, colWeightedSds, colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods, rowCumsums, rowDiffs, rowIQRDiffs,
rowIQRs, rowLogSumExps, rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins, rowOrderStats,
rowProds, rowQuantiles, rowRanges, rowRanks, rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs,
rowVars, rowWeightedMads, rowWeightedMeans, rowWeightedMedians, rowWeightedSds, rowWeightedVars
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply,
parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:bit64’:
match, order, rank
The following objects are masked from ‘package:dplyr’:
combine, intersect, setdiff, union
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval,
evalq, Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order,
paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
table, tapply, union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:Matrix’:
expand
The following objects are masked from ‘package:data.table’:
first, second
The following objects are masked from ‘package:tidygraph’:
active, rename
The following object is masked from ‘package:tidyr’:
expand
The following objects are masked from ‘package:dplyr’:
first, rename
The following object is masked from ‘package:base’:
expand.grid
Loading required package: IRanges
Attaching package: ‘IRanges’
The following object is masked from ‘package:data.table’:
shift
The following object is masked from ‘package:nlme’:
collapse
The following object is masked from ‘package:tidygraph’:
slice
The following object is masked from ‘package:purrr’:
reduce
The following objects are masked from ‘package:dplyr’:
collapse, desc, slice
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Attaching package: ‘Biobase’
The following object is masked from ‘package:MatrixGenerics’:
rowMedians
The following objects are masked from ‘package:matrixStats’:
anyMissing, rowMedians
The following object is masked from ‘package:bit64’:
cache
Attaching package: ‘SummarizedExperiment’
The following object is masked from ‘package:SeuratObject’:
Assays
The following object is masked from ‘package:Seurat’:
Assays
I looked at the help page
> help(pac=airway)
So after reading that I thought the airway dataset might be accessible, but no:
> str(airway)
Error in str(airway) : object 'airway' not found
So I tried loading it with the data function (and no error was reported) so I looked at its structure:
> data(airway)
> str(airway)
Formal class 'RangedSummarizedExperiment' [package "SummarizedExperiment"] with 6 slots
..# rowRanges :Formal class 'GRangesList' [package "GenomicRanges"] with 3 slots
.. .. ..# elementMetadata:Formal class 'DataFrame' [package "IRanges"] with 6 slots
.. .. .. .. ..# rownames : NULL
.. .. .. .. ..# nrows : int 64102
.. .. .. .. ..# listData : Named list()
.. .. .. .. ..# elementType : chr "ANY"
.. .. .. .. ..# elementMetadata: NULL
.. .. .. .. ..# metadata : list()
.. .. ..# elementType : chr "GRanges"
.. .. ..# metadata :List of 1
.. .. .. ..$ genomeInfo:List of 20
.. .. .. .. ..$ Db type : chr "TranscriptDb"
.. .. .. .. ..$ Supporting package : chr "GenomicFeatures"
.. .. .. .. ..$ Data source : chr "BioMart"
.. .. .. .. ..$ Organism : chr "Homo sapiens"
.. .. .. .. ..$ Resource URL : chr "www.biomart.org:80"
.. .. .. .. ..$ BioMart database : chr "ensembl"
.. .. .. .. ..$ BioMart database version : chr "ENSEMBL GENES 75 (SANGER UK)"
.. .. .. .. ..$ BioMart dataset : chr "hsapiens_gene_ensembl"
.. .. .. .. ..$ BioMart dataset description : chr "Homo sapiens genes (GRCh37.p13)"
.. .. .. .. ..$ BioMart dataset version : chr "GRCh37.p13"
.. .. .. .. ..$ Full dataset : chr "yes"
.. .. .. .. ..$ miRBase build ID : chr NA
.. .. .. .. ..$ transcript_nrow : chr "215647"
.. .. .. .. ..$ exon_nrow : chr "745593"
.. .. .. .. ..$ cds_nrow : chr "537555"
.. .. .. .. ..$ Db created by : chr "GenomicFeatures package from Bioconductor"
.. .. .. .. ..$ Creation time : chr "2014-07-10 14:55:55 -0400 (Thu, 10 Jul 2014)"
.. .. .. .. ..$ GenomicFeatures version at creation time: chr "1.17.9"
.. .. .. .. ..$ RSQLite version at creation time : chr "0.11.4"
.. .. .. .. ..$ DBSCHEMAVERSION : chr "1.0"
..# colData :Formal class 'DataFrame' [package "IRanges"] with 6 slots
.. .. ..# rownames : chr [1:8] "SRR1039508" "SRR1039509" "SRR1039512" "SRR1039513" ...
.. .. ..# nrows : int 8
.. .. ..# listData :List of 9
.. .. .. ..$ SampleName: Factor w/ 8 levels "GSM1275862","GSM1275863",..: 1 2 3 4 5 6 7 8
.. .. .. ..$ cell : Factor w/ 4 levels "N052611","N061011",..: 4 4 1 1 3 3 2 2
.. .. .. ..$ dex : Factor w/ 2 levels "trt","untrt": 2 1 2 1 2 1 2 1
.. .. .. ..$ albut : Factor w/ 1 level "untrt": 1 1 1 1 1 1 1 1
.. .. .. ..$ Run : Factor w/ 8 levels "SRR1039508","SRR1039509",..: 1 2 3 4 5 6 7 8
.. .. .. ..$ avgLength : int [1:8] 126 126 126 87 120 126 101 98
.. .. .. ..$ Experiment: Factor w/ 8 levels "SRX384345","SRX384346",..: 1 2 3 4 5 6 7 8
.. .. .. ..$ Sample : Factor w/ 8 levels "SRS508567","SRS508568",..: 2 1 3 4 5 6 7 8
.. .. .. ..$ BioSample : Factor w/ 8 levels "SAMN02422669",..: 1 4 6 2 7 3 8 5
.. .. ..# elementType : chr "ANY"
.. .. ..# elementMetadata: NULL
.. .. ..# metadata : list()
..# assays :Reference class 'ShallowSimpleListAssays' [package "GenomicRanges"] with 1 field
.. ..$ data:Formal class 'SimpleList' [package "IRanges"] with 4 slots
.. .. .. ..# listData :List of 1
.. .. .. .. ..$ counts: int [1:64102, 1:8] 679 0 467 260 60 0 3251 1433 519 394 ...
.. .. .. ..# elementType : chr "ANY"
.. .. .. ..# elementMetadata: NULL
.. .. .. ..# metadata : list()
.. ..and 12 methods.
..# NAMES : NULL
..# elementMetadata:Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
.. .. ..# rownames : NULL
.. .. ..# nrows : int 64102
.. .. ..# listData : Named list()
.. .. ..# elementType : chr "ANY"
.. .. ..# elementMetadata: NULL
.. .. ..# metadata : list()
..# metadata :List of 1
.. ..$ :Formal class 'MIAME' [package "Biobase"] with 13 slots
.. .. .. ..# name : chr "Himes BE"
.. .. .. ..# lab : chr NA
.. .. .. ..# contact : chr ""
.. .. .. ..# title : chr "RNA-Seq transcriptome profiling identifies CRISPLD2 as a glucocorticoid responsive gene that modulates cytokine"| __truncated__
.. .. .. ..# abstract : chr "Asthma is a chronic inflammatory respiratory disease that affects over 300 million people worldwide. Glucocorti"| __truncated__
.. .. .. ..# url : chr "http://www.ncbi.nlm.nih.gov/pubmed/24926665"
.. .. .. ..# pubMedIds : chr "24926665"
.. .. .. ..# samples : list()
.. .. .. ..# hybridizations : list()
.. .. .. ..# normControls : list()
.. .. .. ..# preprocessing : list()
.. .. .. ..# other : list()
.. .. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. .. ..# .Data:List of 2
.. .. .. .. .. .. ..$ : int [1:3] 1 0 0
.. .. .. .. .. .. ..$ : int [1:3] 1 1 0
Scanning through that list of S4 structured data I saw this line:
.. .. .. ..$ dex : Factor w/ 2 levels "trt","untrt": 2 1 2 1 2 1 2 1
So the dex items do have "trt" and "untrt" as values but that "column" is located somewhat deeper in the entire DesignedExperiment structure. There might be a specific function, that I do not know the name of, to pull out values from such structures, but we now have enough information to answer (or hack together) the question. Follow the names and operators in that nested list backward to its origin and use the S4 extraction operator: "#" where it appropriate and $ when not:
sum( airway# colData # listData $ dex == "trt")
#[1] 4

Use sum() function to count True values:
sum(airway$dex=='trted')

Related

Reading in a .geojson file with geojsonio, geojsonR

I am trying to read in a geojson file (https://www.svz-bw.de/fileadmin/verkehrszentrale/RadNETZ-BW_Daten_GeoJSON_2018-20.zip) in R.
I have tried different packages but my knowledge is too limited to find the errors and solve them. Im new to spatial data in R, especially reading geojson file format.
Googling and searching in stackoverflow hasnt helped.
geojsonR::FROM_geojson("../Sonstiges/RadNETZ.geojson")
Error in unlink(x) : file name conversion problem -- name too long?
geojsonR::FROM_GeoJson("../Sonstiges/RadNETZ.geojson")
Error in export_From_geojson(url_file_string, Flatten_Coords,
Average_Coordinates, : invalid GeoJson geometry object -->
geom_OBJ() function
Your file does not comply with the current GeoJSON standards; it uses a projected coordinate reference system, which goes against RFC 7946 - https://www.rfc-editor.org/rfc/rfc7946#page-12
This may, and may not, be the reason why geojson specific packages have hard time interpreting it.
In order to process your file I suggest using {sf}, which is - via GDAL and PROJ - able to digest the file.
library(dplyr)
library(sf)
asdf <- st_read("RadNETZ.geojson") %>%
st_transform(4326) # safety of unprojected CRS
plot(st_geometry(asdf))
As #Jindra Lacko mentioned your 'RadNETZ.geojson' file does not comply with the RFC 7946 that's why you receive the error. If you don't have GDAL installed on your Operating System besides the 'sf' package you can use either the geojsonR::shiny_from_JSON (which does not follow the RFC and is meant to be used in shiny applications),
dat = geojsonR::shiny_from_JSON("../Sonstiges/RadNETZ.geojson")
str(dat)
List of 4
$ crs :List of 2
..$ properties:List of 1
.. ..$ name: chr "urn:ogc:def:crs:EPSG::31467"
..$ type : chr "name"
$ features:List of 70097
..$ :List of 3
.. ..$ geometry :List of 2
.. .. ..$ coordinates:List of 6
.. .. .. ..$ :List of 2
.. .. .. .. ..$ : num 3563993
.. .. .. .. ..$ : num 5353055
.. .. .. ..$ :List of 2
.. .. .. .. ..$ : num 3564002
.. .. .. .. ..$ : num 5353070
.. .. .. ..$ :List of 2
.. .. .. .. ..$ : num 3564009
.. .. .. .. ..$ : num 5353087
.. .. .. ..$ :List of 2
.. .. .. .. ..$ : num 3564013
.. .. .. .. ..$ : num 5353103
.. .. .. ..$ :List of 2
.. .. .. .. ..$ : num 3564016
.. .. .. .. ..$ : num 5353109
.. .. .. ..$ :List of 2
.. .. .. .. ..$ : num 3564030
.. .. .. .. ..$ : num 5353121
.. .. ..$ type : chr "LineString"
.. ..$ properties:List of 24
.....
or the jsonlite::fromJSON function,
dat = jsonlite::fromJSON("../Sonstiges/RadNETZ.geojson")
str(dat)
List of 4
$ type : chr "FeatureCollection"
$ name : chr "sql_statement"
$ crs :List of 2
..$ type : chr "name"
..$ properties:List of 1
.. ..$ name: chr "urn:ogc:def:crs:EPSG::31467"
$ features:'data.frame': 70097 obs. of 3 variables:
..$ type : chr [1:70097] "Feature" "Feature" "Feature" "Feature" ...
..$ properties:'data.frame': 70097 obs. of 24 variables:
.. ..$ gid : int [1:70097] 4 15 23 22 45 72 60 74 13072 75 ...
.. ..$ lrvn_kat: int [1:70097] 3 1 1 3 1 1 3 1 3 1 ...
.....
For the record I'm the author / maintainer of the geojsonR package

How can I add a class name to numeric raster values in a terra SpatRaster?

I'm working with the Circumpolar Arctic Vegetation map. Stored as a SpatRaster with terra, the raster has 21 land cover classes.
> str(lc_2003)
Formal class 'SpatRaster' [package "terra"] with 1 slot
..# ptr:Reference class 'Rcpp_SpatRaster' [package "terra"] with 17 fields
.. ..$ depth : num 0
.. ..$ extent :Reference class 'Rcpp_SpatExtent' [package "terra"] with 2 fields
.. .. ..$ valid : logi TRUE
.. .. ..$ vector: num [1:4] 3946387 7965081 2200504 5681579
.. .. ..and 27 methods, of which 13 are possibly relevant:
.. .. .. align, as.points, ceil, compare, finalize, floor, initialize, intersect, round, sample,
.. .. .. sampleRandom, sampleRegular, union
.. ..$ filenames: chr ""
.. ..$ hasRange : logi TRUE
.. ..$ hasTime : logi FALSE
.. ..$ hasValues: logi TRUE
.. ..$ inMemory : logi TRUE
.. ..$ messages :Reference class 'Rcpp_SpatMessages' [package "terra"] with 2 fields
.. .. ..$ has_error : logi FALSE
.. .. ..$ has_warning: logi FALSE
.. .. ..and 18 methods, of which 4 are possibly relevant:
.. .. .. finalize, getError, getWarnings, initialize
.. ..$ names : chr "PHYSIOG"
.. ..$ origin : num [1:2] 102.7 91.3
.. ..$ range_max: num 21
.. ..$ range_min: num 1
.. ..$ res : num [1:2] 5172 3881
.. ..$ rgb : logi FALSE
.. ..$ time : num 0
.. ..$ timestep : chr "seconds"
.. ..$ units : chr ""
However, I would like to associate each value in the layer PHYSIOG with it's actual landcover class name. This would be useful to me for viewing the file in ArcGis, as well as for assessing which habitat type certain survey plots fall in.
landcover_classes <- data.frame(lc_code = 1:21,
lc_class = c(
"Cryptogam, herb barren",
"Rush/grass, forb, cryptogam tundra",
"Cryptogam barren complex (bedrock)",
"Prostrate dwarf-shrub, herb tundra",
"Graminoid, prostrate dwarf-shrub, forb tundra",
"Prostrate/Hemiprostrate dwarf-shrub tundra",
"Nontussock sedge, dwarf-shrub, moss tundra",
"Tussock-sedge, dwarf-shrub, moss tundra",
"Erect dwarf-shrub tundra",
"Low-shrub tundra",
"Missing (Cryprogram dwarf-shrub?)",
"Sedge/grass, moss wetland",
"Sedge, moss, dwarf-shrub wetland",
"Sedge, moss, low-shrub wetland",
"Noncarbonate mountain complex",
"Carbonate mountain complex",
"Nunatak complex",
"Glaciers",
"Water",
"Lagoon",
"Non-Arctic areas"))
How could I add this data to the SpatRaster?
(I'm not sure how to make a reproducible example of a SpatRaster. I'm going to ask this in a separate question)
You should be able to do
levels(lc_2003) <- landcover_classes
And see the results with
plot(lc_2003)
See the examples in ?terra::levels.

no method for coercing this S4 class to a vector for utilization of mclust

I'm trying to use the mclust method on an .FCS format file (which is a flow cytometry format file) and I read this file into R as flowFrame object.
install.packages("openCyto") # since the old version sefaulted my R session
library( openCyto )
library( flowCore)
library( mclust)
trial1=read.FCS("export_Alcina TregMAIT_AV 10-1974 P1_CD4.fcs")
a=as.matrix(trial1)
Editors note: some of these are Bioconductor packages and you should install according to the help pages for that environment.
However, mclust does not accept the .fcs file as a matrix & I tried to convert it to a matrix with the function as.matrix, and I get this error:
Error in as.vector(data) :
no method for coercing this S4 class to a vector
I've found similar questions where they explain you have to add importMethodsFrom(S4Vectors,as.matrix) into the NAMESPACE of mclust, which I did. I also did importMethodsFrom(BiocGenerics,as.vector) in the NAMESPACE of mclust. However, I'm still not able to use mclust.
P.S. any advice or reading would be appreciated!
If, anyone knows other clustering methods that use GMM model that could accept .FCS format without converting, I'd be very happy.
I've edited your question to show what you should have done originally and also didn't do later (instead of including code in a comment you should have responded by then editing the question as was specifically suggested.) My response is based on the first example in flowCore::read.FCS (since you also did not include a pointer to the dataset you were loading from disk) so rather than "trial1" I will be referring to the "samp" object I get running that code.
The "samp" object is now returns this from class and str:
> class(samp)
[1] "flowFrame"
attr(,"package")
[1] "flowCore"
str(samp)
Formal class 'flowFrame' [package "flowCore"] with 3 slots
..# exprs : num [1:10000, 1:8] 382 628 1023 373 1023 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : Named chr [1:8] "FSC-H" "SSC-H" "FL1-H" "FL2-H" ...
.. .. .. ..- attr(*, "names")= chr [1:8] "$P1N" "$P2N" "$P3N" "$P4N" ...
.. ..- attr(*, "ranges")= num [1:8] 1023 1023 10000 10000 10000 ...
..# parameters :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
.. .. ..# varMetadata :'data.frame': 5 obs. of 1 variable:
.. .. .. ..$ labelDescription: chr [1:5] "Name of Parameter" "Description of Parameter" "Range of Parameter" "Minimum Parameter Value after Transforamtion" ...
.. .. ..# data :'data.frame': 8 obs. of 5 variables:
.. .. .. ..$ name :Class 'AsIs' Named chr [1:8] "FSC-H" "SSC-H" "FL1-H" "FL2-H" ...
.. .. .. .. .. ..- attr(*, "names")= chr [1:8] "$P1N" "$P2N" "$P3N" "$P4N" ...
.. .. .. ..$ desc :Class 'AsIs' Named chr [1:8] "FSC-H" "SSC-H" NA NA ...
.. .. .. .. .. ..- attr(*, "names")= chr [1:8] "$P1S" "$P2S" "$P3S" "$P4S" ...
.. .. .. ..$ range : num [1:8] 1024 1024 1024 1024 1024 ...
.. .. .. ..$ minRange: num [1:8] 0 0 1 1 1 0 1 0
.. .. .. ..$ maxRange: num [1:8] 1023 1023 10000 10000 10000 ...
.. .. ..# dimLabels : chr [1:2] "rowNames" "columnNames"
.. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..# .Data:List of 1
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..# description:List of 164
.. ..$ FCSversion : chr "2"
.. ..$ $BYTEORD : chr "4,3,2,1"
.. ..$ $DATATYPE : chr "F"
#----- output truncated -----------
So "samp" is not a rectangular objects in any sense but rather a complex list with lots of the associated information in attributes. My guess is that you want the information in the # exprs node which is a matrix.
A further difficulty is that there is no function mamed mclust in the mclust package, although looking at ?mclust we do see an example demonstrating the use of an Mclust function. R is unforgiving in its insistence on correct capitalization of function names.
Mclust(exprs(samp)[1:100,])
#-----------
'Mclust' model object:
best model: ellipsoidal, equal orientation (VVE) with 4 components

How to find out which index is out of bounds in object in R

Although I understand OOP, I've only just encountered them in R
I am using a package from Bioconductor to churn through some genomic data.
The object it creates is called readCounts and typing this into the command gives the following.
QDNAseqReadCounts (storageMode: lockedEnvironment)
assayData: 206391 features, 1 samples
element names: counts
protocolData: none
phenoData
sampleNames: SLX-10457.FastSeqA.BloodDMets_11AF_-AHMMH.s_1.r_1.fq.gz
varLabels: name total.reads used.reads expected.variance
varMetadata: labelDescription
featureData
featureNames: 1:825001-840000 1:840001-855000 ... 22:51165001-51180000 (168063 total)
fvarLabels: chromosome start ... use (9 total)
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation:
I am trying to plot readcounts on a simple xy graph as follows:
plot(readCounts, logTransform=TRUE, ylim=c(-1000, binSize * 15))
However when I do so I get the following error:
Error in sort.int(x, partial = unique(c(lo, hi))) :
index 180 outside bounds
with the traceback() showing:
6: sort.int(x, partial = unique(c(lo, hi)))
5: FUN(newX[, i], ...)
4: apply(copynumber, 2, sdFUN, na.rm = TRUE)
3: .local(x, y, ...)
2: plot(readCounts, logTransform = TRUE, ylim = c(-1000, binSize *
15))
1: plot(readCounts, logTransform = TRUE, ylim = c(-1000, binSize *
15))
so having googled I thought it might be a missing values problem so I tried na.omit(readCounts) but got the same error again but this time setting the out of bounds index as being 207.
I have tried to inspect the data but I can't find anything wrong at row 207 although I'm not really sure which slot this refers to. I really don't know how to debug this. I'm happy to give more info regarding what I'm trying to do but I don't really know how to determine what the problem is with this error in a R object.
When I do str(readCounts) I get:
Formal class 'QDNAseqReadCounts' [package "QDNAseq"] with 7 slots
..# assayData :<environment: 0x13a99ed90>
..# phenoData :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
.. .. ..# varMetadata :'data.frame': 4 obs. of 1 variable:
.. .. .. ..$ labelDescription: chr [1:4] NA NA NA NA
.. .. ..# data :'data.frame': 1 obs. of 4 variables:
.. .. .. ..$ name : chr "SLX-10457.FastSeqA.BloodDMets_11AF_-AHMMH.s_1.r_1.fq.gz"
.. .. .. ..$ total.reads : num 0
.. .. .. ..$ used.reads : num 0
.. .. .. ..$ expected.variance: num Inf
.. .. ..# dimLabels : chr [1:2] "sampleNames" "sampleColumns"
.. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..# .Data:List of 1
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..# featureData :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
.. .. ..# varMetadata :'data.frame': 9 obs. of 1 variable:
.. .. .. ..$ labelDescription: chr [1:9] "Chromosome name" "Base pair start position" "Base pair end position" "Percentage of non-N nucleotides (of full bin size)" ...
.. .. ..# data :'data.frame': 168063 obs. of 9 variables:
.. .. .. ..$ chromosome : chr [1:168063] "1" "1" "1" "1" ...
.. .. .. ..$ start : num [1:168063] 825001 840001 855001 870001 885001 ...
.. .. .. ..$ end : num [1:168063] 840000 855000 870000 885000 900000 915000 930000 945000 960000 975000 ...
.. .. .. ..$ bases : num [1:168063] 100 100 100 100 100 100 100 100 100 100 ...
.. .. .. ..$ gc : num [1:168063] 48 61.8 65.1 65.5 62.6 ...
.. .. .. ..$ mappability: num [1:168063] 58.6 91.5 94.1 93.2 93.9 ...
.. .. .. ..$ blacklist : num [1:168063] 0.727 0 0 0 0 ...
.. .. .. ..$ residual : num [1:168063] -0.0627 0.05036 0.09384 0.00541 -0.00588 ...
.. .. .. ..$ use : logi [1:168063] TRUE TRUE TRUE TRUE TRUE TRUE ...
.. .. .. ..- attr(*, "na.action")=Class 'omit' Named int [1:38328] 1 2 3 4 5 6 7 8 9 10 ...
.. .. .. .. .. ..- attr(*, "names")= chr [1:38328] "1:1-15000" "1:15001-30000" "1:30001-45000" "1:45001-60000" ...
.. .. ..# dimLabels : chr [1:2] "featureNames" "featureColumns"
.. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..# .Data:List of 1
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..# experimentData :Formal class 'MIAME' [package "Biobase"] with 13 slots
.. .. ..# name : chr ""
.. .. ..# lab : chr ""
.. .. ..# contact : chr ""
.. .. ..# title : chr ""
.. .. ..# abstract : chr ""
.. .. ..# url : chr ""
.. .. ..# pubMedIds : chr ""
.. .. ..# samples : list()
.. .. ..# hybridizations : list()
.. .. ..# normControls : list()
.. .. ..# preprocessing : list()
.. .. ..# other : list()
.. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..# .Data:List of 2
.. .. .. .. .. ..$ : int [1:3] 1 0 0
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..# annotation : chr(0)
..# protocolData :Formal class 'AnnotatedDataFrame' [package "Biobase"] with 4 slots
.. .. ..# varMetadata :'data.frame': 0 obs. of 1 variable:
.. .. .. ..$ labelDescription: chr(0)
.. .. ..# data :'data.frame': 1 obs. of 0 variables
.. .. ..# dimLabels : chr [1:2] "sampleNames" "sampleColumns"
.. .. ..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. .. .. ..# .Data:List of 1
.. .. .. .. .. ..$ : int [1:3] 1 1 0
..# .__classVersion__:Formal class 'Versions' [package "Biobase"] with 1 slot
.. .. ..# .Data:List of 4
.. .. .. ..$ : int [1:3] 3 1 2
.. .. .. ..$ : int [1:3] 2 26 0
.. .. .. ..$ : int [1:3] 1 3 0
.. .. .. ..$ : int [1:3] 1 2 4

R error while using cbind

I trying to combine 2 vectors using cbind, both vectors are the same size, and I am having an error while i run the code, the vectors are quite big, length = 57605.
final=cbind (counts1,tx_by_gene)
> > Error: cannot allocate vector of size 225 Kb R(473,0xa0cb8540) malloc: *** mmap(size=233472) failed (error code=12)
> *** error: can't allocate region
> *** set a breakpoint in malloc_error_break to debug R(473,0xa0cb8540) malloc: *** mmap(size=233472) failed (error code=12)
> *** error: can't allocate region
> *** set a breakpoint in malloc_error_break to debug
Can anyone help me why am I having this error? or some other way of combining the 2 vectors?
thank you
> str(counts1) = int [1:57605] 0 0 0 0 0 0 0 0 0 0 ...
>str(tx_by_gene)
> Formal class 'GRangesList' [package "GenomicRanges"] with 5 slots ..# partitioning :Formal class 'PartitioningByEnd' [package
> "IRanges"] with 5 slots .. .. ..# end : int [1:57605] 3 5
> 12 17 27 36 42 46 58 60 ... .. .. ..# NAMES : chr [1:57605]
> "ENSG00000000003" "ENSG00000000005" "ENSG00000000419"
> "ENSG00000000457" ... .. .. ..# elementMetadata: NULL .. .. ..#
> elementType : chr "integer" .. .. ..# metadata : list()
> ..# unlistData :Formal class 'GRanges' [package "GenomicRanges"]
> with 7 slots .. .. ..# seqnames :Formal class 'Rle' [package
> "IRanges"] with 5 slots .. .. .. .. ..# values : Factor w/
> 93 levels "chr1","chr2",..: 8 20 1 6 1 8 6 3 7 13 ... .. .. .. ..
> ..# lengths : int [1:41694] 5 7 30 18 21 6 2 9 43 23 ... ..
> .. .. .. ..# elementMetadata: NULL .. .. .. .. ..# elementType :
> chr "ANY" .. .. .. .. ..# metadata : list() .. .. ..# ranges
> :Formal class 'IRanges' [package "IRanges"] with 6 slots .. .. .. ..
> ..# start : int [1:191891] 99883667 99887538 99888439
> 99839799 99848621 49551404 49551404 49551404 49551433 49551482 ...
> .. .. .. .. ..# width : int [1:191891] 8137 4149 6550 15084
> 3908 23684 23684 23689 10966 23577 ... .. .. .. .. ..# NAMES
> : NULL .. .. .. .. ..# elementMetadata: NULL .. .. .. .. ..#
> elementType : chr "integer" .. .. .. .. ..# metadata :
> list() .. .. ..# strand :Formal class 'Rle' [package
> "IRanges"] with 5 slots .. .. .. .. ..# values : Factor w/ 3
> levels "+","-","*": 2 1 2 1 2 1 2 1 2 1 ... .. .. .. .. ..# lengths
> : int [1:28670] 3 2 12 10 9 6 16 2 13 8 ... .. .. .. .. ..#
> elementMetadata: NULL .. .. .. .. ..# elementType : chr "ANY"
> .. .. .. .. ..# metadata : list() .. .. ..# seqlengths :
> Named int [1:93] 249250621 243199373 198022430 191154276 180915260
> 171115067 159138663 155270560 146364022 141213431 ... .. .. .. ..-
> attr(*, "names")= chr [1:93] "chr1" "chr2" "chr3" "chr4" ... .. ..
> ..# elementMetadata:Formal class 'DataFrame' [package "IRanges"] with
> 6 slots .. .. .. .. ..# rownames : NULL .. .. .. .. ..#
> nrows : int 191891 .. .. .. .. ..# elementMetadata: NULL
> .. .. .. .. ..# elementType : chr "ANY" .. .. .. .. ..# metadata
> : list() .. .. .. .. ..# listData :List of 2 .. .. .. .. ..
> ..$ tx_id : int [1:191891] 93738 93739 93740 93736 93737 175481
> 175482 175480 175483 175484 ... .. .. .. .. .. ..$ tx_name: chr
> [1:191891] "ENST00000373020" "ENST00000496771" "ENST00000494424"
> "ENST00000373031" ... .. .. ..# elementType : chr "ANY" .. ..
> ..# metadata : list() ..# elementMetadata:Formal class
> 'DataFrame' [package "IRanges"] with 6 slots .. .. ..# rownames
> : NULL .. .. ..# nrows : int 57605 .. .. ..#
> elementMetadata: NULL .. .. ..# elementType : chr "ANY" .. ..
> ..# metadata : list() .. .. ..# listData : list() ..#
> elementType : chr "GRanges" ..# metadata : list()
The object tx_by_gene isn't a vector. You can check using the is.vector function
is.vector(counts1)
is.vector(tx_by_gene)
Of course, there could be method defined so that the two objects can be combined
Those vectors should not be too big for R. You probably used up a lot of memory before the cbind() operation. Look at what objects you currently have with ls() and delete those you don't need any more with rm().

Resources