rangedummarizedexperiment for deseq2 - r

I'm trying to use the DESeq2 package in R for differential gene expression, but I'm having trouble creating the required RangedSummarizedExperiment object from my input data. I have found several tutorials and vignettes for doing this, but they all seem to apply to a raw data set that is different from mine. My data has gene names as row names and patient id as column names, and the data is simply integer count data. There has to be a simple way to create the RangedSummarizedExperiment object from this type of input data, but I haven't yet found a way. Can anybody help? Thanks.

I had a similar problem understanding how to use this data structure. I eventually managed to do without it by using DESeqDataSetFromMatrix. You can see an example in the first code block of Modify r object with rpy2 (this code is pure R, rpy2 stuff comes after). In this example, I have genes as rows and samples as columns, so it is likely you will be able to adopt the same approach.

Related

Data structure and package for a radial dendrogram in R

I'd like to create a radial dendrogram in R, but being new to the software, I don't know if I chose the correct data structure and package.
I've created a YAML file that looks as follows:
Data structure
I know the exact hierachy of the languages, but I need R to calculate x and y values. I'd use hclust for that, I think?
I found this instruction here for example: https://stats.stackexchange.com/questions/4062/how-to-plot-a-fan-polar-dendrogram-in-r, but it uses the mtcars dataset. I'd just like to know whether it makes sense to set up my data as above or whether I should use a different structure. When I try to import the datasets I get an error message saying I've got more columns than column headers so I must be doing something wrong.

R pivot table with multiple column levels

I would be grateful if anyone could tell me how to create pivot table in R like python pandas with selected aggregation function and more then one level in column.
I would like to receive in R something like this in python:
Iris.pivot_table(index='Sepal.Length',columns=['Sepal.Width','Species'],values='Petal.Length',aggfunc=sum)
I know there is pivotabler package, but default rendering to html method is to slow for a bit larger tables.
I also have found ftable function from stats package but its only for contingency tables, in which I can`t specify my own aggregation function.
Thank you.

Convention for R function to read a file and return a collection of objects

I would like to find out what the "R way" would be to let users the following with R: I have a file that can contain the data of one or more analysis runs of some other software. My R package should provide additional ways to calculate statistics or produce plots for those analyses. So the first step a user would have to do, is read in the file (with one or more analyses), then select the analysis and work with it.
An analysis is uniquely identified by two names (an analysis name and an analysis type where the type should later correspond to an S3 class).
What I am not sure about is how to best represent the collection of analyses that is returned when reading in the file: should this be an object or simply a list of lists (since there are two ids for identifying an analysis, the first list could be indexed by name and the second by type). Using a list feels very low-level and clumsy though.
If the read function returns a special kind of container object what would be a good method to access one of the contained objects based on name and type?
There are probably many ways how to do this, but since I only started to work with R in a way where others should eventually use my code, I am not sure how to best follow existing R-conventions for how to design this.

reaching max.print on R

I just found a bunch of weather data that I would like to play around with in glmnet in R. First I've been reading and organizing the data in R, and right now I am just trying to look at the raw data of each variable. Unfortunately, each variable has a lot of data and R isn't able to print it all. Is there a way I can view all the raw data in R or just in the file itself? I've tried opening the file in excel to no success. Thanks!
Try to use Frequency tables, you can group by segments.
str() , summary(), table(), pairs(), plots() etc. There are several libraries (such as decr) which facilitate analyzing numerical and factor levels. Let me know if you need help with any.

The internal implementation of R's dataset

I am trying to build a data processing program. Currently I use a double matrix to represent the data table, each row is an instance, each column represents a feature. I also have an extra vector as the target value for each instance, it is of double type for regression, it is of integer for classification.
I want to make it more general. I am wondering what kind of structure R uses to store a dataset, i.e. the internal implementation in R.
Maybe if you inspect the rpy2 package, you can learn something about how data structures are represented (and can be accessed).
The internal data structures are `data.frame', a detailed introduction to the data frame can be found here.
http://cran.r-project.org/doc/manuals/R-intro.html#Data-frames

Resources