Can someone please give me a small example of what the microarray data that I import into LIMMA should look like when I import it into R?
I am trying to decipher differentially regulated genes from a microarray sample. Thanks.
A tab (or whatever) separated file with normalized expression levels with in addition a column with probeset ids (or other gene identifiers) and a header which defines samples - generally speaking.
To get an example of the needed code I suggest you to inspect a geo2r generated script (accessible from any GEO dataset) and to read the limma vignette.
Related
I analyzed DNA sequences in a bioinformatics pipeline to identify genetic variants of my samples. The effects of these variants have been estimated using the software snpEff. It returns a vcf file like this example file.
Since I have a multitude of those vcf files, I'd like to read in the vcf files and extract data from the annotation field (ANN=). The problem I have is that every line after the header contains an ANN field, but the number of annotations can vary from line to line. Thus, I'm looking for a simple way to convert the annotation subfields into a list of data frames (one row for every annotation, columns for the annotation subfields).
I'd be happy if you'd help and suggest a way on how to succeed in extracting the annotation info. Thanks a lot in advance!
Example output of tab_model
I have created a table from tab_model that includes multiple models and wish to extract all 'p-values' and 'Estimates/Odds Ratio' to create a data frame that includes these. Output of tab_model is an html file. I am unable to find a function to pull this info in accordance, any ideas on how I could do this?
For example, I want to retrieve all p-values and Estimates for variable 'age' in all of my models...Only 3 in example image but I have hundreds
You should get these values from the regression models themselves, instead of outputting them to a HTML-table, and then extract them.
Without further knowledge of your process and data it is difficult to provide a more concrete answer.
I'd like to create a radial dendrogram in R, but being new to the software, I don't know if I chose the correct data structure and package.
I've created a YAML file that looks as follows:
Data structure
I know the exact hierachy of the languages, but I need R to calculate x and y values. I'd use hclust for that, I think?
I found this instruction here for example: https://stats.stackexchange.com/questions/4062/how-to-plot-a-fan-polar-dendrogram-in-r, but it uses the mtcars dataset. I'd just like to know whether it makes sense to set up my data as above or whether I should use a different structure. When I try to import the datasets I get an error message saying I've got more columns than column headers so I must be doing something wrong.
I'm trying to use the DESeq2 package in R for differential gene expression, but I'm having trouble creating the required RangedSummarizedExperiment object from my input data. I have found several tutorials and vignettes for doing this, but they all seem to apply to a raw data set that is different from mine. My data has gene names as row names and patient id as column names, and the data is simply integer count data. There has to be a simple way to create the RangedSummarizedExperiment object from this type of input data, but I haven't yet found a way. Can anybody help? Thanks.
I had a similar problem understanding how to use this data structure. I eventually managed to do without it by using DESeqDataSetFromMatrix. You can see an example in the first code block of Modify r object with rpy2 (this code is pure R, rpy2 stuff comes after). In this example, I have genes as rows and samples as columns, so it is likely you will be able to adopt the same approach.
I just found a bunch of weather data that I would like to play around with in glmnet in R. First I've been reading and organizing the data in R, and right now I am just trying to look at the raw data of each variable. Unfortunately, each variable has a lot of data and R isn't able to print it all. Is there a way I can view all the raw data in R or just in the file itself? I've tried opening the file in excel to no success. Thanks!
Try to use Frequency tables, you can group by segments.
str() , summary(), table(), pairs(), plots() etc. There are several libraries (such as decr) which facilitate analyzing numerical and factor levels. Let me know if you need help with any.