I am attempting to use the anc.clim function in phyloclim, but am stuck on an error I don't know how to fix.
I have three items in my workspace:
etopo is a 50X14 double matrix with the first column corresponding to 50 bins of an environmental variable. Each subsequent column is labeled with a taxon name.
targetTree is an object of class phylo containing 13 taxa with tip labels corresponding with the taxa in etopo (generated by reading in a .tre file from MrBayes using read.nexus)
prunedPosteriorTrees is an object of class multiphylo containing 1000 phylogenetic trees with 13 taxa with tip labels corresponding with the taxa in etopo (generated by reading in a .t file from MrBayes using read.nexus)
I have confirmed that the taxa in all three match using geiger's treedata function.
When I go to implement anc.clim with these data, the following occurs:
> climateReconstruction <- anc.clim(targetTree, posterior = prunedPosteriorTrees, pno = etopo, n = 2)
Error in noi(old, clades, monophyletic = TRUE) :
tips are not numbered consecutively. Type '?fixTips' for help.
When I type ?fixtips, or ??fixTips for that matter, no documentation is found. I have also searched the web, and the package documentation, to no avail. Has anyone had experience with this error? What do I do?
I have solved this problem. For the aid of others:
targetTree nees to be an ultrametric tree such as one that would result from a BEAST analysis. Mr. Bayes trees are not ultrametric. The same is true for the prunedPosteriorTrees file.
fixTips no longer exists. It has been replaced with fixNodes. Using this function solves the error.
Related
I'm trying to use DESeq2's PCAPlot function in a meta-analysis of data.
Most of the files I have received are raw counts pre-normalization. I'm then running DESeq2 to normalize them, then running PCAPlot.
One of the files I received does not have raw counts or even the FASTQ files, just the data that has already been normalized by DESeq2.
How could I go about importing this data (non-integers) as a DESeqDataSet object after it has already been normalized?
Consensus in vignettes and other comments seems to be that objects can only be constructed from matrices of integers.
I was mostly concerned with getting the format the same between plots. Ultimately, I just used a workaround to get the plots looking the same via ggfortify.
If anyone is curious, I just ended up doing this. Note, the "names" file is just organized like the meta file for colData for building a DESeq object from DESeqDataSetFrom Matrix, but I changed the name of the design column from "conditions" to "group" so it would match the output of PCAplot. Should look identical.
library(ggfortify)
data<-read.csv('COUNTS.csv',sep = ",", header = TRUE, row.names = 1)
names<-read.csv("NAMES.csv")
PCA<-prcomp(t(data))
autoplot(PCA, data = names, colour = "group", size=3)
I am analyzing some microarray data. For each donor, I have a "before intervention" and an "after intervention" idat file. I have successfully read these into R using the Limma package with the read.idat() function. However, the resulting object only has one column in targets: "IDATfile". I believe that if I was using read.ilmn() I would specify a targets.txt file but I can't see this option when using read.idat(). E.g. in the Limma user guide Illumina example, the targets are "Donor", "Age", and "Cell Type". How do I tell Limma what to put as targets? I would like to have "Donor" and "Intervention".
An example of what I mean:
idatfiles <- dir(pattern="idat")
bgxfile <- dir(pattern="bgx")
x <- read.idat(idatfiles, bgxfile)
colnames(x$targets)
[1] "IDATfile"
Instead of "IDATfile", I would like this to be "Donor" and "Intervention". I can include some other columns of the original IDAT files as further targets by doing read.idat(..., dateinfo=TRUE), but I don't know how to edit these columns to make them "Donor" and "Intervention":
[1] "IDATfile" "ScanInfo" "DecodeInfo"
Let me know if any more info is needed, really appreciate any help!
If you want to simply edit colnames you can use:
colnames(x$targets) <- c("Donor")
But I think rows are samples in the target dataframe so is that really what you want?
http://web.mit.edu/~r/current/arch/i386_linux26/lib/R/library/limma/html/EList.html
targets data.frame containing information on the target RNA samples.
Rows correspond to samples. May have any number of columns.
I am trying to create a classification tree with R Studio, with the package rpart.
They are the answers from a survey, in which at the end they registered a consumption level of media usage. I want then, to see if a classification tree can help me out in creating classes of users.
I imported my dataset from excel, randomized, and then run the function:
m <- rpart(Calcio, data = DBRr[1:160,], method = "class")
m : model
Calcio: the last column, representing consumption
DBRr: my randomized dataset (just the first 160 rows, the remaining 80 are for the test)
By inserting that, I see the error:
Error in [.data.frame(m, labs) : undefined columns selected
How can I fix it? Thank you very much in advance for your help
I have a dataset with some 100,000 tweets and their sentiment scores attached. The original dataset just has two columns one for the tweets and one for their sentiment scores.
I am trying to build a data dictionary for it using the dataMeta package. Here is the code that I have writtern so far:
#Data Dictionary
var_desc<-c("Sentiment Score 0 for Negative sentences and 4 for Positive sentences","The tweets collected")
var_type<-c(0,1)
#Creating the Linker Data Frame
linker <- build_linker(tweets_train, variable_description = var_desc, variable_type = var_type)
linker
#Build the data dictionary
dict<-build_dict(my.data = tweets_train,linker=linker,option_description = NULL, prompt_varopts = F)
kable(dict,format="html",caption="Data dictionary for the Training dataset")
My problem is in the data dictionary I have provided the Variable Name and the Variable Description but I think in the Variable Options column it is trying to print the entire 100,000 tweets which I want to avoid. Is it possible for me to set that column up too manually. Would the option_description in the build_dict function be of any help to do it?
I tried getting some idea about it from online but to no use. Here is the link that I have followed till now:
https://cran.r-project.org/web/packages/dataMeta/vignettes/dataMeta_Vignette.html
This is the first time I am trying to build a data dictionary and hence the struggle. Any suggestions would be extremely appreciated. Thanks in advance.
I'm working with the iGraph library and I need to run some statistical analysis on the network. I'm computing several variables using iGraph and then want to use those indicators as the dependent variable in a few regressions and the vertex attributes as the independent variables in the model.
So, I'm able to load the data, run the igraph analysis, but I'm having trouble turning the igraph object back into a data frame. I don't really need the edges to be preserved, just each vertex to be turned into an observation with the attributes serving as a column in each row.
I tried the following:
fg <- fastgreedy.community(uncompg, merges=TRUE)
z<-which.max(fg$modularity)
fgc<- community.to.membership(uncompg, fg$merges,z)
names<-array(V(uncompg)$name)
fccommunity<-array(fgc$membership)
fcresult<-as.matrix(cbind(names,fccommunity))
compg <- set.vertex.attribute(compg, "community", value=fccommunity)
uncompg<-simplify(as.undirected(compg))
hubscore<-hub.score(compg)$vector
authscore<-authority.score(compg)$vector
netdata<-as.data.frame(compg)
But it throws the following error:
cannot coerce class '"igraph"' into a data.frame
Any help or pointers would be greatly appreciated.
I am not quite sure what you are trying to do. Do you want the relationships as a data frame, or the node attribute as a data frame?
To do the former:
> compg.edges <- as.data.frame(get.edgelist(compg))
To do the latter:
> compg.df <- as.data.frame(list(Vertex=V(compg), Community=fccommunity, Hubscore=hubscore, Authscore=authscore), stringsAsFactors=FALSE)