I have an object (Seurat object) an I need to get certain data out of it
> sc#misc[["colors"]][["seurat_clusters"]]
0 1 2 3 4 5 6 7
"#CC0C00FF" "#5C88DAFF" "#84BD00FF" "#FFCD00FF" "#7C878EFF" "#00B5E2FF" "#00AF66FF" "#CC0C00B2"
This data is needed as an vector but I don't know how to pull "#CC0C00FF" "#5C88DAFF" etc. out of it.
In order to hand this data to the next function, the result should look like this:
> vec
[1] "#CC0C00FF" "#5C88DAFF" "#84BD00FF"
Thanks in advance!
Solved it! I'm pretty disappointed by myself, because I didn't know this function existed:
> as.vector(sc#misc[["colors"]][["seurat_clusters"]])
[1] "#CC0C00FF" "#5C88DAFF" "#84BD00FF" "#FFCD00FF" "#7C878EFF" "#00B5E2FF" "#00AF66FF" "#CC0C00B2"
I have a list which included a mix of data type (character) and data structure (dataframe).
I want to keep only the dataframes and remove the rest.
> head(list)
[[1]]
[1] "/Users/Jane/R/12498798.txt error"
[[2]]
match
1 Japan arrests man for taking gun
2 Extradition bill turns ugly
file
1 /Users/Jane/R/12498770.txt
2 /Users/Jane/R/12498770.txt
[[3]]
[1] "/Users/Jane/R/12498780.txt error"
I expect the final list to contain only dataframes:
[[2]]
match
1 Japan arrests man for taking gun
2 Extradition bill turns ugly
file
1 /Users/Jane/R/12498770.txt
2 /Users/Jane/R/12498770.txt
Based on the example, it is possible that the OP's list elements are vectors and want to remove any element having 'error' substring
list[!sapply(list, function(x) any(grepl("error$", x)))]
I am trying to read a TSV file in R using the read.table function.
myTable <- read.table("file_path", sep='\t', header=T)
But when I try the command
names(myTable)
It gives me column names which are odd numbered, while merging the even numbered columns with those.
[1] "GeneSymbol" "GSM480304_JK_C_05.07.mas5.chp"
[3] "GSM480355_JK_C_05.07.mas5.chp" "GSM480480_JK_C_05.07.mas5.chp"
[5] "GSM480555_JK_C_05.07.mas5.chp" "GSM480634_JK_C_05.07.mas5.chp"
These are exact column names and you can see that two column names are separated by space while only ODD numbered column names are listed.
The output should be like this:
[1] "GeneSymbol"
[2] "GSM480304_JK_C_05.07.mas5.chp"
[3] "GSM480355_JK_C_05.07.mas5.chp"
[4] "GSM480480_JK_C_05.07.mas5.chp"
[5] "GSM480555_JK_C_05.07.mas5.chp"
[6] "GSM480634_JK_C_05.07.mas5.chp"
This is creating problem in assigning names to another table where I want to use these column names. Any suggestions ?
As noted in the comments, R is displaying all the columns, but not in the format you expect. This can be forced by casting the result of names() with as.data.frame() as follows:
rawData <- "
Number,Name,Type1,Type2,Total,HP,Attack,Defense,SpecialAtk,SpecialDef,Speed,Generation,Legendary
1,Bulbasaur,Grass,Poison,318,45,49,49,65,65,45,1,False
2,Ivysaur,Grass,Poison,405,60,62,63,80,80,60,1,False
3,Venusaur,Grass,Poison,525,80,82,83,100,100,80,1,False
3,VenusaurMega Venusaur,Grass,Poison,625,80,100,123,122,120,80,1,False
4,Charmander,Fire,,309,39,52,43,60,50,65,1,False
5,Charmeleon,Fire,,405,58,64,58,80,65,80,1,False
6,Charizard,Fire,Flying,534,78,84,78,109,85,100,1,False
6,CharizardMega Charizard X,Fire,Dragon,634,78,130,111,130,85,100,1,False
6,CharizardMega Charizard Y,Fire,Flying,634,78,104,78,159,115,100,1,False
7,Squirtle,Water,,314,44,48,65,50,64,43,1,False
8,Wartortle,Water,,405,59,63,80,65,80,58,1,False
9,Blastoise,Water,,530,79,83,100,85,105,78,1,False"
gen01 <- read.csv(textConnection=rawData,header=TRUE)
as.data.frame(names(gen01))
...and the output:
> as.data.frame(names(gen01))
names(gen01)
1 Number
2 Name
3 Type1
4 Type2
5 Total
6 HP
7 Attack
8 Defense
9 SpecialAtk
10 SpecialDef
11 Speed
12 Generation
13 Legendary
I have two seemingly identical zoo objects created by the same commands from csv files for different time periods. I try to combine them into one long zoo but I'm failing with "indexes overlap" error. ('merge' 'c' or 'rbind' all produce variants of the same error text.) As far as I can see there are no duplicates and the time periods do not overlap. What am I doing wrong? Am using R version 3.0.1 on Windows 7 64bit if that makes a difference.
> colnames(z2)
[1] "Amb" "HWS" "Diff"
> colnames(t.tmp)
[1] "Amb" "HWS" "Diff"
> max(index(z2))
[1] "2012-12-06 02:17:45 GMT"
> min(index(t.tmp))
[1] "2012-12-06 03:43:45 GMT"
> anyDuplicated(c(index(z2),index(t.tmp)))
[1] 0
> c(z2,t.tmp)
Error in rbind.zoo(...) : indexes overlap
>
UPDATE: In trying to make a reproducible case I've concluded this is an implementation error due to the large number of rows I'm dealing with: it fails if the final result is more than 311434 rows long.
> nrow(c(z2,head(t.tmp,n=101958)))
Error in rbind.zoo(...) : indexes overlap
> nrow(c(z2,head(t.tmp,n=101957)))
[1] 311434
# but row 101958 inserts fine on its own so its not a data problem.
> nrow(c(z2,tail(head(t.tmp,n=101958),n=2)))
[1] 209479
I'm sorry but I dont have the R scripting skills to produce a zoo of the critical length, hopefully someone might be able to help me out..
UPDATE 2- Responding to Jason's suggestion.. : The problem is in the MATCH but my R skills arent sufficient to know how to interpret it- does it mean MATCH finds a duplicate value in x.t whereas anyDuplicated does not?
> x.t <- c(index(z2),index(t.tmp));
> length(x.t)
[1] 520713
> ix <- ORDER (x.t)
> length(ix)
[1] 520713
> x.t <- x.t[ix]
> length(ix)
[1] 520713
> length(x.t)
[1] 520713
> tx <- table(MATCH(x.t,x.t))
> max(tx)
[1] 2
> tx[which(tx==2)]
311371 311373 311378 311383 311384 311386 311389 311392 311400 311401
2 2 2 2 2 2 2 2 2 2
> anyDuplicated(x.t)
[1] 0
After all the testing and head scratching it seems that the problem I'm having is timezone related. Setting the environment to the same time zone as the original data makes it work just fine.
Sys.setenv(TZ="GMT")
> z3<-rbind(z2,t.tmp)
> nrow(z3)
[1] 520713
Thanks to how to guard against accidental time zone conversion for the inspiration to look in that direction.
How to read the following vector "c" of strings into a list of tables? Which way is the shortest read.table strsplit? e.g. I cant see how to read the table Edit:c[4:6] a[4:6] in one command.
require(car)
m<-matrix(rnorm(16),4,4,byrow=T)
a<-Anova(lm(m~1),type=3,idata=data.frame(treatment=factor(1:4)),idesign=~treatment)
c<-capture.output(summary(a,multivariate=F))
c
This returns lines 4:6
c[4:6]
Now if you wanted to parse this I would do it in two steps. First on the column values from rows 5:6 and then add back the names.
> vals <- read.table(text=c[5:6])
> txt <- " \t SS\t num Df\t Error SS\t den Df\t F\t Pr(>F)"
> names(vals) <- names(read.delim(text=txt))
> vals
X SS num.Df Error.SS den.Df F Pr..F.
1 (Intercept) 0.57613392 1 0.4219563 3 4.09616 0.13614
2 treatment 1.85936442 3 8.2899759 9 0.67287 0.58996
EDIT --
you could look at the source code of the summary function and calculate the quantities required by yourself
getAnywhere(summary.Anova.mlm)
The original idea seems not to work.
c2 <- summary(a)
# find out what 'properties' the summary object has
# turns out, it is just the Anova object
class(c2) <- "list"
names(c2)
This returns
[1] "SSP" "SSPE" "P" "df" "error.df"
[6] "terms" "repeated" "type" "test" "idata"
[11] "idesign" "icontrasts" "imatrix" "singular"
and we can get access them
c2$SSP
c2$SSPE
It seems not a good idea to use R internal c function as a variable name