I'm using the R table() function, it only gives me 4222 rows, is there some kind of configuration to accept more rows?
table function is not limited to 4222 rows. Most likely, it is the printing limit that gives you the trouble.
Try:
options(max.print = 20000)
also, check the "real" number of rows:
tbl <- table(state.division, state.region)
nrow(tbl)
Nothing wrong with larger tables? What gave you that impression?
> set.seed(123)
> fac <- factor(sample(10000, 10000, rep = TRUE))
> fac2 <- factor(sample(10000, 10000, rep = TRUE))
> tab <- table(fac, fac2)
> str(tab)
'table' int [1:6282, 1:6279] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 2
..$ fac : chr [1:6282] "1" "5" "7" "9" ...
..$ fac2: chr [1:6279] "1" "2" "3" "4" ...
Printing tab will cause problems - it takes a while to generate and then you'll get this message:
[ reached getOption("max.print") -- omitted 6267 rows ]]
You can alter that by changing options(max.print = XXXXX) where XXXXX is some large number. But I don't see what is gained by printing such a large table? If you were trying to do this to see if the correct table had been produced, size-wise, then
> dim(tab)
[1] 6282 6279
> str(tab)
'table' int [1:6282, 1:6279] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 2
..$ fac : chr [1:6282] "1" "5" "7" "9" ...
..$ fac2: chr [1:6279] "1" "2" "3" "4" ...
help with that.
Related
I have a list of data.frames called tagMatrixList from the package ChIPseeker. Is there a way to generate tagHeatmap (tagMatrixList, xlim=c(-3000, 3000), color="blue") for each data frame and then save each single plot in a file?
List of 41
$ : int [1:11715, 1:6001] 0 0 0 0 0 0 0 0 1 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:11715] "5" "7" "8" "10" ...
.. ..$ : NULL
$ : int [1:9414, 1:6001] 0 0 0 0 0 0 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:9414] "4" "5" "7" "10" ...
.. ..$ : NULL
$ : int [1:10498, 1:6001] 0 0 0 0 0 1 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:10498] "4" "6" "7" "9" ...
.. ..$ : NULL
$ : int [1:6849, 1:6001] 0 0 0 0 0 0 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:6849] "5" "6" "10" "12" ...
.. ..$ : NULL
$ : int [1:10823, 1:6001] 0 0 0 0 0 1 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:10823] "6" "7" "9" "12" ...
I tried:
plots = lapply(tagMatrixList, function(x) tagHeatmap(tagMatrix, xlim=c(-3000, 3000),color="blue")) but the output is:
List of 41
$ : NULL
$ : NULL
$ : NULL
$ : NULL
So I cannot save the plots.
You need to open up a image device to save the plot to it (such as png(), pdf(), etc.), and then close it with dev.off() when you're done.
Since your list doesn't have names and you want a file for each plot, we'll loop over a vector of numbers instead to both index the object, and add a file name with that index number.
lapply(seq_len(length(tagMatrixList)), function(x) {
p <- tagHeatmap(tagMatrixList[x], xlim=c(-3000, 3000),color="blue")
png(paste0("heatmap_", x, ".png"), height = 5, width = 5)
p; dev.off()
})
The tidyverse package purrr has a nice convenience function for this too, called iwalk.
purrr::iwalk(tagMatrixList, function(x, y) {
p <- tagHeatmap(x, xlim=c(-3000, 3000),color="blue")
png(paste0("heatmap_", y, ".png"), height = 5, width = 5)
p; dev.off()
})
I googled my error, but that didn't helped me.
Got a data frame, with a column x.
unique(df$x)
The result is:
[1] "fc_social_media" "fc_banners" "fc_nat_search"
[4] "fc_direct" "fc_paid_search"
When I try this:
df <- spread(data = df, key = x, value = x, fill = "0")
I got the error:
Error in `[.data.frame`(data, setdiff(names(data), c(key_var, value_var))) :
undefined columns selected
But that is very weird, because I used the spread function (in the same script) different times.
So I googled, saw some "solutions":
I removed all the "special" characters. As you can see, my unique
values do not contain special characters (cleaned it). But this didn't
help.
I checked if there are any columns with the same name. But all column names
are unique.
#Gregor, #Akrun:
> str(df)
'data.frame': 100 obs. of 22 variables:
$ visitor_id : chr "321012312666671237877-461170125342559040419" "321012366667112237877-461121705342559040419" "321012366661271237877-461170534255901240419" "321012366612671237877-461170534212559040419" ...
$ visit_num : chr "1" "1" "1" "1" ...
$ ref_domain : chr "l.facebook.com" "X.co.uk" "x.co.uk" "" ...
$ x : chr "fc_social_media" "fc_social_media" "fc_social_media" "fc_social_media" ...
$ va_closer_channel : chr "Social Media" "Social Media" "Social Media" "Social Media" ...
$ row : int 1 2 3 4 5 6 7 8 9 10 ...
$ : chr "0" "0" "0" "0" ...
$ Hard Drive : chr "0" "0" "0" "0" ...
The error could be due to a column without a name i.e "". Using a reproducible example
library(tidyr)
spread(df, x, x)
Error in [.data.frame(data, setdiff(names(data), c(key_var,
value_var))) : undefined columns selected
We could make it work by changing the column name
names(df) <- make.names(names(df))
spread(df, x, x, fill = "0")
# X fc_banners fc_direct fc_nat_search fc_paid_search fc_social_media
#1 1 0 0 0 0 fc_social_media
#2 2 fc_banners 0 0 0 0
#3 3 0 0 fc_nat_search 0 0
#4 4 0 fc_direct 0 0 0
#5 5 0 0 0 fc_paid_search 0
data
df <- data.frame(x = c("fc_social_media", "fc_banners",
"fc_nat_search", "fc_direct", "fc_paid_search"), x1 = 1:5, stringsAsFactors = FALSE)
names(df)[2] <- ""
I have a problem regarding data conversion using R language.
I have two data that being stored in variables named lung.X and lung.y, below are the description of my data.
> str(lung.X)
chr [1:86, 1:7129] " 170.0" " 104.0" " 53.7" " 119.0" " 105.5" " 130.0" ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:86] "V3" "V4" "V5" "V6" ...
..$ : chr [1:7129] "A28102_at" "AB000114_at" "AB000115_at" "AB000220_at" ...
and
> str(lung.y)
num [1:86] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 ...
lung.X is a matrix (row: 86 col: 7129) and lung.y is an array of numbers (86 entries)
Do anyone know how to convert above data into the format below?
> str(lung.X)
num [1:86, 1:7129] 170 104 53.7 119 105.5 130...
I thought I should do like this
lung.X <- as.numeric(lung.X)
but I got this instead
> str(lung.X)
num [1:613094] 170 104 53.7 119 105.5 130...
The reason of doing this is because I need lung.X to be numerical only.
Thank you.
You could change the mode of your matrix to numeric:
## example data
m <- matrix(as.character(1:10), nrow=2,
dimnames = list(c("R1", "R2"), LETTERS[1:5]))
m
# A B C D E
# R1 "1" "3" "5" "7" "9"
# R2 "2" "4" "6" "8" "10"
str(m)
# num [1:2, 1:5] 1 2 3 4 5 6 7 8 9 10
# - attr(*, "dimnames")=List of 2
# ..$ : chr [1:2] "R1" "R2"
# ..$ : chr [1:5] "A" "B" "C" "D" ...
# NULL
mode(m) <- "numeric"
str(m)
# num [1:2, 1:5] 1 2 3 4 5 6 7 8 9 10
# - attr(*, "dimnames")=List of 2
# ..$ : chr [1:2] "R1" "R2"
# ..$ : chr [1:5] "A" "B" "C" "D" ...
# NULL
m
# A B C D E
# R1 1 3 5 7 9
# R2 2 4 6 8 10
Give this a try: m <- matrix(as.numeric(lung.X), nrow = 86, ncol = 7129)
If you need it in dataframe/list format, df <- data.frame(m)
I know there is a lot of information in Google about this problem, but I could not solve it.
I have a data frame:
> str(myData)
'data.frame': 1199456 obs. of 7 variables:
$ A: num 3064 82307 4431998 1354 193871 ...
$ B: num 6067 403916 2709997 2743 203434 ...
$ C: num 299 11752 33282 170 2748 ...
$ D: num 105 6676 7065 20 1593 ...
$ E: num 8 572 236 3 170 ...
$ F: num 0 21 95 0 13 ...
$ G: num 583 18512 961328 348 42728 ...
Then I convert it to a matrix in order to apply the Cramer-von Mises test from "cramer" library:
> myData = as.matrix(myData)
> str(myData)
num [1:1199456, 1:7] 3064 82307 4431998 1354 193871 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:1199456] "8" "32" "48" "49" ...
..$ : chr [1:7] "A" "B" "C" "D" ...
After that, if I apply a "cramer.test(myData[x1:y1,], myData[x2:y2,])" I get the following error:
Error in rep(0, (RVAL$m + RVAL$n)^2) : invalid 'times' argument
In addition: Warning message:
In matrix(rep(0, (RVAL$m + RVAL$n)^2), ncol = (RVAL$m + RVAL$n)) :
NAs introduced by coercion
I also tried to convert the data frame to a matrix like this, but the error is the same:
> myData = as.matrix(sapply(myData, as.numeric))
> str(myData)
num [1:1199456, 1:7] 3064 82307 4431998 1354 193871 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:7] "A" "B" "C" "D" ...
Your problem is that your data set is too large for the algorithm that cramer.test is using (at least the way it's coded). The code tries to create a lookup table according to
lookup <- matrix(rep(0, (RVAL$m + RVAL$n)^2),
ncol = (RVAL$m + RVAL$n))
where RVAL$m and RVAL$n are the number of rows of the two samples. The standard maximum length of an R vector is 2^31-1 on a 32-bit platform: since your samples have equal numbers of rows N, you'll be trying to create a vector of length (2*N^2), which in your case is 5.754779e+12 -- probably too big even if R would let you create the vector.
You may have to look for another implementation of the test, or another test.
I've imported a dataset into R where in a column which should be supposed to contain numeric values are present NULL. This make R set the column class to character or factor depending on if you are using or not the stringAsFactors argument.
To give you and idea this is the structure of the dataset.
> str(data)
'data.frame': 1016 obs. of 10 variables:
$ Date : Date, format: "2014-01-01" "2014-01-01" "2014-01-01" "2014-01-01" ...
$ Name : chr "Chi" "Chi" "Chi" "Chi" ...
$ Impressions: chr "229097" "3323" "70171" "1359" ...
$ Revenue : num 533.78 11.62 346.16 3.36 1282.28 ...
$ Clicks : num 472 13 369 1 963 161 1 7 317 21 ...
$ CTR : chr "0.21" "0.39" "0.53" "0.07" ...
$ PCC : chr "32" "2" "18" "0" ...
$ PCOV : chr "3470.52" "94.97" "2176.95" "0" ...
$ PCROI : chr "6.5" "8.17" "6.29" "NULL" ...
$ Dimension : Factor w/ 11 levels "100x72","1200x627",..: 1 3 4 5 7 8 9 10 11 1 ...
I would like to transform the PCROI column as numeric, but containing NULLs it makes this harder.
I've tried to get around the issue setting the value 0 to all observations where current value is NULL, but I got the following error message:
> data$PCROI[which(data$PCROI == "NULL"), ] <- 0
Error in data$PCROI[which(data$PCROI == "NULL"), ] <- 0 :
incorrect number of subscripts on matrix
My idea was to change to 0 all the NULL observations and afterwards transform all the column to numeric using the as.numeric function.
You have a syntax error:
data$PCROI[which(data$PCROI == "NULL"), ] <- 0 # will not work
data$PCROI[which(data$PCROI == "NULL")] <- 0 # will work
by the way you can say:
data$PCROI = as.numeric(data$PCROI)
it will convert your "NULL" to NA automatically.