R character strings are limited to 2^31-1 bytes - r

I am working with neo4r library in R. When i use this function
call_neo4j(con, type = "graph")
I get the error
Fehler in readBin(content, character()) : R character strings are limited to 2^31-1 bytes
Anyone have any idea about it?

As the error would suggest, you reached the limit on the size of character strings. From the documentation:
The number of bytes in a character string is limited to 2^31 - 1 ~ 2*10^9, which is also the limit on each dimension of an array.
Without more information, we can't help you solve this problem. See here to create a minimal reproducible example that could help us solve your problem.

Related

Understanding vector/mean functions in R

Hi, I was looking to see if anyone could help me.
So I am new to using R and I'm following through a workbook provided by my univesity and have been trying the functions out in R and changing various numbers to see how this changed the information.
Generally, I have been understanding most of it but I dont quite understand some of the following code.
Could anyone explain to me what
vec.mean <- numeric(N)
means as im not entirely sure what this is doing.
Thanks in advance for the help.
If you read the help file with help(numeric):
Description -
Creates or coerces objects of type "numeric".
Arguments -
length
A non-negative integer specifying the desired length. Double values will be coerced to integer: supplying an argument of length other than one is an error.
The first argument of numeric() is length =. Therefore, numeric(500) creates a numeric vector of length 500.
This is theoretically a best practice because running the for loop later on a vector that has already been created will have subtly improved performance.

ImpulseDE2, matrix counts contains non-integer elements

Possibly it's a stupid question (but be patient, I'm a beginner in R's word)... I'm working with ImpulseDE2, a package designed to RNAseq data analysis along different times (see article for more information).
The running function (runImpulseDE2) requires a matrix counts and a annotation data frame. I've created both but it appears this error message:
Error in checkCounts(matCountData, "matCountData"): ERROR: matCountData contains non-integer elements. Requires count data.
I have tried some solutions and nothing seems to work (and I've not found any solution in the Internet)...
as.matrix(data)
(data + 1) > and there isn't NAs nor zero values that originate this error ($ which(is.na(data)) and $ which(data < 1), but both results are integer(0))
as.numeric(data) > and appears another error: ERROR: [Rownames of matCountData] was not given as input.
I think that's something I'm not realizing, but I'm totally locked. Every tip will be welcome!
And here is the (silly) solution! This function seems not to accept float numbers... so applying a simple round is enough to solve this error.
Thanks for your help!

Are the as.character() and paste() limited by the size of the numeric vales they are given?

I'm running into some problems with the R function as.character() and paste(): they do not give back what they're being fed...
as.character(1415584236544311111)
## [1] "1415584236544311040"
paste(1415584236544311111)
## [1] "1415584236544311040"
what could be the problem or a workaround to paste my number as a string?
update
I found that using the bit64 library allowed me to retain the extra digits I needed with the function as.integer64().
Remember that numbers are stored in a fixed number of bytes based upon the hardware you are running on. Can you show that your very big integer is treated properly by normal arithmetic operations? If not, you're probably trying to store a number to large to store in your R install's integer # of bytes. The number you see is just what could fit.
You could try storing the number as a double which is technically less precise but can store larger numbers in scientific notation.
EDIT
Consider the answers in long/bigint/decimal equivalent datatype in R which list solutions including arbitrary precision packages.

Data length is not power of two / Sample size is not divisible by 2^J (wavelet analysis)

I have multiple time-series of length 149 and I would like to denoise them by using wavelet transformations.
This is an example of my data:
t=ts(rnorm(149,5000,1000),start=1065,end=1213)
When I try using the packages wavetresh and waveslim, they both point me to the same problem:
library(wavetresh)
wd(t)
Error in wd(t) : Data length is not power of two
library(waveslim)
dwt(t)
Error in dwt(t) : Sample size is not divisible by 2^J
I understand that my data length should be of lenght 2^x, but I can't overcome this problem. I thought the function up.sample() in waveslim was supposed to help with this, but it didn't do the trick (e.g. up.sample(t,2^8) gives a vector of length 38144). So how do I increase my vector length without inserting an error? I know I could pad with zero's,... but I want to know the best way to do this.
Also, when looking at the example of waveslim, it looks as if the length of the imput serie doesn't fulfill this requirement either (although the example of course does work):
data(ibm)
ibm.returns <- diff(log(ibm))
ibmr.haar <- dwt(ibm.returns, "haar") #works
log2(length(ibm.returns))
[1] 8.523562
I feel like I'm missing something basic, but I can't figure it out.
Thanks for any help.
Ps: I know I can use other techniques to do this, but I really want to test this approach.
I had a look into the code of dwt and the reason why it works, is that dwtdoes not check whether the length is a power of 2 but whether the length is a multiple of 2^J (actually that is what the error message says: Error in dwt(t) : Sample size is not divisible by 2^J).
With J=4 the length of your time series must thus be a multiple of 16. As you were asuming, up.sample can be used to overcome this issue as it pads the time series with 0's. But you you don't provide the final length but the frequency of upsampling.
Thus
dwt(up.sample(t, 16, 0))
should do the trick.

R claims that data is non-numeric, but after writing to file is numeric

I have read in a table in R, and am trying to take log of the data. This gives me an error that the last column contains non-numeric values:
> log(TD_complete)
Error in Math.data.frame(list(X2011.01 = c(187072L, 140815L, 785077L, :
non-numeric variable in data frame: X2013.05
The data "looks" numeric, i.e. when I read it my brain interprets it as numbers. I can't be totally wrong since the following will work:
> write.table(TD_complete,"C:\\tmp\\rubbish.csv", sep = ",")
> newdata = read.csv("C:\\tmp\\rubbish.csv")
> log(newdata)
The last line will happily output numbers.
This doesn't make any sense to me - either the data is numeric when I read it in the first time round, or it is not. Any ideas what might be going on?
EDIT: Unfortunately I can't share the data, it's confidential.
Review the colClasses argument of read.csv(), where you can specify what type each column should be read and stored as. That might not be so helpful if you have a large number of columns, but using it makes sure R doesn't have to guess what type of data you're using.
Just because "the last line will happily output numbers" doesn't mean R is treating the values as numeric.
Also, it would help to see some of your data.
If you provide the actual data or a sample of it, help will be much easier.
In this case I assume R has the column in question saved as a string and writes it without any parantheses into the CSV file. Once there, it reads it again and does not bother to interpret a value without any characters as anything else than a number. In other words, by writing and reading a CSV file you converted a string containing only numbers into a proper integer (or float).
But without the actual data or the rest of the code this is mere conjecture.

Resources