I have a set of 0's and 1's represented as a list initially created with sample(c(0,1), n, replace=TRUE), where n is the length of my binary number. I'm currently using a BCD converter to convert my binary number to a decimal number, this is seen here:
BCD.to.Decimal <- function(binaryNumb)
{
binaryLength = length(binaryNumb)
decimalNumb = 0
for(i in 1:binaryLength)
{
if ( binaryNumb[i] == 1)
decimalNumb = decimalNumb + 2^(binaryLength - i)
}
decimalNumb
}
I would like to instead use a GrayCode.To.Decimal converter which does the same job as my BCD.to.Decimal converter, but using Gray Code instead.
Note: Speed DOES matter for this, and I would like to do this in the most efficient way possible. I'm aware that my BCD converter is probably not the most efficient, its just the simplest, if you have a significantly more efficient way of handling BCD conversion I'd also be interested in hearing about that.
What is Gray Code?: http://en.wikipedia.org/wiki/Gray_code
Well, there's a conversion algorithm on that Wiki page, albeit in c so you'll have to port it.
Again on the wiki page, there's this link http://aggregate.org/MAGIC/#Gray%20Code%20Conversion
which lists a number of conversion algorithms, most of which appear pretty simple to code up.
BTW, oh whatever: GA::grey2binary and GA::binary2grey already exist. bah :-)
ETA - I was lucky enough to find this via Mr.Google but in general the package sos is a great R-search tool.
Here is the simple solution to my question, the algorithm ended up being much easier than it first appeared. The algorithm used can be found here.
GrayCode.to.Decimal <- function(grayNumb)
{
binaryNumb = vector("numeric",length(grayNumb))
binaryNumb[1] = grayNumb[1]
for (i in 2:length(grayNumb))
{
binaryNumb[i] = xor(grayNumb[i], binaryNumb[i - 1])
}
return(Binary.to.Decimal(binaryNumb))
}
This code will convert the code into binary where you can then use a binary conversion to convert it to a decimal number. I'm choosing to use the code provided by flodel in the comments section.
Binary.to.Decimal <- function(binaryNumb)
{
L = length(binaryNumb)
sum(2L^(seq_along(binaryNumb)-1L) * rev(binaryNumb))
}
Related
I am trying to learn how to work with nls.lm in the R library minpack.lm by using the Rosenbrock function to see if the algorithm converges to the global minimum at f(x,y) = (1,1). I do so both with and without the analytic Jacobian. In both instances, I get a warning telling me that the algorithm has decided to revert the maximum number of iterations specified in the call to nls.lm to 1024:
Warning messages:
1: In nls.lm(par = initpar, fn = objective_rosenbrock, jac = gradient_rosenbrock, :
resetting `maxiter' to 1024!
2: In nls.lm(par = initpar, fn = objective_rosenbrock, jac = gradient_rosenbrock, :
lmder: info = -1. Number of iterations has reached `maxiter' == 1024.
The algorithm never quite reaches (1,1) as a result given my initial guess of (-1.2, 1.0). I found the source code for the library on GitHub and the following lines of code are pertinent here:
https://github.com/cran/minpack.lm/blob/master/src/nls_lm.c
OS->maxiter = INTEGER_VALUE(getListElement(control, "maxiter"));
if(OS->maxiter > 1024) {
OS->maxiter = 1024;
warning("resetting `maxiter' to 1024!");
}
Is there any logic to why the maximum number of iterations is capped to 1024? Something with bits and 2^10? I would like to use the library for a different application, but this cap on iterations might prevent that. Any insight would be appreciated.
Git blame says that this code limiting the max iterations was introduced in version 1.1-0, in 2008. The NEWS file for the package only goes back as far as version 1.1-6. I can't find the code in any public repo other than the one you point to (which is only a CRAN mirror; it doesn't contain any comments/commit messages/etc. from developers that might give us clues.)
Other than contacting the maintainer I think it's going to be hard to figure out what the rationale is for this limit.
I do have some guesses though.
The only places that maxiter is actually used in the code are here and here - in R code, not Fortran or C code, so it seems extremely unlikely that we are dealing with something like a 10-bit unsigned integer type (which seems an unlikely choice in any case). I think the limitation is there because we also have a buffer defined for holding trace information here:
double rsstrace[1024];
which, as you can see, is hard-coded to a length of 1024. Presumably bad things would happen if we tried to stuff 1025 iterations'-worth of tracing information into this array ...
My suggestions:
change all instances of '1024' in the code to something larger and see what happens. There are only four:
$ find . -type f -exec grep -Hn 1024 {} \;
./src/nls_lm.c:141: if(OS->maxiter > 1024) {
./src/nls_lm.c:142: OS->maxiter = 1024;
./src/nls_lm.c:143: warning("resetting `maxiter' to 1024!");
./src/minpack_lm.h:20: double rsstrace[1024];
it would be best to #define MAXITER 2048 (or whatever) in src/minpack_lm.h and use that instead of the numerical value.
Contact the maintainer (maintainer("minpack.lm")) and ask them about this issue.
I am trying to better understand functions and found an example online but can't get it to work.
I want to solve an equation a, and have two arguments v and r. v= 10 and r=3. Here is my code. What am I missing? Thanks for your insights.
solve <- function(r=3,v=10) {
a <- pi*r*(sqrt(r^2+(9*v^2)/pi^2*r^4))
}
return(a)
Based on inputs. Here is the updated code. But looks like the result is not accurate.
solve <- function(r,v){
a <- pi*r*(sqrt(r^2+(9*v^2)/pi^2*r^4))
return(a)
}
solve(3,10)
R is giving me a result of 810.4933. But the example says the result is 29.9906.
Here is the formula for A:
enter image description here
You need to know the order of operations within math expressions. If you read ?Ops (kind of obscure, granted), you'll see
2. Group '"Ops"':
• '"+"', '"-"', '"*"', '"/"', '"^"', '"%%"', '"%/%"'
• '"&"', '"|"', '"!"'
• '"=="', '"!="', '"<"', '"<="', '">="', '">"'
Which suggests that * and / are consecutive. Unfortunately, your denominator of
... / pi^2*r^4
is being interpreted as
(... / pi^2) * (r^4)
which brings r^4 into the numerator.
Add parens to enforce the order of operations.
.../(pi^2*r^4)
I have read these SO posts on getting rstudio to print out without truncating:
list output truncated - How to expand listed variables with str() in R
avoid string printed to console getting truncated (in RStudio)
The answers there involve making a adjustment to studio settings which would then cover all future outputs to the console.
Is there a ad hoc way to get r to print an entire string to the console?
I tried:
library(tidyverse)
library(foreach)
mystring <- foreach(i = 1:52) %do% {
paste0("'_gaWeek",i,"'!A16:B;")
} %>% unlist %>% toString()
print(mystring, len = length(mystring))
> print(mystring, len = length(mystring))
[1] "'_gaWeek1'!A16:B;, '_gaWeek2'!A16:B;, '_gaWeek3'!A16:B;, '_gaWeek4'!A16:B;, '_gaWeek5'!A16:B;, '_gaWeek6'!A16:B;, '_gaWeek7'!A16:B;, '_gaWeek8'!A16:B;, '_gaWeek9'!A16:B;, '_gaWeek10'!A16:B;, '_gaWeek11'!A16:B;, '_gaWeek12'!A16:B;, '_gaWeek13'!A16:B;, '_gaWeek14'!A16:B;, '_gaWeek15'!A16:B;, '_gaWeek16'!A16:B;, '_gaWeek17'!A16:B;, '_gaWeek18'!A16:B;, '_gaWeek19'!A16:B;, '_gaWeek20'!A16:B;, '_gaWeek21'!A16:B;, '_gaWeek22'!A16:B;, '_gaWeek23'!A16:B;, '_gaWeek24'!A16:B;, '_gaWeek25'!A16:B;, '_gaWeek26'!A16:B;, '_gaWeek27'!A16:B;, '_gaWeek28'!A16:B;, '_gaWeek29'!A16:B;, '_gaWeek30'!A16:B;, '_gaWeek31'!A16:B;, '_gaWeek32'!A16:B;, '_gaWeek33'!A16:B;, '_gaWeek34'!A16:B;, '_gaWeek35'!A16:B;, '_gaWeek36'!A16:B;, '_gaWeek37'!A16:B;, '_gaWeek38'!A16:B;, '_gaWeek39'!A16:B;, '_gaWeek40'!A16:B;, '_gaWeek41'!A16:B;, '_gaWeek42'!A16:B;, '_gaWeek43'!A16:B;, '_gaWeek44'!A16:B;, '_gaWeek45'!A16:B;, '_gaWeek46'!A16:B;, '_gaWeek47'!A16:B;, '_gaWeek48'!A16:B;, '_gaWeek49'!A16:B;, '_gaWeek50'!A16:B;, '_ga... <truncated>
It's truncated. Is there an ad hoc way around this without changing rstudio settings? Such as by a function argument? I tried print() here.
Also, how do I get rid of the comma separator in between each instance above?
The short answer is "no" since, the option limiting the print is in the IDE itself, which you can't control from your program itself (I'm assuming you're not some crazy hacker here), and not a language feature. It's like trying to stop "WINDOWS" from doing things (although not).
Seems to me the easiest way (ad hoc) is to turn it on, do whatever, then turn it off. If you insist on not doing that, you need to write your own function:
myprint<- function(somestring,idelimit=100) {
for(i in seq(1,nchar(somestring),idelimit+1)) {
print(substr(somestring,i,i+idelimit));
}
}
I'm not a fluent R coder so let me know if you catch a syntax error. The idea is simple - idelimit should be wherever studio truncates (I chose 100 arbitrarily), and basically you're doing the splitting yourself so string is printed line after line without truncation. Each time you take a portion at most idelimit long from somestring and print it.
Is there a way to resize a chunked dataset in HDF5 using Julia's HDF5.jl? I didn't see anything in the documentation. Looking through the source, all I found was set_dims!(), but that cannot extend a dataset (only shrink it). Does HDF5.jl have the ability to enlarge an existing (chunked) dataset? This is a very important feature for me, and I would rather not have to call into another language.
The docs have a brief mention of extendible dimensions in hdf5.md excerpted below.
You can use extendible dimensions,
d = d_create(parent, name, dtype, (dims, max_dims), "chunk", (chunk_dims), [lcpl, dcpl, dapl])
set_dims!(d, new_dims)
where dims is a tuple of integers. For example
b = d_create(fid, "b", Int, ((1000,),(-1,)), "chunk", (100,)) #-1 is equivalent to typemax(Hsize)
set_dims!(b, (10000,))
b[1:10000] = [1:10000]
I believe I've got it figured out. The issue is that I forgot to give the dataspace a large enough max_dims. Doing that required digging into the lower-level API. The solution I found was:
dspace = HDF5.dataspace((6,20)::Dims, max_dims=(6,typemax(Int64)))
dtype = HDF5.datatype(Float64)
dset = HDF5.d_create(prt, "trajectory", dtype, dspace, "chunk", (6,10))
Once I created a dataset that can be resized appropriately, the set_dims! function resizes the dataset correctly.
I think I located a few minor issues with the API, which I had to work around or change in my local version. I will get in touch with the HDF5.jl owner regarding those. For those interested:
The constant H5S_UNLIMITED is of type Uint64, but the dataspace function will only accept tuples of Int64, hence why I used typemax(Int64) for my max_dims to imitate how H5S_UNLIMITED is derived.
The form of d_create which I used calls h5d_create incorrectly; it passes parent instead of checkvalid(parent).id (can be seen by comparison with other forms of d_create).
I have the following code running and it's taking me a long time to run. How do I know if it's still doing its job or it got stuck somewhere.
noise4<-NULL;
for(i in 1:length(noise3))
{
if(is.na(noise3[i])==TRUE)
{
next;
}
else
{
noise4<-c(noise4,noise3[i]);
}
}
noise3 is a vector with 2418233 data points.
You just want to remove the NA values. Do it like this:
noise4 <- noise3[!is.na(noise3)]
This will be pretty much instant.
Or as Joshua suggests, a more readable alternative:
noise4 <- na.omit(noise3)
Your code was slow because:
It uses explicit loops which tend to be slow under the R interpreter.
You reallocate memory every iteration.
The memory reallocation is probably the biggest handicap to your code.
I wanted to illustrate the benefits of pre-allocation, so I tried to run your code... but I killed it after ~5 minutes. I recommend you use noise4 <- na.omit(noise3) as I said in my comments. This code is solely for illustrative purposes.
# Create some random data
set.seed(21)
noise3 <- rnorm(2418233)
noise3[sample(2418233, 100)] <- NA
noise <- function(noise3) {
# Pre-allocate
noise4 <- vector("numeric", sum(!is.na(noise3)))
for(i in seq_along(noise3)) {
if(is.na(noise3[i])) {
next
} else {
noise4[i] <- noise3[i]
}
}
}
system.time(noise(noise3)) # MUCH less than 5+ minutes
# user system elapsed
# 9.50 0.44 9.94
# Let's see what we gain from compiling
library(compiler)
cnoise <- cmpfun(noise)
system.time(cnoise(noise3)) # a decent reduction
# user system elapsed
# 3.46 0.49 3.96
The other answers have given you much, much better ways to do the task that you actually set out to achieve (removing NA values in your data), but an answer to the specific question you asked ("how do I know if R is actually working or if it has instead gotten stuck?") is to introduce some output (cat) statements in your loop, as follows:
rpt <- 10000 ## reporting interval
noise4<-NULL;
for(i in 1:length(noise3))
{
if (i %% rpt == 0) cat(i,"\n")
if(is.na(noise3[i])==TRUE)
{
next;
}
else
{
noise4<-c(noise4,noise3[i]);
}
}
If you run this code you can immediately see that it slows down radically as it gets farther into the loop (a consequence of the failure to pre-allocate space) ...
The others have all given correct ways to do the same problem, so that you needn't worry about speed. #BenBolker also gave a good pointer regarding regular output.
A different thing to note is that if you find yourself in a loop, you can break out of it and find the value of i. Assuming that re-starting from that value of i won't harm things, i.e. using that value twice won't be a problem, you can restart. Or, you can just finish the job as the others have stated.
A separate trick is that if the loop is slow (and can't be vectorized or else you're not eager to break out of the loop), AND you don't have any reporting, you can still look for an external method to see if R is actually consuming cycles on your computer. In Linux, the top command is your best bet. On Windows, the task manager will do the trick (I prefer to use the SysInternals / Microsoft program Process Explorer). 'top' also exists on Macs, though I believe there are some other more popular tools.
One other word of advice: if you have a really long loop to run, I strongly encourage saving the results regularly. I typically create a file with the a name like: myPrefix_YYYYMMDDHHMMSS.rdat . This way everything can go to hell and you can still start your loop where you left off.
I don't always iterate, but when I do, I use these tricks. Stay speedy, my friend.
For one case I've faced, updating all packages in use under R studio resolved the issue.