Where in this loop re-writes b to 29? - r

I have the following loop:
n_range <- 4:29
bResultsStra <- vector("double", 27)
for (b in n_range) {
bRowsStra<-Stratified[sample(nrow(Stratified), b), ]
base <- read.table("base.csv", header=T, sep="," )
base$area<-5036821
base$quadrea <- base$area * 16
bootStra <- boot(data=bRowsStra$count, average, R=1000)
base$data<- bootStra$t
base$popsize<-(base$data*base$quadrea)
basesortStra <- base[order(base$popsize),]
bValue <- basesortStra$popsize[975] - basesortStra$popsize[25]
bResultsStra[[b - 1]] <- bValue
}
which I changed from
n_range <- 2:29
bResultsStra <- vector("double", 29)
Basically, I no longer want to take values from n=2,3.
However, when I run and attempt to put it in a data frame it returns:
Error in data.frame(n = n_range, bResultsStra) :
arguments imply differing number of rows: 26, 28
For some reason I do not know, b gets written to
29L
rather than 27 which it should from the n_range. Why is it doing this?
Thank You

The offending line is
bResultsStra[[b - 1]] <- bValue
That’s the problem with magic constants: once you change one of them, the relationship with the other constants in the code gets lost.
To fix this and make your code more robust at the same time, change your code as follows:
from <- 4L
to <- 29L
n_range <- seq(from, to)
bResultsStra <- vector("double", length(n_range))
for (b in n_range) {
# […]
bResultsStra[[b - from + 1L]] <- bValue
}
That said, this code could be improved further by choosing better variable names (even just consistent naming conventions go a long way!) and by replacing the entire loop with a call to lapply.

Related

Search for specific line in R function body

I wish to "copy and modify" a function at a specific point in its body. Currently, what I have is
nearest_psd <- function(mat) {
ed <- eigen(mat)
eigvecs <- ed$vectors
eigvals <- ed$values
eigvals[eigvals<0] <- 0
eigvecs %*% diag(eigvals) %*% t(eigvecs)
}
nearest_pd <- nearest_psd
formals(nearest_pd)$pdeps <- 1e-08
body(nearest_pd)[[c(7,3)]] <- quote(pdeps)
, so that nearest_pd is a copy of nearest_psd, except for the line eigvals[eigvals<0] <- pdeps.
However, the line number (7, in this case) is hard-coded, and I would prefer to have a robust way to determine this line number. How can I search for the line that contains the expression eigvals[eigvals<0] <- 0?
You can use identical to compare two expressions; that way, you can identify and replace the expression in question:
to_replace = vapply(body(nearest_pd), function (e) identical(e, quote(eigvals[eigvals < 0] <- 0)), logical(1L))
body(nearest_pd)[to_replace] = list(quote(eigvals[eigvals < pdeps] <- pdeps))
However, this is no more readable, nor more robust, than your code: in both cases you’re forced to hard-code the relevant information; in your code, the indices. In mine, the expression. For that reason I wouldn’t recommend using this.
… of course you could instead use an AST walker to replace all occurrences of 0 in the function’s body with pdeps. But is that better? No, since 0 could be used for other purposes. It currently isn’t, but who knows, once the original function changes. And if the original function can’t be assumed to change, why not hard-code the new function entirely? That is, write this:
nearest_pd <- function (mat, pdeps = 1e-08) {
ed <- eigen(mat)
eigvecs <- ed$vectors
eigvals <- ed$values
eigvals[eigvals < pdeps] <- pdeps
eigvecs %*% diag(eigvals) %*% t(eigvecs)
}
… no need to use metaprogramming just for the sake of it.
The following might do what you want.
nearest_psd <- function(mat) {
ed <- eigen(mat)
eigvecs <- ed$vectors
eigvals <- ed$values
eigvals[eigvals<0] <- 0
eigvecs %*% diag(eigvals) %*% t(eigvecs)
}
nearest_pd <- nearest_psd
formals(nearest_pd)$pdeps <- 1e-08
nearest_psd_body <- body(nearest_psd)
# Find the string we a re looking for and replace it ...
new.code <- gsub("eigvals[eigvals < 0] <- 0",
"MY_NEW_CODE",
nearest_psd_body, fixed = TRUE)
# Buidling the function body as a string.
new.code <- new.code[-1] # delete first { such that ...
new.code <- paste(new.code, collapse = ";") # we can collapse the remaining here ....
new.code <- paste("{", new.code, "}", sep = "", collapse = "") # and then wrap the remaining in { }
# parse returns an expression.
body(nearest_pd) <- parse(text = new.code)
See At a basic level, what does eval-parse do in R? for an explantion of parse. Or In programming, what is an expression? what an expression is.

Why is my if statement storing the wrong data types?

Okay, so I have combed the internet for an answer to my problem and I can only put it down to me being a little naive in how R works.
Below is my code for a function that generates public and private keys from the system clock and uses it to attempt to decrypt an encrypted message. This bit works fine, but obviously as it goes through different random generations it comes back with a lot of garbage and NULL data.
I wanted to filter this out by using grep and testing whether the result of that grep was 1, is so, the decoded message would be put into a list.
The problem is that, no matter how I propose the if statement, my list gets cluttered with both the nonsense entries and the NULL entries.
I've tried, !is.null, is.character. test == 1. etc etc but nothing seems to work. Either the list doesn't get populated at all, or it gets populated by every entry that runs through the if statement.
Any advice would be appreciated. Thanks :)
Edit: Okay, forgive me, for these are copy and paste jobs to provide clarity. The first code is the code I'm using to encrypt the message.
require(gmp)
source("convert.R")
p <- nextprime(urand.bigz(size=51, seed=as.bigz(Sys.time())))
q <- nextprime(urand.bigz(size=50))
n <- p*q
finde <- function(phi) {
r <- floor(log(phi, base = 2))
y <- 0 # initialise
while(y != 1) {
e <- urand.bigz(nb = 1, size = r)
y <- gcd.bigz(e, phi)
}
return(e)
}
phi <- (p-1) * (q-1)
e <-finde(phi)
d <- inv.bigz(e, phi)
text1 <- c("I want to eat a baby panda with my bare teeth and hands. Just so I know there's something else in this world suffering more than myself, right now.")
m <- blocks(text1, n) # arguments are text1 (message) and n (public key)
u <- as.bigz((as.bigz(m, n)^e))
dput(u, file="codedmessage.R")
The second is the code contained in the "convert.R" source file:
blocks <- function(txt, n) {
x <- strtoi(charToRaw(txt), 16L)
ll <- length(x)
bl <- floor(log(n, base=256)) # block length (how large the blocks must be)
nb <- floor(ll / bl)
wp <- bl*nb
rem <- ll - wp
s <- as.bigz(vector(mode="numeric", length=0))
u <- 0
while(u < wp) {
total <- as.bigz(0)
for(i in 1:bl) {
total <- 256 * total + x[i+u]
}
u <- u + bl
s <- c(s, total)
}
if(rem > 0) {
total <- as.bigz(0)
for(i in 1:rem) {
total <- 256 * total + x[i + wp]
}
s <- c(s, total)
}
return(s)
}
words <- function(blocknum) {
w <- vector(mode="numeric", length=0)
wl <- blocknum
while(as.bigz(wl) > 0) {
rem <- as.bigz(wl) %% 256
w <- c(rem, w)
wl <- (as.bigz(wl) - as.bigz(rem)) / 256
}
return(w)
}
dectext <- function(listnum) {
len <- length(listnum)
newls <- as.integer(vector(mode="numeric", length=0))
for(i in 1:len) {
temp <- as.integer(words(listnum[i]))
newls <- c(newls, temp)
}
return(rawToChar(as.raw(newls)))
}
And finally the last code is the decrypt and compile list function that I'm having issues with.
finde <- function(phi) {
r <- floor(log(phi, base = 2))
y <- 0 # initialise
while(y != 1) {
e <- urand.bigz(nb = 1, size = r)
y <- gcd.bigz(e, phi)
}
return(e)
}
FindKey <- function(a, y) {
x <<- 1 #initialisation
decodedlist <<- list() #initialisation
while (x<7200) {
print(x)
print(a)
p <- nextprime(urand.bigz(size=51, seed=as.bigz(a)))
q <- nextprime(urand.bigz(size=50))
n <- p*q
phi <- (p-1) * (q-1)
phi
e <-finde(phi)
d <- inv.bigz(e, phi)
recieved<-dget(file=y)
v<-as.bigz(as.bigz(recieved, n)^d)
tryCatch({
decodetext<-dectext(v)
Decrypt<- capture.output(cat(decodetext))
print(Decrypt)
test <- grep("and", Decrypt)
if (!is.null(Decrypt)){
if (is.character(Decrypt)){
decodedlist[[x]] <<- Decrypt
}else{return}}else{return}
}, warning = function(war) {
return()
}, error = function(err){
return()
}, finally = {
x=x+1
a=a-1})
}
}
Sorry it's long.. But I really don't know what to do :(
I found a "sort of" solution to my problem, albeit within a different code I've written.
I'm not very knowledgeable in the reasoning behind why this works but I believe the problem lay in the fact that the list was storing something with a NULL reference (Reps to Acccumulation for the hint ;D) and therefore was not technically NULL itself.
My workaround for this was to avoid using an if statement altogether, instead I found a more efficient method of filtering out NULL list entries in a program I had written for generating large prime numbers.
Extra points for anyone who can figure out what I'm currently studying ;)
#Combine two lists and remove NULL entries therein.
Prime_List2 <<- PrimeList[-which(sapply(PrimeList, is.null))]
Prime_List1 <<- PrimeList[-which(sapply(PrimeList, is.null))]

is it possible to generate a list of seq() for loop in r?

I am very new to R and I have some problem on performing loop using seq() and list. I have search on the QnA in SO, yet I have to find the same problem as this. I apologize if there is a duplicate QnA on this.
I know the basic on how to generate sequence of number and generate using list, however I am wondering whether we can generate a list of sequence for each loop.
this is an example of my code
J <- seq(50,200,50) # (I actually wanted to use 1: J to generate a sequence of each combinations . i.e: 1:50, 1:100 etc)
K <- seq(10,100,10) #(same as the above)
set.seed(1234)
for (i in J) {
for (j in K){
f <- rnorm(i + 1) # the f value I would like it to be generate in terms of list, since the j have 4 sequence value, if possible, could it adhere to that?
}
}
i try using both sequence and list function, but i keep getting either messages:
if print(i)
output
[1]1
.
.
.
[1]50
Warning message:
In 1:(seq(50, 200, 50)) :
numerical expression has 6 elements: only the first used
for (i in 1:list(seq(50,200,50)))
Error in 1:list(seq(50, 200, 50)) : NA/NaN argument
May I know, whether such loop combinations can be perform? Could you please guide me on this? Thank you very much.
not yet sure of what you are asking but is this what you are looking for? It was difficult to post this as a comment
J <- seq(50,200,50)
l1 <- vector(length = length(J), mode = "list")
for (i in seq_along(J)){ # you know of seq_along() right?
l1[[i]] = rnorm(J[i])
}
for the second question where you want lists(J) of lists(K) of matrices : Please do note hat <<- has never been a good practice, but for now this is what i could come up with!
Note : to understand what is actually happening, go into the debug mode : i.e. after defining func, also pass debug(func) which will then go into step-by-step execution.
l1 <- vector(length = length(J), mode = "list")
l2 <- vector(length = length(K), mode = "list")
func <- function(x){
l1[[x]] <- l2
func1 <- function(y) {
l1[[x]][[y]] <<- matrix(rnorm(J[x]*K[y]),
ncol = J[x],
nrow = K[y])
}
lapply(seq_along(l1[[x]]),func1)
}
lapply(seq_along(l1), func)

R for-loop iterating from central value out to extremes

I'm trying to improve the speed of my code, which is trying to optimise a value using 3 variables which have large ranges. The most likely output uses values in the middle of the ranges, so it is wasting time starting from the lowest possible value of each variable. I want to start from the middle value and iterate out! The actual problem has thousands of lines with numbers from 150-650. C,H and O limits will be defined somewhat based on the starting number, but will always be more likely at a central value in the defined range. Is there a way to define the for loop to work outwards like I want? The only, quite shabby, way I can think of is to simply redefine the value within the loop from a vector (e.g. 1=20, 2=21, 3=19, etc). See current code below:
set_error<-2.5
ct<-c(325.00214,325.00952,325.02004,325.02762,325.03535,325.03831,325.04588, 325.05641,325.06402,325.06766,325.07167,325.07454,325.10396)
FormFun<-function(x){
for(C in 1:40){
for(H in 1:80){
for(O in 1:40){
test_mass=C*12+H*1.007825+O*15.9949146-1.0072765
error<-1000000*abs(test_mass-x)/x
if(error<set_error){
result<-paste("C",C,"H",H,"O",O,sep ="")
return(result)
break;break;break;break
}
}
}
}
}
old_t <- Sys.time()
ct2<-lapply(ct,FormFun)
new_t <- Sys.time() - old_t # calculate difference
print(new_t)
Use vectorization and create a closure:
FormFun1_fac <- function(gr) {
gr <<- gr
function(x, set_error){
test_mass <- with(gr, C*12+H*1.007825+O*15.9949146-1.0072765)
error <- 1000000 * abs(test_mass - x) / x
ind <- which(error < set_error)[1]
if (is.na(ind)) return(NULL)
paste0("C", gr[ind, "C"],"H", gr[ind, "H"],"O", gr[ind, "O"])
}
}
FormFun1 <- FormFun1_fac(expand.grid(C = 1:40, H = 1:80, O = 1:40))
ct21 <- lapply(ct, FormFun1, set_error = set_error)
all.equal(ct2, ct21)
#[1] TRUE
This saves a grid of all combinations of C, H, O in the function environment and calculates the error for all combinations (which is fast in vectorized code). The first combination that passes the test is returned.

Looping through numerous functions with a vector of characters

I thought this would be easier than it is, but I am learning to code in R so looping is certainly not my strong point.
What I am attempting to do here is take a series of functions that all have a common theme of MISO. As you can see in the first batch of code below MISO is common, but I would like to swap the MISO for i as one would do in a for loop and then loop through a few different names in a character vector. Let's call that character vector ID so that ID <- c("MISO","PJM","SERC")
At this point in the code, all_Cities_MISO is already a data frame in my environment. I just want to break it up and perform some calculations.
meanAvgHighMISO <- mean(all_Cities_MISO$Col21)
meanAvgLowMISO <- mean(all_Cities_MISO$Col20)
meanAvgMISO <- mean(cbind(meanAvgHighMISO,meanAvgLowMISO))
names(meanAvgMISO) <- ifelse(meanAvgMISO<65,"HDD","CDD")
MISO_Avg_DD <- ifelse(meanAvgMISO<65,(65-meanAvgMISO),(meanAvgMISO-65)) #average degree days for each period
MISO_op_mean <- apply(all_Cities_MISO[,1:19],2, mean)
So I attempted this a few different times, but keep getting errors like Error: unexpected string constant in:
" meanAvgHigh"i""
I feel like it should be simple to replace the MISO with a PJM throughout the code above with a simple for loop, but no luck. It must be something with it not liking MISO as a character.
Here is my attempt at the for loop:
ID <- c("MISO","PJM","SERC")
for(i in ID){
meanAvgHigh"i" <- mean(all_Cities_"i"$Col21)
meanAvgLow"i" <- mean(all_Cities_"i"$Col20)
meanAvg"i" <- mean(cbind(meanAvgHigh"i",meanAvgLow"i"))
names(meanAvg"i") <- ifelse(meanAvg"i"<65,"HDD","CDD")
"i"_Avg_DD <- ifelse(meanAvg"i"<65,(65-meanAvg"i"),(meanAvg"i"-65)) #average degree days for each period
"i"_op_mean <- apply(all_Cities_"i"[,1:19],2, mean)
}
I attempted using [i] instead of "i", but that didn't work either. I understand that MISO itself in the first code I displayed is not a character, but I'm not sure how R would recognize it in the loop otherwise... I just would like to do a simple swap of names in a loop. MISO for PJM or for SERC etc etc.
Any help is greatly appreciated, thank you.
While you can do that by using the environment() and assign() methods, I would advise against it. You should instead use a nested list to save those values.
However, for completeness, this is how I think one would do it (untested):
env <- environment()
ID <- c("MISO","PJM","SERC")
for(i in ID){
assign(paste0("meanAvgHigh", i), mean(env[[ paste0("all_Cities_", i) ]]$Col21))
assign(paste0("meanAvgLow", i), mean(env[[ paste0("all_Cities_", i) ]]$Col20))
assign(paste0("meanAvg", i), mean(cbind(env[[ paste0("meanAvgHigh", i) ]], env[[ paste0("meanAvgLow", i) ]])))
names(env[[ paste0("meanAvg", i) ]]) <- ifelse(env[[ paste0("meanAvg", i) ]] < 65,"HDD","CDD")
##### Note: The ifelse can probably be replaced by an abs
assign(paste0(i, "_Avg_DD"), ifelse( env[[ paste0("meanAvg", i) < 65,
(65 - env[[ paste0("meanAvg", i) ]]),
(env[[ paste0("meanAvg", i) ]] - 65)
)) #average degree days for each period
assign(paste0(i, "_op_mean"), apply(env[[ paste0("all_Cities_", i) ]][,1:19], 2, mean)
}
The basic idea is using assign to set the values in the current environment and then using the current environment env to get them via indexing.

Resources