Numerical error in toy script to convert decimals into binary digits

Numerical error in toy script to convert decimals into binary digits - r

I try to implement a toy script for converting decimals (0.21 was the input in the below example) into binary digits - everything is working fine, except for a numerical error that I don't know how to prevent:
bin_dec <- function() {
b <- as.numeric(readline("Input Binary digit: "))
dec=9999999
P=9999999
N=b
print("N B P U")
while (dec != b & P != 0) {
P = N*2
U = floor(P)
dec = P%%1
print(sprintf("%s 2 %s %s", N, P, U))
N = dec
}
}
> bin_dec()
Input Binary digit: 0.21
[1] "N B P U"
[1] "0.21 2 0.42 0"
[1] "0.42 2 0.84 0"
[1] "0.84 2 1.68 1"
[1] "0.68 2 1.36 1"
[1] "0.36 2 0.72 0"
[1] "0.72 2 1.44 1"
[1] "0.440000000000000 2 0.879999999999999 0"
[1] "0.879999999999999 2 1.76000000000000 1"
[1] "0.759999999999998 2 1.52000000000000 1"
[1] "0.519999999999996 2 1.03999999999999 1"
[1] "0.039999999999992 2 0.079999999999984 0"
[1] "0.079999999999984 2 0.159999999999968 0"
[1] "0.159999999999968 2 0.319999999999936 0"
[1] "0.319999999999936 2 0.639999999999873 0"
[1] "0.639999999999873 2 1.27999999999975 1"
[1] "0.279999999999745 2 0.559999999999491 0"
[1] "0.559999999999491 2 1.11999999999898 1"
[1] "0.119999999998981 2 0.239999999997963 0"
[1] "0.239999999997963 2 0.479999999995925 0"
[1] "0.479999999995925 2 0.95999999999185 0"
[1] "0.95999999999185 2 1.9199999999837 1"
[1] "0.919999999983702 2 1.83999999996740 1"
[1] "0.839999999967404 2 1.67999999993481 1"
[1] "0.679999999934807 2 1.35999999986961 1"
[1] "0.359999999869615 2 0.71999999973923 0"
[1] "0.71999999973923 2 1.43999999947846 1"
[1] "0.439999999478459 2 0.879999998956919 0"
[1] "0.879999998956919 2 1.75999999791384 1"
[1] "0.759999997913837 2 1.51999999582767 1"
[1] "0.519999995827675 2 1.03999999165535 1"
[1] "0.0399999916553497 2 0.0799999833106995 0"
[1] "0.0799999833106995 2 0.159999966621399 0"
[1] "0.159999966621399 2 0.319999933242798 0"
[1] "0.319999933242798 2 0.639999866485596 0"
[1] "0.639999866485596 2 1.27999973297119 1"
[1] "0.279999732971191 2 0.559999465942383 0"
[1] "0.559999465942383 2 1.11999893188477 1"
[1] "0.119998931884766 2 0.239997863769531 0"
[1] "0.239997863769531 2 0.479995727539062 0"
[1] "0.479995727539062 2 0.959991455078125 0"
[1] "0.959991455078125 2 1.91998291015625 1"
[1] "0.91998291015625 2 1.8399658203125 1"
[1] "0.8399658203125 2 1.679931640625 1"
[1] "0.679931640625 2 1.35986328125 1"
[1] "0.35986328125 2 0.7197265625 0"
[1] "0.7197265625 2 1.439453125 1"
[1] "0.439453125 2 0.87890625 0"
[1] "0.87890625 2 1.7578125 1"
[1] "0.7578125 2 1.515625 1"
[1] "0.515625 2 1.03125 1"
[1] "0.03125 2 0.0625 0"
[1] "0.0625 2 0.125 0"
[1] "0.125 2 0.25 0"
[1] "0.25 2 0.5 0"
[1] "0.5 2 1 1"
[1] "0 2 0 0"
> sessionInfo()
R version 2.7.2 (2008-08-25)
i386-pc-mingw32

Found it out myself... I'll paste the working code for the record.
However I've got one issue with cat() and print() which give me undesired NULLs at the end of a line... Strangely, cat without print alone breaks the loop.. And, print alone outputs strings, which I don't want.
bin_dec <- function() {
b <- as.numeric(readline("Input Binary digit: "))
dec=9999999
P=9999999
N=b
N_list=list()
bin=list()
cat("N B P U\n")
while (!dec%in%N_list[1:length(N_list)-1] & P != 0) {
P = N*2
U = floor(P)
dec = signif(P%%1)
print(cat(sprintf("%s 2 %s %s ", N, P, U)))
N = dec
N_list <- append(N_list, dec)
bin <- append(bin, U)
}
dig <- paste(rev(unlist(bin)), collapse = " ")
cat(paste("\nThe resulting binary digit is:\n\n***",
"0 . ", dig, "***\n\n", sep=""))
}
bin_dec()
Input Binary digit: 0.21
N B P U
0.21 2 0.42 0 NULL
0.42 2 0.84 0 NULL
0.84 2 1.68 1 NULL
0.68 2 1.36 1 NULL
0.36 2 0.72 0 NULL
0.72 2 1.44 1 NULL
0.44 2 0.88 0 NULL
0.88 2 1.76 1 NULL
0.76 2 1.52 1 NULL
0.52 2 1.04 1 NULL
0.04 2 0.08 0 NULL
0.08 2 0.16 0 NULL
0.16 2 0.32 0 NULL
0.32 2 0.64 0 NULL
0.64 2 1.28 1 NULL
0.28 2 0.56 0 NULL
0.56 2 1.12 1 NULL
0.12 2 0.24 0 NULL
0.24 2 0.48 0 NULL
0.48 2 0.96 0 NULL
0.96 2 1.92 1 NULL
0.92 2 1.84 1 NULL
The resulting binary digit is:
***0 . 1 1 0 0 0 1 0 1 0 0 0 0 1 1 1 0 1 0 1 1 0 0***

Related

Odd behaviour of quotient in R [duplicate]

This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 4 years ago.
The third line of code in the below example is giving 11 while all the other lines give 12. Any reason why this should happen? If there is a reason, any way to fix it?
> .03 %/% 0.0025
[1] 12
> .03 / 0.0025
[1] 12
> .3 %/% 0.025
[1] 11
> .3 / 0.025
[1] 12
> 3 %/% 0.25
[1] 12
> 3 / 0.25
[1] 12
This happens with multiple numbers btw, some more examples below -
> 0.35 %/% 0.025
[1] 13
> 0.35 / 0.025
[1] 14
> 0.85 %/% 0.025
[1] 33
> 0.85 / 0.025
[1] 34
> 0.425 %/% 0.025
[1] 16
> 0.425 / 0.025
[1] 17
> 0.975 %/% 0.025
[1] 38
> 0.975 / 0.025
[1] 39

I am not sure, but it was too long for a comment. From ?"%%":
%% and x %/% y can be used for non-integer y, e.g. 1 %/% 0.2, but the
results are subject to representation error and so may be
platform-dependent. Because the IEC 60059 representation of 0.2 is a
binary fraction slightly larger than 0.2, the answer to 1 %/% 0.2
should be 4 but most platforms give 5.

Confusion on base reshape example

Can you please explain this behavior :
And why are wide and wide2 not identical, and why does reshape works on wide but not on wide2 ?
wide <- reshape(Indometh, v.names = "conc", idvar = "Subject",
timevar = "time", direction = "wide")
wide
# Subject conc.0.25 conc.0.5 conc.0.75 conc.1 conc.1.25 conc.2 conc.3 conc.4 conc.5 conc.6 conc.8
# 1 1 1.50 0.94 0.78 0.48 0.37 0.19 0.12 0.11 0.08 0.07 0.05
# 12 2 2.03 1.63 0.71 0.70 0.64 0.36 0.32 0.20 0.25 0.12 0.08
# 23 3 2.72 1.49 1.16 0.80 0.80 0.39 0.22 0.12 0.11 0.08 0.08
# 34 4 1.85 1.39 1.02 0.89 0.59 0.40 0.16 0.11 0.10 0.07 0.07
# 45 5 2.05 1.04 0.81 0.39 0.30 0.23 0.13 0.11 0.08 0.10 0.06
# 56 6 2.31 1.44 1.03 0.84 0.64 0.42 0.24 0.17 0.13 0.10 0.09
reshape(wide) # ok
wide2 <- wide[,1:ncol(wide)]
reshape(wide2) # Error in match.arg(direction, c("wide", "long")) : argument "direction" is missing, with no default
Some diagnosis:
identical(wide,wide2) # FALSE
dplyr::all_equal(wide,wide2) # TRUE
all.equal(wide,wide2)
# [1] "Attributes: < Names: 1 string mismatch >" "Attributes: < Length mismatch: comparison on first 2 components >"
# [3] "Attributes: < Component 2: Modes: list, numeric >" "Attributes: < Component 2: names for target but not for current >"
# [5] "Attributes: < Component 2: Length mismatch: comparison on first 5 components >" "Attributes: < Component 2: Component 1: Modes: character, numeric >"
# [7] "Attributes: < Component 2: Component 1: target is character, current is numeric >" "Attributes: < Component 2: Component 2: Modes: character, numeric >"
# [9] "Attributes: < Component 2: Component 2: target is character, current is numeric >" "Attributes: < Component 2: Component 3: Modes: character, numeric >"
# [11] "Attributes: < Component 2: Component 3: target is character, current is numeric >" "Attributes: < Component 2: Component 4: Numeric: lengths (11, 1) differ >"
# [13] "Attributes: < Component 2: Component 5: Modes: character, numeric >" "Attributes: < Component 2: Component 5: Lengths: 11, 1 >"
# [15] "Attributes: < Component 2: Component 5: Attributes: < Modes: list, NULL > >" "Attributes: < Component 2: Component 5: Attributes: < Lengths: 1, 0 > >"
# [17] "Attributes: < Component 2: Component 5: Attributes: < names for target but not for current > >" "Attributes: < Component 2: Component 5: Attributes: < current is not list-like > >"
# [19] "Attributes: < Component 2: Component 5: target is matrix, current is numeric >"

Because the subset operation on wide data.frame removes the custom attributes added by reshape and used by reshape itself to automagically perform the opposite reshaping.
In fact as you can notice the attributes list of wide contains reshapeWide storing all the necessary information to revert the reshape :
> names(attributes(wide))
[1] "row.names" "names" "class" "reshapeWide"
> attributes(wide)$reshapeWide
$v.names
[1] "conc"
$timevar
[1] "time"
$idvar
[1] "Subject"
$times
[1] 0.25 0.50 0.75 1.00 1.25 2.00 3.00 4.00 5.00 6.00 8.00
$varying
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] "conc.0.25" "conc.0.5" "conc.0.75" "conc.1" "conc.1.25" "conc.2" "conc.3" "conc.4" "conc.5" "conc.6" "conc.8"
while wide2 does not :
> names(attributes(wide2))
[1] "names" "class" "row.names"

Matrix to Matrix Votes conversion

I have output of classification algorithm in the form of probabilities. they look like:
> head(testProb)
B U
1 0.98 0.02
2 0.80 0.20
3 0.14 0.86
4 0.91 0.09
5 0.25 0.75
6 0.86 0.14
When I checked the class, it was:
> class(testProb)
[1] "matrix" "votes"
If I take a subset of it, for example, I want 1st column:
a <- testProb[,1]
> head(a)
1 2 3 4 5 6
0.98 0.80 0.14 0.91 0.25 0.86
> class(a)
[1] "numeric"
I have another such classification output matrix but it has not the class of "matrix" "votes". How can I convert it into "matrix" "votes"? So that when I take subset then I get the values in the form when I took subset before.
> head(prediction)
B U
[1,] 0.9413505 0.05864955
[2,] 0.8758474 0.12415256
[3,] 0.2271516 0.77284845
[4,] 0.9227356 0.07726441
[5,] 0.1838987 0.81610128
[6,] 0.9253403 0.07465969
> class (prediction)
[1] "matrix"
> a <- prediction[,1]
> head(a)
[1] 0.9413505 0.8758474 0.2271516 0.9227356 0.1838987 0.9253403
> class(a)
[1] "numeric"
In this case as well I want a as I get in the first case. Your help will be appreciated.

Parallel Processing in R using "parallel" package

I have two data frames:
> head(k)
V1
1 1814338070
2 1199215279
3 1283239083
4 1201972527
5 404900682
6 3093614019
> head(g)
start end state value
1 16777216 16777471 queensland 15169
2 16777472 16778239 fujian 0
3 16778240 16779263 victoria 56203
4 16779264 16781311 guangdong 0
5 16781312 16781823 tokyo 0
6 16781824 16782335 aichi 0
> dim(k)
[1] 624979 1
> dim(g)
[1] 5510305 4
I want to compare each value in data.frame(k) and match if it fits between the range of start and end of data.frame(g) and if it does return the value of state and value from data.frame(g)
The problem I have is due to the dimensions of both the data frame and to do the match and return my desired values it takes 5 hours on my computer. I've used the following method but I'm unable to make use of all cores on my computer and not even make it work correctly:
return_first_match_position <- function(int, start,end) {
match = which(int >= start & int <= end)
if(length(match) > 0){
return(match[1])
}
else {
return(match)
}
}
library(parallel)
cl = makeCluster(detectCores())
matches = Vectorize(return_first_match_position, 'int')(k$V1,g$start, g$end)
p = parSapply(cl, Vectorize(return_first_match_position, 'int')(k$V1,g$start, g$end), return_first_match_position)
stopCluster(cl)
desired output is % number of times state and value show up for every match of the number from data.frame(k) in data.frame(g)
Was wondering there there is an intelligent way of doing parallel processing in R ?
And can anyone please suggest (any sources) how I can learn/improve writing functions in R?

I think you want to do a rolling join. This can be done very efficiently with data.table:
DF1 <- data.frame(V1=c(1.5, 2, 0.3, 1.7, 0.5))
DF2 <- data.frame(start=0:3, end=0.9:3.9,
state=c("queensland", "fujian", "victoria", "guangdong"),
value=1:4)
library(data.table)
DT1 <- data.table(DF1, key="V1")
DT1[, pos:=V1]
# V1 pos
#1: 0.3 0.3
#2: 0.5 0.5
#3: 1.5 1.5
#4: 1.7 1.7
#5: 2.0 2.0
DT2 <- data.table(DF2, key="start")
# start end state value
#1: 0 0.9 queensland 1
#2: 1 1.9 fujian 2
#3: 2 2.9 victoria 3
#4: 3 3.9 guangdong 4
DT2[DT1, roll=TRUE]
# start end state value pos
#1: 0 0.9 queensland 1 0.3
#2: 0 0.9 queensland 1 0.5
#3: 1 1.9 fujian 2 1.5
#4: 1 1.9 fujian 2 1.7
#5: 2 2.9 victoria 3 2.0

so instead of editing the last one a lot (pretty much making a new one).. is this what you want:
I noticed that your end is always 1 before the next rows start, so what you want ( i think) is to just find out how many were within each interval and give that interval the state,value for that range. so
set.seed(123)
c1=seq(1,25,4)
c2=seq(4,30,4)
c3=letters[1:7]
c4=sample(seq(1,7),7)
c.all=cbind(c1,c2,c3,c4)
> c.all ### example data.frame that looks similar to yours
c1 c2 c3 c4
[1,] "1" "4" "a" "3"
[2,] "5" "8" "b" "7"
[3,] "9" "12" "c" "2"
[4,] "13" "16" "d" "1"
[5,] "17" "20" "e" "6"
[6,] "21" "24" "f" "5"
[7,] "25" "28" "g" "4"
k1 <- sample(seq(1,18),20,replace=T)
k1
[1] 2 1 15 14 4 15 3 17 18 1 4 3 16 15 2 4 8 11 7 16
fallsin <- cut(k1, c(as.numeric(c.all[,1]), max(c.all[,2])), labels=paste(c.all[,3], c.all[,4],sep=':'), right=F)
fallsin
[1] a:3 a:3 e:6 e:6 a:3 e:6 a:3 f:5 f:5 a:3 a:3 a:3 e:6 e:6 a:3 a:3 c:2 d:1 b:7 e:6
Levels: a:3 b:7 c:2 d:1 e:6 f:5 g:4
prop.table(table(fallsin))
a:3 b:7 c:2 d:1 e:6 f:5 g:4
0.45 0.05 0.05 0.05 0.30 0.10 0.00
where the names of the columns are the 'state:value' and the numbers are the percent of k1 that fall within the range of that label

Fixing RStudio's default par() settings

Upon starting RStudio (v0.97.551 on OSX 10.8.4) running plot(1:10, 1:10) fails with Error in plot.new() : figure margins too large. This seems unrelated to the other SO questions featuring this error message as it only happens in RStudio - R's basic GUI is unaffected.
par() yields:
> par()
$xlog
[1] FALSE
$ylog
[1] FALSE
$adj
[1] 0.5
$ann
[1] TRUE
$ask
[1] FALSE
$bg
[1] "white"
$bty
[1] "o"
$cex
[1] 1
$cex.axis
[1] 1
$cex.lab
[1] 1
$cex.main
[1] 1.2
$cex.sub
[1] 1
$cin
[1] 0.2000000 0.2666667
$col
[1] "black"
$col.axis
[1] "black"
$col.lab
[1] "black"
$col.main
[1] "black"
$col.sub
[1] "black"
$cra
[1] 14.4 19.2
$crt
[1] 0
$csi
[1] 0.2666667
$cxy
[1] 0.02915216 -0.46109510
$din
[1] 8.513889 1.875000
$err
[1] 0
$family
[1] ""
$fg
[1] "black"
$fig
[1] 0 1 0 1
$fin
[1] 8.513889 1.875000
$font
[1] 1
$font.axis
[1] 1
$font.lab
[1] 1
$font.main
[1] 2
$font.sub
[1] 1
$lab
[1] 5 5 7
$las
[1] 0
$lend
[1] "round"
$lheight
[1] 1
$ljoin
[1] "round"
$lmitre
[1] 10
$lty
[1] "solid"
$lwd
[1] 1
$mai
[1] 1.360000 1.093333 1.093333 0.560000
$mar
[1] 5.1 4.1 4.1 2.1
$mex
[1] 1
$mfcol
[1] 1 1
$mfg
[1] 1 1 1 1
$mfrow
[1] 1 1
$mgp
[1] 3 1 0
$mkh
[1] 0.001
$new
[1] FALSE
$oma
[1] 0 0 0 0
$omd
[1] 0 1 0 1
$omi
[1] 0 0 0 0
$pch
[1] 1
$pin
[1] 6.8605556 -0.5783333
$plt
[1] 0.1284176 0.9342251 0.7253333 0.4168889
$ps
[1] 16
$pty
[1] "m"
$smo
[1] 1
$srt
[1] 0
$tck
[1] NA
$tcl
[1] -0.5
$usr
[1] 0 1 0 1
$xaxp
[1] 0 1 5
$xaxs
[1] "r"
$xaxt
[1] "s"
$xpd
[1] FALSE
$yaxp
[1] 0 1 5
$yaxs
[1] "r"
$yaxt
[1] "s"
$ylbias
[1] 0.2
Setting par(mai=c(0,0,0,0)) stops the error message but messes up the plot (I think it pushes plot axes outside the viewable plot area). In any case I don't see why this should be necessary - it should plot ok without the need to customise par.
Does anyone know why this is happening, and if there's any way to fix it?

If the plot window in RStudio is too small, you will get this error.
Keeping that in mind, two options to try are:
Enlarge the plot window :-)
Try plotting with the standard graphics device (which I believe should be x11() for you).

After having set par(mai=c(0,0,0,0)), close and open a new graphical device:
par(mai=c(0,0,0,0))
dev.off()
dev.new()

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Numerical error in toy script to convert decimals into binary digits - r

Related

Odd behaviour of quotient in R [duplicate]

Confusion on base reshape example

Matrix to Matrix Votes conversion

Parallel Processing in R using "parallel" package

Fixing RStudio's default par() settings

Categories

Resources