In R, I am trying to convert binary data to integer values, but instead of 1 value being stored in 1 byte, multiple values are stored within and across bytes.
I know there are 12 integer values stored across 64 bits (8 bytes). The 12 integers have the following bit count: 5,6,5,5,4,7,5,6,5,5,4,7
After the following code:
time <- readBin(fid,integer(),size=1,n=8,signed='FALSE')
The return is:
[1] 25 156 113 63 214 158 113 63
The correct data should be:
25 32 19 17 11 31 22 54 19 17 11 31
I have tried using bitAnd and bitShiftL (package bitops), but have had no real success. And help would be greatly appreciated.
Note that the operation on each 4-byte integer is the same (the pattern is repeated twice). Thus, it suffices to solve the problem for a 4-byte integer, and loop over the 4-byte integers in the file (retrieved via readBin). This is much simpler than considering the problem byte-by-byte.
# length(x) should be 1
bitint <- function(x, bitlens) {
result <- integer(length(bitlens))
for (i in seq_along(bitlens)) {
result[i] <- bitwAnd(x, (2^bitlens[i])-1)
x <- bitwShiftR(x, bitlens[i])
}
return(result)
}
bitlens <- c(5,6,5,5,4,7)
x <- c(1064410137L, 1064410838L)
c(sapply(x, function(i) bitint(i, bitlens)))
## [1] 25 32 19 17 11 31 22 54 19 17 11 31
I don't know of a clean elegant way to do this with the standard data reading base functions (function like redBin seems to prefer to no less than a byte at a time). So i've created a function that does some of the messy calculations to extract bits from bytes and turn them into numbers. I did end up using the bitwise operators in base R (see ?bitwAnd) Here's the function
bitints <- function(bytes, bitlengths) {
stopifnot(sum(bitlengths) <= 8*length(bytes))
stopifnot(all(bitlengths <= 8))
bytebits <- rep.int(8, length(bytes))
masks <- c(1L,3L,7L,15L,31L,63L,127L, 255L)
outs <- numeric(length(bitlengths))
for(i in seq_along(bitlengths)) {
need <- bitlengths[i]
got <- 0
r <- 0
while(need>0) {
j <- which(bytebits>0)[1]
bitget <- min(need, bytebits[j])
r <- r + bitwShiftL(bitwAnd(bytes[j],masks[bitget]), got)
bytebits[j] = bytebits[j]-bitget
bytes[j] = bitwShiftR(bytes[j], bitget)
need <- need - bitget
got <- got + bitget
}
outs[i] <- r
}
outs
}
You just pass in your array of byte values and your array of bit sizes to get the values you need. Here's an example using your data.
bytes <- c(25L, 156L, 113L, 63L, 214L, 158L, 113L, 63L)
bitlens <- c(5,6,5,5,4,7,5,6,5,5,4,7)
bitints( bytes, c(5,6,5,5,4,7,5,6,5,5,4,7) )
# [1] 25 32 19 17 11 31 22 54 19 17 11 31
Note that I had to change around some of your bit lengths to get the values you were expecting. You might want to double check that you either had the expected output correct or that your bit lengths were correct.
Related
aa <-order(maxstaCode$ gateInComingCnt,decreasing=TRUE)[1:10]
aa
[1] 11 121 19 79 13 21 43 10 15 138
for(i in aa){
maxinnum<-c(maxstaCode$gateInComingCnt[i])
}
maxinum
I wanted to use the loop to bring the numbers of aa into the index value in the chart in sequence, and runs out of the value corresponding to the index value the result below
[1] 6235770 2805043 2772432 2592227 2461369 2428441 1990890 1821025 1595055
[10] 1491299
but it turned out:
[1] 1491299
In the for loop, the issue was that maxinum is updated on each iteration, resulting in returning the last value. Instead we need to use c(maxinum, ...)
maxinum <- c()
for(i in aa){
maxinum <- c(maxinum, maxstaCode$gateInComingCnt[i])
}
maxinum
After removing the values from the vector from 1 to 100 I have the following vector:
w
[1] 2 5 13 23 24 39 41 47 48 51 52 58 61 62 70 71 72 90
I am now trying to draw values from this vector with the sample function
for(x in roznica)
{
if(licznik_2 != licznik_1 )
{
roznica_proces_2 <- sample(1:w, roznica)
} else {
roznica_proces_2 <- NA
}
}
I tried various combinations with the sample
If w is the name of the vector then you would NOT use sample(1:w, ...). For one thing 1:w doesn't really amke sense since the : operator expects its second argument to be a single number, while w is apparently on the order of 15 values. Depending on what roznica is (and hopefully it is a single integer) then you might use:
sample(w, roznica) # returns a vector of length roznica's value of randomly ordered values in `w`.
The other problem is that you are currently overwirign any values from prior iterations of the for loop. So you might want to use:
roznica_proces_2[roznica] <- sample(1:w, roznica)
You would of course need to have initialized roznica_proces_2, perhaps with:
roznica_proces_2 <- list()
Regarding your query in the comment :
I am only concerned with the sample function itself: I will show an example : w [1] 31 and now I want to draw 1 number from that in ( which is 31) proces_nr_2 <- sample(w, 1) What does he get? proces_nr_2 [1] 26
The reason that happens is because when a vector is of length 1 the sampling takes from 1 to that number. It is explained in the help page of ?sample.
If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x
So if you have only 1 number to sample just return that number directly instead of passing it in sample.
I have a binary string like this:
0010000
I'd like to have all these permutations:
1010000
0110000
0011000
0010100
0010010
0010001
Is there anybody know which function in R could give me these results?
R has functions for bitwise operations, so we can get the desired numbers with bitwOr:
bitwOr(16, 2^(6:0))
#> [1] 80 48 16 24 20 18 17
...or if we want to exclude the original,
setdiff(bitwOr(16, 2^(6:0)), 16)
#> [1] 80 48 24 20 18 17
However, it only works in decimal, not binary. That's ok, though; we can build some conversion functions:
bin_to_int <- function(bin){
vapply(strsplit(bin, ''),
function(x){sum(as.integer(x) * 2 ^ seq(length(x) - 1, 0))},
numeric(1))
}
int_to_bin <- function(int, bits = 32){
vapply(int,
function(x){paste(as.integer(rev(head(intToBits(x), bits))), collapse = '')},
character(1))
}
Now:
input <- bin_to_int('0010000')
output <- setdiff(bitwOr(input, 2^(6:0)),
input)
output
#> [1] 80 48 24 20 18 17
int_to_bin(output, bits = 7)
#> [1] "1010000" "0110000" "0011000" "0010100" "0010010" "0010001"
library(stringr)
bin <- '0010000'
ones <- str_locate_all(bin, '1')[[1]][,1]
zeros <- (1:str_length(bin))[-ones]
sapply(zeros, function(x){
str_sub(bin, x, x) <- '1'
bin
})
[1] "1010000" "0110000" "0011000" "0010100" "0010010" "0010001"
We assume that the problem is to successively replace each 0 in the input with a 1 for an input string of 0's and 1's.
Replace each character successively with a "1", regardless of its value and then remove any components of the result equal to the input. No packages are used.
input <- "0010000"
setdiff(sapply(1:nchar(input), function(i) `substr<-`(input, i, i, "1")), input)
## [1] "1010000" "0110000" "0011000" "0010100" "0010010" "0010001"
Update: Have completely revised answer.
I have two vectors p1,p2 they report the same information except p2 is more precise. So I want to pick compare the 2 and pick the value from p2 except if the difference between the 2 vectors is > k. In that case I want the value from p1 to be picked in the final product "pd".
k <- 5
p1 <- c(21,43,62,88,119,156,264)
p2 <- c(19,42,62,84,104,156,262)
pd should look like:
pd <- c(19,42,62,84,119,156,262)
I have seen code that specified the selection condition inside the square brackets, but can't figure out how to duplicate it. Something similar to pd <- p2[p1, p1-p2 >5], but not exactly because this obviously doesn't evaluate. p2[p1-p2<5] works to select the positive cases but the 5th case where the condition evaluate to FALSE is skipped.
May be
ifelse(abs(p2-p1) <=k, p2, p1)
#[1] 19 42 62 84 119 156 262
Or without using ifelse
indx <- abs(p1-p2) >k
pd <- p2
pd[indx] <- p1[indx]
pd
#[1] 19 42 62 84 119 156 262
I am new to this forum. I guess something like this has been asked before but, I am not really sure if that is what I want.
I have a sequence like this,
1 2 3 4 5 8 9 10 12 14 15 17 18 19
So, what I wish to do is this, get all the numbers which form a series,i.e.the numbers that belonging to that set should all have a constant difference with the previous element, and also the minimum number of elements should be 3 in that set.
i.e., I can see that (1,2,3,4,5) forms one such series in which numbers appear after an interval of 1 and the total size of this set is 5 which satisfies the minimum threshold criteria.
(1,3,5) forms one such a pattern in which the numbers appear after an interval of 2.
(8,10,12,14) forms another such pattern with an interval of 2. So, as you can see, the interval of repetition can be anything.
Also, for a particular set, I want its maximal one. I dont want, (8,10,12) (although it satisfies the minimum threshold of 3 and constant difference ) as the output and only of the maximal length I want, i.e. (8,10,12,14).
Similarly, for, (1,2,3,4,5) , I dont want (1,2,3) or (2,3,4,5) as the output, only the MAXIMAL LENGTH ONE I WANT, i.e. (1,2,3,4,5).
How can I do this in R?
Edit: That is, I want any set which forms a basic AP series with any difference, however the total value should be greater than 3 in that series and it should be maximal.
Edit2: I have tried using rle and acf in R but that doesnt entirely solves my problem.
Edit3: When I did acf, it basically gave me the maximum peak difference that I could have used. However, I want all the differences possible. Also, rle is just way different. It gave me the longest continuous sequence of similar numbers. Which is not there in my case.
If you are looking for sequences of consecutive numbers, then cgwtools::seqle will find them for you in the same way rle finds a sequence of repeated values.
In the general case of basically any subset of your data which form such a sequence, such as the 8,10,12,14 case you cite, your criteria are so general as to be very difficult to satisfy. You'd have to start at each element of your series and do a forward-looking search for x[j] +1, x[j]+2, x[j]+3 ... ad infinitum. This suggests using some tree-based algorithms.
Here's a potential solution - albeit a very ugly, sloppy one:
##
arithSeq <- function(x=nSeq, minSize=4){
##
dx <- diff(x,lag=1)
Runs <- rle(diff(x))
##
rLens <- Runs[[1]]
rVals <- Runs[[2]]
pStart <- c(
rep(1,rLens[1]),
rep(cumsum(1+rLens[-length(rLens)]),times=rLens[-1])
)
pEnd <- pStart + c(
rep(rLens[1]-1, rLens[1]),
rep(rLens[-1],times=rLens[-1])
)
pGrp <- rep(1:length(rLens),times=rLens)
pLen <- rep(rLens, times=rLens)
dAll <- data.frame(
pStart=pStart,
pEnd=pEnd,
pGrp=pGrp,
pLen=pLen,
runVal=rep(rVals,rLens)
)
##
dSub <- subset(dAll, pLen >= minSize - 1)
##
uVals <- unique(dSub$runVal)
##
maxSub <- subset(dSub, runVal==uVals[1])
maxLen <- max(maxSub$pLen)
maxSub <- subset(maxSub, pLen==maxLen)
##
if(length(uVals) > 1){
for(i in 2:length(uVals)){
iSub <- subset(dSub, runVal==uVals[i])
iMaxLen <- max(iSub$pLen)
iSub <- subset(iSub, pLen==iMaxLen)
maxSub <- rbind(
maxSub,
iSub)
maxSub
}
##
}
##
deDup <- maxSub[!duplicated(maxSub),]
seqStarts <- as.numeric(rownames(deDup))
outList <- list(NULL); length(outList) <- nrow(deDup)
for(i in 1:nrow(deDup)){
outList[[i]] <- list(
Sequence = x[seqStarts[i]:(seqStarts[i]+deDup[i,"pLen"])],
Length=deDup[i,"pLen"]+1,
StartPosition=seqStarts[i],
EndPosition=seqStarts[i]+deDup[i,"pLen"])
outList
}
##
return(outList)
##
}
##
So there are things that can definitely be improved in this function - for instance I made a mistake somewhere in the calculation of pStart and pEnd, the start and end indices of a given arithmetic sequence, but it just so happened that the true start positions of such sequences are given as the rownumbers of one of the intermediate data.frames, so that was a hacky sort of solution. Anyways, it accepts a numeric vector x and a minimum length parameter, minSize. It will return a list containing information about sequences meeting the criteria you outlined above.
set.seed(1234)
lSeq <- sample(1:25,100000,replace=TRUE)
nSeq <- c(1:10,12,33,13:17,16:26)
##
> arithSeq(nSeq)
[[1]]
[[1]]$Sequence
[1] 16 17 18 19 20 21 22 23 24 25 26
[[1]]$Length
[1] 11
[[1]]$StartPosition
[1] 18
[[1]]$EndPosition
[1] 28
##
> arithSeq(x=lSeq,minSize=5)
[[1]]
[[1]]$Sequence
[1] 13 16 19 22 25
[[1]]$Length
[1] 5
[[1]]$StartPosition
[1] 12760
[[1]]$EndPosition
[1] 12764
[[2]]
[[2]]$Sequence
[1] 11 13 15 17 19
[[2]]$Length
[1] 5
[[2]]$StartPosition
[1] 37988
[[2]]$EndPosition
[1] 37992
Like I said, its sloppy and inelegant, but it should get you started.