reading 0 in front of numeric

reading 0 in front of numeric - r

vec1 <- c(26, 12, 13, 20, 9)
vac1 <- decode_vec(vec1)
The result :
vac1 : "11010" "01100" "01101" "10100" "01001"
I'm changing vac1 into numeric but 0 keeps disappear.
test_1 <- as.numeric(vac1)
result
11010 1100 1101 10100 1001
I tried to recover 0 with sprintf() but it rechanged the vector into character.
test_2 <- sprintf("%05d", test_1)
"11010" "01100" "01101" "10100" "01001"
I want to make the morse codes into numeric form without losing any 0

Numeric vectors will not print with 0 prefixes but we can define our own S3 class that does. We have defined as.bin.numeric, as.data.frame.bin, format.bin and print.bin methods. We have not defined a [.bin method so subscripted values will have to be cast back as shown below. Internally the bin class is stored as ordinary numbers. Define other methods as needed.
library(dst)
library(zoo)
as.bin <- function(x, ...) UseMethod("as.bin")
as.bin.numeric <- function(x, ...) structure(x, class = "bin")
as.data.frame.bin <- zoo:::as.data.frame.yearmon
format.bin <- function(x, ...) {
x <- unclass(x)
n <- max(floor(log2(x)) + 1)
base <- rep(2, n)
sapply(x, function(y) paste0(encode(base, y), collapse = ""))
}
print.bin <- function(x, ...) print(format(x), ...)
Now test these
v0 <- c(26, 12, 13, 20, 9)
v <- as.bin(v0)
as.numeric(v)
## [1] 26 12 13 20 9
v
## [1] "11010" "01100" "01101" "10100" "01001"
v + 1
## [1] "11011" "01101" "01110" "10101" "01010"
as.bin(v[1])
## [1] "11010"
data.frame(v = v)
## v
## 1 11010
## 2 01100
## 3 01101
## 4 10100
## 5 01001

Related

Alternative to for loop for fast calculations when equations depend on each other

I am using a for-loop to do step-by-step calculations where several equations depend on each other. Because of this dependence, I cannot find a solution where I do the calculations inside a dataframe. My main motivation is to speed up the calculations when the Time vector is very large in the reprex below.
Could you please suggest alternatives to the following for-loop based calculations, preferably inside a dataframe in R? The only thing I can think of is using for-loop in Rcpp.
Reproducible Example
last_time <- 10
STEP = 1
Time <- seq(from = 0, to = last_time, by = STEP)
## empty vectors
eq1 <- vector(mode = "double", length = length(Time))
eq2 <- vector(mode = "double", length = length(Time))
eq <- vector(mode = "double", length = length(Time))
eq3 <- vector(mode = "double", length = length(Time))
eq4 <- vector(mode = "double", length = length(Time))
## adding the first values
eq1[1] <- 25
eq2[1] <- 25
eq[1] <- 25
eq3[1] <- 100
eq4[1] <- 2
for (t in 2:length(Time)) {
## eq1
eq1[t] <- eq[t-1] + (2.5 * STEP * (1 - (eq[t-1])/25))
## eq2
eq2[t] <- (-2 * STEP) + ((-2^2) * (STEP^2)) - (2 * eq3[t-1]) - (eq[t-1] * STEP)
## min.
eq[t] <- min(eq1[t], eq2[t] )
## eq3
eq3[t] <- (eq[t] - eq[t-1])/(STEP)
## eq4
eq4[t] <- eq4[t-1] + (eq[t-1] * STEP) + (0.5 * eq3[t-1] * (STEP)^2)
}
Output:
my_data <- data.frame(Time, eq1, eq2, eq, eq3, eq4)
my_data
#> Time eq1 eq2 eq eq3 eq4
#> 1 0 25.00000 25.00000 25.00000 -256.00000 2.0000
#> 2 1 25.00000 -231.00000 -231.00000 25.60000 -101.0000
#> 3 2 -205.40000 225.00000 -205.40000 23.04000 -319.2000
#> 4 3 -182.36000 199.40000 -182.36000 20.73600 -513.0800
#> 5 4 -161.62400 176.36000 -161.62400 18.66240 -685.0720
#> 6 5 -142.96160 155.62400 -142.96160 16.79616 -837.3648
#> 7 6 -126.16544 136.96160 -126.16544 15.11654 -971.9283
#> 8 7 -111.04890 120.16544 -111.04890 13.60489 -1090.5355
#> 9 8 -97.44401 105.04890 -97.44401 12.24440 -1194.7819
#> 10 9 -85.19961 91.44401 -85.19961 11.01996 -1286.1037
#> 11 10 -74.17965 79.19961 -74.17965 0.00000 -1365.7934
Created on 2021-02-28 by the reprex package (v1.0.0)

You could define a recursive function. A loop is faster than recursion though.
g <- function(m, STEP, time, x=2) {
if (time == 0) m
else {
## eq1
m[x, 2] <- m[x - 1, 1] + 2.5*STEP*(1 - (m[x - 1, 1])/25)
## eq2
m[x, 3] <- -2*STEP + -2^2*STEP^2 - 2*m[x - 1, 4] - m[x - 1, 1]*STEP
## min.
m[x, 1] <- min(m[x, 2], m[x, 3])
## eq3
m[x - 1, 4] <- (m[x, 1] - m[x - 1, 1])/STEP
## eq4
m[x, 5] <- m[x - 1, 5] + m[x - 1, 1]*STEP + 0.5*m[x - 1, 4]*STEP^2
g(m, STEP, time - 1, x + 1)
}
}
Usage
last_time <- 10; STEP <- 1
First <- c(eq0=25, eq1=25, eq2=25, eq3=100, eq4=2)
m <- matrix(0, last_time + 1, length(First), dimnames=list(NULL, names(First)))
m[1, ] <- First
g(m, STEP, last_time)
# eq0 eq1 eq2 eq3 eq4
# [1,] 25.00000 25.00000 25.00000 -256.00000 2.0000
# [2,] -231.00000 25.00000 -231.00000 25.60000 -101.0000
# [3,] -205.40000 -205.40000 225.00000 23.04000 -319.2000
# [4,] -182.36000 -182.36000 199.40000 20.73600 -513.0800
# [5,] -161.62400 -161.62400 176.36000 18.66240 -685.0720
# [6,] -142.96160 -142.96160 155.62400 16.79616 -837.3648
# [7,] -126.16544 -126.16544 136.96160 15.11654 -971.9283
# [8,] -111.04890 -111.04890 120.16544 13.60489 -1090.5355
# [9,] -97.44401 -97.44401 105.04890 12.24440 -1194.7819
# [10,] -85.19961 -85.19961 91.44401 11.01996 -1286.1037
# [11,] -74.17965 -74.17965 79.19961 0.00000 -1365.7934

as you asked how it works:
The recursive filter function of stats::filter can be used with mapply as follows:
dataframe <-
mapply(stats::filter,
dataframe,
filter = vector,
method = "recursive")
where vector is e.g. c(25), which could be your first eq1[1] <- 25
The recursive filter works like a recursive loop but is a bit more elegant:
Then the mapply recursive filter would do:
dataframe / vector
row or timepoint 1 20
row or timepoint 2 30 + (20 * c(25))
row or timepoint 3 40 + ((20*25)+30) * c(25))
It calculates the value in the first row and uses it in the next, where it multiplies the next vector. Perhaps if you play around with stats filter and the recursive method you also get the same result. It is a row based calculation over time similar to Rcpp but more flexible.

Create a group of numbers that does not exceed 34

I need to create groups of numbers which summed up do not reach 34.
For example: I have an array x<-c(28,26,20,5,3,2,1) and I need to create the following groups: a=(28,5,1), b=(26,3,2), c=(20) because the sums of the groups a, b and c do not exceed 34.
Is it possible to implement this procedure in R?

If I understand correctly this is what you want to do:
create_groups <- function(input, threshold) {
input <- sort(input, decreasing = TRUE)
result <- vector("list", length(input))
sums <- rep(0, length(input))
for (k in input) {
i <- match(TRUE, sums + k <= threshold)
if (!is.na(i)) {
result[[i]] <- c(result[[i]], k)
sums[i] <- sums[i] + k
}
}
result[sapply(result, is.null)] <- NULL
result
}
create_groups(x, 34)
# [[1]]
# [1] 28 5 1
#
# [[2]]
# [1] 26 3 2
#
# [[3]]
# [1] 20
However it is not guaranteed that this greedy algorithm will output the optimal solution in terms of number of groups. For instance:
y <- c(18, 15, 11, 9, 8, 7)
create_groups(y, 34)
# [[1]]
# [1] 18 15
#
# [[2]]
# [1] 11 9 8
#
# [[3]]
# [1] 7
while the optimal solution in this case consists of only 2 groups: list(c(18, 9, 7), c(15, 11, 8)).

Assuming you want all possible combinations of subsets of x that meet this condition, you can use
x = c(28,26,20,5,3,2,1)
y = lapply(seq_along(x), function(y) combn(x, y)) # list all combinations of all subsets
le34 = sapply(y, function(z) colSums(z) <= 34) # which sums are less than 34
lapply(seq_along(y), function(i) y[[i]][,le34[[i]]] ) # list of combinations that meet condition

R error when calling a function with lapply

I have a dataset with a column composed by numbers
dati$score[10:15]
[1] 7576 6362 764663 676164 764676 6364
I have this function which calculates the sums of the number in a cell which i found here on stackoverflow and works when i apply it singularly
digitsum <- function(x) sum(floor(x / 10^(0:(nchar(x) - 1))) %% 10)
I can't apply this to the column dati$score, i get this error, i've tried using lapply and a for cycle
for (i in 1:lunghscore){
f <- dati[i,"score"]
post <- sum(floor(f / 18^(0:(nchar(f) - 1))) %% 18)
dati[i,"score"] <- post
i <- i + 1
}
lapply
dati[,"score"] <- lapply(X = dati[,"score"],FUN = digitsum)
I get this error
2: In `[<-.data.frame`(`*tmp*`, , "score", value = list(20, 17, 26, :
provided 66121 variables to replace 1 variables
How can i apply the function digitsum to every cell in that column?

The problem is that the output of a list is always a list, and you fill a vector with elements of a list. Your code works if you unlist your lapply function as shown in the pet example below:
> digitsum <- function(x) sum(floor(x / 10^(0:(nchar(x) - 1))) %% 10)
> dati <- data.frame(matrix(250:255, ncol = 2))
> dati
X1 X2
1 250 253
2 251 254
3 252 255
> lapply(dati[, "X2"], digitsum)
[[1]]
[1] 10
[[2]]
[1] 11
[[3]]
[1] 12
> dati[, "X2"]<-lapply(dati[, "X2"], digitsum)
Warning message:
In `[<-.data.frame`(`*tmp*`, , "X2", value = list(10, 11, 12)) :
provided 3 variables to replace 1 variables
And the solution:
> dati[, "X2"]<-unlist(lapply(dati[, "X2"], digitsum))
Best, Thomas

How to write a data-frame with one column a list to a file?

Here is my dummy dataset:
dataset<-data.frame(a=c(1,2,3,4),b=c('a','b','c','d'), c=c("HI","DD","gg","ff"))
g=list(c("a","b"),c(2,3,4), c(44,33,11,22),c("chr","ID","i","II"))
dataset$l<-g
dataset
a b c l
1 1 a HI a, b
2 2 b DD 2, 3, 4
3 3 c gg 44, 33, 11, 22
4 4 d ff chr, ID, i, II
> mode(dataset$l)
[1] "list"
when I try to write the dataset to a file:
> write.table(dataset, "dataset.txt", quote=F, sep="\t")
Error in write.table(x, file, nrow(x), p, rnames, sep, eol, na, dec, as.integer(quote), :
unimplemented type 'list' in 'EncodeElement'
How can i solve this problem?

I can think a few options, depending on what you're trying to achieve.
If it is for display only, then you might simply want capture.output() or sink(); neither of these would be very convenient to read back into R:
capture.output(dataset, file="myfile.txt")
### Result is a text file that looks like this:
# a b c l
# 1 1 a HI a, b
# 2 2 b DD 2, 3, 4
# 3 3 c gg 44, 33, 11, 22
# 4 4 d ff chr, ID, i, II
sink("myfile.txt")
dataset
sink()
## Same result as `capture.output()` approach
If you want to be able to read the resulting table back into R (albeit without preserving the fact that column "l" is a list), you can take an approach similar to what #DWin suggested.
In the code below, the dataset2[sapply... line identifies which variables are lists and concatenates them into a single string. Thus, they become simple character variables, allowing you to use write.table().
dataset2 <- dataset # make a copy just to be on the safe side
dataset2[sapply(dataset2, is.list)] <- apply(dataset2[sapply(dataset2, is.list)],
1, function(x)
paste(unlist(x),
sep=", ", collapse=", "))
str(dataset2)
# 'data.frame': 4 obs. of 4 variables:
# $ a: num 1 2 3 4
# $ b: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
# $ c: Factor w/ 4 levels "DD","ff","gg",..: 4 1 3 2
# $ l: chr "a, b" "2, 3, 4" "44, 33, 11, 22" "chr, ID, i, II"
write.table(dataset2, "myfile.txt", quote=FALSE, sep="\t")
# can be read back in with: dataset3 <- read.delim("myfile.txt")

Output from save is unreadable. Output from dump or dput is ASCII and is readable to people who understand the structure of R objects, but I'm guessing you wanted it more conventionally arranged.
> apply(dataset, 1, function(x) paste(x, sep=",", collapse=","))
[1] "1,a,HI,c(\"a\", \"b\")"
[2] "2,b,DD,c(2, 3, 4)"
[3] "3,c,gg,c(44, 33, 11, 22)"
[4] "4,d,ff,c(\"chr\", \"ID\", \"i\", \"II\")"
The backslashes do not appear in the text-file output:
writeLines(con="test.txt", apply(dataset, 1, function(x) paste(x, sep=",", collapse=",")))
#-------output-----
1,a,HI,c("a", "b")
2,b,DD,c(2, 3, 4)
3,c,gg,c(44, 33, 11, 22)
4,d,ff,c("chr", "ID", "i", "II")

If one of the requirements is to preserve the formatting for excel, etc, this might help:
writableTable <- tableFlatten(dataset, filler="")
# a b c l.01 l.02 l.03 l.04
# 1 a HI a b
# 2 b DD 2 3 4
# 3 c gg 44 33 11 22
# 4 d ff chr ID i II
write.csv(writableTable, "myFile.csv")
tableFlatten uses a function listFlatten which, as the name implies, takes nested lists and flattens them.
However, if the elements within the lists are of different sizes, it adds filler (which can be NAs, blank spaces, or any other user defined option)
The code for it is below.
tableFlatten <- function(tableWithLists, filler="") {
# takes as input a table with lists and returns a flat table
# empty spots in lists are filled with value of `filler`
#
# depends on: listFlatten(.), findGroupRanges(.), fw0(.)
# index which columns are lists
listCols <- sapply(tableWithLists, is.list)
tableWithLists[listCols]
tableWithLists[!listCols]
# flatten lists into table
flattened <- sapply(tableWithLists[listCols], listFlatten, filler=filler, simplify=FALSE)
# fix names
for (i in 1:length(flattened)) colnames(flattened[[i]]) <- fw0(ncol(flattened[[i]]), 2)
# REASSEMBLE, IN ORDER
# find pivot point counts
pivots <- sapply(findGroupRanges(listCols), length)
#index markers
indNonList <- indList <- 1
# nonListGrp <- (0:(length(pivots)/2)) * 2 + 1
# ListGrp <- (1:(length(pivots)/2)) * 2
final <- data.frame(row.names=row.names(tableWithLists))
for (i in 1:length(pivots)) {
if(i %% 2 == 1) {
final <- cbind(final,
tableWithLists[!listCols][indNonList:((indNonList<-indNonList+pivots[[i]])-1)]
)
} else {
final <- cbind(final,
flattened[indList:((indList<-indList+pivots[[i]])-1)]
)
}
}
return(final)
}
#=====================================
listFlatten <- function(obj, filler=NA) {
## Flattens obj like rbind, but if elements are of different length, plugs in value filler
# Initialize Vars
bind <- FALSE
# IF ALL ELEMENTS ARE MATRIX-LIKE OR VECTORS, MAKE SURE SAME NUMBER OF COLUMNS
matLike <- sapply(obj, function(x) !is.null(dim(x)))
vecLike <- sapply(obj, is.vector)
# If all matrix-like.
if (all(matLike)) {
maxLng <- max(sapply(obj[matLike], ncol))
obj[matLike] <- lapply(obj[matLike], function(x) t(apply(x, 1, c, rep(filler, maxLng - ncol(x)))))
bind <- TRUE
# If all vector-like
} else if (all(vecLike)) {
maxLng <- max(sapply(obj[vecLike], length))
obj[vecLike] <- lapply(obj[vecLike], function(x) c(x, rep(filler, maxLng - length(x))))
bind <- TRUE
# If all are either matrix- or vector-like
} else if (all(matLike & vecLike)) {
maxLng <- max(sapply(obj[matLike], ncol), sapply(obj[vecLike], length))
# Add in filler's as needed
obj[matLike] <-
lapply(obj[matLike], function(x) t(apply(x, 1, c, rep(filler, maxLng - ncol(x)))))
obj[vecLike] <-
lapply(obj[vecLike], function(x) c(x, rep(filler, maxLng - length(x))))
bind <- TRUE
}
# If processed and ready to be returned, then just clean it up
if(bind) {
ret <- (do.call(rbind, obj))
colnames(ret) <- paste0("L", fw0(1:ncol(ret), digs=2))
return(ret)
}
# Otherwise, if obj is sitll a list, continue recursively
if (is.list(obj)) {
return(lapply(obj, listFlatten))
}
# If none of the above, return an error.
stop("Unknown object type")
}
#--------------------------------------------
findGroupRanges <- function(booleanVec) {
# returns list of indexes indicating a series of identical values
pivots <- which(sapply(2:length(booleanVec), function(i) booleanVec[[i]] != booleanVec[[i-1]]))
### THIS ISNT NEEDED...
# if (identical(pivots, numeric(0)))
# pivots <- length(booleanVec)
pivots <- c(0, pivots, length(booleanVec))
lapply(seq(2, length(pivots)), function(i)
seq(pivots[i-1]+1, pivots[i])
)
}
#--------------------------------------------
fw0 <- function(num, digs=NULL, mkSeq=TRUE) {
## formats digits with leading 0's.
## num should be an integer or range of integers.
## if mkSeq=T, then an num of length 1 will be expanded to seq(1, num).
# TODO 1: put more error check
if (is.list(num))
lapply(num, fw0)
if (!is.vector(num)) {
stop("num should be integer or vector")
}
# convert strings to numbers
num <- as.numeric(num)
# If num is a single number and mkSeq is T, expand to seq(1, num)
if(mkSeq && !length(num)>1)
num <- (1:num)
# number of digits is that of largest number or digs, whichever is max
digs <- max(nchar(max(abs(num))), digs)
# if there are a mix of neg & pos numbers, add a space for pos numbs
posSpace <- ifelse(sign(max(num)) != sign(min(num)), " ", "")
# return: paste appropriate 0's and preface neg/pos mark
sapply(num, function(x) ifelse(x<0,
paste0("-", paste0(rep(0, max(0, digs-nchar(abs(x)))), collapse=""), abs(x)),
paste0(posSpace, paste0(rep(0, max(0, digs-nchar(abs(x)))), collapse=""), x)
))
}
#-----------------------------------------------

You can use dput for this.
dput(dataset, "dataset.txt")

you can also use save()
save(dataset, file="dataset.RData")

The answer provided by #Ananda is excellent, however, I ran into an issue when I had a data frame with two columns that were lists.
dataset<-data.frame(a=c(1,2,3,4),b=c('a','b','c','d'), c=c("HI","DD","gg","ff"))
g=list(c("a","b"),c(2,3,4), c(44,33,11,22),c("chr","ID","i","II"))
dataset$l<-g
dataset$l2<-g
dataset
a b c l l2
1 1 a HI a, b a, b
2 2 b DD 2, 3, 4 2, 3, 4
3 3 c gg 44, 33, 11, 22 44, 33, 11, 22
4 4 d ff chr, ID, i, II chr, ID, i, II
Using the original answer, both list columns contain the concatenated contents of both columns.
a b c l l2
1 1 a HI a, b, a, b a, b, a, b
2 2 b DD 2, 3, 4, 2, 3, 4 2, 3, 4, 2, 3, 4
3 3 c gg 44, 33, 11, 22, 44, 33, 11, 22 44, 33, 11, 22, 44, 33, 11, 22
4 4 d ff chr, ID, i, II, chr, ID, i, II chr, ID, i, II, chr, ID, i, II
Instead, try this modified version:
dataset2 <- dataset # make a copy just to be on the safe side
dataset2[sapply(dataset2, is.list)] <-
sapply(dataset2[sapply(dataset2, is.list)],
function(x)sapply(x, function(y) paste(unlist(y),collapse=", ") ) )
dataset2
a b c l l2
1 1 a HI a, b a, b
2 2 b DD 2, 3, 4 2, 3, 4
3 3 c gg 44, 33, 11, 22 44, 33, 11, 22
4 4 d ff chr, ID, i, II chr, ID, i, II

I stumbled across this and while there are a lot of great answers, I ended up doing something else. Sharing for posterity.
library(dplyr)
flatten_list = function(x){
if (typeof(x) != "list") {
return(x)
}
sapply(x, function(y) paste(y, collapse = " | "))
}
data %>%
mutate_each(funs(flatten_list)) ->
write_csv("data.csv")

Function argument as a part of the output name

Perhaps a silly question, but I can't find any answers to it anywhere (that I've looked :P ). I am trying to create a function with two arguments, these will be vectors (e.g.x=c(a,b,c) and y=c(50,75,100)). I will write a function which calculates all the combinations of these and have the argument used as a part of the output name. E.g.
function(x,y)
df$output_a_50 = a*2+50^2
df$output_a_75 = a*2+75^2
.....
Any suggestions will be appreciated :)

As #Spacedman and others discussed, your problem is that if you pass c(a, b, c) to your function, the names will be lost. The best alternative in my opinion, is to pass a list:
foo <- function(x, y) {
df <- list()
for (xx in names(x)) {
for (yy in y) {
varname <- paste("output", xx, yy, sep = "_")
df[[varname]] <- x[[xx]]*2 + yy^2
}
}
df
}
foo(x = list(a = NA, b = 1, c = 2:3),
y = c(50, 75, 100))
# $output_a_50
# [1] NA
#
# $output_a_75
# [1] NA
#
# $output_a_100
# [1] NA
#
# $output_b_50
# [1] 2502
#
# $output_b_75
# [1] 5627
#
# $output_b_100
# [1] 10002
#
# $output_c_50
# [1] 2504 2506
#
# $output_c_75
# [1] 5629 5631
#
# $output_c_100
# [1] 10004 10006

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

reading 0 in front of numeric - r

Related

Alternative to for loop for fast calculations when equations depend on each other

Create a group of numbers that does not exceed 34

R error when calling a function with lapply

How to write a data-frame with one column a list to a file?

Function argument as a part of the output name

Categories

Resources