R, sink/cat: Output something else than numbers? - r

I'm rather new to R and I guess there's more than one thing inadequate practice in my code (like, using a for loop). I think in this example, it could be solved better with something from the apply-family, but I would have no idea how to do it in my original problem - so, if possible, please let the for-loop be a for-loop. If something else is bad, I'm happy to hear your opinion.
But my real problem is this. I have:
name <- c("a", "a", "a", "a", "a", "a","a", "a", "a", "b", "b", "b","b", "b", "b","b", "b", "b")
class <- c("c1", "c1", "c1", "c2", "c2", "c2", "c3", "c3", "c3","c1", "c1", "c1", "c2", "c2", "c2", "c3", "c3", "c3")
value <- c(100, 33, 80, 90, 80, 100, 100, 90, 80, 90, 80, 100, 100, 90, 80, 99, 80, 100)
df <- data.frame(name, class, value)
df
And i want to print it out. I use sink as well as hwriter (to output it as a html) later on. I get the problem with both, so I hope it's caused by the same and it's enough if we solve it for sink. That's the code:
sink("stuff.txt")
for (i in 1:nrow(df)) {
cat(" Name:")
cat(df$name[i-1])
cat("\n")
cat(" Class:")
cat(df$class[i-1])
cat("\n")
}
sink()
file.show("stuff.txt")
Part of the output I get is something like:
Name:1
Class:1
Name:1
Class:2
Name:1
Class:2
On the other hand, the output I want should be like:
Name:a
Class:c1
Name:a
Class:c2
Name:a
Class:c2

The reason cat was printing numbers was that your character variables were converted to "factors" when you put them in the data.frame. This is the default behavior for data.frames. It is often a more efficient way to store the values because it converts each string value to a unique integer value. That's why you see numbers when you cat the value.
If you don't want to use factors in your data.frame, you can use
df <- data.frame(name, class, value, stringsAsFactors=F)
and this will keep the values as characters. Alternatively, you can convert to character when you print
cat(as.character(df$name[i-1]))

Related

R Shiny DataTables with column filters - preserve the styling

I use the DT package in Shiny and want to use filters, however I have a problem with the styling of the filters:
Code: DT::datatable(overview,editable=TRUE, rownames=FALSE, selection='none', filter='top')
This is the way my DT looks when I add items. When I leave the dropdown, it looks like the following picture (collapsed) which is of course pretty bad.
I want to preserve the styling with the grey box and not have anything like "[..." for my users. How can I do this? If that is not possible with DT, using another package is also an option for me.
Edit: Reproduceable example
df <- data.frame(a = c("a", "a", "b", "b", "c", "c"), b = c(1,2,3,4,5,6), c("A", "B"))
df <- setNames(df, c("A", "B"))
DT::datatable(df,editable=TRUE, rownames=FALSE, selection='none', filter='top')
Thank you!

replacement has length zero on ave function in R

im working on changing the name of several pictures, actually the the length(filename)=length(name) and both are aprox 1.4million entries, the name_idx have about 350,000 entries
filename <- c("1.jpg", "2.jpg" "3.jpg", ...., )
name <- c("a", "b", "c", "c", "a", "b",....,)
name_idx <- c("a", "b", "c", "d", ..., )
i want to create another list which will contain the new name of the picture, the first entry would be a.1.jpg and the 5th would be a.2.jpg, the 4th would be c.2.jpg and so on
someone in stackoverflow giveme this solution
paste0(name, ".", ave(name, name, FUN = seq_along), ".jpg")
but im getting this error
Error in x[...] <- m : replacement has length zero
i was hoping you can helpme to know why is this and how to solve it?
i finally done it in python, whit the glob and the os libraries =3

How to save plots by names of items in a list

I can save a bunch of plots but it names them as the first value from each list item rather than the name of the variable.
delm2<-data.frame(N = c(5.881, 5.671, 7.628, 4.643, 6.598, 4.485, 4.465, 4.978, 4.698, 3.685, 4.915, 4.983, 3.288, 5.455, 5.411, 2.585, 4.321, 4.661),
t1 = c("N", "N", "T", "T", "N", "N", "T", "N", "N", "N", "N", "T", "T", "T", "T", "T", "T", "N"),
t3 = c("r","v", "r", "v", "v", "r", "c", "c", "v", "r", "c", "c", "r", "v","c", "r", "v", "c"),
B = c(1.3, 1.3, 1.33, 1.25, 1.4, 1.34, 1.36, 1.39, 1.36, 1.42, 1.38, 1.31, 1.37, 1.44, 1.22, 1.4, 1.46, 1.35))
library(boot)
lapply(as.list(delm2[,c('N','B')]),
function(i){
bmp(filename = paste0(i,".bmp"), width = 350, height = 400)
glm.diag.plots(glm(i ~ t1*t3,data=del))
dev.off()
})
This saves the plots but they are named with number values from the data rather than the name of each target of lapply...
i.e. current output is two files named "5.881" and "1.3", when I want the same two files but named "N" and "B"
I thought I could change paste0(i,".bmp") to paste0(names(i),".bmp") but that just saves the first one, with no name at all.
It looks like you can give names that are just integers in How to save and name multiple plots with R but I want the names of the variables from the list or the two numerics N and B in delm2.
It looks from Saving a list of plots by their names() like this would be easier with ggplot output but ggsave didn't work on one glm.diag.plots output.
(Disregard my previous suggestion using Map.)
The big takeaway is how to derive the formula dynamically. One way is with as.formula, which takes a string and converts into a formula that can be used in a model-generating function (for example).
One problem with using lapply(as.list(delm2[,c('N','B')]), ...) is that the remainder of the data (i.e., columns t1 and t3) are not passed, just one vector at a time. (I'm wondering if your reference to del is a typo, un-released/hidden data, or something else.)
Try this:
lapply(c("N", "B"), function(nm) {
bmp(filename = paste(nm, ".bmp"), width = 350, height = 400)
glm.diag.plots(glm(as.formula(paste(nm, "~ t1*t3")), data = delm2))
dev.off()
})
In general, I don't like breaching scope inside these functions. That is, I try to not reach outside lapply for data when I can pass it fairly easily. The above in a pedantic-language way could look like:
lapply(c("N", "B"), function(nm, x) {
bmp(filename = paste(nm, ".bmp"), width = 350, height = 400)
glm.diag.plots(glm(as.formula(paste(nm, "~ t1*t3")), data = x))
dev.off()
}, x = delm2)
While this preserves scope, it may be confusing if you do not understand what is going on.
This might be a great time to use for instead of one of the *apply* functions. Everything you want is in side-effect, and since for and *apply are effectively the same speed, you gain readability:
for (nm in c("N", "B")) {
bmp(filename = paste(nm, ".bmp"), width = 350, height = 400)
glm.diag.plots(glm(as.formula(paste(nm, "~ t1*t3")), data = delm2))
dev.off()
}
(In this case, there is no "scope breach", so I used the original variable.)
Parenthetically, to tie-in my now-edited-out and incorrect answer that included Map. Here's an example using Map that does more to demonstrate what Map (and mapply) are doing, vice actually improving on your immediate need.
If for some reason you wanted them named something distinct from "N.bmp", you could do this:
Map(function(fn, vn, x) {
bmp(filename = paste(nm, ".bmp"), width = 350, height = 400)
glm.diag.plots(glm(as.formula(paste(nm, "~ t1*t3")), data = x))
dev.off()
}, c("N2.bmp", "b3456.bmp"), c("N", "B"), list(delm2))
Two things to note from this:
The use of list(delm2) is to wrap that structure into a single "thing" that is passed repeated to the mapped function. If we did just delm2 (no list(...)), then it would try to use the first column of delm2 with each of the first elements. This might be useful in other scenarios, but in your glm example you need other columns present, so you cannot include just one column at a time. (Well, there are ways to do that, too ... but important at the moment.)
The first time the anonymous function is called, fn is "N2.bmp", vnis"N", andxis the full dataset ofdelm2. The second time the anon-func is called,fnis"b3456.bmp",vnis"B", andxis again the full dataset ofdelm2`.
I label this portion "parenthetic" because it really doesn't add to this problem, but since I started that way in my first-cut answer, I thought I'd continue with the methodology, the "why" of my choice of Map. In the end, I think the for solution or one of the lapply solutions should be fine for you.

Kable and pandoc.table takes only the first six rows

I'm trying to build a table either using pandoc.table or kable and have problems getting them to print all 10 rows in my table, atm they both only prints the first six. While I moved to write the table manually, which works, it would be nice to know what's wrong with my code. I haven't seen anything to suggest that 6 rows are the limit, so my code should be workning? Anyone know why it doesn't? If I subset the dt I can print the last 4 as well so maybe 6 rows are a limit. Code below:
library("data.table")
library("knitr")
library("pander")
count.mark <- 35
dt.tbl1 <- data.table(Var = c("Geo", "A", "A",
"Cust", "A",
"Ins", "A",
"Vei", "A",
"Brand"),
RangeR = c("A1", "S1", "T1",
"Com", "Pri",
"T", "B",
"Pa", "Pe",
paste("A1 - A99 (",
count.mark, ")", sep="")
)
)
pandoc.table(head(dt.tbl1), justify = c("left", "centre"))
kable(head(dt.tbl1), justify = c("left", "centre"))
That's because you're using head(dt.tbl1), which by default shows the first six rows. You should just do, e.g.
pandoc.table(dt.tbl1, justify = c("left", "centre"))

How to define if else statement whereby the else statement can remove unwanted \t

I have a function, where one part reads as follows:
conefor.input <- function(conefor.file, onlyoverall, probmin, index)
{
for(i in 1:l00)
{
cat(paste(conefor.file,
if(onlyoverall=="TRUE")
{onlyoverall<-"onlyoverall"},
distance,
if(probmin=="TRUE")
{probmin<-paste("-pcHeur", min)},
index, sep="\t"),file="conef_command.txt", sep="\n")}
return("conef_command.txt")
}
My objective is to generate 100 lines where each input is divided by a tab.
There is no problem when the two 'if' statements above are true. With the resulting being:
conefor.file\tonlyoverall\tdistance\tpcHeur min\tindex
However, when the two 'if' statements above are false, I am left with two tabs where 'onlyoverall' and 'pcHeur' would have been, whereas, what I really want is no action at all, and only one tab separating each argument.
Example when 'if' statements are false:
conefor.file\t\tdistance\t\tindex
What I want when 'if' statements are false:
conefor.file\tdistance\tindex
Many thanks in advance
it's hard to suggest any form of vectorization or similar for your for loop, since it's not clear (to me at least)
what differences there are between iterations.
However, as for the paste statement inside of cat, you can use the following instead
paste(ifelse(onlyoverall,
paste(conefor.file, onlyoverall<-"onlyoverall", sep="\t"),
conefor.file),
ifelse(probmin,
paste(distance, probmin<-paste("-pcHeur", min), sep="\t"),
distance),
index,
sep="\t")
Notice, this grabs the output prior to the if statement and includes in an ifelse output, where it is given for both TRUE and FALSE
eg:
x1 <- x2 <- TRUE
paste(ifelse(x1, paste("A", "B" , sep="#"), "A"), ifelse(x2, paste("C", "D", sep="#"), "C"), "E", sep="#")
# [1] "A#B#C#D#E"
x1 <- x2 <- FALSE
paste(ifelse(x1, paste("A", "B" , sep="#"), "A"), ifelse(x2, paste("C", "D", sep="#"), "C"), "E", sep="#")
# [1] "A#C#E"
x2 <- TRUE
paste(ifelse(x1, paste("A", "B" , sep="#"), "A"), ifelse(x2, paste("C", "D", sep="#"), "C"), "E", sep="#")
# [1] "A#C#D#E"

Resources