How to create a custom write.table function? - r

I can use write.table function to create an output data from a data.frame:
> write.table(head(cars), sep = "|", row.names=FALSE)
"speed"|"dist"
4|2
4|10
7|4
7|22
8|16
9|10
How can I create my own write.table function which creates an output like this (header with double pipes and data with preceding and succeeding pipes)?:
||"speed"||"dist"||
|4|2|
|4|10|
|7|4|
|7|22|
|8|16|
|9|10|

write.table can get you part of the way, but you will still need to do some fiddling around to get things to work just as you want.
Here's an example:
x <- capture.output(
write.table(head(cars), sep = "|", row.names = FALSE, eol = "|\n"))
x2 <- paste0("|", x)
x2[1] <- gsub("|", "||", x2[1], fixed=TRUE)
cat(x2, sep = "\n")
# ||"speed"||"dist"||
# |4|2|
# |4|10|
# |7|4|
# |7|22|
# |8|16|
# |9|10|
As a function, I guess in its most basic form it could look something like:
write.myOut <- function(inDF, outputFile) {
x <- capture.output(
write.table(inDF, sep = "|", row.names = FALSE, eol = "|\n"))
x <- paste0("|", x)
x[1] <- gsub("|", "||", x[1], fixed=TRUE)
cat(x, sep = "\n", file=outputFile)
}

I don't think that it is possible with write.table. Here is a workaround:
# function for formatting a row
rowFun <- function(x, sep = "|") {
paste0(sep, paste(x, collapse = sep), sep)
}
# create strings
rows <- apply(head(cars), 1, rowFun)
header <- rowFun(gsub("^|(.)$", "\\1\"", names(head(cars))), sep = "||")
# combine header and row strings
vec <- c(header, rows)
# write the vector
write(vec, sep = "\n", file = "myfile.sep")
The resulting file:
||"speed"||"dist"||
|4|2|
|4|10|
|7|4|
|7|22|
|8|16|
|9|10|

Related

generate variable names in for loop

Hope you don't mind if this is too easy for you.
In R, I am using fromJSON() to read from 3 urls (tier 1 url) , in the JSON file there is "link" field which give me another url (tier 2 url) and I use that and read.table() to get my final data. My code now is like this:
# note, this code does not run
urlJohn <- www.foo1.com
urlJane <- www.foo2.com
urlJoe <- www.foo3.com
tempJohn <- fromJson(urlJohn)
tempJohn[["data"]][["rows"]]$link %<>%
{clean up this data}
dataJohn <- read.table(tempJohn[["data"]][["rows"]]$link,
header = TRUE,
sep = ",")
tempJane <- fromJson(urlJane)
tempJane[["data"]][["rows"]]$link %<>%
{clean up this data}
dataJane <- read.table(tempJane[["data"]][["rows"]]$link,
header = TRUE,
sep = ",")
tempJoe <- fromJson(urlJoe)
tempJoe[["data"]][["rows"]]$link %<>%
{clean up this data}
dataJoe <- read.table(tempJoe[["data"]][["rows"]]$link,
header = TRUE,
sep = ",")
As you can see, I am just copying-n-pasting code blocks. What I wish is this:
# note, this code also does not run
urlJohn <- www.foo1.com
urlJane <- www.foo2.com
urlJoe <- www.foo3.com
source <- c("John", "Jane", "joe")
for (i in source){
temp <- paste(temp, i, sep = "")
url <- paste(url, i, sep = "")
data <- paste(data, i, sep = "")
temp <- fromJson(url)
temp[["data"]][["rows"]]$link %<>%
{clean up this data}
data <- read.table(temp[["data"]][["rows"]]$link,
header = TRUE,
sep = ",")
}
What do I need to do to make the for loop work? If my question is not clear, please ask me to clarify it.
I usually find using lapply convenient than a for loop. Although you can easily convert this to a for loop if needed.
URLs <- c('www.foo1.com', 'www.foo2.com', 'www.foo3.com')
lapply(URLs, function(x) {
temp <- jsonlite::fromJSON(x)
temp[["data"]][["rows"]]$link %<>% {clean up this data}
read.table(temp[["data"]][["rows"]]$link,header = TRUE,sep = ",")
}) -> list_data
list_data
Thanks to #Ronak Shah. The R community strongly favors "non-For-loop" solution.
The way to get my desired result is lapply.
Below is non-running codes in mnemonics:
URLs <- c('www.foo1.com', 'www.foo2.com', 'www.foo3.com')
lapply(URLs, function(x) {
temp <- jsonlite::fromJSON(x)
x <- temp[["data"]][["rows"]]$link %<>% {clean up this data}
y <- read.table(temp[["data"]][["rows"]]$link,header = TRUE,sep = ",")
return(list(x, y))
})
And this is a running example.
x <- list(alpha = 1:10,
beta = exp(-3:3),
logic = c(TRUE,FALSE,FALSE,TRUE))
lapply(x, function(x){
temp <- sum(x) / 2
temp2 <- list(x,
temp)
return(temp2)
}
)

Conditionally read a csv with a given separator and if that separator does not work read.csv using a different separator

I am trying to download some CSV's from some links. Most of the CSV's are separated by ; however, one or two are separated by ,. Running the following code:
foo <- function(csvURL){
downloadedCSV = read.csv(csvURL, stringsAsFactors = FALSE, fileEncoding = "latin1", sep = ";")
return(downloadedCSV)
}
dat <- purrr::map(links, foo)
Gives me a list of 3 data.frame's. Two of them have 2 columns (correctly read in by the ; separator) and one of them has 1 column (incorrectly read in by the ; separator) because this file uses the , separator.
How can I incorporate into the function something like if the number of columns == 1 re-read the data but this time using , instead of ;? I tried passing sep = ";|," to the read.csv function but had no luck.
Links data:
links <- c("https://dadesobertes.gva.es/dataset/686fc564-7f2a-4f22-ab4e-0fa104453d47/resource/bebd28d6-0de6-4536-b522-d013301ffd9d/download/covid-19-total-acumulado-de-casos-confirmados-pcr-altas-epidemiologicas-personas-fallecidas-y-da.csv",
"https://dadesobertes.gva.es/dataset/686fc564-7f2a-4f22-ab4e-0fa104453d47/resource/b4b4d90b-08cf-49e4-bef1-5608311ce78a/download/covid-19-total-acumulado-de-casos-confirmados-pcr-altas-epidemiologicas-personas-fallecidas-y-da.csv",
"https://dadesobertes.gva.es/dataset/686fc564-7f2a-4f22-ab4e-0fa104453d47/resource/62990e05-9530-4f2f-ac41-3fad722b8515/download/covid-19-total-acumulado-de-casos-confirmados-pcr-altas-epidemiologicas-personas-fallecidas-y-da.csv"
)
We can also specify the sep as an argument
foo <- function(csvURL, sep){
downloadedCSV = read.csv(csvURL, stringsAsFactors = FALSE,
fileEncoding = "latin1", sep = sep)
return(downloadedCSV)
}
lstdat <- map2(links, c(";", ",", ";"), ~ foo(.x, sep=.y))
Or use fread from data.table, which can pick up the delimiter automatically
foo <- function(csvURL){
downloadedCSV = data.table::fread(csvURL, encoding = "Latin-1")
return(downloadedCSV)
}
dat <- purrr::map(links, foo)

How to correct this warning “condition has length > 1” warning from `if` function"? [duplicate]

This question already has answers here:
Interpreting "condition has length > 1" warning from `if` function
(7 answers)
Closed 5 years ago.
I have made a function that read input from file lists and appends it using rbind.
dat <- NA
file.names <- list.files(paste(in.path2,"CSV",sep =""))
for(f in file.names){
file <- paste(in.path2,"CSV/", f, sep = "")
tmp <- read.csv(file, stringsAsFactors = F, na.strings = c("", " "))
if (is.na(dat)) {
dat <- tmp
} else {
colnames(tmp) <- colnames(dat)
dat <- rbind(dat, tmp)
}
print(f)
}
I am getting this warning:
1: In if (is.na(dat)) { ... :
the condition has length > 1 and only the first element will be used.
How to correct this?
I would strongly suggest to not grow your data.frame like that but like this
file.names <- list.files(paste(in.path2,"CSV",sep =""))
input_list <- list()
for(f in file.names){
file <- paste(in.path2,"CSV/", f, sep = "")
input_list[[f]] <- read.csv(file, stringsAsFactors = F, na.strings = c("", " "))
print(f)
}
dat <- do.call(rbind, input_list)
This is much faster and you do not need to test if dat is.na or not
We can do this with more easily with lapply without worrying about the assignment of NA and the if/else clauses
filenames <- list.files(paste0(in.path2,"CSV"), full.names = TRUE)
do.call(rbind,lapply(filenames, read.csv, na.strings = c("", " "), stringsAsFactors = FALSE))
Or another option is fread from data.table
library(data.table)
rbindlist(lapply(filenames, fread, na.strings = c("", " ")), fill = TRUE)
Or with tidyverse
library(tidyverse)
map_df(filenames, read_csv, na = c("", " "))
If the columns are not the same, then
map(filenames, read_csv, na = c("", " ")) %>%
bind_rows

R solving hackerrank challenge

I would like to solve the challenge. The language of my preference is R. I am not sure how to receive input. On hackerrank coding window it says that
"# Enter your code here. Read input from STDIN. Print output to STDOUT"
So far I am used to receiving input by using
v1 <- readline("Enter two integers: ")
How should i receive input on hackerrank? I tried to see solved examples but couldn't find any solved examples.
update 1
Below code works in R. Only problem is number of steps and ball values are not provided from keyboard input. We have to update them manually on line 1 and line2. How could I get update below solution so that it works on hackerrank?
steps=4
ball_numbers=c(1,2,2,2)
d=as.data.frame(c(0,1))
for (i in (1:(length(ball_numbers)-1)))
{
assign(x = paste("A", i, sep = ""),value = c(0,1))
e <- as.data.frame(get(paste("A", i, sep = "")))
colnames(e) <- paste("A", i, sep="")
d <- merge(d,e)
}
d=as.matrix(t(d))
answer=sum(ball_numbers %*% d)/ncol(d)
update2
Below code produces correct answer
# Enter your code here. Read input from STDIN. Print output to STDOUT
nums <- read.table("/dev/stdin", sep=" ");
nums <- as.matrix(as.data.frame(t(nums)))
steps=nums[1]
ball_numbers=nums[2:length(nums)]
d=as.data.frame(c(0,1))
for (i in (1:(length(ball_numbers)-1)))
{
assign(paste("A", i, sep = ""),value = c(0,1))
e <- as.data.frame(get(paste("A", i, sep = "")))
colnames(e) <- paste("A", i, sep="")
d <- merge(d,e)
}
d=as.matrix(t(d))
#answer=as.numeric(format(round(sum(ball_numbers %*% d)/ncol(d),1),nsmall=1))
answer = print(format(sum(ball_numbers %*% d)/ncol(d),nsmall=1, digits = 1), quote = F)
write.table(as.numeric(answer), sep = "", append=T, row.names = F, col.names = F,quote = FALSE,)
I get below output
[1] 2.0
2
which is different from expected output which is below. How can i modify my code to get the correct format of output
2.0
Look at the "warmup".
data <- suppressWarnings(read.table("stdin", sep=" "));
Alternatively you can use
data <- suppressWarnings(readLines(file("stdin")))
Also Refer this page in hackerrank
I faced the similar issue for reading input in R in hackerrank . Then to use readLines i used following :
input<-file('stdin', 'r')
x <- readLines(input, n=1)
If u again want to read another data y use same approach :
y <- readLines(input, n=1)
#---this solves the problem
# Enter your code here. Read input from STDIN. Print output to STDOUT
nums <- suppressWarnings(readLines(file("stdin")))
#nums <- suppressWarnings(readLines(file("new.txt")))
nums <- as.matrix(as.data.frame(t(nums)))
class(nums) <- "numeric"
steps=nums[1]
ball_numbers=nums[2:length(nums)]
d=as.data.frame(c(0,1))
for (i in (1:(length(ball_numbers)-1)))
{
assign(paste("A", i, sep = ""),value = c(0,1))
e <- as.data.frame(get(paste("A", i, sep = "")))
colnames(e) <- paste("A", i, sep="")
d <- merge(d,e)
}
d=as.matrix(t(d))
answer=sum(ball_numbers %*% d)/ncol(d)
write.table(cat(format(answer, nsmall=1), sep="\n"), sep = "", append=T, row.names = F, col.names = F)
Another approach:
con = file('stdin', open ='r')
input = readLines(con)
z = c()
for(i in 2:length(input)){
z = c(z, as.numeric(input[[i]]))
}
cat(format(round(sum(z)/2, 1), nsmall = 1), sep = "\n")
A very handy one-liner to read in from standard input is the scan function, for instance:
text <- scan(file = 'stdin', what = 'character', sep = '\r')

How can I write a comma between each value of a vector?

I would like to export a vector to a file with a comma between each value. I've tried to use paste with sep = "," but it does not work. Does anyone know why?
> x <- rnorm(10)
> paste(x, sep = ",")
[1] "-1.08574649988891" "2.9580381152357" "-0.549880906960338" "1.79794352588269" "-1.06316832291584" "-1.56636513327118" "0.363867572492577"
[8] "-0.644707355221403" "0.617827074223129" "-1.50892267338431"
Because sep separates objects; collapse separates elements.
> paste(letters[1:3], 1:3, sep=",", collapse="|")
[1] "a,1|b,2|c,3"
Try this:
> write.table(matrix(1:10, 1), sep = ",", row.names = FALSE, col.names = FALSE)
1,2,3,4,5,6,7,8,9,10
or this:
> cat(1:10, sep = ","); cat("\n")
1,2,3,4,5,6,7,8,9,10
Also note that both write.table and cat support file= and append= arguments.

Resources