I have dumped several .txt files to an SQLite database on my computer's hard disk using RSQLite package. Since the .txt files have no headers, I have to use the "header = FALSE" argument. Here is how my codes look:
for (i in (1:8)) {
dbWriteTable(conn = db, name = tbls[i], value = paths[i],
row.names = FALSE, header = FALSE, sep = "\t",
overwrite = TRUE)
}
Now I want to add column names to the tables in the SQLite database, how can I do that?
Related
I have a folder containing several .csv files. I need to delete the first three rows and the very last row of all those .csv files and then save them all as .txt. All the files have the same format so it's always the same rows that I would need to delete.
I know how to modify a single dataframe but I do not know how to load, modify and save as txt several dataframes.
I am a beginner using R so I do not have examples of things I have tried yet.
Any help will be really appreciated!
It's hard to start with stack overflow but the other comments about reproducible examples are worth thinking about for the future. My suggestion would be to write a function that reads, modifies, and writes and then loop it across all the files.
I can't tell exactly how to do this as I can't see your data but something like this should work:
library('tidyverse')
old_paths = list.files(
path = your_folder,
pattern = '\\.csv$',
full.names = TRUE
)
read_write = function(path){
new_filename = str_replace(
string = path,
pattern = '\\.csv$',
replacement = '.txt'
)
read_csv(path) %>%
slice(-(1:3)) %>%
slice(-n()) %>%
write_tsv(new_filename) %>%
invisible()
}
lapply(old_paths, read_write)
Let's do this for one data frame, only referencing its file name
input_file = "my_data_1.csv"
data = read.csv(input_file)
# modify
data = data[-(1:3), ] # delete first 3 rows
data = data[-nrow(data), ] # delete last row
# save as .txt
output_file = sub("csv$", "txt", input_file)
write.table(x = data, file = output_file, sep = "\t", row.names = FALSE)
Now we can turn it into a function taking the file name as an argument:
my_txt_convert = function(input_file) {
data = read.csv(input_file)
# modify
data = data[-(1:3), ] # delete first 3 rows
data = data[-nrow(data), ] # delete last row
# save as .txt
output_file = sub("csv$", "txt", input_file)
write.table(x = data, file = output_file, sep = "\t", row.names = FALSE)
}
Then we call the function on all your files:
to_convert = list.files(pattern='.*.csv')
for (file in to_convert) {
my_txt_convert(file)
}
# or
lapply(to_convert, my_txt_convert)
I have multiple EEG data files in .txt format all saved in a single folder, and I would like R to read all the files in said folder, add column headings (i.e., electrode numbers denoted by ordered numbers from 1 to 129) to every file, and overwrite old files with new ones.
rm(list=ls())
setwd("C:/path/to/directory")
files <- Sys.glob("*.txt")
for (file in files){
# read data:
df <- read.delim(file, header = TRUE, sep = ",")
# add header to every file:
colnames(df) <- paste("electrode", 1:129, sep = "")
# overwrite old text files with new text files:
write.table(df, file, append = FALSE, quote = FALSE, sep = ",", row.names = FALSE, col.names = TRUE)
}
I expect the column headings of ordered numbers (i.e., electrode1 to electrode129) to appear on first row in every text file but the code doesn't seem to work.
I bet the solution is ridiculously simple, but I just haven't found any useful information regarding this issue...
Try this one
for (file in files) {
df = read.delim(file,header = FALSE,sep = ",")
colnames(df) = paste("electrode",1:129,sep = "")
write.table(df, file = "my_data.txt", sep = ",")
}
Some background for my question: This is an R script that a previous research assistant wrote, but he did not provide any guidance to me on using it for myself. After working through an R textbook, I attempted to use the code on my data files.
What this code is supposed to do is load multiple .csv files, delete certain items/columns from them, and then write the new cleaned .csv files to a specified directory.
When I run my code, I don't get any errors, but the code isn't going anything. I originally thought that this was a problem with file permissions, but I'm still having the problem after changing them. Not sure what to try next.
Here's the code:
library(data.table)
library(magrittr)
library(stringr)
# create a function to delete unnecessary variables from a CAFAS or PECFAS
data set and save the reduced copy
del.items <- function(file)
{
data <- read.csv(input = paste0("../data/pecfas|cafas/raw",
str_match(pattern = "cafas|pecfas", string = file) %>% tolower, "/raw/",
file), sep = ",", header = TRUE, na.strings = "", stringsAsFactors = FALSE,
skip = 0, colClasses = "character", data.table = FALSE)
data <- data[-grep(pattern = "^(CA|PEC)FAS_E[0-9]+(T(Initial|[0-
9]+|Exit)|SP[a-z])_(G|S|Item)[0-9]+$", x = names(data))]
write.csv(data, file = paste0("../data/pecfas|cafas/items-del",
str_match(pattern = "cafas|pecfas", string = file) %>% tolower, "/items-
del/", sub(pattern = "ExportData_", x = file, replacement = "")) %>%
tolower, sep = ",", row.names = FALSE, col.names = TRUE)
}
# delete items from all cafas data sets
cafas.files <- list.files("../data/cafas/raw/", pattern = ".csv")
for (file in cafas.files){
del.items(file)
}
# delete items from all pecfas data sets
pecfas.files <- list.files("../data/pecfas/raw/", pattern = ".csv")
for (file in pecfas.files){
del.items(file)
}
I'm writing to sqlite db from R using the following command:
dbWriteTable(con, 'topics',as.data.frame(topics), row.names = NA, overwrite = FALSE, append = TRUE, field.types = NULL)
I get the following table in sqlite:
How can I rename the row_names attribute?
The df [as.data.frame(topics)] snippet is:
This is what the row.names argument to dbWriteTable is for: Set it to a character value to rename the column, set it to NULL to avoid writing it altogether. Explore the guessRowName() function in the DBI package for other options.
I am working on combining csv files into one large csv file that will not be able to fit into my machine's RAM. Is there anyway to go about doing that in R? I realize that I could load each individual csv file into R and append the file to an existing database table but for quirky reasons I'm looking to end up with a large csv file.
Try to read each csv file one by one and write out with write.table and option append = T.
Something like this:
read one csv file;
write.table(..., append = T) to the final csv file;
remove the table with rm();
gc().
Repeate until all files are written out.
You can use the option append = TRUE
first <- data.frame(x = c(1,2), y = c(10,20))
second <- data.frame(c(3,4), c(30,40))
write.table(first, "file.csv", sep = ",", row.names = FALSE)
write.table(second, "file.csv", append = TRUE, sep = ",", row.names = FALSE, col.names = FALSE)
First create 3 test files and then create a variable Files containing their names. We used Sys.glob to do get the vector of file names but you may need to modify this statement. Then define outFile as the name of the output file. For each component of Files read in the file with that name and write it out. If it is the first file then write it all out and if it is a subsequent file write it all except for the header being sure to use append = TRUE. Note that L is overwritten each time a file is read in so that only one file takes up space at a time.
# create test files using built in data frame BOD
write.csv(BOD, "BOD1.csv", row.names = FALSE)
write.csv(BOD, "BOD2.csv", row.names = FALSE)
write.csv(BOD, "BOD3.csv", row.names = FALSE)
Files <- Sys.glob("BOD*.csv") # modify as appropriate
outFile <- "out.csv"
for(f in Files) {
L <- readLines(f)
if (f == Files[1]) cat(L, file = outFile, sep = "\n")
else cat(L[-1], file = outFile, sep = "\n", append = TRUE)
}
# check that the output file was written properly
file.show(outFile)
The loop could alternately be replaced with this:
for(f in Files) {
d <- read.csv(f)
first <- f == Files[1]
write.table(d, outFile, sep = ",", row.names = FALSE, col.names = first, append = !first)
}