Julia: Error reading tabular data from a txt file - julia

I have the file called testfile.txt in my working directory.
5.00000000E+06 1.00000000E+07 1.86965370E+13 2.00000000E+04
1.50000000E+07 1.00000000E+07 1.67889215E+13 2.00000000E+04
2.50000000E+07 1.00000000E+07 1.50764483E+13 2.00000000E+04
3.50000000E+07 1.00000000E+07 1.35391442E+13 2.00000000E+04
4.50000000E+07 1.00000000E+07 1.21590771E+13 2.00000000E+04
5.50000000E+07 1.00000000E+07 1.09201484E+13 2.00000000E+04
6.50000000E+07 1.00000000E+07 9.80790597E+12 2.00000000E+04
I want to store this tabular data in a dataframe.
# using necessary packages
using DataFrames, Queryverse
df = load("testfile.txt",
header_exists = false) |> DataFrame
When I try this though, I get the error:
No applicable_loaders found for UNKNOWN
How can I resolve this?

The error
No applicable_loaders found for UNKNOWN
indicates that load does not know how to load the file; specifically a .txt file. There are two options to fix this:
Explicitly state that the file is CSV, which it does know how to load:
df = load(File(format"CSV", "testfile.txt"), spacedelim=true, header_exists=false)
|> DataFrame
Rename your data file to have a csv extension
df = load("testfile.csv", spacedelim=true, header_exists=false) |> DataFrame
Both will produce the output:
7×4 DataFrame
Row │ Column1 Column2 Column3 Column4
│ Float64 Float64 Float64 Float64
─────┼───────────────────────────────────────
1 │ 5.0e6 1.0e7 1.86965e13 20000.0
2 │ 1.5e7 1.0e7 1.67889e13 20000.0
3 │ 2.5e7 1.0e7 1.50764e13 20000.0
4 │ 3.5e7 1.0e7 1.35391e13 20000.0
5 │ 4.5e7 1.0e7 1.21591e13 20000.0
6 │ 5.5e7 1.0e7 1.09201e13 20000.0
7 │ 6.5e7 1.0e7 9.80791e12 20000.0
Note the addition of the spacedelim=true to read space delimited values, which is what you supplied as the example data file contents.

Related

dir_tree() to HTML/Table

I have a very large folder with many subfolders and hence a large number of files. I would like to create an HTML file with the folder structure with dropdown options for the different levels as well as a searchbar. I thought about reactable or a small shiny app, but maybe someone has an idea. My first problem is to get the structure from fs::fs_tree into a suitable format.
Consider the following folder structure:
fs::fs_tree()
├── folder1
├── folder2
│ └── readme.R
├── folder3
│ ├── subfolder1
│ │ ├── example.R
│ │ └── example2.R
│ └── subfolder2
│ └── plot.R
You can use the jsTreeR package.
You don't need a Shiny app since a "jsTree" is a HTML widget, and you can save it as a HTML file with htmlwidgets::saveWidget.
Here is the folder example of this package:
library(jsTreeR)
# make the nodes list from a vector of file paths
makeNodes <- function(leaves){
dfs <- lapply(strsplit(leaves, "/"), function(s){
item <-
Reduce(function(a,b) paste0(a,"/",b), s[-1], s[1], accumulate = TRUE)
data.frame(
item = item,
parent = c("root", item[-length(item)]),
stringsAsFactors = FALSE
)
})
dat <- dfs[[1]]
for(i in 2:length(dfs)){
dat <- merge(dat, dfs[[i]], all = TRUE)
}
f <- function(parent){
i <- match(parent, dat$item)
item <- dat$item[i]
children <- dat$item[dat$parent==item]
label <- tail(strsplit(item, "/")[[1]], 1)
if(length(children)){
list(
text = label,
data = list(value = item),
children = lapply(children, f)
)
}else{
list(text = label, data = list(value = item))
}
}
lapply(dat$item[dat$parent == "root"], f)
}
folder <-
list.files(Sys.getenv("R_HOME"), recursive = TRUE)
nodes <- makeNodes(folder)
jstree(nodes, search = TRUE)

Move files with specific name pattern to specific subfolders

So I would like to copy files to a specific folder based on a certain part in their name. For your overview I put my folder structure below. In the folders D1 and D2 I have multiple files (as example I put names of two files here) and the folders Brightfield and FITC. I would like to move the .TIF files to either the folder Brightfield or FITC dependent on whether the file name has brightfield in its name or FITC (see what I would like).
Current situation:
main Directory
|
|___ Experiment
├── D1
├── Brightfield
│ └── FITC
|__ 20210205_DML_3_4_D1_PM_flow__w1brightfield 100 - CAM_s3_t5.TIF
|__ 20210205_DML_3_4_D1_PM_flow__w2FITC 100- CAM_s3_t5.TIF
└── D2
├── temperature
└── weather
|__ 20210219_DML_3_4_D2_AM_flow__w1brightfield 100 - CAM_s4_t10.TIF
|__ 20210219_DML_3_4_D2_AM_flow__w2FITC 100- CAM_s4_t10.TIF
What I would like:
main Directory
|
|___ Experiment
├── D1
├── Brightfield
|__20210205_DML_3_4_D1_PM_flow__w1brightfield 100 - CAM_s3_t5.TIF
└── FITC
|__ 20210205_DML_3_4_D1_PM_flow__w2FITC 100- CAM_s3_t5.TIF
├── D2
├── Brightfield
|__20210219_DML_3_4_D2_AM_flow__w1brightfield 100 - CAM_s4_t10.TIF
└── FITC
|__20210219_DML_3_4_D2_AM_flow__w2FITC 100- CAM_s4_t10.TIF
In another question asked on stackoverflow I found a code that I thought I could adjust to my situation, but I get an error that says: Error in mapply(FUN = function (path, showWarnings = TRUE, recursive = FALSE, : zero-length inputs cannot be mixed with those of non-zero length. Apparently the list that needs to be formed (in parts) only shows NA. The code that I used is below:
files <- c("20210205_DML_3_4_D0_PM_flow__w1brightfield 100 - CAM_s3_t5.TIF", "20210205_DML_3_4_D0_PM_flow__w2FITC 100- CAM_s3_t5.TIF",
"20210219_DML_3_4_D1_AM_flow__w1brightfield 100 - CAM_s4_t10.TIF", "20210219_DML_3_4_D1_AM_flow__w2FITC 100- CAM_s4_t10.TIF")
# write some temp (empty) files for copying
for (f in files) writeLines(character(0), f)
parts <- strcapture(".*_(D[01])_*_([brightfield]|[FITC])_.*", files, list(d="", tw=""))
parts
# d tw
# 1 D0 Brightfield
# 2 D0 FITC
# 3 D1 Brightfield
# 4 D1 FITC
dirs <- do.call(file.path, parts[complete.cases(parts),])
dirs
# [1] "D0/Brightfield" "D0/FITC" "D1/Brightfield" "D1/FITCr"
### pre-condition, only files, no dir-structure
list.files(".", pattern = "D[0-9]", full.names = TRUE, recursive = TRUE)
# [1] "./20210205_DML_3_4_D0_PM_flow__w1brightfield 100 - CAM_s3_t5.TIF" "./"20210205_DML_3_4_D0_PM_flow__w2FITC 100- CAM_s3_t5.TIF"
### create dirs, move files
Vectorize(dir.create)(unique(dirs), recursive = TRUE) # creates both D0 and D0/Brightfield, ...
# D0/Brightfield D0/FITC D1/Brightfield D1/FITC
# TRUE TRUE TRUE TRUE
file.rename(files, file.path(dirs, files))
# [1] TRUE TRUE TRUE TRUE
### post-condition, files in the correct locations
list.files(".", pattern = "D[0-9]", full.names = TRUE, recursive = TRUE)
Where is it going wrong?
You are doing the parts <- bit wrong, it should be like so:
parts <- strcapture(".*_(D[01])_.*(brightfield|FITC).*", files, list(d="", tw=""))
parts
Output:
> parts
d tw
1 D0 brightfield
2 D0 FITC
3 D1 brightfield
4 D1 FITC
There were a couple of errors:
you forgot a . in _*_ , correct should be _.*_.
Don't put [] around the words brightfield and FITS, that's not how you use [].
there aren't underscores around brightfield or FITC in your filenames. So don't put underscores around them in your regular expression.
May I recommend reading up on an introduction article or a tutorial? It does not take much to learn what you need to overcome your problems here.

R move files with specific name pattern to folder in different subdirectories

I would like to copy files to a specific folder based on a certain part in their name. Below you will find my folder structure and where the files are. In both the D0 and D1 folders you will find files that are named like this structure: 20210308_DML_D0_Temp_s1_t1.txt or 20210308_DML_D1_weather_s3_t6.txt with D0/D1 in which folder it is situated, Temp/weather whether it is temperature or weather file, s1/s3 is the location and t1/t6 is the timepoint. The first thing that I wanted to do is to loop over the txt files in both D0 and D1 files and move the files that have Temp in their name to the temperature subfolder and files that have weather in their name to the weather subfolder in both D0 and D1 folders
main Directory
|
|___ weather_day
├── D0
├── temperature
│ └── weather
|__ 20210308_DML_D0_Temp_s1_t1.txt
|__ 20210308_DML_D1_weather_s3_t6.txt
└── D1
├── temperature
└── weather
|__ 20210308_DML_D0_Temp_s1_t1.txt
|__ 20210308_DML_D1_weather_s3_t6.txt
I tried to do it with a for loop such as:
wd = getwd() #set working directory to subfolder
pathway = paste0(wd,"/weather_day/")
for (i in pathway){
file.copy(i,"temperature)
file.copy(i,"weather")
}
In the end I want it like this that the txt files are in the folder according whether they have temperature or weather in their name:
main Directory
|
|___ weather_day
├── D0
├── temperature
|__20210308_DML_D0_Temp_s1_t1.txt
└── weather
|__ 20210308_DML_D0_weather_s3_t6.txt
├── D1
├── temperature
|__20210308_DML_D1_Temp_s1_t1.txt
└── weather
|__20210308_DML_D1_weather_s3_t6.txt
However, it does not work for me. I think I have to use file.copy, but how can I use this function to move the file based on a certain name pattern of the file and can I use a for loop in a for loop to read over the folders D0 and D1 and then the txt files in these folders?
You didn't provide very much information to go off of. If I understand what you're asking, this should work.
library(tidyverse)
# collect a list of files with their paths
collector = list.files(paste0(getwd(), "/weather_day"),
full.names = T, # capture the file names along with the full path
recursive = T) # look in subfolders
# establish the new 'weather' path
weather = paste0(getwd(), "/weather/")
# establish the new 'temp' path
temp = paste0(getwd(), "/temp/")
collector = data.frame("files" = collector) %>% # original path
mutate(files2 = ifelse(str_detect(str_extract(files,
"([^\\/]+$)"),
"weath"), # if weather, make a new path
paste0(weather,
str_extract(files,
"([^\\/]+$)")
), # end paste0/ if true
ifelse(str_detect(str_extract(files,
"([^\\/]+$)"),
"temp"), # if temp, make a new path
paste0(temp,
str_extract(files,
"([^\\/]+$)")
), # end paste0/ if true
files) # if not weather or temp, no change
) # end if
) # end mutate
dir.create(weather) # create directories
dir.create(temp)
# move the files
file.rename(from = collector[,1],
to = collector[,2])
# validate the change
list.files(weather) # see what's different
list.files(temp) # see what's different
Based on what #alexdegrote1995 added, how about this:
# collect a list of files with their paths
collector = list.files(paste0(getwd(), "/weather_day"),
full.names = T, # capture the file names along with the full path
recursive = T) # look in subfolders
# establish the new 'weather' path
weather = paste0(getwd(), "/D0/weather/")
# establish the new 'temp' path
temp = paste0(getwd(), "/D0/temperature/")
collector = data.frame("files" = collector) %>%
mutate(files2 = ifelse(str_detect(str_extract(files,
"([^\\/]+$)"),
"weath"),
paste0(weather,
str_extract(files,
"([^\\/]+$)")
), # end paste0/ if true
ifelse(str_detect(str_extract(files,
"([^\\/]+$)"),
"temp"),
paste0(temp,
str_extract(files,
"([^\\/]+$)")
), # end paste0/ if true
files) # if not weather or temp, don't change
), # end if
filesD1 = paste0(gsub(pattern="D0", # make a third column for the D1 folder
replacement="D1",
x =files2,))) # end mutate
file.rename(from = collector[,1], # move files to the D0 folder
to = collector[,2])
file.copy(from = collector[,2], # add copy to the D1 folder
to = collector[,3])
Edited to include more filenames, pre-conditions (no dir structure), and post-conditions. (Plus move instead of copy.)
files <- c("20210308_DML_D0_Temp_s1_t1.txt", "20210308_DML_D0_weather_s3_t6.txt",
"20210308_DML_D1_Temp_s1_t1.txt", "20210308_DML_D1_weather_s3_t6.txt")
# write some temp (empty) files for copying
for (f in files) writeLines(character(0), f)
parts <- strcapture(".*_(D[01])_([Tt]emp|[Ww]eather)_.*", files, list(d="", tw=""))
parts
# d tw
# 1 D0 Temp
# 2 D0 weather
# 3 D1 Temp
# 4 D1 weather
dirs <- do.call(file.path, parts[complete.cases(parts),])
dirs
# [1] "D0/Temp" "D0/weather" "D1/Temp" "D1/weather"
### pre-condition, only files, no dir-structure
list.files(".", pattern = "D[0-9]", full.names = TRUE, recursive = TRUE)
# [1] "./20210308_DML_D0_Temp_s1_t1.txt" "./20210308_DML_D0_weather_s3_t6.txt" "./20210308_DML_D1_Temp_s1_t1.txt"
# [4] "./20210308_DML_D1_weather_s3_t6.txt"
### create dirs, move files
Vectorize(dir.create)(unique(dirs), recursive = TRUE) # creates both D0 and D0/Temp, ...
# D0/Temp D0/weather D1/Temp D1/weather
# TRUE TRUE TRUE TRUE
file.rename(files, file.path(dirs, files))
# [1] TRUE TRUE TRUE TRUE
### post-condition, files in the correct locations
list.files(".", pattern = "D[0-9]", full.names = TRUE, recursive = TRUE)
# [1] "./D0/Temp/20210308_DML_D0_Temp_s1_t1.txt" "./D0/weather/20210308_DML_D0_weather_s3_t6.txt"
# [3] "./D1/Temp/20210308_DML_D1_Temp_s1_t1.txt" "./D1/weather/20210308_DML_D1_weather_s3_t6.txt"

R Shiny not reading file path

Consider a shiny app with the following folder structure (github here):
├── data
│ └── import_finance.R
│ └── data.xlsx
├── dashboard.R
├── ui_elements
│ └── sidebar.R
│ └── body.R
The import_finance.R runs correctly when not run in the shiny app. However, when running the shiny app it fails to recognize the path.
# list of quarterly earnings worksheets
file_paths <- list.files('dashboard/data', pattern = 'xlsx', full.names = TRUE)
path <- file_paths[1]
# import all sheets from each workbook and merge to one df
import <-
path %>%
excel_sheets() %>%
set_names() %>%
map_df(xlsx_cells, path = path) %>%
mutate(workbook = word(path, -1, sep = '/')) # add column with file_name
Shiny says Error: path does not exist: ‘NA’. Please note that import_finance.R is called in dashboard.R via source("data/import_finance.R").
Even when specifying the full path the error persists. This thread claims the same error:
> system.file("/Users/pblack/Documents/Git Projects/opensource-freetoplay-economics/dashboard/data/dashboard/data/ATVI 12-Quarter Financial Model Q1 CY20a.xlsx")
[1] ""
Any idea of the mistake I'm making? It's strange the script runs fine, but just not when running as a shiny app.
The error here was in
# list of quarterly earnings worksheets
file_paths <- list.files('dashboard/data', pattern = 'xlsx', full.names = TRUE)
When the shiny app was running it was operating from the data folder as the working directory when running source("data/import_finance.R").
To accommodate simply
# list of quarterly earnings worksheets
file_paths <- list.files('data', pattern = 'xlsx', full.names = TRUE)

Reading and naming multiple .txt files in R

I want to read and name multiple .txt files in R. To be more clear (sample): I have 2 subfolders, each one with three .txt files (they have the same name). Subfolder 'test' has 3 .txt files with names 'alpha.txt','bita.txt','gamma.txt' and subfolder 'train' has 3 .txt files with names 'alpha.txt','bita.txt','gamma.txt'. I am using the following code:
files <- dir(recursive=TRUE,pattern ='\\.txt$')
List <- lapply(files,read.table,fill=TRUE)
which gives a List with 6 elements, each one a data frane. I know that the first element is the 'alpha' from test folder, the second element the 'bita' from the test folder and so on. But as the files are more I would like to read the data in order to have in the environment variables: 'test_alpha','test_bita','test_gamma','train_alpha','train_bita','train_gamma'. Is there a way to do it?
I created two folders in my working directory /train and /test. We create two arrays and write them one to each folder.
df1 <- data.frame(matrix(rnorm(9), 3, 3))
df2 <- data.frame(matrix(runif(12), 4,3))
write(df1, './test/alpha.txt')
write(df2, './train/alpha.txt')
We run your code:
files <- dir(recursive=TRUE,pattern ='\\.txt$')
List <- lapply(files,read.table,fill=TRUE)
files
[1] "test/alpha.txt" "train/alpha.txt"
It works to isolate the files we need. Next we take out the forward slash and file extension.
newnames <- gsub('/', '_', files)
newnames1 <- gsub('\\.txt', '', newnames)
newnames1
[1] "test_alpha" "train_alpha"
This vector can now be assigned to List to name each array.
names(List) <- newnames1
List
$test_alpha
V1 V2 V3 V4 V5
1 -0.6594299 -0.01881557 0.7076588 -0.7096888 0.3629274
2 -1.4401000 1.59659000 -1.9041430 2.3079960 NA
$train_alpha
V1 V2 V3 V4 V5
1 0.9307107 0.6257928 0.6903179 0.5143920 0.6798936
2 0.3652738 0.9297527 0.1902556 0.7243708 0.4541548
3 0.5565041 0.5276907 NA NA NA

Resources