There is a file named haha in C:\test, haha contains character look for me,in linux ,i can search to get the filename.
find / -name "look for me"
can i search the file with some kind of R command in xp os?
if i don't know the file name which contain character look for me is haha,how can i do then ?
or with plyr:
require(plyr) # uses plyr
textFiles<-list.files(pattern=".txt") # only looks at .txt file, you can change or omit
#alply reads each file and returns
# a list of filenames which pass the grep test
# and indicate the first line identified
mylist<-alply(textFiles,
1,
function(f){fline<-grep("LOOK FOR ME",readLines(f))
ifelse(fline>0,paste(f,fline,sep=" - line:"),NULL)
})
Filter(is.character,mylist) # gives you a list of all files containing the term
This code wll find a filename with the phrase 'haha' inside of it. And then check if the string "look for me" occurs anywhere within it. Is that what you want?
whichfile <- grep(
x = list.files(),
pattern = "haha",
value = TRUE
)
sum(
grepl(
x = readLines(whichfile),
pattern = 'look for me')
)
Related
I can create a list of csv files in folder_A:
list1 <- dir_ls("path to folder_A")
I can define a function to add a column with filenames and combine these files into one dataframe:
read_and_save_combo <- function(fileX){
read_csv(fileX) %>%
mutate(fileX = path_file(fileX)}
combo_df <- map_df(list1, read_and_save_combo)
I want to add another column with enclosing folder name (would be the same for all files, folder_A). If I use dirname() on an individual file, I get the full parent directory path to folder_A. I only want the characters "folder_A". If I use dirname() as part of the function, I get another column but its filled with "." Less importantly, I don't know why I get the "." instead of the full path, but more importantly is there a function like path_parentfoldername, that would let me add a new column with only the name of the folder containing each file to each row of the combined dataframe?
Thanks!
Edit:
New function for clarity after answers:
read_and_save_combo <- function(fileX){
read_csv(fileX) %>%
mutate(filename = path_file(fileX), foldername = dirname(fileX) %>%
str_replace(pattern = ".*/", replacement = ""))}
This works because . is the wildcard but * modifies the meaning to 0-infinity characters, so ".*" is any character and any number of characters preceding /. Gregor said this but now I understand it.
Also, I was getting the column filled with ".", because in the function, I was reading one file, but then trying to mutate with dirname operating on the list, which is a vector with multiple elements (more than one file).
You can use dirname + basename :
list1 <- list.files('folder_A_path', full.names = TRUE)
read_and_save_combo <- function(fileX) {
readr::read_csv(fileX) %>%
dplyr::mutate(fileX = basename(dirname(fileX)))
}
combo_df <- purrr::map_df(list1, read_and_save_combo)
If your file is at the path 'Users/Downloads/FolderA/Filename.csv' :
dirname('Users/Downloads/FolderA/Filename.csv')
#[1] "Users/Downloads/FolderA"
basename(dirname('Users/Downloads/FolderA/Filename.csv'))
#[1] "FolderA"
"path to folder_A" is a bad example, use "path/to/folder_A". You need to delete everything from the start through the last /:
library(stringr)
str_replace("path/to/folder_A", pattern = ".*/", replacement = "")
# [1] "folder_A"
If you're worried about \\ or other non-standard things, use dirname() as the input.
Here are two ways to do what I wanted, using the helpful answers above:
read_and_save_combo <- function(file){
read_csv(file) %>%
mutate(filename = path_file(file), foldername = basename(dirname(file)))}
read_and_save_combo <- function(file){
read_csv(file) %>%
mutate(filename = path_file(file), foldername = dirname(file) %>%
str_replace(pattern = ".*/", replacement = ""))}
Other basic things I learned that could be helpful for other beginners:
(1) While writing the function, point all the functions (read_csv(), dirname(), etc.) at a uniform variable (here written as "file" but it could be just a letter "g" or whatever you choose). Then you will avoid the problem I had where part of the function is acting on one file and another part is acting on a list.
(2)
filex and fileX
appear far too similar to each other using certain fonts, which can mess you up (capitalization).
After running my R script in the terminal I get two output data files: a.dat and b.dat. My goal is to directly divert these output files into a new folder.
Is there any way to do something like this:
Rscript myscript.R > folder
Note: For writing the output file I simply use this:
write(t(result1), file = "a.dat", ncolumns = 5, append=TRUE)
I solved my problem by doing the following:
I created an output folder 'output'
I added the full path of the output in myscript.R as
write(t(result1), file = "home/Documents/output/a.dat", ncolumns = 5, append=TRUE)
Solved! :)
You could simply use write.table create two csv files like this:
A minimal working example:
using a r-script called "Rfile.r" in the directory "adir" in my "Dokumente" folder. the script reads the first two inputs , a numeric as the input argument for the function , aswell as a character string with the output-target-directory . (you could also do filenames , etc of course..)
Rfile.r ::
# set arguments, to later specifiy in terminal ,
# one numeric and one target directory
arg <- commandArgs(trailingOnly = TRUE)
n<-as.numeric(arg[1])
path<-as.character(arg[2])
## A random function two create two csv 's
fun <- function(n) {
data.a <-data.frame(rep("Some Data", n))
data.b<-data.frame(rnorm(1:n))
data<-list(data.a,data.b)
return(data)
}
# create data using input arg[1], aka 'n'
data<-fun(n)
# now the important Part: using write.table with the arg[2] aka 'path'
# :
write.table(data[1],file =paste(path,"/data_a.csv", sep = ""))
write.table(data[2],file =paste(path,"/data_b.csv", sep = ""))
## write terminal output message using cat()
cat(paste("Your input was :" ,arg[1],sep="\t"),
paste( "your target path was:" ,arg[2] ,sep="\t"), sep = "\n")
then run in a terminal :
$ Rscript ~/Dokumente/adir/Rfile.r 3 ~/Dokumente/bdir
it creates two csv's in the directory "bdir" called "data_a.csv" and "data_b.csv" where 3 was the numeric input for the function in Rfile.r
I am looking for an elegant way to insert character (name) into directory and create .csv file. I found one possible solution, however I am looking another without "replacing" but "inserting" text between specific charaktects.
#lets start
df <-data.frame()
name <- c("John Johnson")
dir <- c("C:/Users/uzytkownik/Desktop/.csv")
#how to insert "name" vector between "Desktop/" and "." to get:
dir <- c("C:/Users/uzytkownik/Desktop/John Johnson.csv")
write.csv(df, file=dir)
#???
#I found the answer but it is not very elegant in my opinion
library(qdapRegex)
dir2 <- c("C:/Users/uzytkownik/Desktop/ab.csv")
dir2<-rm_between(dir2,'a','b', replacement = name)
> dir2
[1] "C:/Users/uzytkownik/Desktop/John Johnson.csv"
write.csv(df, file=dir2)
I like sprintf syntax for "fill-in-the-blank" style string construction:
name <- c("John Johnson")
sprintf("C:/Users/uzytkownik/Desktop/%s.csv", name)
# [1] "C:/Users/uzytkownik/Desktop/John Johnson.csv"
Another option, if you can't put the %s in the directory string, is to use sub. This is replacing, but it replaces .csv with <name>.csv.
dir <- c("C:/Users/uzytkownik/Desktop/.csv")
sub(".csv", paste0(name, ".csv"), dir, fixed = TRUE)
# [1] "C:/Users/uzytkownik/Desktop/John Johnson.csv"
This should get you what you need.
dir <- "C:/Users/uzytkownik/Desktop/.csv"
name <- "joe depp"
dirsplit <- strsplit(dir,"\\/\\.")
paste0(dirsplit[[1]][1],"/",name,".",dirsplit[[1]][2])
[1] "C:/Users/uzytkownik/Desktop/joe depp.csv"
I find that paste0() is the way to go, so long as you store your directory and extension separately:
path <- "some/path/"
file <- "file"
ext <- ".csv"
write.csv(myobj, file = paste0(path, file, ext))
For those unfamiliar, paste0() is shorthand for paste( , sep="").
Let’s suppose you have list with the desired names for some data structures you want to save, for instance:
names = [“file_1”, “file_2”, “file_3”]
Now, you want to update the path in which you are going to save your files adding the name plus the extension,
path = “/Users/Documents/Test_Folder/”
extension = “.csv”
A simple way to achieve it is using paste() to create the full path as input for write.csv() inside a lapply, as follows:
lapply(names, function(x) {
write.csv(x = data,
file = paste(path, x, extension))
}
)
The good thing of this approach is you can iterate on your list which contain the names of your files and the final path will be updated automatically. One possible extension is to define a list with extensions and update the path accordingly.
Lets say i have a directory with .txt files in it like this (note that every file has the same context but different file names):
dir('tstdir')
[1]"file1_err1.txt"
[2]"file2_ree1.txt"
[3]"file_test.txt"
So to go through this directory i use a for loop (example for readability):
for (i in dir('tstdir')) {
tst<-read.table(paste('tstdir/',i, sep=''),stringsAsFactors=F)
DO SOME MODIFICATION (Randomizing the data)
write.table(tst, file = paste('tst',i,sep='')
}
So i want to do something per txt file and than write it back to a text file with the names of the loaded files + names of the data frame. ( I know how to randomize the data but not needed for example )
I know i do something wrong with renaming of the data and signing I to the correct place. I thought about an if statement but want to see if this can be done without. Unfortunately without success, any help / hints is appreciated
Is something like this what you want to do? Without an example of the randomising it's hard to say exactly this will work, but it should do...
f <- list.files( 'tstdir' , pattern = "*.txt" , full.names = TRUE )
lapply( 1:length(f) , function(x){
dat <- read.table( f[x] , stringsAsFactors = F )
randomise dat code here
require( R.utils )
write.table( dat , file = getAbsolutePath(f[x]) )
}
)
in shell ,to make a dir:
mkdir /home/test
then ,to create a file named ".test" in the "/home/test"
a=list.files(path = "/home/test",include.dirs = FALSE)
a
character(0)
a=list.files(path = "/home/test",include.dirs = TRUE)
a
character(0)
a=list.files(path = "/home/test/",include.dirs = TRUE)
a
character(0)
list.files(path = '/home/test', all.files=TRUE,inclued.dirs=FALSE)
[1] "." ".." ".test"
a=list.files(path = '/home/test', all.files=TRUE)
length(a)
[1] 3
how can i get length(a) = 1 using regular expression parameters pattern= in list.files to prune . and ..
Use all.files=TRUE to show all file names including hidden files.
list.files(path = '/home/test', all.files=TRUE)
To answer your edit, one way would be to use a negative number with tail
tail(list.files(path = '/home/test', all.files=TRUE), -2)
Using only the pattern argument:
list.files(path='/home/test', all.files=TRUE, pattern="^[^\\.]|\\.[^\\.]")
The pattern says "anything that starts with something other than a dot or anything that starts with a dot followed by anything other than a dot."
Although it breaks your requirement to use the pattern argument of list.files, I would actually probably wrap grep around list.statements in this case.
grep("^\\.*\\.$", list.files(path='/home/test', all.files=TRUE),
invert=TRUE, value=TRUE)
The above will find any file names that only contain dots, then return everything else. invert=TRUE means "find the names that do not match", and value=TRUE means "return the names instead of their location."