Let's say I have a directory with product inventories that are saved per day:
$ ls *.csv
2014_01_01.csv
2014_01_02.csv
...
Is there a glob pattern that will only grab the newest file? Or do I need to chain it with other commands? Basically I'm just looking to do what would about to a LIMIT 1 based on the filename sort.
Assuming your shell is bash, ksh93 or zsh, and your files have the same naming convention as the example in your question:
files=( *.csv )
printf "The newest file is %s\n" "${files[-1]}"
Since the date in the filenames is in a format that naturally sorts, storing all of them in an array and taking the last element gives you the newest one (And conversely the first element is the oldest one).
Related
What is the difference between these command:
find . –type f –name '*txt*'
and
find . –type f | grep 'txt'
I tried to run this and there is a difference but I want to know why?
The Major difference is FIND is for searching files and directories using filters while GREP is for searching a pattern inside a file or searching process(es)
FIND is an command for searching file(s) and folder(s) using filters such as size , access time , modification time.
The find command lists all of the files within a directory and its sub-directories that match a set of filters. This command is most commonly used to find all of the files that have a certain name.
To find all of the files named theFile.txt in your current directory and all of its sub-directories, enter:
find . -name theFile.txt -print
To look in your current directory and its sub-directories for all of the files that end in the extension .txt , enter:
find . -name "*.txt" -print
GREP :(Globally search a Regular Expression and Print)
Searches files for a specified string or expression.
Grep searches for lines containing a specified pattern and, by default, writes them to the standard output.
grep myText theFile.txt
Result : Grep will print out each line contain the word myText.
In your first example, you are using the find utility to list the filenames of regular files where the filename includes the string txt.
In your second example, you are using the find utility to list the filenames of regular files and feeding the resultant filenames via a pipe to the grep utility which searches the contents of the pipe (a list of filenames, one per line) for the string txt. Each time this string is found, the corresponding line (which is a filename) is outputted.
When you have a path with txt in the directory name, the second command will find a match. When you do not want to match paths like txtfiles/allfiles.tgz and transactions/txtelevisions/bigscreen.jpg you will want to use the first.
The difference between the two is that in the first case, find is looking for files whose name (just name) matches the pattern.
In the second case, find is looking for all files of type 'f' and outputting their relative paths as strings. That result gets piped to grep, which filters the input strings to those matching the pattern. The pattern 'txt' will filter the filepath results for the pattern. Importantly, the second case will include filepaths that match anywhere in the path, not just in the filename. The first case will not do that.
The first command will display files having txt in their name.
Whereas the second command will highlight the lines of all the files having txt in their content.
I know I can search for a string with:
grep -n -d recurse 'snoopy' *
and then it shows every file name and instance that contains that string, like:
file/name.txt:23 some snoopy here
file/name2.txt:59 another snoopy there
file/name2.txt:343 some more snoopy
etc...
The problem is that with many occurrences, the list is huge. How do I make it show only the actual file names that contain the string, without duplicates and without the occurrence?
Only like:
file/name1.txt
file/name52.txt
file/name28293.txt
Thanks a lot for any help :)
The -l flag (or, in both BSD and GNU grep, --files-with-matches) does what you want.
From the POSIX spec:
Write only the names of files containing selected lines to standard output. Pathnames shall be written once per file searched. If the standard input is searched, a pathname of "(standard input)" shall be written, in the POSIX locale. In other locales, "standard input" may be replaced by something more appropriate in those locales.
Both BSD and GNU also explicitly guarantee that this will be more efficient. (Older BSD versions say "… grep will only search a file until a match has been found, making searches potentially less expensive", newer BSD and GNU say "The scanning will stop on the first match".) If you don't know which grep you have and which options it has, just type man grep at the shell and you should get the manpage.
I have the following line in a bash script:
find . -name "paramsFile.*" | xargs -n131072 cat > parameters.txt
I need to make sure the order the files are concatenated in does not change when I use this command. For example, if I run this command twice on the same set of paramsFile.*, parameters.txt should be the same both times. My question is, is this the case? And if it isn't, how can I make sure it is?
Thanks!
Edit: the same question goes for xargs: would that change how the files are fed to cat?
Edit2: as William Pursell pointed out, this question is actually about find. Does find always return files in the same order?
From description in man cat:
The cat utility reads files sequentially, writing them to the standard
output. The file operands are processed in command-line order.
If file is a single dash (`-') or absent, cat reads from the standard input. If file is a UNIX domain socket, cat connects to it
and
then reads it until EOF. This complements the UNIX domain binding capability available in inetd(8).
So yes as long as you pass the files to cat in the same order every time you'll be ok.
I have a folder with materials for university study, sorted by semesters:
$ ls University
semester1 semester2 semester3 semester4
I'm trying to make one of them the named directory, and I want zsh to allways pointed to directory ending with highest number (so I don't have to update my directory shortcut every semester).
So far I found only the zsh expansion <->:
$ ls semester<->
semester1 semester2 semester3 semester4
but I cannot find a way to extract only the last dirname from that.
Any idea how I should proceed or what I should change?
latestSemester=`ls semester<-> | tail -1`
echo $latestSemester
actually this also works
latestSemester=`ls semester<->([-1])`
EDIT: Fixed the second line, whose first version missed brackets.
From the zsh manual
[beg[,end]]
specifies which of the matched filenames should be included in the returned list. The
syntax is the same as for array subscripts. beg and the optional end may be mathemat-
ical expressions. As in parameter subscripting they may be negative to make them
count from the last match backward. E.g.: ‘*(-OL[1,3])’ gives a list of the names of
the three largest files.
Have two folders with approx. 150 java property files.
In a shell script, how to compare both folders to see if there is any new property file in either of them and what are the differences between the property files.
The output should be in a report format.
To get summary of new/missing files, and which files differ:
diff -arq folder1 folder2
a treats all files as text, r recursively searched subdirectories, q reports 'briefly', only when files differ
diff -r will do this, telling you both if any files have been added or deleted, and what's changed in the files that have been modified.
I used
diff -rqyl folder1 folder2 --exclude=node_modules
in my nodejs apps.
Could you use dircmp ?
Diff command in Unix is used to find the differences between files(all types). Since directory is also a type of file, the differences between two directories can easily be figure out by using diff commands. For more option use man diff on your unix box.
-b Ignores trailing blanks (spaces and tabs)
and treats other strings of blanks as
equivalent.
-i Ignores the case of letters. For example,
`A' will compare equal to `a'.
-t Expands <TAB> characters in output lines.
Normal or -c output adds character(s) to the
front of each line that may adversely affect
the indentation of the original source lines
and make the output lines difficult to
interpret. This option will preserve the
original source's indentation.
-w Ignores all blanks (<SPACE> and <TAB> char-
acters) and treats all other strings of
blanks as equivalent. For example,
`if ( a == b )' will compare equal to
`if(a==b)'.
and there are many more.