To replace the first character of the last line of a unix file with the file name - unix

We need a shell script that retrieves all txt files in the current directory and for each file checks if it is an empty file or contains any data in it (which I believe can be done with wc command).
If it is empty then ignore it else since in our condition, all txt files in this directory will either be empty or contain huge data wherein the last line of the file will be like this:
Z|11|21||||||||||
That is the last line has the character Z then | then an integer then | then an integer then certain numbers of | symbols.
If the file is not empty, then we just assume it to have this format. Data before the last line are garbled and not necessary for us but there will be at least one line before the last line, i.e. there will be at least two lines guaranteed if the file is non-empty.
We need a code wherein, if the file is non-empty, then it takes the file, replaces the 'Z' in the last line with 'filename.txt' and writes the new data into another file say tempfile. The last line will thus become as:
filename.txt|11|21|||||||
Remaining part of the line remains same. From the tempfile, the last line, i.e., filename.txt|int|int||||| is taken out and merged into a finalfile. The contents of tempfile is cleared to receive data from next filename.txt in the same directory. finalfile has the edited version of the last lines of all non-empty txt files in that directory.
Eg: file1.txt has data as
....
....
....
Z|1|1|||||
and file2.txt has data as
....
....
....
Z|2|34|||||
After running the script, new data of file1.txt becomes
.....
.....
.....
file1.txt|1|1||||||
This will be written into a new file say temp.txt which is initially empty. From there the last line is merged into a file final.txt. So, the data in final.txt is:
file1.txt|1|1||||||
After this merging, the data in temp.txt is cleared
New data of file2.txt becomes
...
...
...
file2.txt|2|34||||||
This will be written into the same file temp.txt. From there the last line is merged into the same file final.txt.
So, the data in final.txt is
file1.txt|1|1||||||
file2.txt|2|34||||||
After considering N number of files that was returned to be as of type txt and non-empty and within the same directory, the data in final.txt becomes
file1.txt|1|1||||||
file2.txt|2|34||||||
file3.txt|8|3||||||
.......
.......
.......
fileN.txt|22|3|||||
For some of the conditions, I already know the command, like
For finding files in a directory of type text,
find <directory> -type f -name "*.txt"
For taking the last line and merging it into another file
tail -1 file.txt>>destination.txt

You can use 'sed' to replace the "z" character. You'll be in a loop, so you can use the filename that you have in that. This just removes the Z, and then echos the line and filename.
Good luck.
#!/bin/bash
filename=test.txt
line=`tail -1 $filename | sed "s/Z/$filename/"`
echo $line
Edit:
Did you run your find command first, and see the output? It has of course a ./ at the start of each line. That will break sed, since sed uses / as a delimiter. It also will not work with your problem statement, which does not have an extra "/" before the filename. You said current directory, and the command you give will traverse ALL subdirectories. Try being simple and using LS.
# `2>/dev/null` puts stderr to null, instead of writing to screen. this stops
# us getting the "no files found" (error) and thinking it's a file!
for filename in `ls *.txt 2>/dev/null` ; do
... stuff ...
done

Related

Is there a way to make Unix diff -r compare only differences in filenames, but not check if any single file actually differs?

I need to compare two large directories with a lot of files in them. I tried using:
diff -r Directory1 Directory2
but the process is really slow due to the amount of files and their huge size.
So I thought about making the process faster by just comparing the content of the folders and not the actual content of the files.
Is there a way to make diff recursively check only if every subdirectory of Directory1 and Directory2 match in name and file content, but not check if every single file in Directory1 actually matches every single file in Directory2?
For example, let's say I have Directory1/SubDirectory1 and Directory2/Subdirectory1.
I want to check only if Directory1/SubDirectory1.1 and Directory2/Subdirectory2.1 have the same number of files with the same filenames (let's say, file1, file2, ... fileN), but I don't care about matching every file1, file2 ... fileN of Directory1/SubDirectory1.1 to every file1, file2 ... fileN of SubDirectory2.1 to see if their content is actually the same.
Is there a way of doing this?
Edit:
I tried using:
diff <(path1) <(path2)
but unfortunately, diff outputs the full path for each file. The output I get is thus:
< /Volume1/.../.../Directory1/SubDirectory1.1/file1
< /Volume1/.../.../Directory1/SubDirectory1.1/file2
...
> /Volume2/.../.../Directory2/SubDirectory2.1/file1
> /Volume2/.../.../Directory2/SubDirectory2.1/file2
...
Here every single filename clearly differs, because the full paths differ.
Is there a way to force find to output paths only starting from the directory you give as argument? For example:
find -(some option I'm not aware of) /Volume1/.../.../Directory1
outputs:
/Directory1/SubDirectory1.1/file1
/Directory1/SubDirectory1.1/file2
...
A simple way:
cd /.../Directory1
find . | sort >/tmp/dir1.lst
cd /.../Directory2
find . | sort >/tmp/dir2.lst
diff /tmp/dir1.lst /tmp/dir2.lst
It will fail if your filenames contain newlines, but in many cases that isn't a concern.
If scripting this, make sure to use auto-generated temp file names, e.g. with mktemp(1), to avoid symlink attacks and other problems.
Nate Eldredge, thank you for your answer!
However, I was able to solve my problem creating a script named fast_diff.sh, with just a line of code, as follows:
diff <(find "$1" | sed "s|$1\/||g" | sort) <(find "$2" | sed "s|$2\/||g" | sort)
The script takes two arguments, let's say path1 and path2:
./fast_diff.sh /Volume1/.../.../Directory1 /Volume2/.../.../Directory2
Now the variable $1 is equal to "/Volume1/.../.../Directory1" and the variable $2 is equal to "/Volume2/.../.../Directory2".
The command find gives as output something like:
/Volume1/.../.../Directory1/SubDirectory1.1/file1
/Volume1/.../.../Directory1/SubDirectory1.1/file2
...
Now I pipe this output to sed, using:
sed "s|$1||g"
which replaces every occurrence of "/Volume1/.../.../Directory1" with nothing. I used | as a separator instead of / because there are many occurrences of / in the directory path.
Employing the previous line of code, though, lists all subdirectories and files starting with a slash:
/SubDirectory1.1/file1
/SubDirectory1.1/file2
...
To remove the slash, I added \/:
sed "s|$1\/||g"

Delete files from a list in a text file

I have a text file containing around 500 lines. Each line is an absolute path to a file. I want to delete these files using a script.
There's a suggestion here but my files have spaces in them. They have been treated with \ to escape the space but it still doesn't work. There is discussion on that thread about problems with white spaces but no solutions.
I can't simply use the find command as that won't give me the precise result, I need to use the list (which was created by running find and editing out the discrepancies).
Edit: some context. I noticed that iTunes has re-downloaded and copied multiple songs and put them in the same directory as the original songs, e.g., inside a particular album directory is '01 This Song.aac' and '01 This Song 1.aac'.
I ran a find to produce a text file with all songs matching "* 1.*" to get songs ending in 1 but of any file type. I ran this in my iTunes Media/Music directory.
Some of these songs included in the file had the number 1 in but weren't actually duplicates (victims of circumstance), so I manually deleted them.
The file I am left with is around 500 lines with songs all including spaces in the filenames. Because it's an iTunes issue, there are just a few songs in one directory, then more in another, then another, and so on -- I can't just run a script on a single directory, it has to work recursively and run only on the files named in my list.txt
As you would expect, the trick is to get the quoting right:
while read line; do rm "$line"; done < filename
To remove the file which name has spaces you can just wrap the whole path in quotes.
And to delete the list of files I would recommend to change each line of your file so that it looks like rm call. The fastest way is to use sed. So if your file is in following format:
/home/path/file name.asd
/opt/some/string/another name.wasd
...
The oneliner for that would be something like this:
sed -e 's/^/rm -f "/' file.txt | sed -e 's/$/" ;/' > newfile.sh
First sed replaces beginning of the line with rm -f ", second sed end of the line with " ;.
It would produce file with following content:
rm -rf "/home/path/file name.asd" ;
rm -rf "/opt/some/string/another name.wasd" ;
...
So you can just execute this file as a bash script.

Converting Filename to Filename_Inode

I'm writing my first script that takes a file and moves it to another folder, except that I want to change the filename of the file to filename_inode instead of just filename incase there are any files with the same name
I've figured out how to show this by creating the following 4 variables
inode=$(ls -i $1 | cut -c1-7) #lists the file the user types, cuts the inode from it
space="_" #used to put inbetween the filename and bname
bname=$(basename $1) #gets the basename of the file without the directory etc
bnamespaceinode=$bname$space$inode #combines the 3 values into one variable
echo "$bnamespaceinode #prints filename_inode to the window
So the bottom echo shows filename_inode which is what I want, except now when I try to move this using mv or cp i'm getting the following errors
I dont think it's anything wrong with the syntax i'm using for the mv and cv commands, and so I'm thinking I need to concatenate the 3 variables into a new file or use the result of the first and then append the other 2 to that file?
I've tried both of the above but still not having any luck, any ideas?
Thanks
Without clearer examples, I guess this could work:
$TARGETDIR=/my/target/directory
mv $1 $TARGETDIR/$(basename "$1" | sed 's/_.*/_inode/')

Using the mv command to move a file to a destination stored in a variable

I want to move a file called dog to $HOME/deleted2.
The unix command I use is:
mv dog $HOME/deleted2
However I want to move it to the exact same destination but this time $HOME/deleted2 is stored in a hidden file called .rm.cfg
I want to extract the location from .rm.cfg, this file contains one line which says $HOME/deleted2.
Here is what I did:
pathname=$(cat $HOME/.rm.cfg),
mv dog $pathname.
However this time I get an error saying $HOME/deleted2 does not exist. Why is this?
Sorry for not putting it in code format, I tried to indent by fours spaces but it did not work.
cat $HOME/.rm.cfg will only "outputs" the raw file, but it does not evaluate variables.
To put the full interpreted string in your pathname variable, you need to evaluate it:
pathname=$(eval echo $(cat $HOME/.rm.cfg))

Why did my use of the read command not do what I expected?

I did some havoc on my computer, when I played with the commands suggested by vezult [1]. I expected the one-liner to ask file-names to be removed. However, it immediately removed my files in a folder:
> find ./ -type f | while read x; do rm "$x"; done
I expected it to wait for my typing of stdin:s [2]. I cannot understand its action. How does the read command work, and where do you use it?
What happened there is that read reads from stdin. When you put it at the end of a pipe, it read from that pipe.
So your find becomes
file1
file2
and so on; read reads that and replaces x successively with file1 then file2, and so your loop becomes
rm "file1"
rm "file2"
and sure enough, that rm's every file starting at the current directory ".".
A couple hints.
You didn't need the "/".
It's better and safer to say
find . -type f
because should you happen to type ". /" (ie, dot SPACE slash) find will start at the current directory and then go look starting at the root directory. That trick, given the right privileges, would delete every file in the computer. "." is already the name of a directory; you don't need to add the slash.
The find or rm commands will do this
It sounds like what you wanted to do was go through all the files in all the directories starting at the current directory ".", and have it ASK if you want to delete it. You could do that with
find . -type f -exec rm -i {} \;
or
find . -type f -ok rm {} \;
and not need a loop at all. You can also do
rm -r -i *
and get nearly the same effect, except that it will try to delete directories too. If the directory is empty, that'll even work.
Another thought
Come to think of it, unless you have a LOT of files, you could also do
rm -i `find . -type f`
Now the find in backquotes will become a bunch of file names on the command line, and the '-i' interactive flag on rm will ask the yes or no question.
Charlie Martin gives you a good dissection and explanation of what went wrong with your specific example, but doesn't address the general question of:
When should you use the read command?
The answer to that is - when you want to read successive lines from some file (quite possibly the standard output of some previous sequence of commands in a pipeline), possibly splitting the lines into several separate variables. The splitting is done using the current value of '$IFS', which normally means on blanks and tabs (newlines don't count in this context; they separate lines). If there are multiple variables in the read command, then the first word goes into the first variable, the second into the second, ..., and the residue of the line into the last variable. If there's only one variable, the whole line goes into that variable.
There are many uses. This is one of the simpler scripts I have that uses the split option:
#!/bin/ksh
#
# #(#)$Id: mkdbs.sh,v 1.4 2008/10/12 02:41:42 jleffler Exp $
#
# Create basic set of databases
MKDUAL=$HOME/bin/mkdual.sql
ELEMENTS=$HOME/src/sqltools/SQL/elements.sql
cat <<! |
mode_ansi with log mode ansi
logged with buffered log
unlogged
stores with buffered log
!
while read dbs logging
do
if [ "$dbs" = "unlogged" ]
then bw=""; cw=""
else bw="-ebegin"; cw="-ecommit"
fi
sqlcmd -xe "create database $dbs $logging" \
$bw -e "grant resource to public" -f $MKDUAL -f $ELEMENTS $cw
done
The cat command with a here-document has its output sent to a pipe, so the output goes into the while read dbs logging loop. The first word goes into $dbs and is the name of the (Informix) database I want to create. The remainder of the line is placed into $logging. The body of the loop deals with unlogged databases (where begin and commit do not work), then run a program sqlcmd (completely separate from the Microsoft new-comer of the same name; it's been around since about 1990) to create a database and populate it with some standard tables and data - a simulation of the Oracle 'dual' table, and a set of tables related to the 'table of elements'.
Other scripts that use the read command are bigger (by far), but generally read lines containing one or more file names and some other attributes of relevance, and then apply an appropriate transform to the files using the attributes.
Osiris JL: file * | grep 'sh.*script' | sed 's/:.*//' | xargs wgrep read
esqlcver:read version letter
jlss: while read directory
jlss: read x || exit
jlss: read x || exit
jlss: while read file type link owner group perms
jlss: read x || exit
jlss: while read file type link owner group perms
kb: while read size name
mkbod: while read directory
mkbod:while read dist comp
mkdbs:while read dbs logging
mkmsd:while read msdfile master
mknmd:while read gfile sfile version notes
publictimestamp:while read name type title
publictimestamp:while read name type title
Osiris JL:
'Osiris JL: ' is my command line prompt; I ran this in my 'bin' directory. 'wgrep' is a variant of grep that only matches entire words (to avoid words like 'already'). This gives some indication of how I've used it.
The 'read x || exit' lines are for an interactive script that reads a response from standard input, but exits if the command gets EOF (for example, if standard input comes from /dev/null).

Resources