I have many large text files, and I would like to add a line at the very beginning. I saw someone had asked this already here. However, this involves reading the entire text file and appending it to the single line. Is there a better (faster) way?
I tested this on windows 7 and it works. Essentially, you use the shell function and do everything on the windows cmd which is quite fast.
write_beginning <- function(text, file){
#write your text to a temp file i.e. temp.txt
write(text, file='temp.txt')
#print temp.txt to a new file
shell(paste('type temp.txt >' , 'new.txt'))
#append your file to new.txt
shell(paste('type', file, '>> new.txt'))
#remove temp.txt - I use capture output to get rid of the
#annoying TRUE printed by file.remove
dump <- capture.output(file.remove('temp.txt'))
#uncomment the last line below to rename new.txt with the name of your file
#and essentially modify your old file
#dump <- capture.output(file.rename('new.txt', file))
}
#assuming your file is test.txt and you want to add 'hello' at the beginning just do:
write_beginning('hello', 'test.txt')
On linux you just need to find the corresponding command in order to send a file to another one (I really think you need to replace type by cat on linux but I cannot test right now).
You'd use the system() function on a Linux distro:
system('cp file.txt temp.txt; echo " " > file.txt; cat temp.txt >> file.txt; rm temp.txt')
I use UNIX fairly infrequently so I apologize if this seems like an easy question. I am trying to loop through subdirectories and files, then generate an output from the specific files that the loop grabs, then pipe an output to a file in another directory whos name will be identifiable from the input file. SO far I have:
for file in /home/sub_directory1/samples/SSTC*/
do
samtools depth -r chr9:218026635-21994999 < $file > /home/sub_directory_2/level_2/${file}_out
done
I was hoping to generate an output from file_1_novoalign.bam in sub_directory1/samples/SSTC*/ and to send that output to /home/sub_directory_2/level_2/ as an output file called file_1_novoalign_out.bam however it doesn't work - it says 'bash: /home/sub_directory_2/level_2/file_1_novoalign.bam.out: No such file or directory'.
I would ideally like to be able to strip off the '_novoalign.bam' part of the outfile and replace with '_out.txt'. I'm sure this will be easy for a regular unix user but I have searched and can't find a quick answer and don't really have time to spend ages searching. Thanks in advance for any suggestions building on the code I have so far or any alternate suggestions are welcome.
p.s. I don't have permission to write files to the directory containing the input folders
Beneath an explanation for filenames without spaces, keeping it simple.
When you want files, not directories, you should end your for-loop with * and not */.
When you only want to process files ending with _novoalign.bam, you should tell this to unix.
The easiest way is using sed for replacing a part of the string with sed.
A dollar-sign is for the end of the string. The total script will be
OUTDIR=/home/sub_directory_2/level_2
for file in /home/sub_directory1/samples/SSTC/*_novoalign.bam; do
echo Debug: Inputfile including path: ${file}
OUTPUTFILE=$(basename $file | sed -e 's/_novoalign.bam$/_out.txt/')
echo Debug: Outputfile without path: ${OUTPUTFILE}
samtools depth -r chr9:218026635-21994999 < ${file} > ${OUTDIR}/${OUTPUTFILE}
done
Note 1:
You can use parameter expansion like file=${fullfile##*/} to get the filename without path, but you will forget the syntax in one hour.
Easier to remember are basename and dirname, but you still have to do some processing.
Note 2:
When your script first changes the directory to /home/sub_directory_2/level_2 you can skip the basename call.
When all the files in the dir are to be processed, you can use the asterisk.
When all files have at most one underscore, you can use cut.
You might want to add some error handling. When you want the STDERR from samtools in your outputfile, add 2>&1.
These will turn your script into
OUTDIR=/home/sub_directory_2/level_2
cd /home/sub_directory1/samples/SSTC
for file in *; do
echo Debug: Inputfile: ${file}
OUTPUTFILE="$(basename $file | cut -d_ -f1)_out.txt"
echo Debug: Outputfile: ${OUTPUTFILE}
samtools depth -r chr9:218026635-21994999 < ${file} > ${OUTDIR}/${OUTPUTFILE} 2>&1
done
Please help. I need to turn this in before 4pm for this Unix class. I have been working on it since 7pm last night. Haven't slept. There are three parts to this assignment. I only need help with the last part. If I can't complete this I fail the class.
Stage 3
In that same directory, write a script asciiFix.sh that takes an arbitrary number of file paths from the command line and carries out the same analysis on each one. If a file is not Windows ASCII, your script should do nothing to it. For each file that is Windows ASCII, your script should print the message
converting fileName
and should then convert the CR/LF line terminators in that file to Unix-style LF line terminators.
For example:
cp ~cs252/Assignments/ftpAsst/d3.dat wintest.txt
./asciiFix.sh /usr/share/dict/words wintest.txt fileType.sh
converting wintest.txt
and, after the script has finished, you should be able to determine that wintest.txt is now a Unix ASCII file.
When you believe that you have your script working, run
~cs252/bin/scriptAsst.pl
If all three scripts are working correctly, you will receive your access code.
My Attemps:
#!/bin/sh
for file in "$#" do
if file "$file" | grep "ASCII text, with CRLF"; then
echo "converting $file"
sed -e s/[\\r\\n]//g "$file"
fi
done
result:
./asciiFix.sh: 3: ./asciiFix.sh: Syntax error: "if" unexpected (expecting "do")
aardvark.cpp /home/cs252/Assignments/scriptAsst/winscrubbed.dat differ: byte 50, line 1
Failed: incorrect file conversion when running ./asciiFix.sh 'aardvark.cpp' 'bongo.dat' ' cat.dog.bak
Ii have tried taking out if and then. i have tried sed -i 's/^M//g' "$file",also using dos2unix, as well as some other stuff I dont remember. but it always says incorrect conversion with those files.
After adding the ; and switching to dos2unix:
#!/bin/sh
for file in "$#";
do
if file "$file" | grep "ASCII text, with CRLF"; then
echo "converting $file"
dos2unix "$file"
fi
done
the Error that I now get:
dos2unix: converting file aardvark.cpp to Unix format ...
Failed when running: ./asciiFix.sh 'aardvark.cpp' 'bongo.dat' 'cat.dog.bak'
Thanks for all your help.
The code that finally worked was:
#!/bin/sh
for file in "$#";
do
if file "$file" | grep -q "ASCII text, with CRLF"; then
echo "converting $file"
dos2unix "$file"
fi
done
You forgot the ; before do. do counts as a new statement. Alternatively you could place the do on a new line. In my opinion, the most comfortable way to convert DOS line endings (CRLF) to Unix line endings (LF-only) is dos2unix. If you fix your ;error, using dos2unix instead of sed should be straight forward and trivial.
Since dos2unix 7.1 you can use dos2unix itself to test for CRLF. This way you are not limited to ASCII.
#!/bin/sh
for file in "$#";
do
if [ -n "$(dos2unix -ic $file)" ]; then
echo "converting $file"
dos2unix "$file"
fi
done
In Unix environment, I need to write report to x_out file and also at the end of the process, the file needs to be removed. But, it always throws the following error.
grep: can't open /XYZ/123/Tmp/x_out
rm: /XYZ/123/Tmp/x_out non-existent
But, I can find the file x_out at the corresponding location. I'm able to open and view the contents too. I have found that sometime the file name changes with some '~' like characters appended to it. Is there a way to resolve this?
Edit: I'm not having any '~' appended to it. But, I have a doubt may be some unreadable chatacters like that have been appended.
Edit:I have added the actual error here
Edit: the command I used
grep "Report_values" ${REPORTOUT}|cut -d "|" -f 6
rm ${REPORTOUT}
Well, there are two possibilities I can see off the top of my head. There are undoubtedly more but the top of my head isn't a very big space :-)
The first is that the file doesn't exist despite your assertions.
The second is that it does exist but you're looking for it in the wrong place (for example, you've changed into a different directory).
If you place a line similar to:
( pwd ; cd ../.. ; pwd ; ls )
in your script before the grep/rm, it should tell you if either of those two possibilities is correct.
It will give your current directory, the directory you're looking in for the file and the files in that directory.
just check if you have non-printable/graphic character in the filename ... use -Q or -q flag of ls to see it... check below how it looks....
flag description from ls man page
-q, --hide-control-chars
print ? instead of non graphic characters
--show-control-chars
show non graphic characters as-is (default unless program is `ls' and output is a terminal)
-Q, --quote-name
enclose entry names in double quotes
--quoting-style=WORD
use quoting style WORD for entry names: literal, locale, shell, shell-always, c, escape
Demo Session
$ ls
demo.txt test.dat
$ ls -1
demo.txt
test.dat
$ cat demo.txt
cat: demo.txt: No such file or directory
$ rm demo.txt
rm: cannot remove `demo.txt': No such file or directory
$ ls -Q
"demo.txt " "test.dat"
$ ls -1Q
"demo.txt "
"test.dat"
$ rm "demo.txt "
$
Is there a Unix command to prepend some string data to a text file?
Something like:
prepend "to be prepended" text.txt
printf '%s\n%s\n' "to be prepended" "$(cat text.txt)" >text.txt
sed -i.old '1s;^;to be prepended;' inFile
-i writes the change in place and take a backup if any extension is given. (In this case, .old)
1s;^;to be prepended; substitutes the beginning of the first line by the given replacement string, using ; as a command delimiter.
Process Substitution
I'm surprised no one mentioned this.
cat <(echo "before") text.txt > newfile.txt
which is arguably more natural than the accepted answer (printing something and piping it into a substitution command is lexicographically counter-intuitive).
...and hijacking what ryan said above, with sponge you don't need a temporary file:
sudo apt-get install moreutils
<<(echo "to be prepended") < text.txt | sponge text.txt
EDIT: Looks like this doesn't work in Bourne Shell /bin/sh
Here String (zsh only)
Using a here-string - <<<, you can do:
<<< "to be prepended" < text.txt | sponge text.txt
This is one possibility:
(echo "to be prepended"; cat text.txt) > newfile.txt
you'll probably not easily get around an intermediate file.
Alternatives (can be cumbersome with shell escaping):
sed -i '0,/^/s//to be prepended/' text.txt
If it's acceptable to replace the input file:
Note:
Doing so may have unexpected side effects, notably potentially replacing a symlink with a regular file, ending up with different permissions on the file, and changing the file's creation (birth) date.
sed -i, as in Prince John Wesley's answer, tries to at least restore the original permissions, but the other limitations apply as well.
Here's a simple alternative that uses a temporary file (it avoids reading the whole input file into memory the way that shime's solution does):
{ printf 'to be prepended'; cat text.txt; } > tmp.txt && mv tmp.txt text.txt
Using a group command ({ ...; ...; }) is slightly more efficient than using a subshell ((...; ...)), as in 0xC0000022L's solution.
The advantages are:
It's easy to control whether the new text should be directly prepended to the first line or whether it should be inserted as new line(s) (simply append \n to the printf argument).
Unlike the sed solution, it works if the input file is empty (0 bytes).
The sed solution can be simplified if the intent is to prepend one or more whole lines to the existing content (assuming the input file is non-empty):
sed's i function inserts whole lines:
With GNU sed:
# Prepends 'to be prepended' *followed by a newline*, i.e. inserts a new line.
# To prepend multiple lines, use '\n' as part of the text.
# -i.old creates a backup of the input file with extension '.old'
sed -i.old '1 i\to be prepended' inFile
A portable variant that also works with macOS / BSD sed:
# Prepends 'to be prepended' *followed by a newline*
# To prepend multiple lines, escape the ends of intermediate
# lines with '\'
sed -i.old -e '1 i\
to be prepended' inFile
Note that the literal newline after the \ is required.
If the input file must be edited in place (preserving its inode with all its attributes):
Using the venerable ed POSIX utility:
Note:
ed invariably reads the input file as a whole into memory first.
To prepend directly to the first line (as with sed, this won't work if the input file is completely empty (0 bytes)):
ed -s text.txt <<EOF
1 s/^/to be prepended/
w
EOF
-s suppressed ed's status messages.
Note how the commands are provided to ed as a multi-line here-document (<<EOF\n...\nEOF), i.e., via stdin; by default string expansion is performed in such documents (shell variables are interpolated); quote the opening delimiter to suppress that (e.g., <<'EOF').
1 makes the 1st line the current line
function s performs a regex-based string substitution on the current line, as in sed; you may include literal newlines in the substitution text, but they must be \-escaped.
w writes the result back to the input file (for testing, replace w with ,p to only print the result, without modifying the input file).
To prepend one or more whole lines:
As with sed, the i function invariably adds a trailing newline to the text to be inserted.
ed -s text.txt <<EOF
0 i
line 1
line 2
.
w
EOF
0 i makes 0 (the beginning of the file) the current line and starts insert mode (i); note that line numbers are otherwise 1-based.
The following lines are the text to insert before the current line, terminated with . on its own line.
This will work to form the output. The - means standard input, which is provide via the pipe from echo.
echo -e "to be prepended \n another line" | cat - text.txt
To rewrite the file a temporary file is required as cannot pipe back into the input file.
echo "to be prepended" | cat - text.txt > text.txt.tmp
mv text.txt.tmp text.txt
Prefer Adam's answer
We can make it easier to use sponge. Now we don't need to create a temporary file and rename it by
echo -e "to be prepended \n another line" | cat - text.txt | sponge text.txt
Probably nothing built-in, but you could write your own pretty easily, like this:
#!/bin/bash
echo -n "$1" > /tmp/tmpfile.$$
cat "$2" >> /tmp/tmpfile.$$
mv /tmp/tmpfile.$$ "$2"
Something like that at least...
Editor's note:
This command will result in data loss if the input file happens to be larger than your system's pipeline buffer size, which is typically 64 KB nowadays. See the comments for details.
In some circumstances prepended text may available only from stdin.
Then this combination shall work.
echo "to be prepended" | cat - text.txt | tee text.txt
If you want to omit tee output, then append > /dev/null.
Another way using sed:
sed -i.old '1 {i to be prepended
}' inFile
If the line to be prepended is multiline:
sed -i.old '1 {i\
to be prepended\
multiline
}' inFile
Solution:
printf '%s\n%s' 'text to prepend' "$(cat file.txt)" > file.txt
Note that this is safe on all kind of inputs, because there are no expansions. For example, if you want to prepend !##$%^&*()ugly text\n\t\n, it will just work:
printf '%s\n%s' '!##$%^&*()ugly text\n\t\n' "$(cat file.txt)" > file.txt
The last part left for consideration is whitespace removal at end of file during command substitution "$(cat file.txt)". All work-arounds for this are relatively complex. If you want to preserve newlines at end of file.txt, see this: https://stackoverflow.com/a/22607352/1091436
As tested in Bash (in Ubuntu), if starting with a test file via;
echo "Original Line" > test_file.txt
you can execute;
echo "$(echo "New Line"; cat test_file.txt)" > test_file.txt
or, if the version of bash is too old for $(), you can use backticks;
echo "`echo "New Line"; cat test_file.txt`" > test_file.txt
and receive the following contents of "test_file.txt";
New Line
Original Line
No intermediary file, just bash/echo.
Another fairly straight forward solution is:
$ echo -e "string\n" $(cat file)
% echo blaha > blaha
% echo fizz > fizz
% cat blaha fizz > buzz
% cat buzz
blaha
fizz
You can do that easily with awk
cat text.txt|awk '{print "to be prepended"$0}'
It seems like the question is about prepending a string to the file not each line of the file, in this case as suggested by Tom Ekberg the following command should be used instead.
awk 'BEGIN{print "to be prepended"} {print $0}' text.txt
If you like vi/vim, this may be more your style.
printf '0i\n%s\n.\nwq\n' prepend-text | ed file
For future readers who want to append one or more lines of text (with variables or even subshell code) and keep it readable and formatted, you may enjoy this:
echo "Lonely string" > my-file.txt
Then run
cat <<EOF > my-file.txt
Hello, there!
$(cat my-file.txt)
EOF
Results of cat my-file.txt:
Hello, there!
Lonely string
This works because the read of my-file.txt happens first and in a subshell. I use this trick all the time to append important rules to config files in Docker containers rather than copy over entire config files.
you can use variables
Even though a bunsh of answers here work pretty well, I want to contribute this one-liner, just for completeness. At least it is easy to keep in mind and maybe contributes to some general understanding of bash for some people.
PREPEND="new line 1"; FILE="text.txt"; printf "${PREPEND}\n`cat $FILE`" > $FILE
In this snippe just replace text.txt with the textfile you want to prepend to and new line 1 with the text to prepend.
example
$ printf "old line 1\nold line 2" > text.txt
$ cat text.txt; echo ""
old line 1
old line 2
$ PREPEND="new line 1"; FILE="text.txt"; printf "${PREPEND}\n`cat $FILE`" > $FILE
$ cat text.txt; echo ""
new line 1
old line 1
old line 2
$
# create a file with content..
echo foo > /tmp/foo
# prepend a line containing "jim" to the file
sed -i "1s/^/jim\n/" /tmp/foo
# verify the content of the file has the new line prepened to it
cat /tmp/foo
I'd recommend defining a function and then importing and using that where needed.
prepend_to_file() {
file=$1
text=$2
if ! [[ -f $file ]] then
touch $file
fi
echo "$text" | cat - $file > $file.new
mv -f $file.new $file
}
Then use it like so:
prepend_to_file test.txt "This is first"
prepend_to_file test.txt "This is second"
Your file contents will then be:
This is second
This is first
I'm about to use this approach for implementing a change log updater.
With ex,
ex - $file << PREPEND
-1
i
prepended text
.
wq
PREPEND
The ex commands are
-1 Go to the very beginning of the file
i Begin insert mode
. End insert mode
wq Save (write) and quit