I have a .sqliterc file with .mode column.
Now despite I run
sqlite3 -separator $'\t' .....
It is padded with spaces instead tabs:
sqlite3 -separator $'\t' ..... | cat -A
-- Loading resources from /home/xyz/.sqliterc
Ensembl_Gene_ID gene $
------------------ ----------$
ENSMUSG00000038503 Mesdc2 $
ENSMUSG00000038503 Mesdc2 $
ENSMUSG00000038503 Mesdc2 $
ENSMUSG00000038503 Mesdc2 $
How can override some of the options in .sqlitrc. If it is possible I would rather don't switch off .sqliterc but overwrite some options.
The -separator $'\t' option does not take effect because in column output mode, each record is shown on a separate line with the data aligned in columns.
The sqlite3 command line option does overwrite the settings in .sqliterc. Try overwriting the output mode using -list option, then you will see -separator $'\t' takes effect.
sqlite3 -separator $'\t' -list -header test.db "select * from test" | cat -A
The output is in list mode, \t as the separator:
-- Loading resources from /home/test/.sqliterc
Ensembl_Gene_ID^Igene$
ENSMUSG00000038503^IMesdc2$
ENSMUSG00000038503^IMesdc2$
ENSMUSG00000038503^IMesdc2$
ENSMUSG00000038503^IMesdc2$
Related
I have a .sql file in which there are 100s of hive queries and i want their output in a multiple files, like for 1st query abc.txt file gets created for 2nd query xyz.txt file gets created and so on....for 100 queries 100 output file with their result respectively
If your main .sql file has semicolon separated sql queries, you could use an awk command like this to generate separate hive commands with output files.
tr '\n' ' ' < yourqueryfile | awk 'BEGIN {RS=";"} \
{gsub(/(^ +| +$)/, "", $0);printf "hive -e \"%s\" >OUT_"NF".txt\n",$0}'
RS=";" - sets record separator to ";"
tr - to replace newlines between queries to single space.
gsub - to trim leading and trailing spaces.
The command will generate multiple hive command lines like this.
hive -e "select 1" >OUT_2.txt
hive -e "select 3 from ( select 4 )" >OUT_7.txt
hive -e "select name from t union select n from t2" >OUT_9.txt
hive -e "select * from c" >OUT_4.txt
How would I apply the word count tool to a task of this nature:
Д
е
с
я
т
ь
д
н
е
й
I want to know how many characters appear but I want to ignore the white space in between the characters.
How can I specify that in the unix word count utility?
If you use tr you can delete new lines and spaces like this:
$ tr -d '[\n ]' < file
Десятьдней
Then, pipe to wc with -m for chars:
$ tr -d '[\n ]' < file | wc -m
10
I have two .txt files "test1.txt" and "test2.txt" and I want to use inverse grep (UNIX) to find out all lines in test2.txt that do not contain any of the lines in test1.txt
test1.txt contains only user names, while test2.txt contains longer strings of text. I only want the lines in test2.txt that DO NOT contain the usernames found in test1.txt
Would it be something like?
grep -v test1.txt test2.txt > answer.txt
Your were almost there just missed one option in your command (i.e -f )
Your Solution should be use the -f flag, see below for sample session demonstrating the same
Demo Session
$ # first file
$ cat a.txt
xxxx yyyy
kkkkkk
zzzzzzzz
$ # second file
$ cat b.txt
line doesnot contain any name
This person is xxxx yyyy good
Another line which doesnot contain any name
Is kkkkkk a good name ?
This name itself is sleeping ...zzzzzzzz
I can't find any other name
Lets try the command now
$ # -i is used to ignore the case while searching
$ # output contains only lines from second file not containing text for first file lines
$ grep -v -i -f a.txt b.txt
line doesnot contain any name
Another line which doesnot contain any name
I can't find any other name
Lets try the command now
They're probably better ways to do this ie. without grep but heres a solution which will work
grep -v -P "($(sed ':a;N;$!ba;s/\n/)|(/g' test1.txt))" test2.txt > answer.txt
To explain this:
$(sed ':a;N;$!ba;s/\n/)|(/g' test1.txt) is an embedded sed command which outputs a string where each newline in test1.txt is replaced by )|( the output is then inserted into a perl style regex (-P) for grep to use, so that grep is searching test2.txt for the every line in text1.txt and returns only those in test2.txt which don't contain lines in test1.txt because of the -v param.
What flavor of unix are you using? This will provide us with a better understanding of what is available to you from the command line. Currently what you have will not work, you're looking for the diff command which compares two files.
You can do the following for OS X 10.6 I have tested this at home.
diff -i -y FILE1 FILE2
diff compares the files -i will ignore the case if this does not matter so Hi and HI will still mean the same. Finally -y will output side by side the results If you want to out the information to a file you could do diff -i -y FILE1 FILE2 >> /tmp/Results.txt
I want to sort a tab limited file in descending order according to the 5th field of the records.
I tried
sort -r -k5n filename
But it didn't work.
The presence of the n option attached to the -k5 causes the global -r option to be ignored for that field. You have to specify both n and r at the same level (globally or locally).
sort -t $'\t' -k5,5rn
or
sort -rn -t $'\t' -k5,5
If you only want to sort only on the 5th field then use -k5,5.
Also, use the -t command line switch to specify the delimiter to tab. Try this:
sort -k5,5 -r -n -t \t filename
or if the above doesn't work (with the tab) this:
sort -k5,5 -r -n -t $'\t' filename
The man page for sort states:
-t, --field-separator=SEP
use SEP instead of non-blank to blank transition
Finally, this SO question Unix Sort with Tab Delimiter might be helpful.
To list files based on size in asending order.
find ./ -size +1000M -exec ls -tlrh {} \; |awk -F" " '{print $5,$9}' | sort -n\
I'm trying to bulk load a lot of data ( 5.5 million rows ) into an SQLite database file.
Loading via INSERTs seems to be far too slow, so I'm trying to use the sqlite3 command line tool and the .import command.
It works perfectly if I enter the commands by hand, but I can't for the life of me work out how to automate it from a script ( .bat file or python script; I'm working on a Windows machine ).
The commands I issue at the command line are these:
> sqlite3 database.db
sqlite> CREATE TABLE log_entry ( <snip> );
sqlite> .separator "\t"
sqlite> .import logfile.log log_entry
But nothing I try will get this to work from a bat file or python script.
I've been trying things like:
sqlite3 "database.db" .separator "\t" .import logfile.log log_entry
echo '.separator "\t" .import logfile.log log_entry' | sqlite3 database.db
Surely I can do this somehow?
Create a text file with the lines you want to enter into the sqlite command line program, like this:
CREATE TABLE log_entry ( );
.separator "\t"
.import logfile.log log_entry
and then just call sqlite3 database.db < commands.txt
Alternatively you can put everything in one shell script file (thus simplifying maintenance) using heredoc import.sh :
#!/bin/bash --
sqlite3 -batch $1 <<"EOF"
CREATE TABLE log_entry ( <snip> );
.separator "\t"
.import logfile.log log_entry
EOF
...and run it:
import.sh database.db
It makes it easier to maintain just one script file.
By the way, if you need to run it under Windows, Power Shell also features heredoc
In addition this approach helps to deal with lacking script parameter support. You can use bash variables:
#!/bin/bash --
table_name=log_entry
sqlite3 -batch $1 <<EOF
CREATE TABLE ${table_name} ( <snip> );
.separator "\t"
.import logfile.log ${table_name}
EOF
Or even do a trick like this:
#!/bin/bash --
table_name=$2
sqlite3 -batch $1 <<EOF
CREATE TABLE ${table_name} ( <snip> );
.separator "\t"
.import logfile.log ${table_name}
EOF
...and run it: import.sh database.db log_entry
Create a separate text file containing all the commands you would normally type into the sqlite3 shell app:
CREATE TABLE log_entry ( <snip> );
.separator "\t"
.import /path/to/logfile.log log_entry
Save it as, say, impscript.sql.
Create a batch file which calls the sqlite3 shell with that script:
sqlite3.exe yourdatabase.db < /path/to/impscript.sql
Call the batch file.
On a side note - when importing, make sure to wrap the INSERTs in a transaction! That will give you an instant 10.000% speedup.
I just recently had a similar problem while converting Firefox' cookies.sqlite to a text file (for some downloading tool) and stumbled across this question.
I wanted to do that with a single shell line and that would be my solution applied to the above mentioned problem:
echo -e ".mode tabs\n.import logfile.log log_entry" | sqlite3 database.db
But I haven't tested that line yet. But it worked fine with the Firefox problem I mentioned above (btw via Bash on Mac OSX ):
echo -e ".mode tabs\nselect host, case when host glob '.*' then 'TRUE' else 'FALSE' end, path, case when isSecure then 'TRUE' else 'FALSE' end, expiry, name, value from moz_cookies;" | sqlite3 cookies.sqlite
sqlite3 abc.db ".read scriptname.sql"
At this point, I'm not sure what else I can add other than, I had some trouble adding a unix environment variable to the bash script suggested by nad2000.
running this:
bash dbmake.sh database.db <(sed '1d' $DATA/logfile.log | head -n 1000)
I needed to import from stdin as workaround and I found this solution:
sqlite3 $1 <<"EOF"
CREATE TABLE log_entry;
EOF
sqlite3 -separator $'\t' $1 ".import $2 log_entry"
By adding the second sqlite3 line, I was able to pass the $2 from Unix into the file parameter for .import, full path and everything.
On Windows, this should work:
(echo CREATE TABLE log_entry ( <snip> ); & echo .separator "\t" & echo .import logfile.log log_entry) | sqlite3.exe database.db
I haven't tested this particular command but from my own pursuit of solving this issue of piping multiple commands I found that the key was to enclose the echoed commands within parentheses. That being said, it is possible that you may need to tweak the above command to also escape some of those characters. For example:
(echo CREATE TABLE log_entry ^( ^<snip^> ^); & echo .separator "\t" & echo .import logfile.log log_entry) | sqlite3.exe database.db
I'm not sure if the escaping is needed in this case, but it is highly probable since the parentheses may conflict with the enclosing ones, then the "less than" and "greater than" symbols are usually interpreted as input or output which may also conflict. An extensive list of characters' escape may be found here: http://www.robvanderwoude.com/escapechars.php
here trans is table name and trans.csv is a csv file in which i have 1959 rows of data
$ sqlite3 abc.db ".separator ','"
$ sqlite3 abc.db ".import 'trans.csv' trans"
$ sqlite3 abc.db "select count(*) from trans;"
1959
but its impossible to write like as you wrote