SQLite3 database or disk is full on csv imports - sqlite

This issue has been discussed on a number of threads, but none of the proposals seem to apply to my case.
I have a very large sqlite database (4Tb). I am trying to import csv files from the terminal
sqlite3 -csv -separator " " /data/mydb.db ".import '|cat *.csv' mytable"
I intermittently receive SQLite3 database or disk is full errors. Re-running the command after an error usually succeeds.
Some notes:
/data has 3.2Tb free
/tmp has 1.8Tb free.
*.csv takes up approximately 802Gb.
Both /tmp and /data are using ext4 which has a maximum file size of 16tb.
The only process accessing the database is the one mentioned above.
PRAGMA integrity_check returns ok.
Test on both
-sqlite3 --version - 3.38.1 2022-03-12 13:37:29 38c210fdd258658321c85ec9c01a072fda3ada94540e3239d29b34dc547a8cbc and 3.31.1 2020-01-27 19:55:54 3bfa9cc97da10598521b342961df8f5f68c7388fa117345eeb516eaa837balt1
OS - Ubuntu 20.04
Any thoughts on what could be happening?
(Unless there is an informed reason for why I am exceeding the limits sqlite, I would prefer to avoid suggestions that I move to a client/server RDBMS.)

i didn't figure it out, but someone else did, am pretty sure this will "fix it" until you reach 8TB-ish:
sqlite3 ... "PRAGMA main.max_page_count=2147483647; .import '|cat *.csv' mytable"
However the invocation
sqlite3 ... "PRAGMA main.journal_mode=DELETE; PRAGMA main.max_page_count; PRAGMA main.max_page_count=2147483647; PRAGMA main.page_size=65536;VACUUM; import '|cat *.csv' mytable;"
should allow the db to grow to ~200TB, but that VACUUM command, which is needed to apply the new page_size, requires a lot of free space to run, and will probably use a long time =/
good news is that you only need to run that once and it should be a permanent change to your db, your next invocation only needs sqlite3 ... "import '|cat *.csv' mytable;"
notably, this will probably break again around ~200TB

Related

How can I quit sqlite from a command batch file?

I am trying to create a sealed command for my build pipeline which inserts data and quits.
So far I have created my data files
things-to-import-001.sql and 002 etc, which contains all the INSERT statements I'd like to run, with a file per table.
I have created a command file to run them
-- import-all.sql
.read ./things-to-import-001.sql
.read ./things-to-import-002.sql
.quit
However when I run my command
sqlite3 -init ./import-all.sql ./database.sqlite
..the data is inserted, but the program remains running and shows the sqlite> prompt, despite the .quit command. I have also tried using .exit 0.
From the sqlite3 --help
-init FILENAME read/process named file
Docs: https://www.sqlite.org/cli.html#reading_sql_from_a_file
How can I tell sqlite to exit once my inserts have finished?
I have managed to find a dirty workaround for this issue.
I have updated my import file to include a bad command, and executed using -bail to quit on first error.
-- import-all.sql
.read ./things-to-import-001.sql
.read ./things-to-import-002.sql
.fakeErrorToQuitWithBail
Then you can execute with
sqlite3 -init import-all.sql -bail
and it should quit with
Error: unknown command or invalid arguments: "fakeErrorToQuitWithBail". Enter ".help" for help
Try using ".exit" at the place of ".quit". For some reason SQLite dont doccumented this commands.
https://www.tutorialspoint.com/sqlite/sqlite_commands.htm

sqlite3 disk I/O error on cli

sqlite version 3.6.20, running through VNC.
Starting sqlite3 cli session. When trying to run commands ".tables", ".databases", "create table" I get "Error: disk I/O error". I don't know how to get more accurate description. I want to write in my home directory where I have permissions.
I tried some suggested fixes in .sqliterc with journal mode and temp storage - they do not help. Some commands like "PRAGMA synchronous = OFF;" also cause disk io error.
.output /dev/null
PRAGMA journal_mode = MEMORY;
PRAGMA locking_mode = EXCLUSIVE;
PRAGMA temp_store_directory = '/home/username/tmp';
How to find out more about error and solve this?
It was VMWARE related error. Solution: move files to /tmp. Sqlite works there.
It is described here https://dba.stackexchange.com/questions/93575/sqlite-disk-i-o-error-3850

fast export unexplained failure

I have roughly 14 million records that I am attempting to export from a Teradata table to file using a fast export connection object.
There is no size limit for fast export files on our Linux system, and there is 1.2 TB of available space in the target directory.
The session fails, and gives the following errors:
READER_2_1_1 FEXP_87011 Process [16022] exited with status [12]
SDKS_38200 Partition-level [SOURCE_TABLE_NAME]: Plug-in #305400 failed in deinit()
I googled the error message, and found this post:
Here
I followed the recommendations in the port to delete the .out file in the temp directory, delete the files that were partially filled in the target directory, and drop the error table and delete the log file. This did not fix the issue and the session still fails with the same error messages.
Try to use TPT Export plug-in instead. Also you can try to execute this FastExport using bteq scripts directly on your unix environment.

DMP File Compression - Oracle EXPDP

I am using Oracle 11g. Here i am Exporting the database using EXPDP. My database dmp file will be around 50 GB. So i am running out of space in Production Server. So i had tried COMPRESSION = "ALL" in ,my EXPDP command. While running this, i am getting something like "Not Enabled".
Here is EXPDP command.
for /f "tokens=2,3,4 delims=/ " %%a in ('date /t') do set fdate=%%c%%a%%b
EXPDP username/password#sid COMPRESSION=ALL DIRECTORY=EXPDP_CUSTOM_DIR TABLESPACES=USER DUMPFILE = user.dmp
Whether i need to change anything in this..
You need to have licensed the Advanced Compression Option to use this feature. Options for export compression are pretty slim with data pump, otherwise. With the older export you could pipe the output through a compression program, but I don't think that's possible here.
You might consider specifying a maximum file size (1GB, say) and include a substitution variable in the dumpfile name so you produce a bunch of smaller files, and have a cron job watching for them and compressing them as soon as they export process releases them.

Writing log files in disk instead of db in drupal

Is there a way (or a module) to write log files into the disk instead of writing them to db? Because i really don't want my db getting fatter just because log lines.
Yes there is. It's part of the core of Drupal. It's call syslog. However, it logs in the system log file by default.
I hope you have some fast disks... you could easily create a bottleneck by doing so. Instead, I would regularly dump the log tables to a file, say using a cron job.
You could add this to a file called drupal_logs.sh:
NOW=$(date +"%Y%m%d")_$(date +"%H%M.%S")
mysqldump -p - -user=username dbname tableName1 tableName1 > /path/to/drupal_$NOW.sql
And schedule that to run every 15 minutes by adding the following cron job:
15 * * * * /path/to/drupal_logs.sh > /dev/null
And if you're worried about the log files in the database getting to large, you can follow your mysqldump command in your drupal_logs.sh with a truncate command of the exported tables.

Resources