Writing log files in disk instead of db in drupal - drupal

Is there a way (or a module) to write log files into the disk instead of writing them to db? Because i really don't want my db getting fatter just because log lines.

Yes there is. It's part of the core of Drupal. It's call syslog. However, it logs in the system log file by default.

I hope you have some fast disks... you could easily create a bottleneck by doing so. Instead, I would regularly dump the log tables to a file, say using a cron job.
You could add this to a file called drupal_logs.sh:
NOW=$(date +"%Y%m%d")_$(date +"%H%M.%S")
mysqldump -p - -user=username dbname tableName1 tableName1 > /path/to/drupal_$NOW.sql
And schedule that to run every 15 minutes by adding the following cron job:
15 * * * * /path/to/drupal_logs.sh > /dev/null
And if you're worried about the log files in the database getting to large, you can follow your mysqldump command in your drupal_logs.sh with a truncate command of the exported tables.

Related

SQLite3 database or disk is full on csv imports

This issue has been discussed on a number of threads, but none of the proposals seem to apply to my case.
I have a very large sqlite database (4Tb). I am trying to import csv files from the terminal
sqlite3 -csv -separator " " /data/mydb.db ".import '|cat *.csv' mytable"
I intermittently receive SQLite3 database or disk is full errors. Re-running the command after an error usually succeeds.
Some notes:
/data has 3.2Tb free
/tmp has 1.8Tb free.
*.csv takes up approximately 802Gb.
Both /tmp and /data are using ext4 which has a maximum file size of 16tb.
The only process accessing the database is the one mentioned above.
PRAGMA integrity_check returns ok.
Test on both
-sqlite3 --version - 3.38.1 2022-03-12 13:37:29 38c210fdd258658321c85ec9c01a072fda3ada94540e3239d29b34dc547a8cbc and 3.31.1 2020-01-27 19:55:54 3bfa9cc97da10598521b342961df8f5f68c7388fa117345eeb516eaa837balt1
OS - Ubuntu 20.04
Any thoughts on what could be happening?
(Unless there is an informed reason for why I am exceeding the limits sqlite, I would prefer to avoid suggestions that I move to a client/server RDBMS.)
i didn't figure it out, but someone else did, am pretty sure this will "fix it" until you reach 8TB-ish:
sqlite3 ... "PRAGMA main.max_page_count=2147483647; .import '|cat *.csv' mytable"
However the invocation
sqlite3 ... "PRAGMA main.journal_mode=DELETE; PRAGMA main.max_page_count; PRAGMA main.max_page_count=2147483647; PRAGMA main.page_size=65536;VACUUM; import '|cat *.csv' mytable;"
should allow the db to grow to ~200TB, but that VACUUM command, which is needed to apply the new page_size, requires a lot of free space to run, and will probably use a long time =/
good news is that you only need to run that once and it should be a permanent change to your db, your next invocation only needs sqlite3 ... "import '|cat *.csv' mytable;"
notably, this will probably break again around ~200TB

Default Authorization Required response (401) - taskscheduleR

I'm trying to run a daily taskscheduleR script that pulls data into R from an API. It works when I run it as a one time task but for some reason it won't work as a daily task. I keep getting the following error in the log file:
<HEAD><TITLE>Authorization Required</TITLE></HEAD>
<BODY BGCOLOR=white FGCOLOR=black>
<H1>Authorization Required</H1><HR>
<FONT FACE=Helvetica,Arial>
<B>Description: Authorization is required for access to this proxy</B>
</FONT>
<HR>
<!-- default Authorization Required response (401) -->
Here's the code:
library(httr)
library(jsonlite)
library(tidyverse)
library(taskscheduleR)
# Url to feed into GET function
url<-"https://urldefense.com/v3/__http://files.airnowtech.org/airnow/yesterday/daily_data_v2.dat__;!!J30X0ZrnC1oQtbA!Yh5wIss-mzbpMRXugALJoWEKLKcg1-7VmERQwcx2ESK0PZpM5NWNml5s9MVgwHr5LD1i5w$ "
# Sends request to AirNow API to get access to data
my_raw_result<-httr::GET(url)
# Retrieve contents of a request
my_content<-httr::content(my_raw_result,as="text")
# Parse content into a dataframe
my_content_from_delim <- my_content %>% textConnection %>% readLines %>% read.delim(text = ., sep = "|",header = FALSE)
head(my_content_from_delim)
I have been using the Rstudio add-in to create the task.
If you are trying to access this on a work computer, you may need to allow downloads from the url link. Open a browser, paste that url, click 'allow downloads', run the script.
I am not sure whether the solution I will offer will work for you, but it won't harm to try. If the problem related to the task scheduler, the following solution might work. However, if the problem of authorization issues, you may need to get some IT help from your workplace.
For the task scheduler issue, you can directly send your script to the windows task scheduler with a batch file and create a schedule for it.
To make it easy, you can use the following code. First, open a new folder and copy-paste your R script there. To run the following code, you should call you R script as My Script.r.
Then, in the same folder, create a batch file with the following codes. To create a batch file, you should copy the following code into a Notepad and save it as Run R Script.bat in the same folder.
cd %~dp0
"C:\PROGRA~1\R\R-40~1.0\bin\R.exe" -e "setwd(%~dp0)" CMD BATCH --vanilla --slave "%~dp0My Script.r" Log.txt
Here, cd %~dp0 will set the directory for the windows batch to the folder you run this batch. "C:\PROGRA~1\R\R-40~1.0\bin\R.exe" will specify your R.exe. You may need to change the path based on your system files.
-e "setwd(%~dp0)" will set the directory of R to the same folder in which the batch and script will be run.
"%~dp0My Script.r" Log.txt will define R script pathname and the log file for the batch.
Second, to create a daily schedule, we are going to create another batch file. To do so, copy and paste the following codes into a notepad and save as Daily Schedule.bat.
When you click the Daily Schedule.bat, it will create a daily task and run for the first time in one minute, and every day it will repeat itself at the same time when you first run this batch.
#echo off
for /F "tokens=1*" %%A in ('
powershell -NoP -C "(Get-Date).AddMinutes(1).ToString('MM/dd/yyyy HH:mm:ss')"
') do (
Set "MyDate=%%A"
set "MyTime=%%B"
)
::Execute path to bat path
cd %~dp0
::Create Task
SchTasks /Create /SC DAILY /TN "MY R TASK" /TR "%~dp0Run R Script.bat" /sd %MyDate% /st %MyTime%
This code will create a task called as "MY R TASK". To see whether it is scheduled, you can run the following codes on the windows prompt: taskschd.msc. This will open your task scheduler, and you can find your task there. If you want to modify or delete, you can use this task scheduler program; it has a nice GUI and easy to navigate.
For more details about the Task scheduler syntax, see the following link
If you have any questions, let me know.

fast export unexplained failure

I have roughly 14 million records that I am attempting to export from a Teradata table to file using a fast export connection object.
There is no size limit for fast export files on our Linux system, and there is 1.2 TB of available space in the target directory.
The session fails, and gives the following errors:
READER_2_1_1 FEXP_87011 Process [16022] exited with status [12]
SDKS_38200 Partition-level [SOURCE_TABLE_NAME]: Plug-in #305400 failed in deinit()
I googled the error message, and found this post:
Here
I followed the recommendations in the port to delete the .out file in the temp directory, delete the files that were partially filled in the target directory, and drop the error table and delete the log file. This did not fix the issue and the session still fails with the same error messages.
Try to use TPT Export plug-in instead. Also you can try to execute this FastExport using bteq scripts directly on your unix environment.

How can i check my cron job is running or fail?

How can i identify my cron job is processing successfully or fail in Unix.
eg
1 * * * * sh /usr/Script/Pri_Ip_Status.sh > /tmp/pri_log
check /tmp/pri_log this file . if any error coming while cron job append in this file. In every min it will change.
/tmp/pri_log if you are giving like this get all logs.
in script first check current working directory declared or not. other wise all cronjob script take home as current directory whether script in particular location also.

How find which process generate max read/write disk operation

Cloud server begin generate big disk read/write operation. I want find some script who generate top file with process(process name | TOTAL | READ | WRITE )
You can use iotop to see the reads and writes of each process using a top like interface.
Another way is to look at the /proc/[PID]/io files.
Example:
$ cat /proc/1944/io
read_bytes: 17961091072
write_bytes: 8192000
cancelled_write_bytes: 32768
There's a monitor much like top available: Iotop.
Since you're using Debian Linux, you can simply retrieve it via APT:
apt-get install iotop
Done.

Resources