Symfony logging with Monolog, confused about STDERR - symfony

I am trying to align my logging with the best practice of using STDERR.
So, I would like to understand what happens with the logs sent to STDERR.
Symfony official docs (https://symfony.com/doc/current/logging.html):
In the prod environment, logs are written to STDERR PHP stream, which
works best in modern containerized applications deployed to servers
without disk write permissions.
If you prefer to store production logs in a file, set the path of your
log handler(s) to the path of the file to use (e.g. var/log/prod.log).
This time I want to follow the STDERR stream option.
When I was writing to a specific file, I knew exactly where to look for that file, open it and check the logged messages.
But with STDERR, I don't know where to look for my logs.
So, using monolog, I have the configuration:
monolog:
handlers:
main:
type: fingers_crossed
action_level: error
handler: nested
excluded_http_codes: [404, 405]
nested:
type: stream
path: "php://stderr"
level: debug
Suppose next morning I want to check the logs. Where would I look?

Several hours of reading docs later, my understanding is as follows:
First, the usage of STDERR over STDOUT is preferred for errors because it is not buffered (gathering all output waiting for the script to end), thus errors are thrown immediately to the STDERR stream. Also, this way the normal output doesn't get mixed with errors.
Secondly, the immediate intuitive usage is when running a shell script, because in the Terminal one will directly see the STDOUT and STDERR messages (by default, both streams output to the screen).
But then, the non-intuitive usage of STDERR is when logging a website/API. We want to log the errors, and we want to be able to monitor the already occurred errors, that is to come back later and check those errors. Traditional practice stores errors in custom defined log-files. More modern practice recommends sending errors to STDERR. Regarding Symfony, Fabien Potencier (the creator of Symfony), says:
in production, stderr is a better option, especially when using
Docker, SymfonyCloud, lambdas, ... So, I now recommend to use
php://stderr
(https://symfony.com/blog/logging-in-symfony-and-the-cloud).
And he further recommends using STDERR even for development.
Now, what I believe to be missing from the picture (at least for me, as non-expert), is the guidance on HOW to access and check the error logs. Okay, we send the errors to STDERR, and then? Where am I going to check the errors next morning? I get it that containerized platforms (clouds, docker etc) have specific tools to easily and nicely monitor logs (tools that intercept STDERR and parse the messages in order to organize them in specific files/DBs), but that's not the case on a simple server, be it a local server or on a hosting.
Therefore, my understanding is that sending errors to STDERR is a good standardization when:
Resorting to using a third-party tool for log monitoring (like ELK, Grail, Sentry, Rollbar etc.)
When knowing exactly where your web-server is storing the STDERR logs. For instance, if you try (I defined a new STD_ERR constant to avoid any pre-configs):
define('STD_ERR', fopen('php://stderr', 'wb'));
fputs(STD_ERR, "ABC error message.");
you can find the "ABC error message" at:
XAMPP Apache default (Windows):
..\xampp\apache\logs\error.log
Symfony5 server (Windows):
C:\Users\your_user\.symfony5\log\ [in the most recent folder, as the logs rotate]
Symfony server (Linux):
/home/your_user/.symfony/log/ [in the most recent folder, as the logs rotate]
For Symfony server, you can actually see the logs paths when starting the server, or by command "symfony server:log".
One immediate advantage is that these STDERR logs are stored outside of the app folders, and you do not need to maintain extra writable folders or deal with the permissions etc. Of course, when developing/hosting multiple sites/apps, you need to configure the error log (the STDERR storage) location per app (in Apache that would be inside each <VirtualHost> conf ; with Symfony server, I am not sure). Personally, without a third-party tool for monitoring logs, I would stick with custom defined log files (no STDERR), but feel free to contradict me.

Related

airflow logs: break long logs into smaller multiple files

I see that airflow logs are stored at
base_log_folder/dag_id/task_id/date_time/1.log
i.e:
base_log_folder/dag_id={dag_id}/run_id={run_id}/task_id={task_id}/attempt={try_number}.log
Sometime my logs are huge and know its now a good idea to check them from the web ui, because the chrome cant handle so much size of logs.
I have access to the server and can check the logs.
So how can i break the longs into smaller files - v
i.e
{try_number}_1.log
{try_number}_2.log
{try_number}_3.log
...
Also noted that the log file {trynumber}.log, is only created when the task is completed.
while the task is running i can check the logs in the webui, but i dont see any file in the corresponding log folder.
So i need two things for logging from the server side:
break large log files into smaller files
see the log file live while the task is running, not only after the task is completed
In Airflow 2.4.0 there is an option to view full logs or only the first fragment thus huge logs are not loaded automatically:
Starting Airflow 2.5.0 the web UI also does auto tails for logs (PR)
Airflow does show live logs. If you will set for example a Sensor task that pokes resource you will see the poking attempts in the log when the task is running. It's important to note that there are local logs and remote logs (docs):
In the Airflow UI, remote logs take precedence over local logs when remote logging is enabled. If remote logs can not be found or accessed, local logs will be displayed. Note that logs are only sent to remote storage once a task is complete (including failure). In other words, remote logs for running tasks are unavailable (but local logs are available).
Huge logs are often a sign of not using log levels. If you have entries relevant for debugging then set DEBUG mode rather than INFO mode that way you can better control over the log size displayed in the UI using the AIRFLOW__LOGGING__LOGGING_LEVEL variable.

Google Cloud Composer (Apache Airflow) cannot access log files

I'm running a DAG in Google Cloud Composer (hosted Airflow) which runs fine in Airflow locally. All it does is print "Hello World". However, when I run it through Cloud Composer I receive the error:
*** Log file does not exist: /home/airflow/gcs/logs/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log
*** Fetching from: http://airflow-worker-d775d7cdd-tmzj9:8793/log/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-d775d7cdd-tmzj9', port=8793): Max retries exceeded with url: /log/matts_custom_dag/main_test/2020-04-20T23:46:53.652833+00:00/2.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8825920160>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I've also tried making the DAG add data into a database and it actually succeeds 50% of the time. However, it always returns this error message (and no other print statements or logs). Any help much appreciated on why this might be happening.
We also faced the same issue then raised a support ticket to GCP and got the following reply.
The message is related to the latency of syncing logs from Airflow workers to WebServer, it takes at least some minutes (depending on the number of objects and their size)
The total log size seems not large but it’s enough to noticeably slow down synchronization, hence, we recommend cleanup/archive the logs
Basically we recommend relying on Stackdriver logs instead, because of latency due to the design of this sync
I hope this will help you solve the problem.
I have the same problem after upgrading from 1.10.3 to 1.10.6 of Google Composer.
I can see in my logs that airflow is trying to get the logs from a bucket with a name ended with -tenant while the bucket in my account ends with -bucket
In the configuration, I can see something weird too.
## airflow.cfg
[core]
remote_base_log_folder = gs://us-east1-dada-airflow-xxxxx-bucket/logs
## also in the running configuration says
core remote_base_log_folder gs://us-east1-dada-airflow-xxxxx-tenant/logs env var
I wrote to google support and they said the team is working on a fix.
EDIT:
I've been accessing my logs with gsutil and replacing the bucket name suffix to -bucket
gsutil cat gs://us-east1-dada-airflow-xxxxx-bucket/logs/...../5.logs
I faced the same situation in multiple occasions.
As soon as when the job finished when I take a look at the log on Airflow Web UI, it used to give me the same error. Although when I check back the same logs on UI after a min or 2, I could see the logs properly.
As per the above answers, its a sync issue between the webserver and the Worker node.
In general, the issue describe here should be more like a sporadic issue.
In certain situations, what could help is setting default-task-retries to a value that allows for retrying a task at least 1.
This issue is resolved at least since Airflow version: 1.10.10+composer.

ERROR: [_parse_http_data] invalid HTTP method in shiny app

When I load my docker shiny app domain name in the browser, it crashes (greys out) and I get this "ERROR: [_parse_http_data] invalid HTTP method".
I have developed an web application that consists of a shiny app (has a login feature connected to an RMySQL database), a website and a mariadb database. I put them together in a docker-compose file and tested it on my local computer and it works fine. I then proceeded to deploy them in a Kubernetes cluster in GCE and that was also successful. I used cloudflare to install a ssl certificate for the shiny app domain (i.e. trnddaapp.com). Now when I load the shiny app domain in the browser it appends the https and loads the app successfully but after about a minute it crashes (greys out). I loaded the shiny app external ip with http and this doesn’t crash.
The closest solution I have come to is https://github.com/rstudio/shiny-server/issues/392 but there doesn't seem to be any other solution to my problem. I would be grateful if anyone help me resolve this problem.
This is the error message I get when I check with kubectl log [app pod name], I get this error:
ERROR: [_parse_http_data] invalid HTTP method
ERROR: [_parse_http_data] invalid HTTP method
ERROR: [_parse_http_data] invalid HTTP method
I expect the app not to crash when the shiny app domain (trnddaapp.com) is appended with the https.
Let's start with the analysis of the error message, it says:
[_parse_http_data]
So we know that your app is receiving something, but it doesn't understand what it is (it may be a malformed HTTP/1.0 or HTTP/1.1 or even binary data) then we have an
invalid HTTP method
Now we are sure it is not a HTTP/1.X call but a stream of (non recognized) data.
We now know is not the instance since it "deploys" and "delivers" the service, but something inside that is just breaking.
There are a few things that may be happening, since it runs in your local machine (where I am assuming it has access to more resources, especially memory) it may be an issue of resource allocation and that once ran in a container, it could be possible that it empties its allocated amount of resources and breaks (perhaps a library that is called in real time that uses a chunk of memory?) but we won't be sure unless we can debug it inside a container, so could it be possible for you to add a debug library that records your requests to see if it parses all of those and at some point in time it stops and why? I know a person from R-Studio created a httpuv that logs every request this can be done as in:
devtools::install_github('rstudio/httpuv#wch-print-req')
And after that, maybe share the output and see why the application is behaving like that and killing its own service.
I really thank you in advance, hopefully with those logs we may be able to shed more light into this matter.
Thanks once again!
-JP

.net output in Docker logs

I''m trying to get log output (Console.WriteLine(..)) in my Docker logs, but I'm getting zero avail.
I've tried:
Console.WriteLine(..)
Trace.WriteLine(..)
Flushing the console, flushing the trace.
I can see these outputs in a VS output window when I'm debugging, so they go somoewhere.
I'm on windows Container, using microsoft/aspnet:4.7.1-windowsservercore-1709 and net4.7
These are the logs I get on container start
docker logs -f exportapi
ERROR ( message:Cannot find requested collection element. )
Applied configuration changes to section "system.applicationHost/applicationPools" for "MACHINE/WEBROOT/APPHOST" at configuration commit path "MACHINE/WEBROOT/APPHOST"
You have many good lateral options, like self-contained/server-contained executables (eg. Dotnet Core using microsoft/dotnet:runtime would proxy Console.WriteLine by default off the dotnet new web scaffold). Zero-configuration STDOUT logging has never been a common approach on IIS, but these modern options adopt it as best practice (logging should be a transparent backing service).
If you want or need a chain of three programs/assemblies to get your web service up (ServiceMonitor, W3SVC, and finally your assembly), then you need something like this: https://blog.sixeyed.com/relay-iis-log-entries-to-read-them-in-docker/
Overriding the entrypoint to tail more logs than the image does by default is unfortunately a common hack (not just in Microsoft land). So, in your case, I believe you need at least a trace listener config to emit Trace.WriteLine, and then the above approach to emit it: https://learn.microsoft.com/en-us/dotnet/framework/debug-trace-profile/how-to-create-and-initialize-trace-listeners

Spring integration sftp move remote file issue

I'am using the inbound-channel-adapter from Spring Integration to retrieve files over sftp from a remote server. Everything works fine.
But I have an additional requirement: after a file is received on the local side, that file needs to moved to a "send" directory on the remote server.
The "SFTP Outbound Gateway" has the appropriate method for that move action, but my problem is when to call it.
Situation: 10 files on remote server, 0 on local server
When I start my application it will receive all 10 files from the remote server and write them to my local file system. Perfect.
Situation: 1 file on remote server, 10 on local server
In this situation the remote file is received, but for every file on the local file system the receive method of the QueueChannel is also called.
Example log from one file: (file1.zip)
18:12:52.118 [task-scheduler-1] INFO o.s.i.file.FileReadingMessageSource - Created message: [[Payload File content=C:\Downloads\sftpTest\file1.zip][Headers=...]
18:12:52.119 [task-scheduler-1] DEBUG o.s.i.e.SourcePollingChannelAdapter - Poll resulted in Message: [Payload File content=C:\Downloads\sftpTest\file1.zip][Headers=...]
18:12:52.119 [task-scheduler-1] DEBUG o.s.integration.channel.QueueChannel - preSend on channel 'fromChannel', message: [Payload File content=C:\Downloads\sftpTest\file1.zip][...]
18:12:52.119 [task-scheduler-1] DEBUG o.s.integration.channel.QueueChannel - postSend (sent=true) on channel 'fromChannel', message: [Payload File content=C:\Downloads\sftpTest\file1.zip][Headers=...]
18:12:52.119 [main] DEBUG o.s.integration.channel.QueueChannel - postReceive on channel 'fromChannel', message: [Payload File content=C:\Downloads\sftpTest\file1.zip][Headers=......]
So even when the file it not physicaly retreived from the remote server, the channel.receive() method will still receive a message with that file as payload.
This confuses me, because I can't determine from the message if the file was already on the local file system or was just retrieved from the remote server.
I experimented using a custom org.springframework.messaging.support.ChannelInterceptorAdapter, FileFilter, ServiceActivator, but the problem still remains.
My application will process high volumes, so sending the received file to the required directory on the remote server is not an option. And also simply trying to move the file remote for every message that is locally received, is not an option since it will clutter the logfiles with exceptions of not being able to move the file. This way in case of an real error situation the problem will not be detected.
One solution might be a hook in the method copyFileToLocalDirectory of the org.springframework.integration.file.remote.synchronizer.AbstractInboundFileSynchronizer.
There is a check performed if the remote file should be deleted and that loop is only called for the files that were actually transferred from the remote server. My attempts to override this method and add my move behaviour did not succeed, since Spring has already instantiated the classes
that will handle this.
So what is the best way to achieve this? I know the problem will probably be located between my keyboard and my chair, but I've run out of options and any help is highly appreciated.
Thanks a lot,
Frank
You would probably be better off using MGET and an outbound gateway to retrieve the files instead of using the inbound adapter which, as you say, is two-stage - synchronize, and emit message(s) for file(s) in the local dir (unless you use a persistent file list filter, in which case you'll only see "new" files).

Resources