Any method for going through large log files? - unix

// Java programmers, when I mean method, I mean a 'way to do things'...
Hello All,
I'm writing a log miner script to monitor various log files at my company, It's written in Perl though I have access to Python and if I REALLY need to, C (though my company doesn't like binary files). It needs to be able to go through the last 24 hours, take the log code and check it if we should ignore or email the appropriate people (me). The script would run as a cron job on Solaris servers. Now here is what I had in mind (this is only pseudo-ish... and badly written pesudo)
main()
{
$today = Get_Current_Date();
$yesterday = Subtract_One_Day($today);
`grep $yesterday '/path/to/log' > /tmp/log` # Get logs from previous day
`awk '{print $X}' > /tmp/log_codes`; # Get Log Code
SubRoutine_to_Compare_Log_Codes('/tmp/log_codes');
}
Another thought was to load the log file into memory and read it in there... that is all fine and dandy except for a two small problems.
These servers are production servers and serve a couple million customers...
The Log files average 3.3GB (which are logs for about two days)
So not only would grep take a while to go through each file, but It would use up CPU and Memory in the process which need to be used elsewhere. And loading into memory a 3.3GB file is not of the wisest ideas. (At least IMHO). Now I had a crazy idea involving assembly code and memory locations but I don't know SPARC assembly sooo flush that idea.
Anyone have any suggestions?
Thanks for reading this far =)

Possible solutions: 1) have the system start a new log file every midnight -- this way you could mine the finite-size log file of the previous day at a reduced priority; and 2) modify the logging system so that it automatically extracts certain messages for further processing on the fly.

Related

Download CSV in Shiny app every 24 hours & display download time

I have a CSV that I want to download. I do not want it to download every time a user joins or uses the app.
I want to run the code every 24 hours and also display any of 1) timer since last download 2) timer until next download 3) timestamp of last download
Below is what I have right now, which works, but will probably cause unnecessary downloads. Is doing something with invalidatelater going to work or is there a better way?
CSV.Path <- "https://oracleselixir-downloadable-match-data.s3-us-west-2.amazonaws.com/2021_LoL_esports_match_data_from_OraclesElixir_20210404.csv"
download.file(CSV.Path, "lol2021")
lol2021 <- read.csv("lol2021")
There are two ways to approach this:
Check to see if it should be downloaded when the app starts; if the file is more recent than 24h, do not re-download it. This can be resolved fairly easily with:
fileage <- difftime(Sys.time(), file.info("data")["mtime"][[1]], units = "day")
if (is.na(fileage) || fileage > 1) {
CSV.Path <- "https://oracleselixir-downloadable-match-data.s3-us-west-2.amazonaws.com/2021_LoL_esports_match_data_from_OraclesElixir_20210404.csv"
download.file(CSV.Path, "lol2021")
}
lol2021 <- read.csv("lol2021")
(The is.na is there in case the file does not exist.)
One complicating factor with this is that two simultaneous users might attempt to download it at the same time. There should likely be some mutex file-access control here if that is a possibility.
Make sure this script is run every 24h, regardless of what users are or are not using the app. On what type of server are you running this app? Something like shiny-server does not do cron-like running, I believe, and you might not be able to guarantee that the app is "awake" every 24h. RStudio Connect does allow scheduled jobs, which might be a consideration for you.
Lacking that, if you have good access to the server, you might just add it as a cron job using Rscript or similar to download and overwrite the file.
Note about mutex file access: many networked filesystems (common in cloud and server architectures) do not guarantee file locking. A common technique is to download into a temporary file and then move (or copy) this temp file into the "real" file name in one step. This guards against the possibility that one process is reading from the file while another process is writing to it ... partial-file reads will be a frustrating and difficult-to-reproduce bug.

Problem: C++ application changes directory permissions on shutdown

I'm in the (ever-going) process of diagnosing a baffling problem with an application we use at work.
First, some notes about this application:
Required to run as the root user
Runs on the Solaris 10 operating system
Compiled for C++14
Normal shutdown is conducted by receiving SIGTERM
Writes a log file (explicitly sets permissions to 660) to a data directory (with default 770 permissions)
The application runs fine and does everything it's supposed to do, up until the point it terminates. Upon termination, the application is changing the permissions on the data directory from 770 to 660.
My coworkers are as baffled as I am. Even our system administrator doesn't understand why this is happening.
Things I've tried:
Print statements: The application reports the directory permissions are 770 until the exit or return statements
Check the logging: The logging mechanism is shared with several other applications, none of which have this issue
Running as myself: The directory's permissions are not changed on termination
Change umask to 027: The directory's permissions are still changed to 660
Check system logs: The sudo and messages logs do not show any calls to chmod for the directory (except those made to change the permissions back)
Due to the nature of the application, I cannot provide any of the code here for review/inspection. Further, many of the standard diagnostics tools are unavailable on the system in question.
However, I'm hopeful the gurus here can provide insight into what might be causing this problem, where to look going forward, or (ideally) how to fix it.
You can use the following simple dTrace script to get a stack trace from any process that calls chmod() (or any of its variants such as fchmod() or fchmodat()):
#!/usr/sbin/dtrace -s
syscall::fchmodat:entry
{
printf{ "\nExecname: %s\n", execname );
ustack();
}
You can filter by execname to only print chmod call stacks from your executable with
#!/usr/sbin/dtrace -s
syscall::fchmodat:entry
/ execname == "yourExecName" /
{
ustack();
}
You can add more or less stack frames with ustack( 10 ); to print, for example, 10 stack frames. If you want longer or shorter function names in the stack trace, you can specify the string length with ustack( 10, 50 ); to print 10 stack frames with each function name printing up to 50 characters.
If your binary has been completely stripped of symbol names you may not get function names, only addresses.
As it's a C++ binary, you might have to demangle the function names.
Once you get a stack trace, you can start working on what exactly is happening.

Why isn't Carbon writing Whisper data points as per updated storage-schema retention?

My original carbon storage-schema config was set to 10s:1w, 60s:1y and was working fine for months. I've recently updated it to 1s:7d, 10s:30d, 60s,1y. I've resized all my whisper files to reflect the new retention schema using the following bit of bash:
collectd_dir="/opt/graphite/storage/whisper/collectd/"
retention="1s:7d 1m:30d 15m:1y"
find $collectd_dir -type f -name '*.wsp' | parallel whisper-resize.py \
--nobackup {} $retention \;
I've confirmed that they've been updated using whisper-info.py with the correct retention and data points. I've also confirmed that the storage-schema is valid using a storage-schema validation script.
The carbon-cache{1..8}, carbon-relay, carbon-aggregator, and collectd services have been stopped before the whisper resizing, then started once the resizing was complete.
However, when checking in on a Grafana dashboard, I'm seeing empty graphs with correct data points (per sec, but no data) on collectd plugin charts; but with the graphs that are providing data, it's showing data and data points every 10s (old retention), instead of 1s.
The /var/log/carbon/console.log is looking good, and the collectd whisper files all have carbon user access, so no permission denied issues when writing.
When running an ngrep on port 2003 on the graphite host, I'm seeing connections to the relay, along with metrics being sent. Those metrics are then getting relayed to a pool of 8 caches to their pickle port.
Has anyone else experienced similar issues, or can possibly help me diagnose the issue further? Have I missed something here?
So it took me a little while to figure this out. It had nothing to do with the local_settings.py file like some of the old responses, but it had to do with the Interval function in the collectd.conf.
A lot of the older responses mentioned that you needed to include 'Interval 1' inside each Plugin container. I think this would have been great due to the control of each metric. However, that would create config errors in my logs, and break the metric. Setting 'Interval 1' at top level of the config resolved my issues.

Why Symfony3 so slow?

I installed Symfony3 framework-standard-edition. I'm trying to open the home page(app.php prod) and it is loaded 300-400ms.
This is my profiler information:
also I use php7.
Why it is so long?
You can try to optimize Zend OPCache.
Here are some recommended settings
opcache.revalidate_freq
Basically put, how often (in seconds) should the code cache expire and check if your code has changed. 0 means it checks your PHP code every single request (which adds lots of stat syscalls). Set it to 0 in your development environment. Production doesn't matter because of the next setting.
opcache.validate_timestamps
When this is enabled, PHP will check the file timestamp per your opcache.revalidate_freq value.
When it's disabled, opcache.revaliate_freq is ignored and PHP files are NEVER checked for updated code. So, if you modify your code, the changes won't actually run until you restart or reload PHP (you force a reload with kill -SIGUSR2).
Yes, this is a pain in the ass, but you should use it. Why? While you're updating or deploying code, new code files can get mixed with old ones— the results are unknown. It's unsafe as hell
opcache.max_accelerated_files
Controls how many PHP files, at most, can be held in memory at once. It's important that your project has LESS FILES than whatever you set this at. For a codebase at ~6000 files, I use the prime number 8000 for maxacceleratedfiles.
You can run find . -type f -print | grep php | wc -l to quickly calculate the number of files in your codebase.
opcache.memory_consumption
The default is 64MB. You can use the function opcachegetstatus() to tell how much memory opcache is consuming and if you need to increase the amount.
opcache.interned_strings_buffer
A pretty neat setting with like 0 documentation. PHP uses a technique called string interning to improve performance— so, for example, if you have the string "foobar" 1000 times in your code, internally PHP will store 1 immutable variable for this string and just use a pointer to it for the other 999 times you use it. Cool.
This setting takes it to the next level— instead of having a pool of these immutable string for each SINGLE php-fpm process, this setting shares it across ALL of your php-fpm processes. It saves memory and improves performance, especially in big applications.
The value is set in megabytes, so set it to "16" for 16MB. The default is low, 4MB.
opcache.fast_shutdown
Another interesting setting with no useful documentation. "Allows for faster shutdown".
Oh okay. Like that helps me. What this actually does is provide a faster mechanism for calling the destructors in your code at the end of a single request to speed up the response and recycle php workers so they're ready for the next incoming request faster.
Set it to 1 and turn it on.
opcache=1
opcache.memory_consumption=256
opcache.interned_strings_buffer=16
opcache.max_accelerated_files=8000
opcache.validate_timestamps=0
opcache.revalidate_freq=0
opcache.fast_shutdown=1
I hope it will help improve your performances
[EDIT]
You might also want to look at this answer:
Are Doctrine relations affecting application performance?
TheMrbikus, try some optimization with the following elements:
Use APC
Use Bootstrap files
Reference: http://symfony.com/doc/current/performance.html
Use the OPCache PHP7
Use Apache PHP-FPM.
E-mail sending process, and may slow down during the form rendering operations. Create a blank test Controller.

have R halt the EC2 machine it's running on

I have a few work flows where I would like R to halt the Linux machine it's running on after completion of a script. I can think of two similar ways to do this:
run R as root and then call system("halt")
run R from a root shell script (could run the R script as any user) then have the shell script run halt after the R bit completes.
Are there other easy ways of doing this?
The use case here is for scripts running on AWS where I would like the instance to stop after script completion so that I don't get charged for machine time post job run. My instance I use for data analysis is an EBS backed instance so I don't want to terminate it, simply suspend. Issuing a halt command from inside the instance is the same effect as a stop/suspend from AWS console.
I'm impressed that works. (For anyone else surprised that an instance can stop itself, see notes 1 & 2.)
You can also try "sudo halt", as you wouldn't need to run as a root user, as long as the user account running R is capable of running sudo. This is pretty common on a lot of AMIs on EC2.
Be careful about what constitutes an assumption of R quitting - believe it or not, one can crash R. It may be better to have a separate script that watches the R pid and, once that PID is no longer active, terminates the instance. Doing this command inside of R means that if R crashes, it never reaches the call to halt. If you call it from within another script, that can be dangerous, too. If you know Linux well, what you're looking for is the PID from starting R, which you can pass to another script that checks ps, say every 1 second, and then terminates the instance once the PID is no longer running.
I think a better solution is to use the EC2 API tools (see: http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/ for documentation) to terminate OR stop instances. There's a difference between the two of these, and it matters if your instance is EBS backed or S3 backed. You needn't run as root in order to terminate the instance - the fact that you have the private key and certificate shows Amazon that you're the BOSS, way above the hoi polloi who merely have root access on your instance.
Because these credentials can be used for mischief, be careful about running API tools from a given server, you'll need your certificate and private key on the server. That's a bad idea in the event that you have a security problem. It would be better to message to a master server and have it shut down the instance. If you have messaging set up in any way between instances, this can do all the work for you.
Note 1: Eric Hammond reports that the halt will only suspend an EBS instance, so you still have storage fees. If you happen to start a lot of such instances, this can clutter things up. Your original question seems unclear about whether you mean to terminate or stop an instance. He has other good advice on this page
Note 2: A short thread on the EC2 developers forum gives advice for Linux & Windows users.
Note 3: EBS instances are billed for partial hours, even when restarted. (See this thread from the developer forum.) Having an auto-suspend close to the hour mark can be useful, assuming the R process isn't working, in case one might re-task that instance (i.e. to save on not restarting). Other useful tools to consider: setTimeLimit and setSessionTimeLimit, and various checkpointing tools (I have a Q that mentions a couple). Using an auto-kill is useful if one has potentially badly behaved code.
Note 4: I recently learned of the shutdown command in package fun. This is multi-platform. See this blog post for commentary, and code is here. Dangerous stuff, but it could be useful if you want to adapt to Windows. I haven't tried it, though.
Update 1. Three more ideas:
You could use .Last() and runLast = TRUE for q() and quit(), which could shut down the instance.
If using littler or a script that invokes the script via Rscript, the same command line functions could be used.
My favorite package of today, tcltk2 has a neat timer mechanism, called tclTaskSchedule() that can be used to schedule the execution of an expression. You could then go crazy with the execution of stuff just before a hourly interval has elapsed.
system("echo 'rootpassword' | sudo halt")
However, the downside is having your root password in plain text in the script.
AFAIK those ways you mentioned are the only ones. In any case the script will have to run as root to be able to shut down the machine (if you find a way to do it without root that's possibly an exploit). You ask for an easier way but system("halt") is just an additional line at the end of your script.
sudo is an option -- it allows you to run certain commands without prompting for any password. Just put something like this in /etc/sudoers
<username> ALL=(ALL) PASSWD: ALL, NOPASSWD: /sbin/halt
(of course replacing with the name of user running R) and system('sudo halt') should just work.

Resources