nginx gzip_static won't create the gz file that doesn't exist automatically? - nginx

I'm just curious...
nginx will detect the gz files in the same dir,if it does not exists,it will use on-the-fly gzip and return a response(if gzip on)
so...when we turn gzip_static on,why nginx not to create a gz file with the output gzipped response?it's about trunked encoding or something else?
So do I really need to write a bash script to create/update the gz files everytime I modify the static files,right?
Thanks ^_^

You're right, as far as i can tell the two modules (gzip and gzip_static) don't really interact. Anything compressed on the fly by gzip will possibly be cached for a short period of time, but will not be saved for gzip_static. A bash script to automatically update the .gz files is a good idea, and if you're using source control, could be done as a post-command in Git or Hg.
It's worth noting that for small files the overhead is arguably in the disk access rather than the compression.. but every little bit helps.

Related

static files shown as uncompressed even when web server has been configured for gzip compression

I have hosted my website on Amazon Elastic BeanStalk. It uses nginx as proxy server and has gzip compression enabled. But when I run PageInsights on the site, it reports that many of my static content files need to be gzipped. Why is PageSpeed Insights not recognizing the compression? Is there something extra that needs to be done?
I think I actully found the answer
By enabling gzip compression on nginx, you enable it only for text/html (that is nginx default http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_types)
In order to enable it for other types, you have to do it explicitly. In case of beanstalk, create the following file in your project
.ebextensions/gzip.config
and put the code there (make sure you keep the indentation, it is important):
files:
/etc/nginx/conf.d/gzip.conf:
content: |
gzip_types application/json;
As you can see, in my case I needed to gzip json files, you are probably having problems with Pagespeed complaining about css and js files, right? As the link above suggests you can use a * wildcard to compress everything, but if not, just list the mime types you need in the config, deploy it, and check PageSpeed Insights again.
Dmitry's answer only works in the case where there is no gzip_types entry in the default config that amazon sets for you. This is now the case and so you will need to write an .ebextensions conf file to overwrite the entire config with a custom one. To do this you need to:
Download the default config from one of your instances via SSH. It will be in the folder /etc/nginx/conf.d and be called 00_elastic_beanstalk_proxy.conf
Create a new file in your .ebextensions folder called proxy.conf that follows this template:
files:
"/etc/nginx/conf.d/proxy.conf":
mode: "000644"
owner: root
group: root
content: |
# Paste the contents of the config you downloaded here
# at this indentation level
container_commands:
00_remove:
command: "rm -f /tmp/deployment/config/#etc#nginx#conf.d#00_elastic_beanstalk_proxy.conf /etc/nginx/conf.d/00_elastic_beanstalk_proxy.conf"
Change the config to have the gzip_types you want.
Deploy your application
For reference this is what my working proxy.conf file looks like: https://pastebin.com/raw/KGvdsZc4
Word of Caution:
I have been assured it is a common use case to overwrite the entire config this way and while it makes changing the config easier in the future this will break some functionality of the AWS EB Web Tools. Particularly anything that effects the nginx config (static file paths, gzip compression, etc) will not work. In order to make changes you will just have to change the config directly in proxy.conf
techwes solution was very helpful and worked great (in my case, allowed me to add application/javascript to the gzip_types), with one modification: the file in your .ebextensions folder must be named with a .config extension, so it should be proxy.config. (I tried to add a comment to techwes' post but don't have enough rep!)
It should also be noted that if you turn off gzip in your EB environment using the AWS Console (Environment > Configuration > Software Configuration), it will remove the gzip lines from the 00_elastic_beanstalk_proxy.conf file, so you can use a .config file to add another .conf file without having to replace the entire 00_elastic_beanstalk_proxy.conf file.

An appendable compressed archive

I have a requirement to maintain a compressed archive of log files. The log filenames are unique and the archive, once expanded, is simply one directory containing all the log files.
The current solution isn't scaling well, since it involves a gzipped tar file. Every time a log file is added, they first decompress the entire archive, add the file, and re-gzip.
Is there a Unix archive tool that can add to a compressed archive without completely expanding and re-compressing? Or can gzip perform this, given the right combination of arguments?
I'm using zip -Zb for that (appending text logs incrementally to compressed archive):
fast append (index is at the end of archive, efficient to update)
-Zb uses bzip2 compression method instead of deflate. In 2018 this seems safe to use (you'll need a reasonably modern unzip -- note some tools do assume deflate when they see a zip file, so YMMV)
7z was a good candidate: compression ratio is vastly better than zip when you compress all files in the same operation. But when you append files one by one to the archive (incremental appending), compression ratio is only marginally better than standard zip, and similar to zip -Zb. So for now I'm sticking with zip -Zb.
To clarify what happens and why having the index at the end is useful for "appendable" archive format, with entries compressed individually:
Before:
############## ########### ################# #
[foo1.png ] [foo2.png ] [foo3.png ] ^
|
index
After:
############## ########### ################# ########### #
[foo1.png ] [foo2.png ] [foo3.png ] [foo4.png ] ^
|
new index
So this is not fopen in append mode, but presumably fopen in write mode, then fseek, then write (that's my mental model of it, someone let me know if this is wrong). I'm not 100% certain that it would be so simple in reality, it might depend on OS and file system (e.g. a file system with snapshots might have a very different opinion about how to deal with small writes at the end of a file… huge "YMMV" here 🤷🏻‍♂️)
It's rather easy to have an appendable archive of compressed files (not same as appendable compressed archive, though).
tar has an option to append files to the end of an archive (Assuming that you have GNU tar)
-r, --append
append files to the end of an archive
You can gzip the log files before adding to the archive and can continue to update (append) the archive with newer files.
$ ls -l
foo-20130101.log
foo-20130102.log
foo-20130103.log
$ gzip foo*
$ ls -l
foo-20130101.log.gz
foo-20130102.log.gz
foo-20130103.log.gz
$ tar cvf backup.tar foo*gz
Now you have another log file to add to the archive:
$ ls -l
foo-20130104.log
$ gzip foo-20130104.log
$ tar rvf backup.tar foo-20130104.log
$ tar tf backup.tar
foo-20130101.log.gz
foo-20130102.log.gz
foo-20130103.log.gz
foo-20130104.log.gz
If you don't need to use tar, I suggest 7-Zip. It has an 'add' command, which I believe does what you want.
See related SO question: Is there a way to add a folder to existing 7za archive?
Also, the 7-Zip documentation: https://sevenzip.osdn.jp/chm/cmdline/commands/add.htm

PIG UDF load .gz file failed

I wrote my UDF to load file into Pig. It works well for loading text file, however, now I need also be able to read .gz file. I know I can unzip the file then process, but I want just read .gz file without to unzip it.
I have my UDF extends from LoadFunc, then in my costom input file MyInputFile extends TextInputFormat. I also Implemented MyRecordReader. Just wondering if extends TextInputFormat is the problem? I tried FileInputFormat, still cannot read the file. Anyone wrote UDF read data from .gz file before?
TextInputFormat handles gzip files as well. Have a look at its RecordReader's (LineRecordReader) initialize() method where the proper CompressionCodec is initialized. Also note that gzip files aren't splittable (even if they are located on S3) so you might either need to use a splittable format (e.g: LZO) or an uncompressed data to exploit the desired level of parallel processing.
If your gzipped data is stored locally you can uncompress and copy it to hdfs in one step as described here. Or if it's already on hdfs
hadoop fs -cat /data/data.gz | gzip -d | hadoop fs -put - /data/data.txt would be more convenient.

What is the Unix way for a console script to use config files?

Let's imagine we have some script 'm12' (I've just invented this name) that runs
on Linux computers. If it is situated in your $PATH, you can easily run it
from the console like this:
m12
It will work with the default parameters. But you can customize the work of
this script by running it something like:
m12 --enable_feature --select=3
It is great and it will work. But I want to create a config file ~/.m12rc so I
will not need to specify --enable_feature --select=3 every time I run it.
It can be easily done.
The difficult part is starting here.
So, I have ~/.m12rc config file, but I what to start m12 without parameters that
are stored in that config file. What is the Unix way to do this? Should I run
script like this:
m12 --ignore_config
or there is better solution?
Next. Let's imagine I have a config file ~/.m12rc and I want some parameters from that
file, but want to change them a bit. How should I run the script and how the
script should work?
And the last question. Is it a good idea for script to first look for .m12rc
in the current directory, then in ~/ and then in /etc?
I'm asking all these questions because I what to implement config files in my
small script and I want to make the correct decisions about the design.
The book 'The Art of Unix Programming' by E S Raymond discusses such issues.
You can override the config file with --config-file=/dev/null.
You would normally use the order:
System-wide configuration (/etc/m12/m12rc, or just /etc/m12).
User's personal configuration (~/.m12rc)
Local directory configuration (./.m12rc)
Command-line options
with each later-listed item overriding earlier listed items. You should be able to specify the configuration file to read on the command line; arguably, that should be given precedence over other options. Think about --no-system-config or --no-user-config or --no-local-config. Many scripts do not warrant a system config file. Most scripts I've developed would not use both local config and user config. But that's the way my mind works.
The way I package standard options is to have a script in $HOME/bin (say m12a) that does it for me:
#!/bin/sh
exec m12 --enable_feature --select=3 "$#"
If I want those options, I run m12a. If I want some other options, I run raw m12 with the requisite options. I have multiple hundreds of files in my personal bin directory (about 500 on my main machine, a Mac; some of those are executables, but many are scripts).
Let me share my experience. I normally source config file at the beginning of the script. In the config file I also handle all the parameter switches:
DEFAULT_USER=blabla
while getopts ":u" do
case $opt in
u)
export APP_USER=$OPTARG
;;
esac
done
export APP_USER=${APP_USER-$DEFAULT_USER}
Then within the script I just use variables, this let me to have number of script having same input parameters.
In your case I imaging you would move "getopts" section to script and after it source the config file (if there was no switch to skip sourcing).
You should not put yours script config file to etc, it will require root privilidge to do that, and you simple can live with config file in home.
If you would like anyway to put your script for sharing with other users, it should go to /usr/share...
Another solution use thor (ruby gem), its way simpler to handle input parameter, avoiding work to get same result in bash e.g. getopts support only single letter switches.

'Overwrite' php.ini settings

I have a folder, and for all php files in that folder (or even better, in that folder or any folders within it) I'd like to make some changes to the php settings. Can I just place a php.ini file in that folder with those settings I'd like to change?
If so, any reason why this wouldn't be working for me? It's my own server.
Thanks!
edit: I'd like to be able to use a local php.ini file, as I've been able to do with several webhosts. Is this a possibility?
It looks like you're wanting to use per-directory php.ini files which are available as of PHP 5.3. If it's your own server, I'd like to think you're happy to keep up with the latest stable releases (currently 5.3.2). Back to ini files, to quote that manual page:
Since PHP 5.3.0, PHP includes support for .htaccess-style INI files on a per-directory basis. These files are processed only by the CGI/FastCGI SAPI. This functionality obsoletes the PECL htscanner extension. If you are using Apache, use .htaccess files for the same effect.
In addition to the main php.ini file, PHP scans for INI files in each directory, starting with the directory of the requested PHP file, and working its way up to the current document root (as set in $_SERVER['DOCUMENT_ROOT']). Only INI settings with the modes PHP_INI_PERDIR and PHP_INI_USER will be recognized in .user.ini-style INI files.
You'll have to use a .htaccess file for that. There a section in the PHP manual about that:
http://php.net/manual/en/configuration.changes.php
For more general information on htaccess files you can read:
http://en.wikipedia.org/wiki/Htaccess
or
http://httpd.apache.org/docs/2.0/howto/htaccess.html
The .htaccess files are typically the best way to go for an Apache server. However, to answer your original question, yes you can set a php.ini file in every directory if you want. However, in order for it to work, PHP must be set to run as PHP-CGI. My guess is that you are running PHP as an Apache module.
See this link for reference on where PHP looks for php.ini and when it looks for it: http://www.php.net/manual/en/configuration.file.php
You could also use ini_set(), if you wanted to do it in code.
instead of modifying php.ini file for each folder you would be required to modify a .htaccess file. Keep the file in the folders with whatever setting you like. You cant do this with a php.ini file since changes in php.ini are considered server wide

Resources