Unix - server gets polluted - find out were new files get stored - unix

My server has no available space left on disk. Yesterday, I deleted 200 GB Data, today it is full again. Some Process must write some files. How do I find out where possibly new huge files are stored?

Check df to check partition usage.
Use du to find sizes of folders.
I tend to do this:
du -sm /mount/point/* | sort -n
This gives you a list with the size of folders in MB in the /mount/point folder.
Also if you have X you can use baobab or similar utilies to explore disk usage.
PS: check the log files. For example if you have Tomcat installed it tends to generate crazy amount of log if not configured properly.

Related

I can't packing my Data.fs, Because too large more than 500GB

Unfortunately, I have a more than 500GB ZODB, Data.fs in my Plone site(Plone 5.05)
So, I have no way to use bin/zeopack to packing it,
Seriously affecting performance
What should I do ?
I assume you're running out of space on the volume containing your data.
First, try turning off pack-keep-old in your zeoserver settings:
[zeoserver]
recipe = plone.recipe.zeoserver
...
pack-keep-old false
This will disable the creation of a .old copy of the Data.fs file and matching blobs. That may allow you to complete your pack.
Alternatively, create a matching Zope/Plone install on a separate machine or volume with more storage and copy over the data files. Run zeopack there. Copy the now-packed storage back.

Overcoming inode limitation

What's the best practise for storing a large (expanding) number of small files on a server, without running into inode limitations?
For a project, I am storing a large number of small files on a server with 2TB HD space, but my limitation is the 2560000 allowed inodes. Recently the server used up all the inodes and was unable to write new files. I subsequently moved some files into databases, but others (images and json files remain on the drive). I am currently at 58% inode usage, so a solution is required imminently.
The reason for storing the files individually is to limit the number of database calls. Basically the scripts will check if the file exists and if so, then return results dependently. Performance wise this makes sense for my application, but as stated above it has limitations.
As I understand it does not help to move the files into sub-directories, because each inode points to a file (or a directory file), so in fact I would just use up more inodes.
Alternatively I might be able to bundle the files together in an archive type of file, but that will require some sort of indexing.
Perhaps I am going about this all wrong, so any feedback is greatly appreciated.
On the advice of arkascha I looked into loop devices and found some documentation about losetup. Remains to be tested.

Meteor mongodb file size too big

I am just starting with Meteor creating some test/practice apps. After I have created the app and run it, the .meteor folder size baloons to 500 MB. Each practice app adds 500 MB or so to my working folder.
I am not playing with any huge data sets on anything, my database will be less than 10 MB.
As I sync my work folder with my laptop, it is a major pain to back it up. How can I reduce the size of default mongodb while creating a practice app so that backing it up or folder sync
Also even when I copy the whole app folder to the new location, It does not run, likely because the database is stored somewhere else.
Can I save the database to the same folder as the app, so that just copying the folder over will enable me to continue working on the laptop as well?
Sorry if the question is too noobish.
Thanks for your time.
meteor reset >>> deletes my database. I want to be able to preserve it.
Yes, this can be a pain and is unavoidable by default at present. However, a couple of ideas that might be useful:
If you have multiple meteor apps, it's possible to use the same DB for each, as per #elfoslav: link. However, note that you have to supply the env variable every time or create a shell script for when you start meteor, otherwise it'll create a new db for you if you run meteor on its own just once!
If it's just portability of the app you're concerned about, get comfortable with mongodump and mongorestore, which will yield bson files containing just your database contents (i.e. about 10mb) which are pretty easy to insert back into another instance of mongoDB, so that you only have to copy these backwards and forwards. Here is a guide to doing this with your Meteor DB, and here is a great gist from #olizilla.
Have you tried below mongoDB configuration options to limit the space it occupies?
storage.smallFiles
Type: boolean Default: False
Sets MongoDB to use a smaller default file size. The
storage.smallFiles option reduces the initial size for data files and
limits the maximum size to 512 megabytes. storage.smallFiles also
reduces the size of each journal file from 1 gigabyte to 128
megabytes. Use storage.smallFiles if you have a large number of
databases that each holds a small quantity of data.
storage.journal.enabled
Type: boolean
Default: true on 64-bit systems, false on 32-bit systems
Enables the durability journal to ensure data files remain valid and
recoverable. This option applies only when you specify the --dbpath
option. The mongod enables journaling by default on 64-bit builds of
versions after 2.0.
Refer to: http://docs.mongodb.org/manual/reference/configuration-options/

"sqlite3.OperationalError: database or disk is full" on Lustre

I have this error in my application log:
sqlite3.OperationalError: database or disk is full
As plenty of disk space is available and my SQLite database does not appear to be corrupted (integrity_check did not report any error), why is this happening and how can I debug it?
I am using the Lustre filesystem (with flock set), and until now, it worked perfectly.
Versions are:
Python 2.6.6
SQLite 3.3.6
It's probably too late for the original poster, but I just had this problem and couldn't find an answer so I'll document my findings in the hope that it will help others:
As it turns out, an SQLite database actually can get full even if there's plenty of disk space, because it has a limit for the number of pages in a database file:
http://www.sqlite.org/pragma.html#pragma_max_page_count
In my case the value was 1073741823, which meant that in combination with a page size of 1024 Bytes the database maxed out at 1 TB and returned the "database or disk is full" error.
The good news is that you can raise the limit; for example double it by issuing PRAGMA max_page_count = 2147483646;.
The limit doesn't seem to be saved in the database file, though, so you have to run it in your application every time you open the database.
By default, SQLite uses /tmp temporary directory (not the memory). If /tmp is too small you will get disk full. In that case change the temporary directory like that: export TMPDIR=<big file system>.
I had same problem too.
Your host or PC's storage is full so delete some files in your system then problem is gone.

Transfering millions of images -- RSync not good enough

We've got a folder, 130GB in size, with millions of tiny (5-20k) image files, and we need to move it from our old server (EC2) to our new server (Hetzner, Germany).
Our SQL files SCP'd over really quickly -- 20-30mb/s atleast -- and the first ~5gb or so of images transfered pretty quick, too.
Then we went home for the day, and coming back in this morning, our images have slowed to only ~5kb/s in transfer. RSync seems to slow down as it hits the middle of the workload. I've looked into alternatives, like gigasync (which doesn't seem to work), but everyone seems to agree rsync is the best option.
We have so many files, doing ls -al takes over an hour, and all my attempts at using python to batch up our transfer into smaller parts have eaten all available RAM without successfully completing.
How can I transfer all these files at a reasonable speed, using readily available tools and some light scripting?
I don't know if it will significantly faster, but maybe a
cd /folder/with/data; tar cvz | ssh target 'cd /target/folder; tar xvz'
will do the trick.
If you can, maybe restructure your file arrangement. In similiar situations, I group the files project-wise or just 1000-wise together so that a single folder doesn't have too many entries at once.
But I can imagine that the necessity of rsync (which I otherwise like very well, too) to keep a list of transferred files is responsible for the slowness. If the rsync process occupies so much RAM that it has to swap, all is lost.
So another option could be to rsync folder by folder.
It's likely that the performance issue isn't with rsync itself, but a result of having that many files in a single directory. Very few file systems perform well with a single huge folder like that. You might consider refactoring that storage to use a hierarchy of subdirectories.
Since it sounds like you're doing essentially a one-time transfer, though, you could try something along the lines of a tar cf - -C <directory> . | ssh <newhost> tar xf - -C <newdirectory> - that might eliminate some of the extra per-file communication rsync does and the extra round-trip delays, but I don't think that will make a significant improvement...
Also, note that, if ls -al is taking an hour, then by the time you get near the end of the transfer, creating each new file is likely to take a significant amount of time (seconds or even minutes), since it first has to check every entry in the directory to see if it's in fact creating a new file or overwriting an old one.

Resources