Cleanup unused blobs in filestorage in Plone - plone

Is there a way to find and remove unused blob storage space in a Plone site?
I'm looking for something like bin/zeopack, but that detects unused blobs in the blobstorage directory.

The ZODB takes care of unused blobs by itself. Packing will remove blobs together with removed transactions as required.

Related

Sonatype nexus3 server Manually remove items

I'm using nexus sonatype-work nexus3. Still it didn't have a cleaning process. So, it has filled. Is there any commands that I can use to find unused images and data? And, how can delete files manually?
No, you should not delete any of the files manually otherwise it lead to issues with retrieving data from your repositories. You should set up some cleanup policies that will maintain the disk space for you - you can learn more about from these resources:
Cleanup policies
Keeping disk usage low
Storage management

Sonatype Nexus - Full Blob folder

The blobs folder on my Sonatype Nexus has completely filled the server memory.
Does anyone know how to make room? Does it exist an automatic way to free that space, or I have to do it manually..?
And, at last: what happens if I completety remove all the data in the directory blobs/default/content?
Thank you all in advance
Marco
In NXRM3, the blobstore contains all the components of your repository manager. If your disk is full, you will not be able to write anything more to NXRM and risk corruption of existing data.
Cleanup can be performed using scheduled tasks. What you need varies based around what formats your system is using. You can find more general information here: https://help.sonatype.com/display/NXRM3/System+Configuration#SystemConfiguration-ConfiguringandExecutingTasks
It is important to note that you must run the "Compact blob store" task after any cleanup is done, otherwise the space will not be freed.
However, it is advisable if you have reached full disk space, you shut down and restore from a backup in case there is corruption, preferably giving yourself a larger disk for your blobstore before restarting.
RE "what happens if I completety remove all the data in the directory blobs/default/content": That is in effect removing all data from NXRM in the default blobstore. You will have no components if you do that.

I can't packing my Data.fs, Because too large more than 500GB

Unfortunately, I have a more than 500GB ZODB, Data.fs in my Plone site(Plone 5.05)
So, I have no way to use bin/zeopack to packing it,
Seriously affecting performance
What should I do ?
I assume you're running out of space on the volume containing your data.
First, try turning off pack-keep-old in your zeoserver settings:
[zeoserver]
recipe = plone.recipe.zeoserver
...
pack-keep-old false
This will disable the creation of a .old copy of the Data.fs file and matching blobs. That may allow you to complete your pack.
Alternatively, create a matching Zope/Plone install on a separate machine or volume with more storage and copy over the data files. Run zeopack there. Copy the now-packed storage back.

A file storage format for file sharing site

I am implementing a file sharing system in ASP.NET MVC3. I suppose most file sharing sites store files in a standard binary format on a server's file system, right?
I have two options storage wise - a file system, or binary data field in a database.
Is there any advantages in storing files (including large one's) in a database, rather then on file system?
MORE INFO:
Expected average file size is 800 MB. 3 files per minute are to be usually requested to be fed back to the user, who is downloading.
If the files are as big as that, then using the filesystem is almost certainly a better option. Databases are designed to contain relational data grouped into small rows and are optimized for consulting and comparing the values in these relations. Filesystems are optimized for storing fairly large blobs and recalling them by name as a bytestream.
Putting files that big into a database will also make it difficult to manage the space occupied by the database. The tools to query space used in a filesystem, and remove and replace data are better.
The only caveat to using the filesystem is that your application has to run under an account that has the necessary permission to write the (portion of the) filesystem you use to store these files.
Use FileStream when:
Objects that are being stored are, on average, larger than 1 MB.
Fast read access is important.
You are developing applications that use a middle tier for application logic.
Here is MSDN link https://msdn.microsoft.com/en-us/library/gg471497.aspx
How to use it: https://www.simple-talk.com/sql/learn-sql-server/an-introduction-to-sql-server-filestream/

Can you use rsync to replicate block changes in a Berkeley DB file?

I have a Berkeley DB file that is quite large (~1GB) and I'd like to replicate small changes that occur (weekly) to an alternate location without having the entire file be re-written at the target location.
Does rsync properly handle Berkeley DBs by it's block level algo?
Does anyone have an alternative to only have changes be written to the Berkeley DBs files that are targets of replication?
Thanks!
Rsync handles files perfectly, at the block level. The problem with databases can come into play in a number of ways.
Caching
File locking
Synchronization/transaction logs
If you can insure that during the period of the rsync, no applications have the berkeley db open, then rsync should work fine, and offer a significent advantage over copying the entire file. However, depending on the configuration and version of bdb, there are transaction logs. You probably want to investigate the same mechanisms used for backups and hot backups. They also have a "snapshot" feature that might better facilitate a working solution.
You should probably read this carefully: http://www.cs.sunysb.edu/documentation/BerkeleyDB/ref/transapp/archival.html
I'd also recommend you consider using replication as an alternative solution that is blessed by BDB https://idlebox.net/2010/apidocs/db-5.1.19.zip/programmer_reference/rep.html
They now call this High Availabity -> http://www.oracle.com/technetwork/database/berkeleydb/overview/high-availability-099050.html

Resources