USN journal for hard links - usn

If i have a directory with a few hardlinks all pointing to files outside the directory, will a change to one of the hardlinks affect the USN journal associated with the directory or will it affect the USN journal of the original directory which contains the actual file to which the hardlinks are linked at the time of their creation?

The journal will get an entry when you add the hard-link USN_REASON_HARD_LINK_CHANGE. Then as time goes on, any of the hard links may be opened, and changes made. The subsequent USN entries will all reference the original file's FileReferenceNumber, but will contain FileName and ParentFileReferenceNumber that depend on which link was actually opened. This is what you have available to distinguish between links. Note that it might be tempting to distinguish using only the ParentFileReferenceNumber, but this isn't really safe. While the most widely used pattern is to have the same-named link in different directories, you could have a link in the same directory but with a different name.
Note on moved links: If you choose to read the USN in "summary mode" (your READ_USN_JOURNAL_DATA_V0 has ReturnOnlyOnClose = 1), where you only read the entries that have accumulated to the point of the file closing, you can miss the USN_REASON_RENAME_OLD_NAME entries...and lose track of which link the rename was made through. This kind of USN record doesn't accumulate into the file close event...I'm guessing because of the potential collision of ParentFileReferenceNumber and FileName.

Related

I have Advagg Module for Drupal 7.x and i have many files in folders advagg_js/css.... why?

I have Drupal 7.x and Advanced CSS/JS Aggregation 7.x-2.7
In folders advagg_js and advagg_css (path is sites/default/files) i have
I have too many identical files and i don't understand why...
This is a name of file in advagg_css :
css____tQ6DKNpjnnLOLLOo1chze6a0EuAzr40c2JW8LEnlk__CmbidT93019ZJXjBPnKuAOSV78GHKPC3vgAjyUWRvNg__U78DXVtmNgrsprQhJ0bcjElTm2p5INlkJg6oQm4a72o
How can I delete all these files without doing damage?
Maybe in performance/advagg/operations in box Cron Maintenance Tasks i must check
Clear All Stale Files
Remove all stale files. Scan all files in the advagg_css/js directories and remove the ones that have not been accessed in the last 30 days.
????
I hope you can help me...
Thanks a lot
I can guarantee that there are very few duplicate files in those directories. If you really want, you can manually delete every file in there; a lot of them will be generated again so you're back to having a lot of files (the css/js files get auto created on demand, just like image styles). AdvAgg is very good at preventing a 404 from happening when requesting an aggregated css/js file. You can adjust how old a file needs to be in order for it to be considered "stale". Inside of the core drupal_delete_file_if_stale() function is the drupal_stale_file_threshold variable. Changing this inside of your settings.php file to something like 2 days $conf['drupal_stale_file_threshold'] = 172800; will make Drupal more aggressive in terms of removing aggregated css and js files.
Long term if you want to reduce the number of different css/js files from being created you'll need to reduce the number of combinations/variations that are possible with your css and js assets. On the "admin/config/development/performance/advagg/bundler" page under raw grouping info it will tell you how many different groupings are currently possible, take that number and multiply it by the number of bundles (usually 2-6 if following a guide like this https://www.drupal.org/node/2493801 or 6-12 if using the default settings) and that's the number of files that can currently be generated. Multiply it by 2 for gzip. On one of our sites that gives us over 4k files.
In terms of file names the first base64 group is the file name, second base64 group are the file contents, and the third base64 group are the advagg settings. This allows for the aggregates contents to be recreated by just knowing the filename as all this additional info is stored in the database.

How to change special file attributes using R?

On a Windows machine, when I look at any mp3-file and look at the properties of that file (mark the mp3, right click, properties), there are various subfields for the title, subtitle, artist, album, etc.
I am looking for a way to access these properties and change them. For instance, some files may indicate that the artist is "GreatArtist", whereas other files indicate "The GreatArtist" or "Great Artist".
I know I can change all of them manually by selecting all files that corespond to the same artist, right click and entering everything manually. I am looking for a way to automate this though so that it becomes easy for many folders, artists, and files, and R is my software of choice.
How can I access these properties using R? file.info() does not display these properties.

Efficency for reading file names of a directory ASP.NET

How efficient is reading the names of files in a directory in ASP.NET?
Background: I want to update pictures on a webserver automatically and deploy them in advance. E.g. until the 1. April I want to pick 'image1.png'. After the 1. April 'image2.png'. To achieve this I have to map every image name to a date which indicates if this image has to be picked or not.
In order to avoid mapping between file name and date in a seperate file or database the idea is to put a date in the file name. Iterating the directory and parsing the dates make me find my file.
E.g.:
image_2013-01-01.png
image_2013-04-31.png
The second one will be picked from May to eternity if no image with a later date will be dropped.
So I wonder how this solution impacts the speed of a website assuming <20 files.
If you are using something like Directory.GetFiles, that is one call to the OS.
This will access the disk to get the listing.
For less that 20 files this will be very quick. However since this data is unlikely to change very often, consider caching the name of your image.
You could store it in the application context to share it among all users of your site.

One to one correspending to files - in unix - log files

I am writing a Log Unifier program. That is, I have a system that produces logs:
my.log, my.log.1, my.log.2, my.log.3...
I want on each iteration to store the number of lines I've read from a certain file, so that on the next iteration - I can continue reading on from that place.
The problem is that when the files are full, they roll:
The last log is deleted
...
my.log.2 becomes my.log.3
my.log.1 becomes my.log.2
my.log becomes my.log.1
and a new my.log is created
I can ofcourse keep track of them, using inodes - which are almost a one-to-one correspondence to files.
I say "almost", because I fear of the following scenario:
Between two of my iterations - some files are deleted (let's say the logging is very fast), and are then new files are created and some have inodes of files just deleted. The problem is now - that I will mistake these files as old files - and start reading from line 500 (for example) instead of 0.
So I am hoping to find a way to solve this- here are a few directions I thought about - that may help you help me:
Either another 1-to-1 correspondence other than inodes.
An ability to mark a file. I thought about using chmod +x to mark the file as an
existing file, and for new files that don't have these permissions - I will know they are new - but if somebody were to change the permissions manually, that would confuse my program. So if you have any other way to mark.
I thought about creating soft links to a file that are deleted when the file is deleted. That would allow me to know which files got deleted.
Any way to get the "creation date"
Any idea that comes to mind - maybe using timestamps, atime, ctime, mtime in some clever way - all will be good, as long as they will allow me to know which files are new, or any idea creating a one-to-one correspondence to files.
Thank you
I can think of a few alternatives:
Use POSIX extended attributes to store metadata about each log file that your program can use for its operation.
It should be a safe assumption that the contents of old log files are not modified after being archived, i.e. after my.log becomes my.log.1. You could generate a hash for each file (e.g. SHA-256) to uniquely identify it.
All decent log formats embed a timestamp in each entry. You could use the timestamp of the first entry - or even the whole entry itself - in the file for identification purposes. Log files are usually rolled on a periodic basis, which would ensure a different starting timestamp for each file.

How can I find missing Global Resource records?

I have 3 Global resource files:
WebResources.resx
WebResources.resx.es
WebResources.resx.it
When making changes to my application, I always add the default global resource records (English) to the WebResources.resx file. However, I don't always have the Spanish and Italian versions at the time, so these need to be added at a later stage.
My project manager has suggested that whenever adding a record into the WebResources.resx file, then a blank record should be added into the .es and .it versions. Then when it comes to finding out which records need a translation, we can order by the value and see a list of the blanks.
I like the fall-back of using Global Resources, that if there is not a record in the specified resource file, then the default record is returned. By adding a blank record this is preventing the fall back.
Is there a better way of finding out what records are missing from the .es and .it resource files?
There are some tools around that should help you do what you need.
Here is one you could try for example

Resources