How efficient is reading the names of files in a directory in ASP.NET?
Background: I want to update pictures on a webserver automatically and deploy them in advance. E.g. until the 1. April I want to pick 'image1.png'. After the 1. April 'image2.png'. To achieve this I have to map every image name to a date which indicates if this image has to be picked or not.
In order to avoid mapping between file name and date in a seperate file or database the idea is to put a date in the file name. Iterating the directory and parsing the dates make me find my file.
E.g.:
image_2013-01-01.png
image_2013-04-31.png
The second one will be picked from May to eternity if no image with a later date will be dropped.
So I wonder how this solution impacts the speed of a website assuming <20 files.
If you are using something like Directory.GetFiles, that is one call to the OS.
This will access the disk to get the listing.
For less that 20 files this will be very quick. However since this data is unlikely to change very often, consider caching the name of your image.
You could store it in the application context to share it among all users of your site.
Related
I have Drupal 7.x and Advanced CSS/JS Aggregation 7.x-2.7
In folders advagg_js and advagg_css (path is sites/default/files) i have
I have too many identical files and i don't understand why...
This is a name of file in advagg_css :
css____tQ6DKNpjnnLOLLOo1chze6a0EuAzr40c2JW8LEnlk__CmbidT93019ZJXjBPnKuAOSV78GHKPC3vgAjyUWRvNg__U78DXVtmNgrsprQhJ0bcjElTm2p5INlkJg6oQm4a72o
How can I delete all these files without doing damage?
Maybe in performance/advagg/operations in box Cron Maintenance Tasks i must check
Clear All Stale Files
Remove all stale files. Scan all files in the advagg_css/js directories and remove the ones that have not been accessed in the last 30 days.
????
I hope you can help me...
Thanks a lot
I can guarantee that there are very few duplicate files in those directories. If you really want, you can manually delete every file in there; a lot of them will be generated again so you're back to having a lot of files (the css/js files get auto created on demand, just like image styles). AdvAgg is very good at preventing a 404 from happening when requesting an aggregated css/js file. You can adjust how old a file needs to be in order for it to be considered "stale". Inside of the core drupal_delete_file_if_stale() function is the drupal_stale_file_threshold variable. Changing this inside of your settings.php file to something like 2 days $conf['drupal_stale_file_threshold'] = 172800; will make Drupal more aggressive in terms of removing aggregated css and js files.
Long term if you want to reduce the number of different css/js files from being created you'll need to reduce the number of combinations/variations that are possible with your css and js assets. On the "admin/config/development/performance/advagg/bundler" page under raw grouping info it will tell you how many different groupings are currently possible, take that number and multiply it by the number of bundles (usually 2-6 if following a guide like this https://www.drupal.org/node/2493801 or 6-12 if using the default settings) and that's the number of files that can currently be generated. Multiply it by 2 for gzip. On one of our sites that gives us over 4k files.
In terms of file names the first base64 group is the file name, second base64 group are the file contents, and the third base64 group are the advagg settings. This allows for the aggregates contents to be recreated by just knowing the filename as all this additional info is stored in the database.
In my Windows 8/RT app I use SQLite DataBase (sqlite-net) witch store in Isolated Storage. In DataBase I have a lot of data, including files(images, pdf's and other) links. I get those links from web server. When I got link, I want to download file and store it locally.
My question is: what is the best way to store big number of files (100+)? One important think: I need to organize quickly find the desired file.
I have three ideas:
Create another DataBase only for files (I can't modify existing)
Create folder in IS and store here directly.
Create list of files and store it in IS.
Which would be better/faster? Or somebody have another great solution?
100 files isn't such a big number as you can easily store up to 100k files (or folders) in a single (NTFS) directory.
If you receive the files from a webserver then the question is whether the source makes sure there are no duplicate filenames. If this can't be assured, I'd recommend having a database table mapping from original filename and metadata to its hash (SHA256 or similar) and store the file with a filename corresponding to its hash.
Then, when using the file, you can pass pass it to the user using the original filename using the StorageFile API.
Going beyond 100k files, you could create a subfolder structure from the first two letters of the hash.
Either way, storing the file metadata in a database and the files in a directory has been the most useful approach for us in the past.
100 files with average size of 1MB is only 100MB.
Most people say that storing binary files in database is wrong and suggest storing files separately and only keep file names in database, but I think it is fine provided you know what you are doing and why.
Big advantage of storing files in database is that you keep files together with their properties logically in one place. Also, you can simply copy one file and this would backup everything.
Database also affords you transaction support. You may have some problems reading and writing BLOBs into database, but it is not very difficult.
I am writing a Log Unifier program. That is, I have a system that produces logs:
my.log, my.log.1, my.log.2, my.log.3...
I want on each iteration to store the number of lines I've read from a certain file, so that on the next iteration - I can continue reading on from that place.
The problem is that when the files are full, they roll:
The last log is deleted
...
my.log.2 becomes my.log.3
my.log.1 becomes my.log.2
my.log becomes my.log.1
and a new my.log is created
I can ofcourse keep track of them, using inodes - which are almost a one-to-one correspondence to files.
I say "almost", because I fear of the following scenario:
Between two of my iterations - some files are deleted (let's say the logging is very fast), and are then new files are created and some have inodes of files just deleted. The problem is now - that I will mistake these files as old files - and start reading from line 500 (for example) instead of 0.
So I am hoping to find a way to solve this- here are a few directions I thought about - that may help you help me:
Either another 1-to-1 correspondence other than inodes.
An ability to mark a file. I thought about using chmod +x to mark the file as an
existing file, and for new files that don't have these permissions - I will know they are new - but if somebody were to change the permissions manually, that would confuse my program. So if you have any other way to mark.
I thought about creating soft links to a file that are deleted when the file is deleted. That would allow me to know which files got deleted.
Any way to get the "creation date"
Any idea that comes to mind - maybe using timestamps, atime, ctime, mtime in some clever way - all will be good, as long as they will allow me to know which files are new, or any idea creating a one-to-one correspondence to files.
Thank you
I can think of a few alternatives:
Use POSIX extended attributes to store metadata about each log file that your program can use for its operation.
It should be a safe assumption that the contents of old log files are not modified after being archived, i.e. after my.log becomes my.log.1. You could generate a hash for each file (e.g. SHA-256) to uniquely identify it.
All decent log formats embed a timestamp in each entry. You could use the timestamp of the first entry - or even the whole entry itself - in the file for identification purposes. Log files are usually rolled on a periodic basis, which would ensure a different starting timestamp for each file.
I have 3 Global resource files:
WebResources.resx
WebResources.resx.es
WebResources.resx.it
When making changes to my application, I always add the default global resource records (English) to the WebResources.resx file. However, I don't always have the Spanish and Italian versions at the time, so these need to be added at a later stage.
My project manager has suggested that whenever adding a record into the WebResources.resx file, then a blank record should be added into the .es and .it versions. Then when it comes to finding out which records need a translation, we can order by the value and see a list of the blanks.
I like the fall-back of using Global Resources, that if there is not a record in the specified resource file, then the default record is returned. By adding a blank record this is preventing the fall back.
Is there a better way of finding out what records are missing from the .es and .it resource files?
There are some tools around that should help you do what you need.
Here is one you could try for example
Greetings!
I am using the ASP.NET FileUpload control to allow users to upload text files to our web server. Everything works great in terms of saving the file to where we wanted, etc, using the SaveAs() method of the control.
But we were caught off guard by one seemingly simple caveat: the original timestamp of the uploaded file was lost such as the date last modified and date create. The date last modified and date created become the actual date and time when the file is saved to the server.
My question is: is there anyway to retain the original timestamp by setting some attributes that I am not aware of yet or is it possible to read the metadata of the file to get its original time stamp?
Any in-sight and suggestions are greatly appreciated.
John
Unless the file format being uploaded itself contains this data, then no.
When a file is uploaded to a web server, the binary data for the file is sent to the server, not the "file" as it is represented in the filesystem. You don't, for example, know that your file is coming from a compatible filesystem; you only get its data. Hence, the metadata is inaccessible.