Assumptions: Microsoft stack (ASP.NET; SQL Server).
Some content management systems handle user-generated content (images, file attachments) by storing it in the file system. Others store these items in the back end database.
Some examples of both:
In the filesystem: Community Server, Graffiti CMS
In the database: Microsoft Sharepoint
I can see pros and cons of each approach.
In the filesystem
Lightweight
Avoids bloating the database
Backup and restore potentially simpler
In the Database
All content together in one repository (the database)
Complete separation of concerns (content vs format)
Easier deployment of web site (e.g. directly from Subversion repository)
What's the best approach, and why? What are the pros and cons of keeping user files in the database? Is there another approach?
I'm making this question Community Wiki because it is somewhat subjective.
If you are using SQL Server 2008 or higher, you can use the FileStream functionality to get the best of both worlds. That is, you can access documents from the database (for queries, etc), but still have access to the file via the file system (using SMB). More details here.
Erick
I picked the file system because it made editing of documents in place easier, that is when the user edits a file or document it can be saved in the location it is loaded from with no intervention by the program or user.
IMO, as of right now with the current functionality available in databases, the file system is the better choice.
The file system has no limit on the size of the files and with content this could easily be files larger than 2 GB.
It makes the database size much smaller which means less pressure on memory.
You can design your system to use UNCs and NASs or even cloud storage where as you cannot do this with FILESTREAM.
The biggest downside with using the file system is the potential for orphaning files and keeping the database information on files in sync with the actual files on disk. Admittedly, this is a huge issue but until solutions like FILESTREAM are more flexible, it is the price you have to pay.
Actually its door #3 Chuck.
I think storing images in the database is bad news unless you need to keep them private, otherwise, just put them on a CDN and store the URLs of the images instead. I've built some huge sites for ecommerce and putting the load on a CDN like Akumai or Amazon Cloudfront is a real nice way to speed up your website dramatically. I'm not a big fan of burning your bandwidth, CPU and memory for serving up images. Seems a silly waste of resources now days since CDNs are so cheap. Also, it does allow deployment to not care because your stuff is already in a globally accessible region. You can take a look at my profile to see the sites I've done and see how they are using CDNs to offload static requests. Just makes sense and gets even better if you can gzip it.
Related
I've got asp.net project. I want publish it in azure platform. My project contains different static content: images, javascript, css, html pages and so on. I want store this content in azure blob storage. So, my questions are:
1) Is there any way to automate the process of migration this content from my application to blob storage?
2) How can I use data retreived from blob storage? Any examples would be great!
Best regards,
Alexander
First off, what you're trying to do could create cross-site scripting (they'll be on different domain names) or security issues (if you're using SSL). So make sure you really want to seperate the static files from the rest of your web site.
That said, the simpliest approach would be to use any one of a number of Windows Azure Storage management utilities (Storage Explorer or Cerebrata's Storage Studio would both work), to upload the static content to a Windows Azure Storage blob container. Then set the permissions on that container to publis read so that anyone with a web browser can access the contents of the container.
Finally, change all referrences to the content to point to the new URI's in blob storage and deploy your ASP.NET web role.
Again though, if I were you, I'd really look at what you're trying to accomplish with this approach. By putting it in blob storage, you do gain access to a few things (like CDN enablement), but as a trade-off, you lose control over many others (like simplified access control via IIS for request logs to tell when someone is downloading your image files a trillion times to try and run up your bill). So unless there's a solid NEED for this, I'd generally recommend against it.
Adding a bit to #Brent's answer: you'll get a few more benefits when offloading static content to blob storage, such as reduction in load against your Web Role instances.
I wrote up a more detailed answer on this similar StackOverflow question.
In light of your comment to Brent, you may want to consider uploading the content into Blob storage and then proxying it through a WebRole. You can use something like an HttpModule to accomplish that fairly seamlessly.
This has 2 main advantages:
You can add/modify files without reloading your web roles or losing them on role refresh.
If you're migrating a site, the files can stay at the same URLs they were pre-migration.
The disadvantages:
You're paying the monetary cost for Blob accesses and the performance cost to your web roles.
You can't use the Azure CDN.
Blob storage is generally slower (higher latency) than disk access.
I've got a fairly simple module I wrote to do exactly this. I haven't gotten around to posting it anywhere public, but if you're going to do this I can send you the code or something.
I maintain a web application (ASP.NET/IIS7/SQL2K8/Win2K8) that needs to access documents, actually hundreds of thousands of documents, and growing. Currently, they are all on a Windows 2K8 Server fileshare, being accessed by UNC path (SMB). The files are in a single flat directory and I'm trying to plan how to best improve this solution. I don't want to use the SQL Filestream attribute as it would be significant effort to migrate it all into that, and would really lock in to SQL Server. I also need to find a way to replicate the data for disaster recovery, so perhaps a solution can help with that too.
Options could be:
Segment files into multiple directories?
Application would add metadata for which directory it's on (or segment by other means)
Segment files into separate servers? (virtualize)
Backup becomes more complicated.
Application would add metadata for which server it's on
NAS Storage
SAN Storage
Put a service (WCF) in front of the files and have the app talk to the service
bonus of being reusable across many applications
Assuming I'm going to store on filesystem and not in database (I've read those disccusions here), which would be a more scalable solution?
You've got a couple issues:
- managing a large volume of (static?) files
- preparing for backups and disaster recovery of said files
I'll throw this out there, even though I'm not a fan of the answer, but you might poke around with the free SharePoint 2010 Foundation that's included with server 2k8. If you're having issues with finding the documents you need (either by search, taxonomy via tagging or other metadata) as well as document expiration and you don't want to buy a full blown document management system, this might be a solution. Of course it introduces new problems...
If your only desire is to have these files available to spit out on the web, then the file store like you're using now really is the simplest solution. For DR/redundancy purposes, I'd look at a) running them on a raid/SAN of some sort and b) auto-syncing them with the cloud (either azure or amazon). For b) you can get apps that make the cloud appear as a mapped drive and then use an rsync type software to keep the cloud up to date.
If you want to build something new and cool, you might think about moving the entire file archive into the cloud and just write a table in a db to manage the file name, old location, new cloud location and a redirector code that can provide the access tokens to requestors.
3 different approaches... your choice.
I am working on an application that creates video files and stores them in a folder in the C:\ drive. I speculate that there will be a large number of these files in the future and we would run out of disk space at some point of time (on our VPS). When the time comes that we have to upgrade, we either plan to use one of the Cloud providers to store files or our existing provider can add another disk (say D:\ drive).
Either way, I would want to design the app now in a way that in future, moving to different locations would not be an issue and would be transparent to the end user.
The code that creates these files supports 2 ways:
myObj.SetOutputToDisk(<path to store>); or
myObj.SetOutputToMemoryStream(ms);
If we go with the Cloud architecture, I assume we might have the following combination:
Cloud Files + Existing VPS or
Cloud Files + Cloud Windows Server
Given the unknowns at this time, how would I go about designing this?
Serve the files up from a subdomain. Say: media.yourdomain.com.
That way, you can trivially repoint DNS records to the new storage provider at some point in the future.
Also, I'd recommend storing the media files on another physical disk to the OS disk. So have a D:\ drive and store the media there.
You might want to look at the Managed Extensibility Framework as a way of adding extensions to your app for new storage methods without the need to rebuild the whole thing.
You need some way to record the storage location and method used, I'd expect some kind of database store that you could migrate to the cloud later if required.
Your question is very vague, you haven't put much work in yourself and as such you are unlikely to get the level of detail you are hoping for in the answers. At least try to implement the system and then ask specific questions around issues that you are having problems with.
I am building an ASP.Net C# web application that will be using lots of sound files and image files. Considering performance, would it be best to store all files in SQL as image data type and retrieve from the database or store/archive the hard file on the server and store the path in sql? Im curious about the pros and cons - other than the obvious of storage space and manageability.
My current client is currently looking at the same options. There are a few tradeoffs to consider:
Storing as IMAGE data type:
You only need to backup your database rather than the DB and places on the file system
You don't have to worry about files being moved without the DB being updated with the new location or any other issues with hanging pointers to non-existent files
Storing as a file with a path in the DB:
Slightly faster access (we'll be quantifying this in the next few days)
Originally I thought that there would also be a problem with client-side caching of images. For example, when .NET gets the image out of the DB the client browser can't cache it - it looks like a new image every time. I then learned that unless you are giving users file-level access (a security no-no) you run into the same problem using direct file access.
From a performance perspective you should be better off storing the sounds/images as files and just keeping a reference to the location in the database. This will save you from having to transfer the data from the database to the web server and reconstitute the "file" via a handler whenever it is referenced. Of course, caching could help this but you'd still pay the penalty on each cache miss. This solution is also, for my money, somewhat less complicated in terms of the code that needs to be developed though you do have issues with collisions and some extra security setup (potentially) to do if you are enabling upload.
I'm building an ASP .NET web solution that will include a lot of pictures and hopefully a fair amount of traffic. I do really want to achieve performance.
Should I save the pictures in the Database or on the File system? And regardless the answer I'm more interested in why choosing a specific way.
Store the pictures on the file system and picture locations in the database.
Why? Because...
You will be able to serve the pictures as static files.
No database access or application code will be required to fetch the pictures.
The images could be served from a different server to improve performance.
It will reduce database bottleneck.
The database ultimately stores its data on the file system.
Images can be easily cached when stored on the file system.
In my recently developed projects, I stored images (and all kinds of binary documents) as image columns in database tables.
The advantage of having files stored in the database is obviously that you do not end up with unreferenced files on the harddisk if a record is deleted, since synchronization between database (= meta data) and harddisk (= file storage) is not built-in and has to be programmed manually.
Using today's technology, I suggest you store images in SQL Server 2008 FILESTREAM columns (at least that's what I am going to do with my next project), since they combine the advantage of storing data in database AND having large binaries in separate files (at least according to advertising ;) )
The adage has always been "Files in the filesystem, file metadata in the database"
Better to store files as files. Different databses handle Blob data differently, so if you have to migrate your back end you might get into trouble.
When serving the impages an < img src= to a file that already exists on the server is likely to be quicker than making a temporary file from the database field and pointing the < img tag to that.
I found this answer from googling your question and reading the comments at http://databases.aspfaq.com/database/should-i-store-images-in-the-database-or-the-filesystem.html
i usually like to have binary files in the database because :
data integrity : no unreferenced file, no path in the db without any file associated
data consistency : take a database dump and that's all. no "O i forgot to targz this data directory."
Storing images in the database adds a DB overhead to serve single images and makes it hard to offload to alternate storage (S3, Akami) if you grow to that level. Storing them in the database makes it much easier to move your app to a different server since it's only the DB that needs to move now.
Storing images on the disk makes it easy to offload to alternate storage, makes images static elements so you don't have to mess about with HTTP headers in your web app to make the images cacheable. The downside is if you ever move your app to a different server you need to remember to move the images too; something that's easily forgotten.
For web based applications, you're going to get better performance out of using the file system for storing your images. Doing so will allow you to easily implement caching of the images at multiple levels within your application. There are some advantages to storing images in a database, but most of the time those advantages come with client based applications.
Just to add some more to the already good answers so far. You can still get the benefits of caching from both the web level maybe and the database level if you go the route keeping you images in the database.
I think for the database you can achieve this by how you store the images with relation to the textual data associated with them and if you can the access to the images into a particular query so that the database can cache the query (just theory though so feel free to nuke me on that part).
With the web side, I would guess since you're question is tagged up with asp.net that you would go the route of using a http handler to serve up the images. Then you have all the benefits of the framework at your disposal and you can keep you domain logic cleaner with only having to pass the key to your image to the http handler.
Here is a step-by-step example (general approach, Spring implementation, Eclipse) of storing images in file system and holding their metadata in DB --
http://www.devmanuals.com/tutorials/java/spring/spring3/mvc/Spring3MVCImageUpload.html
Here is an example too -- http://www.journaldev.com/2573/spring-mvc-file-upload-example-tutorial-single-and-multiple-files
Also you can investigate a codebase of this project -- https://github.com/jdmr/fileUpload . Pay attention to this controller.