I am interested in images but the question is quite general. I am doing it thusly :
private static final SecureRandom RANDOM = new SecureRandom();
private static final int FILENAMElENGTH = 73; // a guess
private static String nextId() { // synchronized ?
return new BigInteger(FILENAMElENGTH, RANDOM).toString(32);
} // https://stackoverflow.com/a/41156/281545
Questions :
Are there pros and cons in storing the files with the session id + a timestamp ? Pros as in use this info later and cons as in security
Are there any standard (see servlet API or Java) way of generating a name ? Any standard practices ? Any container specific tips (glassfish and tomcat)
I understand that keeping the original filename, the username etc can lead to security holes
Related :
File uploading : What should be the name of the file to save to?
JSP: Best practices uploading files to server
static File getImageFile() throws IOException {
return File.createTempFile("upload_", ".jpg", new File(upload_path));
}
// String filename = getImageFile().getName();
This is guaranteed to be unique (docs) - and it is not a tmp file at all (provided you have control to the upload_path, which must be a path to an existing directory (although the docs are not explicit about this)).
Obviously you should have a better way to specify the extension but this is another question.
No session ids, user input etc.
Got the idea from a BalusC blog post :
It is necessary to know the file upload location in the MultipartMap as well, because we can then make use of File#createTempFile() to create files with an unique filename to avoid them being overwritten by another file with a (by coincidence) same name. Once you have the uploaded file at hands in the servlet or bean, you can always make use of File#renameTo() to do a fast rename/move.
Notice that createTempFile used to be rather insecure before Java 6.11 (see here for an exposition and here for a general exposition of tmp files security). Also see this SO question - there is a window of vulnerability between file creation and opening. These issues however have nothing to do with filenames - still createTempFile is the only way to guarantee uniqueness (I hope you are using latest JDK, to avoid the predictable filenames createTempFile suffered from).
You may want to use a Universally Unique Identifier. They are nicely supported in Java 7. If you use the static method UUID.randomUUID(), you should have a reasonably unique identifier. Note that in theory you could run across a duplicate, but the chances of that are extremely small, so much so that it is considered a very strong solution for what you are trying to do (see the discussion on the Wikipedia link).
Mind you, the generated sequence of characters is not user-friendly at all, but from what I understand of your requirements, that is all right.
Good luck!
Related
I have written downloading a file in a simple manner:
#ResourceMapping(value = "content")
public void download(ResourceRequest request, ResourceResponse response) {
//...
SerializableInputStream serializableInputStream = someService.getSerializableInputStream(id_of_some_file);
response.addProperty(HttpHeaders.CACHE_CONTROL, "max-age=3600, must-revalidate");
response.setContentType(contentType);
response.addProperty(HttpHeaders.CONTENT_TYPE, contentType);
response.addProperty(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename*=UTF-8''"
+ URLEncoder.encode(fileName, "UTF-8"));
OutputStream outputStream = response.getPortletOutputStream();
byte[] parcel = new byte[4096];
while (serializableInputStream.read(parcel) > 0)
outputStream.write(parcel);
outputStream.flush();
serializableInputStream.close();
outputStream.close();
//...
}
The SerializableInputStream is described here - JavaDocs. It allows an InputStream to be serialized and, for instance, passed over remoting.
I read from input and write it to the output, not all bytes at once. But unfortunately the portlet isn't "streaming" the contents - the file (e.g. an image) is sent to the browser only after reading the entire input stream - this is how it looks like. I see the file being read from the database (from live logs), but I don't see any "growing" image on the screen.
What am I doing wrong? Is it possible to really stream a file in Liferay 6.0.6 and Spring Portlet MVC?
Where are you doing this? I fear that you're doing this instead of rendering your portlet's HTML (e.g. render phase). Typically the portlet content is embedded in an HTML page, thus you need the resource phase, which (roughly) behaves like a servlet.
Also, the code you give does not match the actual question you ask: You use a comment //read from input stream (file), write file to os and ask what to do differently in order to not have the full content in memory.
As the comment does not have anything in memory and you could loop through reading from the input file while writing to the output stream: What's the underlying question? Do you have problems with implementing download-streaming in a portal environment or difficulties (i.e. using too much memory) reading from a file while writing to a stream?
Edit: Thanks for clarifying. Have you tried to flush the stream earlier? You can do that whenever you want - e.g. every loop (though that might be a bit too much). Also, keep in mind that the browser as well as the file itself must handle it in a way that you expect: If an image is not encoded "incrementally" a browser might not show it that way.
Have you tried this with huge files as well? It might be that the automatic flushing is just not triggered because your files are too small for it to be triggered...
Also, I think that filename*=UTF-8'' looks strange. Might be valid encoding, but I've never seen this
I want to limit the allowed uploaded file types to images, pdfs, and docs. What is the recommended way to approach this?
I assume checking the file extension alone is not enough, since an attacker can change the file extension as he wishes.
I also thought about checking against MIME Type using PostedFile.ContentType.
I still don't know if this is adding any further functionality than checking against file extensions alone, and if an attacker have and ability to change this information easily.
This is basically for a course management system for students to upload assignments and teachers to download and view them.
Thanks.
I agree with validating the extension as show by pranay_stacker, and checking against PostedFile.ContentType will provide another layer of security. But, it still relies on a the Content-Type header set by the client and therefore susceptible to attack.
If you want to guarantee the file types then you need to upload the file and check the first 2 bytes. Something along the lines of (untested)
string fileclass = "";
using(System.IO.BinaryReader r = new System.IO.BinaryReader(fileUpload1.PostedFile.InputStream))
{
byte buffer = r.ReadByte();
fileclass = buffer.ToString();
buffer = r.ReadByte();
fileclass += buffer.ToString();
r.Close();
}
if(fileclass!="3780")//.pdf 208207=.doc 7173=.gif 255216=.jpg 6677=.bmp 13780=.png
{
errorLiteral.Text = "<p>Error - The upload file must be in PDF format.</p>"
return;
}
This is very rough and not robust, hopefully someone can expand on this.
To be 99% sure, you'll have to check magic numbers of a uploaded files, just like UNIX file utility does.
I have a FileUploader control in my web form. If the file being uploaded is already present, I want to delete it, and overwrite it with the newly-uploaded file. But I get an error, as the file is in use by another process, and thus the application can't delete it. Sample code:
if (FUpload.HasFile)
{
string FileName = Path.GetFileName(FUpload.PostedFile.FileName);
string Extension = Path.GetExtension(FUpload.PostedFile.FileName);
string FolderPath = ConfigurationManager.AppSettings["FolderPath"];
string FilePath = Server.MapPath(FolderPath + FileName);
if (File.Exists(FilePath))
{
File.Delete(FilePath);
}
FUpload.SaveAs(FilePath);
}
Is there anything I can do apart from writing the code in try/catch blocks?
Generate a unique temporary file name. Rename it to your destination when complete. You may still have collisions if someone uploads the "same" file name at the same time. You should always be catching file system errors somewhere. If you don't do it here, may I suggest a global error handler in global.asax.
you can save you file with some other name and after that if it exist use File.Replace to replace old file
At the end of the day, due to potential race conditions on your web site (due to, hopefully, concurrent users), you can't get around try/catch. (Why are you averse to it?)
Utkarsh and No Refunds No Returns have the basic answer right -- save it with a temporary file name, then replace/overwrite the existing one if needed. A good approach for this is to use a GUID as the temporary file name, to ensure that there are no collisions on the filename alone.
Depending on the nature of your application, you could get quite a few files stacked up, uploaded by different users, with lots of potential name conflicts. Depending on the nature and scale of your app, as well as its security boundaries, you might consider giving each user his/her own directory, based on user ID (how you'd identify the user in the database). Each user uploads his/her files there. If there's a name collision, you can bounce back to the user (holding the GUID name in session if needed) and ask if he/she wants to overwrite, and know with confidence that the answer is safe.
If the user declines to overwrite, you can delete your temp file.
If the user agrees to overwrite, you can delete the original and write the new one.
In either event, all of this is localized to the user's own directory, and thus (unless multiple users are signed on with the same ID) the behavior is safe.
In general, this will be more robust and safe than arbitrarily overwriting file name collisions.
Again, due to race conditions and other situations beyond your control, you need to use a try/catch block any time you attempt to write to the file system. Why? What if the drive is out of space? What if the file you are attempting to overwrite is legitimately in use by another process? What if the file you are attempting to overwrite has NTFS permissions forbidding the web process from touching it? So on and so forth. You need to be prepared to handle these kinds of exceptions.
I run a rather large site where my members add thousands of images every day. Obviously there is a lot of duplication and i was just wondering if during an upload of an image i can somehow generate a signature or a hash of an image so i can store it. And every time someone uploads the picture i would simply run a check if this signature already exists and fire an error stating that this image already exists. Not sure if this kind of technology already exists for asp.net but i am aware of tineye.com which sort of does it already.
If you think you can help i would appreciate your input.
Kris
A keyword that might be of interest is perceptual hashing.
You use any derived HashAlgorithm to generate a hash from the byte array of the file. Usually MD5 is used, but you could subsitute this for any of those provided in the System.Security.Cryptography namespace. This works for any binary, not just images.
Lots of sites provide MD5 hashes when you download files to verify if you've downloaded the file properly. For instance, an ISO CD/DVD image may be missing bytes when you've received the whole thing. Once you've downloaded the file, you generate the hash for it and make sure it's the same as the site says it should be. If all compares, you've got an exact copy.
I would probably use something similar to this:
public static class Helpers
{
//If you're running .NET 2.0 or lower, remove the 'this' keyword from the
//method signature as 2.0 doesn't support extension methods.
static string GetHashString(this byte[] bytes, HashAlgorithm cryptoProvider)
{
byte[] hash = cryptoProvider.ComputeHash(bytes);
return Convert.ToBase64String(hash);
}
}
Requires:
using System.Security.Cryptography;
Call using:
byte[] bytes = File.ReadAllBytes("FilePath");
string filehash = bytes.GetHashString(new MD5CryptoServiceProvider());
or if you're running in .NET 2.0 or lower:
string filehash = Helpers.GetHashString(File.ReadAllBytes("FilePath"), new MD5CryptoServiceProvider());
If you were to decide to go with a different hashing method instead of MD5 for the miniscule probability of collisions:
string filehash = bytes.GetHashString(new SHA1CryptoServiceProvider());
This way your has method isn't crypto provider specific and if you were to decide you wanted to change which crypto provider you're using, you just inject a different one into the cryptoProvider parameter.
You can use any of the other hashing classes just by changing the service provider you pass in:
string md5Hash = bytes.GetHashString(new MD5CryptoServiceProvider());
string sha1Hash = bytes.GetHashString(new SHA1CryptoServiceProvider());
string sha256Hash = bytes.GetHashString(new SHA256CryptoServiceProvider());
string sha384Hash = bytes.GetHashString(new SHA384CryptoServiceProvider());
string sha512Hash = bytes.GetHashString(new SHA512CryptoServiceProvider());
Typically you'd just use MD5 or similar to create a hash. This isn't guaranteed to be unique though, so I'd recommend you use the hash as a starting point. Identify if the image matches any known hashes you stored, then individually load the ones that it does match and do a full byte comparison on the potential collisions to be sure.
Another, simpler technique though is to simply pick a smallish number of bits and read first part of the image... store that number of starting bits as if they were a hash. This still gives you a small number of potential collisions that you'd need to check, but has much less overhead.
Look in the System.Security.Cryptography namespace. You have your choice of several hashing algorithms/implementations. Here's an example using md5, but since you have a lot of these you might want something bigger like SHA1:
public byte[] HashImage(Stream imageData)
{
return new MD5CryptoServiceProvider().ComputeHash(imageData);
}
I don't know if it already exists or not, but I can't think of a reason you can't do this yourself. Something similar to this will get you a hash of the file.
var fileStream = Request.Files[0].InputStream;//the uploaded file
var hasher = System.Security.Cryptography.HMACMD5();
var theHash = hasher.ComputeHash(fileStream);
System.Security.Cryptography
I am building an MVC application in which I am reading a list of files from the file system and I want to pass the relative URL to that file to the view, preferably prefixed with "~/" so that whatever view is selected cab render the URL appropriately.
To do this, I need to enumerate the files in the file system and convert their physical paths back to relative URLs. There are a few algorithms I've experimented with, but I am concerned about efficiency and minimal string operations. Also, I believe there's nothing in the .Net Framework that can perform this operation, but is there something in the latest MVC release that can?
At the moment I don't know any built-in method to do it, but it's not difficult, I do it like this:
We need to get the Application root, and replace it in our new path with ~
We need to convert the backslashes to slashes
public string ReverseMapPath(string path)
{
string appPath = HttpContext.Current.Server.MapPath("~");
string res = string.Format("~{0}", path.Replace(appPath, "").Replace("\\", "/"));
return res;
}
Isn't this what UrlHelper.Content method does? http://msdn.microsoft.com/en-us/library/system.web.mvc.urlhelper.content.aspx
I did some digging, trying to get the UrlHelper class to work outside of a controller, then I remembered a old trick to do the same thing within an aspx page:
string ResolveUrl(string pathWithTilde)
Hope this helps!
See:
https://msdn.microsoft.com/en-us/library/system.web.ui.control.resolveurl(v=vs.110).aspx