How does StackOverflow (and other sites) handle orphaned uploads? [closed] - http

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've been wrestling with a problem related to allowing users to upload files to my web server through a web page. How can I prevent uploaded files from being "orphaned" as a result of uploading and forgetting.
For example, when I create a question on StackOverflow, I have the option of adding an image to my question. This uploads the file and provides a URL to the resource on an SO server. What if I upload a file, but never submit my question (either I changed my mind, or my computer crashed, or I just closed the browser, etc...). Now there's a file on the SO server that isn't being used by anything. Over time, these could build up.
How can I handle this? Would some background process / automated task that performs checks for unused or expired files be sufficient? Maybe I'm over complicating this?

I can't speak for SO, but the way I've always done it is to run a scheduled task (e.g. cronjob) that goes through the database, looks for orphaned files that don't match entries in the uploads table, and whose creation date is older than 24 hours.
Secondly, having files upload to /tmp or %temp%\ first and then copy over to a proper uploads directory does wonders for this kind of thing - a half finished upload can be left orphaned, and the OS will automagically clear it up when there are no longer any handles to it.

Related

How to restructure the Alfresco Repository [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
We are using Alfresco Community Edition 5.0d. Unfortunately the best practices are not followed since beginning. Due to this all the documents are stored in the Repository root folder. This folder now has 800,000 records. This is causing performance issues in application.
After looking at several recommendations for keeping fewer number of files in a folder, we want to move all the existing document in to year wise folders. What is the recommended way to move the documents?
I would suggest to use a BatchProcessor in Java.
Your implementation of BatchProcessWorkProvider would get the documents under the repository root folder, and your implementation of BatchProcessWorker would move each document in a date folder (after creating the folder if it doesn't exist).
The batchProcessor could be launched either manually from a java webscript, either automatically with a patch on startup.
If you choose this method you might have to perform a full reindex of Solr after the execution of the batch because I remember of a bug in 5.0 causing a node to be duplicated in Solr indexes after being moved, with one version indexed in its original path and the copy indexed in its new path.
You can try to move a node and search it by name (or whatever way which ensure that you only recover this node) in Share. If you have 2 results for this node then you have the bug.
The full Solr reindex can take a lot of time depending on the number of files you have in the repo and their size.

Possible vulnerability within my application [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Apologies if this is in the wrong category. I'm currently developing and application in ASP, due to my inexperience with ASP I'm worried about vulnerabilities that a user can exploit.
My application is being coded from scratch, no templates used or defaults from Visual Studio, completely blank projects. The user is greeted with a login page where depending on there user access in active directory depends on which pages the user can access.
The exploit I'm worried about is if the user will be able to commit a directory traversal and access a page in which they're not allowed to access and change critical information.
I'm afraid my inexperience has caught up with me. Could someone explain to me how I could limit the access to the user or, If I'm over thinking the process, correct me? Constructive criticism is accepted.
Microsoft does try to help protect your application through their defaults, so if you're running in IIS, make sure the user the application pool is running under only has write access to the folders it needs to write into.
This is a very open-ended question and depends on many factors such as version of .net, server OS/IIS version, other handlers installed, etc. But a good start is to review the OWASP Top 10:
https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project#OWASP_Top_10_for_2013
Here's a list of some automated tools you can use for testing your implementation:
https://geekflare.com/online-scan-website-security-vulnerabilities/

how to detect website leaking data [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have recently built a website based on WordPress. I got a free theme from a source in Pakistan.
I have to use this theme because it perfectly serves my purpose. But I want to know that if this theme is quietly establishing a connection with another server and sending my data.
How can I detect that my website is internally sending some codes to the server of developer of theme? Also, I need to know what servers are being communicated with — like, if any image is getting loaded from their server, any code is imported from their server, or anything else is being fetched from their server to run.
Since you have the source code, then you can simply look what this theme does - basically theme should only be HTML and CSS (or mostly it). If there is too much suspicious PHP of Javascript I wouldn't use it.
If you want to see if it connects to some outside sources, run it in your controlled environment and use some network sniffing tool like Wireshark for example.
Generally speaking - if you don't trust the source where you got your theme and you are not good enough in programming to check for malicious code yourselves, don't use it!
I would recommend downloading some of themes provided directly by wordpress.org - those should be safe.

Could hackers steal dlls from an Azure Website? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have an Asp.net MVC website that is hosted in Azure Websites. It uses dlls to access databases etc.
Can hackers potentially download those dlls?
No, dll files in the bin folder can't be downloaded. The web server excludes certain file types and folders for download.
A hacker could of course potentially get the files, but he would have to hack into the server and access them as files directly, it can't just be done by downloading them.
Basically: no.
Of course, if you open up the web site permissions, and move DLLs to places they should not be, etc - all bets are off.

copying .tar.gz file while writing it [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Is writing a .tar.gz file purely sequential?
When copying a large file, I started compressing it, and while it was compressing, scping it to a different machine. Afterwards I checked the md5sum on both machines, and they did not match. I guess it wasn't the best idea ever to start reading the .tar.gz before it was finished. I supposed that writing that .tar.gz file would only append to the end so that reading it would work out fine.
Does anybody know anything about the mechanics of this? What specifically is happening here?
If you were doing to scp with a simple .tar file, it could work.
tar is a sequential archiving tool mostly designed to be piped to cpio to write on a tape.
But here, you ask tar to first create the archive and then compress it. The compression can happen only after the archive is finished.

Resources