The question speaks for itself : is there a convenient wrapper for system specific function in Qt, so I can tell how much is the current resources usage ?
I want to execute some expensive task when the system is idle. For your information (I might need to put that in another question), I want to calculate the content hash of a file. I thought of doing it by streams instead of a basic readAll followed by a call to QCryptographicHash, but didn't find how to do it yet, so i'm stuck with reading the whole file and calling hash()...
You need to use platform specific code to detect resource usage.
For fastest file access to calculate hashes, use memory mapped files (QMemoryFile)
To get the system data in Linux, you can read '/proc' and display info.
For Windows, you may want to look at WMI queries.
Related
I am looking for a good solution by which we can prevent an exe file to be uploaded on server.
It will be best if we can discard the upload by just reading the file headers as soon as we receive them rather than waiting for entire file to upload.
I have already implemented the extension check, looking for a better solution.
There is a how and a when/where part. The how is fairly simple, as binary files do contain a header and the header is fairly easy to strip out and check. For windows files, you can check the article Executable-File Header Format. Similar formats are used for other binary types, so you can determine types you allow and those you do not.
NOTE: Linked article is for full querying of the file. There are cheap, down and dirty, shortcuts where you only examine a few bytes.
The when/where depends on how you are getting the files. If you are using a highly abstracted methodology (upload library), which is fairly normal, you may have to stream the entire file before you can start querying the bits. Whether it is streamed into memory or you have to save and delete depends on your coding and possibly even the library. If you control the streaming up, you have the ability to stream in the first bytes (header portion) and abort the process in mid stream.
The first point of access to uploaded data would be in a HttpModule.
Technically you can check before all the bytes are sent if you have an .exe on your hands and cancel the upload. It can get quite complicated depending on how far you want to take this.
I suggest you look at the HttpModule of Brettle's NeatUpload. Maybe it gives you a lead on how to deal with this on the level you want.
I think you can do that by a javascript by checking if the file end with .exe before submitting the data and also do the check server side.
Is there a way to identify what process is using a particular DirectShow filter? Specifically a video capture filter.
If our application throws an exception trying to use a DirectShow filter because it's already in use, we would like to identify the process that is using the filter and kill it. Of course this is not a general purpose or distributed application but one installed on a dedicated computer whose sole purpose is to run our application.
Thanks,
Ideally, I think killing a process should be avoided by all means... many bad things can happen as result. That said, my proposal counts on 5 parts:
Locating the fitler dll file in the file-system.
Enumerating all processes
Enumarating all loaded modules of each process
identifying which process is using the filter.
Killing the process.
Since you did not specify any language or programming framework, I will assume C#/.net just for convenience.
1- DirectShow filters are just COM objects, so they are registered in the system as such. You need to figure out the GUI of your filter, using this GUID, you can locate the registry key where this object information is stored, then you can retrive the location of the dll in the file system from there. Microsoft.Win32.Registry can be used to access the registry.
2- System.Diagnostics.Process.GetProcesses() can be used to enumerate all running process.
3- System.Diagnostics.Process.Modules can be used to enumerate all modules (dlls) loadded by the process.
The rest of the steps should be trivial.
I have a program that monitors certain files for change. As soon as the file gets updated, the file is processed. So far I've come up with this general approach of handing "real time analysis" in R. I was hoping you guys have other approaches. Maybe we can discuss their advantages/disadvantages.
monitor <- TRUE
start.state <- file.info$mtime # modification time of the file when initiating
while(monitor) {
change.state <- file.info$mtime
if(start.state < change.state) {
#process
} else {
print("Nothing new.")
}
Sys.sleep(sleep.time)
}
Similar to the suggestion to use a system API, this can be also done using qtbase which will be a cross-platform means from within R:
dir_to_watch <- "/tmp"
library(qtbase)
fsw <- Qt$QFileSystemWatcher()
fsw$addPath(dir_to_watch)
id <- qconnect(fsw, "directoryChanged", function(path) {
message(sprintf("directory %s has changed", path))
})
cat("abc", file="/tmp/deleteme.txt")
If your system provides an API for monitoring filesystem changes, then you should use that. I believe Macs come with this. Not sure about other platforms though.
Edit:
A quick goog gave me:
Linux - http://wiki.linuxquestions.org/wiki/FAM
Win32 - http://msdn.microsoft.com/en-us/library/aa364417(VS.85).aspx
Obviously, these APIs will eliminate any polling that you require. On the other hand, they may not always be available.
Java has this: http://jnotify.sourceforge.net/ and http://java.sun.com/developer/technicalArticles/javase/nio/#6
I have a hack in mind: you can setup a CRON job/Scheduled task to run R script every n seconds (or whatever). R script checks the file hash, and if hashes don't match, runs the analysis. You can use digest::digest function, just check out the manual.
If you have lots of files that you want to monitor, then R may be too slow for this purpose. Go to your c: or / dir and see how long it takes to do file.info(dir(recursive = TRUE)). A dos or bash script may be quicker.
Otherwise, the code looks fine.
You could use the tclTaskSchedule function in the tcltk2 package to set up a function that checks for updates and runs your code. This would then be run on a regular basis (you set the timing) but would still allow you to use your R session.
I'll offer another solution for windows that I have been using in a production environment that works perfectly and that I find very easy to set up and, under the hood it basically accesses the system API for monitoring folder changes as others have mentioned, but all the "hard work" is taken care of for you. I use a freely available piece of software called Folder Monitor by Nodesoft and well described here. Once you execute this program, it appears in your system tray and from there you can specify a given directory to monitor. When files are written to the directory (or changed or modified - there are a few options from which you can choose), the program executes any program you like. I simply link the program to a windows batch that that calls my R Script. So for example, I have Folder Monitor set up to monitor a "\myservername\DropOff" UNC path for any new data files written to it. When Folder Monitor detects new files, it executes RunBatch.bat file that simply runs an R script (see here for information on setting that up) that validates the format of the expected file based on an expected naming convention for files received and then it unzips and processes the data, creating a dataframe and ultimately loads that into a SQL Server Database. It just doesn't get any easier.
One note if you decide to use this solution: take a look at the optional delay execution parameter, which might be important if files take a while to copy into the target directory from the source location.
We're creating a web service where we're writing files to disk. Sometimes these files will be read at the same time as they are written.
If we do this - writing and reading from the same file - we sometimes end up with files that are of the same length, but where some of the data inside are not the same. So with a 350mb file we get maybe 20-40 bytes that differs.
This problem mostly occur if we have 3-4 files that are being written and read at the same time. Could this problem be because there is no guarantee that after a "write" to a disk, that the data is actually written, i.e., the disk is being asynchronous.
Also, the computer we're testing on is just a standard macbook pro, so no fancy disks of any kind.
The bug might be somewhere else, but we just wanted to ask the question and see if anybody knew something about this writing+reading thing.
All modern OSs support concurrent reading and writing to files (obviously, given a single writer). So this is not an OS level bug. But do make sure you do not have multiple threads/processes trying to append data to the file.
Check your application code. Check the buffers you are using. Make sure your application is synchronized and there are no race conditions between readers and writers.
Suppose I have a file, urls.txt, that contains a list of URLs I'm monitoring. My monitoring script edits that file occasionally, say, to indicate whether each URL is reachable. I'd like to also manually edit that file, to add to or change the list of URLs. How can I allow that such that I don't have to think about it when manually editing?
Here are some possible answers. What would you do?
Engage in hackery like having the program check for the lockfiles that vim or emacs create. Since this is just for me, this would actually work.
If the human edits always take precedence, just always have the human clobber the program's changes (eg, ignore the editor's warning that the file has changed on disk). The program can then just redo its changes on its next loop. Still, changing the file while the user edits it is not so nice.
Never let a human touch a file that a program makes ongoing modifications to. Rethink the design and have one file that only the human edits and another file that only the program edits.
Give the human a custom tool to edit the file that does the appropriate file locking. That could be as crude as locking the file and then launching an editor, or a custom interface (perhaps a simple command line interface) for inserting/changing/deleting entries from the file.
Use a database instead of a flat file and then the locking is all taken care of automatically.
(Note that I concocted the URL monitoring example to make this more concrete and because what I actually have in mind is perhaps too weird and distracting -- this question is strictly about how to let humans and programs both modify the same state file.)
I'd use a database since that's basically what you're going to have to build to achieve what you want. Why re-invent the wheel?
If a full-blown DBMS is too much of a load, separate the files into two and synchronize them periodically. Whether the URL is reachable doesn't sound like something the user would be changing, so should not be editable by them.
During the synchronize process (which would have to lock out the monitor and the user although it could be a sub-function of the monitor), remove entries in the monitor file that aren't in the user full. Also, add to the monitor file those that have been added to the user file (and start monitoring them).
But, I'd go the database method with a special front-end for the user, since you can get relatively good light-weight databases nowadays.
Use a sensible version control system!
(Git would work well here).
That said, the nature of the problem implies that a real database would be best - and they will generally have either database-level, table-level, or row-level locking - but then put any scripts you need into version control.
I would go with option 3. In fact, I would have the program read the human-edited input file, and append the results of each query to a log file. In this way, you can also analyse the reachability of sites over time. You can also have the program maintain a file that indicates the current reachability state of each site in the input file, as a snapshot of the current state.
One other option is using two files, one for automated access and one for manual. You'd need a way in the user file to indicate modifications or deletions but you'd have similar problems in some of the other solutions as well.