I have developed a Client/Server application for IM with Qt. So far messages are sent and displayed at the client side, but when the program is closed the messages are no longer available since a proper storage is missing.
I would like to keep the messages on the client devices and avoid to store everything on the server. I don't want to use a DB either since it needs to be installed and I would like to keep everything quite easy.
Therefore I was thinking of simply storing everything in an encrypted file, but I couldn't think of a proper format to do that.
Has anyone experience with that or any suggestions how to save the messages from different clients?
You do have a concern with data integrity in face of unplanned termination of your software, due to bugs in your code, transient hardware errors, power outages, etc. That's the problem that everyone using "plain files" usually ignores, as it's a hard problem to solve and requires extensive testing and know-how.
That's why you should use an embedded database. It will solve that, and many other problems as well. SQLite is a de-facto standard for applications such as yours. You can add any encryption you wish, as SQLite provides hooks that let you implement writing and reading of the pages. You'd do the encryption there.
One little-appreciated aspect of SQLite specifically is the amount of testing it gets during development. The test harness, most of it non-public, is probably worth way more than the published SQLite code (>1M USD). SQLite is used in aerospace applications, e.g. IIRC in code classified as DAL-B under DO-178B.
Related
I have developed a tool that loads in an configuration file at runtime. Some of the values are encrypted with an AES key.
The tool will be scheduled to run on a regular basis from a remote machine. What is an acceptable way to provide the decryption key to the program. It has a command line interface which I can pass it through. I can currently see three options
Provide the full key via CLI, meaning the key is available in the clear at OS config level (i.e. CronJob)
Hardcode the key into the binary via source code. Not a good idea for a number of reasons. (Decompiling and less portable)
Use a combination of 1 and 2 i.e. Have a base key in exe and then accept partial key via CLI. This way I can use the same build for multiple machines, but it doesn't solve the problem of decompiling the exe.
It is worth noting that I am not too worried about decompiling the exe to get key. If i'm sure there are ways I could address via obfuscation etc.
Ultimately if I was really conscious I wouldn't be storing the password anywhere.
I'd like to hear what is considered best practice. Thanks.
I have added the Go tag because the tool is written in Go, just in case there is a magical Go package that might help, other than that, this question is not specific to a technology really.
UPDATE:: I am trying to protect the key from external attackers. Not the regular physical user of the machine.
Best practice for this kind of system is one of two things:
A sysadmin authenticates during startup, providing a password at the console. This is often extremely inconvenient, but is pretty easy to implement.
A hardware device is used to hold the credential. The most common and effective are called HSMs (Hardware Security Modules). They come in all kinds of formats, from USB keys to plug-in boards to external rack-mounted devices. HSMs come with their own API that you would need to interface with. The main feature of an HSM is that it never divulges its key, and it has physical safeguards to protect against it being extracted. Your app sends it some data and it signs the data and returns it. That proves that that the hardware module was connected to this machine.
For specific OSes, you can make use of the local secure credential storage, which can provide some reasonable protection. Windows and OS X in particular have these, generally keyed to some credential the admin is required to type at startup. I'm not aware of a particularly effective one for Linux, and in general this is pretty inconvenient in a server setting (because of manual sysadmin intervention).
In every case that I've worked on, an HSM was the best solution in the end. For simple uses (like starting an application), you can get them for a few hundred bucks. For a little more "roll-your-own," I've seen them as cheap as $50. (I'm not reviewing these particularly. I've mostly worked with a bit more expensive ones, but the basic idea is the same.)
what's the proper way to capture biometric information (pressure, speed...) by signing with a stylus on a canvas developed in a JSP web Page
Alright, since no one else has attempted to answer this question, I shall elaborate on my comment and opefully it will serve as an answer to others as well.
First, Java Server Pages (JSP) is a server-side language. It is meant to run on the web-server and not on the user's browser. The same goes for other server-side languages like PHP and ASP.
So a server-side language is not able to directly interact with devices (keyboard, scanners, cameras, etc). Only when the data is submitted by the browser or client program, the server receives it for processing.
For a device to receive input, there are two key pieces of software needed.
The device driver: which must be installed on the user's machine
The application program to capture inputs and do any processing.
If either one is missing, the device cannot function. And then there's another issues. Depending on the device, there's various feedback from the driver/API that should go back to the application that reads it. For example, if a fingerprint scan was not very successful for some reason, the scanner should tell this to the user. So again, there's the need for interactivity between the device and the user's application.
Thus, using any server-side language is out of the question for such applicatoins.
Now, in order to make this possible, you may use a client-side program. Here are some options.
A native application in VB, C/C++, Pascal or other language. If this is an option, the user must install this application on their computer.
A browser-based program. This can be a program created using JAVA (not Javascript or JSP), or ActiveX component. ActiveX is largely OS/browser dependent. And the TRUTH is that even Java is not truly platform independent when it comes to different operating systems. There are some technical differences that you'll need to look into. But for the most part of interactivity and high-level operations, yes, Java is more platform-independent than the others. But on a personal note, Java is my worst language. I try not to use it anywhere anymore. That's a different story.
In both options above, every client machine must have their own proprietory drivers and often some sort of API for browser integration.
A year or so ago, I had to program a Bio-Mini fingerprint scanner using VB. It was all sweet in the beginning. Then due to the restrictions of networkability and concurrent usage, the drivers/SDK could not take the load and things were going wrong. By the way, the drivers/SDK were meant for MS-Access. Knowing that the DB was the problem, I started to port this to MySQL. And it was a severe climb from there. I had to do a near-rewrite of the SDK for capturing and comparing data using arrays in VB. And to make things worst, the device was changed and things went wrong again. But do note that the new device was from the same manufacturer.
So keep in mind that even a simple change like that can cause a problem.
I've been having nothing but problems with Blackberry development and SQLite for Blackberry in general.
I'm thinking of alternatives for storing data on the device.
First of all the data stored on the device comes from web service calls 99% of the time. The web service response can range from less than 0.5kB up to 10 or maybe even 20 Kb.
A lot of the trouble I've been having revolves around the fact that I use threads to make my web service calls asynchronous, and many conflicts arise between database connections. I've also been having trouble with a DatabaseOutOfMemoryException, which I haven't even found in the documentation.
Is storing the web service response in it's raw XML (as an xml or txt file on the device) and just reading it from there everytime I want to load something on the UI a good idea?? Right now I just get the raw XML in a string and parse it (using DocumentBuilder etc...), storing the contents into different tables of my SQLite.
Would doing away with SQLite and using XML exclusively be faster?? Would it be easier?? Would there be conflicts with read/write access to open files? My app has a lot of read/write going on, so I'd like to make it as easy as possible to manage.
Any ideas would be great, thanks!!
You can use the persistent store, instead of SqLite. One big advantage of the persistent store is that it is always available - no worries about missing SDCards or the filesystem being mounted while the device is USB connected. By "big", I mean this is absolutely huge from a support perspective. Explaining all the edge cases around when a SqLite database is usable on BlackBerry is a huge pain.
The biggest disadvantage of the persistent store is the 64kb limit per object. If you know all your XML fragments never exceed that, then you're fine. However, if you might exceed 64kb, then you'll need to come up with a persistable object that intentionally fragments any large streams into components under 64kb each.
Our company has people in every catastrophic event here in the U.S. and parts of Canada. An example is they were quite prevalent in Katrina immediately after the event.
We are constructing an application to improve their job in the field which may be either ASP.NET or WPF, and the disconnect requirement makes us believe it will be a WPF application. Our people need to be able to create their jobs, provide all of the insurance and measurement data, and save it as if in the database whether or not the internet is available.
The issue we are trying to get our heads around is that when at catastrophic events our people need to be able to use our new application even when the internet is not available. (They were offline for 3 days in Katrina)
Has anyone else had to address requirements like this and suggestions on how they approached functioning on small-footprint devices while saving data as if they were still connected to the backend services and database? We also have to incorporate security into this as well, and do it well enough that their entered data loads into the connected database without issues.
Our longterm goal is to also provide this application for Android and IPad Tablet devices as well as laptops. Our initial desire for ASP.NET was it gave us an immediate application for the tablet environment. In the old application they have, they run a local server, run remote connections on the tablets and run the application through terminal server. Not pretty. Not pretty.
I feel this is a serious question that is not subjective so hopefully this won't get deleted.
Our current architecture on the server side is Entity Framework with a repository pattern, WCF services to satisfy CRUD requests returning composite data transfer objects, and a proxy for use by the clients.
I'm interested in hearing other developers' input and this design puzzle.
Additional Information Added to the Discussion
Lots of good information provided!!! I'll have to look at Microsoft Sync for sure. For the disconnected database I would be placing only list tables (enumerations) in the initial database. Jobs and, if needed, an item we call dry books, will be added for each client we are helping. (though I hope the internet returns by the time we are cleaning and drying out the homes) These are the tables that would then populate back to the host once we have a stable link. In the case of Katrina we also lost internet connectivity in our offices which meant the office provided no communication relief for days as well.
Last night I realized that our client proxy is the key to everything working! The client remains unaware of the fact that it is online or offline and leaves the synchronization process within that library. We are discovering how much data we are talking about today. I also want to make it clear that ASP.NET was a like-to-have but a thick client (actually WPF with XAML) may end up being our end state.
Now -- for multiple updates. The disconnected work will be going to individual homes by a single franchise. In fact our home office dispatches specific franchises to specific events. So we have a reduced likelihood (if any) of the problem of multiple people updating a record. The reason is that they are creating records for each job (person's home/office/business) and only that one franchise will deal with it. Of course this also means that if they are disconnected for days that the device that creates the job (record of who, where, condition, insurance company, etc) is also the only device that knows of the job. But that can be lived with. In fact we may be able to have a facility to sync the franchise devices on a hub.
I'm looking forward to hearing additional stories of how you've implemented your disconnected environment.
Thanks!!!
Looking at new technology from Microsoft
I was directed to look at a video from TechEd 2012 and thought I might have an answer. The talk was on using ASP.NET and MVC4 along with 2 libraries for disconnected behavior. At first I thought it would be great but then as it continued it worried me quite a bit.
First the use of a javascript backend to support disconnected I/O does not generate confidence. As a compiler guy (and one who wrote two interpretive languages) I really do not like having a critical business model reliant upon interpretive javascript. And script at that! It may be me but it just makes me shudder.
Then they show their "great"(???) programming model having your ViewModel exist as just javascript. I do not care for an application (asp.net and javascript) that can be, and may as well be (for lack of intellisense ) written in notepad.
No offense meant to any asp lovers, but a well written C# program that has been syntactically and type checked gives me stronger confidence in software than something written with a hope and prayer that a class namespace has been properly typed without any means of cross check. I've seen too many hours of debugging looking for a bug that ended up in a huge namespace with transposed ie in it's name. I ran my thought past the other senior developers in my group and we are all in consensus on this technology.
But we continue to look. (I feel this is becoming more of a diary than a question) :)
Looks like a perfect example for Microsoft Sync Framework
http://msdn.microsoft.com/en-us/sync/bb736753.aspx
A comprehensive synchronization platform that enables collaboration
and offline access for applications, services, and devices with
support for any data type, any data store, any transfer protocol, and
any network topology.
I often find that building a lightweight framework to fit my specific needs is more beneficial to me than using an existing one. However, always look at what's available and weigh the pros and cons before making that decision.
I haven't use the Microsoft Sync Framework, but it sounds like that's a good one to research first. If you have Sql Server Standard (or some other version other than the Express version) then replication might also be an option.
If you want to develop your own homegrown solution, then be sure to put lastupdated and dateadded fields on any tables that need to stay in sync. It doesn't 'sound' like your scenario will be burdened by concurrency issues (i.e. if person A and B both modify a field at the same time, who wins?). If that's the case then developing your own lightweight solution will be pretty straightforward.
As Jeremy pointed out, you will need a way to get the changes. In addition to using a web service, you can also use WCF which is similar to a web service in some ways. But my personal bias would be towards just accessing a SQL server remotely over the internet. The downside of that solution is added security concerns, while the upside is decreased development overhead (i.e. faster/easier development now and less maintenance over time). Also, the direct SQL solution is also assuming that this is an internal application... that you're in charge of all development and not working with 3rd parties who need access to your data and wouldn't be allowed to access it this way.
Not really a full answer but too much for a comment.
I have two apps one that synchs one way and the other two way.
I do a one way synch to client for disconnected operation. At the server full SQL Server and at the client Compact Edition. TimeStamp is a prefect for finding any rows that needs to be synched. I also don't copy the whole database as some of the largest table are non nonessential. The common use is the user marks identified records they want to synch.
If synch does what you need great +1 for Jakub. For me I don't have the option to synch the whole MSSQL both based on size and security.
Have another smaller application that synchs two way but in this case it has regions and update are only within the region. So a region only synchs their data and in disconnected mode they can only add new records. Update to an existing records must be performed in connected mode. That was mangeable. In that case MSSQL for the master and used XML for the client.
No news to you but the hard part of a raw synch is that two parties may have added or revised the same record.
Background:
Enterprise application - very will written for its time in 2004.
Stack:
.NET, Heavy use of Remoting, ASMX style web services, SQL Server
Problem:
The application allows user to go through various wizards for lack of a better term, all of their actions are stored in what we call "wiz state", which is essentially XML that is persisted to a SQL server database very frequently because we allow users to pause/resume their application. Often in these wizards, the XML that comprises the wizard state grows very large, I'm talking 5-8 MB of data, and we noticed that when we had a sudden influx of simultaneous users, we started receiving occasional timeouts against the database, because a lot of what the wizard state is comprised of, is keeping track of collections of "things". Sometimes these custom collections grow very large.
Question:
We were in a meeting today and we're expecting a flurry of activity in October that will test the system like never before, and possibly result in huge wizard states that go back and forth from the web server to the database. The crux of the situation is that there is only one database and one web server.
For arguments sake, because of the complexity of the application, lets say adding any kind of clustering/mirroring to increase database throughput is out of the question. I spoke up in the meeting and said the quickest way to address this in the shortest time period would be to add more servers to the front end web application so the load could be distributed amongst web servers. The development lead said I was completely wrong and it would have no effect because we only have one database, so adding more web power would do nothing. He is having one of the other developers reduce the xml bloat that we persist frequently to the database. Probably in the long run, reducing the size of the xml that we pass back and forth is the right idea, but will adding additional web servers truly have no effect, I just think in terms of simultaneous users, it should help.
Any responses thoughts are appreciated, proof that more web servers would help would be pure win.
Thanks.
EDIT: We use binary serialization to store the XML in the database in an image field.
I haven't heard anything about locating the "bottlenecks". Isn't that the first thing to do? Here's the method I use.
Otherwise you're just investing in guesses. That won't work.
I've been in meetings like that, where everybody gets excited throwing ideas around, and "management" wants to make "decisions", but it's the blind leading the blind. Knuckle down and find out what's going on. You can't do that in meetings.
Some time ago I looked at a performance problem with some similarity to yours. The biggest "bottleneck" was in writing and parsing XML, with attendant memory allocation, setup, and destruction. Then there were others as well. You might find the same thing, or something different.
P.S. I keep quoting "bottleneck" because all the performance problems I've found have been nothing at all like the necks of bottles. Rather they are like way over-bushy call trees that need radical pruning, such as making and reading mountains of XML for no good reason.
If the rate at which the data is written by SQL is the bottleneck, feeding data to SQL more quickly should have no effect.
I am not sure exactly what the data structure is, but perhaps compressing the XML data on the web server(s) before writing may have a positive effect.
If the bottleneck is the database, then more web services will not help you a lot.
The problem may be that the problem is not only the size of the data, but the number of concurrent request to the same table. The number of writes will be the big problem. If your XML write is in a transaction with other queries you may try to break out the XML write from that transaction to reduce locking time of the XML table.
As stated by vdeych you may try compression to reduce the data size. (That would increase the load on the web servers.)
You may also try caching the data. Only read from the SQL server if the data is not already in the cache. Make sure you don't update the SQL server if your data has not changed.
No one seems to have suggest this, what about replacing your XML serialization of your wizard with JsonSerialization.
Not only should this give you a minor boost in performance in the serialization itself since both the DataContractSerializer (faster) and Newtonsoft Json.NET (fastest) out perform the XML serializers in .NET. This should easily reduce the size of your object graph by upwards of 50% or more (depending on number of properties vs large strings in the XML).
This should dramatically lower the IO that is inflicted upon Sql server. This should also limit the amount of scope required to alter your application significantly (assuming it's well designed and works through common calls for serialization/deserialization).
If you choose to go this route also invest time comparing BSON vs JSON as I think it would be likely that the binary encoded one will offer even more space savings (and further IO reduction) due to the size of your object graphs.
I'm not a .NET expert but maybe using a binary serialization would increase throughput. Making sure that the XML isn't stored as text (fairly obvious but thought I'd mention it). Also relational databases are best for storing relational data, so perhaps substituting an ORM layer in place of the serialization (sounds feasible) could speed things up.
Mike is spot on, without understanding the resource constaint leading to the performance issues, no amount of discussion will resolve the problem. I'll add that socket timeouts that affect running statements are a symptom, and are never imposed by SQL Server, they're an artifact of your driver configuration or a firewall or similar device between app and db imposing them (unless you're talking about timeouts for new connections, then you have a host in serious distress under load).
Given your symptom is database timeouts, you need to start there. If they're indicative of long running statements that result in a socket timeout, use SQL Server profiler to capture the workload while simultaneously monitoring system resources. Given it's a mature application and the type of workload you mention, it's unlikely to be statement tuning related, it probably boils down to resource limitations CPU, memory or disk IO capacity
This Technet guide is a very good place to start:
http://technet.microsoft.com/en-us/library/cc966540.aspx
If it's resource contention, then it's a simple discussion about how the resource contention can be tuned, configured for or addressed by adding more of whatever is needed.
Edit: I should add that given a database performance issue, more applications servers is likely to worsen the problem as you increase the amount of concurrency, that might otherwise be kept in check by connection pool, request processing or other limits.