Is there a stand-alone database for Adobe AIR that supports large amounts of data? - sqlite

I have considered SQLite, but from what I've read, it is very unstable at sizes bigger than 2 GB. I need a database that in theory can grow up to 10 GB.
It would be best if it was stand-alone since it is easier to implement for non-techie users, instead of having the extra step of installing something like MySQL which most likely will require assistance.
Any recommendations?

SQLite should handle your file sizes just fine. The only caveat worth mentioning is that SQLite is not suitable for highly-concurrent environments, since the entire database file is exclusively-locked during writing processes.
So if you are writing an application that needs to handle several users concurrently, a better choice would be Postgresql.

I believe SQLite will actually work fine for you with large databases, especially if you index them appropriately. Considering SQLite's popularity it seems unlikely that it would have fundamental bugs.
I would suggest that you revisit the decision to rule out SQLite, and you might try to compensate for the selection bias of negative reports. That is, people tend to publicize bug reports, not non-bug reports, and if SQLite were the most popular embedded database then you might expect to see more negative experiences than with less popular packages even if it were superior.

Related

Howto profile performance aws dynamodb calls across many java apps

AWS Dynamodb, lot of literature around scan vs query, using proper GSI vs LSI, etc....but nothing around how to actually analyze/profile (other than manually reviewing each one) if your current query usage patterns are 'ideal' or 'less than ideal'.
Are there any practices, dashboards, logging I may be missing to collect which queries may be less than optimal across a wide range of java apps with enough info to tackle the problem (i.e. more than just 'its slow, go figure it out'). :-)
thanks!

How well does UnQLite perform? How does it compare to SQLite (in performance)?

I've researched on what I can about SQLite and UnQLite but there are still a few things that haven't quite been answered yet. UnQLite appears to have been released within the past few years which would attribute to the lack of benchmarks. "Performance" (read/write speed, querying, avg. database size before significant slowdown, etc.) comparisons may be somewhat apples-to-oranges here.
From all that I have seen the two have very few differences comparatively speaking, namely that SQLite is a relational database whereas UnQLite is a key-value pair and document (via Jx9) database. They're both portable, cross-platform, and 32/64-bit friendly, and can have single-write and multi-read connections. Very little can be found on UnQLite benchmarks while SQLite has quite a few with different implementations across various (scripting) languages. SQLite has some varied performance across in-memory databases, indexed data, and read/write modes with varying data size. Overall SQLite appears quick and reliable.
All that I can find on UnQLite are unreliable and confusing. I cannot seem to find anything helpful. What read/writes speeds does UnQLite seem to peak at? What languages are (not) recommended when using UnQLite? What are some known disadvantages and bugs?
If it helps at all to explain my intrigue, I'm developing a network utility that will be reading and processing packets with hot-swapping between network interfaces. Since the connections can, though unlikely, reach speeds up to 1 Gbps there will be a lot of raw data being written out to a database. It's still in the early stages of development and I'm having to find a way to balance out performance. There are a lot of factors such as missed packets, how large each write size is, how quickly it can process and move data, how much organization will be required, how many tables will be needed, if I can implement multiprocessing, how reliant each database is on HDD speeds, etc. etc.. My data will need tables but whether or not I have to store them as relational is still in the air. Seeing how the two stack up with their own pros and cons (aside from the usual KVP vs Relational debate) may push me towards either one or, if I'm crazy enough, a mix of both
I've done a bit of fooling around with UnQLite using python bindings I wrote. The Python bindings use cython and are quite fast.
What I've found from my experimentation is that UnQLite's key/value APIs are pretty damn fast, comparable to other DBMs. Things slow down a bit when you start using Jx9 and the document store, though.
Basically depends on what you need...
If you want SQL and ad-hoc querying, I'd suggest using SQLite. It is plenty fast and quite flexible.
If you want just keys and values, I'd use something like leveldb or rocksdb.
If you want a lightweight JSON document store, or key/value with a bit "extra", then UnQLite may be a good fit.

Does it make sense to make multiple SQLite databases to improve performance?

I'm just learning SQL/SQLite, and plan to use SQLite 3 for a new website I'm building. It's replacing XML, so concurrency isn't a big concern. But I would like to make it as performant as possible with the technology I'm using. Are there any benefits to using multiple databases for performance, or is the best performance keeping all the data for the site in one file? I ask because 99% of the data will be read-only 99% of the time, but that last 1% will be written to 99% of the time. I know databases don't read in and re-write the whole file for every little change, but I guess I'm wondering if the writes will be much faster if the data is going to a separate 5KB database, rather than part of the ~ 250MB main database.
With proper performance tuning, sqlite can do around 63 300 inserts-per-second. Unless you're planning on some really heavy volume, I would avoid pre-optimizing. Splitting into two databases doesn't feel right to me, and if you're planning on doing joins in the future, you'll be hosed. Especially since you say concurrency isn't a big problem, I would avoid complicating the database design.
Actually with 50 000 databases you will have very bad performance
you should try several tables in single database, sometimes it really can speed up something, but as description of initial task is very general - hard to say exactly what you need, try single table and multiple tables - measure speed

Programs for creating, design and administrate(GUI) SQLite

Which programs do you know for subj purpose? Quickie googling reveal 2 progs:
sqlite-manager(Firefox extension)
Not all features are realizable through GUI, but really easy for use and opensource.
SQLite Administrator Screenshots looks pretty but for Win only.
Please tell me your opinion about program not just link. Thanks.
Well, I'm using Navicat Premium. It is not free, it costs money, but it is a very nice tool when working with multiple database systems, including Sqlite. It has many nice features, such as working with multiple db's from one window, import/export/synchronize data and schemas across different databases etc.
There is also Navicat for SQLite only which costs less I think.
I found this table, and may be this information help someone.
And this question are repeat this. Just in time heh.
You can try SQLitespy. I found it very useful. Its GUI makes it very easy to explore, analyze, and manipulate SQLite3 databases.

How to Convince Programming Team to Let Go of Old Ways?

This is more of a business-oriented programming question that I can't seem to figure out how to resolve. I work with a team of programmers who have been working with BASIC for over 20 years. I was brought in to help write the same software in .NET, only with updates and modern practices. The problem is that I can't seem to get any of the other 3 team members(all BASIC programmers, though one does .NET now as well) to understand how to correctly do a relational database. Here's the thing they won't understand:
We basically have a transaction that keeps track of a customer's tag information. We need to be able to track current transactions and past transactions. In the old system, a flat-file database was used that had one table that contained records with the basic current transaction of the customer, and another transaction that contained all the previous transactions of the customer along with important money information. To prevent redundancy, they would overwrite the current transaction with the history transactions-(the history file was updated first, then the current one.) It's totally unneccessary since you only need one transaction table, but my supervisor or any of my other two co-workers can't seem to understand this. How exactly can I convince them to see the light so that we won't have to do ridiculous amounts of work and end up hitting the datatabse too many times? Thanks for the input!
Firstly I must admit it's not absolutely clear to me from your description what the data structures and logic flows in the existing structures actually are. This does imply to me that perhaps you are not making yourself clear to your co-workers either, so one of your priorities must be to be able explain, either verbally or preferably in writing and diagrams, the current situation and the proposed replacement. Please take this as an observation rather than any criticism of your question.
Secondly I do find it quite remarkable that programmers of 20 years experience do not understand relational databases and transactions. Flat file coding went out of the mainstream a very long time ago - I first handled relational databases in a commercial setting back in 1988 and they were pretty commonplace by the mid-90s. What sector and product type are you working on? It sounds possible to me that you might be dealing with some sort of embedded or otherwise 'unusual' system, in which case you do need to make sure that you don't have some sort of communication issue and you're overlooking a large elephant that hasn't been pointed out to you - you wouldn't be the first 'consultant' brought into a team who has been set up in some manner by not being fed the appropriate information. That said such archaic shops do still exist - one of my current clients systems interfaces to a flat-file based system coded in COBOL, and yes, it is hell to manage ;-)
Finally, if you are completely sure of your ground and you are faced with a team who won't take on board your recommendations - and demonstration code is a good idea if you can spare the time -then you'll probably have to accept the decision gracefully and move one. Myself in this position I would attempt to abstract out the issue - can the database updates be moved into stored procedures for example so the code to update both tables is in the SP and can be modified at a later date to move to your schema without a corresponding application change? Make sure your arguments are well documented and recorded so you can revisit them later should the opportunity arise.
You will not be the first coder who's had to implement a sub-optimal solution because of office politics - use it as a learning experience for your own personal development about handling such situations and commiserate yourself with the thought you'll get paid for the additional work. Often the deciding factor in such arguments is not the logic, but the 'weight of reputation' you yourself bring to the table - it sounds like having been brought in you don't have much of that sort of leverage with your team, so you may have to work on gaining a reputation by exceling at implementing what they do agree to do before you have sufficient reputation in subsequent cases - you need to be modded up first!
Sometimes you can't.
If you read some XP books, they often say that one of your biggest hurdles will be convincing your team to abandon what they have always done.
Generally they will recommend letting people who can't adapt go to other projects (Or just letting them go).
Code reviews might help in your case. Mandatory code reviews of every line of code is not unheard of.
Sometime the best argument is an example. I'd write a prototype (or a replacement if not too much work). With an example to examine it will be easier to see the pros and cons of a relational database.
As an aside, flat-file databases have their places since they are so much easier to "administer" than a true relational database. Keep an open mind. ;-)
I think you may have to lead by example - when people see that the "new" way is less work they will adopt it (as long as you don't rub their noses in it).
I would also ask yourself whether the old design is actually causing a problem or whether it is just aesthetically annoying. It's important to pick your battles - if the old design isn't causing a performance problem or making the system hard to maintain you may want to leave the old design alone.
Finally, if you do leave the old design in place, try and abstract the interface between your new code and the old database so if you do persuade your co-workers to improve the design later you can drop the new schema in without having to change anything else.
It is difficult to extract a whole lot except general frustration from the original question.
Yes, there are a lot of techniques and habits long-timers pick up over time that can be useless and even costly in light of technology changes. Some things that made sense when processing power, memory, and even disk was expensive can be foolish attempts at optimization now. It is also very much the case that people accumulate bad habits and bad programming patterns over time.
You have to be careful though.
Sometimes there are good reasons for the things those old timers do. Sadly, they may not even be able to verbalize the "why" - if they even know why anymore.
I see a lot of this sort of frustration when newbies come into an enterprise software development shop. It can be bad even when the environment is all fairly modern technology and tools. If most of your experience is in writing small-community desktop and Web applications a lot of what you "know" may be wrong.
Often there are requirements for transaction journaling at a level above what your DBMS may do. Quite often it can be necessary to go beyond DB transaction semantics in order to ensure time-sequence correctness, once and only once updating, resiliancy, and non-repudiation.
And this doesn't even begin to address the issues involved in enterprise or inter-enterprise scalability. When you begin to approach half a million complex transactions a day you will find that RDBMS technology fails you. Because relational databases are not designed to handle high transaction volumes you must often break with standard paradigms for normalization and updating. Conventional RDBMS locking techniques can destroy scalability no matter how much hardware you throw at the problem.
It is easy to dismiss all of it as stodginess or general wrong-headedness - even incompetence. But be careful because this isn't always the case.
And by the way: There are other models besides the RDBMS, and the alternative to an RDBMS is not necessarily "flat files" - contrary to the experience of of most coders today. There are transactional hierarchical DBMSs that can handle much higher throughput than an RDBMS. IMS is still very much alive in large IBM shops, for example. Other vendors offer similar software for different platforms.
Of course in a 4-man shop maybe none of this applies.
Sign them up for some decent trainings and then it's up to you to convince them that with new technologies a lot more is possible (or at least easier!).
But I think the most important thing here is that professional, certified trainers teach them the basics first. They will be more impressed by that instead of just one of their colleagues telling them: "hey, why not use this?"
Related post here.
The following may not apply in yr situation, but you make very little mention of technical details, so I thought I'd mention it...
Sometimes, if the access patterns are very different for current data than for historical data (I'm making this example up, but say that Current data is accessed 1000s of times per second, and accesses a small subset of columns, and all current data fits in less than 1 GB, whereas, say, historical data uses 1000s of GBs, is accessed only 100s of times per day, and access is to all columns),
then, what your co-workers are doing would make perfect sense, for performance optimization. By separating the current data (albiet redundantly) you can optimize the indices and data structures in that table, for the higher frequency access paterns that you could not do in the historical table.
Not everything that is "academically", or "technically" correct from a purely relational perspective makes sense when applied in an actual practical situation.

Resources