Related
I've researched on what I can about SQLite and UnQLite but there are still a few things that haven't quite been answered yet. UnQLite appears to have been released within the past few years which would attribute to the lack of benchmarks. "Performance" (read/write speed, querying, avg. database size before significant slowdown, etc.) comparisons may be somewhat apples-to-oranges here.
From all that I have seen the two have very few differences comparatively speaking, namely that SQLite is a relational database whereas UnQLite is a key-value pair and document (via Jx9) database. They're both portable, cross-platform, and 32/64-bit friendly, and can have single-write and multi-read connections. Very little can be found on UnQLite benchmarks while SQLite has quite a few with different implementations across various (scripting) languages. SQLite has some varied performance across in-memory databases, indexed data, and read/write modes with varying data size. Overall SQLite appears quick and reliable.
All that I can find on UnQLite are unreliable and confusing. I cannot seem to find anything helpful. What read/writes speeds does UnQLite seem to peak at? What languages are (not) recommended when using UnQLite? What are some known disadvantages and bugs?
If it helps at all to explain my intrigue, I'm developing a network utility that will be reading and processing packets with hot-swapping between network interfaces. Since the connections can, though unlikely, reach speeds up to 1 Gbps there will be a lot of raw data being written out to a database. It's still in the early stages of development and I'm having to find a way to balance out performance. There are a lot of factors such as missed packets, how large each write size is, how quickly it can process and move data, how much organization will be required, how many tables will be needed, if I can implement multiprocessing, how reliant each database is on HDD speeds, etc. etc.. My data will need tables but whether or not I have to store them as relational is still in the air. Seeing how the two stack up with their own pros and cons (aside from the usual KVP vs Relational debate) may push me towards either one or, if I'm crazy enough, a mix of both
I've done a bit of fooling around with UnQLite using python bindings I wrote. The Python bindings use cython and are quite fast.
What I've found from my experimentation is that UnQLite's key/value APIs are pretty damn fast, comparable to other DBMs. Things slow down a bit when you start using Jx9 and the document store, though.
Basically depends on what you need...
If you want SQL and ad-hoc querying, I'd suggest using SQLite. It is plenty fast and quite flexible.
If you want just keys and values, I'd use something like leveldb or rocksdb.
If you want a lightweight JSON document store, or key/value with a bit "extra", then UnQLite may be a good fit.
I need my program to run with big, natural numbers and zero. The program itself is not important to this question, or at least I think it is not. I looked up which primitiv data type would suite my aim best and I found the unsigned long.
Accroding to the webisite, unsined longs are supported from java 8 and onwarts. However, it does not say how to declare a variable as an unsigned long.
By googling, I find pages complaining about the lack of unsigned data types compared to C++ (from where I now the principe of unsigned primitiv types).
So my question is, how to declare an unsigned long type in java?
The aim of the big number is to make the implementation slower. The reason therefore is to compare two methods, doing the same job. It is an university asignment, so I am not interested in how much sense this makes.
If unasigned types do not work in java or only very inconvienently, which primitiv data type allows the usage of the highest positiv and whole numbers? Is long or double suited better?
Don't think you can.
You could however try this
This question comes in a context where Isabelle is used with formal software development in mind more than with pure maths theorization in mind (and from a standalone developer's context).
Seems at best, SML programs generated from an Isabelle theory, use SML's IntInf.int, not the native integer type, which is Int.int; even if Code_Target_Int, Code_Binary_Nat or Code_Target_Nat is used. Investigation of these theories sources seems to confirm it's all it can do. Native platform integers may be required for multiple reasons, including efficiency and the case the SML imperative program is to be optionally translated into an imperative language subset (ex. C or Ada), which is relevant when the theory relies on the Imperative_HOL theory. The codegen.pdf document which comes with the Isabelle distribution, did not help with it, except in suggesting the first of the options below.
Options may be:
Not using Isabelle's int and nat and re‑create a new numeric type from scratch, then use the code_printing commands (with its type_constructor and constant) to give it the native platform representation and operations (implies inclusion of range limitations in some way in the theory) : must be tedious, although unlikely error‑prone I hope, due to the formal environment. Note this does seems feasible with Isabelle's own int and nat… it makes code generation fails, and nothing tells which constants are missing in the code_printing command.
If the SML program is to be compiled directly (ex. with MLTon), tweak the SML environment with a replacement IntInf structure : may be unsafe or not feasible, and still requires to embed the range limitations in the theory, so the previous options may finally be better than this one.
Touch the generated program to change IntInf into Int : easy, but it is safe? (at least, IntInf implements the same signature as Int do, so may be it's safe). As above, requires to specifies bounds in the theory in some way, it's OK with this.
Dive into Isabelle internals : surely unreasonable, even worse than the second option.
There exist a Word theory, but according to some readings, it's seems not suited for that purpose.
Are they other known options not listed here? Are they comments on the listed options?
If there is no ready‑to‑cook solutions (I feel there is no at the time), what hints or tracks would be best known? (ex. links to documents, mentions of concepts).
Update
Points #2 and #3 of the list, may be OK (if it really is) only if there is a single integer type. If the program use more than only one, it's not applicable.
Directly generating native words from Isabelle int would be unsound, because your formalisation would not take overflow into account where it exists in reality.
It looks like the AFP entry Native_Word does what you want, though:
http://afp.sourceforge.net/entries/Native_Word.shtml
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
As with most new technologies after a while a standard emerges.
Is there anything cooking for NoSQL?
The whole point of NoSQL is that there are no standard solutions. Every data storage problem is different, and you need to choose the data storage technology that is appropriate for your specific problem and not the one that is "the standard".
That's the whole premise of "Not Only SQL".
Take ACID (here's a pieve of advice you never thought you'd get on StackOverflow, or really anywhere after 1987 :-) ), for example. There is a wide array of problems which don't need ACID guarantees. For those problems, ACID is overkill. Overkill that translates into wasted I/O, wasted CPU cycles, wasted performance. Which means wasted heat and wasted energy, which in turn means wasted money on electrical and utility bills.
Some problems only need weaker forms of those guarantees. For example, for a wide array of web applications the so-called eventual consistency is plenty enough. Other problems need higher guarantees than what SQL-style ACID provides.
So, some NoSQL databases don't have ACID guarantees or only have them in a weaker form. Some can turn them on and off on a per-DB basis. Some can turn A, C, I and D on and off individually on a per-DB basis. Some can not only turn A, C, I and D on and off individually, they can finetune them on a sliding scale. Some can even do that on a per-query basis.
If you have hierarchical data, store it in a hierarchical database. If you have graph data, store it in a graph database. If you have key-value data, store it in a key-value database. If you have semi-structured document data, store it in a document database. If you have semantic RDF data, store it in a triple database. If you build a data warehouse, store it in a column database. And if you have relational data, then, by all means store it in a relational database. (But only if you actually have relational data!)
There is no single standard NoSQL solution, as Jörg explained (+1). The term NoSQL covers a wide array of database types, each tailored for a specific data domain.
Ayende's That No SQL Thing series takes a look at some of the mainstream NoSQL solutions and highlights the strengths and weaknesses of each type. He discusses the following:
Key/value stores
Column-family stores
Document databases
Graph databases
You can think of these different types as standards within NoSQL. Just remember that each of them is specialized for certain data storage problems. There's no "one size fits all" solution: all of them will continue to exist.
A query language for JSON, semi-structured and document databases called UnQL is being developed:
http://www.unqlspec.org/display/UnQL/Home
Some people have contemplated about standards for document db's: http://nosql.mypopescu.com/post/731261002/a-common-nosql-query-language .
However key-value-stores and document db's don't do joins and that means that their query languages are simple and easy to learn. There is less need for a common language like SQL.
However .NET developers can use LINQ to access document db's MongoDB and RavenDB, and some people are developing a LINQ provider to document db CouchDB: http://github.com/sinesignal/ottoman . LINQ isn't a NoSQL standard but a standard for everything that is related to data. You can use it to talk to a relational database or an xml file too.
Graph databases are very different from key-value-stores and document db's. I don't think you can unite them in one standard. I really don't know if it is possible to develop a LINQ provider for a graph database. I guess not but I'm not sure.
Some NoSQL product supports SQL or a super set of it. This is the case of OrientDB, a document-graph nosql dbms with the support of SQL. It's released under Apache 2 license.
Furthermore it can export document in JSON format (you can export/import the entire database in JSON). Other NoSQL products read/write JSON.
bye,
Lvc#
(Speaking specifically on subset of NoSQL known as Document databases).
Many document databases do not expose a "Query Language". In lieu, they often provide Query APIs and these APIs are specific to the implementation and controlled by the individual sponsors/owners of the implementations (10gen for MongoDB, for example).
In the XML database space (a subset of Document databases), there is the W3C standard XQuery. It is a query and functional programming language designed for querying collections of XML data (says wikipedia).
It is unclear yet if there is any need/desire for a standard query API (or language) for JSON data. JSONPath (analogous to XPath) has been proposed, but it's received little attention other than it's use by Kynetx .
One potentially interesting one is AppScale which provides a unified API for HBase, Hypertable, MySQL Cluster, Cassandra, Voldemort, MongoDB, MemcacheDB and Redis. The API is defined by Google for the Google App Engine and is available for Java, Python and Go.
Is there a Java BigInt equivalent for Standard ML? The normal int type throws an exception when it overflows.
Yes, see the IntInf structure.
The official SML'97 standard basis library introduces a zoo of structures like Int, IntInf, Int32, Int64, LargeInt etc.
To actually use them in practice to make things work as expected, and make them work efficiently, you need to look closely at the SML implementation at hand.
One family of implementations imitates the memory layout of C and Java, so Int32 will be really a 32bit machine word (but with overflow checking), and Int64 a 64bit machine word. SML/NJ is a notable example for that, and its small int arithmentic is fast, but its big int arithmentic slow.
Another family of implementations come from the background of symbolic computation (LISP or Computer Algebra), where Poly/ML is a notable example. Here you have Int = IntInf = LargeInt by default, and the implementation first uses (part of) the native machine word as approximation, until it overflows and then switches to really big integers that are allocated on the heap (as boxed values). Poly/ML uses the GNU MP library for that big part.
Thus Int/IntInf is very efficient as long as your application is about integers, not machine words of a specific size: Int32 in the symbolic model won't fit into a single word on 32bit hardware due to the extra tag bits that are required. So some algorithms that are actually about word arithmetic will degrade, for example SHA1 on 32bit hardware.
On the other hand, the implicit upgrade of shorter-than-wordsize int to heap-allocated big int gives you something better than BigInt in Java, because you won't need the full object overhead for small values: 42 will be just some bit pattern in a register (with additional tag bit), but not a heavy box on the heap.
The BigInt-equivalent is called LargeInt. See these lecture notes to see some functions on how to convert between int (aka Int) and LargeInt.
While this isn't exactly what you were asking, you don't actually want an equivalent to the Java BigInt class. Java's BigInt class implements O(n^2) time for multiplication (essentially multiplying the way it's taught in elementary school), instead of O(n log n), which is possible. This is really important, as a lot of trivial BigInt programming simply doesn't work with the n^2 version.
Well, int puts a nasty limit on stuff like calculating permutations. SML needs a large numeric datatype thats more natural to use.