Why MariaDB 10.2 uses again InnoDB instead of Percona XtraDB? - innodb

MariaDB homepage says that they use Percona XtraDB until 10.1 and from 10.2 on they are going to use normal InnoDB again (https://mariadb.com/kb/en/mariadb/xtradb-and-innodb/).
This does not seem reasonable to me, because XtraDB seems to be the better / improved version of InnoDB (https://www.percona.com/software/mysql-database/percona-server/feature-comparison). So is this a typo, are there any legal issues, or is the new version of InnoDB simply better than XtraDB?
There is even a question on MariaDB page, but it has not been answered for weeks now.
Sry, could not append all related links because of Stackoverflow rules.

Keeping InnoDB (or XtraDB) up to date with MySQL (Percona) is a complex task. It took us more than half a year to migrate from InnoDB-5.6 to InnoDB-5.7 in 10.2. Doing it again for XtraDB would probably have required only slightly less than this. For us to embark on such project, it must bring significant benefits to our users.
XtraDB had many great improvements over InnoDB in 5.1 and 5.5. But over time, MySQL has implemented almost all of them. InnoDB has caught up and XtraDB is only marginally better. Not enough to justify a multi-month merge that would delay 10.2-GA for everyone.
In particular, the only real improvement that XtraDB 5.7 seems to have is for a write-intensive I/O-bound workload, where innodb_thread_concurrency control is disabled.
With a proper innodb_thread_concurrency, XtraDB is only marginally better. We didn't want to delay 10.2-GA by up to half a year for the sake of those few users who have write-intensive I/O-bound InnoDB workload and don't know how to configure innodb_thread_concurrency.
Note, we still consider incorporating XtraDB optimizations, but as patches, rather than XtraDB as a whole, which no longer has numerous all-over-the-code improvements.
https://mariadb.com/kb/en/library/why-does-mariadb-102-use-innodb-instead-of-xtradb/

As far as I have seen, they did it for better compatibility from mysql. During my training at M17 they did not say anything about it. I had discovered this during that last 10 minutes of the social hour as I was providing feedback.
I'm sure its because its not GA yet.

Related

MariaDB ColumnStore Questions

I have question regarding MariaDB ColumnStore:-
Is this free, can use in production system?
Is this the extension of Mariadb (prerequisite is MariaDB) or we can install this alone?
Can install in single machine in Production, give better performance in terms of the column stores?
Does it support all functionality of MariaDB, I mean directly migration from MariaDB to MariaDB columnstore is possible?
Does it support procedure/functions also because I have used vectorwise actian columnar DB and its not supported.
MariaDB ColumnStore is a GPLv2 storage engine that enables columnar storage of data. Currently ColumnStore is distributed in a separate package (found here) that has all MariaDB functionality it was built with.
The latest ColumnStore has the same features as MariaDB 10.2 plus the COLUMNSTORE storage engine. There are also a small set of additional functions that ColumnStore implements (e.g. some extra window functions).
For analytical queries, ColumnStore tables are almost always faster but they are not suited for OLTP workloads. You can have InnoDB and ColumnStore tables in the same database and even do cross-engine joins.
Stored procedures are supported in the same manner as they are in a normal MariaDB installation. The ColumnStore documentation has a list of features that it supports in addition to the base MariaDB functionality.
(Caveat: My knowledge of ColumnStore is limited, so these answers are suspect.
The Question is 22 days old, so I feel the Question deserves some Answer.)
Yes? The manual makes it clear what parts of MariaDB require money to change hands.
ColumnStore used to be available as InfiniDB. But after MariaDB took it on, it became integrated with MariaDB.
(Unclear question) Columnstore has a niche market. It is not practical to attempt answering how its performance compares to non-column-store engines. Perhaps with a discussion of the application we can discuss this further.
All(?) MariaDB functionality is available.
I don't know anything about "vectorwise actian columnar DB".

Why am I getting innoDB tables in information_schema in mariaDB 10.1?

MariaDB 10.1 uses XtraDB as default engine, But I am still getting innoDB tables in information_schema . Why am I getting innoDB tables in information_schema?
Since XtraDB is a "drop-in replacement" for InnoDB, probably they say "innodb" in order to avoid confusion for scripts, code, etc that uses them.
(Caveat: I can't say the above as a "fact", I have watched the evolution of XtraDB and MariaDB over the years, and feel that it is a safe guess.)
Some history...
Several years ago, Percona modified InnoDB (then 'owned' by Mysql AB or Sun, I forget the exact timing) to create XtraDB. XtraDB had some desperately needed fixes for performance. Since then, Oracle acquired MySQL (including InnoDB) and made numerous changes to it, especially in 5.6 and 5.7. Some of those changes were to incorporate (or replicate) the improvements that made XtraDB so good. Meanwhile, Percona continued development of XtraDB. Today, good code is generated by either, and sometimes incorporated into the other.
Meanwhile, MariaDB was branching off, and making other improvements in MySQL overall. At some point (10.x?), they chose to use Percona's XtraDB instead of Oracle's InnoDB.
To the casual observer, InnoDB and XtraDB feel, act, and smell the same. But if you dig hard enough, you can produce a test case that works better in one than in the other. Apples versus Oranges.
Bottom line: Not a problem.

Asp.net NHibernate CPU performance after upgrade

Has anyone else had CPU spikes after switching over to NHibernate?
We switched to using NHibernate about 2 years ago. Since then we've had issues with the server running using the CPU near 60 - 80. We also had issues with the server running out of memory.
Weve consistently been told to optimize our query. Which we did with only limited success. It wasnt until I recently upgraded from NHibernate 2.1 to 3.2 that we finally saw an improvement in the CPU. It dropped from a 60 percent average to about 30 percent. I was amazed, I was told by many who consider themselves experts that upgrading NHibernate would only produce limited improvements if any at all.
My question is ... Has anyone else noticed CPU spikes with NHibernate nd have they seen any improvement after doing a major version upgrade. And last, why exactly is the new version performing so much better? I know NHibernate 3 has a lot better support for linq and about 70 percent of my queries use Linq, so my guess is that may be part of the reason I'm seeing better performace.
Also, does anyone have any ideas how I can optimism NHibernate to produce even better CPU performance other than upgrading the dlls which I have already done.
I'm currently running NHibernate 3.2 and fluent NHibernate 1.2 upgraded from 2.1 and 1.0 respectively.
I suspect you have been told the same as I am about to recommend, but I urge you to look at all possibilities and discount them.
Weve consistently been told to optimize our query - Suspicions always lie with either the SQL generated by the ORM or the amount of time the DB takes to execute the query. This is sound advice and you must disprove this by using the following methods.
First I would set up a trace on the live database server that runs for a week. Once this is done you may find that you get suggestions on indexes or SQL related issues.
Secondly I would fire up NHProf on my development box and run some stress tests against heavily used pages or pages that have a lot of database trips to see what is going on behind the scenes with NHibernate. NHProf will give you advice about various problems including; select n+1, unbounded results, large number of rows returned, queries with too many joins etc. Again this tool is invaluable to bridge the gap between SQL server and your code.
Hopefully after this exercise you will have ideas on how to fix certain issues, introduce caching OR if you find you don't have any items to address give you valuable feedback that you can then post to the NHUser group.
After all if you think about it tens of thousands of users use NHibernate. I have used NHibernate myself for several years and subscribe to the NHusers group and I have not seen the CPU spike issue before. Always it turns out to be either; the SQL generated, the database is under pressure or large recordsets being hydrated

Cassandra and asp.net (C#)

I am interested to create portal on cassandra services, since I faced some performance and scale issues starting from 1 million of records.
Definitely, it could be solved, but I am interested on other options.
My main issues is cost of updating all necessary indexes, to make reading fast.
First, is cassandra is good way for asp.net programmers? I mean, maybe there is some other projects, which worth to take a look
And second, can you provide any documentation samples on how to start with cassandra programming from C#?
since I faced performance and scale issues starting from 1 million of records.
Maybe your design was not that good, NoSQL is not a magic bullet for bad design. I have multi billion row tables and 95% of the response is sub second. Also what do you mean by updating indexes, do you mean updating statistics or rebuilding indexes?
since I faced performance and scale
issues starting from 1 million of
records.
You know, the one million mark for modern databases is where it is not something "totally ridiculously small" where you can ignore actually knowing what you do. Below one million is "tiny". I have a 800 million row table and get a LOT of sql running through with it - no problem at all.
First, is cassandra is good way for
asp.net programmers?
I would more suggest a basic book about SQL, reading the documentation and POSSIBLY throwing some hardware on the problem. As in: having totally bad hardware will kill all data management systems.
If you are using Cassandra for your .NET Application take a look at Aquiles. I developed it based on my company needs. If you find it useful or need any help let me know.
You can't really speak of Cassandra documentation. There's a myriad of partial tutorials on the web.
You may want to setup Linux in a virtual machine, because the windows build process is quite challenging, to say the least. (http://www.virtualbox.org, http://www.ubuntu.com)
Here's the howto:
http://www.ridgway.co.za/archive/2009/11/06/net-developers-guide-to-getting-started-with-cassandra.aspx
Note that the cassandra SVN url and the code sample have changed since the writing of this tutorial.
Here's another C# client:
http://github.com/mattvv/hectorsharp
And here some sample code:
http://www.copypastecode.com/26752/
Note that you need to download the latest Java Development Kit (JDK) from Sun for Linux.
It's not in the repositories of Ubuntu 10.04.
Then you need to type
export JAVA_HOME="/path/to/jdk"
in order for Cassandra to find your Java installation.
You might also want to take a look at:
http://en.wikipedia.org/wiki/NoSQL
Especially the taxonomy section is interesting.
Make sure Cassandra is the right type of NoSQL solution for your problem, e.g. use Neo4J if your problem actually is a graph problem.
Also, you need to make sure your NoSQL solution is ACID-compliant.
For example, Neo4J is the only ACID-compliant NoSQL graph engine.
Edit: Here's a jumpstart guide for Windows, without compiling:
http://coderjournal.com/2010/03/cassandra-jump-start-for-the-windows-developer/
http://www.ronaldwidha.net/2010/06/23/running-cassandra-on-windows-first-attempt/
http://www.yafla.com/dforbes/Getting_Started_with_Apache_Cassandra_a_NoSQL_frontrunner_on_Windows/
Instead of cassandra you might take a look at: ravendb. Supposedly it is a document store made with and created for .Net. It has Linq integration, and is (again supposedly) very fast.
As with any new technology, read if it helps you with your specific case, and check if it is proven technology (Do they have mainstream clients using it).
Before you go into this route see if you can't optimize your current solution first. Check if your queries are fast, if the indexes are done correctly, and if you can't remove load by adding caching.
Last nut not least, if adding some processors to your SQL machine might fix issues, it is typically a much cheaper solution.
If you want to do something new, then instead of going for noSQL, you might want to consider trying a database cluster.
The idea is when two machines each search half of the original database at the same time, you have half the search time without totally redesigning your existing database.

Does hyperthreading lead to unstable systems?

I'm building a PC with the new Intel I7 quad core processor. With hyperthreading turned on it will report 8 cores in Task Manager.
Some of my colleagues are saying that hyperthreading will make the system unreliable and suggest turning it off.
Can any of you good people enlighten me and the rest of the stockoverflow users.
Follow on: I've been using hyperthreading constantly, and its been spot on. No instability whatsoever. I'm using:
Microsoft Server 2008 64 bit
Microsoft SQL Server 2008 64 bit
Microsoft Visual Studio 2008
Diskeeper Server
Lots of controls (Telerik, Dundas, Rebex, Resharper)
Stability isn't likely to be affected, since the abstraction is very low level and the OS just sees it as another CPU to provide work to. However, performance is another matter.
In all honesty I can't say if this is still the case, but at least when the HT-enabled CPUs first came out, there were known problems with at least some applications. For example, MySQL, and multi-threaded apps like the Java application I support for my day job were known to have decreased performance when HT was enabled. We always recommended it be removed, at least for our particular use case of a server-side enterprise application.
It's possible that this is no longer an issue, and in a desktop environment this is less likely to be a problem for most use cases. The ability to split work on the CPU generally would lead to more responsive applications when the CPU is heavily utilized. However, the context switching and overhead could be a detrement when the app is already heavily threaded and CPU-intensive such as in the case of a database server.
Off the top of my head I can think of a few reasons your colleagues might say this.
Several articles about SQL performance suffering under hyperthreading. I believe it winds up doing too much context switchings or cache thrashing. can't remember exactly.
Early on going from single proc to multi-proc or more likely for most people hyperthreaded procs, brought many threading issues into the open. Race conditions, deadlocks, etc, that they never saw before. Even though its a code problem some people blamed the procs.
Are they making the same claims about multi-core/multi-proc or just about hyperthreaded?
As for me, I've been developing on a hyperthreaded box for 4 years now, only problem has been a UI deadlock issue of my own making.
Hyperthreading will mainly make a difference in the scheduler behaviour/performance when dispatching threads to the same CPU as opposed to different CPU...
It will show in a badly coded application that does not handle race conditions between threads...
So it is usually bad design/code.... that suddendly find a failure mode condition
Unreliable? I doubt so. The only disadvantage of hyperthreading that I can think of is the fact that if the OS is not aware of it, it might schedule two threads on one physical processor when other physical processors are idle which will degrade performance.
There was a problem with SQL server and hyperthreading for some queries because SQL server has its own scheduler, maxdop 1 would solve that
To whatever degree Windows is unstable, it's highly unlikely that hyperthreading contributes significantly (or it would have made big news by now.)
I've had a hyperthreading PC for a couple years now. Not that many cores, but it's worked fine for me.
Wish I had test data to prove your colleagues wrong, but it sounds like it's just my opinion versus theirs at this point. ;)
The threads in a hyperthreaded CPU share the same cache, and as such don't suffer from the cache consistency problems that a multiple cpu architecture can. Though, if the developer of a piece of software is programming with multiple cpus in mind, they will (or should) be writing with read semantics (iirc, that's the term). i.e. all writes are flushed from the cache immediately.
As far as I know, from the OS's point of view, it doesn't see hyperthreading as any different from having actual multiple cores. From the OS's point of view, there is no difference - it's isolated.
So, aside from the fact that hyperthreading's "extra cores" aren't "real" (in the strictly technical sense) and don't have the full performance of "real" CPU cores, I can't see that it'd be any less reliable. Slower, perhaps, in some rare instances, but not less reliable.
Of course, it depends on what you're running - I suppose some applications might get "down & dirty" with the CPU and hyperthreading might confuse them, but that's probably pretty rare.
I myself have been running a PC with hyperthreading for several years now, and I have seen no stability problems.
Sorry I don't have more concrete data!
I own an i7 system, and I haven't had any issues.
If it works w/ multiple cores, it works with hyperthreading.
The short answer: yes.
The long answer, as with almost every question, is "it depends". Depends on the OS, the software, the CPU revision, etc. I have personally had to disable hyperthreading on two occasions to get software working properly (one, with the Synergy application, and two, with the Windows NT 4.0 installer), but your mileage may vary.
As long as you get windows installed detecting multiple HT cores from the beginning (it loads some relevant drivers and such), you can always disable (and re-enable) HT "after the fact". If you have bizarre stability issues with specific software that you can't resolve, it's not hard to disable HT to see if it has any impact.
I wouldn't disable it to start with because, frankly, it will probably work fine in 99.99% of your daily use. But be aware that yes, it can occasionally cause bizarre behaviors, so don't rule it out if you happen to be troubleshooting something very odd down the road.
Personally, I've found that hyperthreading, while not causing any problems, doesn't actually help all that much either. It might be like having an extra .1 of a processor. On my HT machine at work, I only very seldomly see my CPU go above 50%. I don't know if HT has gotten any better with newer processors like the i7, but I'm not optimistic.
Other than hearing a few reports about SQL Server, all I can report is positive. I get about 25% better performance on heavy multi-threaded apps with HT on. Have never run into a problem with it, and I'm using a first generation HT processor...
Late to the party, but for future referrence;
I'm currently having an issue with this with SQLServer. Basically, my understanding is Hyperthreading on the same processor shares the same L1 & L2 cache, which can cause issues between the two. Citrix also appears to have this problem from what I'm reading.
Slava Ok wrote a good blog post on it.
I'm here very late but found this page via Google. I may have discovered a very subtle problem. I have a i7 950 running 2003 Server and it's great. Initially I left hyperthreading on in the BIOS, but during some testing and pushing things hard, I ran a program called "crashme" by Carrette. This program tries to crash an OS by spawning a process and feeding it garbage to try and run. My dual Opteron setup ran it forever without a problem, but the 950 crashed within the hour. It didn't crash for anything else unless I did something stupid, so it was very surprising. On a whim I turned off HT and ran the program again. It runs all night, even multiple instances of it. One anecdote doesn't mean much, but try it and see what happens. Also, it seems that the processor is slightly cooler at any given load if HT is turned off. YMMV.

Resources