SQL Contest – Win Amazon Gift Cards – Learn How to Get Started with ClustrixDB

wincontest SQL Contest   Win Amazon Gift Cards   Learn How to Get Started with ClustrixDBIt has been a long time since we contest where we can learn something new and win something cool. I reached out to the good folks of Clustrix with the request to help me to build a contest where readers can learn and explore new technology, while stand a good chance to win something good.

Before we jump into the contest, let us quickly understand what Clustrix is all about in four simple points:

  • ClustrixDB can scale reads, writes, updates and analytics, — near linearly — as you add nodes. The scale-out architecture of the cloud means that the new cloud applications use scale-out NoSQL or SQL databases, and most often a combination of both.
  • Real-time analytics is analytics on your live operational database, up to date to the current moment. ClustrixDB allows real-time insights into your business and fast, current reports for your business and self-serve customers.
  • ClustrixDB has powered mission-critical business applications for more than three years, with trillions of transactions per month running on ClustrixDB. Proven in massive transaction volume environments with an appliance form factor, ClustrixDB is now being used by multiple customers across the globe as software and on public clouds such as AWS.
  • Several databases and data management platforms are now on the market. We created a series of landscapes to help our customers determine the right solution to their problem. Whether primary or analytics, SQL or NoSQL–comparing features gives clarity.

If you have directly skipped to this statement, I encourage to read the four bullet points above, it is indeed interesting to know that there are solution exists in the market which can help our mission critical problems. Now let us jump to details of the contest.

Contest Details

Step 1: Download and Install ClustrixDB here using the community license

Click Here to Download

Step 2: Do one thing with ClustrixDB 
ClustrixDB is nearly plug-and-play compatible with MySQL. You can do any of the following tasks:

  • Load Data
  • Create a Table
  • Create an online schema change
  • Run an Analytics query
  • Or any other task which is a core database task

Step 3: Post it on the ClustrixDB forum (http://support.clustrix.com/forums/):

  1. The current database you use
  2. Snapshot (or screenshot) of what you did with ClustrixDB

That’s it! It is that simple and it will also give you exposure how to get going with ClustrixDB as well enable you to Win Amazon Gift Card.

Prize for Winner (Total worth USD 200)

  • amazon gift cards SQL Contest   Win Amazon Gift Cards   Learn How to Get Started with ClustrixDB10 Amazon Gift Cards each of USD 20
  • 3 Gift Cards will go to the first 3 posts and the remaining 7 will be raffled among the other posts.
  • Post your entry before March 20th Midnight EST.

Well, if you are an early bird, indeed there is a guaranteed prize for you as you can read the second bullet point above.


Well, if you want few helps regarding how to do any of the above tasks. Here are a few articles which you can follow:

Let me know what you think of this contest and ClustrixDB. I have a surprise prize for one of the participation in this contest. I will announce it with the winner list after March 20th.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Big Data – Real-Time Analytics Performance with ClustrixDB

Note: The product used in comparison is ClustrixDB. It is available to download for FREE.

NewSQL databases provide scale-out of NoSQL without giving up on SQL or ACID transactions. While most NewSQL databases focus only on transactions, ClustrixDB also provides fast real-time analytics that are becoming increasing important to many businesses. ClustrixDB does this by bringing Massively Parallel Processing (MPP) used in data warehouses, to the primary database.

So, I decided to get a workload and try it out to see what kind of performance improvements one can get, if any. Since, joins and aggregates are the workhorses of real-time analytics processing, they are a good place to start.


I built a simple dataset with two tables USERS (100K rows), USER_ADDRESSES (200K rows) and BIDS (10M rows) so this dataset has 2GB of data (mysqldump). For platform I used AWS and got ClustrixDB from AWS Marketplace. For comparison, I decided to use MySQL 5.6 since the exact same data and queries can be run on both databases. For both databases, the instance types are m1.xlarge.

MySQL does not scale beyond a single server and is usually deployed with master and two read slaves. Since ClustrixDB provides horizontal scale-out within one cluster, rather than master-slave (with multiple copies of data), the equivalent configuration is 3 servers. ClustrixDB horizontal scaling allows all nodes to participate in all query types. For measuring performance single MySQL is enough because performance for one query will be the same – whether we use the master or read slave.

For ClustrixDB, I also tried out 6 servers to see if analytics get faster as you add servers.

clustixdb table1 Big Data   Real Time Analytics Performance with ClustrixDB

Here is the resulting table:

clustixdb table Big Data   Real Time Analytics Performance with ClustrixDB


We see that some queries get significantly faster, however one query showed no performance improvement. The count query on users is only counting 100K rows so it is likely not enough work. The count query on the bids table (counting 10M rows) shows speedup with 3 nodes, but with 6 nodes we don’t get as much improvement. This is still a very simple query. The queries with aggregates and joins get significantly faster (23x and 8.79x) on 3 nodes. These queries also get nearly twice as fast as you go from 3-node ClustrixDB to 6-node ClustrixDB, this is because of MPP in ClustrixDB.

Overall, we see that for more complex analytical queries ClustrixDB gets significant advantage. This means reports will get much faster with ClustrixDB. For some other queries, there is not enough work or being distributed does not offer that much advantage and here the performance is about the same. For real-time analytics requirements, ClustrixDB seems like a good solution.

Note: The product used in comparison is ClustrixDB. It is available to download for FREE.

Reference: Pinal Dave (http://blog.sqlauthority.com)