Like many competitors, XtremeData began with an open-source database software package. But unlike others, we then re-engineered the core query execution code with a truly parallel, vectorized SQL engine developed from first principles. The reasons for this are simple. Legacy database software, including all open-source packages, were developed decades ago and are not optimized for the key computing resources of today: many-core CPUs, large amounts of memory and high-speed networks.
Unlike “federated” systems, where multiple complete instances of a database run in parallel, XtremeData offers a single instance of a database that within itself contains a truly parallel SQL execution engine. The core software layer manages all peer-to-peer communication and data exchange between nodes. It has been designed to excel at what other databases find difficult or impossible to do: handle big data issues of complex SQL against complex schema. XtremeData is data model agnostic and does not require careful data partitioning or placement to deliver performance. This enables us to excel at performing complex n-way joins and aggregates against multiple big tables, at scales of 1-100's of TB.
At XtremeData we have benchmarked our engine against federation-based competitors and also against “NoSQL” solutions like Hive, and measured performance gains of 10x. What does this mean? Put simply, the federated systems will need 10x the hardware resources in order to match XtremeData.