• LeadingPerformance2

  • softwareonly
  • datamarts
  • QUANTUM IMPROVEMENT IN BIG DATA ELT
  • PEAK PERFORMANCE ON COMPLEX AD HOC QUERIES
  • CONTINUOUS INGEST

Automatic Load Balancing

In today's data centers, it is not unusual to find large clusters of Linux computers – 100's to 1000's. Scaling of the hardware is relatively simple, but scaling software across this hardware infrastructure is very challenging. One of the critical factors that limit scalability of parallel processing systems is load balancing – the ability to ensure all nodes are performing equal amount of work. If the load is unbalanced, a few nodes end up performing all the work while the rest remain idle. This is especially true for large parallel databases, where the load distribution for a particular SQL query depends on the profile of the data. The profiles of intermediate data at stages within a query execution plan are largely unknown to the database software. Therefore the database engine makes some rough guesses and tries to implement load balancing across nodes in a primitive manner. More often than not, these attempts fail, resulting in highly unbalanced load distribution and poor performance.

XtremeData's dbX software implements automatic load balancing by collecting detailed statistics in real-time on all data being processed, and using these statistics to dynamically distribute the workload. This is a unique strength of our technology and enabled dbX to scale without side effects; providing predictable and consistent performance at all scales.