Difference between Cassandra and HBase

Cassandra vs HBase

Cassandra HBase
Based on DynamoDB (Amazon). Based on Bigtable (Google).
Many Cassandra deployments uses Cassandra + Storm (which uses zookeeper), and/or Cassandra + Hadoop. Uses the Hadoop infrastructure (Zookeeper, NameNode, HDFS).
Uses a single node-type with each node performing same functions. Uses several “moving parts” consisting of Zookeeper, Name Node, HBase master, and data nodes, each performing different functionalities.
Does not support range based row-scans. Supports range based scans.
Random partitioning provides for row-replication of a single row across a wan. Facilitates asynchronous replication of an HBase cluster across a wan.
Officially supports ordered partitioning, but not used by production users due to its limitations. Support ordered partitioning.
The practical limitation of a row size in Cassandra is 10’s of MB, when data is stored in columns to support range scans. Easily scale horizontally due to ordered partitioning, while still supporting Rowkey range scans.
Does not support atomic compare and set. Support atomic compare and set.
Support read load balancing against a single row. Does not support read load balancing against a single row.
Uses bloom filters for key lookup. Bloom filters can be used as another form of indexing.
Does not support co-processor-like functionality. The coprocessor capability supports Triggers.


Please Share