Cassandra incorporates a number of architectural best practices that affect performance. None are unique to Cassandra, but Cassandra is the only NoSQL system that incorporates all of them.
Fully distributed: Every Cassandra machine handles a proportionate share of every activity in the system. There are no special cases like the HDFS namenode, MongoDB mongos, or the MySQL Fabric process that require special treatment. And with every node the same, Cassandra is far simpler to install and operate, which has long-term implications for troubleshooting. Even when everything works perfectly, master/slave designs have a bottleneck at the master. Cassandra leverage’s its masterless design to deliver lower latency as well as uninterrupted uptime.
Accelerate your career with Cassandra Training and become expertise Cassandra Developer.
Log-structured storage engine: A log-structured engine that avoids overwrites to turn updates into sequential i/o is essential both on hard disks (HDD) and solid-state disks (SSD). On HDD, because the seek penalty is so high; on SSD, to avoid write amplification and disk failure. This is why you see mongodb performance go through the floor as the dataset size exceeds RAM. Couchbase’s append-only b-trees avoids overwrites, but requires several seeks when updating or inserting new documents and does not support durable writes without a large performance penalty.
Locally-managed storage: HBase has an integrated, log-structured storage engine, but relies on HDFS for replication instead of managing storage locally. This means HBase is architecturally incapable of supporting Cassandra-style optimizations like putting the commitlog on a separate disk, mixing SSD and HDD in a single cluster with appropriate data pinned to each, orincrementally pulling compacted sstables into the page cache.
Learn more about Cassandra Interview Questions in this blog post.
Prepared statements: Five years ago, NoSQL systems were characterized by only allowing primary key lookups, and there was no query planning to speak of. Today, Cassandra and most other systems2 support indexes and increasingly complex queries. The Cassandra Query Language allows Cassandra to pre-parse and re-use query plans, reducing overhead. Others remain stuck with primitive JSON APIs or even raw Java Scanner objects. CQL also allows Cassandra to express more sophisticated operations like lightweight transactions with a minimal impact on clients, resulting in wide support across many languages. The closest alternative is Apache Phoenix, a Java-only SQL layer for HBase.
For an Indepth knowledge on Cassandra, click on below