In Software development principles and philosophies I wrote about development principles and philosophies. Here I am outlining some observations that I’ve made related to large distributed systems that have been built over the last decade.
Distributed system Trends
Last Updated: Feb 27, 2024
- Systems are becoming more configurable by allowing tradeoffs such as between performance, cost, and consistency.
- E.g Napa can tune query latency, data freshness and cost.
- Eventual vs Strong consistency in FoundationDB and Memcache
- The transition from Shared Nothing to Shared Data architectures.
- In the early days Hadoop and Map-Reduce would bring compute to where data was, however in the recent years compute and storage are kept separated, which allows for more flexible systems that can be independently tuned. ✅ The Snowflake Elastic Data Warehouse [ 2016]
- Decoupling
- Decoupled ingestion from serving. If ingestion is tightly coupled to the storage engine that means the ingestion is bounded by the storage write capacity. Assuming the system is optimized for reads this leads to degraded ingestion performance. ✅ Napa: Powering Scalable Data Warehousing with Robust Query Performance at Google. [Google, 2021]
- The unbundled architecture of FoundationDB that decouples an in-memory transaction management system, a distributed storage system, and a built-in distributed configuration system. The fact that storage is decoupled from the transaction system allows the storage to scale by increased replication without affecting the write throughput. Having a write-optimized replicated log allows multiple replicas to be eventually consistent with the latest state while supporting high throughput. Note a serial number can be used to achieve strong consistency. ✅ FoundationDB: A Distributed Unbundled Transactional Key [Apple, Snowflake, 2021]
- OLAP databases query latency is becoming increasingly more important. Engineers and analyst productivity, user-facing products requiring complex aggregations. LinkedIn and Uber use Pinot, Google uses Napa and Procella.
- Organizations are slowly moving back toward relation-data models. In early 2010 with the release and adoption of several NoSQL storage solutions, many organizations adopted eventually consistent models that provided better reliability and simpler, more cost-effective operations. This came at the cost of product development, where product engineers now had to consider the implications of eventual consistency, and the complexity had increased. The release and adoption of systems like Spanner, Amazon Aurora, and CockroachDB demonstrated that engineers still prefer the simplicity and flexibility of relation-data models.
- A good example is how engineers within Google opted to use Megastore which provided a simpler interface and better consistency guarantees at the cost of performance. The alternative at the time was BigTable, which had much better performance but a limited interface and complex use. As a result, Spanner was created to provide the best of both worlds.