This is an invaluable roadmap for meeting the rapid demand to deliver scalable applications in a startup environment. With a focus on core concepts and best practices rather than on individual languages, platforms, or technologies, Web Scalability for Startup Engineers describes how infrastructure and software architecture work together to support a scalable environment. You’ll learn, step by step, how scalable systems work and how to solve common challenges. Helpful diagrams are included throughout, and real-world examples illustrate the concepts presented. Even if you have limited time and resources, you can successfully develop and deliver robust, scalable web applications with help from this practical guide.
High level overview of how to design highly scalable systems, each component is described in detail and broken down into potential use cases so it's easy to understand how it fits in the architecture: benefits, caveats, challenges, tradeoffs. Feels like a great combination of breadth and depth that also includes a vast reference list of books, white papers, talks and links to expand on each topic.
Notes:
#1: Core Concepts - Avoid full re-writes, they always cost and take more than expected and you end up with similar issues. - Single-server configuration means having everything running off one machine, good option for small applications and can be scaled vertically. - Vertical scaling gets more expensive in time and has a fixed ceiling. - Isolation of services means running different components off different servers like DNS, cache, data storage, static assets, etc., which can then be vertically scaled independently. - CDNs bring things like geoDNS for your static assets so requests will be routed to the geographically closest servers. - SOA is an architecture centred around loosely coupled and autonomous services that solve specific business needs. - Layered architectures divide functionality into layers where the lower layers provide an API for and don't know about the upper ones. - Hexagonal architecture asumes the business logic is in the center and all interactions with other components, like data stores or users, are equal, everything outside the business logic requires a strict contract. - Event-driven architecture is a different way of thinking about actions, is about reacting to events that already happened while traditional architectures are about responding to requests.
#2: Principles of Good Software Design - Simplicity. - Hide complexity behind abstractions. - Avoid overengineering. - Try TDD, use SOLID. - Promote loose coupling, drawing diagrams helps spot coupling. - DRY and don't waste time by following inefficient processes or reinventing the wheel. - Functional Partitioning: dividing the system based on functionality where each subsystem has everything it needs to operate like data store, cache, message queue, queue workers; eg.: profile web service, scheduling web service. - Design for self healing.
#3: Building the FE Layer - You can have your frontend be a traditional multipage web application, SPA, or a hybrid. - Web servers can be stateless (handle requests from every user indistinctly) or stateful (handle every request of some users). - Prefer stateless servers where session is not kept within the servers but pushed to another layer like a shared session storage which makes for better horizontal scalability since you can add/remove clones without affecting users bound sessions. - HTTP is stateless but cookies make it stateful. - On the first HTTP request the server can send a Set-Cookie:SID=XYZ... response header, after that, every consecutive request needs to contain the cookie Cookie:SID=XYZ... - Session data can be stored in cookies: which means every request (css, images, etc) contains the full body of session data, when state changes, the server needs to re-send a Set-Cookie:SDATA=ABCDE... which increases request payload so only use if session state is small. As an advantage you don't have to store any session data in the server. - Keep session state in shared data store: like Memcached, Redis, Cassandra; then you only append the session id to every request while the full session data is kept in the shared store which can be distributed and partitioned by session id. - Or use a load balancer that supports sticky sessions: this is not flexible as it makes the load balancer know about every user, scaling is hard since you can't restart or decommission web servers without breaking users sessions. - Components of scalable frontends: DNS, CDN, Load Balancer, FE web server cluster. - DNS is the first component hit by users, use a third-party hosted service in almost all cases. There are geoDNS and latency based DNS services like Amazon Rout 53 which can be better than the former cause they take into consideration realtime network congestion and outages in order to route traffic. - Load Balancers help with horizontal scaling since users don't hit web servers directly. * can provide SSL offloading (or SSL termination) where the LB encrypts and decrypts HTTPS connections and then your web servers talk HTTP among them internally. * there are providers like Amazon ELB, software-based LB like Nginx and HAProxy and hardware based that can be more expensive but more easily to vertically scale. * apart from an external LB routing traffic to FE web servers, you can have an internal LB to distribute from FE to web services instances and gain all the benefits of an LB internally. * some LB route the TCP connections themselves allowing the use of other protocols apart from HTTP.
#4: Web Services - The monolithic approach: where you can have a single mvc web app containing all the web application controllers, mobile services controllers, third-party integration service controllers and shared business logic. * Here every time you need to integrate a new parter, you'd need to build that into the same mvc app as a set of controllers and views since there's not concept of separate web services layer. - The API-first approach: all clients talk to the web application using the same API which is a set of shared web services containing all the business logic. - Function-centric web services: is the concept of calling functions or objects in remote machines without the need to know how they're implemented like SOAP, CORBA, XML-RPC, DCOM. * SOAP dominates this space, it uses WSDL files for describing methods and endpoints available and provide service discovery and an XSD files for describing data structure. * Your client code doesn't need to know that is calling a remote web service, it only needs to know the objects generated on web services contract (WSDL + XSD files). * dozens of additional specifications were created for SOAP (referred to as ws-* like ws-security) for features like transactions, multiphase commits, authentication, encryption, etc., but this caused reduced interoperability, it made it difficult to integrate between development stacks as different providers had different levels of support for the ws-* specifications, specially in dynamic languages like PHP, Ruby, Perl, Python; this led to the alternative and easier to implement JSON + REST. * You couldn't either cache HTTP calls on URL alone since it was the same every time due to request params and method names being part of the XML document itself. * Some of the ws-* specifications introduced state, preventing you from treating web service machines as stateless clones. * Having a strict contract and the ability to discover functions and data types, provides a lot of value. - Resource-centric web services treat every resource as a type of object, you can model them as you like but the operations you can perform on them are standard (POST, GET, PUT, DELETE), different from SOAP where you have arbitrary functions which take and produce arbitrary values. * REST is predictable, you know the operations will be always the same, while on SOAP each service had its conventions, standards and ws-* specifications. * Uses JSON rather than XML which is lighter and easier to read. * REST frameworks or containers is pretty much just a simple HTTP server that maps URLs to your code. * It doesn't provide discoverability and auto-generation of client code that SOAP has with WSDL and XSD but by being more flexible it allows for nonbreaking changes to be released in the server without the need to redeploy client code.
#5: Data Layer - Replication scales reads not writes. * On MySQL you can have a master and slave topology, there's a limit on the slave count but you can then have multilayer slaves to increase the limit. * All writes go to master while replicas are read-only. * Replication happens asynchronously. * Each slave has a binlog file with the list of statements from master to be executed and a relay log with the already executed ones. * Promoting a slave to master is a manual process. * You can have a master-master topology for a faster/easier failover process but still not automatic in MySQL. In this topology, you write to either one and the other replicates from its binlog. * The more masters you have, the longer the replication lag, hence worst write times. - Active Data Set: amount of data accessed within a time window (an hour, day, week). Its the live data that's more frequently read/written. * A large active data set size is a common scalability issue. * Replication can help increasing concurrent reads but not if you want to increase the active data size since an entire copy of the dataset needs to be kept in each slave. - When sharding, you want to keep together sets of data that will be accessed together and spread the load evenly among servers. - You can apply sharding to object caches, message queues, file systems, etc. * Cross-shard queries are a big challenge and should prob be avoided. * Cross-shard transactions are not ACID. - Try to avoid distributed transactions. - NoSQL databases make compromises in order to support their priority features. - The famous "pick two" phrase related to the CAP theorem is not entirely true. * If you're system is still operational after a number of failures, you still have partition tolerance. * Quorum consistency provides consistency while still having availability at the price of latency. - Quorum consistency means the majority of the nodes agree on the result (reads and/or writes), you can implement this for some of your queries which means you're trading latency for consistency on eventually consistent systems, meaning you can still enforce read-after-write semantics. - Eventual Consistency: property of a system where nodes may have different versions of the data until it eventually propagates. * These systems favor availability, they give no guarantees that the data you're getting is the freshest. * Amazon Dynamo (designed to support the checkout process) works this way, it saves all conflicts and sends them to the client where the reconciliation happens which results on both shopping cart versions being merged. They prefer showing previously removed item in a shopping cart than loosing data. - Cassandra topology: all nodes are the same, they all accept reads and writes, when designing your cluster you decide how many nodes you want the data to be replicated to which happens automatically, all nodes know which has what data. * It's a mix of Google's BigTable and Amazon's Dynamo, it has huge tables that most of the times are not related, rows can be partitioned by the row key among nodes and you can add columns on the fly, not all rows need to have all the columns. * When a client connects to a node, this one acts as the session coordinator, it finds which nodes have the requested data and delegates to them. * It implements a self-healing technique where 10% of all transactions trigger an async background check against all replicas and sends updates to the ones with stale data.
#6: Caching - Cache hit ratio is the most important metric of caching which is affected by 3 factors. * data set size: the most unique your cache keys, the less chance of reusing them. If you wanted to cache based on user's IP address, you have up to 4 billion keys, different from caching based on country. * space: how big are the data objects. * TTL (time to live). - HTTP caches are read-through caches. * For these you can use request/response headers or HTML metatags, try to avoid the latter to prevent confusion. * Favor "Expires: A-DATE" over "Cache-Control: max-age=TTL", the latter is inconsistent and less backwards compatible. * Browser cache: exists in most browsers. * Caching Proxies: usually a local server in your office or ISP to reduce internet traffic, less used nowadays cause of cheaper internet prices. * HTTPS prevents caching on intermediary proxies. * Reverse Proxy: reduces the load on your web servers and can be installed in the same machine as the load balancer like Nginx or Varnish. * CDNs can act as caches. * Most caching systems use Least Recently Used (LRU) algorithm to reclaim memory space. - Object Caches are cache-aside caches. * Client side like the web storage specification, up to 5 and 25MB. * Distributed Object Caches: ease the load on data stores, you app will check them before making a request to the db, you can decide if updating the cache on read or write. * Caching invalidation is hard: if you were caching each search result on an e-commerce site, then updated one product, there's no easy way of invalidating the cache without running each of the search queries and see if the product is included. * The best approach often is to set a short TTL.
#7: Asynchronous Processing - Messages are fire-and-forget requests. - Message queue: can be as simple as a database table. - Message Broker | Event Service Bus (ESB) | Message-Oriented Middleware (MOM): app that handles message queuing, routing and delivery, often optimised for concurrency and throughput. - Message producers are clients issuing requests (messages). - Message Consumers are your servers which process messages. * Can be cron-like (pull model): connects periodically to the queue to check its status, common among scripting languages without a persistent running application container: PHP, Ruby, Perl. * Daemon-like (push-model): consumers run in an infinite loop usually with a permanent connection to the broker, common with languages with persistent application containers: C#, Java, Node.js. - Messaging protocols. * AMQP is recommended, standardised (equally implemented among providers) well-defined contract for publishing, consuming and transferring messages, provides delivery guarantees, transactions, etc. * STOMP, simple and text based like HTTP, adds little overhead but advanced features require the use of extensions via custom headers which are non-standard like prefetch-count. * JMS is only for JVM languages. - Benefits of message queues: * Async processing. * Evening out Traffic Spikes: if you get more requests, the queue just gets longer and messages take longer to get picked up but consumers will still process the same amount of messages and eventually even out the queue. *Isolate failures, self-healing and decoupling. - Challenges. * No message ordering, group Ids can help, some providers will guarantee ordered delivery of messages of same group, watch out for consumer idle time. * Race conditions. * Message re-queuing: strive for idempotent consumers. * Don't couple producers with consumers. - Event sourcing: technique where every change to the application state is persisted in the form of an event, usually tracked in a log file, so at any point in time you can replay back the events and get to an specific state, MySQL replication with binary log files is an example.
#8: Searching for Data - High cardinality fields (many unique values) are good index candidates cause they narrow down searches. - Distribution factor also affects the index. - You can have compound indexes to combine a high cardinality field and low distributed one. - Inverted index: allows search of phrases or words in full-text-search. * These break down words into tokens and store next to them the document ids that contain them. * When making a search you get the posting lists of each word, merge them and then find the intersections.
#9: Other Dimensions of Scalability - Automate testing, builds, releases, monitoring. - Scale yourself * Overtime is not scaling * Recovering from burnout can take months - Project management levers: Scope, Time, Cost * When modifying one, the others should recalibrate. * Influencing scope can be the easiest way to balance workload. - "Without data you're just another person with an opinion". - Stay pragmatic using the 80/20 heuristic rule.
This entire review has been hidden because of spoilers.
Pretty solid book. There are some surprising omissions in WSSE (such as not discussing the difference between statement- vs. row-based replication), and some of the information is now outdated. But overall, I would say it's a great introduction if you're just trying to get your bearings with system design. I think of it as a more-accessible, less-academic, less-high-quality version of (which you should definitely read as a follow-up).
Excellent book, very well written and all the concepts are very well explained with diagrams and all. And its very easy to read as well. The author clearly knows what he is talking about and also what it takes to explain all the concepts. An excellent book i had read in a long time. Very exhaustive. Will recommend to others and totally worth buying.
Great overview of scalability for small companies. One of those books that makes you realize how many things you didn't know that you didn't know. Perfect as a starting point. If nothing else it will help you frame your questions better and show you what you need to learn next.
One of the best intros to distributed systems and system design that I have ever read. Explains everything from good software practices, the types of data stores and how to choose, caching, message queues,
This is so far the most interesting book I've ever read in computer science field. If you are a frontend developer or a junior backend developer and you want to understand what the senior backend people talk about, read this book. Or if you want to build a really scalable web application like the giant websites we know, start with this book.
I was thoroughly impressed by the depth and breadth of knowledge it offered. Despite some of the concepts being relatively simple, they are crucial for building a strong foundation in system design. In addition to the technical content, the book also delves into important topics like scaling operations, the impact of individual developers, and team dynamics.
Overall, I highly recommend this book to any developer just starting their career. It is a valuable resource that will give you the tools and understanding you need to succeed in the fast-paced world of startups.
Excellent book which I regret I didn't read 5-6 years ago.
Startup engineering is still a rare topic when it comes to software engineering (in contrary to product management). Databases, algorithms, cracking code interview, fronted frameworks, domain driven design books - this is what we have. I am happy that this book exist.
Bad thing first: - Some parts (not many) did not age well, like writing about cloud resources like EC2. Now we have a range of different cloud services at our disposal including containerized apps which scales automatically based on some criteria like cloud run or azure container apps. 2nd edition should fix this! - There were a lot of trivial sentences in the introduction to the chapters. Example: "Careful design of the web services layer is critical because if you decide to use web services, this is where most of your business logic will live". Introductions were the most boring parts here. - The new copy I bought from Amazon was just a bad quality. I can pull out like 4-6 pages out of the book. They are not glued (WTF?). Also some pictures are more gray than black... plus there is a duplicated title on 2 different images. This is not a self-published book, I would expect better quality.
Good parts: + CDNs, Load balancer, Proxy, Reverse Proxy... finally I can put this things in order in my head. + A lot of good references! Even if there is a topic that may be important for a startup in specific context, but the topic is to broad to cover in this book the author is able to give good references too books / articles. + Feels like end-end. How to scale frontend? backend? databases? searches? yourself? The code? It's all here. Caching, async messaging, eventual consistency, NoSQL vs SQL, search engines, even SLA. + Good practices are here! While working for startups I saw a lot of bad code, lack of tests, bad quality. Here Artur emphasizes the importance of tests, writes about hexagonal architecture, TDD, SOLID, inversion of control... This may be obvious for guys that are long in the industry, but still I felt happy when I saw all this in a book for startup engineers. + Chapter about caching is great. You can learn about different caching mechanism and makes great points about HTTP caching and various HTTP headers.
This is an extremely accessible and readable book on designing scalable web application architectures. The sheer breadth it covers is impressive. Written in 2014, it also feels a little dated in some areas. It does not become too technical at any point, though therein lies the downside. Some topics do feel superficial like text-based searches.
The second chapter on design principles feels misplaced in the book as it is neither comprehensive and detailed nor it fits well with the theme of the book. I feel that if the focus on the book had been just scalable web application architectures without the startup angle, it would have delivered more value with less text.
With that being said, I think the clarity and organization with which the author has laid out the concepts will certainly help a software engineer. Probably more beneficial for a less tenured engineer to get up to speed from the perspective of approaching system-design problems encountered in their daily job or asked in interviews. However, as I mentioned about the breadth, it might still be beneficial for a more tenured engineer to make sure all bases are covered. This, in particular, is where it helped me. As engineers we rarely get to work on each and every aspect of an application.
Following it up with Designing Data Intensive Applications might be a killer combo but I am yet to go through that one.
Still reading but it does not look good. For now it seems to be too high level and beginner friendly. It's more like an introduction to scaling & system design, but it also involves other topics like SOLID principles on a very very brief level. It also uses the common misinterpretation of the SRP to explain it. I think this just adds more pages to the book and makes it more expensive, both in terms of price and time to read.
Graphic editing is bad - diagrams are almost always placed on the page after the explanation so you have no context and you need to turn pages to find the related figure.
This book is a good introduction into the problems of scalability. It does that in providing you a holistic view of large-scale systems and does so in a very pragmatic and consistent manner. There are lots of links to other materials if you are inclined to dig deeper into a topic, although some of them might be outdated, given that the book was written 8 years ago. Honestly, I wish I read this book earlier. I personally didn't gather a lot of new insight from it but it is a good book for junior/middle engineers and can save you a lot of time and effort putting together different pieces of the puzzle of highly scalable systems.
Some concepts would be discussed but the explanation would happen in a later chapter. Sometimes, the author wrote sentences with incorrect wording or missing words. That said, the book is a great primer on scaling out all aspects of a technology firm: good software architecture and design, caching and CDNs, functional partitioning of software, data replication and sharding, message queues, search engines for database text, automated testing and continuous deployment, automated monitoring and logging, small teams.
This is a good first book on distributed systems and scalability. The author covers all the major topics and challenges of scaling out distributed systems in an approachable way without diving into too much theory. The advice is detailed and practical. Read this before you read DDIA.
I first read it in 2019 for System Design interviews and it helped me understand the topics very well. In 2022, after a lot more experience scaling systems, I still find it super relevant. The tech that's hip now may be slightly different (Envoy > HAProxy!), but the core concepts are mostly unchanged.
Great books for learning essential technological knowledge on how to scale web applications for startups. It covers most important aspects of scalability from NoSQL, database replication and sharding, caching, asynchronous systems, full-text search to automation, team management and collaboration. I highly recommended this book to any web developers even if you are not working in a startup.
Buku dasar yang bisa digunakan untuk mulai belajar tentang scalability. Kalau pengguna aplikasi sudah mulai banyak, ada baiknya kita mengulik supaya aplikasi kita bisa melayani permintaan dalam jumlah banyak, dari front end, back end, sampai database. Tidak salah juga untuk awal membangun aplikasi memahami buku ini.
Really good books for a junior engineer that wants to become a mid or senior engineer. Easy to learn with many illustrations make it easier to understand. This will help you as an engineer in order to design a system that easily scales. Although in this book only just the skin or introduction to the tools for scalability but this book also provide reference if you want to learn more about it.
Good to read for exploring the world of trade off :)
I like the contents of the book so much, good for middle level of software engineer who want to explore more what could be the concerns of making the app or web scalable and understand why there is no one solution fit all situation. The example used in the book are very easy to follow and make sense.
Just read this book, would advise anyone who has just started with System Design and Architecting things. It's a great book that will give you the outline on how to create your system, and point out various components and options available to choose the right technology. A great read.
Excellent book for a high level view of software architecture and scalability. It is concise and with just enough detail to understand each topic, how they fit together, and learn more on your own as needed. Should be mandatory reading for every junior engineer.
You can just pick it up and start reading a random chapter. It touches all the aspects of web development. It is pretty technical, so not recommended for the management level.
One of the best books on web development I've read in a long time. Very good balance of storytelling, technical jargon, examples and principles. Highly recommend.
One of the consistently engaging, clear, and focused programming books I've read in awhile. Not "game changing", but certainly helps you learn what it claims to!