What Is a Distributed System?

A distributed system can be defined in several ways:

  • Tanenbaum and van Steen: “A collection of independent computers that appears to its users as a single coherent system”

  • Coulouris, Dollimore and Kindberg: “One in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages”

  • Lamport: “One that stops you getting work done when a machine you’ve never even heard of crashes”

Motivations for Distributed Systems

  1. Geographic Distribution: Resources and users are naturally distributed
    • Example: Banking services accessible from different locations while data is centrally stored
  2. Fault Tolerance: Problems rarely affect multiple locations simultaneously
    • Multiple database servers in different rooms provide better reliability
  3. Performance and Scalability: Combining resources for enhanced capabilities
    • High Performance Computing, replicated web servers, etc.

Examples of Distributed Systems

  • Financial trading platforms
  • Web search engines (processing 50+ billion web pages)
  • Social media platforms supporting billions of users
  • Large Language Models (trained across clusters)
  • Scientific research (e.g., CERN with over 1 Exabyte of data)
  • Content Delivery Networks (CDNs)
  • Online multiplayer games

Fallacies of Distributed Computing

Eight classic assumptions that often lead to problematic distributed systems designs (identified at Sun Microsystems):

  1. The network is reliable
  2. Latency is zero
  3. Bandwidth is infinite
  4. The network is secure
  5. There is one administrator
  6. Transport cost is zero
  7. The network is homogeneous
  8. Topology doesn’t change

Key Aspects of Distributed System Design

  • System Function: The intended purpose (features and capabilities)
  • System Behavior: How the system performs its functions
  • Quality Attributes: Core qualities determining success:
    • Performance
    • Cost
    • Security
    • Dependability

Challenges in Distributed Systems

Distributed systems introduce complexity in:

  • Coordination
  • Consistency
  • Fault detection and recovery
  • Security
  • Performance optimization