Pregel: A System for Large-Scale Graph Processing
Summary of the paper presenting Pregel, Google's system for processing large Graphs using a simple computational model while being designed for efficiency and fault-tolerant execution.
Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks
Summary of the paper presenting Tachyon, the distributed file system that enables reliable data sharing at memory speed.
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
Summary of the paper published by University of California Berkeley researchers, presenting RDD, a distributed memory abstraction for computations on large clusters.
HDFS scalability: the limits to growth
Summary of Shvachko's article presenting Hadoop distributed file system HDFS and its scalability limitations.
MapReduce: Simplified Data Processing on Large Clusters
Paper from Google engineers presenting MapReduce, a model providing a robust yet simple interface for processing large datasets in distributed environments.