HDFS scalability: the limits to growth
Summary of Shvachko's article presenting Hadoop distributed file system HDFS and its scalability limitations.
MapReduce: Simplified Data Processing on Large Clusters
Paper from Google engineers presenting MapReduce, a model providing a robust yet simple interface for processing large datasets in distributed environments.
Enterprise Data Analysis and Visualization: An Interview Study
This paper presents an overview of the different competencies of data analysts, discussing the social and organizational context of companies that affects the outcome of analysts' everyday work.
Improving Datacenter Performance and Robustness with Multipath TCP
Paper presenting the MPTCP protocol which improves the performance on datacenters, providing an alternative for single-path transport.
Modeling TCP Throughput: A Simple Model and its Empirical Validation
Summary of the paper published in 1998 describing a TCP throughput prediction model that take timeouts into account.
TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms
Summary of the RFC 2001, discussing the TCP mechanisms to handle package transmissions during network congestion.
Congestion avoidance and control
Paper presenting an efficient algorithm for network congestion avoidance and control, as a response to the first Internet congestion collapses that happened back in 1986.
Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks
Summary of the paper published in 1989 presenting different algorithms for Congestion Avoidance in computer networks.
Efficient virtual network isolation in multi-tenant data centers on commodity ethernet switches
This paper presents LANES, a system that provides network isolation for multi-tenant data center environments.
P4: Programming Protocol-Independent Packet Processors
Programming language designed for protocol and target independence allowing programmers to define packet processors in SDN environments.