MapReduce: Simplified Data Processing on Large Clusters
Paper from Google engineers presenting MapReduce, a model providing a robust yet simple interface for processing large datasets in distributed environments.
Enterprise Data Analysis and Visualization: An Interview Study
This paper presents an overview of the different competencies of data analysts, discussing the social and organizational context of companies that affects the outcome of analysts' everyday work.
Improving Datacenter Performance and Robustness with Multipath TCP
Paper presenting the MPTCP protocol which improves the performance on datacenters, providing an alternative for single-path transport.
Modeling TCP Throughput: A Simple Model and its Empirical Validation
Summary of the paper published in 1998 describing a TCP throughput prediction model that take timeouts into account.
TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms
Summary of the RFC 2001, discussing the TCP mechanisms to handle package transmissions during network congestion.