Although tremendous success has been achieved for interactive object cutout in still images, accurately extracting dynamic objects in video remains a very challenging problem. Previous video cutout systems present two major limitations: (1) reliance on global statistics, thus lacking the ability to deal with complex and diverse scenes; and (2) treating segmentation as a global optimization, thus lacking a practical workflow that can guarantee the convergence of the systems to the desired results. We present Video SnapCut, a robust video object cutout system that significantly advances the state-of-the-art. In our system segmentation is achieved by the collaboration of a set of local classifiers, each adaptively integrating multiple local image features. We show how this segmentation paradigm naturally supports local user editing and propagates them across time. The object cutout system is completed with a novel coherent video matting technique. A comprehensive evaluation and comparison is presented, demonstrating the effectiveness of the proposed system at achieving high quality results, as well as the robustness of the system against various types of inputs. © 2009 ACM.