© 2015 IEEE. In this work, we present a dictionary learning based framework for robust, cross-modality, and pose-invariant facial expression recognition. The proposed framework first learns a dictionary that i) contains both 3D shape and morphological information as well as 2D texture and geometric information, ii) enforces coherence across both 2D and 3D modalities and different poses, and iii) is robust in the sense that a learned dictionary can be applied across multiple facial expression datasets. We demonstrate that enforcing domain specific block structures on the dictionary, given a test expression sample, we can transform such sample across different domains for tasks such as pose alignment. We validate our approach on the task of pose-invariant facial expression recognition on the standard BU3D-FE and MultiPie datasets, achieving state of the art performance.