Parallel corpora are an essential training resource for training state-of-the-art data-driven statistical machine translation systems. Unfortunately, large parallel corpora are only available for a handful of language pairs and for view genres, like political documents. We therefore propose to use the much richer body of comparable corpora to deal with this data sparseness problem. Comparable corpora are documents that contain the same of similar information in different languages. Examples are the multilingual newswire texts that are produced by news organizations such as Agence France Presse and BBC. These texts often describe the same event in multiple languages in varying degree of details. The proposed project addresses the question how comparable corpora can be leveraged for improving translation systems. We will extend existing and develop new techniques to collect comparable corpora from a variety of data steams available on the Web, including news wires, but also sources like the online encyclopedia Wikipedia. Cross-lingual information retrieval techniques and classifiers will be used to identify documents containing similar content. We will then develop new word and phrase alignment techniques to extract lexicons and phrase tables from these comparable documents. Our new alignment approaches will be validated by improving existing translation systems, esp. for English-Arabic translation. We will also test the effect for low resource languages.