OpenMaTrEx, a free/open-source marker-driven example-based machine translation system

What is OpenMaTrEx?

OpenMaTrEx is a free/open-source (FOS) example-based machine translation (EBMT) system based on the marker hypothesis. It comprises a marker-driven chunker, a collection of chunk aligners, and two engines: one based on the simple proof-of-concept monotone recombinator (previously released as Marclator, http://www.openmatrex.org/marclator/) and a Moses-based decoder (http://www.statmt.org/moses/). OpenMaTrEx is a FOS version of the basic components of MaTrEx, the data-driven machine translation system designed by the Machine Translation group at the School of Computing of Dublin City University (Stroppa and Way 2006, Stroppa et al. 2006). A great part of the code in OpenMaTrEx is written in Java, although there are many important tasks that are performed in a variety of scripting languages.

OpenMaTrEx has been released under the GNU General Public Licence (GPL), version 3.

OpenMaTrEx is (c) 2007-2010 Dublin City University. The original MaTrEx code was developed among others by Steve Armstrong, Yvette Graham, Nano Gough, Declan Groves, Yanjun Ma, Nicolas Stroppa, John Tinsley, Andy Way, Bart Mellebeek. The free/open-source package OpenMaTrEx has been put together by Sandipan Dandapat, Mikel L. Forcada, Declan Groves, Sergio Penkale, John Tinsley, and Andy Way. Pavel Pecina helped with the Czech marker files. Jimmy O'Regan helped with the Irish marker files. The current release contains also code (an old version of args4j) which is (c) 2003 Kohsuke Kawaguchi (will be removed and added to install requirements soon).

A more general description of OpenMaTrEx may be found in the ABOUT file of the package.

Downloading OpenMaTrEx

While we set up a proper project site, OpenMaTrEx can be downloaded from here:

Subversion repository

If you are an OpenMaTrEx developer, you can access the repository with the command

svn co http://www.openmatrex.org/svn/OpenMaTrEx/

(contact mfor...@computing.dcu.ie if you want to become a developer).

Anyone can Browse the Subversion Repository.

Contact

Please send bug reports, comments, etc. to Mikel L. Forcada, mfor...@computing.dcu.ie .

Meet us at our IRC channel (#openmatrex at irc.freenode.net). Use your favourite IRC client or log in here:

Published information

PDF flyer (24/05/2010)

Sandipan Dandapat, Mikel L. Forcada, Declan Groves, Sergio Penkale, John Tinsley, Andy Way: OpenMaTrEx: A Free/Open-Source Marker-Driven Example-Based Machine Translation System, in Loftsson, H., et al., eds., Advances in Natural Language Processing: 7th International Conference on NLP, IceTAL 2010 (Reykjavík, 16-18 Aug. 2010), Col. Lecture Notes in Artificial Intelligence, vol. 6233, pp. 121-126 (Berlin, Heidelberg: Springer).

References

A more complete set of references can be found in the ABOUT file of the package or here.