Mistranslated numbers have the potential to cause serious effects, such as
financial loss or medical misinformation. In this work we develop comprehensive
assessments of the robustness of neural machine translation systems to
numerical text via behavioural testing. We explore a variety of numerical
translation capabilities a system is expected to exhibit and design effective
test examples to expose system underperformance. We find that numerical
mistranslation is a general issue: major commercial systems and
state-of-the-art research models fail on many of our test examples, for high-
and low-resource languages. Our tests reveal novel errors that have not
previously been reported in NMT systems, to the best of our knowledge. Lastly,
we discuss strategies to mitigate numerical mistranslation.

By admin