Sunday, July 29, 2018

Almost-accurate machine translation of human language can be dangerous - its occasional inaccuracy can lead to wrong decisions

The point here is that if machine translation becomes so good that you become so confident about it that you don't think you need to cross-check or double-check the automated translation, then this can lead to wrong and potentially dangerous actions/decisions [the assumption here is that merely "almost-flawless" automated language translation can create this high level of confidence or trust in a human user, and that "fully-flawless" level of accuracy isn't required]. Since the translation service is not 100% accurate but only almost-100%, it will inevitably make occasional errors/mistakes. But since the human user has complete confidence or faith in this service and so he doesn't feel the need to get the automated translation checked [manually - by a human translator], the error/mistake can be [silently] accepted by the human user as if it were the correct translation [that is, the human won't even realize that there's any flaw in the translation]. This can lead to erroneous actions or decisions. If doctors rely solely on such "almost-perfect" computer translation, serious medical blunders can occur.

An example of very good automated translation is below. How can I claim that the translation correctly depicts what was originally written in Ukrainian? I read a few news stories [in English] about this Ukrainian-language webpage. But I cannot be sure that those news outlets got this page translated by a human, or they themselves too relied on Google Translate.




Update [7-Oct-18]: Another [representative] scenario where an almost-accurate system can lead to catastrophic outcomes is an imaginary crying-infant detector device which continuously listens to incoming sounds, and can identify sounds that resemble a crying infant. It alerts the parents - who are sitting at some distance - when it detects this crying. Support it's "so" accurate that parents start to depend blindly on it. Suppose its real accuracy is 99.5%, whereas the parents consciously or subconsciously start to assume - based on their real-life experience with the device - that it's "virtually" 100% accurate. Now this can lead to fatal mistakes. What if a particular type of low-volume, intermittent crying by an infant falls in that 0.5% category, and the device doesn't raise an alarm, and the parents falsely assume that all is well, and the infant keeps crying for a long, long time? It's a scary scenario. In any such high-stakes scenarios, any machine-based system better be at least as accurate as a human. Nearly-100% can produce fatal outcomes due to blind faith and by the system's human users. As it's sometimes said, good enough is not good enough.