Modify

Opened 2 years ago

Closed 2 years ago

Last modified 2 years ago

#21720 closed defect (fixed)

Delete Vietnamese localization

Reported by: 1ec5 Owned by: team
Priority: normal Milestone: 22.06
Component: Core Version:
Keywords: i18n vietnamese Cc: Don-vip

Description (last modified by 1ec5)

The Vietnamese (vi) localization of JOSM is completely unusable, to the point of insulting any Vietnamese speaker (even a non-native speaker like myself). It hinders Vietnamese speakers from contributing to OSM and hurts their perception of the project. Among its many problems:

  • Most translated strings seem as if they had been translated by machine-translating every other word and concatenating the results, disregarding Vietnamese grammar.
  • Basic terms are mistranslated in a manner that suggests total unfamiliarity with OpenStreetMap or the Vietnamese language.
  • Spaces are frequently missing between words. Words are inconsistently capitalized with no discernible pattern.
  • Most strings containing HTML formatting have broken syntax.
  • JOSM incorrectly considers Vietnamese to lack a plural grammatical form.

I’m only able to contribute to OSM using JOSM because I’m familiar with the English localization. Every time a warning or error appears, I have to play a game of Mad Libs to figure out if my edits would cause problems.

Just a few examples of mistranslated terminology, really the tip of the iceberg:

EnglishJOSM VietnameseLiteral meaningCorrect Vietnamese
way cách manner lối
reverse way cách xếp sorting method đảo ngược lối
to join (a way) tham gia to participate in (a project) gắn vào
(imagery) offset bù đắp to compensate độ lệch
lat/lon lạt lon bamboo strip of beverage can vĩ độ/kinh độ
upload to (server) tải lên để upload in order to tải lên
rubber-band cao su-band rubber + band dây chun
delete mode xóa mode delete + mode chế độ xóa
jump there jump có jump + there is nhảy tới đấy
right (side) quyền permission phải
to snap (to a way) chụp to snap a photo dính

JOSM’s Launchpad instance unfairly blames me for the problem. It says my account was responsible for contributing 7,867 strings (61%) out of 12,805 total on 12 May 2015, two days before the localization was committed to the repository in r8352. However, I have no recollection of contributing thousands of strings to a Vietnamese translation of JOSM at that time, when I was too busy to even contribute very much to OSM. Even if I did, I would not have translated a single one of these strings the way Launchpad says I did. That much is clear when comparing JOSM with iD and Vespucci, which I am responsible for translating into Vietnamese. (Launchpad lists a handful of strings of mine from 27 June 2010 that do look like something I would’ve written, but they were all overwritten by the 2015 mistranslations.)

So, as the supposed author of the majority of JOSM’s Vietnamese localization, I kindly ask that it be deleted from the repository until translators have had an opportunity to clean things up. If possible, the offending translations should also be deleted from Launchpad to facilitate the cleanup effort.

Attachments (1)

upload.png (380.7 KB ) - added by 1ec5 2 years ago.
Upload dialog

Download all attachments as: .zip

Change History (37)

by 1ec5, 2 years ago

Attachment: upload.png added

Upload dialog

comment:1 by 1ec5, 2 years ago

Description: modified (diff)

comment:2 by stoecker, 2 years ago

Cc: Don-vip added

Reading you your text I have several issues.

  • If Launchpad says translations are attributed to you then I rather believe your made a mistake or forgot something which was in the past compared to the idea that launchpad invents something
  • I will not remove translations for an existing language only because a single somebody opens a ticket and asks to do so. If it's a serious issue for years I'd expect that topic should have come up more than once until now.
  • If the translations are as bad as you say, start an effort to improve them. Vietnamese community should be willing to help to improve situation if it is really as described
  • "JOSM incorrectly considers Vietnamese to lack a plural grammatical form." → JOSM does not define something itself. We take the plural forms from Launchpad and the definition for "vi" is this: https://translations.launchpad.net/+languages/vi which is also reflected in any po-file (Plural-Forms: nplurals=1; plural=0;).

Altogether while I cannot decide myself whether Vietnamese in JOSM is correct or not simply because I don't speak Vietnamese your ticket has a lot of points which indicate that what you describe isn't exact either.

comment:3 by Don-vip, 2 years ago

Keywords: i18n vietnamese added

comment:4 by maxerickson@…, 2 years ago

Does Launchpad have a log of translation files that are uploaded? I don't see one (but I'm not logged in and so on).

https://help.launchpad.net/Translations/YourProject/ImportingTranslations

It does seem unlikely that thousands of strings were translated in a day or two using the web interface.

comment:5 by stoecker, 2 years ago

Not that I know about. We only have a daily backup.

Last edited 2 years ago by stoecker (previous) (diff)

in reply to:  2 comment:6 by 1ec5, 2 years ago

Replying to stoecker:

  • If Launchpad says translations are attributed to you then I rather believe your made a mistake or forgot something which was in the past compared to the idea that launchpad invents something

As Max has pointed out, it’s extremely unlikely that thousands of translations were contributed manually. Launchpad may have had a bug misattributing the import to me. It’s plausible that I was a preexisting contributor to the localization at the time, having submitted a few (correct) translations manually through the Web interface in 2010. For example, string #153 shows a correct translation by me that was rejected in favor of a humorously incorrect one also under my name.

I suspect that someone did a crude find-and-replace job on the original English .po, manually added the X-Exported-From-Launchpad header so that Launchpad would accept it, and overwrote existing translations by me and a few other users. I insist that I did not do something so unreasonable in a fever dream, but if you don’t want to take my word for it, then there’s little I can do about that.

  • I will not remove translations for an existing language only because a single somebody opens a ticket and asks to do so.

In general, that would be a fine principle, but it leaves me in a catch-22 situation. On the one hand, you’re claiming that the localization is largely authored by me based on Launchpad’s attribution, yet on the other hand, you seem to be discounting me as “a single somebody” asking to delete the work of others. So how do I undo what I supposedly accidentally wrought?

If it's a serious issue for years I'd expect that topic should have come up more than once until now.

No one has said anything because pretty much no one uses the Vietnamese localization. I think you’re assuming that a typical Vietnamese speaker’s first reaction to a poor localization would’ve been to head over to this issue tracker and file a ticket in English. But most Vietnamese mappers would either avoid JOSM or switch JOSM’s interface language to English (for those who can find the option). After all, why would someone bother to engage with a project that uses an offensive caricature of their language?

I downloaded the latest changeset dump from 24 December 2021 and filtered it to changesets starting from 14 May 2015 (the date the Vietnamese localization was committed to the repo) that overlap with the Vietnam bounding box (8.27673°N, 101.86523°E to 23.56399°N, 109.64355°E, excluding the Paracels and Spratlys to avoid including Hong Kong). Here’s the absolute and relative editor localization usage by number of changesets:

Locale JOSM iD Go Map!! 3.1+ Vespucci 0.9.9+
English 809213 124581 185 958
Vietnamese 869 35691 55 360
Others 14228 45534 4 783
Total 824310 205806 244 2101
% Vietnamese 1.726% 17.34% 22.54% 45.60%

Even though JOSM has much higher absolute usage than iD, Vietnamese usage is proportionally much lower among JOSM changesets than it is among iD, Go Map!!, and Vespucci changesets.

Of the 869 changesets in Vietnam uploaded using JOSM’s Vietnamese localization since 2015, 333 (38%) were uploaded by one user, V U P H A N, who stopped mapping in 2018. Among the 19 mappers who have contributed 10 or more changesets, Hieu Van is the second most prolific mapper and the only one who has ever been active. But they also switched to the English localization back in 2018. I reached out to both mappers for their feedback about the Vietnamese localization.

Here are the commands I used for this analysis:

osmium changeset-filter -c --bbox='101.865234375,8.276727101164047,109.64355468749999,23.563987128451217' -a '2015-05-14T14:41:03Z' --output=vnchangesets.opl.bz2 changesets-211220.osm.bz2
grep -E 'created_by=JOSM/[^,]*en' vnchangesets.opl | wc -l
grep -E 'created_by=JOSM/[^,]*vi' vnchangesets.opl | wc -l
grep -E 'created_by=JOSM/' vnchangesets.opl | wc -l
grep -E 'created_by=iD' vnchangesets.opl | wc -
grep -E 'created_by=iD' vnchangesets.opl | grep -E 'locale=en' | wc -
grep -E 'created_by=iD' vnchangesets.opl | grep -E 'locale=vi' | wc -
grep -E 'created_by=Go%20%Map' vnchangesets.opl | grep -E 'locale=' | wc -l
grep -E 'created_by=Go%20%Map' vnchangesets.opl | grep -E 'locale=en' | wc -l
grep -E 'created_by=Go%20%Map' vnchangesets.opl | grep -E 'locale=vi' | wc -l
grep -E 'created_by=Vespucci' vnchangesets.opl | wc -
grep -E 'created_by=Vespucci' vnchangesets.opl | grep -E 'locale=en' | wc -
grep -E 'created_by=Vespucci' vnchangesets.opl | grep -E 'locale=vi' | wc -
grep -E 'created_by=JOSM/[^,]*vi' vnchangesets.opl | sed -E 's/.* u([^ ]+).*/\1/g' | sort | uniq -c | sort -r

Outside of Vietnam, I assume I’m the main user of JOSM in Vietnamese via my import accounts.

  • If the translations are as bad as you say, start an effort to improve them. Vietnamese community should be willing to help to improve situation if it is really as described

Of course, I’m not asking that JOSM permanently delete the Vietnamese localization. But I hope to convince you that the majority of translated strings are incorrect enough to be removed now. And if these strings are to be removed, then we’re left with not enough translated strings for an official published localization.

I’ve reached out to the other translators whose translations were overwritten by the import to encourage them to take another look at the localization and comment on next steps. Regardless, these 7,000-plus strings will take a very long time to retranslate. The subset of Vietnamese users who contribute to OSM software translation is vanishingly small these days – I’m pretty much the only one who actively translates any of the projects. As things stand, I’m personally not very motivated to clean up this mess by hand compared to translating another editor from scratch.

To my knowledge, there’s no active, open communication channel for Vietnamese mappers. But I posted to the talk-vi mailing list, which has been mostly inactive since 2012, in case anyone on that list is still participating in OSM.

  • "JOSM incorrectly considers Vietnamese to lack a plural grammatical form." → JOSM does not define something itself. We take the plural forms from Launchpad and the definition for "vi" is this: https://translations.launchpad.net/+languages/vi which is also reflected in any po-file (Plural-Forms: nplurals=1; plural=0;).

I’m referring to this line in the JOSM codebase that gives Vietnamese only one grammatical number form. Vietnamese does indeed make a grammatical distinction between singular and plural using extra plural-marking words, even if it doesn’t inflect the noun itself for plural number. You can read about it in this introductory grammar textbook or this academic reference. As it is, we would have to write (các) lối này and (những) người này in parentheses, akin to writing “this/these way(s)” or “this person/these people” in English.

Launchpad probably got the incorrect plural form data from CLDR, which has a known issue in this regard. It’s understandable that JOSM would be unable to fix the plural setting independently of its CLDR-based translation platform. iD is also in the same situation with Transifex. So I’m not requesting that JOSM fix the plural setting just yet, even though it does contribute to the general brokenness of the localization.

Altogether while I cannot decide myself whether Vietnamese in JOSM is correct or not simply because I don't speak Vietnamese your ticket has a lot of points which indicate that what you describe isn't exact either.

I will interpret this sentence as charitably as possible. Since you don’t speak Vietnamese, I’ve provided representative examples, quantitative analysis, and links to reputable supporting material that would hopefully give you some confidence that I’m not just making things up. I understand that deleting a localization wholesale is a somewhat extreme step, but it will motivate the community toward working on a better translation more than what we have now. In the meantime, falling back to another language such as English would set a more accurate expectation with users.

Consider this an undiscussed, botched import that overwrote correct translations and drove away craft-translators and craft-mappers. I don’t fault you for committing vi.lang to the repo, because you didn’t know any better. But now that you’re aware of the problem, the criteria for reverting should be no more stringent than it was for importing it in the first place, right? After all, that’s how it normally works in OSM.

Last edited 2 years ago by 1ec5 (previous) (diff)

comment:7 by Don-vip, 2 years ago

Thank you Minh for explaining us the issue with so much details.
I agree we should find a way to revert the 2015 change, or if we can't, delete the impacted strings. Then if we have less than 2000 translated core strings, remove the translation until enough strings are translated again. This wouldn't be the first time we remove a translation.

comment:8 by Don-vip, 2 years ago

Milestone: 22.01

comment:9 by nhoccondalonroi@…, 2 years ago

I saw the report from Mr. Minh.
I'm Vietnamese and I agree that many translated strings in this issue are not good.
If possible, please modify them.

in reply to:  2 ; comment:10 by Le Viet Thanh <lethanhx2k@…>, 2 years ago

Replying to stoecker:

Reading you your text I have several issues.

  • I will not remove translations for an existing language only because a single somebody opens a ticket and asks to do so. If it's a serious issue for years I'd expect that topic should have come up more than once until now.
  • If the translations are as bad as you say, start an effort to improve them. Vietnamese community should be willing to help to improve situation if it is really as described
  • "JOSM incorrectly considers Vietnamese to lack a plural grammatical form." → JOSM does not define something itself. We take the plural forms from Launchpad and the definition for "vi" is this: https://translations.launchpad.net/+languages/vi which is also reflected in any po-file (Plural-Forms: nplurals=1; plural=0;).

Altogether while I cannot decide myself whether Vietnamese in JOSM is correct or not simply because I don't speak Vietnamese your ticket has a lot of points which indicate that what you describe isn't exact either.

Just to comment on the points that I know of, as one of the Vietnamese mappers and Vi translators for JOSM in c.a. 2009-2015:
There were not many active Vi mappers back in that period and even now. Most of the mappers were using the default En interface and then switched to the web version (OSM iD), so no ones have been aware of these changes, and I don't think Vietnamese mappers/translators will take their effort to fix the mistranslated words. Looking at some recently translated words as 1ec5 pointed out in comparison with the 2015 version, they are completely non-sense in the context of JOSM interface. I really appreciate the enthusiastic contribution of 1ec5 to OSM for over 1 decade and raising this translation issue. I think the simplest solution would be just to revert the changes back to the 2015 version.

Thanh
Retired OSM mapper (ninomax)

comment:11 by stoecker, 2 years ago

Milestone: 22.0122.02

Milestone renamed

comment:12 by Don-vip, 2 years ago

Milestone: 22.0222.03

comment:13 by stoecker, 2 years ago

Milestone: 22.0322.04

comment:14 by stoecker, 2 years ago

Milestone: 22.0422.05

Milestone renamed

in reply to:  10 ; comment:15 by stoecker, 2 years ago

I think the simplest solution would be just to revert the changes back to the 2015 version.

I don't understand the language. Old texts are available on launchpad. Go ahead with fixing stuff.

in reply to:  15 comment:16 by taylor.smock, 2 years ago

Replying to stoecker:

I think the simplest solution would be just to revert the changes back to the 2015 version.

I don't understand the language. Old texts are available on launchpad. Go ahead with fixing stuff.

I was looking at the API for Launchpad today, with the hope of being able to reset translations done by 1ec5 (Minh Nguyễn) on 2015-05-12. Rather unfortunately, the documentation is lacking, and I haven't figure out how to get to the translations via API yet.

comment:17 by stoecker, 2 years ago

If the Vietnamese native speakers don't improve the translation themselves it makes not much sense to modify it (resetting or whatever). We can only help, but the work must be done by speakers of the relevant language.

Reverting 7 years of changes without knowing the language seems not sensible to me.

What would probably be a sensible approach is somebody knowing the language checking the strings and replacing all unwanted ones with empty string (that's lots less work than translating). Then we could reimport missing 2015 strings.

P.S. Regarding the plural forms somebody should contact Launchpad. I'm sure they can fix it if it is an issue.

comment:18 by stoecker, 2 years ago

NOTE: I'd NOT use the Launchpad interface for such a task, but rather a tool like kbabel.

comment:19 by taylor.smock, 2 years ago

I was looking at writing a script that did the following:

Is the user mxn and is the date 2015-05-12? If so, reset the translation or set the translation to "", depending upon which is possible.

Unfortunately, it doesn't look like they have any API calls to do that (note: I'll be asking them about this on their IRC channel in a bit, so I may have to eat these words).

I wasn't intending to do a blanket reset of the translation.

comment:20 by taylor.smock, 2 years ago

I've asked a question on launchpad ( https://answers.launchpad.net/launchpad/+question/701972 ) -- IRC #launchpad pointed me there.

@1ec5: You may be asked to comment on the question, just so that they know it isn't some random person taking issue with the translations/translator.

comment:21 by anonymous, 2 years ago

Thanks for any help you can provide here. I don’t really understand the stance that the problem can’t be fixed by non-speakers, seeing as this whole ticket is a plea from translators for help from non-speakers. If it were so easy to fix ourselves, we would’ve done it by now.

If a more nuanced approach turns out to be infeasible, I would be fully supportive of clearing out everything, even the handful of legitimate translations that weren’t overwritten. Prior to this massive upload of incorrect translations, we didn’t have many translations anyways. It we had to start from scratch, I assure you we could easily get to the same point that we were at prior to May 2015 and would quickly make more progress than we have in the past seven years.

Regardless of what happens in Launchpad, deleting the current Vietnamese localization from Subversion would allow the community to stop warning mappers to use an alternative to JOSM.

comment:22 by 1ec5, 2 years ago

(Last comment was from me; forgot to log in.)

comment:23 by stoecker, 2 years ago

I OTOH don't understand why Vietnamese community does nothing to fix the situation. Japanese translation was done by a single person in a few days, so it should be possible for a group of dedicated people to review and to fix translation issues in 5 months.

comment:24 by 1ec5, 2 years ago

It’s much easier to translate from scratch than to wade through over 7,000 strings of gibberish without the ability to track progress. As far as I can tell, Launchpad doesn’t have a way to untranslate strings in bulk. These translations created a stifling atmosphere that dissuaded anyone from touching it. Have you ever looked at a part of OSM where a fly-by-night import ran rampant with so many errors that you just wanted to wipe everything away and start over from scratch?

Honestly, it’s hard to find the motivation to complete an entire localization for JOSM given the incredulous or dismissive initial response. There are plenty of OSM-related projects that work more collaboratively with translators. However, I would be fully supportive of anyone else who wants to take on the challenge of fixing the localization, whether automatically or manually.

It’s also much easier to delete wholesale the localization files from this repository that are known to be gibberish. Apparently there is precedent for deleting localizations. Would you accept a patch to that effect?

comment:25 by stoecker, 2 years ago

Milestone: 22.05

comment:27 by 1ec5, 2 years ago

P.S. Regarding the plural forms somebody should contact Launchpad. I'm sure they can fix it if it is an issue.

Thanks, I didn’t realize Launchpad configures plurals separately from CLDR. I reported the issue to Launchpad.

in reply to:  24 comment:28 by taylor.smock, 2 years ago

Replying to 1ec5:

It’s much easier to translate from scratch than to wade through over 7,000 strings of gibberish without the ability to track progress. As far as I can tell, Launchpad doesn’t have a way to untranslate strings in bulk. These translations created a stifling atmosphere that dissuaded anyone from touching it. Have you ever looked at a part of OSM where a fly-by-night import ran rampant with so many errors that you just wanted to wipe everything away and start over from scratch?

From the link in comment:20 (launchpad 701972), it looks like someone on the launchpad side is looking into the possibility of reverting/untranslating strings in bulk.

As far as tracking progress goes, +filter?person=mxn would probably work (just look for bold text and then 2015-05-12). Not perfect, so I'm hoping that the people on LaunchPad can come up with a better options (AKA removing all translations from mxn on 2015-05-12).

comment:29 by taylor.smock, 2 years ago

Quick update from launchpad 701972:

Another quick update here, this is still being worked on.
New code is now in Production, filled internal RT 151420 to perform a dry run.
Will update here when the actual deletion in Production occurs and translations from this user on that particular date should no longer be visible.

@1ec5: I know this isn't as fast as you would like, but I think this has probably led to a better long-term solution (AKA Launchpad will now have the ability to delete translations from a specific user on a specific date). I don't know if it will be visible in the UI, but it will make it easier to revert accidental/malicious uploads from a user in the future.

comment:30 by 1ec5, 2 years ago

Thank you for the update and for obtaining this upstream fix!

comment:31 by taylor.smock, 2 years ago

Resolution: fixed
Status: newclosed

All translations done by user mxn for vi on 2015-05-12 have been deleted.

Translated: 65 (0.500808999152477%)
Untranslated: 12914 (99.49919100084752%)

This means that we will be removing the vi translation on the next i18n update (see comment:7) unless someone who knows Vietnamese translates enough strings.

comment:32 by 1ec5, 2 years ago

Thank you once again for your help! I’ll remove the warning about JOSM from the OSM Wiki as soon as the next release comes out. I’ve posted a call for translations to the talk-vi mailing list, OSM Vietnam Facebook group, and OSM Asia Telegram chat. Hopefully this will have a happy ending. In the future, if the development team has any questions about how JOSM is being used by Vietnamese speakers or mappers in Vietnam, please don’t hesitate to contact these channels or reach out to me directly.

comment:33 by taylor.smock, 2 years ago

Milestone: 22.06

comment:34 by stoecker, 2 years ago

In 18510/josm:

i18n update, disable vi, fix #21720

comment:35 by stoecker, 2 years ago

In 18511/josm:

remove vi, see #21720

comment:36 by stoecker, 2 years ago

In 35995/osm:

I18n, remove vi, see #21720

comment:37 by stoecker, 2 years ago

In 35996/osm:

remove vi, see #21720

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.