Merge lp:~mvo/unity-scope-click/get-language-base-exceptions into lp:unity-scope-click/devel

Proposed by Michael Vogt
Status: Rejected
Rejected by: dobey
Proposed branch: lp:~mvo/unity-scope-click/get-language-base-exceptions
Merge into: lp:unity-scope-click/devel
Diff against target: 47 lines (+14/-1)
3 files modified
scope/click/configuration.cpp (+6/-0)
scope/tests/test_configuration.cpp (+7/-0)
scope/tests/test_reviews.cpp (+1/-1)
To merge this branch: bzr merge lp:~mvo/unity-scope-click/get-language-base-exceptions
Reviewer Review Type Date Requested Status
dobey (community) Disapprove
PS Jenkins bot (community) continuous-integration Needs Fixing
Review via email:

Description of the change

This branch ensures that review results for zh_CN/zh_TW and pt/pt_BR are not mixed but instead handled as their own base-languages. This is what we have done in software-center too.

If for some reason this is no longer needed please let me know and just close the MP.

Thanks for your consideration,

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
dobey (dobey) wrote :

9 + // some languages are special

I'd like to understand better why some languages are special, while others aren't. I can somewhat understand why zh_CN and zh_TW fit in here, as the same character may be a different word/meaning in the different sets. However, I don't understand why pt_BR should be special here, and why pt_PT, es_*, en_* and other languages which are spoken in a wider variety of countries with different dialects, should not be handled specially.

I wasn't involved in the software-center code that does this; have never seen it; and didn't even realize that it was being done there.

So if we must do this, I'd at least like to better understand why, and also have that rationale written into the comment in the code, as "some languages are special" doesn't really say much.

Also, I think instead of get_language_base() returning the full language string, we should perhaps handle this specifically where we need to handle such languages in a special manner (in get_accept_languages or where we build the JSON when submitting the review, for example).

review: Needs Information
Revision history for this message
dobey (dobey) wrote :

Also, if this is an issue we need to deal with, I think we need a bug report.

Revision history for this message
David Planella (dpm) wrote :

It depends on the teams. Some context for the languages I'm aware of:

- zh_TW/CN - I think this example is clear for everyone, as we're talking of Simplified Chinese vs. Traditional Chinese, each of which has a different character set. There is a translation team for each one of those.
- pt/_BR - Portuguese from Portugal vs. Brazilian Portuguese. Indeed the language can be understood on both countries, but my understanding is that there are enough regional differences in expressions that grant the division in two separate translation teams.
- en_US/_UK - while there is no US translation per se as obviously the original messages are in en_US, there is an active British translation team that translate. I had fun a couple of years back moderating the "Trash" vs. "Rubbish bin" discussion...
- es_* - There are plenty of Spanish dialects, but quite a while ago the Spanish teams (at least in GNOME and Ubuntu) decided to join efforts and ship translations only in the main dialect. That's why there is only one Spanish translation team in Ubuntu and there are virtually no translations in other dialects.
- ca/ca@valencia - done by the same translation team, but the ca@valencia translations are provided as otherwise the Valencian government and institutions would not use the ca locale and would provide a non-standard one instead.

Whenever there is a new team that requests the creation of a _CC (country) locale, we do ask them for legitimate reasons and to talk to any existing ll (language) team that covers the associated locale before creating a new team.

So in summary, it'd be nice to have one team that would cover all dialects, but languages and teams are more organic in that sense, especially when countries and politics are a factor in the creation of the locales and associated translation teams.

Revision history for this message
Michael Vogt (mvo) wrote :

I added bug #1321154 as requested.

Revision history for this message
dobey (dobey) wrote :

David, this isn't about translations of the app in Launchpad exactly, I don't think. The issue here is mainly with how we are submitting reviews to the server, and possibly with how we are fetching those reviews (though currently, the server is not filtering on Accept-Language and simply returns all reviews).

However, returning the full language string from get_language_base() is I think the wrong answer in any case. It should always return the base string for the language. Changing it can break other things (such as get_accept_languages()). I'm disapproving this branch, because I think this is the wrong place/way to do this.

review: Disapprove
Revision history for this message
dobey (dobey) wrote :

I've submitted which implements this in a better way I think.

Unmerged revisions

263. By Michael Vogt

For click reviews get_language_base() is used. But the above languages
are different enough so that they should not get results from the
generic zh/pt language.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'scope/click/configuration.cpp'
2--- scope/click/configuration.cpp 2014-05-01 21:04:23 +0000
3+++ scope/click/configuration.cpp 2014-05-19 13:50:48 +0000
4@@ -90,6 +90,12 @@
5 std::string Configuration::get_language_base()
6 {
7 std::string language = get_language();
9+ // some languages are special
10+ std::vector<std::string> exceptions = {"zh_CN", "zh_TW", "pt_BR"};
11+ if(std::find(exceptions.begin(), exceptions.end(), language) != exceptions.end())
12+ return language;
14 std::vector<std::string> lang_parts;
15 boost::split(lang_parts, language, boost::is_any_of("_"));
16 return lang_parts[0];
18=== modified file 'scope/tests/test_configuration.cpp'
19--- scope/tests/test_configuration.cpp 2014-05-01 21:04:23 +0000
20+++ scope/tests/test_configuration.cpp 2014-05-19 13:50:48 +0000
21@@ -126,6 +126,13 @@
22 ASSERT_EQ(unsetenv(Configuration::LANGUAGE_ENVVAR), 0);
23 }
25+TEST(Configuration, getLanguageBaseCornerCase)
27+ ASSERT_EQ(setenv(Configuration::LANGUAGE_ENVVAR, "zh_CN.UTF-8", 1), 0);
28+ EXPECT_EQ(Configuration().get_language_base(), "zh_CN");
29+ ASSERT_EQ(unsetenv(Configuration::LANGUAGE_ENVVAR), 0);
32 TEST(Configuration, getLanguageBaseUnsetFallback)
33 {
34 ASSERT_EQ(unsetenv(Configuration::LANGUAGE_ENVVAR), 0);
36=== modified file 'scope/tests/test_reviews.cpp'
37--- scope/tests/test_reviews.cpp 2014-04-29 03:17:38 +0000
38+++ scope/tests/test_reviews.cpp 2014-05-19 13:50:48 +0000
39@@ -275,7 +275,7 @@
41 ASSERT_EQ(setenv(click::Configuration::LANGUAGE_ENVVAR,
42 "zh_TW.UTF-8", 1), 0);
43- std::string expected_language = "\"language\":\"zh\"";
44+ std::string expected_language = "\"language\":\"zh_TW\"";
45 EXPECT_CALL(*clientPtr, callImpl(_, "POST", true, _,
46 HasSubstr(expected_language), _))
47 .Times(1)


People subscribed via source and target branches

to all changes: