Merge lp:~pascal-devuyst/apt-ddtp-tools/DebianSync into lp:apt-ddtp-tools

Proposed by Pascal De Vuyst
Status: Needs review
Proposed branch: lp:~pascal-devuyst/apt-ddtp-tools/DebianSync
Merge into: lp:apt-ddtp-tools
Diff against target: 407 lines (+228/-29)
10 files modified
DebianSync (+30/-0)
cmp-translations.py (+4/-4)
po2translation.py (+5/-5)
series.sh (+12/-4)
ubuntu/debian-ddtp-import/ddtp-get.py (+48/-0)
ubuntu/debian-ddtp-import/gen-sync.sh (+6/-6)
ubuntu/debian-ddtp-import/mail-new-ddtp.py (+73/-0)
ubuntu/debian-ddtp-import/nl.avoid (+3/-0)
ubuntu/debian-ddtp-import/sync-missing.py (+36/-0)
ubuntu/debian-ddtp-import/update-from-debian.sh (+11/-10)
To merge this branch: bzr merge lp:~pascal-devuyst/apt-ddtp-tools/DebianSync
Reviewer Review Type Date Requested Status
Michael Vogt Pending
Review via email: mp+162643@code.launchpad.net

Description of the change

Hi Michael,

I have modified your update-from-debian.sh, gen-sync.sh,
po2translation.py and cmp-translations.py scripts in apt-ddtp-tools to
give a list of descriptions missing in debian:
* Translation-en instead of Packages
* Fixed typo's

* update-from-debian.sh: variables for debian and ubuntu mirrors and
  only get Translation for specific language in series.sh
* gen-sync.sh: remove-duplicates.py is missing so commented it out
* po2translation.py: mark missing translation chunks as <trans> like
  ddtp.debian.net <http://ddtp.debian.net> does, fixed error of extra
  trailing \n with enumerations, Description-en instead of _Description
* cmp-translations.py: make it work with Translation-main-$lang files in
  missing-in-debian and output alphabetically

I created ddtp-get.py to fetch a package description from
ddtp.debian.net <http://ddtp.debian.net>, mail-new-ddtp.py to send a
package description to debian using email interface and sync-missing to
send descriptions missing in debian back to debian (as suggested on
<email address hidden> <mailto:debian-i18n%40lists.debian.org>).
Please read DebianSync.

Please consider including these changes into apt-ddtp-tools and feel
free to improve where necessary.

Thanks in advance,
Pascal

To post a comment you must log in.
Revision history for this message
Michael Vogt (mvo) wrote :

Hello Pascal,

thanks a lot for this branch and sorry for my slow reply.

This looks good, but I do have some questions:

=== modified file 'po2translation.py'
--- po2translation.py 2010-12-20 13:22:27 +0000
+++ po2translation.py 2013-05-30 07:21:40 +0000
@@ -41,7 +41,7 @@
                # we have no translation for this chunk
                if only_do_full_translations:
                        return ""
- short_descr_trans = short_descr
+ short_descr_trans = "<trans>"
        else:
                have_translation = True

Why is this needed? Will this <trans> get filtered out later?
@@ -91,7 +91,7 @@
                        for line in chunk.split("\n"):
                                if line:
                                        new_chunk += " %s\n" % line
- chunk = new_chunk
+ chunk = new_chunk[:-1] # get rid of trailing \n
                # make sure here that chunks are seperated with "\n .\n"
                if long_descr_str == "":
                        long_descr_str = "%s" % chunk

Its probably best to use chunk = new_chunk.rstrip("\n") here to be robust against chunks that do not end with \n.

For the sending of mails, I wonder if it wouldn't be easier to just call the sendmail binary (maybe as a optional way) for people who already have a installed and configured sendmail.

It would be nice if the return codes (253, 252 etc) also have symbolic names somewhere.

I hope the above makes sense, please let me know if you have any questions. Its a nice touch that you also fix the hardcoded ubuntu mirror and make that configurable instead, its only a small part of the large patch but still a nice touch :)

Out of curiosity, when we send new translations to the email interface, do we have to be careful about not sending too much? Will it enter some sort of queue/review process in debian? Should we notify them before we send the stuff back or did you already started the sending back ?

Thanks!
 Michael

Revision history for this message
Pierre Slamich (pierre-slamich) wrote :

Was also wondering about throttling to ensure we don't crash upstream's servers

Unmerged revisions

90. By Pascal De Vuyst

update-from-debian.sh, gen-sync.sh, po2translation.py and cmp-translations.py:
* Translation-en instead of Packages
* Fixed typo's

* update-from-debian.sh: variables for debian and ubuntu mirrors and only get
  Translation for specific language in series.sh
* gen-sync.sh: remove-duplicates.py is missing so commented it out
* po2translation.py: mark missing translation chunks as <trans> like
  ddtp.debian.net does, fixed error of extra trailing \n with enumerations,
  Description-en instead of _Description
* cmp-translations.py: make it work with Translation-main-$lang files in
  missing-in-debian and output alphabetically

I created ddtp-get.py to fetch a package description from ddtp.debian.net,
mail-new-ddtp.py to send a package description to debian using email interface
and sync-missing to send descriptions missing in debian back to debian (as
suggested on <email address hidden>). Please read DebianSync.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'DebianSync'
2--- DebianSync 1970-01-01 00:00:00 +0000
3+++ DebianSync 2013-05-06 18:35:30 +0000
4@@ -0,0 +1,30 @@
5+#1) Generate list of translations missing in Debian:
6+
7+#Preparations:
8+sudo apt-get install distro-info lftp
9+#Edit apt-ddtp-tools/series.sh and set SERIES, LNG and archive mirrors!
10+#Get recent $lang.po from ddtp branch or export them for each $comp by email
11+#and link (ln -s) or put them into apt-ddtp-tools/ubuntu/$comp/$lang.po
12+
13+cd apt-ddtp-tools/ubuntu/debian-ddtp-import
14+./update-from-debian.sh
15+./gen-sync.sh
16+#Translations missing in Debian are listed in
17+#./missing-in-debian/Translation-$comp-$lang
18+
19+#2) Send missing translations to Debian:
20+#Preparations: edit ./mail-new-ddtp.py and set SMTP_SERVER, SMTP_PORT, SENDER
21+#and PASSWORD
22+./sync-missing.py ./missing-in-debian/Translation-$comp-$lang
23+
24+#3) Get and send a translation directly from/to Debian:
25+./ddtp-get.py $package $lang
26+#Edit new.ddtp and change the translation:
27+editor new.ddtp
28+#Send the translation to Debian:
29+./mail-new-ddtp.py
30+#A file $lang.avoid can be use to avoid words in the translation (frequent
31+#errors). The format is word <tab> suggestion. If an avoid word occurs in the
32+#translation it will not be send until corrected.
33+#./mail-new-ddtp.py -f or ./mail-new-ddtp.py --force can be used to
34+#send translations anyway even if avoid words occur.
35
36=== modified file 'cmp-translations.py'
37--- cmp-translations.py 2010-12-20 13:25:44 +0000
38+++ cmp-translations.py 2013-05-06 18:35:30 +0000
39@@ -19,13 +19,13 @@
40
41 if len(sys.argv) < 3:
42 print "Need two translation files left-file right-file (missing-from-right)"
43- print "the paramter 'missing-from-right' is option and will create "
44+ print "the parameter 'missing-from-right' is option and will create "
45 print "a file with the missing Translation stuff from the right hand side"
46 sys.exit(1)
47
48 # sanity check
49- lang = os.path.basename(sys.argv[1]).split("-")[1]
50- if lang != os.path.basename(sys.argv[2]).split("-")[1]:
51+ lang = os.path.basename(sys.argv[1]).rsplit("-",1)[1]
52+ if lang != os.path.basename(sys.argv[2]).rsplit("-",1)[1]:
53 print "Different languages, does not make sense to compare them"
54 sys.exit(1)
55
56@@ -68,7 +68,7 @@
57 # output what is missing in the left
58 if len(set(left_hand.keys()).difference(set(right_hand.keys()))) > 0:
59 f=open(missing_on_left,"w")
60- for pkg in set(left_hand.keys()).difference(set(right_hand.keys())):
61+ for pkg in sorted(set(left_hand.keys()).difference(set(right_hand.keys()))):
62 f.write("Package: %s\n" % pkg)
63 f.write("Description-md5: %s\n" % left_hand[pkg][0])
64 f.write("Description-%s: %s\n\n" % (lang, left_hand[pkg][1]))
65
66=== modified file 'po2translation.py'
67--- po2translation.py 2010-12-20 13:22:27 +0000
68+++ po2translation.py 2013-05-06 18:35:30 +0000
69@@ -41,7 +41,7 @@
70 # we have no translation for this chunk
71 if only_do_full_translations:
72 return ""
73- short_descr_trans = short_descr
74+ short_descr_trans = "<trans>"
75 else:
76 have_translation = True
77
78@@ -61,7 +61,7 @@
79 if not chunk_translated:
80 if only_do_full_translations:
81 return ""
82- chunk_translated = chunk
83+ chunk_translated = "<trans>"
84 else:
85 have_translation=True
86 # strip away any \r and \n\n to avoid anything like bug #195383
87@@ -91,7 +91,7 @@
88 for line in chunk.split("\n"):
89 if line:
90 new_chunk += " %s\n" % line
91- chunk = new_chunk
92+ chunk = new_chunk[:-1] # get rid of trailing \n
93 # make sure here that chunks are seperated with "\n .\n"
94 if long_descr_str == "":
95 long_descr_str = "%s" % chunk
96@@ -120,7 +120,7 @@
97 translated_descriptions = 0
98 while parser.step():
99 pkg = parser.section.get("Package")
100- descr = parser.section.get("_Description")
101+ descr = parser.section.get("Description-en")
102 # work with a regular packages file too
103 if not descr:
104 descr = parser.section.get("Description")
105@@ -136,7 +136,7 @@
106 format="%(asctime)s %(message)s")
107
108 if len(sys.argv) < 4:
109- print "need a Packages file a po file and a language as argument"
110+ print "need a Translation-en/Packages file a po file and a language as argument"
111 print "it will search for the mo file in "
112 print "mo/$lang/LC_MESSAGES/packages.mo"
113 sys.exit(1)
114
115=== modified file 'series.sh'
116--- series.sh 2013-04-09 18:45:59 +0000
117+++ series.sh 2013-05-06 18:35:30 +0000
118@@ -1,7 +1,15 @@
119-
120-# the series to use (default auto-detect)
121-#SERIES=$(lsb_release -s -c)
122-SERIES="raring"
123+#The series to use:
124+#SERIES="raring"
125+#Current development version short codename (requires distro-info package):
126+SERIES=$(distro-info --devel)
127
128 DIST="$SERIES"
129
130+#update-from-debian.sh language:
131+LNG=nl
132+#All languages:
133+#LNG=""
134+
135+#Archive mirrors to use:
136+UBUNTUMIRROR=be.archive.ubuntu.com
137+DEBIANMIRROR=ftp.be.debian.org
138
139=== added file 'ubuntu/debian-ddtp-import/ddtp-get.py'
140--- ubuntu/debian-ddtp-import/ddtp-get.py 1970-01-01 00:00:00 +0000
141+++ ubuntu/debian-ddtp-import/ddtp-get.py 2013-05-06 18:35:30 +0000
142@@ -0,0 +1,48 @@
143+#!/usr/bin/python
144+
145+import sys, urllib2, re, hashlib
146+
147+if len(sys.argv) < 3:
148+ print 'please specify package and language'
149+ exit(1)
150+
151+f = urllib2.urlopen('https://ddtp.debian.net/ddt.cgi?package=' + sys.argv[1])
152+pkg = f.readlines()
153+
154+i = None
155+for line in pkg:
156+ if re.search('active', line):
157+ try:
158+ i = re.search('id=([0-9]*)',prevline).group(1)
159+ except AttributeError:
160+ i = None
161+ break
162+ prevline = line
163+
164+if i == None:
165+ print "package " + sys.argv[1] + " not active or not present in DDTP"
166+ exit(254)
167+else:
168+ print 'Id: ' + i
169+
170+f = urllib2.urlopen('https://ddtp.debian.net/ddt.cgi?desc_id='+i+'&getuntrans=' + sys.argv[2])
171+trans = f.read()
172+
173+descr = trans[trans.find('Description:')+13:trans.find('Description-')]
174+m = hashlib.md5()
175+m.update(descr)
176+print 'Description-md5: ' + m.hexdigest()
177+print trans
178+descr2 = ''
179+if trans.find('<trans>') == -1:
180+ print 'translated to ' + sys.argv[2]
181+ descr2 = trans[trans.find('Description-')+14+len(sys.argv[2]):trans.find('\n\n')] + '\n'
182+ exitval = 253
183+else:
184+ print 'not yet translated to ' + sys.argv[2]
185+ exitval = 252
186+f = open('new.ddtp', 'w')
187+message='# Package(s): ' + sys.argv[1] + '\nDescription: ' + descr[:-1] + '\nDescription-' + sys.argv[2] + '.UTF-8: ' + descr2
188+f.write(message)
189+f.close()
190+exit(exitval)
191
192=== modified file 'ubuntu/debian-ddtp-import/gen-sync.sh'
193--- ubuntu/debian-ddtp-import/gen-sync.sh 2010-12-20 13:23:42 +0000
194+++ ubuntu/debian-ddtp-import/gen-sync.sh 2013-05-06 18:35:30 +0000
195@@ -25,17 +25,17 @@
196 LANG=${LANG#Translation-}
197 LANG=${LANG%.bz2}
198 echo "working on $LANG"
199- if ! ../../po2translation sid/main/Packages ../$comp/$LANG.po $LANG --full-translations > tmp/Translation-$LANG; then
200+ if ! ../../po2translation sid/main/Translation-en ../$comp/$LANG.po $LANG --full-translations > tmp/Translation-$LANG; then
201 continue
202 fi
203 # now check what translations are there but not in debian and
204- # geneate a file for this
205+ # generate a file for this
206 ../../cmp-translations.py ${f%.bz2} tmp/Translation-$LANG missing-in-debian/Translation-$comp-$LANG | tee missing-in-debian/Translation-$comp-$LANG.stats
207
208- # remove duplicates for now (ddtp has a bug)
209- if [ -e missing-in-debian/Translation-$comp-$LANG ]; then
210- ../../remove-duplicates.py sid/main/i18n/Translation-$LANG missing-in-debian/Translation-$comp-$LANG > missing-in-debian/Translation-.no.same.md5.as.in.debian-$comp-${LANG}
211- fi
212+ ## remove duplicates for now (ddtp has a bug)
213+ #if [ -e missing-in-debian/Translation-$comp-$LANG ]; then
214+ # ../../remove-duplicates.py sid/main/i18n/Translation-$LANG missing-in-debian/Translation-$comp-$LANG > missing-in-debian/Translation-.no.same.md5.as.in.debian-$comp-${LANG}
215+ #fi
216
217 done
218
219
220=== added file 'ubuntu/debian-ddtp-import/mail-new-ddtp.py'
221--- ubuntu/debian-ddtp-import/mail-new-ddtp.py 1970-01-01 00:00:00 +0000
222+++ ubuntu/debian-ddtp-import/mail-new-ddtp.py 2013-05-06 18:35:30 +0000
223@@ -0,0 +1,73 @@
224+#!/usr/bin/python
225+
226+import sys, smtplib
227+
228+SMTP_SERVER = 'smtp.gmail.com'
229+SMTP_PORT = 587
230+SENDER = 'xxxx@gmail.com'
231+PASSWORD = 'xxxx'
232+
233+f = open('new.ddtp', 'r')
234+s = f.read()
235+f.close()
236+pkg = s[s.find('Package(s):')+12:s.find('\n')]
237+lang = s[s.find('Description-')+12:s.find('.UTF-8')]
238+tstart = s.find('Description-')+len(lang)+20
239+short = s[tstart:s.find('\n',tstart)]
240+trans = s[tstart:s.find('\n\n',tstart)]
241+
242+if len(short) > 80:
243+ print "WARNING: translated short description too long (>80 characters), please edit and resend."
244+ exit(1)
245+
246+try:
247+ f = open(lang+'.avoid', 'r')
248+ avoid = f.readlines()
249+ f.close()
250+except IOError:
251+ avoid = []
252+
253+try:
254+ if sys.argv[1] == "--force" or sys.argv[1] == "-f":
255+ force = True
256+except IndexError:
257+ force = False
258+
259+av = False
260+for item in avoid:
261+ av0 = item.strip('\n').split('\t')[0]
262+ av1 = item.strip('\n').split('\t')[1]
263+ if av0 in trans:
264+ print av0 + " should be translated as " + av1
265+ av = True
266+ continue
267+if av == True and force == False: exit(1)
268+
269+recipient = 'pdesc@ddtp.debian.net'
270+subject = pkg + ' ' + lang
271+
272+from email.mime.multipart import MIMEMultipart
273+msg = MIMEMultipart()
274+msg['Subject'] = subject
275+msg['To'] = recipient
276+msg['From'] = SENDER
277+
278+from email.MIMEText import MIMEText
279+
280+filename = "new.ddtp"
281+f = open(filename,'r')
282+attachment = MIMEText(f.read())
283+f.close()
284+attachment.add_header('Content-Disposition', 'attachment', filename=filename)
285+msg.attach(attachment)
286+
287+session = smtplib.SMTP(SMTP_SERVER, SMTP_PORT)
288+
289+session.ehlo()
290+session.starttls()
291+session.ehlo
292+session.login(SENDER, PASSWORD)
293+
294+session.sendmail(SENDER, recipient, msg.as_string())
295+print "sent " + subject
296+session.quit()
297
298=== added file 'ubuntu/debian-ddtp-import/nl.avoid'
299--- ubuntu/debian-ddtp-import/nl.avoid 1970-01-01 00:00:00 +0000
300+++ ubuntu/debian-ddtp-import/nl.avoid 2013-05-06 18:35:30 +0000
301@@ -0,0 +1,3 @@
302+dummy leeg
303+frontend interface
304+development ontwikkelings
305
306=== added file 'ubuntu/debian-ddtp-import/sync-missing.py'
307--- ubuntu/debian-ddtp-import/sync-missing.py 1970-01-01 00:00:00 +0000
308+++ ubuntu/debian-ddtp-import/sync-missing.py 2013-05-06 18:35:30 +0000
309@@ -0,0 +1,36 @@
310+#!/usr/bin/python
311+
312+import sys, apt_pkg, subprocess, hashlib
313+
314+if len(sys.argv) < 2:
315+ print 'please specify Translation file'
316+ exit()
317+
318+lang = sys.argv[1].rsplit("-",1)[1]
319+
320+translation = []
321+rt = apt_pkg.ParseTagFile(open(sys.argv[1]))
322+while rt.Step():
323+ translation.append( (rt.Section.get("Package"),rt.Section.get("Description-md5"),rt.Section.get("Description-"+lang)) )
324+
325+for item in translation:
326+ print "-" * 75
327+ r = subprocess.call(['./ddtp-get.py',item[0],lang])
328+ if r == 253 or r == 254: #translated | package not active or not present in DDTP
329+ continue
330+ f = open('new.ddtp', 'r')
331+ trans = f.read()
332+ f.close()
333+ descr = trans[trans.find('Description:')+13:trans.find('Description-')]
334+ if r == 252: #not yet translated
335+ m = hashlib.md5()
336+ m.update(descr)
337+ if m.hexdigest() == item[1]:
338+ f = open('new.ddtp','a')
339+ f.write(item[2]+'\n')
340+ f = open('new.ddtp','r')
341+ print f.read()
342+ f.close()
343+ subprocess.call(['./mail-new-ddtp.py','--force'])
344+ else:
345+ print "md5 doesn't match"
346
347=== modified file 'ubuntu/debian-ddtp-import/update-from-debian.sh'
348--- ubuntu/debian-ddtp-import/update-from-debian.sh 2012-04-17 13:22:53 +0000
349+++ ubuntu/debian-ddtp-import/update-from-debian.sh 2013-05-06 18:35:30 +0000
350@@ -2,25 +2,25 @@
351
352 set -e
353
354-# include series and DIST overide
355+# include series and DIST override
356 . ../../series.sh
357
358 TRANSLATIONS2PO=`pwd`/../../translation2po
359
360-# get the debian translatins
361+# get the debian translations
362 #rm -f Translations.tar.gz
363 #wget http://ddtp.debian.net/debian/Translations.tar.gz
364 #tar xzvf Translations.tar.gz
365
366-# get current packages file
367+# get current Translation-en file
368 mkdir -p sid/main/ || true
369 cd sid/main
370-rm -f Packages
371-wget -c ftp://ftp.de.debian.org/debian/dists/sid/main/binary-i386/Packages.gz
372-gunzip Packages.gz
373+rm -f Translation-en
374+wget -c "ftp://$DEBIANMIRROR/debian/dists/sid/main/i18n/Translation-en.bz2"
375+bunzip2 Translation-en.bz2
376 cd ../..
377
378-# *sigh* now that debian does no longer publish the transltions tarball
379+# *sigh* now that debian does no longer publish the translations tarball
380 # on debian.net we need to get it elsewhere
381 mkdir -p sid/main/i18n/ || true
382 cd sid/main/i18n ;
383@@ -28,7 +28,8 @@
384 # mget -c is playing tricks on me
385 rm -f *.bz2
386
387-echo "mget -c *.bz2"|lftp 'ftp://ftp.de.debian.org/debian/dists/sid/main/i18n'
388+# if $LNG only get Translation for that language
389+echo "mget -c *$LNG.bz2"|lftp "ftp://$DEBIANMIRROR/debian/dists/sid/main/i18n"
390 for i in *.bz2; do
391 bunzip2 -c $i > ${i%.bz2}
392 done
393@@ -49,12 +50,12 @@
394 rm -f $comp/*
395 cd $comp
396 # the translations file
397- wget http://archive.ubuntu.com/ubuntu/dists/$DIST/$comp/i18n/Translation-en
398+ wget "http://$UBUNTUMIRROR/ubuntu/dists/$DIST/$comp/i18n/Translation-en"
399 for f in ../sid/main/i18n/Translation-*.bz2; do
400 file=${f%.bz2}
401 filename=$(basename $file)
402 lang=${filename#Translation-}
403- echo "$TRANSLATIONS2PO Packages $file $lang >$lang.po"
404+ echo "$TRANSLATIONS2PO Translation-en $file $lang >$lang.po"
405 $TRANSLATIONS2PO Translation-en $file $lang >$lang.po
406 done
407 cd ..

Subscribers

People subscribed via source and target branches

to all changes: