Merge lp:~miurahr/calibre/recipes into lp:calibre
- recipes
- Merge into trunk
Proposed by
Hiroshi Miura
Status: | Merged |
---|---|
Merged at revision: | 6971 |
Proposed branch: | lp:~miurahr/calibre/recipes |
Merge into: | lp:calibre |
Diff against target: |
898 lines (+805/-0) 14 files modified
resources/recipes/cnetjapan.recipe (+30/-0) resources/recipes/endgadget_ja.recipe (+20/-0) resources/recipes/jijinews.recipe (+24/-0) resources/recipes/mainichi.recipe (+24/-0) resources/recipes/mainichi_it_news.recipe (+16/-0) resources/recipes/msnsankei.recipe (+22/-0) resources/recipes/nikkei_free.recipe (+58/-0) resources/recipes/nikkei_sub_economy.recipe (+111/-0) resources/recipes/nikkei_sub_industry.recipe (+109/-0) resources/recipes/nikkei_sub_life.recipe (+110/-0) resources/recipes/nikkei_sub_main.recipe (+103/-0) resources/recipes/nikkei_sub_sports.recipe (+110/-0) resources/recipes/reuters_ja.recipe (+37/-0) resources/recipes/the_h.recipe (+31/-0) |
To merge this branch: | bzr merge lp:~miurahr/calibre/recipes |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Kovid Goyal | Pending | ||
Review via email: mp+41555@code.launchpad.net |
Commit message
Description of the change
Introduced 15 Japanese recipes with some cover images and icons.
- CNET Japan
- Endgadget Japan
- MSN Sankei News
- Reuters Japan
- Nikkei(Free)
- Nikkei::Sports
- Nikkei::Industry
- Nikkei::Life
- Nikkei::Economy
- Nikkei::Headline
- Jiji Express
- Mainichi Daily News
- Mainichi Daily News:: IT and electoronics
Remove nikkei_sub.recipe which is problematic by too much feed at once.
To post a comment you must log in.
Preview Diff
[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1 | === added file 'resources/images/news/cnetjapan.png' |
2 | Binary files resources/images/news/cnetjapan.png 1970-01-01 00:00:00 +0000 and resources/images/news/cnetjapan.png 2010-11-23 07:00:03 +0000 differ |
3 | === added file 'resources/images/news/endgadget_ja.png' |
4 | Binary files resources/images/news/endgadget_ja.png 1970-01-01 00:00:00 +0000 and resources/images/news/endgadget_ja.png 2010-11-23 07:00:03 +0000 differ |
5 | === added file 'resources/images/news/jijinews.png' |
6 | Binary files resources/images/news/jijinews.png 1970-01-01 00:00:00 +0000 and resources/images/news/jijinews.png 2010-11-23 07:00:03 +0000 differ |
7 | === added file 'resources/images/news/msnsankei.png' |
8 | Binary files resources/images/news/msnsankei.png 1970-01-01 00:00:00 +0000 and resources/images/news/msnsankei.png 2010-11-23 07:00:03 +0000 differ |
9 | === added file 'resources/images/news/nikkei_free.png' |
10 | Binary files resources/images/news/nikkei_free.png 1970-01-01 00:00:00 +0000 and resources/images/news/nikkei_free.png 2010-11-23 07:00:03 +0000 differ |
11 | === added file 'resources/images/news/nikkei_sub_economy.png' |
12 | Binary files resources/images/news/nikkei_sub_economy.png 1970-01-01 00:00:00 +0000 and resources/images/news/nikkei_sub_economy.png 2010-11-23 07:00:03 +0000 differ |
13 | === added file 'resources/images/news/nikkei_sub_industory.png' |
14 | Binary files resources/images/news/nikkei_sub_industory.png 1970-01-01 00:00:00 +0000 and resources/images/news/nikkei_sub_industory.png 2010-11-23 07:00:03 +0000 differ |
15 | === added file 'resources/images/news/nikkei_sub_life.png' |
16 | Binary files resources/images/news/nikkei_sub_life.png 1970-01-01 00:00:00 +0000 and resources/images/news/nikkei_sub_life.png 2010-11-23 07:00:03 +0000 differ |
17 | === added file 'resources/images/news/nikkei_sub_main.png' |
18 | Binary files resources/images/news/nikkei_sub_main.png 1970-01-01 00:00:00 +0000 and resources/images/news/nikkei_sub_main.png 2010-11-23 07:00:03 +0000 differ |
19 | === added file 'resources/images/news/nikkei_sub_sports.png' |
20 | Binary files resources/images/news/nikkei_sub_sports.png 1970-01-01 00:00:00 +0000 and resources/images/news/nikkei_sub_sports.png 2010-11-23 07:00:03 +0000 differ |
21 | === added file 'resources/images/news/reuters.png' |
22 | Binary files resources/images/news/reuters.png 1970-01-01 00:00:00 +0000 and resources/images/news/reuters.png 2010-11-23 07:00:03 +0000 differ |
23 | === added file 'resources/images/news/reuters_ja.png' |
24 | Binary files resources/images/news/reuters_ja.png 1970-01-01 00:00:00 +0000 and resources/images/news/reuters_ja.png 2010-11-23 07:00:03 +0000 differ |
25 | === added file 'resources/recipes/cnetjapan.recipe' |
26 | --- resources/recipes/cnetjapan.recipe 1970-01-01 00:00:00 +0000 |
27 | +++ resources/recipes/cnetjapan.recipe 2010-11-23 07:00:03 +0000 |
28 | @@ -0,0 +1,30 @@ |
29 | +import re; |
30 | + |
31 | +class CNetJapan(BasicNewsRecipe): |
32 | + title = u'CNET Japan' |
33 | + oldest_article = 3 |
34 | + max_articles_per_feed = 30 |
35 | + |
36 | + feeds = [(u'cnet rss', u'http://feeds.japan.cnet.com/cnet/rss')] |
37 | + language = 'ja' |
38 | + encoding = 'Shift_JIS' |
39 | + remove_javascript = True |
40 | + |
41 | + preprocess_regexps = [ |
42 | + (re.compile(ur'<!--\u25B2contents_left END\u25B2-->.*</body>', re.DOTALL|re.IGNORECASE|re.UNICODE), |
43 | + lambda match: '</body>'), |
44 | + (re.compile(r'<!--AD_ELU_HEADER-->.*</body>', re.DOTALL|re.IGNORECASE), |
45 | + lambda match: '</body>'), |
46 | + (re.compile(ur'<!-- \u25B2\u95A2\u9023\u30BF\u30B0\u25B2 -->.*<!-- \u25B2ZDNet\u25B2 -->', re.UNICODE), |
47 | + lambda match: '<!-- removed -->'), |
48 | + ] |
49 | + |
50 | + remove_tags_before = dict(name="h2") |
51 | + remove_tags = [ |
52 | + {'class':"social_bkm_share"}, |
53 | + {'class':"social_bkm_print"}, |
54 | + {'class':"block20 clearfix"}, |
55 | + dict(name="div",attrs={'id':'bookreview'}), |
56 | + ] |
57 | + remove_tags_after = {'class':"block20"} |
58 | + |
59 | |
60 | === added file 'resources/recipes/endgadget_ja.recipe' |
61 | --- resources/recipes/endgadget_ja.recipe 1970-01-01 00:00:00 +0000 |
62 | +++ resources/recipes/endgadget_ja.recipe 2010-11-23 07:00:03 +0000 |
63 | @@ -0,0 +1,20 @@ |
64 | +#!/usr/bin/env python |
65 | + |
66 | +__license__ = 'GPL v3' |
67 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
68 | +''' |
69 | +japan.engadget.com |
70 | +''' |
71 | + |
72 | +from calibre.web.feeds.news import BasicNewsRecipe |
73 | + |
74 | +class EndgadgetJapan(BasicNewsRecipe): |
75 | + title = u'Endgadget\u65e5\u672c\u7248' |
76 | + cover_url = 'http://skins18.wincustomize.com/1/49/149320/29/7578/preview-29-7578.jpg' |
77 | + masthead_url = 'http://www.blogsmithmedia.com/japanese.engadget.com/media/eng-jp-logo-t.png' |
78 | + oldest_article = 7 |
79 | + max_articles_per_feed = 100 |
80 | + no_stylesheets = True |
81 | + language = 'ja' |
82 | + encoding = 'utf-8' |
83 | + feeds = [(u'engadget', u'http://japanese.engadget.com/rss.xml')] |
84 | |
85 | === added file 'resources/recipes/jijinews.recipe' |
86 | --- resources/recipes/jijinews.recipe 1970-01-01 00:00:00 +0000 |
87 | +++ resources/recipes/jijinews.recipe 2010-11-23 07:00:03 +0000 |
88 | @@ -0,0 +1,24 @@ |
89 | +#!/usr/bin/env python |
90 | + |
91 | +__license__ = 'GPL v3' |
92 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
93 | +''' |
94 | +www.jiji.com |
95 | +''' |
96 | + |
97 | +class JijiDotCom(BasicNewsRecipe): |
98 | + title = u'\u6642\u4e8b\u901a\u4fe1' |
99 | + __author__ = 'Hiroshi Miura' |
100 | + description = 'World News from Jiji Press' |
101 | + publisher = 'Jiji Press Ltd.' |
102 | + category = 'news' |
103 | + encoding = 'utf-8' |
104 | + oldest_article = 6 |
105 | + max_articles_per_feed = 100 |
106 | + language = 'ja' |
107 | + cover_url = 'http://www.jiji.com/img/top_header_logo2.gif' |
108 | + masthead_url = 'http://jen.jiji.com/images/logo_jijipress.gif' |
109 | + |
110 | + feeds = [(u'\u30cb\u30e5\u30fc\u30b9', u'http://www.jiji.com/rss/ranking.rdf')] |
111 | + remove_tags_after = dict(id="ad_google") |
112 | + |
113 | |
114 | === added file 'resources/recipes/mainichi.recipe' |
115 | --- resources/recipes/mainichi.recipe 1970-01-01 00:00:00 +0000 |
116 | +++ resources/recipes/mainichi.recipe 2010-11-23 07:00:03 +0000 |
117 | @@ -0,0 +1,24 @@ |
118 | +#!/usr/bin/env python |
119 | + |
120 | +__license__ = 'GPL v3' |
121 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
122 | +''' |
123 | +www.mainichi.jp |
124 | +''' |
125 | + |
126 | +class MainichiDailyNews(BasicNewsRecipe): |
127 | + title = u'\u6bce\u65e5\u65b0\u805e' |
128 | + __author__ = 'Hiroshi Miura' |
129 | + oldest_article = 2 |
130 | + max_articles_per_feed = 20 |
131 | + description = 'Japanese traditional newspaper Mainichi Daily News' |
132 | + publisher = 'Mainichi Daily News' |
133 | + category = 'news, japan' |
134 | + language = 'ja' |
135 | + |
136 | + feeds = [(u'daily news', u'http://mainichi.jp/rss/etc/flash.rss')] |
137 | + |
138 | + remove_tags_before = {'class':"NewsTitle"} |
139 | + remove_tags = [{'class':"RelatedArticle"}] |
140 | + remove_tags_after = {'class':"Credit"} |
141 | + |
142 | |
143 | === added file 'resources/recipes/mainichi_it_news.recipe' |
144 | --- resources/recipes/mainichi_it_news.recipe 1970-01-01 00:00:00 +0000 |
145 | +++ resources/recipes/mainichi_it_news.recipe 2010-11-23 07:00:03 +0000 |
146 | @@ -0,0 +1,16 @@ |
147 | +class MainichiDailyITNews(BasicNewsRecipe): |
148 | + title = u'\u6bce\u65e5\u65b0\u805e(IT&\u5bb6\u96fb)' |
149 | + __author__ = 'Hiroshi Miura' |
150 | + oldest_article = 2 |
151 | + max_articles_per_feed = 100 |
152 | + description = 'Japanese traditional newspaper Mainichi Daily News - IT and electronics' |
153 | + publisher = 'Mainichi Daily News' |
154 | + category = 'news, Japan, IT, Electronics' |
155 | + language = 'ja' |
156 | + |
157 | + feeds = [(u'IT News', u'http://mainichi.pheedo.jp/f/mainichijp_electronics')] |
158 | + |
159 | + remove_tags_before = {'class':"NewsTitle"} |
160 | + remove_tags = [{'class':"RelatedArticle"}] |
161 | + remove_tags_after = {'class':"Credit"} |
162 | + |
163 | |
164 | === added file 'resources/recipes/msnsankei.recipe' |
165 | --- resources/recipes/msnsankei.recipe 1970-01-01 00:00:00 +0000 |
166 | +++ resources/recipes/msnsankei.recipe 2010-11-23 07:00:03 +0000 |
167 | @@ -0,0 +1,22 @@ |
168 | +#!/usr/bin/env python |
169 | + |
170 | +__license__ = 'GPL v3' |
171 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
172 | +''' |
173 | +sankei.jp.msn.com |
174 | +''' |
175 | + |
176 | +class MSNSankeiNewsProduct(BasicNewsRecipe): |
177 | + title = u'MSN\u7523\u7d4c\u30cb\u30e5\u30fc\u30b9(\u65b0\u5546\u54c1)' |
178 | + __author__ = 'Hiroshi Miura' |
179 | + description = 'Products release from Japan' |
180 | + oldest_article = 7 |
181 | + max_articles_per_feed = 100 |
182 | + encoding = 'Shift_JIS' |
183 | + language = 'ja' |
184 | + |
185 | + feeds = [(u'\u65b0\u5546\u54c1', u'http://sankei.jp.msn.com/rss/news/release.xml')] |
186 | + |
187 | + remove_tags_before = dict(id="__r_article_title__") |
188 | + remove_tags_after = dict(id="ajax_release_news") |
189 | + remove_tags = [{'class':"parent chromeCustom6G"}] |
190 | |
191 | === added file 'resources/recipes/nikkei_free.recipe' |
192 | --- resources/recipes/nikkei_free.recipe 1970-01-01 00:00:00 +0000 |
193 | +++ resources/recipes/nikkei_free.recipe 2010-11-23 07:00:03 +0000 |
194 | @@ -0,0 +1,58 @@ |
195 | +#!/usr/bin/env python |
196 | + |
197 | +__license__ = 'GPL v3' |
198 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
199 | +''' |
200 | +www.nikkei.com |
201 | +''' |
202 | + |
203 | +class NikkeiNet(BasicNewsRecipe): |
204 | + title = u'\u65e5\u7d4c\u65b0\u805e\u96fb\u5b50\u7248(Free)' |
205 | + __author__ = 'Hiroshi Miura' |
206 | + description = 'News and current market affairs from Japan' |
207 | + cover_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
208 | + masthead_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
209 | + oldest_article = 2 |
210 | + max_articles_per_feed = 20 |
211 | + language = 'ja' |
212 | + |
213 | + feeds = [ (u'\u65e5\u7d4c\u4f01\u696d', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=sangyo'), |
214 | + (u'\u65e5\u7d4c\u88fd\u54c1', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=newpro'), |
215 | + (u'internet', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=internet'), |
216 | + (u'\u653f\u6cbb', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=seiji'), |
217 | + (u'\u8ca1\u52d9', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=zaimu'), |
218 | + (u'\u7d4c\u6e08', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=keizai'), |
219 | + (u'\u56fd\u969b', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kaigai'), |
220 | + (u'\u79d1\u5b66', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kagaku'), |
221 | + (u'\u30de\u30fc\u30b1\u30c3\u30c8', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=market'), |
222 | + (u'\u304f\u3089\u3057', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kurashi'), |
223 | + (u'\u30b9\u30dd\u30fc\u30c4', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=sports'), |
224 | + (u'\u793e\u4f1a', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=shakai'), |
225 | + (u'\u30a8\u30b3', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=eco'), |
226 | + (u'\u5065\u5eb7', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kenkou'), |
227 | + (u'\u96c7\u7528', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=koyou'), |
228 | + (u'\u6559\u80b2', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kyouiku'), |
229 | + (u'\u304a\u304f\u3084\u307f', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=okuyami'), |
230 | + (u'\u4eba\u4e8b', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=zinzi'), |
231 | + (u'\u7279\u96c6', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=special'), |
232 | + (u'\u5730\u57df\u30cb\u30e5\u30fc\u30b9', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=local'), |
233 | + (u'\u7d71\u8a08\u30fb\u767d\u66f8', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=report'), |
234 | + (u'\u30e9\u30f3\u30ad\u30f3\u30b0', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=ranking'), |
235 | + (u'\u4f1a\u898b', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=interview'), |
236 | + (u'\u793e\u8aac\u30fb\u6625\u79cb', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=shasetsu'), |
237 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u30d7\u30ed\u91ce\u7403', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=baseball'), |
238 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u5927\u30ea\u30fc\u30b0', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=mlb'), |
239 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u30b5\u30c3\u30ab\u30fc', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=soccer'), |
240 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u30b4\u30eb\u30d5', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=golf'), |
241 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u76f8\u64b2', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=sumou'), |
242 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u7af6\u99ac', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=keiba'), |
243 | + (u'\u8abf\u67fb\u30fb\u30a2\u30f3\u30b1\u30fc\u30c8', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=research') |
244 | + ] |
245 | + |
246 | + remove_tags_before = dict(id="CONTENTS") |
247 | + remove_tags = [ |
248 | + dict(name="form"), |
249 | + {'class':"cmn-hide"}, |
250 | + ] |
251 | + remove_tags_after = {'class':"cmn-pr_list"} |
252 | + |
253 | |
254 | === added file 'resources/recipes/nikkei_sub_economy.recipe' |
255 | --- resources/recipes/nikkei_sub_economy.recipe 1970-01-01 00:00:00 +0000 |
256 | +++ resources/recipes/nikkei_sub_economy.recipe 2010-11-23 07:00:03 +0000 |
257 | @@ -0,0 +1,111 @@ |
258 | +#!/usr/bin/env python |
259 | + |
260 | +__license__ = 'GPL v3' |
261 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
262 | +''' |
263 | +www.nikkei.com |
264 | +''' |
265 | + |
266 | +import string, re, sys |
267 | +from calibre import strftime |
268 | +from calibre.web.feeds.recipes import BasicNewsRecipe |
269 | +import mechanize |
270 | +from calibre.ptempfile import PersistentTemporaryFile |
271 | + |
272 | + |
273 | +class NikkeiNet_sub_economy(BasicNewsRecipe): |
274 | + title = u'\u65e5\u7d4c\u65b0\u805e\u96fb\u5b50\u7248(\u7d4c\u6e08)' |
275 | + __author__ = 'Hiroshi Miura' |
276 | + description = 'News and current market affairs from Japan' |
277 | + cover_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
278 | + masthead_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
279 | + needs_subscription = True |
280 | + oldest_article = 2 |
281 | + max_articles_per_feed = 20 |
282 | + language = 'ja' |
283 | + remove_javascript = False |
284 | + temp_files = [] |
285 | + |
286 | + remove_tags_before = {'class':"cmn-section cmn-indent"} |
287 | + remove_tags = [ |
288 | + {'class':"JSID_basePageMove JSID_baseAsyncSubmit cmn-form_area JSID_optForm_utoken"}, |
289 | + {'class':"cmn-article_keyword cmn-clearfix"}, |
290 | + {'class':"cmn-print_headline cmn-clearfix"}, |
291 | + ] |
292 | + remove_tags_after = {'class':"cmn-pr_list"} |
293 | + |
294 | + feeds = [ (u'\u653f\u6cbb', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=seiji'), |
295 | + (u'\u8ca1\u52d9', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=zaimu'), |
296 | + (u'\u7d4c\u6e08', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=keizai'), |
297 | + (u'\u30de\u30fc\u30b1\u30c3\u30c8', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=market'), |
298 | + (u'\u96c7\u7528', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=koyou'), |
299 | + (u'\u6559\u80b2', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kyouiku'), |
300 | + (u'\u304a\u304f\u3084\u307f', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=okuyami'), |
301 | + (u'\u4eba\u4e8b', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=zinzi'), |
302 | + ] |
303 | + |
304 | + def get_browser(self): |
305 | + br = BasicNewsRecipe.get_browser() |
306 | + |
307 | + cj = mechanize.LWPCookieJar() |
308 | + br.set_cookiejar(cj) |
309 | + |
310 | + #br.set_debug_http(True) |
311 | + #br.set_debug_redirects(True) |
312 | + #br.set_debug_responses(True) |
313 | + |
314 | + if self.username is not None and self.password is not None: |
315 | + #print "----------------------------get login form--------------------------------------------" |
316 | + # open login form |
317 | + br.open('https://id.nikkei.com/lounge/nl/base/LA0010.seam') |
318 | + response = br.response() |
319 | + #print "----------------------------get login form---------------------------------------------" |
320 | + #print "----------------------------set login form---------------------------------------------" |
321 | + # remove disabled input which brings error on mechanize |
322 | + response.set_data(response.get_data().replace("<input id=\"j_id48\"", "<!-- ")) |
323 | + response.set_data(response.get_data().replace("gm_home_on.gif\" />", " -->")) |
324 | + br.set_response(response) |
325 | + br.select_form(name='LA0010Form01') |
326 | + br['LA0010Form01:LA0010Email'] = self.username |
327 | + br['LA0010Form01:LA0010Password'] = self.password |
328 | + br.form.find_control(id='LA0010Form01:LA0010AutoLoginOn',type="checkbox").get(nr=0).selected = True |
329 | + br.submit() |
330 | + response1 = br.response() |
331 | + #print "----------------------------send login form---------------------------------------------" |
332 | + #print "----------------------------open news main page-----------------------------------------" |
333 | + # open news site |
334 | + br.open('http://www.nikkei.com/') |
335 | + response2 = br.response() |
336 | + #print "----------------------------www.nikkei.com BODY --------------------------------------" |
337 | + #print response2.get_data() |
338 | + #print "-------------------------^^-got auto redirect form----^^--------------------------------" |
339 | + # forced redirect in default |
340 | + br.select_form(nr=0) |
341 | + br.submit() |
342 | + response3 = br.response() |
343 | + # return some cookie which should be set by Javascript |
344 | + #print response3.geturl() |
345 | + raw = response3.get_data() |
346 | + #print "---------------------------response to form --------------------------------------------" |
347 | + # grab cookie from JS and set it |
348 | + redirectflag = re.search(r"var checkValue = '(\d+)';", raw, re.M).group(1) |
349 | + br.select_form(nr=0) |
350 | + |
351 | + self.temp_files.append(PersistentTemporaryFile('_fa.html')) |
352 | + self.temp_files[-1].write("#LWP-Cookies-2.0\n") |
353 | + |
354 | + self.temp_files[-1].write("Set-Cookie3: Cookie-dummy=Cookie-value; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
355 | + self.temp_files[-1].write("Set-Cookie3: redirectFlag="+redirectflag+"; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
356 | + self.temp_files[-1].close() |
357 | + cj.load(self.temp_files[-1].name) |
358 | + |
359 | + br.submit() |
360 | + |
361 | + #br.set_debug_http(False) |
362 | + #br.set_debug_redirects(False) |
363 | + #br.set_debug_responses(False) |
364 | + return br |
365 | + |
366 | + |
367 | + |
368 | + |
369 | |
370 | === added file 'resources/recipes/nikkei_sub_industry.recipe' |
371 | --- resources/recipes/nikkei_sub_industry.recipe 1970-01-01 00:00:00 +0000 |
372 | +++ resources/recipes/nikkei_sub_industry.recipe 2010-11-23 07:00:03 +0000 |
373 | @@ -0,0 +1,109 @@ |
374 | +#!/usr/bin/env python |
375 | + |
376 | +__license__ = 'GPL v3' |
377 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
378 | +''' |
379 | +www.nikkei.com |
380 | +''' |
381 | + |
382 | +import string, re, sys |
383 | +from calibre import strftime |
384 | +from calibre.web.feeds.recipes import BasicNewsRecipe |
385 | +import mechanize |
386 | +from calibre.ptempfile import PersistentTemporaryFile |
387 | + |
388 | + |
389 | +class NikkeiNet_sub_industory(BasicNewsRecipe): |
390 | + title = u'\u65e5\u7d4c\u65b0\u805e\u96fb\u5b50\u7248(\u7523\u696d)' |
391 | + __author__ = 'Hiroshi Miura' |
392 | + description = 'News and current market affairs from Japan' |
393 | + cover_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
394 | + masthead_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
395 | + needs_subscription = True |
396 | + oldest_article = 2 |
397 | + max_articles_per_feed = 20 |
398 | + language = 'ja' |
399 | + remove_javascript = False |
400 | + temp_files = [] |
401 | + |
402 | + remove_tags_before = {'class':"cmn-section cmn-indent"} |
403 | + remove_tags = [ |
404 | + {'class':"JSID_basePageMove JSID_baseAsyncSubmit cmn-form_area JSID_optForm_utoken"}, |
405 | + {'class':"cmn-article_keyword cmn-clearfix"}, |
406 | + {'class':"cmn-print_headline cmn-clearfix"}, |
407 | + ] |
408 | + remove_tags_after = {'class':"cmn-pr_list"} |
409 | + |
410 | + feeds = [ (u'\u65e5\u7d4c\u4f01\u696d', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=sangyo'), |
411 | + (u'\u65e5\u7d4c\u88fd\u54c1', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=newpro'), |
412 | + (u'internet', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=internet'), |
413 | + (u'\u56fd\u969b', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kaigai'), |
414 | + (u'\u79d1\u5b66', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kagaku'), |
415 | + |
416 | + ] |
417 | + |
418 | + def get_browser(self): |
419 | + br = BasicNewsRecipe.get_browser() |
420 | + |
421 | + cj = mechanize.LWPCookieJar() |
422 | + br.set_cookiejar(cj) |
423 | + |
424 | + #br.set_debug_http(True) |
425 | + #br.set_debug_redirects(True) |
426 | + #br.set_debug_responses(True) |
427 | + |
428 | + if self.username is not None and self.password is not None: |
429 | + #print "----------------------------get login form--------------------------------------------" |
430 | + # open login form |
431 | + br.open('https://id.nikkei.com/lounge/nl/base/LA0010.seam') |
432 | + response = br.response() |
433 | + #print "----------------------------get login form---------------------------------------------" |
434 | + #print "----------------------------set login form---------------------------------------------" |
435 | + # remove disabled input which brings error on mechanize |
436 | + response.set_data(response.get_data().replace("<input id=\"j_id48\"", "<!-- ")) |
437 | + response.set_data(response.get_data().replace("gm_home_on.gif\" />", " -->")) |
438 | + br.set_response(response) |
439 | + br.select_form(name='LA0010Form01') |
440 | + br['LA0010Form01:LA0010Email'] = self.username |
441 | + br['LA0010Form01:LA0010Password'] = self.password |
442 | + br.form.find_control(id='LA0010Form01:LA0010AutoLoginOn',type="checkbox").get(nr=0).selected = True |
443 | + br.submit() |
444 | + response1 = br.response() |
445 | + #print "----------------------------send login form---------------------------------------------" |
446 | + #print "----------------------------open news main page-----------------------------------------" |
447 | + # open news site |
448 | + br.open('http://www.nikkei.com/') |
449 | + response2 = br.response() |
450 | + #print "----------------------------www.nikkei.com BODY --------------------------------------" |
451 | + #print response2.get_data() |
452 | + #print "-------------------------^^-got auto redirect form----^^--------------------------------" |
453 | + # forced redirect in default |
454 | + br.select_form(nr=0) |
455 | + br.submit() |
456 | + response3 = br.response() |
457 | + # return some cookie which should be set by Javascript |
458 | + #print response3.geturl() |
459 | + raw = response3.get_data() |
460 | + #print "---------------------------response to form --------------------------------------------" |
461 | + # grab cookie from JS and set it |
462 | + redirectflag = re.search(r"var checkValue = '(\d+)';", raw, re.M).group(1) |
463 | + br.select_form(nr=0) |
464 | + |
465 | + self.temp_files.append(PersistentTemporaryFile('_fa.html')) |
466 | + self.temp_files[-1].write("#LWP-Cookies-2.0\n") |
467 | + |
468 | + self.temp_files[-1].write("Set-Cookie3: Cookie-dummy=Cookie-value; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
469 | + self.temp_files[-1].write("Set-Cookie3: redirectFlag="+redirectflag+"; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
470 | + self.temp_files[-1].close() |
471 | + cj.load(self.temp_files[-1].name) |
472 | + |
473 | + br.submit() |
474 | + |
475 | + #br.set_debug_http(False) |
476 | + #br.set_debug_redirects(False) |
477 | + #br.set_debug_responses(False) |
478 | + return br |
479 | + |
480 | + |
481 | + |
482 | + |
483 | |
484 | === added file 'resources/recipes/nikkei_sub_life.recipe' |
485 | --- resources/recipes/nikkei_sub_life.recipe 1970-01-01 00:00:00 +0000 |
486 | +++ resources/recipes/nikkei_sub_life.recipe 2010-11-23 07:00:03 +0000 |
487 | @@ -0,0 +1,110 @@ |
488 | +#!/usr/bin/env python |
489 | + |
490 | +__license__ = 'GPL v3' |
491 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
492 | +''' |
493 | +www.nikkei.com |
494 | +''' |
495 | + |
496 | +import string, re, sys |
497 | +from calibre import strftime |
498 | +from calibre.web.feeds.recipes import BasicNewsRecipe |
499 | +import mechanize |
500 | +from calibre.ptempfile import PersistentTemporaryFile |
501 | + |
502 | + |
503 | +class NikkeiNet_sub_life(BasicNewsRecipe): |
504 | + title = u'\u65e5\u7d4c\u65b0\u805e\u96fb\u5b50\u7248(\u751f\u6d3b)' |
505 | + __author__ = 'Hiroshi Miura' |
506 | + description = 'News and current market affairs from Japan' |
507 | + cover_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
508 | + masthead_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
509 | + needs_subscription = True |
510 | + oldest_article = 2 |
511 | + max_articles_per_feed = 20 |
512 | + language = 'ja' |
513 | + remove_javascript = False |
514 | + temp_files = [] |
515 | + |
516 | + remove_tags_before = {'class':"cmn-section cmn-indent"} |
517 | + remove_tags = [ |
518 | + {'class':"JSID_basePageMove JSID_baseAsyncSubmit cmn-form_area JSID_optForm_utoken"}, |
519 | + {'class':"cmn-article_keyword cmn-clearfix"}, |
520 | + {'class':"cmn-print_headline cmn-clearfix"}, |
521 | + ] |
522 | + remove_tags_after = {'class':"cmn-pr_list"} |
523 | + |
524 | + feeds = [ (u'\u304f\u3089\u3057', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kurashi'), |
525 | + (u'\u30b9\u30dd\u30fc\u30c4', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=sports'), |
526 | + (u'\u793e\u4f1a', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=shakai'), |
527 | + (u'\u30a8\u30b3', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=eco'), |
528 | + (u'\u5065\u5eb7', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=kenkou'), |
529 | + (u'\u7279\u96c6', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=special'), |
530 | + (u'\u30e9\u30f3\u30ad\u30f3\u30b0', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=ranking') |
531 | + ] |
532 | + |
533 | + def get_browser(self): |
534 | + br = BasicNewsRecipe.get_browser() |
535 | + |
536 | + cj = mechanize.LWPCookieJar() |
537 | + br.set_cookiejar(cj) |
538 | + |
539 | + #br.set_debug_http(True) |
540 | + #br.set_debug_redirects(True) |
541 | + #br.set_debug_responses(True) |
542 | + |
543 | + if self.username is not None and self.password is not None: |
544 | + #print "----------------------------get login form--------------------------------------------" |
545 | + # open login form |
546 | + br.open('https://id.nikkei.com/lounge/nl/base/LA0010.seam') |
547 | + response = br.response() |
548 | + #print "----------------------------get login form---------------------------------------------" |
549 | + #print "----------------------------set login form---------------------------------------------" |
550 | + # remove disabled input which brings error on mechanize |
551 | + response.set_data(response.get_data().replace("<input id=\"j_id48\"", "<!-- ")) |
552 | + response.set_data(response.get_data().replace("gm_home_on.gif\" />", " -->")) |
553 | + br.set_response(response) |
554 | + br.select_form(name='LA0010Form01') |
555 | + br['LA0010Form01:LA0010Email'] = self.username |
556 | + br['LA0010Form01:LA0010Password'] = self.password |
557 | + br.form.find_control(id='LA0010Form01:LA0010AutoLoginOn',type="checkbox").get(nr=0).selected = True |
558 | + br.submit() |
559 | + response1 = br.response() |
560 | + #print "----------------------------send login form---------------------------------------------" |
561 | + #print "----------------------------open news main page-----------------------------------------" |
562 | + # open news site |
563 | + br.open('http://www.nikkei.com/') |
564 | + response2 = br.response() |
565 | + #print "----------------------------www.nikkei.com BODY --------------------------------------" |
566 | + #print response2.get_data() |
567 | + #print "-------------------------^^-got auto redirect form----^^--------------------------------" |
568 | + # forced redirect in default |
569 | + br.select_form(nr=0) |
570 | + br.submit() |
571 | + response3 = br.response() |
572 | + # return some cookie which should be set by Javascript |
573 | + #print response3.geturl() |
574 | + raw = response3.get_data() |
575 | + #print "---------------------------response to form --------------------------------------------" |
576 | + # grab cookie from JS and set it |
577 | + redirectflag = re.search(r"var checkValue = '(\d+)';", raw, re.M).group(1) |
578 | + br.select_form(nr=0) |
579 | + |
580 | + self.temp_files.append(PersistentTemporaryFile('_fa.html')) |
581 | + self.temp_files[-1].write("#LWP-Cookies-2.0\n") |
582 | + |
583 | + self.temp_files[-1].write("Set-Cookie3: Cookie-dummy=Cookie-value; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
584 | + self.temp_files[-1].write("Set-Cookie3: redirectFlag="+redirectflag+"; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
585 | + self.temp_files[-1].close() |
586 | + cj.load(self.temp_files[-1].name) |
587 | + |
588 | + br.submit() |
589 | + |
590 | + #br.set_debug_http(False) |
591 | + #br.set_debug_redirects(False) |
592 | + #br.set_debug_responses(False) |
593 | + return br |
594 | + |
595 | + |
596 | + |
597 | + |
598 | |
599 | === added file 'resources/recipes/nikkei_sub_main.recipe' |
600 | --- resources/recipes/nikkei_sub_main.recipe 1970-01-01 00:00:00 +0000 |
601 | +++ resources/recipes/nikkei_sub_main.recipe 2010-11-23 07:00:03 +0000 |
602 | @@ -0,0 +1,103 @@ |
603 | +#!/usr/bin/env python |
604 | + |
605 | +__license__ = 'GPL v3' |
606 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
607 | +''' |
608 | +www.nikkei.com |
609 | +''' |
610 | + |
611 | +import string, re, sys |
612 | +from calibre import strftime |
613 | +from calibre.web.feeds.recipes import BasicNewsRecipe |
614 | +import mechanize |
615 | +from calibre.ptempfile import PersistentTemporaryFile |
616 | + |
617 | + |
618 | +class NikkeiNet_sub_main(BasicNewsRecipe): |
619 | + title = u'\u65e5\u7d4c\u65b0\u805e\u96fb\u5b50\u7248(\u7dcf\u5408)' |
620 | + __author__ = 'Hiroshi Miura' |
621 | + description = 'News and current market affairs from Japan' |
622 | + cover_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
623 | + masthead_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
624 | + needs_subscription = True |
625 | + oldest_article = 2 |
626 | + max_articles_per_feed = 20 |
627 | + language = 'ja' |
628 | + remove_javascript = False |
629 | + temp_files = [] |
630 | + |
631 | + remove_tags_before = {'class':"cmn-section cmn-indent"} |
632 | + remove_tags = [ |
633 | + {'class':"JSID_basePageMove JSID_baseAsyncSubmit cmn-form_area JSID_optForm_utoken"}, |
634 | + {'class':"cmn-article_keyword cmn-clearfix"}, |
635 | + {'class':"cmn-print_headline cmn-clearfix"}, |
636 | + ] |
637 | + remove_tags_after = {'class':"cmn-pr_list"} |
638 | + |
639 | + feeds = [ (u'NIKKEI', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=main')] |
640 | + |
641 | + def get_browser(self): |
642 | + br = BasicNewsRecipe.get_browser() |
643 | + |
644 | + cj = mechanize.LWPCookieJar() |
645 | + br.set_cookiejar(cj) |
646 | + |
647 | + #br.set_debug_http(True) |
648 | + #br.set_debug_redirects(True) |
649 | + #br.set_debug_responses(True) |
650 | + |
651 | + if self.username is not None and self.password is not None: |
652 | + #print "----------------------------get login form--------------------------------------------" |
653 | + # open login form |
654 | + br.open('https://id.nikkei.com/lounge/nl/base/LA0010.seam') |
655 | + response = br.response() |
656 | + #print "----------------------------get login form---------------------------------------------" |
657 | + #print "----------------------------set login form---------------------------------------------" |
658 | + # remove disabled input which brings error on mechanize |
659 | + response.set_data(response.get_data().replace("<input id=\"j_id48\"", "<!-- ")) |
660 | + response.set_data(response.get_data().replace("gm_home_on.gif\" />", " -->")) |
661 | + br.set_response(response) |
662 | + br.select_form(name='LA0010Form01') |
663 | + br['LA0010Form01:LA0010Email'] = self.username |
664 | + br['LA0010Form01:LA0010Password'] = self.password |
665 | + br.form.find_control(id='LA0010Form01:LA0010AutoLoginOn',type="checkbox").get(nr=0).selected = True |
666 | + br.submit() |
667 | + response1 = br.response() |
668 | + #print "----------------------------send login form---------------------------------------------" |
669 | + #print "----------------------------open news main page-----------------------------------------" |
670 | + # open news site |
671 | + br.open('http://www.nikkei.com/') |
672 | + response2 = br.response() |
673 | + #print "----------------------------www.nikkei.com BODY --------------------------------------" |
674 | + #print response2.get_data() |
675 | + #print "-------------------------^^-got auto redirect form----^^--------------------------------" |
676 | + # forced redirect in default |
677 | + br.select_form(nr=0) |
678 | + br.submit() |
679 | + response3 = br.response() |
680 | + # return some cookie which should be set by Javascript |
681 | + #print response3.geturl() |
682 | + raw = response3.get_data() |
683 | + #print "---------------------------response to form --------------------------------------------" |
684 | + # grab cookie from JS and set it |
685 | + redirectflag = re.search(r"var checkValue = '(\d+)';", raw, re.M).group(1) |
686 | + br.select_form(nr=0) |
687 | + |
688 | + self.temp_files.append(PersistentTemporaryFile('_fa.html')) |
689 | + self.temp_files[-1].write("#LWP-Cookies-2.0\n") |
690 | + |
691 | + self.temp_files[-1].write("Set-Cookie3: Cookie-dummy=Cookie-value; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
692 | + self.temp_files[-1].write("Set-Cookie3: redirectFlag="+redirectflag+"; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
693 | + self.temp_files[-1].close() |
694 | + cj.load(self.temp_files[-1].name) |
695 | + |
696 | + br.submit() |
697 | + |
698 | + #br.set_debug_http(False) |
699 | + #br.set_debug_redirects(False) |
700 | + #br.set_debug_responses(False) |
701 | + return br |
702 | + |
703 | + |
704 | + |
705 | + |
706 | |
707 | === added file 'resources/recipes/nikkei_sub_sports.recipe' |
708 | --- resources/recipes/nikkei_sub_sports.recipe 1970-01-01 00:00:00 +0000 |
709 | +++ resources/recipes/nikkei_sub_sports.recipe 2010-11-23 07:00:03 +0000 |
710 | @@ -0,0 +1,110 @@ |
711 | +#!/usr/bin/env python |
712 | + |
713 | +__license__ = 'GPL v3' |
714 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
715 | +''' |
716 | +www.nikkei.com |
717 | +''' |
718 | + |
719 | +import string, re, sys |
720 | +from calibre import strftime |
721 | +from calibre.web.feeds.recipes import BasicNewsRecipe |
722 | +import mechanize |
723 | +from calibre.ptempfile import PersistentTemporaryFile |
724 | + |
725 | + |
726 | +class NikkeiNet_sub_sports(BasicNewsRecipe): |
727 | + title = u'\u65e5\u7d4c\u65b0\u805e\u96fb\u5b50\u7248(\u30b9\u30dd\u30fc\u30c4)' |
728 | + __author__ = 'Hiroshi Miura' |
729 | + description = 'News and current market affairs from Japan' |
730 | + cover_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
731 | + masthead_url = 'http://parts.nikkei.com/parts/ds/images/common/logo_r1.svg' |
732 | + needs_subscription = True |
733 | + oldest_article = 2 |
734 | + max_articles_per_feed = 20 |
735 | + language = 'ja' |
736 | + remove_javascript = False |
737 | + temp_files = [] |
738 | + |
739 | + remove_tags_before = {'class':"cmn-section cmn-indent"} |
740 | + remove_tags = [ |
741 | + {'class':"JSID_basePageMove JSID_baseAsyncSubmit cmn-form_area JSID_optForm_utoken"}, |
742 | + {'class':"cmn-article_keyword cmn-clearfix"}, |
743 | + {'class':"cmn-print_headline cmn-clearfix"}, |
744 | + ] |
745 | + remove_tags_after = {'class':"cmn-pr_list"} |
746 | + |
747 | + feeds = [ |
748 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u30d7\u30ed\u91ce\u7403', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=baseball'), |
749 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u5927\u30ea\u30fc\u30b0', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=mlb'), |
750 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u30b5\u30c3\u30ab\u30fc', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=soccer'), |
751 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u30b4\u30eb\u30d5', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=golf'), |
752 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u76f8\u64b2', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=sumou'), |
753 | + (u'\u30b9\u30dd\u30fc\u30c4\uff1a\u7af6\u99ac', u'http://www.zou3.net/php/rss/nikkei2rss.php?head=keiba') |
754 | + ] |
755 | + |
756 | + def get_browser(self): |
757 | + br = BasicNewsRecipe.get_browser() |
758 | + |
759 | + cj = mechanize.LWPCookieJar() |
760 | + br.set_cookiejar(cj) |
761 | + |
762 | + #br.set_debug_http(True) |
763 | + #br.set_debug_redirects(True) |
764 | + #br.set_debug_responses(True) |
765 | + |
766 | + if self.username is not None and self.password is not None: |
767 | + #print "----------------------------get login form--------------------------------------------" |
768 | + # open login form |
769 | + br.open('https://id.nikkei.com/lounge/nl/base/LA0010.seam') |
770 | + response = br.response() |
771 | + #print "----------------------------get login form---------------------------------------------" |
772 | + #print "----------------------------set login form---------------------------------------------" |
773 | + # remove disabled input which brings error on mechanize |
774 | + response.set_data(response.get_data().replace("<input id=\"j_id48\"", "<!-- ")) |
775 | + response.set_data(response.get_data().replace("gm_home_on.gif\" />", " -->")) |
776 | + br.set_response(response) |
777 | + br.select_form(name='LA0010Form01') |
778 | + br['LA0010Form01:LA0010Email'] = self.username |
779 | + br['LA0010Form01:LA0010Password'] = self.password |
780 | + br.form.find_control(id='LA0010Form01:LA0010AutoLoginOn',type="checkbox").get(nr=0).selected = True |
781 | + br.submit() |
782 | + response1 = br.response() |
783 | + #print "----------------------------send login form---------------------------------------------" |
784 | + #print "----------------------------open news main page-----------------------------------------" |
785 | + # open news site |
786 | + br.open('http://www.nikkei.com/') |
787 | + response2 = br.response() |
788 | + #print "----------------------------www.nikkei.com BODY --------------------------------------" |
789 | + #print response2.get_data() |
790 | + #print "-------------------------^^-got auto redirect form----^^--------------------------------" |
791 | + # forced redirect in default |
792 | + br.select_form(nr=0) |
793 | + br.submit() |
794 | + response3 = br.response() |
795 | + # return some cookie which should be set by Javascript |
796 | + #print response3.geturl() |
797 | + raw = response3.get_data() |
798 | + #print "---------------------------response to form --------------------------------------------" |
799 | + # grab cookie from JS and set it |
800 | + redirectflag = re.search(r"var checkValue = '(\d+)';", raw, re.M).group(1) |
801 | + br.select_form(nr=0) |
802 | + |
803 | + self.temp_files.append(PersistentTemporaryFile('_fa.html')) |
804 | + self.temp_files[-1].write("#LWP-Cookies-2.0\n") |
805 | + |
806 | + self.temp_files[-1].write("Set-Cookie3: Cookie-dummy=Cookie-value; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
807 | + self.temp_files[-1].write("Set-Cookie3: redirectFlag="+redirectflag+"; domain=\".nikkei.com\"; path=\"/\"; path_spec; secure; expires=\"2029-12-21 05:07:59Z\"; version=0\n") |
808 | + self.temp_files[-1].close() |
809 | + cj.load(self.temp_files[-1].name) |
810 | + |
811 | + br.submit() |
812 | + |
813 | + #br.set_debug_http(False) |
814 | + #br.set_debug_redirects(False) |
815 | + #br.set_debug_responses(False) |
816 | + return br |
817 | + |
818 | + |
819 | + |
820 | + |
821 | |
822 | === added file 'resources/recipes/reuters_ja.recipe' |
823 | --- resources/recipes/reuters_ja.recipe 1970-01-01 00:00:00 +0000 |
824 | +++ resources/recipes/reuters_ja.recipe 2010-11-23 07:00:03 +0000 |
825 | @@ -0,0 +1,37 @@ |
826 | +from calibre.web.feeds.news import BasicNewsRecipe |
827 | +import re |
828 | + |
829 | +class ReutersJa(BasicNewsRecipe): |
830 | + |
831 | + title = 'Reuters(Japan)' |
832 | + description = 'Global news in Japanese' |
833 | + __author__ = 'Hiroshi Miura' |
834 | + use_embedded_content = False |
835 | + language = 'ja' |
836 | + max_articles_per_feed = 10 |
837 | + remove_javascript = True |
838 | + |
839 | + feeds = [ ('Top Stories', 'http://feeds.reuters.com/reuters/JPTopNews?format=xml'), |
840 | + ('World News', 'http://feeds.reuters.com/reuters/JPWorldNews?format=xml'), |
841 | + ('Business News', 'http://feeds.reuters.com/reuters/JPBusinessNews?format=xml'), |
842 | + ('Technology News', 'http://feeds.reuters.com/reuters/JPTechnologyNews?format=xml'), |
843 | + ('Oddly Enough News', 'http://feeds.reuters.com/reuters/JPOddlyEnoughNews?format=xml') |
844 | + ] |
845 | + |
846 | + remove_tags_before = {'class':"article primaryContent"} |
847 | + remove_tags = [ dict(id="banner"), |
848 | + dict(id="autilities"), |
849 | + dict(id="textSizer"), |
850 | + dict(id="shareFooter"), |
851 | + dict(id="relatedNews"), |
852 | + dict(id="editorsChoice"), |
853 | + dict(id="ecArticles"), |
854 | + {'class':"secondaryContent"}, |
855 | + {'class':"module"}, |
856 | + ] |
857 | + remove_tags_after = {'class':"assetBuddy"} |
858 | + |
859 | + def print_version(self, url): |
860 | + m = re.search('(.*idJPJAPAN-[0-9]+)', url) |
861 | + return m.group(0)+'?sp=true' |
862 | + |
863 | |
864 | === added file 'resources/recipes/the_h.recipe' |
865 | --- resources/recipes/the_h.recipe 1970-01-01 00:00:00 +0000 |
866 | +++ resources/recipes/the_h.recipe 2010-11-23 07:00:03 +0000 |
867 | @@ -0,0 +1,31 @@ |
868 | +#!/usr/bin/env python |
869 | + |
870 | +__license__ = 'GPL v3' |
871 | +__copyright__ = '2010, Hiroshi Miura <miurahr@linux.com>' |
872 | +''' |
873 | +www.h-online.com |
874 | +''' |
875 | + |
876 | +class TheHeiseOnline(BasicNewsRecipe): |
877 | + title = u'The H' |
878 | + __author__ = 'Hiroshi Miura' |
879 | + oldest_article = 3 |
880 | + description = 'In association with Heise Online' |
881 | + publisher = 'Heise Media UK Ltd.' |
882 | + category = 'news, technology, security' |
883 | + max_articles_per_feed = 100 |
884 | + language = 'en' |
885 | + encoding = 'utf-8' |
886 | + conversion_options = { |
887 | + 'comment' : description |
888 | + ,'tags' : category |
889 | + ,'publisher': publisher |
890 | + ,'language' : language |
891 | + } |
892 | + feeds = [ |
893 | + (u'The H News Feed', u'http://www.h-online.com/news/atom.xml') |
894 | + ] |
895 | + |
896 | + def print_version(self, url): |
897 | + return url + '?view=print' |
898 | + |