1
=== modified file '.pc/applied-patches'
2
--- .pc/applied-patches	2013-03-06 14:34:02 +0000
3
+++ .pc/applied-patches	2013-03-19 07:30:29 +0000
4
@@ -8,3 +8,4 @@
5
8
test-sys-path.diff
8
test-sys-path.diff
6
9
move-data-to-usr-share.diff
9
move-data-to-usr-share.diff
7
10
disable_py33_failing_tests.diff
10
disable_py33_failing_tests.diff
8
11
support-aliases-in-references.diff
9
11
12
10
=== added directory '.pc/support-aliases-in-references.diff'
11
=== added directory '.pc/support-aliases-in-references.diff/docs'
12
=== added directory '.pc/support-aliases-in-references.diff/docs/ref'
13
=== added directory '.pc/support-aliases-in-references.diff/docs/ref/rst'
14
=== added file '.pc/support-aliases-in-references.diff/docs/ref/rst/restructuredtext.txt'
15
--- .pc/support-aliases-in-references.diff/docs/ref/rst/restructuredtext.txt	1970-01-01 00:00:00 +0000
16
+++ .pc/support-aliases-in-references.diff/docs/ref/rst/restructuredtext.txt	2013-03-19 07:30:29 +0000
17
@@ -0,0 +1,2979 @@
18
1
.. -*- coding: utf-8 -*-
19
2
20
3
=======================================
21
4
 reStructuredText Markup Specification
22
5
=======================================
23
6
24
7
:Author: David Goodger
25
8
:Contact: docutils-develop@lists.sourceforge.net
26
9
:Revision: $Revision: 7302 $
27
10
:Date: $Date: 2012-01-03 20:23:53 +0100 (Di, 03. Jan 2012) $
28
11
:Copyright: This document has been placed in the public domain.
29
12
30
13
.. Note::
31
14
32
15
   This document is a detailed technical specification; it is not a
33
16
   tutorial or a primer.  If this is your first exposure to
34
17
   reStructuredText, please read `A ReStructuredText Primer`_ and the
35
18
   `Quick reStructuredText`_ user reference first.
36
19
37
20
.. _A ReStructuredText Primer: ../../user/rst/quickstart.html
38
21
.. _Quick reStructuredText: ../../user/rst/quickref.html
39
22
40
23
41
24
reStructuredText_ is plaintext that uses simple and intuitive
42
25
constructs to indicate the structure of a document.  These constructs
43
26
are equally easy to read in raw and processed forms.  This document is
44
27
itself an example of reStructuredText (raw, if you are reading the
45
28
text file, or processed, if you are reading an HTML document, for
46
29
example).  The reStructuredText parser is a component of Docutils_.
47
30
48
31
Simple, implicit markup is used to indicate special constructs, such
49
32
as section headings, bullet lists, and emphasis.  The markup used is
50
33
as minimal and unobtrusive as possible.  Less often-used constructs
51
34
and extensions to the basic reStructuredText syntax may have more
52
35
elaborate or explicit markup.
53
36
54
37
reStructuredText is applicable to documents of any length, from the
55
38
very small (such as inline program documentation fragments, e.g.
56
39
Python docstrings) to the quite large (this document).
57
40
58
41
The first section gives a quick overview of the syntax of the
59
42
reStructuredText markup by example.  A complete specification is given
60
43
in the `Syntax Details`_ section.
61
44
62
45
`Literal blocks`_ (in which no markup processing is done) are used for
63
46
examples throughout this document, to illustrate the plaintext markup.
64
47
65
48
66
49
.. contents::
67
50
68
51
69
52
-----------------------
70
53
 Quick Syntax Overview
71
54
-----------------------
72
55
73
56
A reStructuredText document is made up of body or block-level
74
57
elements, and may be structured into sections.  Sections_ are
75
58
indicated through title style (underlines & optional overlines).
76
59
Sections contain body elements and/or subsections.  Some body elements
77
60
contain further elements, such as lists containing list items, which
78
61
in turn may contain paragraphs and other body elements.  Others, such
79
62
as paragraphs, contain text and `inline markup`_ elements.
80
63
81
64
Here are examples of `body elements`_:
82
65
83
66
- Paragraphs_ (and `inline markup`_)::
84
67
85
68
      Paragraphs contain text and may contain inline markup:
86
69
      *emphasis*, **strong emphasis**, `interpreted text`, ``inline
87
70
      literals``, standalone hyperlinks (http://www.python.org),
88
71
      external hyperlinks (Python_), internal cross-references
89
72
      (example_), footnote references ([1]_), citation references
90
73
      ([CIT2002]_), substitution references (|example|), and _`inline
91
74
      internal targets`.
92
75
93
76
      Paragraphs are separated by blank lines and are left-aligned.
94
77
95
78
- Five types of lists:
96
79
97
80
  1. `Bullet lists`_::
98
81
99
82
         - This is a bullet list.
100
83
101
84
         - Bullets can be "*", "+", or "-".
102
85
103
86
  2. `Enumerated lists`_::
104
87
105
88
         1. This is an enumerated list.
106
89
107
90
         2. Enumerators may be arabic numbers, letters, or roman
108
91
            numerals.
109
92
110
93
  3. `Definition lists`_::
111
94
112
95
         what
113
96
             Definition lists associate a term with a definition.
114
97
115
98
         how
116
99
             The term is a one-line phrase, and the definition is one
117
100
             or more paragraphs or body elements, indented relative to
118
101
             the term.
119
102
120
103
  4. `Field lists`_::
121
104
122
105
         :what: Field lists map field names to field bodies, like
123
106
                database records.  They are often part of an extension
124
107
                syntax.
125
108
126
109
         :how: The field marker is a colon, the field name, and a
127
110
               colon.
128
111
129
112
               The field body may contain one or more body elements,
130
113
               indented relative to the field marker.
131
114
132
115
  5. `Option lists`_, for listing command-line options::
133
116
134
117
         -a            command-line option "a"
135
118
         -b file       options can have arguments
136
119
                       and long descriptions
137
120
         --long        options can be long also
138
121
         --input=file  long options can also have
139
122
                       arguments
140
123
         /V            DOS/VMS-style options too
141
124
142
125
     There must be at least two spaces between the option and the
143
126
     description.
144
127
145
128
- `Literal blocks`_::
146
129
147
130
      Literal blocks are either indented or line-prefix-quoted blocks,
148
131
      and indicated with a double-colon ("::") at the end of the
149
132
      preceding paragraph (right here -->)::
150
133
151
134
          if literal_block:
152
135
              text = 'is left as-is'
153
136
              spaces_and_linebreaks = 'are preserved'
154
137
              markup_processing = None
155
138
156
139
- `Block quotes`_::
157
140
158
141
      Block quotes consist of indented body elements:
159
142
160
143
          This theory, that is mine, is mine.
161
144
162
145
          -- Anne Elk (Miss)
163
146
164
147
- `Doctest blocks`_::
165
148
166
149
      >>> print 'Python-specific usage examples; begun with ">>>"'
167
150
      Python-specific usage examples; begun with ">>>"
168
151
      >>> print '(cut and pasted from interactive Python sessions)'
169
152
      (cut and pasted from interactive Python sessions)
170
153
171
154
- Two syntaxes for tables_:
172
155
173
156
  1. `Grid tables`_; complete, but complex and verbose::
174
157
175
158
         +------------------------+------------+----------+
176
159
         | Header row, column 1   | Header 2   | Header 3 |
177
160
         +========================+============+==========+
178
161
         | body row 1, column 1   | column 2   | column 3 |
179
162
         +------------------------+------------+----------+
180
163
         | body row 2             | Cells may span        |
181
164
         +------------------------+-----------------------+
182
165
183
166
  2. `Simple tables`_; easy and compact, but limited::
184
167
185
168
         ====================  ==========  ==========
186
169
         Header row, column 1  Header 2    Header 3
187
170
         ====================  ==========  ==========
188
171
         body row 1, column 1  column 2    column 3
189
172
         body row 2            Cells may span columns
190
173
         ====================  ======================
191
174
192
175
- `Explicit markup blocks`_ all begin with an explicit block marker,
193
176
  two periods and a space:
194
177
195
178
  - Footnotes_::
196
179
197
180
        .. [1] A footnote contains body elements, consistently
198
181
           indented by at least 3 spaces.
199
182
200
183
  - Citations_::
201
184
202
185
        .. [CIT2002] Just like a footnote, except the label is
203
186
           textual.
204
187
205
188
  - `Hyperlink targets`_::
206
189
207
190
        .. _Python: http://www.python.org
208
191
209
192
        .. _example:
210
193
211
194
        The "_example" target above points to this paragraph.
212
195
213
196
  - Directives_::
214
197
215
198
        .. image:: mylogo.png
216
199
217
200
  - `Substitution definitions`_::
218
201
219
202
        .. |symbol here| image:: symbol.png
220
203
221
204
  - Comments_::
222
205
223
206
        .. Comments begin with two dots and a space.  Anything may
224
207
           follow, except for the syntax of footnotes/citations,
225
208
           hyperlink targets, directives, or substitution definitions.
226
209
227
210
228
211
----------------
229
212
 Syntax Details
230
213
----------------
231
214
232
215
Descriptions below list "doctree elements" (document tree element
233
216
names; XML DTD generic identifiers) corresponding to syntax
234
217
constructs.  For details on the hierarchy of elements, please see `The
235
218
Docutils Document Tree`_ and the `Docutils Generic DTD`_ XML document
236
219
type definition.
237
220
238
221
239
222
Whitespace
240
223
==========
241
224
242
225
Spaces are recommended for indentation_, but tabs may also be used.
243
226
Tabs will be converted to spaces.  Tab stops are at every 8th column.
244
227
245
228
Other whitespace characters (form feeds [chr(12)] and vertical tabs
246
229
[chr(11)]) are converted to single spaces before processing.
247
230
248
231
249
232
Blank Lines
250
233
-----------
251
234
252
235
Blank lines are used to separate paragraphs and other elements.
253
236
Multiple successive blank lines are equivalent to a single blank line,
254
237
except within literal blocks (where all whitespace is preserved).
255
238
Blank lines may be omitted when the markup makes element separation
256
239
unambiguous, in conjunction with indentation.  The first line of a
257
240
document is treated as if it is preceded by a blank line, and the last
258
241
line of a document is treated as if it is followed by a blank line.
259
242
260
243
261
244
Indentation
262
245
-----------
263
246
264
247
Indentation is used to indicate -- and is only significant in
265
248
indicating -- block quotes, definitions (in definition list items),
266
249
and local nested content:
267
250
268
251
- list item content (multi-line contents of list items, and multiple
269
252
  body elements within a list item, including nested lists),
270
253
- the content of literal blocks, and
271
254
- the content of explicit markup blocks.
272
255
273
256
Any text whose indentation is less than that of the current level
274
257
(i.e., unindented text or "dedents") ends the current level of
275
258
indentation.
276
259
277
260
Since all indentation is significant, the level of indentation must be
278
261
consistent.  For example, indentation is the sole markup indicator for
279
262
`block quotes`_::
280
263
281
264
    This is a top-level paragraph.
282
265
283
266
        This paragraph belongs to a first-level block quote.
284
267
285
268
        Paragraph 2 of the first-level block quote.
286
269
287
270
Multiple levels of indentation within a block quote will result in
288
271
more complex structures::
289
272
290
273
    This is a top-level paragraph.
291
274
292
275
        This paragraph belongs to a first-level block quote.
293
276
294
277
            This paragraph belongs to a second-level block quote.
295
278
296
279
    Another top-level paragraph.
297
280
298
281
            This paragraph belongs to a second-level block quote.
299
282
300
283
        This paragraph belongs to a first-level block quote.  The
301
284
        second-level block quote above is inside this first-level
302
285
        block quote.
303
286
304
287
When a paragraph or other construct consists of more than one line of
305
288
text, the lines must be left-aligned::
306
289
307
290
    This is a paragraph.  The lines of
308
291
    this paragraph are aligned at the left.
309
292
310
293
        This paragraph has problems.  The
311
294
    lines are not left-aligned.  In addition
312
295
      to potential misinterpretation, warning
313
296
        and/or error messages will be generated
314
297
      by the parser.
315
298
316
299
Several constructs begin with a marker, and the body of the construct
317
300
must be indented relative to the marker.  For constructs using simple
318
301
markers (`bullet lists`_, `enumerated lists`_, footnotes_, citations_,
319
302
`hyperlink targets`_, directives_, and comments_), the level of
320
303
indentation of the body is determined by the position of the first
321
304
line of text, which begins on the same line as the marker.  For
322
305
example, bullet list bodies must be indented by at least two columns
323
306
relative to the left edge of the bullet::
324
307
325
308
    - This is the first line of a bullet list
326
309
      item's paragraph.  All lines must align
327
310
      relative to the first line.  [1]_
328
311
329
312
          This indented paragraph is interpreted
330
313
          as a block quote.
331
314
332
315
    Because it is not sufficiently indented,
333
316
    this paragraph does not belong to the list
334
317
    item.
335
318
336
319
    .. [1] Here's a footnote.  The second line is aligned
337
320
       with the beginning of the footnote label.  The ".."
338
321
       marker is what determines the indentation.
339
322
340
323
For constructs using complex markers (`field lists`_ and `option
341
324
lists`_), where the marker may contain arbitrary text, the indentation
342
325
of the first line *after* the marker determines the left edge of the
343
326
body.  For example, field lists may have very long markers (containing
344
327
the field names)::
345
328
346
329
    :Hello: This field has a short field name, so aligning the field
347
330
            body with the first line is feasible.
348
331
349
332
    :Number-of-African-swallows-required-to-carry-a-coconut: It would
350
333
        be very difficult to align the field body with the left edge
351
334
        of the first line.  It may even be preferable not to begin the
352
335
        body on the same line as the marker.
353
336
354
337
355
338
Escaping Mechanism
356
339
==================
357
340
358
341
The character set universally available to plaintext documents, 7-bit
359
342
ASCII, is limited.  No matter what characters are used for markup,
360
343
they will already have multiple meanings in written text.  Therefore
361
344
markup characters *will* sometimes appear in text **without being
362
345
intended as markup**.  Any serious markup system requires an escaping
363
346
mechanism to override the default meaning of the characters used for
364
347
the markup.  In reStructuredText we use the backslash, commonly used
365
348
as an escaping character in other domains.
366
349
367
350
A backslash followed by any character (except whitespace characters)
368
351
escapes that character.  The escaped character represents the
369
352
character itself, and is prevented from playing a role in any markup
370
353
interpretation.  The backslash is removed from the output.  A literal
371
354
backslash is represented by two backslashes in a row (the first
372
355
backslash "escapes" the second, preventing it being interpreted in an
373
356
"escaping" role).
374
357
375
358
Backslash-escaped whitespace characters are removed from the document.
376
359
This allows for character-level `inline markup`_.
377
360
378
361
There are two contexts in which backslashes have no special meaning:
379
362
literal blocks and inline literals.  In these contexts, a single
380
363
backslash represents a literal backslash, without having to double up.
381
364
382
365
Please note that the reStructuredText specification and parser do not
383
366
address the issue of the representation or extraction of text input
384
367
(how and in what form the text actually *reaches* the parser).
385
368
Backslashes and other characters may serve a character-escaping
386
369
purpose in certain contexts and must be dealt with appropriately.  For
387
370
example, Python uses backslashes in strings to escape certain
388
371
characters, but not others.  The simplest solution when backslashes
389
372
appear in Python docstrings is to use raw docstrings::
390
373
391
374
    r"""This is a raw docstring.  Backslashes (\) are not touched."""
392
375
393
376
394
377
Reference Names
395
378
===============
396
379
397
380
Simple reference names are single words consisting of alphanumerics
398
381
plus isolated (no two adjacent) internal hyphens, underscores,
399
382
periods, colons and plus signs; no whitespace or other characters are
400
383
allowed.  Footnote labels (Footnotes_ & `Footnote References`_), citation
401
384
labels (Citations_ & `Citation References`_), `interpreted text`_ roles,
402
385
and some `hyperlink references`_ use the simple reference name syntax.
403
386
404
387
Reference names using punctuation or whose names are phrases (two or
405
388
more space-separated words) are called "phrase-references".
406
389
Phrase-references are expressed by enclosing the phrase in backquotes
407
390
and treating the backquoted text as a reference name::
408
391
409
392
    Want to learn about `my favorite programming language`_?
410
393
411
394
    .. _my favorite programming language: http://www.python.org
412
395
413
396
Simple reference names may also optionally use backquotes.
414
397
415
398
Reference names are whitespace-neutral and case-insensitive.  When
416
399
resolving reference names internally:
417
400
418
401
- whitespace is normalized (one or more spaces, horizontal or vertical
419
402
  tabs, newlines, carriage returns, or form feeds, are interpreted as
420
403
  a single space), and
421
404
422
405
- case is normalized (all alphabetic characters are converted to
423
406
  lowercase).
424
407
425
408
For example, the following `hyperlink references`_ are equivalent::
426
409
427
410
    - `A HYPERLINK`_
428
411
    - `a    hyperlink`_
429
412
    - `A
430
413
      Hyperlink`_
431
414
432
415
Hyperlinks_, footnotes_, and citations_ all share the same namespace
433
416
for reference names.  The labels of citations (simple reference names)
434
417
and manually-numbered footnotes (numbers) are entered into the same
435
418
database as other hyperlink names.  This means that a footnote
436
419
(defined as "``.. [1]``") which can be referred to by a footnote
437
420
reference (``[1]_``), can also be referred to by a plain hyperlink
438
421
reference (1_).  Of course, each type of reference (hyperlink,
439
422
footnote, citation) may be processed and rendered differently.  Some
440
423
care should be taken to avoid reference name conflicts.
441
424
442
425
443
426
Document Structure
444
427
==================
445
428
446
429
Document
447
430
--------
448
431
449
432
Doctree element: document.
450
433
451
434
The top-level element of a parsed reStructuredText document is the
452
435
"document" element.  After initial parsing, the document element is a
453
436
simple container for a document fragment, consisting of `body
454
437
elements`_, transitions_, and sections_, but lacking a document title
455
438
or other bibliographic elements.  The code that calls the parser may
456
439
choose to run one or more optional post-parse transforms_,
457
440
rearranging the document fragment into a complete document with a
458
441
title and possibly other metadata elements (author, date, etc.; see
459
442
`Bibliographic Fields`_).
460
443
461
444
Specifically, there is no way to indicate a document title and
462
445
subtitle explicitly in reStructuredText.  Instead, a lone top-level
463
446
section title (see Sections_ below) can be treated as the document
464
447
title.  Similarly, a lone second-level section title immediately after
465
448
the "document title" can become the document subtitle.  The rest of
466
449
the sections are then lifted up a level or two.  See the `DocTitle
467
450
transform`_ for details.
468
451
469
452
470
453
Sections
471
454
--------
472
455
473
456
Doctree elements: section, title.
474
457
475
458
Sections are identified through their titles, which are marked up with
476
459
adornment: "underlines" below the title text, or underlines and
477
460
matching "overlines" above the title.  An underline/overline is a
478
461
single repeated punctuation character that begins in column 1 and
479
462
forms a line extending at least as far as the right edge of the title
480
463
text.  Specifically, an underline/overline character may be any
481
464
non-alphanumeric printable 7-bit ASCII character [#]_.  When an
482
465
overline is used, the length and character used must match the
483
466
underline.  Underline-only adornment styles are distinct from
484
467
overline-and-underline styles that use the same character.  There may
485
468
be any number of levels of section titles, although some output
486
469
formats may have limits (HTML has 6 levels).
487
470
488
471
.. [#] The following are all valid section title adornment
489
472
   characters::
490
473
491
474
       ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~
492
475
493
476
   Some characters are more suitable than others.  The following are
494
477
   recommended::
495
478
496
479
       = - ` : . ' " ~ ^ _ * + #
497
480
498
481
Rather than imposing a fixed number and order of section title
499
482
adornment styles, the order enforced will be the order as encountered.
500
483
The first style encountered will be an outermost title (like HTML H1),
501
484
the second style will be a subtitle, the third will be a subsubtitle,
502
485
and so on.
503
486
504
487
Below are examples of section title styles::
505
488
506
489
    ===============
507
490
     Section Title
508
491
    ===============
509
492
510
493
    ---------------
511
494
     Section Title
512
495
    ---------------
513
496
514
497
    Section Title
515
498
    =============
516
499
517
500
    Section Title
518
501
    -------------
519
502
520
503
    Section Title
521
504
    `````````````
522
505
523
506
    Section Title
524
507
    '''''''''''''
525
508
526
509
    Section Title
527
510
    .............
528
511
529
512
    Section Title
530
513
    ~~~~~~~~~~~~~
531
514
532
515
    Section Title
533
516
    *************
534
517
535
518
    Section Title
536
519
    +++++++++++++
537
520
538
521
    Section Title
539
522
    ^^^^^^^^^^^^^
540
523
541
524
When a title has both an underline and an overline, the title text may
542
525
be inset, as in the first two examples above.  This is merely
543
526
aesthetic and not significant.  Underline-only title text may *not* be
544
527
inset.
545
528
546
529
A blank line after a title is optional.  All text blocks up to the
547
530
next title of the same or higher level are included in a section (or
548
531
subsection, etc.).
549
532
550
533
All section title styles need not be used, nor need any specific
551
534
section title style be used.  However, a document must be consistent
552
535
in its use of section titles: once a hierarchy of title styles is
553
536
established, sections must use that hierarchy.
554
537
555
538
Each section title automatically generates a hyperlink target pointing
556
539
to the section.  The text of the hyperlink target (the "reference
557
540
name") is the same as that of the section title.  See `Implicit
558
541
Hyperlink Targets`_ for a complete description.
559
542
560
543
Sections may contain `body elements`_, transitions_, and nested
561
544
sections.
562
545
563
546
564
547
Transitions
565
548
-----------
566
549
567
550
Doctree element: transition.
568
551
569
552
    Instead of subheads, extra space or a type ornament between
570
553
    paragraphs may be used to mark text divisions or to signal
571
554
    changes in subject or emphasis.
572
555
573
556
    (The Chicago Manual of Style, 14th edition, section 1.80)
574
557
575
558
Transitions are commonly seen in novels and short fiction, as a gap
576
559
spanning one or more lines, with or without a type ornament such as a
577
560
row of asterisks.  Transitions separate other body elements.  A
578
561
transition should not begin or end a section or document, nor should
579
562
two transitions be immediately adjacent.
580
563
581
564
The syntax for a transition marker is a horizontal line of 4 or more
582
565
repeated punctuation characters.  The syntax is the same as section
583
566
title underlines without title text.  Transition markers require blank
584
567
lines before and after::
585
568
586
569
    Para.
587
570
588
571
    ----------
589
572
590
573
    Para.
591
574
592
575
Unlike section title underlines, no hierarchy of transition markers is
593
576
enforced, nor do differences in transition markers accomplish
594
577
anything.  It is recommended that a single consistent style be used.
595
578
596
579
The processing system is free to render transitions in output in any
597
580
way it likes.  For example, horizontal rules (``<hr>``) in HTML output
598
581
would be an obvious choice.
599
582
600
583
601
584
Body Elements
602
585
=============
603
586
604
587
Paragraphs
605
588
----------
606
589
607
590
Doctree element: paragraph.
608
591
609
592
Paragraphs consist of blocks of left-aligned text with no markup
610
593
indicating any other body element.  Blank lines separate paragraphs
611
594
from each other and from other body elements.  Paragraphs may contain
612
595
`inline markup`_.
613
596
614
597
Syntax diagram::
615
598
616
599
    +------------------------------+
617
600
    | paragraph                    |
618
601
    |                              |
619
602
    +------------------------------+
620
603
621
604
    +------------------------------+
622
605
    | paragraph                    |
623
606
    |                              |
624
607
    +------------------------------+
625
608
626
609
627
610
Bullet Lists
628
611
------------
629
612
630
613
Doctree elements: bullet_list, list_item.
631
614
632
615
A text block which begins with a "*", "+", "-", "•", "‣", or "⁃",
633
616
followed by whitespace, is a bullet list item (a.k.a. "unordered" list
634
617
item).  List item bodies must be left-aligned and indented relative to
635
618
the bullet; the text immediately after the bullet determines the
636
619
indentation.  For example::
637
620
638
621
    - This is the first bullet list item.  The blank line above the
639
622
      first list item is required; blank lines between list items
640
623
      (such as below this paragraph) are optional.
641
624
642
625
    - This is the first paragraph in the second item in the list.
643
626
644
627
      This is the second paragraph in the second item in the list.
645
628
      The blank line above this paragraph is required.  The left edge
646
629
      of this paragraph lines up with the paragraph above, both
647
630
      indented relative to the bullet.
648
631
649
632
      - This is a sublist.  The bullet lines up with the left edge of
650
633
        the text blocks above.  A sublist is a new list so requires a
651
634
        blank line above and below.
652
635
653
636
    - This is the third item of the main list.
654
637
655
638
    This paragraph is not part of the list.
656
639
657
640
Here are examples of **incorrectly** formatted bullet lists::
658
641
659
642
    - This first line is fine.
660
643
    A blank line is required between list items and paragraphs.
661
644
    (Warning)
662
645
663
646
    - The following line appears to be a new sublist, but it is not:
664
647
      - This is a paragraph continuation, not a sublist (since there's
665
648
        no blank line).  This line is also incorrectly indented.
666
649
      - Warnings may be issued by the implementation.
667
650
668
651
Syntax diagram::
669
652
670
653
    +------+-----------------------+
671
654
    | "- " | list item             |
672
655
    +------| (body elements)+      |
673
656
           +-----------------------+
674
657
675
658
676
659
Enumerated Lists
677
660
----------------
678
661
679
662
Doctree elements: enumerated_list, list_item.
680
663
681
664
Enumerated lists (a.k.a. "ordered" lists) are similar to bullet lists,
682
665
but use enumerators instead of bullets.  An enumerator consists of an
683
666
enumeration sequence member and formatting, followed by whitespace.
684
667
The following enumeration sequences are recognized:
685
668
686
669
- arabic numerals: 1, 2, 3, ... (no upper limit).
687
670
- uppercase alphabet characters: A, B, C, ..., Z.
688
671
- lower-case alphabet characters: a, b, c, ..., z.
689
672
- uppercase Roman numerals: I, II, III, IV, ..., MMMMCMXCIX (4999).
690
673
- lowercase Roman numerals: i, ii, iii, iv, ..., mmmmcmxcix (4999).
691
674
692
675
In addition, the auto-enumerator, "#", may be used to automatically
693
676
enumerate a list.  Auto-enumerated lists may begin with explicit
694
677
enumeration, which sets the sequence.  Fully auto-enumerated lists use
695
678
arabic numerals and begin with 1.  (Auto-enumerated lists are new in
696
679
Docutils 0.3.8.)
697
680
698
681
The following formatting types are recognized:
699
682
700
683
- suffixed with a period: "1.", "A.", "a.", "I.", "i.".
701
684
- surrounded by parentheses: "(1)", "(A)", "(a)", "(I)", "(i)".
702
685
- suffixed with a right-parenthesis: "1)", "A)", "a)", "I)", "i)".
703
686
704
687
While parsing an enumerated list, a new list will be started whenever:
705
688
706
689
- An enumerator is encountered which does not have the same format and
707
690
  sequence type as the current list (e.g. "1.", "(a)" produces two
708
691
  separate lists).
709
692
710
693
- The enumerators are not in sequence (e.g., "1.", "3." produces two
711
694
  separate lists).
712
695
713
696
It is recommended that the enumerator of the first list item be
714
697
ordinal-1 ("1", "A", "a", "I", or "i").  Although other start-values
715
698
will be recognized, they may not be supported by the output format.  A
716
699
level-1 [info] system message will be generated for any list beginning
717
700
with a non-ordinal-1 enumerator.
718
701
719
702
Lists using Roman numerals must begin with "I"/"i" or a
720
703
multi-character value, such as "II" or "XV".  Any other
721
704
single-character Roman numeral ("V", "X", "L", "C", "D", "M") will be
722
705
interpreted as a letter of the alphabet, not as a Roman numeral.
723
706
Likewise, lists using letters of the alphabet may not begin with
724
707
"I"/"i", since these are recognized as Roman numeral 1.
725
708
726
709
The second line of each enumerated list item is checked for validity.
727
710
This is to prevent ordinary paragraphs from being mistakenly
728
711
interpreted as list items, when they happen to begin with text
729
712
identical to enumerators.  For example, this text is parsed as an
730
713
ordinary paragraph::
731
714
732
715
    A. Einstein was a really
733
716
    smart dude.
734
717
735
718
However, ambiguity cannot be avoided if the paragraph consists of only
736
719
one line.  This text is parsed as an enumerated list item::
737
720
738
721
    A. Einstein was a really smart dude.
739
722
740
723
If a single-line paragraph begins with text identical to an enumerator
741
724
("A.", "1.", "(b)", "I)", etc.), the first character will have to be
742
725
escaped in order to have the line parsed as an ordinary paragraph::
743
726
744
727
    \A. Einstein was a really smart dude.
745
728
746
729
Examples of nested enumerated lists::
747
730
748
731
    1. Item 1 initial text.
749
732
750
733
       a) Item 1a.
751
734
       b) Item 1b.
752
735
753
736
    2. a) Item 2a.
754
737
       b) Item 2b.
755
738
756
739
Example syntax diagram::
757
740
758
741
    +-------+----------------------+
759
742
    | "1. " | list item            |
760
743
    +-------| (body elements)+     |
761
744
            +----------------------+
762
745
763
746
764
747
Definition Lists
765
748
----------------
766
749
767
750
Doctree elements: definition_list, definition_list_item, term,
768
751
classifier, definition.
769
752
770
753
Each definition list item contains a term, optional classifiers, and a
771
754
definition.  A term is a simple one-line word or phrase.  Optional
772
755
classifiers may follow the term on the same line, each after an inline
773
756
" : " (space, colon, space).  A definition is a block indented
774
757
relative to the term, and may contain multiple paragraphs and other
775
758
body elements.  There may be no blank line between a term line and a
776
759
definition block (this distinguishes definition lists from `block
777
760
quotes`_).  Blank lines are required before the first and after the
778
761
last definition list item, but are optional in-between.  For example::
779
762
780
763
    term 1
781
764
        Definition 1.
782
765
783
766
    term 2
784
767
        Definition 2, paragraph 1.
785
768
786
769
        Definition 2, paragraph 2.
787
770
788
771
    term 3 : classifier
789
772
        Definition 3.
790
773
791
774
    term 4 : classifier one : classifier two
792
775
        Definition 4.
793
776
794
777
Inline markup is parsed in the term line before the classifier
795
778
delimiter (" : ") is recognized.  The delimiter will only be
796
779
recognized if it appears outside of any inline markup.
797
780
798
781
A definition list may be used in various ways, including:
799
782
800
783
- As a dictionary or glossary.  The term is the word itself, a
801
784
  classifier may be used to indicate the usage of the term (noun,
802
785
  verb, etc.), and the definition follows.
803
786
804
787
- To describe program variables.  The term is the variable name, a
805
788
  classifier may be used to indicate the type of the variable (string,
806
789
  integer, etc.), and the definition describes the variable's use in
807
790
  the program.  This usage of definition lists supports the classifier
808
791
  syntax of Grouch_, a system for describing and enforcing a Python
809
792
  object schema.
810
793
811
794
Syntax diagram::
812
795
813
796
    +----------------------------+
814
797
    | term [ " : " classifier ]* |
815
798
    +--+-------------------------+--+
816
799
       | definition                 |
817
800
       | (body elements)+           |
818
801
       +----------------------------+
819
802
820
803
821
804
Field Lists
822
805
-----------
823
806
824
807
Doctree elements: field_list, field, field_name, field_body.
825
808
826
809
Field lists are used as part of an extension syntax, such as options
827
810
for directives_, or database-like records meant for further
828
811
processing.  They may also be used for two-column table-like
829
812
structures resembling database records (label & data pairs).
830
813
Applications of reStructuredText may recognize field names and
831
814
transform fields or field bodies in certain contexts.  For examples,
832
815
see `Bibliographic Fields`_ below, or the "image_" and "meta_"
833
816
directives in `reStructuredText Directives`_.
834
817
835
818
Field lists are mappings from field names to field bodies, modeled on
836
819
RFC822_ headers.  A field name may consist of any characters, but
837
820
colons (":") inside of field names must be escaped with a backslash.
838
821
Inline markup is parsed in field names.  Field names are
839
822
case-insensitive when further processed or transformed.  The field
840
823
name, along with a single colon prefix and suffix, together form the
841
824
field marker.  The field marker is followed by whitespace and the
842
825
field body.  The field body may contain multiple body elements,
843
826
indented relative to the field marker.  The first line after the field
844
827
name marker determines the indentation of the field body.  For
845
828
example::
846
829
847
830
    :Date: 2001-08-16
848
831
    :Version: 1
849
832
    :Authors: - Me
850
833
              - Myself
851
834
              - I
852
835
    :Indentation: Since the field marker may be quite long, the second
853
836
       and subsequent lines of the field body do not have to line up
854
837
       with the first line, but they must be indented relative to the
855
838
       field name marker, and they must line up with each other.
856
839
    :Parameter i: integer
857
840
858
841
The interpretation of individual words in a multi-word field name is
859
842
up to the application.  The application may specify a syntax for the
860
843
field name.  For example, second and subsequent words may be treated
861
844
as "arguments", quoted phrases may be treated as a single argument,
862
845
and direct support for the "name=value" syntax may be added.
863
846
864
847
Standard RFC822_ headers cannot be used for this construct because
865
848
they are ambiguous.  A word followed by a colon at the beginning of a
866
849
line is common in written text.  However, in well-defined contexts
867
850
such as when a field list invariably occurs at the beginning of a
868
851
document (PEPs and email messages), standard RFC822 headers could be
869
852
used.
870
853
871
854
Syntax diagram (simplified)::
872
855
873
856
    +--------------------+----------------------+
874
857
    | ":" field name ":" | field body           |
875
858
    +-------+------------+                      |
876
859
            | (body elements)+                  |
877
860
            +-----------------------------------+
878
861
879
862
880
863
Bibliographic Fields
881
864
````````````````````
882
865
883
866
Doctree elements: docinfo, author, authors, organization, contact,
884
867
version, status, date, copyright, field, topic.
885
868
886
869
When a field list is the first non-comment element in a document
887
870
(after the document title, if there is one), it may have its fields
888
871
transformed to document bibliographic data.  This bibliographic data
889
872
corresponds to the front matter of a book, such as the title page and
890
873
copyright page.
891
874
892
875
Certain registered field names (listed below) are recognized and
893
876
transformed to the corresponding doctree elements, most becoming child
894
877
elements of the "docinfo" element.  No ordering is required of these
895
878
fields, although they may be rearranged to fit the document structure,
896
879
as noted.  Unless otherwise indicated below, each of the bibliographic
897
880
elements' field bodies may contain a single paragraph only.  Field
898
881
bodies may be checked for `RCS keywords`_ and cleaned up.  Any
899
882
unrecognized fields will remain as generic fields in the docinfo
900
883
element.
901
884
902
885
The registered bibliographic field names and their corresponding
903
886
doctree elements are as follows:
904
887
905
888
- Field name "Author": author element.
906
889
- "Authors": authors.
907
890
- "Organization": organization.
908
891
- "Contact": contact.
909
892
- "Address": address.
910
893
- "Version": version.
911
894
- "Status": status.
912
895
- "Date": date.
913
896
- "Copyright": copyright.
914
897
- "Dedication": topic.
915
898
- "Abstract": topic.
916
899
917
900
The "Authors" field may contain either: a single paragraph consisting
918
901
of a list of authors, separated by ";" or ","; or a bullet list whose
919
902
elements each contain a single paragraph per author.  ";" is checked
920
903
first, so "Doe, Jane; Doe, John" will work.  In some languages
921
904
(e.g. Swedish), there is no singular/plural distinction between
922
905
"Author" and "Authors", so only an "Authors" field is provided, and a
923
906
single name is interpreted as an "Author".  If a single name contains
924
907
a comma, end it with a semicolon to disambiguate: ":Authors: Doe,
925
908
Jane;".
926
909
927
910
The "Address" field is for a multi-line surface mailing address.
928
911
Newlines and whitespace will be preserved.
929
912
930
913
The "Dedication" and "Abstract" fields may contain arbitrary body
931
914
elements.  Only one of each is allowed.  They become topic elements
932
915
with "Dedication" or "Abstract" titles (or language equivalents)
933
916
immediately following the docinfo element.
934
917
935
918
This field-name-to-element mapping can be replaced for other
936
919
languages.  See the `DocInfo transform`_ implementation documentation
937
920
for details.
938
921
939
922
Unregistered/generic fields may contain one or more paragraphs or
940
923
arbitrary body elements.
941
924
942
925
943
926
RCS Keywords
944
927
````````````
945
928
946
929
`Bibliographic fields`_ recognized by the parser are normally checked
947
930
for RCS [#]_ keywords and cleaned up [#]_.  RCS keywords may be
948
931
entered into source files as "$keyword$", and once stored under RCS or
949
932
CVS [#]_, they are expanded to "$keyword: expansion text $".  For
950
933
example, a "Status" field will be transformed to a "status" element::
951
934
952
935
    :Status: $keyword: expansion text $
953
936
954
937
.. [#] Revision Control System.
955
938
.. [#] RCS keyword processing can be turned off (unimplemented).
956
939
.. [#] Concurrent Versions System.  CVS uses the same keywords as RCS.
957
940
958
941
Processed, the "status" element's text will become simply "expansion
959
942
text".  The dollar sign delimiters and leading RCS keyword name are
960
943
removed.
961
944
962
945
The RCS keyword processing only kicks in when the field list is in
963
946
bibliographic context (first non-comment construct in the document,
964
947
after a document title if there is one).
965
948
966
949
967
950
Option Lists
968
951
------------
969
952
970
953
Doctree elements: option_list, option_list_item, option_group, option,
971
954
option_string, option_argument, description.
972
955
973
956
Option lists are two-column lists of command-line options and
974
957
descriptions, documenting a program's options.  For example::
975
958
976
959
    -a         Output all.
977
960
    -b         Output both (this description is
978
961
               quite long).
979
962
    -c arg     Output just arg.
980
963
    --long     Output all day long.
981
964
982
965
    -p         This option has two paragraphs in the description.
983
966
               This is the first.
984
967
985
968
               This is the second.  Blank lines may be omitted between
986
969
               options (as above) or left in (as here and below).
987
970
988
971
    --very-long-option  A VMS-style option.  Note the adjustment for
989
972
                        the required two spaces.
990
973
991
974
    --an-even-longer-option
992
975
               The description can also start on the next line.
993
976
994
977
    -2, --two  This option has two variants.
995
978
996
979
    -f FILE, --file=FILE  These two options are synonyms; both have
997
980
                          arguments.
998
981
999
982
    /V         A VMS/DOS-style option.
1000
983
1001
984
There are several types of options recognized by reStructuredText:
1002
985
1003
986
- Short POSIX options consist of one dash and an option letter.
1004
987
- Long POSIX options consist of two dashes and an option word; some
1005
988
  systems use a single dash.
1006
989
- Old GNU-style "plus" options consist of one plus and an option
1007
990
  letter ("plus" options are deprecated now, their use discouraged).
1008
991
- DOS/VMS options consist of a slash and an option letter or word.
1009
992
1010
993
Please note that both POSIX-style and DOS/VMS-style options may be
1011
994
used by DOS or Windows software.  These and other variations are
1012
995
sometimes used mixed together.  The names above have been chosen for
1013
996
convenience only.
1014
997
1015
998
The syntax for short and long POSIX options is based on the syntax
1016
999
supported by Python's getopt.py_ module, which implements an option
1017
1000
parser similar to the `GNU libc getopt_long()`_ function but with some
1018
1001
restrictions.  There are many variant option systems, and
1019
1002
reStructuredText option lists do not support all of them.
1020
1003
1021
1004
Although long POSIX and DOS/VMS option words may be allowed to be
1022
1005
truncated by the operating system or the application when used on the
1023
1006
command line, reStructuredText option lists do not show or support
1024
1007
this with any special syntax.  The complete option word should be
1025
1008
given, supported by notes about truncation if and when applicable.
1026
1009
1027
1010
Options may be followed by an argument placeholder, whose role and
1028
1011
syntax should be explained in the description text.  Either a space or
1029
1012
an equals sign may be used as a delimiter between options and option
1030
1013
argument placeholders; short options ("-" or "+" prefix only) may omit
1031
1014
the delimiter.  Option arguments may take one of two forms:
1032
1015
1033
1016
- Begins with a letter (``[a-zA-Z]``) and subsequently consists of
1034
1017
  letters, numbers, underscores and hyphens (``[a-zA-Z0-9_-]``).
1035
1018
- Begins with an open-angle-bracket (``<``) and ends with a
1036
1019
  close-angle-bracket (``>``); any characters except angle brackets
1037
1020
  are allowed internally.
1038
1021
1039
1022
Multiple option "synonyms" may be listed, sharing a single
1040
1023
description.  They must be separated by comma-space.
1041
1024
1042
1025
There must be at least two spaces between the option(s) and the
1043
1026
description.  The description may contain multiple body elements.  The
1044
1027
first line after the option marker determines the indentation of the
1045
1028
description.  As with other types of lists, blank lines are required
1046
1029
before the first option list item and after the last, but are optional
1047
1030
between option entries.
1048
1031
1049
1032
Syntax diagram (simplified)::
1050
1033
1051
1034
    +----------------------------+-------------+
1052
1035
    | option [" " argument] "  " | description |
1053
1036
    +-------+--------------------+             |
1054
1037
            | (body elements)+                 |
1055
1038
            +----------------------------------+
1056
1039
1057
1040
1058
1041
Literal Blocks
1059
1042
--------------
1060
1043
1061
1044
Doctree element: literal_block.
1062
1045
1063
1046
A paragraph consisting of two colons ("::") signifies that the
1064
1047
following text block(s) comprise a literal block.  The literal block
1065
1048
must either be indented or quoted (see below).  No markup processing
1066
1049
is done within a literal block.  It is left as-is, and is typically
1067
1050
rendered in a monospaced typeface::
1068
1051
1069
1052
    This is a typical paragraph.  An indented literal block follows.
1070
1053
1071
1054
    ::
1072
1055
1073
1056
        for a in [5,4,3,2,1]:   # this is program code, shown as-is
1074
1057
            print a
1075
1058
        print "it's..."
1076
1059
        # a literal block continues until the indentation ends
1077
1060
1078
1061
    This text has returned to the indentation of the first paragraph,
1079
1062
    is outside of the literal block, and is therefore treated as an
1080
1063
    ordinary paragraph.
1081
1064
1082
1065
The paragraph containing only "::" will be completely removed from the
1083
1066
output; no empty paragraph will remain.
1084
1067
1085
1068
As a convenience, the "::" is recognized at the end of any paragraph.
1086
1069
If immediately preceded by whitespace, both colons will be removed
1087
1070
from the output (this is the "partially minimized" form).  When text
1088
1071
immediately precedes the "::", *one* colon will be removed from the
1089
1072
output, leaving only one colon visible (i.e., "::" will be replaced by
1090
1073
":"; this is the "fully minimized" form).
1091
1074
1092
1075
In other words, these are all equivalent (please pay attention to the
1093
1076
colons after "Paragraph"):
1094
1077
1095
1078
1. Expanded form::
1096
1079
1097
1080
      Paragraph:
1098
1081
1099
1082
      ::
1100
1083
1101
1084
          Literal block
1102
1085
1103
1086
2. Partially minimized form::
1104
1087
1105
1088
      Paragraph: ::
1106
1089
1107
1090
          Literal block
1108
1091
1109
1092
3. Fully minimized form::
1110
1093
1111
1094
      Paragraph::
1112
1095
1113
1096
          Literal block
1114
1097
1115
1098
All whitespace (including line breaks, but excluding minimum
1116
1099
indentation for indented literal blocks) is preserved.  Blank lines
1117
1100
are required before and after a literal block, but these blank lines
1118
1101
are not included as part of the literal block.
1119
1102
1120
1103
1121
1104
Indented Literal Blocks
1122
1105
```````````````````````
1123
1106
1124
1107
Indented literal blocks are indicated by indentation relative to the
1125
1108
surrounding text (leading whitespace on each line).  The minimum
1126
1109
indentation will be removed from each line of an indented literal
1127
1110
block.  The literal block need not be contiguous; blank lines are
1128
1111
allowed between sections of indented text.  The literal block ends
1129
1112
with the end of the indentation.
1130
1113
1131
1114
Syntax diagram::
1132
1115
1133
1116
    +------------------------------+
1134
1117
    | paragraph                    |
1135
1118
    | (ends with "::")             |
1136
1119
    +------------------------------+
1137
1120
       +---------------------------+
1138
1121
       | indented literal block    |
1139
1122
       +---------------------------+
1140
1123
1141
1124
1142
1125
Quoted Literal Blocks
1143
1126
`````````````````````
1144
1127
1145
1128
Quoted literal blocks are unindented contiguous blocks of text where
1146
1129
each line begins with the same non-alphanumeric printable 7-bit ASCII
1147
1130
character [#]_.  A blank line ends a quoted literal block.  The
1148
1131
quoting characters are preserved in the processed document.
1149
1132
1150
1133
.. [#]
1151
1134
   The following are all valid quoting characters::
1152
1135
1153
1136
       ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~
1154
1137
1155
1138
   Note that these are the same characters as are valid for title
1156
1139
   adornment of sections_.
1157
1140
1158
1141
Possible uses include literate programming in Haskell and email
1159
1142
quoting::
1160
1143
1161
1144
    John Doe wrote::
1162
1145
1163
1146
    >> Great idea!
1164
1147
    >
1165
1148
    > Why didn't I think of that?
1166
1149
1167
1150
    You just did!  ;-)
1168
1151
1169
1152
Syntax diagram::
1170
1153
1171
1154
    +------------------------------+
1172
1155
    | paragraph                    |
1173
1156
    | (ends with "::")             |
1174
1157
    +------------------------------+
1175
1158
    +------------------------------+
1176
1159
    | ">" per-line-quoted          |
1177
1160
    | ">" contiguous literal block |
1178
1161
    +------------------------------+
1179
1162
1180
1163
1181
1164
Line Blocks
1182
1165
-----------
1183
1166
1184
1167
Doctree elements: line_block, line.  (New in Docutils 0.3.5.)
1185
1168
1186
1169
Line blocks are useful for address blocks, verse (poetry, song
1187
1170
lyrics), and unadorned lists, where the structure of lines is
1188
1171
significant.  Line blocks are groups of lines beginning with vertical
1189
1172
bar ("|") prefixes.  Each vertical bar prefix indicates a new line, so
1190
1173
line breaks are preserved.  Initial indents are also significant,
1191
1174
resulting in a nested structure.  Inline markup is supported.
1192
1175
Continuation lines are wrapped portions of long lines; they begin with
1193
1176
a space in place of the vertical bar.  The left edge of a continuation
1194
1177
line must be indented, but need not be aligned with the left edge of
1195
1178
the text above it.  A line block ends with a blank line.
1196
1179
1197
1180
This example illustrates continuation lines::
1198
1181
1199
1182
    | Lend us a couple of bob till Thursday.
1200
1183
    | I'm absolutely skint.
1201
1184
    | But I'm expecting a postal order and I can pay you back
1202
1185
      as soon as it comes.
1203
1186
    | Love, Ewan.
1204
1187
1205
1188
This example illustrates the nesting of line blocks, indicated by the
1206
1189
initial indentation of new lines::
1207
1190
1208
1191
    Take it away, Eric the Orchestra Leader!
1209
1192
1210
1193
        | A one, two, a one two three four
1211
1194
        |
1212
1195
        | Half a bee, philosophically,
1213
1196
        |     must, *ipso facto*, half not be.
1214
1197
        | But half the bee has got to be,
1215
1198
        |     *vis a vis* its entity.  D'you see?
1216
1199
        |
1217
1200
        | But can a bee be said to be
1218
1201
        |     or not to be an entire bee,
1219
1202
        |         when half the bee is not a bee,
1220
1203
        |             due to some ancient injury?
1221
1204
        |
1222
1205
        | Singing...
1223
1206
1224
1207
Syntax diagram::
1225
1208
1226
1209
    +------+-----------------------+
1227
1210
    | "| " | line                  |
1228
1211
    +------| continuation line     |
1229
1212
           +-----------------------+
1230
1213
1231
1214
1232
1215
Block Quotes
1233
1216
------------
1234
1217
1235
1218
Doctree element: block_quote, attribution.
1236
1219
1237
1220
A text block that is indented relative to the preceding text, without
1238
1221
preceding markup indicating it to be a literal block or other content,
1239
1222
is a block quote.  All markup processing (for body elements and inline
1240
1223
markup) continues within the block quote::
1241
1224
1242
1225
    This is an ordinary paragraph, introducing a block quote.
1243
1226
1244
1227
        "It is my business to know things.  That is my trade."
1245
1228
1246
1229
        -- Sherlock Holmes
1247
1230
1248
1231
A block quote may end with an attribution: a text block beginning with
1249
1232
"--", "---", or a true em-dash, flush left within the block quote.  If
1250
1233
the attribution consists of multiple lines, the left edges of the
1251
1234
second and subsequent lines must align.
1252
1235
1253
1236
Multiple block quotes may occur consecutively if terminated with
1254
1237
attributions.
1255
1238
1256
1239
    Unindented paragraph.
1257
1240
1258
1241
        Block quote 1.
1259
1242
1260
1243
        -- Attribution 1
1261
1244
1262
1245
        Block quote 2.
1263
1246
1264
1247
`Empty comments`_ may be used to explicitly terminate preceding
1265
1248
constructs that would otherwise consume a block quote::
1266
1249
1267
1250
    * List item.
1268
1251
1269
1252
    ..
1270
1253
1271
1254
        Block quote 3.
1272
1255
1273
1256
Empty comments may also be used to separate block quotes::
1274
1257
1275
1258
        Block quote 4.
1276
1259
1277
1260
    ..
1278
1261
1279
1262
        Block quote 5.
1280
1263
1281
1264
Blank lines are required before and after a block quote, but these
1282
1265
blank lines are not included as part of the block quote.
1283
1266
1284
1267
Syntax diagram::
1285
1268
1286
1269
    +------------------------------+
1287
1270
    | (current level of            |
1288
1271
    | indentation)                 |
1289
1272
    +------------------------------+
1290
1273
       +---------------------------+
1291
1274
       | block quote               |
1292
1275
       | (body elements)+          |
1293
1276
       |                           |
1294
1277
       | -- attribution text       |
1295
1278
       |    (optional)             |
1296
1279
       +---------------------------+
1297
1280
1298
1281
1299
1282
Doctest Blocks
1300
1283
--------------
1301
1284
1302
1285
Doctree element: doctest_block.
1303
1286
1304
1287
Doctest blocks are interactive Python sessions cut-and-pasted into
1305
1288
docstrings.  They are meant to illustrate usage by example, and
1306
1289
provide an elegant and powerful testing environment via the `doctest
1307
1290
module`_ in the Python standard library.
1308
1291
1309
1292
Doctest blocks are text blocks which begin with ``">>> "``, the Python
1310
1293
interactive interpreter main prompt, and end with a blank line.
1311
1294
Doctest blocks are treated as a special case of literal blocks,
1312
1295
without requiring the literal block syntax.  If both are present, the
1313
1296
literal block syntax takes priority over Doctest block syntax::
1314
1297
1315
1298
    This is an ordinary paragraph.
1316
1299
1317
1300
    >>> print 'this is a Doctest block'
1318
1301
    this is a Doctest block
1319
1302
1320
1303
    The following is a literal block::
1321
1304
1322
1305
        >>> This is not recognized as a doctest block by
1323
1306
        reStructuredText.  It *will* be recognized by the doctest
1324
1307
        module, though!
1325
1308
1326
1309
Indentation is not required for doctest blocks.
1327
1310
1328
1311
1329
1312
Tables
1330
1313
------
1331
1314
1332
1315
Doctree elements: table, tgroup, colspec, thead, tbody, row, entry.
1333
1316
1334
1317
ReStructuredText provides two syntaxes for delineating table cells:
1335
1318
`Grid Tables`_ and `Simple Tables`_.
1336
1319
1337
1320
As with other body elements, blank lines are required before and after
1338
1321
tables.  Tables' left edges should align with the left edge of
1339
1322
preceding text blocks; if indented, the table is considered to be part
1340
1323
of a block quote.
1341
1324
1342
1325
Once isolated, each table cell is treated as a miniature document; the
1343
1326
top and bottom cell boundaries act as delimiting blank lines.  Each
1344
1327
cell contains zero or more body elements.  Cell contents may include
1345
1328
left and/or right margins, which are removed before processing.
1346
1329
1347
1330
1348
1331
Grid Tables
1349
1332
```````````
1350
1333
1351
1334
Grid tables provide a complete table representation via grid-like
1352
1335
"ASCII art".  Grid tables allow arbitrary cell contents (body
1353
1336
elements), and both row and column spans.  However, grid tables can be
1354
1337
cumbersome to produce, especially for simple data sets.  The `Emacs
1355
1338
table mode`_ is a tool that allows easy editing of grid tables, in
1356
1339
Emacs.  See `Simple Tables`_ for a simpler (but limited)
1357
1340
representation.
1358
1341
1359
1342
Grid tables are described with a visual grid made up of the characters
1360
1343
"-", "=", "|", and "+".  The hyphen ("-") is used for horizontal lines
1361
1344
(row separators).  The equals sign ("=") may be used to separate
1362
1345
optional header rows from the table body (not supported by the `Emacs
1363
1346
table mode`_).  The vertical bar ("|") is used for vertical lines
1364
1347
(column separators).  The plus sign ("+") is used for intersections of
1365
1348
horizontal and vertical lines.  Example::
1366
1349
1367
1350
    +------------------------+------------+----------+----------+
1368
1351
    | Header row, column 1   | Header 2   | Header 3 | Header 4 |
1369
1352
    | (header rows optional) |            |          |          |
1370
1353
    +========================+============+==========+==========+
1371
1354
    | body row 1, column 1   | column 2   | column 3 | column 4 |
1372
1355
    +------------------------+------------+----------+----------+
1373
1356
    | body row 2             | Cells may span columns.          |
1374
1357
    +------------------------+------------+---------------------+
1375
1358
    | body row 3             | Cells may  | - Table cells       |
1376
1359
    +------------------------+ span rows. | - contain           |
1377
1360
    | body row 4             |            | - body elements.    |
1378
1361
    +------------------------+------------+---------------------+
1379
1362
1380
1363
Some care must be taken with grid tables to avoid undesired
1381
1364
interactions with cell text in rare cases.  For example, the following
1382
1365
table contains a cell in row 2 spanning from column 2 to column 4::
1383
1366
1384
1367
    +--------------+----------+-----------+-----------+
1385
1368
    | row 1, col 1 | column 2 | column 3  | column 4  |
1386
1369
    +--------------+----------+-----------+-----------+
1387
1370
    | row 2        |                                  |
1388
1371
    +--------------+----------+-----------+-----------+
1389
1372
    | row 3        |          |           |           |
1390
1373
    +--------------+----------+-----------+-----------+
1391
1374
1392
1375
If a vertical bar is used in the text of that cell, it could have
1393
1376
unintended effects if accidentally aligned with column boundaries::
1394
1377
1395
1378
    +--------------+----------+-----------+-----------+
1396
1379
    | row 1, col 1 | column 2 | column 3  | column 4  |
1397
1380
    +--------------+----------+-----------+-----------+
1398
1381
    | row 2        | Use the command ``ls | more``.   |
1399
1382
    +--------------+----------+-----------+-----------+
1400
1383
    | row 3        |          |           |           |
1401
1384
    +--------------+----------+-----------+-----------+
1402
1385
1403
1386
Several solutions are possible.  All that is needed is to break the
1404
1387
continuity of the cell outline rectangle.  One possibility is to shift
1405
1388
the text by adding an extra space before::
1406
1389
1407
1390
    +--------------+----------+-----------+-----------+
1408
1391
    | row 1, col 1 | column 2 | column 3  | column 4  |
1409
1392
    +--------------+----------+-----------+-----------+
1410
1393
    | row 2        |  Use the command ``ls | more``.  |
1411
1394
    +--------------+----------+-----------+-----------+
1412
1395
    | row 3        |          |           |           |
1413
1396
    +--------------+----------+-----------+-----------+
1414
1397
1415
1398
Another possibility is to add an extra line to row 2::
1416
1399
1417
1400
    +--------------+----------+-----------+-----------+
1418
1401
    | row 1, col 1 | column 2 | column 3  | column 4  |
1419
1402
    +--------------+----------+-----------+-----------+
1420
1403
    | row 2        | Use the command ``ls | more``.   |
1421
1404
    |              |                                  |
1422
1405
    +--------------+----------+-----------+-----------+
1423
1406
    | row 3        |          |           |           |
1424
1407
    +--------------+----------+-----------+-----------+
1425
1408
1426
1409
1427
1410
Simple Tables
1428
1411
`````````````
1429
1412
1430
1413
Simple tables provide a compact and easy to type but limited
1431
1414
row-oriented table representation for simple data sets.  Cell contents
1432
1415
are typically single paragraphs, although arbitrary body elements may
1433
1416
be represented in most cells.  Simple tables allow multi-line rows (in
1434
1417
all but the first column) and column spans, but not row spans.  See
1435
1418
`Grid Tables`_ above for a complete table representation.
1436
1419
1437
1420
Simple tables are described with horizontal borders made up of "=" and
1438
1421
"-" characters.  The equals sign ("=") is used for top and bottom
1439
1422
table borders, and to separate optional header rows from the table
1440
1423
body.  The hyphen ("-") is used to indicate column spans in a single
1441
1424
row by underlining the joined columns, and may optionally be used to
1442
1425
explicitly and/or visually separate rows.
1443
1426
1444
1427
A simple table begins with a top border of equals signs with one or
1445
1428
more spaces at each column boundary (two or more spaces recommended).
1446
1429
Regardless of spans, the top border *must* fully describe all table
1447
1430
columns.  There must be at least two columns in the table (to
1448
1431
differentiate it from section headers).  The top border may be
1449
1432
followed by header rows, and the last of the optional header rows is
1450
1433
underlined with '=', again with spaces at column boundaries.  There
1451
1434
may not be a blank line below the header row separator; it would be
1452
1435
interpreted as the bottom border of the table.  The bottom boundary of
1453
1436
the table consists of '=' underlines, also with spaces at column
1454
1437
boundaries.  For example, here is a truth table, a three-column table
1455
1438
with one header row and four body rows::
1456
1439
1457
1440
    =====  =====  =======
1458
1441
      A      B    A and B
1459
1442
    =====  =====  =======
1460
1443
    False  False  False
1461
1444
    True   False  False
1462
1445
    False  True   False
1463
1446
    True   True   True
1464
1447
    =====  =====  =======
1465
1448
1466
1449
Underlines of '-' may be used to indicate column spans by "filling in"
1467
1450
column margins to join adjacent columns.  Column span underlines must
1468
1451
be complete (they must cover all columns) and align with established
1469
1452
column boundaries.  Text lines containing column span underlines may
1470
1453
not contain any other text.  A column span underline applies only to
1471
1454
one row immediately above it.  For example, here is a table with a
1472
1455
column span in the header::
1473
1456
1474
1457
    =====  =====  ======
1475
1458
       Inputs     Output
1476
1459
    ------------  ------
1477
1460
      A      B    A or B
1478
1461
    =====  =====  ======
1479
1462
    False  False  False
1480
1463
    True   False  True
1481
1464
    False  True   True
1482
1465
    True   True   True
1483
1466
    =====  =====  ======
1484
1467
1485
1468
Each line of text must contain spaces at column boundaries, except
1486
1469
where cells have been joined by column spans.  Each line of text
1487
1470
starts a new row, except when there is a blank cell in the first
1488
1471
column.  In that case, that line of text is parsed as a continuation
1489
1472
line.  For this reason, cells in the first column of new rows (*not*
1490
1473
continuation lines) *must* contain some text; blank cells would lead
1491
1474
to a misinterpretation (but see the tip below).  Also, this mechanism
1492
1475
limits cells in the first column to only one line of text.  Use `grid
1493
1476
tables`_ if this limitation is unacceptable.
1494
1477
1495
1478
.. Tip::
1496
1479
1497
1480
   To start a new row in a simple table without text in the first
1498
1481
   column in the processed output, use one of these:
1499
1482
1500
1483
   * an empty comment (".."), which may be omitted from the processed
1501
1484
     output (see Comments_ below)
1502
1485
1503
1486
   * a backslash escape ("``\``") followed by a space (see `Escaping
1504
1487
     Mechanism`_ above)
1505
1488
1506
1489
Underlines of '-' may also be used to visually separate rows, even if
1507
1490
there are no column spans.  This is especially useful in long tables,
1508
1491
where rows are many lines long.
1509
1492
1510
1493
Blank lines are permitted within simple tables.  Their interpretation
1511
1494
depends on the context.  Blank lines *between* rows are ignored.
1512
1495
Blank lines *within* multi-line rows may separate paragraphs or other
1513
1496
body elements within cells.
1514
1497
1515
1498
The rightmost column is unbounded; text may continue past the edge of
1516
1499
the table (as indicated by the table borders).  However, it is
1517
1500
recommended that borders be made long enough to contain the entire
1518
1501
text.
1519
1502
1520
1503
The following example illustrates continuation lines (row 2 consists
1521
1504
of two lines of text, and four lines for row 3), a blank line
1522
1505
separating paragraphs (row 3, column 2), text extending past the right
1523
1506
edge of the table, and a new row which will have no text in the first
1524
1507
column in the processed output (row 4)::
1525
1508
1526
1509
    =====  =====
1527
1510
    col 1  col 2
1528
1511
    =====  =====
1529
1512
    1      Second column of row 1.
1530
1513
    2      Second column of row 2.
1531
1514
           Second line of paragraph.
1532
1515
    3      - Second column of row 3.
1533
1516
1534
1517
           - Second item in bullet
1535
1518
             list (row 3, column 2).
1536
1519
    \      Row 4; column 1 will be empty.
1537
1520
    =====  =====
1538
1521
1539
1522
1540
1523
Explicit Markup Blocks
1541
1524
----------------------
1542
1525
1543
1526
An explicit markup block is a text block:
1544
1527
1545
1528
- whose first line begins with ".." followed by whitespace (the
1546
1529
  "explicit markup start"),
1547
1530
- whose second and subsequent lines (if any) are indented relative to
1548
1531
  the first, and
1549
1532
- which ends before an unindented line.
1550
1533
1551
1534
Explicit markup blocks are analogous to bullet list items, with ".."
1552
1535
as the bullet.  The text on the lines immediately after the explicit
1553
1536
markup start determines the indentation of the block body.  The
1554
1537
maximum common indentation is always removed from the second and
1555
1538
subsequent lines of the block body.  Therefore if the first construct
1556
1539
fits in one line, and the indentation of the first and second
1557
1540
constructs should differ, the first construct should not begin on the
1558
1541
same line as the explicit markup start.
1559
1542
1560
1543
Blank lines are required between explicit markup blocks and other
1561
1544
elements, but are optional between explicit markup blocks where
1562
1545
unambiguous.
1563
1546
1564
1547
The explicit markup syntax is used for footnotes, citations, hyperlink
1565
1548
targets, directives, substitution definitions, and comments.
1566
1549
1567
1550
1568
1551
Footnotes
1569
1552
`````````
1570
1553
1571
1554
Doctree elements: footnote, label.
1572
1555
1573
1556
Each footnote consists of an explicit markup start (".. "), a left
1574
1557
square bracket, the footnote label, a right square bracket, and
1575
1558
whitespace, followed by indented body elements.  A footnote label can
1576
1559
be:
1577
1560
1578
1561
- a whole decimal number consisting of one or more digits,
1579
1562
1580
1563
- a single "#" (denoting `auto-numbered footnotes`_),
1581
1564
1582
1565
- a "#" followed by a simple reference name (an `autonumber label`_),
1583
1566
  or
1584
1567
1585
1568
- a single "*" (denoting `auto-symbol footnotes`_).
1586
1569
1587
1570
The footnote content (body elements) must be consistently indented (by
1588
1571
at least 3 spaces) and left-aligned.  The first body element within a
1589
1572
footnote may often begin on the same line as the footnote label.
1590
1573
However, if the first element fits on one line and the indentation of
1591
1574
the remaining elements differ, the first element must begin on the
1592
1575
line after the footnote label.  Otherwise, the difference in
1593
1576
indentation will not be detected.
1594
1577
1595
1578
Footnotes may occur anywhere in the document, not only at the end.
1596
1579
Where and how they appear in the processed output depends on the
1597
1580
processing system.
1598
1581
1599
1582
Here is a manually numbered footnote::
1600
1583
1601
1584
    .. [1] Body elements go here.
1602
1585
1603
1586
Each footnote automatically generates a hyperlink target pointing to
1604
1587
itself.  The text of the hyperlink target name is the same as that of
1605
1588
the footnote label.  `Auto-numbered footnotes`_ generate a number as
1606
1589
their footnote label and reference name.  See `Implicit Hyperlink
1607
1590
Targets`_ for a complete description of the mechanism.
1608
1591
1609
1592
Syntax diagram::
1610
1593
1611
1594
    +-------+-------------------------+
1612
1595
    | ".. " | "[" label "]" footnote  |
1613
1596
    +-------+                         |
1614
1597
            | (body elements)+        |
1615
1598
            +-------------------------+
1616
1599
1617
1600
1618
1601
Auto-Numbered Footnotes
1619
1602
.......................
1620
1603
1621
1604
A number sign ("#") may be used as the first character of a footnote
1622
1605
label to request automatic numbering of the footnote or footnote
1623
1606
reference.
1624
1607
1625
1608
The first footnote to request automatic numbering is assigned the
1626
1609
label "1", the second is assigned the label "2", and so on (assuming
1627
1610
there are no manually numbered footnotes present; see `Mixed Manual
1628
1611
and Auto-Numbered Footnotes`_ below).  A footnote which has
1629
1612
automatically received a label "1" generates an implicit hyperlink
1630
1613
target with name "1", just as if the label was explicitly specified.
1631
1614
1632
1615
.. _autonumber label: `autonumber labels`_
1633
1616
1634
1617
A footnote may specify a label explicitly while at the same time
1635
1618
requesting automatic numbering: ``[#label]``.  These labels are called
1636
1619
_`autonumber labels`.  Autonumber labels do two things:
1637
1620
1638
1621
- On the footnote itself, they generate a hyperlink target whose name
1639
1622
  is the autonumber label (doesn't include the "#").
1640
1623
1641
1624
- They allow an automatically numbered footnote to be referred to more
1642
1625
  than once, as a footnote reference or hyperlink reference.  For
1643
1626
  example::
1644
1627
1645
1628
      If [#note]_ is the first footnote reference, it will show up as
1646
1629
      "[1]".  We can refer to it again as [#note]_ and again see
1647
1630
      "[1]".  We can also refer to it as note_ (an ordinary internal
1648
1631
      hyperlink reference).
1649
1632
1650
1633
      .. [#note] This is the footnote labeled "note".
1651
1634
1652
1635
The numbering is determined by the order of the footnotes, not by the
1653
1636
order of the references.  For footnote references without autonumber
1654
1637
labels (``[#]_``), the footnotes and footnote references must be in
1655
1638
the same relative order but need not alternate in lock-step.  For
1656
1639
example::
1657
1640
1658
1641
    [#]_ is a reference to footnote 1, and [#]_ is a reference to
1659
1642
    footnote 2.
1660
1643
1661
1644
    .. [#] This is footnote 1.
1662
1645
    .. [#] This is footnote 2.
1663
1646
    .. [#] This is footnote 3.
1664
1647
1665
1648
    [#]_ is a reference to footnote 3.
1666
1649
1667
1650
Special care must be taken if footnotes themselves contain
1668
1651
auto-numbered footnote references, or if multiple references are made
1669
1652
in close proximity.  Footnotes and references are noted in the order
1670
1653
they are encountered in the document, which is not necessarily the
1671
1654
same as the order in which a person would read them.
1672
1655
1673
1656
1674
1657
Auto-Symbol Footnotes
1675
1658
.....................
1676
1659
1677
1660
An asterisk ("*") may be used for footnote labels to request automatic
1678
1661
symbol generation for footnotes and footnote references.  The asterisk
1679
1662
may be the only character in the label.  For example::
1680
1663
1681
1664
    Here is a symbolic footnote reference: [*]_.
1682
1665
1683
1666
    .. [*] This is the footnote.
1684
1667
1685
1668
A transform will insert symbols as labels into corresponding footnotes
1686
1669
and footnote references.  The number of references must be equal to
1687
1670
the number of footnotes.  One symbol footnote cannot have multiple
1688
1671
references.
1689
1672
1690
1673
The standard Docutils system uses the following symbols for footnote
1691
1674
marks [#]_:
1692
1675
1693
1676
- asterisk/star ("*")
1694
1677
- dagger (HTML character entity "&dagger;", Unicode U+02020)
1695
1678
- double dagger ("&Dagger;"/U+02021)
1696
1679
- section mark ("&sect;"/U+000A7)
1697
1680
- pilcrow or paragraph mark ("&para;"/U+000B6)
1698
1681
- number sign ("#")
1699
1682
- spade suit ("&spades;"/U+02660)
1700
1683
- heart suit ("&hearts;"/U+02665)
1701
1684
- diamond suit ("&diams;"/U+02666)
1702
1685
- club suit ("&clubs;"/U+02663)
1703
1686
1704
1687
.. [#] This list was inspired by the list of symbols for "Note
1705
1688
   Reference Marks" in The Chicago Manual of Style, 14th edition,
1706
1689
   section 12.51.  "Parallels" ("||") were given in CMoS instead of
1707
1690
   the pilcrow.  The last four symbols (the card suits) were added
1708
1691
   arbitrarily.
1709
1692
1710
1693
If more than ten symbols are required, the same sequence will be
1711
1694
reused, doubled and then tripled, and so on ("**" etc.).
1712
1695
1713
1696
.. Note:: When using auto-symbol footnotes, the choice of output
1714
1697
   encoding is important.  Many of the symbols used are not encodable
1715
1698
   in certain common text encodings such as Latin-1 (ISO 8859-1).  The
1716
1699
   use of UTF-8 for the output encoding is recommended.  An
1717
1700
   alternative for HTML and XML output is to use the
1718
1701
   "xmlcharrefreplace" `output encoding error handler`__.
1719
1702
1720
1703
__ ../../user/config.html#output-encoding-error-handler
1721
1704
1722
1705
1723
1706
Mixed Manual and Auto-Numbered Footnotes
1724
1707
........................................
1725
1708
1726
1709
Manual and automatic footnote numbering may both be used within a
1727
1710
single document, although the results may not be expected.  Manual
1728
1711
numbering takes priority.  Only unused footnote numbers are assigned
1729
1712
to auto-numbered footnotes.  The following example should be
1730
1713
illustrative::
1731
1714
1732
1715
    [2]_ will be "2" (manually numbered),
1733
1716
    [#]_ will be "3" (anonymous auto-numbered), and
1734
1717
    [#label]_ will be "1" (labeled auto-numbered).
1735
1718
1736
1719
    .. [2] This footnote is labeled manually, so its number is fixed.
1737
1720
1738
1721
    .. [#label] This autonumber-labeled footnote will be labeled "1".
1739
1722
       It is the first auto-numbered footnote and no other footnote
1740
1723
       with label "1" exists.  The order of the footnotes is used to
1741
1724
       determine numbering, not the order of the footnote references.
1742
1725
1743
1726
    .. [#] This footnote will be labeled "3".  It is the second
1744
1727
       auto-numbered footnote, but footnote label "2" is already used.
1745
1728
1746
1729
1747
1730
Citations
1748
1731
`````````
1749
1732
1750
1733
Citations are identical to footnotes except that they use only
1751
1734
non-numeric labels such as ``[note]`` or ``[GVR2001]``.  Citation
1752
1735
labels are simple `reference names`_ (case-insensitive single words
1753
1736
consisting of alphanumerics plus internal hyphens, underscores, and
1754
1737
periods; no whitespace).  Citations may be rendered separately and
1755
1738
differently from footnotes.  For example::
1756
1739
1757
1740
    Here is a citation reference: [CIT2002]_.
1758
1741
1759
1742
    .. [CIT2002] This is the citation.  It's just like a footnote,
1760
1743
       except the label is textual.
1761
1744
1762
1745
1763
1746
.. _hyperlinks:
1764
1747
1765
1748
Hyperlink Targets
1766
1749
`````````````````
1767
1750
1768
1751
Doctree element: target.
1769
1752
1770
1753
These are also called _`explicit hyperlink targets`, to differentiate
1771
1754
them from `implicit hyperlink targets`_ defined below.
1772
1755
1773
1756
Hyperlink targets identify a location within or outside of a document,
1774
1757
which may be linked to by `hyperlink references`_.
1775
1758
1776
1759
Hyperlink targets may be named or anonymous.  Named hyperlink targets
1777
1760
consist of an explicit markup start (".. "), an underscore, the
1778
1761
reference name (no trailing underscore), a colon, whitespace, and a
1779
1762
link block::
1780
1763
1781
1764
    .. _hyperlink-name: link-block
1782
1765
1783
1766
Reference names are whitespace-neutral and case-insensitive.  See
1784
1767
`Reference Names`_ for details and examples.
1785
1768
1786
1769
Anonymous hyperlink targets consist of an explicit markup start
1787
1770
(".. "), two underscores, a colon, whitespace, and a link block; there
1788
1771
is no reference name::
1789
1772
1790
1773
    .. __: anonymous-hyperlink-target-link-block
1791
1774
1792
1775
An alternate syntax for anonymous hyperlinks consists of two
1793
1776
underscores, a space, and a link block::
1794
1777
1795
1778
    __ anonymous-hyperlink-target-link-block
1796
1779
1797
1780
See `Anonymous Hyperlinks`_ below.
1798
1781
1799
1782
There are three types of hyperlink targets: internal, external, and
1800
1783
indirect.
1801
1784
1802
1785
1. _`Internal hyperlink targets` have empty link blocks.  They provide
1803
1786
   an end point allowing a hyperlink to connect one place to another
1804
1787
   within a document.  An internal hyperlink target points to the
1805
1788
   element following the target.  For example::
1806
1789
1807
1790
       Clicking on this internal hyperlink will take us to the target_
1808
1791
       below.
1809
1792
1810
1793
       .. _target:
1811
1794
1812
1795
       The hyperlink target above points to this paragraph.
1813
1796
1814
1797
   Internal hyperlink targets may be "chained".  Multiple adjacent
1815
1798
   internal hyperlink targets all point to the same element::
1816
1799
1817
1800
       .. _target1:
1818
1801
       .. _target2:
1819
1802
1820
1803
       The targets "target1" and "target2" are synonyms; they both
1821
1804
       point to this paragraph.
1822
1805
1823
1806
   If the element "pointed to" is an external hyperlink target (with a
1824
1807
   URI in its link block; see #2 below) the URI from the external
1825
1808
   hyperlink target is propagated to the internal hyperlink targets;
1826
1809
   they will all "point to" the same URI.  There is no need to
1827
1810
   duplicate a URI.  For example, all three of the following hyperlink
1828
1811
   targets refer to the same URI::
1829
1812
1830
1813
       .. _Python DOC-SIG mailing list archive:
1831
1814
       .. _archive:
1832
1815
       .. _Doc-SIG: http://mail.python.org/pipermail/doc-sig/
1833
1816
1834
1817
   An inline form of internal hyperlink target is available; see
1835
1818
   `Inline Internal Targets`_.
1836
1819
1837
1820
2. _`External hyperlink targets` have an absolute or relative URI or
1838
1821
   email address in their link blocks.  For example, take the
1839
1822
   following input::
1840
1823
1841
1824
       See the Python_ home page for info.
1842
1825
1843
1826
       `Write to me`_ with your questions.
1844
1827
1845
1828
       .. _Python: http://www.python.org
1846
1829
       .. _Write to me: jdoe@example.com
1847
1830
1848
1831
   After processing into HTML, the hyperlinks might be expressed as::
1849
1832
1850
1833
       See the <a href="http://www.python.org">Python</a> home page
1851
1834
       for info.
1852
1835
1853
1836
       <a href="mailto:jdoe@example.com">Write to me</a> with your
1854
1837
       questions.
1855
1838
1856
1839
   An external hyperlink's URI may begin on the same line as the
1857
1840
   explicit markup start and target name, or it may begin in an
1858
1841
   indented text block immediately following, with no intervening
1859
1842
   blank lines.  If there are multiple lines in the link block, they
1860
1843
   are concatenated.  Any whitespace is removed (whitespace is
1861
1844
   permitted to allow for line wrapping).  The following external
1862
1845
   hyperlink targets are equivalent::
1863
1846
1864
1847
       .. _one-liner: http://docutils.sourceforge.net/rst.html
1865
1848
1866
1849
       .. _starts-on-this-line: http://
1867
1850
          docutils.sourceforge.net/rst.html
1868
1851
1869
1852
       .. _entirely-below:
1870
1853
          http://docutils.
1871
1854
          sourceforge.net/rst.html
1872
1855
1873
1856
   If an external hyperlink target's URI contains an underscore as its
1874
1857
   last character, it must be escaped to avoid being mistaken for an
1875
1858
   indirect hyperlink target::
1876
1859
1877
1860
       This link_ refers to a file called ``underscore_``.
1878
1861
1879
1862
       .. _link: underscore\_
1880
1863
1881
1864
   It is possible (although not generally recommended) to include URIs
1882
1865
   directly within hyperlink references.  See `Embedded URIs`_ below.
1883
1866
1884
1867
3. _`Indirect hyperlink targets` have a hyperlink reference in their
1885
1868
   link blocks.  In the following example, target "one" indirectly
1886
1869
   references whatever target "two" references, and target "two"
1887
1870
   references target "three", an internal hyperlink target.  In
1888
1871
   effect, all three reference the same thing::
1889
1872
1890
1873
       .. _one: two_
1891
1874
       .. _two: three_
1892
1875
       .. _three:
1893
1876
1894
1877
   Just as with `hyperlink references`_ anywhere else in a document,
1895
1878
   if a phrase-reference is used in the link block it must be enclosed
1896
1879
   in backquotes.  As with `external hyperlink targets`_, the link
1897
1880
   block of an indirect hyperlink target may begin on the same line as
1898
1881
   the explicit markup start or the next line.  It may also be split
1899
1882
   over multiple lines, in which case the lines are joined with
1900
1883
   whitespace before being normalized.
1901
1884
1902
1885
   For example, the following indirect hyperlink targets are
1903
1886
   equivalent::
1904
1887
1905
1888
       .. _one-liner: `A HYPERLINK`_
1906
1889
       .. _entirely-below:
1907
1890
          `a    hyperlink`_
1908
1891
       .. _split: `A
1909
1892
          Hyperlink`_
1910
1893
1911
1894
If the reference name contains any colons, either:
1912
1895
1913
1896
- the phrase must be enclosed in backquotes::
1914
1897
1915
1898
      .. _`FAQTS: Computers: Programming: Languages: Python`:
1916
1899
         http://python.faqts.com/
1917
1900
1918
1901
- or the colon(s) must be backslash-escaped in the link target::
1919
1902
1920
1903
      .. _Chapter One\: "Tadpole Days":
1921
1904
1922
1905
      It's not easy being green...
1923
1906
1924
1907
See `Implicit Hyperlink Targets`_ below for the resolution of
1925
1908
duplicate reference names.
1926
1909
1927
1910
Syntax diagram::
1928
1911
1929
1912
    +-------+----------------------+
1930
1913
    | ".. " | "_" name ":" link    |
1931
1914
    +-------+ block                |
1932
1915
            |                      |
1933
1916
            +----------------------+
1934
1917
1935
1918
1936
1919
Anonymous Hyperlinks
1937
1920
....................
1938
1921
1939
1922
The `World Wide Web Consortium`_ recommends in its `HTML Techniques
1940
1923
for Web Content Accessibility Guidelines`_ that authors should
1941
1924
"clearly identify the target of each link."  Hyperlink references
1942
1925
should be as verbose as possible, but duplicating a verbose hyperlink
1943
1926
name in the target is onerous and error-prone.  Anonymous hyperlinks
1944
1927
are designed to allow convenient verbose hyperlink references, and are
1945
1928
analogous to `Auto-Numbered Footnotes`_.  They are particularly useful
1946
1929
in short or one-off documents.  However, this feature is easily abused
1947
1930
and can result in unreadable plaintext and/or unmaintainable
1948
1931
documents.  Caution is advised.
1949
1932
1950
1933
Anonymous `hyperlink references`_ are specified with two underscores
1951
1934
instead of one::
1952
1935
1953
1936
    See `the web site of my favorite programming language`__.
1954
1937
1955
1938
Anonymous targets begin with ".. __:"; no reference name is required
1956
1939
or allowed::
1957
1940
1958
1941
    .. __: http://www.python.org
1959
1942
1960
1943
As a convenient alternative, anonymous targets may begin with "__"
1961
1944
only::
1962
1945
1963
1946
    __ http://www.python.org
1964
1947
1965
1948
The reference name of the reference is not used to match the reference
1966
1949
to its target.  Instead, the order of anonymous hyperlink references
1967
1950
and targets within the document is significant: the first anonymous
1968
1951
reference will link to the first anonymous target.  The number of
1969
1952
anonymous hyperlink references in a document must match the number of
1970
1953
anonymous targets.  For readability, it is recommended that targets be
1971
1954
kept close to references.  Take care when editing text containing
1972
1955
anonymous references; adding, removing, and rearranging references
1973
1956
require attention to the order of corresponding targets.
1974
1957
1975
1958
1976
1959
Directives
1977
1960
``````````
1978
1961
1979
1962
Doctree elements: depend on the directive.
1980
1963
1981
1964
Directives are an extension mechanism for reStructuredText, a way of
1982
1965
adding support for new constructs without adding new primary syntax
1983
1966
(directives may support additional syntax locally).  All standard
1984
1967
directives (those implemented and registered in the reference
1985
1968
reStructuredText parser) are described in the `reStructuredText
1986
1969
Directives`_ document, and are always available.  Any other directives
1987
1970
are domain-specific, and may require special action to make them
1988
1971
available when processing the document.
1989
1972
1990
1973
For example, here's how an image_ may be placed::
1991
1974
1992
1975
    .. image:: mylogo.jpeg
1993
1976
1994
1977
A figure_ (a graphic with a caption) may placed like this::
1995
1978
1996
1979
    .. figure:: larch.png
1997
1980
1998
1981
       The larch.
1999
1982
2000
1983
An admonition_ (note, caution, etc.) contains other body elements::
2001
1984
2002
1985
    .. note:: This is a paragraph
2003
1986
2004
1987
       - Here is a bullet list.
2005
1988
2006
1989
Directives are indicated by an explicit markup start (".. ") followed
2007
1990
by the directive type, two colons, and whitespace (together called the
2008
1991
"directive marker").  Directive types are case-insensitive single
2009
1992
words (alphanumerics plus isolated internal hyphens, underscores,
2010
1993
plus signs, colons, and periods; no whitespace).  Two colons are used
2011
1994
after the directive type for these reasons:
2012
1995
2013
1996
- Two colons are distinctive, and unlikely to be used in common text.
2014
1997
2015
1998
- Two colons avoids clashes with common comment text like::
2016
1999
2017
2000
      .. Danger: modify at your own risk!
2018
2001
2019
2002
- If an implementation of reStructuredText does not recognize a
2020
2003
  directive (i.e., the directive-handler is not installed), a level-3
2021
2004
  (error) system message is generated, and the entire directive block
2022
2005
  (including the directive itself) will be included as a literal
2023
2006
  block.  Thus "::" is a natural choice.
2024
2007
2025
2008
The directive block is consists of any text on the first line of the
2026
2009
directive after the directive marker, and any subsequent indented
2027
2010
text.  The interpretation of the directive block is up to the
2028
2011
directive code.  There are three logical parts to the directive block:
2029
2012
2030
2013
1. Directive arguments.
2031
2014
2. Directive options.
2032
2015
3. Directive content.
2033
2016
2034
2017
Individual directives can employ any combination of these parts.
2035
2018
Directive arguments can be filesystem paths, URLs, title text, etc.
2036
2019
Directive options are indicated using `field lists`_; the field names
2037
2020
and contents are directive-specific.  Arguments and options must form
2038
2021
a contiguous block beginning on the first or second line of the
2039
2022
directive; a blank line indicates the beginning of the directive
2040
2023
content block.  If either arguments and/or options are employed by the
2041
2024
directive, a blank line must separate them from the directive content.
2042
2025
The "figure" directive employs all three parts::
2043
2026
2044
2027
    .. figure:: larch.png
2045
2028
       :scale: 50
2046
2029
2047
2030
       The larch.
2048
2031
2049
2032
Simple directives may not require any content.  If a directive that
2050
2033
does not employ a content block is followed by indented text anyway,
2051
2034
it is an error.  If a block quote should immediately follow a
2052
2035
directive, use an empty comment in-between (see Comments_ below).
2053
2036
2054
2037
Actions taken in response to directives and the interpretation of text
2055
2038
in the directive content block or subsequent text block(s) are
2056
2039
directive-dependent.  See `reStructuredText Directives`_ for details.
2057
2040
2058
2041
Directives are meant for the arbitrary processing of their contents,
2059
2042
which can be transformed into something possibly unrelated to the
2060
2043
original text.  It may also be possible for directives to be used as
2061
2044
pragmas, to modify the behavior of the parser, such as to experiment
2062
2045
with alternate syntax.  There is no parser support for this
2063
2046
functionality at present; if a reasonable need for pragma directives
2064
2047
is found, they may be supported.
2065
2048
2066
2049
Directives do not generate "directive" elements; they are a *parser
2067
2050
construct* only, and have no intrinsic meaning outside of
2068
2051
reStructuredText.  Instead, the parser will transform recognized
2069
2052
directives into (possibly specialized) document elements.  Unknown
2070
2053
directives will trigger level-3 (error) system messages.
2071
2054
2072
2055
Syntax diagram::
2073
2056
2074
2057
    +-------+-------------------------------+
2075
2058
    | ".. " | directive type "::" directive |
2076
2059
    +-------+ block                         |
2077
2060
            |                               |
2078
2061
            +-------------------------------+
2079
2062
2080
2063
2081
2064
Substitution Definitions
2082
2065
````````````````````````
2083
2066
2084
2067
Doctree element: substitution_definition.
2085
2068
2086
2069
Substitution definitions are indicated by an explicit markup start
2087
2070
(".. ") followed by a vertical bar, the substitution text, another
2088
2071
vertical bar, whitespace, and the definition block.  Substitution text
2089
2072
may not begin or end with whitespace.  A substitution definition block
2090
2073
contains an embedded inline-compatible directive (without the leading
2091
2074
".. "), such as "image_" or "replace_".  For example::
2092
2075
2093
2076
    The |biohazard| symbol must be used on containers used to
2094
2077
    dispose of medical waste.
2095
2078
2096
2079
    .. |biohazard| image:: biohazard.png
2097
2080
2098
2081
It is an error for a substitution definition block to directly or
2099
2082
indirectly contain a circular substitution reference.
2100
2083
2101
2084
`Substitution references`_ are replaced in-line by the processed
2102
2085
contents of the corresponding definition (linked by matching
2103
2086
substitution text).  Matches are case-sensitive but forgiving; if no
2104
2087
exact match is found, a case-insensitive comparison is attempted.
2105
2088
2106
2089
Substitution definitions allow the power and flexibility of
2107
2090
block-level directives_ to be shared by inline text.  They are a way
2108
2091
to include arbitrarily complex inline structures within text, while
2109
2092
keeping the details out of the flow of text.  They are the equivalent
2110
2093
of SGML/XML's named entities or programming language macros.
2111
2094
2112
2095
Without the substitution mechanism, every time someone wants an
2113
2096
application-specific new inline structure, they would have to petition
2114
2097
for a syntax change.  In combination with existing directive syntax,
2115
2098
any inline structure can be coded without new syntax (except possibly
2116
2099
a new directive).
2117
2100
2118
2101
Syntax diagram::
2119
2102
2120
2103
    +-------+-----------------------------------------------------+
2121
2104
    | ".. " | "|" substitution text "| " directive type "::" data |
2122
2105
    +-------+ directive block                                     |
2123
2106
            |                                                     |
2124
2107
            +-----------------------------------------------------+
2125
2108
2126
2109
Following are some use cases for the substitution mechanism.  Please
2127
2110
note that most of the embedded directives shown are examples only and
2128
2111
have not been implemented.
2129
2112
2130
2113
Objects
2131
2114
    Substitution references may be used to associate ambiguous text
2132
2115
    with a unique object identifier.
2133
2116
2134
2117
    For example, many sites may wish to implement an inline "user"
2135
2118
    directive::
2136
2119
2137
2120
        |Michael| and |Jon| are our widget-wranglers.
2138
2121
2139
2122
        .. |Michael| user:: mjones
2140
2123
        .. |Jon|     user:: jhl
2141
2124
2142
2125
    Depending on the needs of the site, this may be used to index the
2143
2126
    document for later searching, to hyperlink the inline text in
2144
2127
    various ways (mailto, homepage, mouseover Javascript with profile
2145
2128
    and contact information, etc.), or to customize presentation of
2146
2129
    the text (include username in the inline text, include an icon
2147
2130
    image with a link next to the text, make the text bold or a
2148
2131
    different color, etc.).
2149
2132
2150
2133
    The same approach can be used in documents which frequently refer
2151
2134
    to a particular type of objects with unique identifiers but
2152
2135
    ambiguous common names.  Movies, albums, books, photos, court
2153
2136
    cases, and laws are possible.  For example::
2154
2137
2155
2138
        |The Transparent Society| offers a fascinating alternate view
2156
2139
        on privacy issues.
2157
2140
2158
2141
        .. |The Transparent Society| book:: isbn=0738201448
2159
2142
2160
2143
    Classes or functions, in contexts where the module or class names
2161
2144
    are unclear and/or interpreted text cannot be used, are another
2162
2145
    possibility::
2163
2146
2164
2147
        4XSLT has the convenience method |runString|, so you don't
2165
2148
        have to mess with DOM objects if all you want is the
2166
2149
        transformed output.
2167
2150
2168
2151
        .. |runString| function:: module=xml.xslt class=Processor
2169
2152
2170
2153
Images
2171
2154
    Images are a common use for substitution references::
2172
2155
2173
2156
        West led the |H| 3, covered by dummy's |H| Q, East's |H| K,
2174
2157
        and trumped in hand with the |S| 2.
2175
2158
2176
2159
        .. |H| image:: /images/heart.png
2177
2160
           :height: 11
2178
2161
           :width: 11
2179
2162
        .. |S| image:: /images/spade.png
2180
2163
           :height: 11
2181
2164
           :width: 11
2182
2165
2183
2166
        * |Red light| means stop.
2184
2167
        * |Green light| means go.
2185
2168
        * |Yellow light| means go really fast.
2186
2169
2187
2170
        .. |Red light|    image:: red_light.png
2188
2171
        .. |Green light|  image:: green_light.png
2189
2172
        .. |Yellow light| image:: yellow_light.png
2190
2173
2191
2174
        |-><-| is the official symbol of POEE_.
2192
2175
2193
2176
        .. |-><-| image:: discord.png
2194
2177
        .. _POEE: http://www.poee.org/
2195
2178
2196
2179
    The "image_" directive has been implemented.
2197
2180
2198
2181
Styles [#]_
2199
2182
    Substitution references may be used to associate inline text with
2200
2183
    an externally defined presentation style::
2201
2184
2202
2185
        Even |the text in Texas| is big.
2203
2186
2204
2187
        .. |the text in Texas| style:: big
2205
2188
2206
2189
    The style name may be meaningful in the context of some particular
2207
2190
    output format (CSS class name for HTML output, LaTeX style name
2208
2191
    for LaTeX, etc), or may be ignored for other output formats (such
2209
2192
    as plaintext).
2210
2193
2211
2194
    .. @@@ This needs to be rethought & rewritten or removed:
2212
2195
2213
2196
       Interpreted text is unsuitable for this purpose because the set
2214
2197
       of style names cannot be predefined - it is the domain of the
2215
2198
       content author, not the author of the parser and output
2216
2199
       formatter - and there is no way to associate a style name
2217
2200
       argument with an interpreted text style role.  Also, it may be
2218
2201
       desirable to use the same mechanism for styling blocks::
2219
2202
2220
2203
           .. style:: motto
2221
2204
              At Bob's Underwear Shop, we'll do anything to get in
2222
2205
              your pants.
2223
2206
2224
2207
           .. style:: disclaimer
2225
2208
              All rights reversed.  Reprint what you like.
2226
2209
2227
2210
    .. [#] There may be sufficient need for a "style" mechanism to
2228
2211
       warrant simpler syntax such as an extension to the interpreted
2229
2212
       text role syntax.  The substitution mechanism is cumbersome for
2230
2213
       simple text styling.
2231
2214
2232
2215
Templates
2233
2216
    Inline markup may be used for later processing by a template
2234
2217
    engine.  For example, a Zope_ author might write::
2235
2218
2236
2219
        Welcome back, |name|!
2237
2220
2238
2221
        .. |name| tal:: replace user/getUserName
2239
2222
2240
2223
    After processing, this ZPT output would result::
2241
2224
2242
2225
        Welcome back,
2243
2226
        <span tal:replace="user/getUserName">name</span>!
2244
2227
2245
2228
    Zope would then transform this to something like "Welcome back,
2246
2229
    David!" during a session with an actual user.
2247
2230
2248
2231
Replacement text
2249
2232
    The substitution mechanism may be used for simple macro
2250
2233
    substitution.  This may be appropriate when the replacement text
2251
2234
    is repeated many times throughout one or more documents,
2252
2235
    especially if it may need to change later.  A short example is
2253
2236
    unavoidably contrived::
2254
2237
2255
2238
        |RST|_ is a little annoying to type over and over, especially
2256
2239
        when writing about |RST| itself, and spelling out the
2257
2240
        bicapitalized word |RST| every time isn't really necessary for
2258
2241
        |RST| source readability.
2259
2242
2260
2243
        .. |RST| replace:: reStructuredText
2261
2244
        .. _RST: http://docutils.sourceforge.net/rst.html
2262
2245
2263
2246
    Note the trailing underscore in the first use of a substitution
2264
2247
    reference.  This indicates a reference to the corresponding
2265
2248
    hyperlink target.
2266
2249
2267
2250
    Substitution is also appropriate when the replacement text cannot
2268
2251
    be represented using other inline constructs, or is obtrusively
2269
2252
    long::
2270
2253
2271
2254
        But still, that's nothing compared to a name like
2272
2255
        |j2ee-cas|__.
2273
2256
2274
2257
        .. |j2ee-cas| replace::
2275
2258
           the Java `TM`:super: 2 Platform, Enterprise Edition Client
2276
2259
           Access Services
2277
2260
        __ http://developer.java.sun.com/developer/earlyAccess/
2278
2261
           j2eecas/
2279
2262
2280
2263
    The "replace_" directive has been implemented.
2281
2264
2282
2265
2283
2266
Comments
2284
2267
````````
2285
2268
2286
2269
Doctree element: comment.
2287
2270
2288
2271
Arbitrary indented text may follow the explicit markup start and will
2289
2272
be processed as a comment element.  No further processing is done on
2290
2273
the comment block text; a comment contains a single "text blob".
2291
2274
Depending on the output formatter, comments may be removed from the
2292
2275
processed output.  The only restriction on comments is that they not
2293
2276
use the same syntax as any of the other explicit markup constructs:
2294
2277
substitution definitions, directives, footnotes, citations, or
2295
2278
hyperlink targets.  To ensure that none of the other explicit markup
2296
2279
constructs is recognized, leave the ".." on a line by itself::
2297
2280
2298
2281
    .. This is a comment
2299
2282
    ..
2300
2283
       _so: is this!
2301
2284
    ..
2302
2285
       [and] this!
2303
2286
    ..
2304
2287
       this:: too!
2305
2288
    ..
2306
2289
       |even| this:: !
2307
2290
2308
2291
.. _empty comments:
2309
2292
2310
2293
An explicit markup start followed by a blank line and nothing else
2311
2294
(apart from whitespace) is an "_`empty comment`".  It serves to
2312
2295
terminate a preceding construct, and does **not** consume any indented
2313
2296
text following.  To have a block quote follow a list or any indented
2314
2297
construct, insert an unindented empty comment in-between.
2315
2298
2316
2299
Syntax diagram::
2317
2300
2318
2301
    +-------+----------------------+
2319
2302
    | ".. " | comment              |
2320
2303
    +-------+ block                |
2321
2304
            |                      |
2322
2305
            +----------------------+
2323
2306
2324
2307
2325
2308
Implicit Hyperlink Targets
2326
2309
==========================
2327
2310
2328
2311
Implicit hyperlink targets are generated by section titles, footnotes,
2329
2312
and citations, and may also be generated by extension constructs.
2330
2313
Implicit hyperlink targets otherwise behave identically to explicit
2331
2314
`hyperlink targets`_.
2332
2315
2333
2316
Problems of ambiguity due to conflicting duplicate implicit and
2334
2317
explicit reference names are avoided by following this procedure:
2335
2318
2336
2319
1. `Explicit hyperlink targets`_ override any implicit targets having
2337
2320
   the same reference name.  The implicit hyperlink targets are
2338
2321
   removed, and level-1 (info) system messages are inserted.
2339
2322
2340
2323
2. Duplicate implicit hyperlink targets are removed, and level-1
2341
2324
   (info) system messages inserted.  For example, if two or more
2342
2325
   sections have the same title (such as "Introduction" subsections of
2343
2326
   a rigidly-structured document), there will be duplicate implicit
2344
2327
   hyperlink targets.
2345
2328
2346
2329
3. Duplicate explicit hyperlink targets are removed, and level-2
2347
2330
   (warning) system messages are inserted.  Exception: duplicate
2348
2331
   `external hyperlink targets`_ (identical hyperlink names and
2349
2332
   referenced URIs) do not conflict, and are not removed.
2350
2333
2351
2334
System messages are inserted where target links have been removed.
2352
2335
See "Error Handling" in `PEP 258`_.
2353
2336
2354
2337
The parser must return a set of *unique* hyperlink targets.  The
2355
2338
calling software (such as the Docutils_) can warn of unresolvable
2356
2339
links, giving reasons for the messages.
2357
2340
2358
2341
2359
2342
Inline Markup
2360
2343
=============
2361
2344
2362
2345
In reStructuredText, inline markup applies to words or phrases within
2363
2346
a text block.  The same whitespace and punctuation that serves to
2364
2347
delimit words in written text is used to delimit the inline markup
2365
2348
syntax constructs.  The text within inline markup may not begin or end
2366
2349
with whitespace.  Arbitrary `character-level inline markup`_ is
2367
2350
supported although not encouraged.  Inline markup cannot be nested.
2368
2351
2369
2352
There are nine inline markup constructs.  Five of the constructs use
2370
2353
identical start-strings and end-strings to indicate the markup:
2371
2354
2372
2355
- emphasis_: "*"
2373
2356
- `strong emphasis`_: "**"
2374
2357
- `interpreted text`_: "`"
2375
2358
- `inline literals`_: "``"
2376
2359
- `substitution references`_: "|"
2377
2360
2378
2361
Three constructs use different start-strings and end-strings:
2379
2362
2380
2363
- `inline internal targets`_: "_`" and "`"
2381
2364
- `footnote references`_: "[" and "]_"
2382
2365
- `hyperlink references`_: "`" and "\`_" (phrases), or just a
2383
2366
  trailing "_" (single words)
2384
2367
2385
2368
`Standalone hyperlinks`_ are recognized implicitly, and use no extra
2386
2369
markup.
2387
2370
2388
2371
Inline markup recognition rules
2389
2372
-------------------------------
2390
2373
2391
2374
Inline markup start-strings and end-strings are only recognized if all of
2392
2375
the following conditions are met:
2393
2376
2394
2377
1. Inline markup start-strings must start a text block or be
2395
2378
   immediately preceded by
2396
2379
2397
2380
   * whitespace,
2398
2381
   * one of the ASCII characters ``- : / ' " < ( [ {`` or
2399
2382
   * a non-ASCII punctuation character with `Unicode category`_
2400
2383
     `Pd` (Dash),
2401
2384
     `Po` (Other),
2402
2385
     `Ps` (Open),
2403
2386
     `Pi` (Initial quote), or
2404
2387
     `Pf` (Final quote) [#PiPf]_.
2405
2388
2406
2389
2. Inline markup start-strings must be immediately followed by
2407
2390
   non-whitespace.
2408
2391
2409
2392
3. Inline markup end-strings must be immediately preceded by
2410
2393
   non-whitespace.
2411
2394
2412
2395
4. Inline markup end-strings must end a text block or be immediately
2413
2396
   followed by
2414
2397
2415
2398
   * whitespace,
2416
2399
   * one of the ASCII characters ``- . , : ; ! ? \ / ' " ) ] } >`` or
2417
2400
   * a non-ASCII punctuation character with `Unicode category`_
2418
2401
     `Pd` (Dash),
2419
2402
     `Po` (Other),
2420
2403
     `Pe` (Close),
2421
2404
     `Pf` (Final quote), or
2422
2405
     `Pi` (Initial quote) [#PiPf]_.
2423
2406
2424
2407
5. If an inline markup start-string is immediately preceded by one of the
2425
2408
   ASCII characters ``' " < ( [ {``, or a character with Unicode character
2426
2409
   category `Ps`, `Pi`, or `Pf`, it must not be followed by the
2427
2410
   corresponding [#corresponding-quotes]_ closing character from
2428
2411
   ``' " ) ] } >`` or the categories `Pe`, `Pf`, or `Pi`.
2429
2412
2430
2413
6. An inline markup end-string must be separated by at least one
2431
2414
   character from the start-string.
2432
2415
2433
2416
7. An unescaped backslash preceding a start-string or end-string will
2434
2417
   disable markup recognition, except for the end-string of `inline
2435
2418
   literals`_.  See `Escaping Mechanism`_ above for details.
2436
2419
2437
2420
.. [#PiPf] `Pi` (Punctuation, Initial quote) characters are "usually
2438
2421
   closing, sometimes opening". `Pf` (Punctuation, Final quote)
2439
2422
   characters are "usually closing, sometimes opening".
2440
2423
2441
2424
.. [#corresponding-quotes] For quotes, corresponding characters can be
2442
2425
   any of the `quotation marks in international usage`_
2443
2426
2444
2427
.. _Unicode category:
2445
2428
   http://www.unicode.org/Public/5.1.0/ucd/UCD.html#General_Category_Values
2446
2429
2447
2430
.. _quotation marks in international usage:
2448
2431
   http://en.wikipedia.org/wiki/Quotation_mark,_non-English_usage
2449
2432
2450
2433
The inline markup recognition rules were devised to allow 90% of non-markup
2451
2434
uses of "*", "`", "_", and "|" without escaping. For example, none of the
2452
2435
following terms are recognized as containing inline markup strings:
2453
2436
2454
2437
- 2*x a**b O(N**2) e**(x*y) f(x)*f(y) a|b file*.* (breaks 1)
2455
2438
- 2 * x  a ** b  (* BOM32_* ` `` _ __ | (breaks 2)
2456
2439
- "*" '|' (*) [*] {*} <*>
2457
2440
  ‘*’ ‚*‘ ‘*‚ ’*’ ‚*’
2458
2441
  “*” „*“ “*„ ”*” „*”
2459
2442
  »*« ›*‹ «*» »*» ›*› (breaks 5)
2460
2443
- || (breaks 6)
2461
2444
- __init__ __init__()
2462
2445
2463
2446
No escaping is required inside the following inline markup examples:
2464
2447
2465
2448
- *2 * x  *a **b *.txt* (breaks 3)
2466
2449
- *2*x a**b O(N**2) e**(x*y) f(x)*f(y) a*(1+2)* (breaks 4)
2467
2450
2468
2451
It may be desirable to use `inline literals`_ for some of these anyhow,
2469
2452
especially if they represent code snippets.  It's a judgment call.
2470
2453
2471
2454
These cases *do* require either literal-quoting or escaping to avoid
2472
2455
misinterpretation:
2473
2456
2474
2457
    \*4, class\_, \*args, \**kwargs, \`TeX-quoted', \*ML, \*.txt
2475
2458
2476
2459
In most use cases, `inline literals`_ or `literal blocks`_ are the best
2477
2460
choice (by default, this also selects a monospaced font)::
2478
2461
2479
2462
    *4, class_, *args, **kwargs, `TeX-quoted', *ML, *.txt
2480
2463
2481
2464
Recognition order
2482
2465
-----------------
2483
2466
2484
2467
Inline markup delimiter characters are used for multiple constructs,
2485
2468
so to avoid ambiguity there must be a specific recognition order for
2486
2469
each character.  The inline markup recognition order is as follows:
2487
2470
2488
2471
- Asterisks: `Strong emphasis`_ ("**") is recognized before emphasis_
2489
2472
  ("*").
2490
2473
2491
2474
- Backquotes: `Inline literals`_ ("``"), `inline internal targets`_
2492
2475
  (leading "_`", trailing "`"), are mutually independent, and are
2493
2476
  recognized before phrase `hyperlink references`_ (leading "`",
2494
2477
  trailing "\`_") and `interpreted text`_ ("`").
2495
2478
2496
2479
- Trailing underscores: Footnote references ("[" + label + "]_") and
2497
2480
  simple `hyperlink references`_ (name + trailing "_") are mutually
2498
2481
  independent.
2499
2482
2500
2483
- Vertical bars: `Substitution references`_ ("|") are independently
2501
2484
  recognized.
2502
2485
2503
2486
- `Standalone hyperlinks`_ are the last to be recognized.
2504
2487
2505
2488
2506
2489
Character-Level Inline Markup
2507
2490
-----------------------------
2508
2491
2509
2492
It is possible to mark up individual characters within a word with
2510
2493
backslash escapes (see `Escaping Mechanism`_ above).  Backslash
2511
2494
escapes can be used to allow arbitrary text to immediately follow
2512
2495
inline markup::
2513
2496
2514
2497
    Python ``list``\s use square bracket syntax.
2515
2498
2516
2499
The backslash will disappear from the processed document.  The word
2517
2500
"list" will appear as inline literal text, and the letter "s" will
2518
2501
immediately follow it as normal text, with no space in-between.
2519
2502
2520
2503
Arbitrary text may immediately precede inline markup using
2521
2504
backslash-escaped whitespace::
2522
2505
2523
2506
    Possible in *re*\ ``Structured``\ *Text*, though not encouraged.
2524
2507
2525
2508
The backslashes and spaces separating "re", "Structured", and "Text"
2526
2509
above will disappear from the processed document.
2527
2510
2528
2511
.. CAUTION::
2529
2512
2530
2513
   The use of backslash-escapes for character-level inline markup is
2531
2514
   not encouraged.  Such use is ugly and detrimental to the
2532
2515
   unprocessed document's readability.  Please use this feature
2533
2516
   sparingly and only where absolutely necessary.
2534
2517
2535
2518
2536
2519
Emphasis
2537
2520
--------
2538
2521
2539
2522
Doctree element: emphasis.
2540
2523
2541
2524
Start-string = end-string = "*".
2542
2525
2543
2526
Text enclosed by single asterisk characters is emphasized::
2544
2527
2545
2528
    This is *emphasized text*.
2546
2529
2547
2530
Emphasized text is typically displayed in italics.
2548
2531
2549
2532
2550
2533
Strong Emphasis
2551
2534
---------------
2552
2535
2553
2536
Doctree element: strong.
2554
2537
2555
2538
Start-string = end-string = "**".
2556
2539
2557
2540
Text enclosed by double-asterisks is emphasized strongly::
2558
2541
2559
2542
    This is **strong text**.
2560
2543
2561
2544
Strongly emphasized text is typically displayed in boldface.
2562
2545
2563
2546
2564
2547
Interpreted Text
2565
2548
----------------
2566
2549
2567
2550
Doctree element: depends on the explicit or implicit role and
2568
2551
processing.
2569
2552
2570
2553
Start-string = end-string = "`".
2571
2554
2572
2555
Interpreted text is text that is meant to be related, indexed, linked,
2573
2556
summarized, or otherwise processed, but the text itself is typically
2574
2557
left alone.  Interpreted text is enclosed by single backquote
2575
2558
characters::
2576
2559
2577
2560
    This is `interpreted text`.
2578
2561
2579
2562
The "role" of the interpreted text determines how the text is
2580
2563
interpreted.  The role may be inferred implicitly (as above; the
2581
2564
"default role" is used) or indicated explicitly, using a role marker.
2582
2565
A role marker consists of a colon, the role name, and another colon.
2583
2566
A role name is a single word consisting of alphanumerics plus isolated
2584
2567
internal hyphens, underscores, plus signs, colons, and periods;
2585
2568
no whitespace or other characters are allowed.  A role marker is
2586
2569
either a prefix or a suffix to the interpreted text, whichever reads
2587
2570
better; it's up to the author::
2588
2571
2589
2572
    :role:`interpreted text`
2590
2573
2591
2574
    `interpreted text`:role:
2592
2575
2593
2576
Interpreted text allows extensions to the available inline descriptive
2594
2577
markup constructs.  To emphasis_, `strong emphasis`_, `inline
2595
2578
literals`_, and `hyperlink references`_, we can add "title reference",
2596
2579
"index entry", "acronym", "class", "red", "blinking" or anything else
2597
2580
we want.  Only pre-determined roles are recognized; unknown roles will
2598
2581
generate errors.  A core set of standard roles is implemented in the
2599
2582
reference parser; see `reStructuredText Interpreted Text Roles`_ for
2600
2583
individual descriptions.  The role_ directive can be used to define
2601
2584
custom interpreted text roles.  In addition, applications may support
2602
2585
specialized roles.
2603
2586
2604
2587
2605
2588
Inline Literals
2606
2589
---------------
2607
2590
2608
2591
Doctree element: literal.
2609
2592
2610
2593
Start-string = end-string = "``".
2611
2594
2612
2595
Text enclosed by double-backquotes is treated as inline literals::
2613
2596
2614
2597
    This text is an example of ``inline literals``.
2615
2598
2616
2599
Inline literals may contain any characters except two adjacent
2617
2600
backquotes in an end-string context (according to the recognition
2618
2601
rules above).  No markup interpretation (including backslash-escape
2619
2602
interpretation) is done within inline literals.
2620
2603
2621
2604
Line breaks are *not* preserved in inline literals.  Although a
2622
2605
reStructuredText parser will preserve runs of spaces in its output,
2623
2606
the final representation of the processed document is dependent on the
2624
2607
output formatter, thus the preservation of whitespace cannot be
2625
2608
guaranteed.  If the preservation of line breaks and/or other
2626
2609
whitespace is important, `literal blocks`_ should be used.
2627
2610
2628
2611
Inline literals are useful for short code snippets.  For example::
2629
2612
2630
2613
    The regular expression ``[+-]?(\d+(\.\d*)?|\.\d+)`` matches
2631
2614
    floating-point numbers (without exponents).
2632
2615
2633
2616
2634
2617
Hyperlink References
2635
2618
--------------------
2636
2619
2637
2620
Doctree element: reference.
2638
2621
2639
2622
- Named hyperlink references:
2640
2623
2641
2624
  - Start-string = "" (empty string), end-string = "_".
2642
2625
  - Start-string = "`", end-string = "\`_".  (Phrase references.)
2643
2626
2644
2627
- Anonymous hyperlink references:
2645
2628
2646
2629
  - Start-string = "" (empty string), end-string = "__".
2647
2630
  - Start-string = "`", end-string = "\`__".  (Phrase references.)
2648
2631
2649
2632
Hyperlink references are indicated by a trailing underscore, "_",
2650
2633
except for `standalone hyperlinks`_ which are recognized
2651
2634
independently.  The underscore can be thought of as a right-pointing
2652
2635
arrow.  The trailing underscores point away from hyperlink references,
2653
2636
and the leading underscores point toward `hyperlink targets`_.
2654
2637
2655
2638
Hyperlinks consist of two parts.  In the text body, there is a source
2656
2639
link, a reference name with a trailing underscore (or two underscores
2657
2640
for `anonymous hyperlinks`_)::
2658
2641
2659
2642
    See the Python_ home page for info.
2660
2643
2661
2644
A target link with a matching reference name must exist somewhere else
2662
2645
in the document.  See `Hyperlink Targets`_ for a full description).
2663
2646
2664
2647
`Anonymous hyperlinks`_ (which see) do not use reference names to
2665
2648
match references to targets, but otherwise behave similarly to named
2666
2649
hyperlinks.
2667
2650
2668
2651
2669
2652
Embedded URIs
2670
2653
`````````````
2671
2654
2672
2655
A hyperlink reference may directly embed a target URI inline, within
2673
2656
angle brackets ("<...>") as follows::
2674
2657
2675
2658
    See the `Python home page <http://www.python.org>`_ for info.
2676
2659
2677
2660
This is exactly equivalent to::
2678
2661
2679
2662
    See the `Python home page`_ for info.
2680
2663
2681
2664
    .. _Python home page: http://www.python.org
2682
2665
2683
2666
The bracketed URI must be preceded by whitespace and be the last text
2684
2667
before the end string.  With a single trailing underscore, the
2685
2668
reference is named and the same target URI may be referred to again.
2686
2669
2687
2670
With two trailing underscores, the reference and target are both
2688
2671
anonymous, and the target cannot be referred to again.  These are
2689
2672
"one-off" hyperlinks.  For example::
2690
2673
2691
2674
    `RFC 2396 <http://www.rfc-editor.org/rfc/rfc2396.txt>`__ and `RFC
2692
2675
    2732 <http://www.rfc-editor.org/rfc/rfc2732.txt>`__ together
2693
2676
    define the syntax of URIs.
2694
2677
2695
2678
Equivalent to::
2696
2679
2697
2680
    `RFC 2396`__ and `RFC 2732`__ together define the syntax of URIs.
2698
2681
2699
2682
    __ http://www.rfc-editor.org/rfc/rfc2396.txt
2700
2683
    __ http://www.rfc-editor.org/rfc/rfc2732.txt
2701
2684
2702
2685
If reference text happens to end with angle-bracketed text that is
2703
2686
*not* a URI, the open-angle-bracket needs to be backslash-escaped.
2704
2687
For example, here is a reference to a title describing a tag::
2705
2688
2706
2689
    See `HTML Element: \<a>`_ below.
2707
2690
2708
2691
The reference text may also be omitted, in which case the URI will be
2709
2692
duplicated for use as the reference text.  This is useful for relative
2710
2693
URIs where the address or file name is also the desired reference
2711
2694
text::
2712
2695
2713
2696
    See `<a_named_relative_link>`_ or `<an_anonymous_relative_link>`__
2714
2697
    for details.
2715
2698
2716
2699
.. CAUTION::
2717
2700
2718
2701
   This construct offers easy authoring and maintenance of hyperlinks
2719
2702
   at the expense of general readability.  Inline URIs, especially
2720
2703
   long ones, inevitably interrupt the natural flow of text.  For
2721
2704
   documents meant to be read in source form, the use of independent
2722
2705
   block-level `hyperlink targets`_ is **strongly recommended**.  The
2723
2706
   embedded URI construct is most suited to documents intended *only*
2724
2707
   to be read in processed form.
2725
2708
2726
2709
2727
2710
Inline Internal Targets
2728
2711
------------------------
2729
2712
2730
2713
Doctree element: target.
2731
2714
2732
2715
Start-string = "_`", end-string = "`".
2733
2716
2734
2717
Inline internal targets are the equivalent of explicit `internal
2735
2718
hyperlink targets`_, but may appear within running text.  The syntax
2736
2719
begins with an underscore and a backquote, is followed by a hyperlink
2737
2720
name or phrase, and ends with a backquote.  Inline internal targets
2738
2721
may not be anonymous.
2739
2722
2740
2723
For example, the following paragraph contains a hyperlink target named
2741
2724
"Norwegian Blue"::
2742
2725
2743
2726
    Oh yes, the _`Norwegian Blue`.  What's, um, what's wrong with it?
2744
2727
2745
2728
See `Implicit Hyperlink Targets`_ for the resolution of duplicate
2746
2729
reference names.
2747
2730
2748
2731
2749
2732
Footnote References
2750
2733
-------------------
2751
2734
2752
2735
Doctree element: footnote_reference.
2753
2736
2754
2737
Start-string = "[", end-string = "]_".
2755
2738
2756
2739
Each footnote reference consists of a square-bracketed label followed
2757
2740
by a trailing underscore.  Footnote labels are one of:
2758
2741
2759
2742
- one or more digits (i.e., a number),
2760
2743
2761
2744
- a single "#" (denoting `auto-numbered footnotes`_),
2762
2745
2763
2746
- a "#" followed by a simple reference name (an `autonumber label`_),
2764
2747
  or
2765
2748
2766
2749
- a single "*" (denoting `auto-symbol footnotes`_).
2767
2750
2768
2751
For example::
2769
2752
2770
2753
    Please RTFM [1]_.
2771
2754
2772
2755
    .. [1] Read The Fine Manual
2773
2756
2774
2757
2775
2758
Citation References
2776
2759
-------------------
2777
2760
2778
2761
Doctree element: citation_reference.
2779
2762
2780
2763
Start-string = "[", end-string = "]_".
2781
2764
2782
2765
Each citation reference consists of a square-bracketed label followed
2783
2766
by a trailing underscore.  Citation labels are simple `reference
2784
2767
names`_ (case-insensitive single words, consisting of alphanumerics
2785
2768
plus internal hyphens, underscores, and periods; no whitespace).
2786
2769
2787
2770
For example::
2788
2771
2789
2772
    Here is a citation reference: [CIT2002]_.
2790
2773
2791
2774
See Citations_ for the citation itself.
2792
2775
2793
2776
2794
2777
Substitution References
2795
2778
-----------------------
2796
2779
2797
2780
Doctree element: substitution_reference, reference.
2798
2781
2799
2782
Start-string = "|", end-string = "|" (optionally followed by "_" or
2800
2783
"__").
2801
2784
2802
2785
Vertical bars are used to bracket the substitution reference text.  A
2803
2786
substitution reference may also be a hyperlink reference by appending
2804
2787
a "_" (named) or "__" (anonymous) suffix; the substitution text is
2805
2788
used for the reference text in the named case.
2806
2789
2807
2790
The processing system replaces substitution references with the
2808
2791
processed contents of the corresponding `substitution definitions`_
2809
2792
(which see for the definition of "correspond").  Substitution
2810
2793
definitions produce inline-compatible elements.
2811
2794
2812
2795
Examples::
2813
2796
2814
2797
    This is a simple |substitution reference|.  It will be replaced by
2815
2798
    the processing system.
2816
2799
2817
2800
    This is a combination |substitution and hyperlink reference|_.  In
2818
2801
    addition to being replaced, the replacement text or element will
2819
2802
    refer to the "substitution and hyperlink reference" target.
2820
2803
2821
2804
2822
2805
Standalone Hyperlinks
2823
2806
---------------------
2824
2807
2825
2808
Doctree element: reference.
2826
2809
2827
2810
Start-string = end-string = "" (empty string).
2828
2811
2829
2812
A URI (absolute URI [#URI]_ or standalone email address) within a text
2830
2813
block is treated as a general external hyperlink with the URI itself
2831
2814
as the link's text.  For example::
2832
2815
2833
2816
    See http://www.python.org for info.
2834
2817
2835
2818
would be marked up in HTML as::
2836
2819
2837
2820
    See <a href="http://www.python.org">http://www.python.org</a> for
2838
2821
    info.
2839
2822
2840
2823
Two forms of URI are recognized:
2841
2824
2842
2825
1. Absolute URIs.  These consist of a scheme, a colon (":"), and a
2843
2826
   scheme-specific part whose interpretation depends on the scheme.
2844
2827
2845
2828
   The scheme is the name of the protocol, such as "http", "ftp",
2846
2829
   "mailto", or "telnet".  The scheme consists of an initial letter,
2847
2830
   followed by letters, numbers, and/or "+", "-", ".".  Recognition is
2848
2831
   limited to known schemes, per the `Official IANA Registry of URI
2849
2832
   Schemes`_ and the W3C's `Retired Index of WWW Addressing Schemes`_.
2850
2833
2851
2834
   The scheme-specific part of the resource identifier may be either
2852
2835
   hierarchical or opaque:
2853
2836
2854
2837
   - Hierarchical identifiers begin with one or two slashes and may
2855
2838
     use slashes to separate hierarchical components of the path.
2856
2839
     Examples are web pages and FTP sites::
2857
2840
2858
2841
         http://www.python.org
2859
2842
2860
2843
         ftp://ftp.python.org/pub/python
2861
2844
2862
2845
   - Opaque identifiers do not begin with slashes.  Examples are
2863
2846
     email addresses and newsgroups::
2864
2847
2865
2848
         mailto:someone@somewhere.com
2866
2849
2867
2850
         news:comp.lang.python
2868
2851
2869
2852
   With queries, fragments, and %-escape sequences, URIs can become
2870
2853
   quite complicated.  A reStructuredText parser must be able to
2871
2854
   recognize any absolute URI, as defined in RFC2396_ and RFC2732_.
2872
2855
2873
2856
2. Standalone email addresses, which are treated as if they were
2874
2857
   absolute URIs with a "mailto:" scheme.  Example::
2875
2858
2876
2859
       someone@somewhere.com
2877
2860
2878
2861
Punctuation at the end of a URI is not considered part of the URI,
2879
2862
unless the URI is terminated by a closing angle bracket (">").
2880
2863
Backslashes may be used in URIs to escape markup characters,
2881
2864
specifically asterisks ("*") and underscores ("_") which are vaid URI
2882
2865
characters (see `Escaping Mechanism`_ above).
2883
2866
2884
2867
.. [#URI] Uniform Resource Identifier.  URIs are a general form of
2885
2868
   URLs (Uniform Resource Locators).  For the syntax of URIs see
2886
2869
   RFC2396_ and RFC2732_.
2887
2870
2888
2871
2889
2872
Units
2890
2873
=====
2891
2874
2892
2875
(New in Docutils 0.3.10.)
2893
2876
2894
2877
All measures consist of a positive floating point number in standard
2895
2878
(non-scientific) notation and a unit, possibly separated by one or
2896
2879
more spaces.
2897
2880
2898
2881
Units are only supported where explicitly mentioned in the reference
2899
2882
manuals.
2900
2883
2901
2884
2902
2885
Length Units
2903
2886
------------
2904
2887
2905
2888
The following length units are supported by the reStructuredText
2906
2889
parser:
2907
2890
2908
2891
* em (ems, the height of the element's font)
2909
2892
* ex (x-height, the height of the letter "x")
2910
2893
* px (pixels, relative to the canvas resolution)
2911
2894
* in (inches; 1in=2.54cm)
2912
2895
* cm (centimeters; 1cm=10mm)
2913
2896
* mm (millimeters)
2914
2897
* pt (points; 1pt=1/72in)
2915
2898
* pc (picas; 1pc=12pt)
2916
2899
2917
2900
This set corresponds to the `length units in CSS`_.
2918
2901
2919
2902
(List and explanations taken from
2920
2903
http://www.htmlhelp.com/reference/css/units.html#length.)
2921
2904
2922
2905
The following are all valid length values: "1.5em", "20 mm", ".5in".
2923
2906
2924
2907
Length values without unit are completed with a writer-dependent
2925
2908
default (e.g. px with `html4css1`, pt with `latex2e`). See the writer
2926
2909
specific documentation in the `user doc`__ for details.
2927
2910
2928
2911
.. _length units in CSS:
2929
2912
   http://www.w3.org/TR/CSS2/syndata.html#length-units
2930
2913
2931
2914
__ ../../user/
2932
2915
2933
2916
Percentage Units
2934
2917
----------------
2935
2918
2936
2919
Percentage values have a percent sign ("%") as unit.  Percentage
2937
2920
values are relative to other values, depending on the context in which
2938
2921
they occur.
2939
2922
2940
2923
2941
2924
----------------
2942
2925
 Error Handling
2943
2926
----------------
2944
2927
2945
2928
Doctree element: system_message, problematic.
2946
2929
2947
2930
Markup errors are handled according to the specification in `PEP
2948
2931
258`_.
2949
2932
2950
2933
2951
2934
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
2952
2935
.. _Docutils: http://docutils.sourceforge.net/
2953
2936
.. _The Docutils Document Tree: ../doctree.html
2954
2937
.. _Docutils Generic DTD: ../docutils.dtd
2955
2938
.. _transforms:
2956
2939
   http://docutils.sourceforge.net/docutils/transforms/
2957
2940
.. _Grouch: http://www.mems-exchange.org/software/grouch/
2958
2941
.. _RFC822: http://www.rfc-editor.org/rfc/rfc822.txt
2959
2942
.. _DocTitle transform:
2960
2943
.. _DocInfo transform:
2961
2944
   http://docutils.sourceforge.net/docutils/transforms/frontmatter.py
2962
2945
.. _getopt.py:
2963
2946
   http://www.python.org/doc/current/lib/module-getopt.html
2964
2947
.. _GNU libc getopt_long():
2965
2948
   http://www.gnu.org/software/libc/manual/html_node/Getopt-Long-Options.html
2966
2949
.. _doctest module:
2967
2950
   http://www.python.org/doc/current/lib/module-doctest.html
2968
2951
.. _Emacs table mode: http://table.sourceforge.net/
2969
2952
.. _Official IANA Registry of URI Schemes:
2970
2953
   http://www.iana.org/assignments/uri-schemes
2971
2954
.. _Retired Index of WWW Addressing Schemes:
2972
2955
   http://www.w3.org/Addressing/schemes.html
2973
2956
.. _World Wide Web Consortium: http://www.w3.org/
2974
2957
.. _HTML Techniques for Web Content Accessibility Guidelines:
2975
2958
   http://www.w3.org/TR/WCAG10-HTML-TECHS/#link-text
2976
2959
.. _image: directives.html#image
2977
2960
.. _replace: directives.html#replace
2978
2961
.. _meta: directives.html#meta
2979
2962
.. _figure: directives.html#figure
2980
2963
.. _admonition: directives.html#admonitions
2981
2964
.. _role: directives.html#custom-interpreted-text-roles
2982
2965
.. _reStructuredText Directives: directives.html
2983
2966
.. _reStructuredText Interpreted Text Roles: roles.html
2984
2967
.. _RFC2396: http://www.rfc-editor.org/rfc/rfc2396.txt
2985
2968
.. _RFC2732: http://www.rfc-editor.org/rfc/rfc2732.txt
2986
2969
.. _Zope: http://www.zope.com/
2987
2970
.. _PEP 258: ../../peps/pep-0258.html
2988
2971
2989
2972
2990
0
2973
2991
2974
..
2992
2975
   Local Variables:
2993
2976
   mode: indented-text
2994
2977
   indent-tabs-mode: nil
2995
2978
   sentence-end-double-space: t
2996
2979
   fill-column: 70
2997
2980
   End:
2998
1
2981
2999
=== added directory '.pc/support-aliases-in-references.diff/docutils'
3000
=== added directory '.pc/support-aliases-in-references.diff/docutils/parsers'
3001
=== added directory '.pc/support-aliases-in-references.diff/docutils/parsers/rst'
3002
=== added file '.pc/support-aliases-in-references.diff/docutils/parsers/rst/states.py'
3003
--- .pc/support-aliases-in-references.diff/docutils/parsers/rst/states.py	1970-01-01 00:00:00 +0000
3004
+++ .pc/support-aliases-in-references.diff/docutils/parsers/rst/states.py	2013-03-19 07:30:29 +0000
3005
@@ -0,0 +1,3052 @@
3006
1
# $Id: states.py 7495 2012-08-16 14:50:57Z milde $
3007
2
# Author: David Goodger <goodger@python.org>
3008
3
# Copyright: This module has been placed in the public domain.
3009
4
3010
5
"""
3011
6
This is the ``docutils.parsers.rst.states`` module, the core of
3012
7
the reStructuredText parser.  It defines the following:
3013
8
3014
9
:Classes:
3015
10
    - `RSTStateMachine`: reStructuredText parser's entry point.
3016
11
    - `NestedStateMachine`: recursive StateMachine.
3017
12
    - `RSTState`: reStructuredText State superclass.
3018
13
    - `Inliner`: For parsing inline markup.
3019
14
    - `Body`: Generic classifier of the first line of a block.
3020
15
    - `SpecializedBody`: Superclass for compound element members.
3021
16
    - `BulletList`: Second and subsequent bullet_list list_items
3022
17
    - `DefinitionList`: Second+ definition_list_items.
3023
18
    - `EnumeratedList`: Second+ enumerated_list list_items.
3024
19
    - `FieldList`: Second+ fields.
3025
20
    - `OptionList`: Second+ option_list_items.
3026
21
    - `RFC2822List`: Second+ RFC2822-style fields.
3027
22
    - `ExtensionOptions`: Parses directive option fields.
3028
23
    - `Explicit`: Second+ explicit markup constructs.
3029
24
    - `SubstitutionDef`: For embedded directives in substitution definitions.
3030
25
    - `Text`: Classifier of second line of a text block.
3031
26
    - `SpecializedText`: Superclass for continuation lines of Text-variants.
3032
27
    - `Definition`: Second line of potential definition_list_item.
3033
28
    - `Line`: Second line of overlined section title or transition marker.
3034
29
    - `Struct`: An auxiliary collection class.
3035
30
3036
31
:Exception classes:
3037
32
    - `MarkupError`
3038
33
    - `ParserError`
3039
34
    - `MarkupMismatch`
3040
35
3041
36
:Functions:
3042
37
    - `escape2null()`: Return a string, escape-backslashes converted to nulls.
3043
38
    - `unescape()`: Return a string, nulls removed or restored to backslashes.
3044
39
3045
40
:Attributes:
3046
41
    - `state_classes`: set of State classes used with `RSTStateMachine`.
3047
42
3048
43
Parser Overview
3049
44
===============
3050
45
3051
46
The reStructuredText parser is implemented as a recursive state machine,
3052
47
examining its input one line at a time.  To understand how the parser works,
3053
48
please first become familiar with the `docutils.statemachine` module.  In the
3054
49
description below, references are made to classes defined in this module;
3055
50
please see the individual classes for details.
3056
51
3057
52
Parsing proceeds as follows:
3058
53
3059
54
1. The state machine examines each line of input, checking each of the
3060
55
   transition patterns of the state `Body`, in order, looking for a match.
3061
56
   The implicit transitions (blank lines and indentation) are checked before
3062
57
   any others.  The 'text' transition is a catch-all (matches anything).
3063
58
3064
59
2. The method associated with the matched transition pattern is called.
3065
60
3066
61
   A. Some transition methods are self-contained, appending elements to the
3067
62
      document tree (`Body.doctest` parses a doctest block).  The parser's
3068
63
      current line index is advanced to the end of the element, and parsing
3069
64
      continues with step 1.
3070
65
3071
66
   B. Other transition methods trigger the creation of a nested state machine,
3072
67
      whose job is to parse a compound construct ('indent' does a block quote,
3073
68
      'bullet' does a bullet list, 'overline' does a section [first checking
3074
69
      for a valid section header], etc.).
3075
70
3076
71
      - In the case of lists and explicit markup, a one-off state machine is
3077
72
        created and run to parse contents of the first item.
3078
73
3079
74
      - A new state machine is created and its initial state is set to the
3080
75
        appropriate specialized state (`BulletList` in the case of the
3081
76
        'bullet' transition; see `SpecializedBody` for more detail).  This
3082
77
        state machine is run to parse the compound element (or series of
3083
78
        explicit markup elements), and returns as soon as a non-member element
3084
79
        is encountered.  For example, the `BulletList` state machine ends as
3085
80
        soon as it encounters an element which is not a list item of that
3086
81
        bullet list.  The optional omission of inter-element blank lines is
3087
82
        enabled by this nested state machine.
3088
83
3089
84
      - The current line index is advanced to the end of the elements parsed,
3090
85
        and parsing continues with step 1.
3091
86
3092
87
   C. The result of the 'text' transition depends on the next line of text.
3093
88
      The current state is changed to `Text`, under which the second line is
3094
89
      examined.  If the second line is:
3095
90
3096
91
      - Indented: The element is a definition list item, and parsing proceeds
3097
92
        similarly to step 2.B, using the `DefinitionList` state.
3098
93
3099
94
      - A line of uniform punctuation characters: The element is a section
3100
95
        header; again, parsing proceeds as in step 2.B, and `Body` is still
3101
96
        used.
3102
97
3103
98
      - Anything else: The element is a paragraph, which is examined for
3104
99
        inline markup and appended to the parent element.  Processing
3105
100
        continues with step 1.
3106
101
"""
3107
102
3108
103
__docformat__ = 'reStructuredText'
3109
104
3110
105
3111
106
import sys
3112
107
import re
3113
108
try:
3114
109
    import roman
3115
110
except ImportError:
3116
111
    import docutils.utils.roman as roman
3117
112
from types import FunctionType, MethodType
3118
113
3119
114
from docutils import nodes, statemachine, utils
3120
115
from docutils import ApplicationError, DataError
3121
116
from docutils.statemachine import StateMachineWS, StateWS
3122
117
from docutils.nodes import fully_normalize_name as normalize_name
3123
118
from docutils.nodes import whitespace_normalize_name
3124
119
import docutils.parsers.rst
3125
120
from docutils.parsers.rst import directives, languages, tableparser, roles
3126
121
from docutils.parsers.rst.languages import en as _fallback_language_module
3127
122
from docutils.utils import escape2null, unescape, column_width
3128
123
from docutils.utils import punctuation_chars, urischemes
3129
124
3130
125
class MarkupError(DataError): pass
3131
126
class UnknownInterpretedRoleError(DataError): pass
3132
127
class InterpretedRoleNotImplementedError(DataError): pass
3133
128
class ParserError(ApplicationError): pass
3134
129
class MarkupMismatch(Exception): pass
3135
130
3136
131
3137
132
class Struct:
3138
133
3139
134
    """Stores data attributes for dotted-attribute access."""
3140
135
3141
136
    def __init__(self, **keywordargs):
3142
137
        self.__dict__.update(keywordargs)
3143
138
3144
139
3145
140
class RSTStateMachine(StateMachineWS):
3146
141
3147
142
    """
3148
143
    reStructuredText's master StateMachine.
3149
144
3150
145
    The entry point to reStructuredText parsing is the `run()` method.
3151
146
    """
3152
147
3153
148
    def run(self, input_lines, document, input_offset=0, match_titles=True,
3154
149
            inliner=None):
3155
150
        """
3156
151
        Parse `input_lines` and modify the `document` node in place.
3157
152
3158
153
        Extend `StateMachineWS.run()`: set up parse-global data and
3159
154
        run the StateMachine.
3160
155
        """
3161
156
        self.language = languages.get_language(
3162
157
            document.settings.language_code)
3163
158
        self.match_titles = match_titles
3164
159
        if inliner is None:
3165
160
            inliner = Inliner()
3166
161
        inliner.init_customizations(document.settings)
3167
162
        self.memo = Struct(document=document,
3168
163
                           reporter=document.reporter,
3169
164
                           language=self.language,
3170
165
                           title_styles=[],
3171
166
                           section_level=0,
3172
167
                           section_bubble_up_kludge=False,
3173
168
                           inliner=inliner)
3174
169
        self.document = document
3175
170
        self.attach_observer(document.note_source)
3176
171
        self.reporter = self.memo.reporter
3177
172
        self.node = document
3178
173
        results = StateMachineWS.run(self, input_lines, input_offset,
3179
174
                                     input_source=document['source'])
3180
175
        assert results == [], 'RSTStateMachine.run() results should be empty!'
3181
176
        self.node = self.memo = None    # remove unneeded references
3182
177
3183
178
3184
179
class NestedStateMachine(StateMachineWS):
3185
180
3186
181
    """
3187
182
    StateMachine run from within other StateMachine runs, to parse nested
3188
183
    document structures.
3189
184
    """
3190
185
3191
186
    def run(self, input_lines, input_offset, memo, node, match_titles=True):
3192
187
        """
3193
188
        Parse `input_lines` and populate a `docutils.nodes.document` instance.
3194
189
3195
190
        Extend `StateMachineWS.run()`: set up document-wide data.
3196
191
        """
3197
192
        self.match_titles = match_titles
3198
193
        self.memo = memo
3199
194
        self.document = memo.document
3200
195
        self.attach_observer(self.document.note_source)
3201
196
        self.reporter = memo.reporter
3202
197
        self.language = memo.language
3203
198
        self.node = node
3204
199
        results = StateMachineWS.run(self, input_lines, input_offset)
3205
200
        assert results == [], ('NestedStateMachine.run() results should be '
3206
201
                               'empty!')
3207
202
        return results
3208
203
3209
204
3210
205
class RSTState(StateWS):
3211
206
3212
207
    """
3213
208
    reStructuredText State superclass.
3214
209
3215
210
    Contains methods used by all State subclasses.
3216
211
    """
3217
212
3218
213
    nested_sm = NestedStateMachine
3219
214
    nested_sm_cache = []
3220
215
3221
216
    def __init__(self, state_machine, debug=False):
3222
217
        self.nested_sm_kwargs = {'state_classes': state_classes,
3223
218
                                 'initial_state': 'Body'}
3224
219
        StateWS.__init__(self, state_machine, debug)
3225
220
3226
221
    def runtime_init(self):
3227
222
        StateWS.runtime_init(self)
3228
223
        memo = self.state_machine.memo
3229
224
        self.memo = memo
3230
225
        self.reporter = memo.reporter
3231
226
        self.inliner = memo.inliner
3232
227
        self.document = memo.document
3233
228
        self.parent = self.state_machine.node
3234
229
        # enable the reporter to determine source and source-line
3235
230
        if not hasattr(self.reporter, 'get_source_and_line'):
3236
231
            self.reporter.get_source_and_line = self.state_machine.get_source_and_line
3237
232
            # print "adding get_source_and_line to reporter", self.state_machine.input_offset
3238
233
3239
234
3240
235
    def goto_line(self, abs_line_offset):
3241
236
        """
3242
237
        Jump to input line `abs_line_offset`, ignoring jumps past the end.
3243
238
        """
3244
239
        try:
3245
240
            self.state_machine.goto_line(abs_line_offset)
3246
241
        except EOFError:
3247
242
            pass
3248
243
3249
244
    def no_match(self, context, transitions):
3250
245
        """
3251
246
        Override `StateWS.no_match` to generate a system message.
3252
247
3253
248
        This code should never be run.
3254
249
        """
3255
250
        self.reporter.severe(
3256
251
            'Internal error: no transition pattern match.  State: "%s"; '
3257
252
            'transitions: %s; context: %s; current line: %r.'
3258
253
            % (self.__class__.__name__, transitions, context,
3259
254
               self.state_machine.line))
3260
255
        return context, None, []
3261
256
3262
257
    def bof(self, context):
3263
258
        """Called at beginning of file."""
3264
259
        return [], []
3265
260
3266
261
    def nested_parse(self, block, input_offset, node, match_titles=False,
3267
262
                     state_machine_class=None, state_machine_kwargs=None):
3268
263
        """
3269
264
        Create a new StateMachine rooted at `node` and run it over the input
3270
265
        `block`.
3271
266
        """
3272
267
        use_default = 0
3273
268
        if state_machine_class is None:
3274
269
            state_machine_class = self.nested_sm
3275
270
            use_default += 1
3276
271
        if state_machine_kwargs is None:
3277
272
            state_machine_kwargs = self.nested_sm_kwargs
3278
273
            use_default += 1
3279
274
        block_length = len(block)
3280
275
3281
276
        state_machine = None
3282
277
        if use_default == 2:
3283
278
            try:
3284
279
                state_machine = self.nested_sm_cache.pop()
3285
280
            except IndexError:
3286
281
                pass
3287
282
        if not state_machine:
3288
283
            state_machine = state_machine_class(debug=self.debug,
3289
284
                                                **state_machine_kwargs)
3290
285
        state_machine.run(block, input_offset, memo=self.memo,
3291
286
                          node=node, match_titles=match_titles)
3292
287
        if use_default == 2:
3293
288
            self.nested_sm_cache.append(state_machine)
3294
289
        else:
3295
290
            state_machine.unlink()
3296
291
        new_offset = state_machine.abs_line_offset()
3297
292
        # No `block.parent` implies disconnected -- lines aren't in sync:
3298
293
        if block.parent and (len(block) - block_length) != 0:
3299
294
            # Adjustment for block if modified in nested parse:
3300
295
            self.state_machine.next_line(len(block) - block_length)
3301
296
        return new_offset
3302
297
3303
298
    def nested_list_parse(self, block, input_offset, node, initial_state,
3304
299
                          blank_finish,
3305
300
                          blank_finish_state=None,
3306
301
                          extra_settings={},
3307
302
                          match_titles=False,
3308
303
                          state_machine_class=None,
3309
304
                          state_machine_kwargs=None):
3310
305
        """
3311
306
        Create a new StateMachine rooted at `node` and run it over the input
3312
307
        `block`. Also keep track of optional intermediate blank lines and the
3313
308
        required final one.
3314
309
        """
3315
310
        if state_machine_class is None:
3316
311
            state_machine_class = self.nested_sm
3317
312
        if state_machine_kwargs is None:
3318
313
            state_machine_kwargs = self.nested_sm_kwargs.copy()
3319
314
        state_machine_kwargs['initial_state'] = initial_state
3320
315
        state_machine = state_machine_class(debug=self.debug,
3321
316
                                            **state_machine_kwargs)
3322
317
        if blank_finish_state is None:
3323
318
            blank_finish_state = initial_state
3324
319
        state_machine.states[blank_finish_state].blank_finish = blank_finish
3325
320
        for key, value in extra_settings.items():
3326
321
            setattr(state_machine.states[initial_state], key, value)
3327
322
        state_machine.run(block, input_offset, memo=self.memo,
3328
323
                          node=node, match_titles=match_titles)
3329
324
        blank_finish = state_machine.states[blank_finish_state].blank_finish
3330
325
        state_machine.unlink()
3331
326
        return state_machine.abs_line_offset(), blank_finish
3332
327
3333
328
    def section(self, title, source, style, lineno, messages):
3334
329
        """Check for a valid subsection and create one if it checks out."""
3335
330
        if self.check_subsection(source, style, lineno):
3336
331
            self.new_subsection(title, lineno, messages)
3337
332
3338
333
    def check_subsection(self, source, style, lineno):
3339
334
        """
3340
335
        Check for a valid subsection header.  Return 1 (true) or None (false).
3341
336
3342
337
        When a new section is reached that isn't a subsection of the current
3343
338
        section, back up the line count (use ``previous_line(-x)``), then
3344
339
        ``raise EOFError``.  The current StateMachine will finish, then the
3345
340
        calling StateMachine can re-examine the title.  This will work its way
3346
341
        back up the calling chain until the correct section level isreached.
3347
342
3348
343
        @@@ Alternative: Evaluate the title, store the title info & level, and
3349
344
        back up the chain until that level is reached.  Store in memo? Or
3350
345
        return in results?
3351
346
3352
347
        :Exception: `EOFError` when a sibling or supersection encountered.
3353
348
        """
3354
349
        memo = self.memo
3355
350
        title_styles = memo.title_styles
3356
351
        mylevel = memo.section_level
3357
352
        try:                            # check for existing title style
3358
353
            level = title_styles.index(style) + 1
3359
354
        except ValueError:              # new title style
3360
355
            if len(title_styles) == memo.section_level: # new subsection
3361
356
                title_styles.append(style)
3362
357
                return 1
3363
358
            else:                       # not at lowest level
3364
359
                self.parent += self.title_inconsistent(source, lineno)
3365
360
                return None
3366
361
        if level <= mylevel:            # sibling or supersection
3367
362
            memo.section_level = level   # bubble up to parent section
3368
363
            if len(style) == 2:
3369
364
                memo.section_bubble_up_kludge = True
3370
365
            # back up 2 lines for underline title, 3 for overline title
3371
366
            self.state_machine.previous_line(len(style) + 1)
3372
367
            raise EOFError              # let parent section re-evaluate
3373
368
        if level == mylevel + 1:        # immediate subsection
3374
369
            return 1
3375
370
        else:                           # invalid subsection
3376
371
            self.parent += self.title_inconsistent(source, lineno)
3377
372
            return None
3378
373
3379
374
    def title_inconsistent(self, sourcetext, lineno):
3380
375
        error = self.reporter.severe(
3381
376
            'Title level inconsistent:', nodes.literal_block('', sourcetext),
3382
377
            line=lineno)
3383
378
        return error
3384
379
3385
380
    def new_subsection(self, title, lineno, messages):
3386
381
        """Append new subsection to document tree. On return, check level."""
3387
382
        memo = self.memo
3388
383
        mylevel = memo.section_level
3389
384
        memo.section_level += 1
3390
385
        section_node = nodes.section()
3391
386
        self.parent += section_node
3392
387
        textnodes, title_messages = self.inline_text(title, lineno)
3393
388
        titlenode = nodes.title(title, '', *textnodes)
3394
389
        name = normalize_name(titlenode.astext())
3395
390
        section_node['names'].append(name)
3396
391
        section_node += titlenode
3397
392
        section_node += messages
3398
393
        section_node += title_messages
3399
394
        self.document.note_implicit_target(section_node, section_node)
3400
395
        offset = self.state_machine.line_offset + 1
3401
396
        absoffset = self.state_machine.abs_line_offset() + 1
3402
397
        newabsoffset = self.nested_parse(
3403
398
              self.state_machine.input_lines[offset:], input_offset=absoffset,
3404
399
              node=section_node, match_titles=True)
3405
400
        self.goto_line(newabsoffset)
3406
401
        if memo.section_level <= mylevel: # can't handle next section?
3407
402
            raise EOFError              # bubble up to supersection
3408
403
        # reset section_level; next pass will detect it properly
3409
404
        memo.section_level = mylevel
3410
405
3411
406
    def paragraph(self, lines, lineno):
3412
407
        """
3413
408
        Return a list (paragraph & messages) & a boolean: literal_block next?
3414
409
        """
3415
410
        data = '\n'.join(lines).rstrip()
3416
411
        if re.search(r'(?<!\\)(\\\\)*::$', data):
3417
412
            if len(data) == 2:
3418
413
                return [], 1
3419
414
            elif data[-3] in ' \n':
3420
415
                text = data[:-3].rstrip()
3421
416
            else:
3422
417
                text = data[:-1]
3423
418
            literalnext = 1
3424
419
        else:
3425
420
            text = data
3426
421
            literalnext = 0
3427
422
        textnodes, messages = self.inline_text(text, lineno)
3428
423
        p = nodes.paragraph(data, '', *textnodes)
3429
424
        p.source, p.line = self.state_machine.get_source_and_line(lineno)
3430
425
        return [p] + messages, literalnext
3431
426
3432
427
    def inline_text(self, text, lineno):
3433
428
        """
3434
429
        Return 2 lists: nodes (text and inline elements), and system_messages.
3435
430
        """
3436
431
        return self.inliner.parse(text, lineno, self.memo, self.parent)
3437
432
3438
433
    def unindent_warning(self, node_name):
3439
434
        # the actual problem is one line below the current line
3440
435
        lineno = self.state_machine.abs_line_number()+1
3441
436
        return self.reporter.warning('%s ends without a blank line; '
3442
437
                                     'unexpected unindent.' % node_name,
3443
438
                                     line=lineno)
3444
439
3445
440
3446
441
def build_regexp(definition, compile=True):
3447
442
    """
3448
443
    Build, compile and return a regular expression based on `definition`.
3449
444
3450
445
    :Parameter: `definition`: a 4-tuple (group name, prefix, suffix, parts),
3451
446
        where "parts" is a list of regular expressions and/or regular
3452
447
        expression definitions to be joined into an or-group.
3453
448
    """
3454
449
    name, prefix, suffix, parts = definition
3455
450
    part_strings = []
3456
451
    for part in parts:
3457
452
        if type(part) is tuple:
3458
453
            part_strings.append(build_regexp(part, None))
3459
454
        else:
3460
455
            part_strings.append(part)
3461
456
    or_group = '|'.join(part_strings)
3462
457
    regexp = '%(prefix)s(?P<%(name)s>%(or_group)s)%(suffix)s' % locals()
3463
458
    if compile:
3464
459
        return re.compile(regexp, re.UNICODE)
3465
460
    else:
3466
461
        return regexp
3467
462
3468
463
3469
464
class Inliner:
3470
465
3471
466
    """
3472
467
    Parse inline markup; call the `parse()` method.
3473
468
    """
3474
469
3475
470
    def __init__(self):
3476
471
        self.implicit_dispatch = [(self.patterns.uri, self.standalone_uri),]
3477
472
        """List of (pattern, bound method) tuples, used by
3478
473
        `self.implicit_inline`."""
3479
474
3480
475
    def init_customizations(self, settings):
3481
476
        """Setting-based customizations; run when parsing begins."""
3482
477
        if settings.pep_references:
3483
478
            self.implicit_dispatch.append((self.patterns.pep,
3484
479
                                           self.pep_reference))
3485
480
        if settings.rfc_references:
3486
481
            self.implicit_dispatch.append((self.patterns.rfc,
3487
482
                                           self.rfc_reference))
3488
483
3489
484
    def parse(self, text, lineno, memo, parent):
3490
485
        # Needs to be refactored for nested inline markup.
3491
486
        # Add nested_parse() method?
3492
487
        """
3493
488
        Return 2 lists: nodes (text and inline elements), and system_messages.
3494
489
3495
490
        Using `self.patterns.initial`, a pattern which matches start-strings
3496
491
        (emphasis, strong, interpreted, phrase reference, literal,
3497
492
        substitution reference, and inline target) and complete constructs
3498
493
        (simple reference, footnote reference), search for a candidate.  When
3499
494
        one is found, check for validity (e.g., not a quoted '*' character).
3500
495
        If valid, search for the corresponding end string if applicable, and
3501
496
        check it for validity.  If not found or invalid, generate a warning
3502
497
        and ignore the start-string.  Implicit inline markup (e.g. standalone
3503
498
        URIs) is found last.
3504
499
        """
3505
500
        self.reporter = memo.reporter
3506
501
        self.document = memo.document
3507
502
        self.language = memo.language
3508
503
        self.parent = parent
3509
504
        pattern_search = self.patterns.initial.search
3510
505
        dispatch = self.dispatch
3511
506
        remaining = escape2null(text)
3512
507
        processed = []
3513
508
        unprocessed = []
3514
509
        messages = []
3515
510
        while remaining:
3516
511
            match = pattern_search(remaining)
3517
512
            if match:
3518
513
                groups = match.groupdict()
3519
514
                method = dispatch[groups['start'] or groups['backquote']
3520
515
                                  or groups['refend'] or groups['fnend']]
3521
516
                before, inlines, remaining, sysmessages = method(self, match,
3522
517
                                                                 lineno)
3523
518
                unprocessed.append(before)
3524
519
                messages += sysmessages
3525
520
                if inlines:
3526
521
                    processed += self.implicit_inline(''.join(unprocessed),
3527
522
                                                      lineno)
3528
523
                    processed += inlines
3529
524
                    unprocessed = []
3530
525
            else:
3531
526
                break
3532
527
        remaining = ''.join(unprocessed) + remaining
3533
528
        if remaining:
3534
529
            processed += self.implicit_inline(remaining, lineno)
3535
530
        return processed, messages
3536
531
3537
532
    # Inline object recognition
3538
533
    # -------------------------
3539
534
    # lookahead and look-behind expressions for inline markup rules
3540
535
    start_string_prefix = (u'(^|(?<=\\s|[%s%s]))' %
3541
536
                           (punctuation_chars.openers,
3542
537
                            punctuation_chars.delimiters))
3543
538
    end_string_suffix = (u'($|(?=\\s|[\x00%s%s%s]))' %
3544
539
                         (punctuation_chars.closing_delimiters,
3545
540
                          punctuation_chars.delimiters,
3546
541
                          punctuation_chars.closers))
3547
542
    # print start_string_prefix.encode('utf8')
3548
543
    # TODO: support non-ASCII whitespace in the following 4 patterns?
3549
544
    non_whitespace_before = r'(?<![ \n])'
3550
545
    non_whitespace_escape_before = r'(?<![ \n\x00])'
3551
546
    non_unescaped_whitespace_escape_before = r'(?<!(?<!\x00)[ \n\x00])'
3552
547
    non_whitespace_after = r'(?![ \n])'
3553
548
    # Alphanumerics with isolated internal [-._+:] chars (i.e. not 2 together):
3554
549
    simplename = r'(?:(?!_)\w)+(?:[-._+:](?:(?!_)\w)+)*'
3555
550
    # Valid URI characters (see RFC 2396 & RFC 2732);
3556
551
    # final \x00 allows backslash escapes in URIs:
3557
552
    uric = r"""[-_.!~*'()[\];/:@&=+$,%a-zA-Z0-9\x00]"""
3558
553
    # Delimiter indicating the end of a URI (not part of the URI):
3559
554
    uri_end_delim = r"""[>]"""
3560
555
    # Last URI character; same as uric but no punctuation:
3561
556
    urilast = r"""[_~*/=+a-zA-Z0-9]"""
3562
557
    # End of a URI (either 'urilast' or 'uric followed by a
3563
558
    # uri_end_delim'):
3564
559
    uri_end = r"""(?:%(urilast)s|%(uric)s(?=%(uri_end_delim)s))""" % locals()
3565
560
    emailc = r"""[-_!~*'{|}/#?^`&=+$%a-zA-Z0-9\x00]"""
3566
561
    email_pattern = r"""
3567
562
          %(emailc)s+(?:\.%(emailc)s+)*   # name
3568
563
          (?<!\x00)@                      # at
3569
564
          %(emailc)s+(?:\.%(emailc)s*)*   # host
3570
565
          %(uri_end)s                     # final URI char
3571
566
          """
3572
567
    parts = ('initial_inline', start_string_prefix, '',
3573
568
             [('start', '', non_whitespace_after,  # simple start-strings
3574
569
               [r'\*\*',                # strong
3575
570
                r'\*(?!\*)',            # emphasis but not strong
3576
571
                r'``',                  # literal
3577
572
                r'_`',                  # inline internal target
3578
573
                r'\|(?!\|)']            # substitution reference
3579
574
               ),
3580
575
              ('whole', '', end_string_suffix, # whole constructs
3581
576
               [# reference name & end-string
3582
577
                r'(?P<refname>%s)(?P<refend>__?)' % simplename,
3583
578
                ('footnotelabel', r'\[', r'(?P<fnend>\]_)',
3584
579
                 [r'[0-9]+',               # manually numbered
3585
580
                  r'\#(%s)?' % simplename, # auto-numbered (w/ label?)
3586
581
                  r'\*',                   # auto-symbol
3587
582
                  r'(?P<citationlabel>%s)' % simplename] # citation reference
3588
583
                 )
3589
584
                ]
3590
585
               ),
3591
586
              ('backquote',             # interpreted text or phrase reference
3592
587
               '(?P<role>(:%s:)?)' % simplename, # optional role
3593
588
               non_whitespace_after,
3594
589
               ['`(?!`)']               # but not literal
3595
590
               )
3596
591
              ]
3597
592
             )
3598
593
    patterns = Struct(
3599
594
          initial=build_regexp(parts),
3600
595
          emphasis=re.compile(non_whitespace_escape_before
3601
596
                              + r'(\*)' + end_string_suffix, re.UNICODE),
3602
597
          strong=re.compile(non_whitespace_escape_before
3603
598
                            + r'(\*\*)' + end_string_suffix, re.UNICODE),
3604
599
          interpreted_or_phrase_ref=re.compile(
3605
600
              r"""
3606
601
              %(non_unescaped_whitespace_escape_before)s
3607
602
              (
3608
603
                `
3609
604
                (?P<suffix>
3610
605
                  (?P<role>:%(simplename)s:)?
3611
606
                  (?P<refend>__?)?
3612
607
                )
3613
608
              )
3614
609
              %(end_string_suffix)s
3615
610
              """ % locals(), re.VERBOSE | re.UNICODE),
3616
611
          embedded_uri=re.compile(
3617
612
              r"""
3618
613
              (
3619
614
                (?:[ \n]+|^)            # spaces or beginning of line/string
3620
615
                <                       # open bracket
3621
616
                %(non_whitespace_after)s
3622
617
                ([^<>\x00]+)            # anything but angle brackets & nulls
3623
618
                %(non_whitespace_before)s
3624
619
                >                       # close bracket w/o whitespace before
3625
620
              )
3626
621
              $                         # end of string
3627
622
              """ % locals(), re.VERBOSE | re.UNICODE),
3628
623
          literal=re.compile(non_whitespace_before + '(``)'
3629
624
                             + end_string_suffix),
3630
625
          target=re.compile(non_whitespace_escape_before
3631
626
                            + r'(`)' + end_string_suffix),
3632
627
          substitution_ref=re.compile(non_whitespace_escape_before
3633
628
                                      + r'(\|_{0,2})'
3634
629
                                      + end_string_suffix),
3635
630
          email=re.compile(email_pattern % locals() + '$',
3636
631
                           re.VERBOSE | re.UNICODE),
3637
632
          uri=re.compile(
3638
633
                (r"""
3639
634
                %(start_string_prefix)s
3640
635
                (?P<whole>
3641
636
                  (?P<absolute>           # absolute URI
3642
637
                    (?P<scheme>             # scheme (http, ftp, mailto)
3643
638
                      [a-zA-Z][a-zA-Z0-9.+-]*
3644
639
                    )
3645
640
                    :
3646
641
                    (
3647
642
                      (                       # either:
3648
643
                        (//?)?                  # hierarchical URI
3649
644
                        %(uric)s*               # URI characters
3650
645
                        %(uri_end)s             # final URI char
3651
646
                      )
3652
647
                      (                       # optional query
3653
648
                        \?%(uric)s*
3654
649
                        %(uri_end)s
3655
650
                      )?
3656
651
                      (                       # optional fragment
3657
652
                        \#%(uric)s*
3658
653
                        %(uri_end)s
3659
654
                      )?
3660
655
                    )
3661
656
                  )
3662
657
                |                       # *OR*
3663
658
                  (?P<email>              # email address
3664
659
                    """ + email_pattern + r"""
3665
660
                  )
3666
661
                )
3667
662
                %(end_string_suffix)s
3668
663
                """) % locals(), re.VERBOSE | re.UNICODE),
3669
664
          pep=re.compile(
3670
665
                r"""
3671
666
                %(start_string_prefix)s
3672
667
                (
3673
668
                  (pep-(?P<pepnum1>\d+)(.txt)?) # reference to source file
3674
669
                |
3675
670
                  (PEP\s+(?P<pepnum2>\d+))      # reference by name
3676
671
                )
3677
672
                %(end_string_suffix)s""" % locals(), re.VERBOSE | re.UNICODE),
3678
673
          rfc=re.compile(
3679
674
                r"""
3680
675
                %(start_string_prefix)s
3681
676
                (RFC(-|\s+)?(?P<rfcnum>\d+))
3682
677
                %(end_string_suffix)s""" % locals(), re.VERBOSE | re.UNICODE))
3683
678
3684
679
    def quoted_start(self, match):
3685
680
        """Test if inline markup start-string is 'quoted'.
3686
681
3687
682
        'Quoted' in this context means the start-string is enclosed in a pair
3688
683
        of matching opening/closing delimiters (not necessarily quotes)
3689
684
        or at the end of the match.
3690
685
        """
3691
686
        string = match.string
3692
687
        start = match.start()
3693
688
        if start == 0:                  # start-string at beginning of text
3694
689
            return False
3695
690
        prestart = string[start - 1]
3696
691
        try:
3697
692
            poststart = string[match.end()]
3698
693
        except IndexError:          # start-string at end of text
3699
694
            return True  # not "quoted" but no markup start-string either
3700
695
        return punctuation_chars.match_chars(prestart, poststart)
3701
696
3702
697
    def inline_obj(self, match, lineno, end_pattern, nodeclass,
3703
698
                   restore_backslashes=False):
3704
699
        string = match.string
3705
700
        matchstart = match.start('start')
3706
701
        matchend = match.end('start')
3707
702
        if self.quoted_start(match):
3708
703
            return (string[:matchend], [], string[matchend:], [], '')
3709
704
        endmatch = end_pattern.search(string[matchend:])
3710
705
        if endmatch and endmatch.start(1):  # 1 or more chars
3711
706
            text = unescape(endmatch.string[:endmatch.start(1)],
3712
707
                            restore_backslashes)
3713
708
            textend = matchend + endmatch.end(1)
3714
709
            rawsource = unescape(string[matchstart:textend], 1)
3715
710
            return (string[:matchstart], [nodeclass(rawsource, text)],
3716
711
                    string[textend:], [], endmatch.group(1))
3717
712
        msg = self.reporter.warning(
3718
713
              'Inline %s start-string without end-string.'
3719
714
              % nodeclass.__name__, line=lineno)
3720
715
        text = unescape(string[matchstart:matchend], 1)
3721
716
        rawsource = unescape(string[matchstart:matchend], 1)
3722
717
        prb = self.problematic(text, rawsource, msg)
3723
718
        return string[:matchstart], [prb], string[matchend:], [msg], ''
3724
719
3725
720
    def problematic(self, text, rawsource, message):
3726
721
        msgid = self.document.set_id(message, self.parent)
3727
722
        problematic = nodes.problematic(rawsource, text, refid=msgid)
3728
723
        prbid = self.document.set_id(problematic)
3729
724
        message.add_backref(prbid)
3730
725
        return problematic
3731
726
3732
727
    def emphasis(self, match, lineno):
3733
728
        before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3734
729
              match, lineno, self.patterns.emphasis, nodes.emphasis)
3735
730
        return before, inlines, remaining, sysmessages
3736
731
3737
732
    def strong(self, match, lineno):
3738
733
        before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3739
734
              match, lineno, self.patterns.strong, nodes.strong)
3740
735
        return before, inlines, remaining, sysmessages
3741
736
3742
737
    def interpreted_or_phrase_ref(self, match, lineno):
3743
738
        end_pattern = self.patterns.interpreted_or_phrase_ref
3744
739
        string = match.string
3745
740
        matchstart = match.start('backquote')
3746
741
        matchend = match.end('backquote')
3747
742
        rolestart = match.start('role')
3748
743
        role = match.group('role')
3749
744
        position = ''
3750
745
        if role:
3751
746
            role = role[1:-1]
3752
747
            position = 'prefix'
3753
748
        elif self.quoted_start(match):
3754
749
            return (string[:matchend], [], string[matchend:], [])
3755
750
        endmatch = end_pattern.search(string[matchend:])
3756
751
        if endmatch and endmatch.start(1):  # 1 or more chars
3757
752
            textend = matchend + endmatch.end()
3758
753
            if endmatch.group('role'):
3759
754
                if role:
3760
755
                    msg = self.reporter.warning(
3761
756
                        'Multiple roles in interpreted text (both '
3762
757
                        'prefix and suffix present; only one allowed).',
3763
758
                        line=lineno)
3764
759
                    text = unescape(string[rolestart:textend], 1)
3765
760
                    prb = self.problematic(text, text, msg)
3766
761
                    return string[:rolestart], [prb], string[textend:], [msg]
3767
762
                role = endmatch.group('suffix')[1:-1]
3768
763
                position = 'suffix'
3769
764
            escaped = endmatch.string[:endmatch.start(1)]
3770
765
            rawsource = unescape(string[matchstart:textend], 1)
3771
766
            if rawsource[-1:] == '_':
3772
767
                if role:
3773
768
                    msg = self.reporter.warning(
3774
769
                          'Mismatch: both interpreted text role %s and '
3775
770
                          'reference suffix.' % position, line=lineno)
3776
771
                    text = unescape(string[rolestart:textend], 1)
3777
772
                    prb = self.problematic(text, text, msg)
3778
773
                    return string[:rolestart], [prb], string[textend:], [msg]
3779
774
                return self.phrase_ref(string[:matchstart], string[textend:],
3780
775
                                       rawsource, escaped, unescape(escaped))
3781
776
            else:
3782
777
                rawsource = unescape(string[rolestart:textend], 1)
3783
778
                nodelist, messages = self.interpreted(rawsource, escaped, role,
3784
779
                                                      lineno)
3785
780
                return (string[:rolestart], nodelist,
3786
781
                        string[textend:], messages)
3787
782
        msg = self.reporter.warning(
3788
783
              'Inline interpreted text or phrase reference start-string '
3789
784
              'without end-string.', line=lineno)
3790
785
        text = unescape(string[matchstart:matchend], 1)
3791
786
        prb = self.problematic(text, text, msg)
3792
787
        return string[:matchstart], [prb], string[matchend:], [msg]
3793
788
3794
789
    def phrase_ref(self, before, after, rawsource, escaped, text):
3795
790
        match = self.patterns.embedded_uri.search(escaped)
3796
791
        if match:
3797
792
            text = unescape(escaped[:match.start(0)])
3798
793
            uri_text = match.group(2)
3799
794
            uri = ''.join(uri_text.split())
3800
795
            uri = self.adjust_uri(uri)
3801
796
            if uri:
3802
797
                target = nodes.target(match.group(1), refuri=uri)
3803
798
                target.referenced = 1
3804
799
            else:
3805
800
                raise ApplicationError('problem with URI: %r' % uri_text)
3806
801
            if not text:
3807
802
                text = uri
3808
803
        else:
3809
804
            target = None
3810
805
        refname = normalize_name(text)
3811
806
        reference = nodes.reference(rawsource, text,
3812
807
                                    name=whitespace_normalize_name(text))
3813
808
        node_list = [reference]
3814
809
        if rawsource[-2:] == '__':
3815
810
            if target:
3816
811
                reference['refuri'] = uri
3817
812
            else:
3818
813
                reference['anonymous'] = 1
3819
814
        else:
3820
815
            if target:
3821
816
                reference['refuri'] = uri
3822
817
                target['names'].append(refname)
3823
818
                self.document.note_explicit_target(target, self.parent)
3824
819
                node_list.append(target)
3825
820
            else:
3826
821
                reference['refname'] = refname
3827
822
                self.document.note_refname(reference)
3828
823
        return before, node_list, after, []
3829
824
3830
825
    def adjust_uri(self, uri):
3831
826
        match = self.patterns.email.match(uri)
3832
827
        if match:
3833
828
            return 'mailto:' + uri
3834
829
        else:
3835
830
            return uri
3836
831
3837
832
    def interpreted(self, rawsource, text, role, lineno):
3838
833
        role_fn, messages = roles.role(role, self.language, lineno,
3839
834
                                       self.reporter)
3840
835
        if role_fn:
3841
836
            nodes, messages2 = role_fn(role, rawsource, text, lineno, self)
3842
837
            return nodes, messages + messages2
3843
838
        else:
3844
839
            msg = self.reporter.error(
3845
840
                'Unknown interpreted text role "%s".' % role,
3846
841
                line=lineno)
3847
842
            return ([self.problematic(rawsource, rawsource, msg)],
3848
843
                    messages + [msg])
3849
844
3850
845
    def literal(self, match, lineno):
3851
846
        before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3852
847
              match, lineno, self.patterns.literal, nodes.literal,
3853
848
              restore_backslashes=True)
3854
849
        return before, inlines, remaining, sysmessages
3855
850
3856
851
    def inline_internal_target(self, match, lineno):
3857
852
        before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3858
853
              match, lineno, self.patterns.target, nodes.target)
3859
854
        if inlines and isinstance(inlines[0], nodes.target):
3860
855
            assert len(inlines) == 1
3861
856
            target = inlines[0]
3862
857
            name = normalize_name(target.astext())
3863
858
            target['names'].append(name)
3864
859
            self.document.note_explicit_target(target, self.parent)
3865
860
        return before, inlines, remaining, sysmessages
3866
861
3867
862
    def substitution_reference(self, match, lineno):
3868
863
        before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3869
864
              match, lineno, self.patterns.substitution_ref,
3870
865
              nodes.substitution_reference)
3871
866
        if len(inlines) == 1:
3872
867
            subref_node = inlines[0]
3873
868
            if isinstance(subref_node, nodes.substitution_reference):
3874
869
                subref_text = subref_node.astext()
3875
870
                self.document.note_substitution_ref(subref_node, subref_text)
3876
871
                if endstring[-1:] == '_':
3877
872
                    reference_node = nodes.reference(
3878
873
                        '|%s%s' % (subref_text, endstring), '')
3879
874
                    if endstring[-2:] == '__':
3880
875
                        reference_node['anonymous'] = 1
3881
876
                    else:
3882
877
                        reference_node['refname'] = normalize_name(subref_text)
3883
878
                        self.document.note_refname(reference_node)
3884
879
                    reference_node += subref_node
3885
880
                    inlines = [reference_node]
3886
881
        return before, inlines, remaining, sysmessages
3887
882
3888
883
    def footnote_reference(self, match, lineno):
3889
884
        """
3890
885
        Handles `nodes.footnote_reference` and `nodes.citation_reference`
3891
886
        elements.
3892
887
        """
3893
888
        label = match.group('footnotelabel')
3894
889
        refname = normalize_name(label)
3895
890
        string = match.string
3896
891
        before = string[:match.start('whole')]
3897
892
        remaining = string[match.end('whole'):]
3898
893
        if match.group('citationlabel'):
3899
894
            refnode = nodes.citation_reference('[%s]_' % label,
3900
895
                                               refname=refname)
3901
896
            refnode += nodes.Text(label)
3902
897
            self.document.note_citation_ref(refnode)
3903
898
        else:
3904
899
            refnode = nodes.footnote_reference('[%s]_' % label)
3905
900
            if refname[0] == '#':
3906
901
                refname = refname[1:]
3907
902
                refnode['auto'] = 1
3908
903
                self.document.note_autofootnote_ref(refnode)
3909
904
            elif refname == '*':
3910
905
                refname = ''
3911
906
                refnode['auto'] = '*'
3912
907
                self.document.note_symbol_footnote_ref(
3913
908
                      refnode)
3914
909
            else:
3915
910
                refnode += nodes.Text(label)
3916
911
            if refname:
3917
912
                refnode['refname'] = refname
3918
913
                self.document.note_footnote_ref(refnode)
3919
914
            if utils.get_trim_footnote_ref_space(self.document.settings):
3920
915
                before = before.rstrip()
3921
916
        return (before, [refnode], remaining, [])
3922
917
3923
918
    def reference(self, match, lineno, anonymous=False):
3924
919
        referencename = match.group('refname')
3925
920
        refname = normalize_name(referencename)
3926
921
        referencenode = nodes.reference(
3927
922
            referencename + match.group('refend'), referencename,
3928
923
            name=whitespace_normalize_name(referencename))
3929
924
        if anonymous:
3930
925
            referencenode['anonymous'] = 1
3931
926
        else:
3932
927
            referencenode['refname'] = refname
3933
928
            self.document.note_refname(referencenode)
3934
929
        string = match.string
3935
930
        matchstart = match.start('whole')
3936
931
        matchend = match.end('whole')
3937
932
        return (string[:matchstart], [referencenode], string[matchend:], [])
3938
933
3939
934
    def anonymous_reference(self, match, lineno):
3940
935
        return self.reference(match, lineno, anonymous=1)
3941
936
3942
937
    def standalone_uri(self, match, lineno):
3943
938
        if (not match.group('scheme')
3944
939
                or match.group('scheme').lower() in urischemes.schemes):
3945
940
            if match.group('email'):
3946
941
                addscheme = 'mailto:'
3947
942
            else:
3948
943
                addscheme = ''
3949
944
            text = match.group('whole')
3950
945
            unescaped = unescape(text, 0)
3951
946
            return [nodes.reference(unescape(text, 1), unescaped,
3952
947
                                    refuri=addscheme + unescaped)]
3953
948
        else:                   # not a valid scheme
3954
949
            raise MarkupMismatch
3955
950
3956
951
    def pep_reference(self, match, lineno):
3957
952
        text = match.group(0)
3958
953
        if text.startswith('pep-'):
3959
954
            pepnum = int(match.group('pepnum1'))
3960
955
        elif text.startswith('PEP'):
3961
956
            pepnum = int(match.group('pepnum2'))
3962
957
        else:
3963
958
            raise MarkupMismatch
3964
959
        ref = (self.document.settings.pep_base_url
3965
960
               + self.document.settings.pep_file_url_template % pepnum)
3966
961
        unescaped = unescape(text, 0)
3967
962
        return [nodes.reference(unescape(text, 1), unescaped, refuri=ref)]
3968
963
3969
964
    rfc_url = 'rfc%d.html'
3970
965
3971
966
    def rfc_reference(self, match, lineno):
3972
967
        text = match.group(0)
3973
968
        if text.startswith('RFC'):
3974
969
            rfcnum = int(match.group('rfcnum'))
3975
970
            ref = self.document.settings.rfc_base_url + self.rfc_url % rfcnum
3976
971
        else:
3977
972
            raise MarkupMismatch
3978
973
        unescaped = unescape(text, 0)
3979
974
        return [nodes.reference(unescape(text, 1), unescaped, refuri=ref)]
3980
975
3981
976
    def implicit_inline(self, text, lineno):
3982
977
        """
3983
978
        Check each of the patterns in `self.implicit_dispatch` for a match,
3984
979
        and dispatch to the stored method for the pattern.  Recursively check
3985
980
        the text before and after the match.  Return a list of `nodes.Text`
3986
981
        and inline element nodes.
3987
982
        """
3988
983
        if not text:
3989
984
            return []
3990
985
        for pattern, method in self.implicit_dispatch:
3991
986
            match = pattern.search(text)
3992
987
            if match:
3993
988
                try:
3994
989
                    # Must recurse on strings before *and* after the match;
3995
990
                    # there may be multiple patterns.
3996
991
                    return (self.implicit_inline(text[:match.start()], lineno)
3997
992
                            + method(match, lineno) +
3998
993
                            self.implicit_inline(text[match.end():], lineno))
3999
994
                except MarkupMismatch:
4000
995
                    pass
4001
996
        return [nodes.Text(unescape(text), rawsource=unescape(text, 1))]
4002
997
4003
998
    dispatch = {'*': emphasis,
4004
999
                '**': strong,
4005
1000
                '`': interpreted_or_phrase_ref,
4006
1001
                '``': literal,
4007
1002
                '_`': inline_internal_target,
4008
1003
                ']_': footnote_reference,
4009
1004
                '|': substitution_reference,
4010
1005
                '_': reference,
4011
1006
                '__': anonymous_reference}
4012
1007
4013
1008
4014
1009
def _loweralpha_to_int(s, _zero=(ord('a')-1)):
4015
1010
    return ord(s) - _zero
4016
1011
4017
1012
def _upperalpha_to_int(s, _zero=(ord('A')-1)):
4018
1013
    return ord(s) - _zero
4019
1014
4020
1015
def _lowerroman_to_int(s):
4021
1016
    return roman.fromRoman(s.upper())
4022
1017
4023
1018
4024
1019
class Body(RSTState):
4025
1020
4026
1021
    """
4027
1022
    Generic classifier of the first line of a block.
4028
1023
    """
4029
1024
4030
1025
    double_width_pad_char = tableparser.TableParser.double_width_pad_char
4031
1026
    """Padding character for East Asian double-width text."""
4032
1027
4033
1028
    enum = Struct()
4034
1029
    """Enumerated list parsing information."""
4035
1030
4036
1031
    enum.formatinfo = {
4037
1032
          'parens': Struct(prefix='(', suffix=')', start=1, end=-1),
4038
1033
          'rparen': Struct(prefix='', suffix=')', start=0, end=-1),
4039
1034
          'period': Struct(prefix='', suffix='.', start=0, end=-1)}
4040
1035
    enum.formats = enum.formatinfo.keys()
4041
1036
    enum.sequences = ['arabic', 'loweralpha', 'upperalpha',
4042
1037
                      'lowerroman', 'upperroman'] # ORDERED!
4043
1038
    enum.sequencepats = {'arabic': '[0-9]+',
4044
1039
                         'loweralpha': '[a-z]',
4045
1040
                         'upperalpha': '[A-Z]',
4046
1041
                         'lowerroman': '[ivxlcdm]+',
4047
1042
                         'upperroman': '[IVXLCDM]+',}
4048
1043
    enum.converters = {'arabic': int,
4049
1044
                       'loweralpha': _loweralpha_to_int,
4050
1045
                       'upperalpha': _upperalpha_to_int,
4051
1046
                       'lowerroman': _lowerroman_to_int,
4052
1047
                       'upperroman': roman.fromRoman}
4053
1048
4054
1049
    enum.sequenceregexps = {}
4055
1050
    for sequence in enum.sequences:
4056
1051
        enum.sequenceregexps[sequence] = re.compile(
4057
1052
              enum.sequencepats[sequence] + '$', re.UNICODE)
4058
1053
4059
1054
    grid_table_top_pat = re.compile(r'\+-[-+]+-\+ *$')
4060
1055
    """Matches the top (& bottom) of a full table)."""
4061
1056
4062
1057
    simple_table_top_pat = re.compile('=+( +=+)+ *$')
4063
1058
    """Matches the top of a simple table."""
4064
1059
4065
1060
    simple_table_border_pat = re.compile('=+[ =]*$')
4066
1061
    """Matches the bottom & header bottom of a simple table."""
4067
1062
4068
1063
    pats = {}
4069
1064
    """Fragments of patterns used by transitions."""
4070
1065
4071
1066
    pats['nonalphanum7bit'] = '[!-/:-@[-`{-~]'
4072
1067
    pats['alpha'] = '[a-zA-Z]'
4073
1068
    pats['alphanum'] = '[a-zA-Z0-9]'
4074
1069
    pats['alphanumplus'] = '[a-zA-Z0-9_-]'
4075
1070
    pats['enum'] = ('(%(arabic)s|%(loweralpha)s|%(upperalpha)s|%(lowerroman)s'
4076
1071
                    '|%(upperroman)s|#)' % enum.sequencepats)
4077
1072
    pats['optname'] = '%(alphanum)s%(alphanumplus)s*' % pats
4078
1073
    # @@@ Loosen up the pattern?  Allow Unicode?
4079
1074
    pats['optarg'] = '(%(alpha)s%(alphanumplus)s*|<[^<>]+>)' % pats
4080
1075
    pats['shortopt'] = r'(-|\+)%(alphanum)s( ?%(optarg)s)?' % pats
4081
1076
    pats['longopt'] = r'(--|/)%(optname)s([ =]%(optarg)s)?' % pats
4082
1077
    pats['option'] = r'(%(shortopt)s|%(longopt)s)' % pats
4083
1078
4084
1079
    for format in enum.formats:
4085
1080
        pats[format] = '(?P<%s>%s%s%s)' % (
4086
1081
              format, re.escape(enum.formatinfo[format].prefix),
4087
1082
              pats['enum'], re.escape(enum.formatinfo[format].suffix))
4088
1083
4089
1084
    patterns = {
4090
1085
          'bullet': u'[-+*\u2022\u2023\u2043]( +|$)',
4091
1086
          'enumerator': r'(%(parens)s|%(rparen)s|%(period)s)( +|$)' % pats,
4092
1087
          'field_marker': r':(?![: ])([^:\\]|\\.)*(?<! ):( +|$)',
4093
1088
          'option_marker': r'%(option)s(, %(option)s)*(  +| ?$)' % pats,
4094
1089
          'doctest': r'>>>( +|$)',
4095
1090
          'line_block': r'\|( +|$)',
4096
1091
          'grid_table_top': grid_table_top_pat,
4097
1092
          'simple_table_top': simple_table_top_pat,
4098
1093
          'explicit_markup': r'\.\.( +|$)',
4099
1094
          'anonymous': r'__( +|$)',
4100
1095
          'line': r'(%(nonalphanum7bit)s)\1* *$' % pats,
4101
1096
          'text': r''}
4102
1097
    initial_transitions = (
4103
1098
          'bullet',
4104
1099
          'enumerator',
4105
1100
          'field_marker',
4106
1101
          'option_marker',
4107
1102
          'doctest',
4108
1103
          'line_block',
4109
1104
          'grid_table_top',
4110
1105
          'simple_table_top',
4111
1106
          'explicit_markup',
4112
1107
          'anonymous',
4113
1108
          'line',
4114
1109
          'text')
4115
1110
4116
1111
    def indent(self, match, context, next_state):
4117
1112
        """Block quote."""
4118
1113
        indented, indent, line_offset, blank_finish = \
4119
1114
              self.state_machine.get_indented()
4120
1115
        elements = self.block_quote(indented, line_offset)
4121
1116
        self.parent += elements
4122
1117
        if not blank_finish:
4123
1118
            self.parent += self.unindent_warning('Block quote')
4124
1119
        return context, next_state, []
4125
1120
4126
1121
    def block_quote(self, indented, line_offset):
4127
1122
        elements = []
4128
1123
        while indented:
4129
1124
            (blockquote_lines,
4130
1125
             attribution_lines,
4131
1126
             attribution_offset,
4132
1127
             indented,
4133
1128
             new_line_offset) = self.split_attribution(indented, line_offset)
4134
1129
            blockquote = nodes.block_quote()
4135
1130
            self.nested_parse(blockquote_lines, line_offset, blockquote)
4136
1131
            elements.append(blockquote)
4137
1132
            if attribution_lines:
4138
1133
                attribution, messages = self.parse_attribution(
4139
1134
                    attribution_lines, attribution_offset)
4140
1135
                blockquote += attribution
4141
1136
                elements += messages
4142
1137
            line_offset = new_line_offset
4143
1138
            while indented and not indented[0]:
4144
1139
                indented = indented[1:]
4145
1140
                line_offset += 1
4146
1141
        return elements
4147
1142
4148
1143
    # U+2014 is an em-dash:
4149
1144
    attribution_pattern = re.compile(u'(---?(?!-)|\u2014) *(?=[^ \\n])',
4150
1145
                                     re.UNICODE)
4151
1146
4152
1147
    def split_attribution(self, indented, line_offset):
4153
1148
        """
4154
1149
        Check for a block quote attribution and split it off:
4155
1150
4156
1151
        * First line after a blank line must begin with a dash ("--", "---",
4157
1152
          em-dash; matches `self.attribution_pattern`).
4158
1153
        * Every line after that must have consistent indentation.
4159
1154
        * Attributions must be preceded by block quote content.
4160
1155
4161
1156
        Return a tuple of: (block quote content lines, content offset,
4162
1157
        attribution lines, attribution offset, remaining indented lines).
4163
1158
        """
4164
1159
        blank = None
4165
1160
        nonblank_seen = False
4166
1161
        for i in range(len(indented)):
4167
1162
            line = indented[i].rstrip()
4168
1163
            if line:
4169
1164
                if nonblank_seen and blank == i - 1: # last line blank
4170
1165
                    match = self.attribution_pattern.match(line)
4171
1166
                    if match:
4172
1167
                        attribution_end, indent = self.check_attribution(
4173
1168
                            indented, i)
4174
1169
                        if attribution_end:
4175
1170
                            a_lines = indented[i:attribution_end]
4176
1171
                            a_lines.trim_left(match.end(), end=1)
4177
1172
                            a_lines.trim_left(indent, start=1)
4178
1173
                            return (indented[:i], a_lines,
4179
1174
                                    i, indented[attribution_end:],
4180
1175
                                    line_offset + attribution_end)
4181
1176
                nonblank_seen = True
4182
1177
            else:
4183
1178
                blank = i
4184
1179
        else:
4185
1180
            return (indented, None, None, None, None)
4186
1181
4187
1182
    def check_attribution(self, indented, attribution_start):
4188
1183
        """
4189
1184
        Check attribution shape.
4190
1185
        Return the index past the end of the attribution, and the indent.
4191
1186
        """
4192
1187
        indent = None
4193
1188
        i = attribution_start + 1
4194
1189
        for i in range(attribution_start + 1, len(indented)):
4195
1190
            line = indented[i].rstrip()
4196
1191
            if not line:
4197
1192
                break
4198
1193
            if indent is None:
4199
1194
                indent = len(line) - len(line.lstrip())
4200
1195
            elif len(line) - len(line.lstrip()) != indent:
4201
1196
                return None, None       # bad shape; not an attribution
4202
1197
        else:
4203
1198
            # return index of line after last attribution line:
4204
1199
            i += 1
4205
1200
        return i, (indent or 0)
4206
1201
4207
1202
    def parse_attribution(self, indented, line_offset):
4208
1203
        text = '\n'.join(indented).rstrip()
4209
1204
        lineno = self.state_machine.abs_line_number() + line_offset
4210
1205
        textnodes, messages = self.inline_text(text, lineno)
4211
1206
        node = nodes.attribution(text, '', *textnodes)
4212
1207
        node.source, node.line = self.state_machine.get_source_and_line(lineno)
4213
1208
        return node, messages
4214
1209
4215
1210
    def bullet(self, match, context, next_state):
4216
1211
        """Bullet list item."""
4217
1212
        bulletlist = nodes.bullet_list()
4218
1213
        self.parent += bulletlist
4219
1214
        bulletlist['bullet'] = match.string[0]
4220
1215
        i, blank_finish = self.list_item(match.end())
4221
1216
        bulletlist += i
4222
1217
        offset = self.state_machine.line_offset + 1   # next line
4223
1218
        new_line_offset, blank_finish = self.nested_list_parse(
4224
1219
              self.state_machine.input_lines[offset:],
4225
1220
              input_offset=self.state_machine.abs_line_offset() + 1,
4226
1221
              node=bulletlist, initial_state='BulletList',
4227
1222
              blank_finish=blank_finish)
4228
1223
        self.goto_line(new_line_offset)
4229
1224
        if not blank_finish:
4230
1225
            self.parent += self.unindent_warning('Bullet list')
4231
1226
        return [], next_state, []
4232
1227
4233
1228
    def list_item(self, indent):
4234
1229
        if self.state_machine.line[indent:]:
4235
1230
            indented, line_offset, blank_finish = (
4236
1231
                self.state_machine.get_known_indented(indent))
4237
1232
        else:
4238
1233
            indented, indent, line_offset, blank_finish = (
4239
1234
                self.state_machine.get_first_known_indented(indent))
4240
1235
        listitem = nodes.list_item('\n'.join(indented))
4241
1236
        if indented:
4242
1237
            self.nested_parse(indented, input_offset=line_offset,
4243
1238
                              node=listitem)
4244
1239
        return listitem, blank_finish
4245
1240
4246
1241
    def enumerator(self, match, context, next_state):
4247
1242
        """Enumerated List Item"""
4248
1243
        format, sequence, text, ordinal = self.parse_enumerator(match)
4249
1244
        if not self.is_enumerated_list_item(ordinal, sequence, format):
4250
1245
            raise statemachine.TransitionCorrection('text')
4251
1246
        enumlist = nodes.enumerated_list()
4252
1247
        self.parent += enumlist
4253
1248
        if sequence == '#':
4254
1249
            enumlist['enumtype'] = 'arabic'
4255
1250
        else:
4256
1251
            enumlist['enumtype'] = sequence
4257
1252
        enumlist['prefix'] = self.enum.formatinfo[format].prefix
4258
1253
        enumlist['suffix'] = self.enum.formatinfo[format].suffix
4259
1254
        if ordinal != 1:
4260
1255
            enumlist['start'] = ordinal
4261
1256
            msg = self.reporter.info(
4262
1257
                'Enumerated list start value not ordinal-1: "%s" (ordinal %s)'
4263
1258
                % (text, ordinal))
4264
1259
            self.parent += msg
4265
1260
        listitem, blank_finish = self.list_item(match.end())
4266
1261
        enumlist += listitem
4267
1262
        offset = self.state_machine.line_offset + 1   # next line
4268
1263
        newline_offset, blank_finish = self.nested_list_parse(
4269
1264
              self.state_machine.input_lines[offset:],
4270
1265
              input_offset=self.state_machine.abs_line_offset() + 1,
4271
1266
              node=enumlist, initial_state='EnumeratedList',
4272
1267
              blank_finish=blank_finish,
4273
1268
              extra_settings={'lastordinal': ordinal,
4274
1269
                              'format': format,
4275
1270
                              'auto': sequence == '#'})
4276
1271
        self.goto_line(newline_offset)
4277
1272
        if not blank_finish:
4278
1273
            self.parent += self.unindent_warning('Enumerated list')
4279
1274
        return [], next_state, []
4280
1275
4281
1276
    def parse_enumerator(self, match, expected_sequence=None):
4282
1277
        """
4283
1278
        Analyze an enumerator and return the results.
4284
1279
4285
1280
        :Return:
4286
1281
            - the enumerator format ('period', 'parens', or 'rparen'),
4287
1282
            - the sequence used ('arabic', 'loweralpha', 'upperroman', etc.),
4288
1283
            - the text of the enumerator, stripped of formatting, and
4289
1284
            - the ordinal value of the enumerator ('a' -> 1, 'ii' -> 2, etc.;
4290
1285
              ``None`` is returned for invalid enumerator text).
4291
1286
4292
1287
        The enumerator format has already been determined by the regular
4293
1288
        expression match. If `expected_sequence` is given, that sequence is
4294
1289
        tried first. If not, we check for Roman numeral 1. This way,
4295
1290
        single-character Roman numerals (which are also alphabetical) can be
4296
1291
        matched. If no sequence has been matched, all sequences are checked in
4297
1292
        order.
4298
1293
        """
4299
1294
        groupdict = match.groupdict()
4300
1295
        sequence = ''
4301
1296
        for format in self.enum.formats:
4302
1297
            if groupdict[format]:       # was this the format matched?
4303
1298
                break                   # yes; keep `format`
4304
1299
        else:                           # shouldn't happen
4305
1300
            raise ParserError('enumerator format not matched')
4306
1301
        text = groupdict[format][self.enum.formatinfo[format].start
4307
1302
                                 :self.enum.formatinfo[format].end]
4308
1303
        if text == '#':
4309
1304
            sequence = '#'
4310
1305
        elif expected_sequence:
4311
1306
            try:
4312
1307
                if self.enum.sequenceregexps[expected_sequence].match(text):
4313
1308
                    sequence = expected_sequence
4314
1309
            except KeyError:            # shouldn't happen
4315
1310
                raise ParserError('unknown enumerator sequence: %s'
4316
1311
                                  % sequence)
4317
1312
        elif text == 'i':
4318
1313
            sequence = 'lowerroman'
4319
1314
        elif text == 'I':
4320
1315
            sequence = 'upperroman'
4321
1316
        if not sequence:
4322
1317
            for sequence in self.enum.sequences:
4323
1318
                if self.enum.sequenceregexps[sequence].match(text):
4324
1319
                    break
4325
1320
            else:                       # shouldn't happen
4326
1321
                raise ParserError('enumerator sequence not matched')
4327
1322
        if sequence == '#':
4328
1323
            ordinal = 1
4329
1324
        else:
4330
1325
            try:
4331
1326
                ordinal = self.enum.converters[sequence](text)
4332
1327
            except roman.InvalidRomanNumeralError:
4333
1328
                ordinal = None
4334
1329
        return format, sequence, text, ordinal
4335
1330
4336
1331
    def is_enumerated_list_item(self, ordinal, sequence, format):
4337
1332
        """
4338
1333
        Check validity based on the ordinal value and the second line.
4339
1334
4340
1335
        Return true if the ordinal is valid and the second line is blank,
4341
1336
        indented, or starts with the next enumerator or an auto-enumerator.
4342
1337
        """
4343
1338
        if ordinal is None:
4344
1339
            return None
4345
1340
        try:
4346
1341
            next_line = self.state_machine.next_line()
4347
1342
        except EOFError:              # end of input lines
4348
1343
            self.state_machine.previous_line()
4349
1344
            return 1
4350
1345
        else:
4351
1346
            self.state_machine.previous_line()
4352
1347
        if not next_line[:1].strip():   # blank or indented
4353
1348
            return 1
4354
1349
        result = self.make_enumerator(ordinal + 1, sequence, format)
4355
1350
        if result:
4356
1351
            next_enumerator, auto_enumerator = result
4357
1352
            try:
4358
1353
                if ( next_line.startswith(next_enumerator) or
4359
1354
                     next_line.startswith(auto_enumerator) ):
4360
1355
                    return 1
4361
1356
            except TypeError:
4362
1357
                pass
4363
1358
        return None
4364
1359
4365
1360
    def make_enumerator(self, ordinal, sequence, format):
4366
1361
        """
4367
1362
        Construct and return the next enumerated list item marker, and an
4368
1363
        auto-enumerator ("#" instead of the regular enumerator).
4369
1364
4370
1365
        Return ``None`` for invalid (out of range) ordinals.
4371
1366
        """ #"
4372
1367
        if sequence == '#':
4373
1368
            enumerator = '#'
4374
1369
        elif sequence == 'arabic':
4375
1370
            enumerator = str(ordinal)
4376
1371
        else:
4377
1372
            if sequence.endswith('alpha'):
4378
1373
                if ordinal > 26:
4379
1374
                    return None
4380
1375
                enumerator = chr(ordinal + ord('a') - 1)
4381
1376
            elif sequence.endswith('roman'):
4382
1377
                try:
4383
1378
                    enumerator = roman.toRoman(ordinal)
4384
1379
                except roman.RomanError:
4385
1380
                    return None
4386
1381
            else:                       # shouldn't happen
4387
1382
                raise ParserError('unknown enumerator sequence: "%s"'
4388
1383
                                  % sequence)
4389
1384
            if sequence.startswith('lower'):
4390
1385
                enumerator = enumerator.lower()
4391
1386
            elif sequence.startswith('upper'):
4392
1387
                enumerator = enumerator.upper()
4393
1388
            else:                       # shouldn't happen
4394
1389
                raise ParserError('unknown enumerator sequence: "%s"'
4395
1390
                                  % sequence)
4396
1391
        formatinfo = self.enum.formatinfo[format]
4397
1392
        next_enumerator = (formatinfo.prefix + enumerator + formatinfo.suffix
4398
1393
                           + ' ')
4399
1394
        auto_enumerator = formatinfo.prefix + '#' + formatinfo.suffix + ' '
4400
1395
        return next_enumerator, auto_enumerator
4401
1396
4402
1397
    def field_marker(self, match, context, next_state):
4403
1398
        """Field list item."""
4404
1399
        field_list = nodes.field_list()
4405
1400
        self.parent += field_list
4406
1401
        field, blank_finish = self.field(match)
4407
1402
        field_list += field
4408
1403
        offset = self.state_machine.line_offset + 1   # next line
4409
1404
        newline_offset, blank_finish = self.nested_list_parse(
4410
1405
              self.state_machine.input_lines[offset:],
4411
1406
              input_offset=self.state_machine.abs_line_offset() + 1,
4412
1407
              node=field_list, initial_state='FieldList',
4413
1408
              blank_finish=blank_finish)
4414
1409
        self.goto_line(newline_offset)
4415
1410
        if not blank_finish:
4416
1411
            self.parent += self.unindent_warning('Field list')
4417
1412
        return [], next_state, []
4418
1413
4419
1414
    def field(self, match):
4420
1415
        name = self.parse_field_marker(match)
4421
1416
        src, srcline = self.state_machine.get_source_and_line()
4422
1417
        lineno = self.state_machine.abs_line_number()
4423
1418
        indented, indent, line_offset, blank_finish = \
4424
1419
              self.state_machine.get_first_known_indented(match.end())
4425
1420
        field_node = nodes.field()
4426
1421
        field_node.source = src
4427
1422
        field_node.line = srcline
4428
1423
        name_nodes, name_messages = self.inline_text(name, lineno)
4429
1424
        field_node += nodes.field_name(name, '', *name_nodes)
4430
1425
        field_body = nodes.field_body('\n'.join(indented), *name_messages)
4431
1426
        field_node += field_body
4432
1427
        if indented:
4433
1428
            self.parse_field_body(indented, line_offset, field_body)
4434
1429
        return field_node, blank_finish
4435
1430
4436
1431
    def parse_field_marker(self, match):
4437
1432
        """Extract & return field name from a field marker match."""
4438
1433
        field = match.group()[1:]        # strip off leading ':'
4439
1434
        field = field[:field.rfind(':')] # strip off trailing ':' etc.
4440
1435
        return field
4441
1436
4442
1437
    def parse_field_body(self, indented, offset, node):
4443
1438
        self.nested_parse(indented, input_offset=offset, node=node)
4444
1439
4445
1440
    def option_marker(self, match, context, next_state):
4446
1441
        """Option list item."""
4447
1442
        optionlist = nodes.option_list()
4448
1443
        try:
4449
1444
            listitem, blank_finish = self.option_list_item(match)
4450
1445
        except MarkupError, error:
4451
1446
            # This shouldn't happen; pattern won't match.
4452
1447
            msg = self.reporter.error(u'Invalid option list marker: %s' %
4453
1448
                                      error)
4454
1449
            self.parent += msg
4455
1450
            indented, indent, line_offset, blank_finish = \
4456
1451
                  self.state_machine.get_first_known_indented(match.end())
4457
1452
            elements = self.block_quote(indented, line_offset)
4458
1453
            self.parent += elements
4459
1454
            if not blank_finish:
4460
1455
                self.parent += self.unindent_warning('Option list')
4461
1456
            return [], next_state, []
4462
1457
        self.parent += optionlist
4463
1458
        optionlist += listitem
4464
1459
        offset = self.state_machine.line_offset + 1   # next line
4465
1460
        newline_offset, blank_finish = self.nested_list_parse(
4466
1461
              self.state_machine.input_lines[offset:],
4467
1462
              input_offset=self.state_machine.abs_line_offset() + 1,
4468
1463
              node=optionlist, initial_state='OptionList',
4469
1464
              blank_finish=blank_finish)
4470
1465
        self.goto_line(newline_offset)
4471
1466
        if not blank_finish:
4472
1467
            self.parent += self.unindent_warning('Option list')
4473
1468
        return [], next_state, []
4474
1469
4475
1470
    def option_list_item(self, match):
4476
1471
        offset = self.state_machine.abs_line_offset()
4477
1472
        options = self.parse_option_marker(match)
4478
1473
        indented, indent, line_offset, blank_finish = \
4479
1474
              self.state_machine.get_first_known_indented(match.end())
4480
1475
        if not indented:                # not an option list item
4481
1476
            self.goto_line(offset)
4482
1477
            raise statemachine.TransitionCorrection('text')
4483
1478
        option_group = nodes.option_group('', *options)
4484
1479
        description = nodes.description('\n'.join(indented))
4485
1480
        option_list_item = nodes.option_list_item('', option_group,
4486
1481
                                                  description)
4487
1482
        if indented:
4488
1483
            self.nested_parse(indented, input_offset=line_offset,
4489
1484
                              node=description)
4490
1485
        return option_list_item, blank_finish
4491
1486
4492
1487
    def parse_option_marker(self, match):
4493
1488
        """
4494
1489
        Return a list of `node.option` and `node.option_argument` objects,
4495
1490
        parsed from an option marker match.
4496
1491
4497
1492
        :Exception: `MarkupError` for invalid option markers.
4498
1493
        """
4499
1494
        optlist = []
4500
1495
        optionstrings = match.group().rstrip().split(', ')
4501
1496
        for optionstring in optionstrings:
4502
1497
            tokens = optionstring.split()
4503
1498
            delimiter = ' '
4504
1499
            firstopt = tokens[0].split('=', 1)
4505
1500
            if len(firstopt) > 1:
4506
1501
                # "--opt=value" form
4507
1502
                tokens[:1] = firstopt
4508
1503
                delimiter = '='
4509
1504
            elif (len(tokens[0]) > 2
4510
1505
                  and ((tokens[0].startswith('-')
4511
1506
                        and not tokens[0].startswith('--'))
4512
1507
                       or tokens[0].startswith('+'))):
4513
1508
                # "-ovalue" form
4514
1509
                tokens[:1] = [tokens[0][:2], tokens[0][2:]]
4515
1510
                delimiter = ''
4516
1511
            if len(tokens) > 1 and (tokens[1].startswith('<')
4517
1512
                                    and tokens[-1].endswith('>')):
4518
1513
                # "-o <value1 value2>" form; join all values into one token
4519
1514
                tokens[1:] = [' '.join(tokens[1:])]
4520
1515
            if 0 < len(tokens) <= 2:
4521
1516
                option = nodes.option(optionstring)
4522
1517
                option += nodes.option_string(tokens[0], tokens[0])
4523
1518
                if len(tokens) > 1:
4524
1519
                    option += nodes.option_argument(tokens[1], tokens[1],
4525
1520
                                                    delimiter=delimiter)
4526
1521
                optlist.append(option)
4527
1522
            else:
4528
1523
                raise MarkupError(
4529
1524
                    'wrong number of option tokens (=%s), should be 1 or 2: '
4530
1525
                    '"%s"' % (len(tokens), optionstring))
4531
1526
        return optlist
4532
1527
4533
1528
    def doctest(self, match, context, next_state):
4534
1529
        data = '\n'.join(self.state_machine.get_text_block())
4535
1530
        self.parent += nodes.doctest_block(data, data)
4536
1531
        return [], next_state, []
4537
1532
4538
1533
    def line_block(self, match, context, next_state):
4539
1534
        """First line of a line block."""
4540
1535
        block = nodes.line_block()
4541
1536
        self.parent += block
4542
1537
        lineno = self.state_machine.abs_line_number()
4543
1538
        line, messages, blank_finish = self.line_block_line(match, lineno)
4544
1539
        block += line
4545
1540
        self.parent += messages
4546
1541
        if not blank_finish:
4547
1542
            offset = self.state_machine.line_offset + 1   # next line
4548
1543
            new_line_offset, blank_finish = self.nested_list_parse(
4549
1544
                  self.state_machine.input_lines[offset:],
4550
1545
                  input_offset=self.state_machine.abs_line_offset() + 1,
4551
1546
                  node=block, initial_state='LineBlock',
4552
1547
                  blank_finish=0)
4553
1548
            self.goto_line(new_line_offset)
4554
1549
        if not blank_finish:
4555
1550
            self.parent += self.reporter.warning(
4556
1551
                'Line block ends without a blank line.',
4557
1552
                line=lineno+1)
4558
1553
        if len(block):
4559
1554
            if block[0].indent is None:
4560
1555
                block[0].indent = 0
4561
1556
            self.nest_line_block_lines(block)
4562
1557
        return [], next_state, []
4563
1558
4564
1559
    def line_block_line(self, match, lineno):
4565
1560
        """Return one line element of a line_block."""
4566
1561
        indented, indent, line_offset, blank_finish = \
4567
1562
            self.state_machine.get_first_known_indented(match.end(),
4568
1563
                                                        until_blank=True)
4569
1564
        text = u'\n'.join(indented)
4570
1565
        text_nodes, messages = self.inline_text(text, lineno)
4571
1566
        line = nodes.line(text, '', *text_nodes)
4572
1567
        if match.string.rstrip() != '|': # not empty
4573
1568
            line.indent = len(match.group(1)) - 1
4574
1569
        return line, messages, blank_finish
4575
1570
4576
1571
    def nest_line_block_lines(self, block):
4577
1572
        for index in range(1, len(block)):
4578
1573
            if block[index].indent is None:
4579
1574
                block[index].indent = block[index - 1].indent
4580
1575
        self.nest_line_block_segment(block)
4581
1576
4582
1577
    def nest_line_block_segment(self, block):
4583
1578
        indents = [item.indent for item in block]
4584
1579
        least = min(indents)
4585
1580
        new_items = []
4586
1581
        new_block = nodes.line_block()
4587
1582
        for item in block:
4588
1583
            if item.indent > least:
4589
1584
                new_block.append(item)
4590
1585
            else:
4591
1586
                if len(new_block):
4592
1587
                    self.nest_line_block_segment(new_block)
4593
1588
                    new_items.append(new_block)
4594
1589
                    new_block = nodes.line_block()
4595
1590
                new_items.append(item)
4596
1591
        if len(new_block):
4597
1592
            self.nest_line_block_segment(new_block)
4598
1593
            new_items.append(new_block)
4599
1594
        block[:] = new_items
4600
1595
4601
1596
    def grid_table_top(self, match, context, next_state):
4602
1597
        """Top border of a full table."""
4603
1598
        return self.table_top(match, context, next_state,
4604
1599
                              self.isolate_grid_table,
4605
1600
                              tableparser.GridTableParser)
4606
1601
4607
1602
    def simple_table_top(self, match, context, next_state):
4608
1603
        """Top border of a simple table."""
4609
1604
        return self.table_top(match, context, next_state,
4610
1605
                              self.isolate_simple_table,
4611
1606
                              tableparser.SimpleTableParser)
4612
1607
4613
1608
    def table_top(self, match, context, next_state,
4614
1609
                  isolate_function, parser_class):
4615
1610
        """Top border of a generic table."""
4616
1611
        nodelist, blank_finish = self.table(isolate_function, parser_class)
4617
1612
        self.parent += nodelist
4618
1613
        if not blank_finish:
4619
1614
            msg = self.reporter.warning(
4620
1615
                'Blank line required after table.',
4621
1616
                line=self.state_machine.abs_line_number()+1)
4622
1617
            self.parent += msg
4623
1618
        return [], next_state, []
4624
1619
4625
1620
    def table(self, isolate_function, parser_class):
4626
1621
        """Parse a table."""
4627
1622
        block, messages, blank_finish = isolate_function()
4628
1623
        if block:
4629
1624
            try:
4630
1625
                parser = parser_class()
4631
1626
                tabledata = parser.parse(block)
4632
1627
                tableline = (self.state_machine.abs_line_number() - len(block)
4633
1628
                             + 1)
4634
1629
                table = self.build_table(tabledata, tableline)
4635
1630
                nodelist = [table] + messages
4636
1631
            except tableparser.TableMarkupError, err:
4637
1632
                nodelist = self.malformed_table(block, ' '.join(err.args),
4638
1633
                                                offset=err.offset) + messages
4639
1634
        else:
4640
1635
            nodelist = messages
4641
1636
        return nodelist, blank_finish
4642
1637
4643
1638
    def isolate_grid_table(self):
4644
1639
        messages = []
4645
1640
        blank_finish = 1
4646
1641
        try:
4647
1642
            block = self.state_machine.get_text_block(flush_left=True)
4648
1643
        except statemachine.UnexpectedIndentationError, err:
4649
1644
            block, src, srcline = err.args
4650
1645
            messages.append(self.reporter.error('Unexpected indentation.',
4651
1646
                                                source=src, line=srcline))
4652
1647
            blank_finish = 0
4653
1648
        block.disconnect()
4654
1649
        # for East Asian chars:
4655
1650
        block.pad_double_width(self.double_width_pad_char)
4656
1651
        width = len(block[0].strip())
4657
1652
        for i in range(len(block)):
4658
1653
            block[i] = block[i].strip()
4659
1654
            if block[i][0] not in '+|': # check left edge
4660
1655
                blank_finish = 0
4661
1656
                self.state_machine.previous_line(len(block) - i)
4662
1657
                del block[i:]
4663
1658
                break
4664
1659
        if not self.grid_table_top_pat.match(block[-1]): # find bottom
4665
1660
            blank_finish = 0
4666
1661
            # from second-last to third line of table:
4667
1662
            for i in range(len(block) - 2, 1, -1):
4668
1663
                if self.grid_table_top_pat.match(block[i]):
4669
1664
                    self.state_machine.previous_line(len(block) - i + 1)
4670
1665
                    del block[i+1:]
4671
1666
                    break
4672
1667
            else:
4673
1668
                messages.extend(self.malformed_table(block))
4674
1669
                return [], messages, blank_finish
4675
1670
        for i in range(len(block)):     # check right edge
4676
1671
            if len(block[i]) != width or block[i][-1] not in '+|':
4677
1672
                messages.extend(self.malformed_table(block))
4678
1673
                return [], messages, blank_finish
4679
1674
        return block, messages, blank_finish
4680
1675
4681
1676
    def isolate_simple_table(self):
4682
1677
        start = self.state_machine.line_offset
4683
1678
        lines = self.state_machine.input_lines
4684
1679
        limit = len(lines) - 1
4685
1680
        toplen = len(lines[start].strip())
4686
1681
        pattern_match = self.simple_table_border_pat.match
4687
1682
        found = 0
4688
1683
        found_at = None
4689
1684
        i = start + 1
4690
1685
        while i <= limit:
4691
1686
            line = lines[i]
4692
1687
            match = pattern_match(line)
4693
1688
            if match:
4694
1689
                if len(line.strip()) != toplen:
4695
1690
                    self.state_machine.next_line(i - start)
4696
1691
                    messages = self.malformed_table(
4697
1692
                        lines[start:i+1], 'Bottom/header table border does '
4698
1693
                        'not match top border.')
4699
1694
                    return [], messages, i == limit or not lines[i+1].strip()
4700
1695
                found += 1
4701
1696
                found_at = i
4702
1697
                if found == 2 or i == limit or not lines[i+1].strip():
4703
1698
                    end = i
4704
1699
                    break
4705
1700
            i += 1
4706
1701
        else:                           # reached end of input_lines
4707
1702
            if found:
4708
1703
                extra = ' or no blank line after table bottom'
4709
1704
                self.state_machine.next_line(found_at - start)
4710
1705
                block = lines[start:found_at+1]
4711
1706
            else:
4712
1707
                extra = ''
4713
1708
                self.state_machine.next_line(i - start - 1)
4714
1709
                block = lines[start:]
4715
1710
            messages = self.malformed_table(
4716
1711
                block, 'No bottom table border found%s.' % extra)
4717
1712
            return [], messages, not extra
4718
1713
        self.state_machine.next_line(end - start)
4719
1714
        block = lines[start:end+1]
4720
1715
        # for East Asian chars:
4721
1716
        block.pad_double_width(self.double_width_pad_char)
4722
1717
        return block, [], end == limit or not lines[end+1].strip()
4723
1718
4724
1719
    def malformed_table(self, block, detail='', offset=0):
4725
1720
        block.replace(self.double_width_pad_char, '')
4726
1721
        data = '\n'.join(block)
4727
1722
        message = 'Malformed table.'
4728
1723
        startline = self.state_machine.abs_line_number() - len(block) + 1
4729
1724
        if detail:
4730
1725
            message += '\n' + detail
4731
1726
        error = self.reporter.error(message, nodes.literal_block(data, data),
4732
1727
                                    line=startline+offset)
4733
1728
        return [error]
4734
1729
4735
1730
    def build_table(self, tabledata, tableline, stub_columns=0):
4736
1731
        colwidths, headrows, bodyrows = tabledata
4737
1732
        table = nodes.table()
4738
1733
        tgroup = nodes.tgroup(cols=len(colwidths))
4739
1734
        table += tgroup
4740
1735
        for colwidth in colwidths:
4741
1736
            colspec = nodes.colspec(colwidth=colwidth)
4742
1737
            if stub_columns:
4743
1738
                colspec.attributes['stub'] = 1
4744
1739
                stub_columns -= 1
4745
1740
            tgroup += colspec
4746
1741
        if headrows:
4747
1742
            thead = nodes.thead()
4748
1743
            tgroup += thead
4749
1744
            for row in headrows:
4750
1745
                thead += self.build_table_row(row, tableline)
4751
1746
        tbody = nodes.tbody()
4752
1747
        tgroup += tbody
4753
1748
        for row in bodyrows:
4754
1749
            tbody += self.build_table_row(row, tableline)
4755
1750
        return table
4756
1751
4757
1752
    def build_table_row(self, rowdata, tableline):
4758
1753
        row = nodes.row()
4759
1754
        for cell in rowdata:
4760
1755
            if cell is None:
4761
1756
                continue
4762
1757
            morerows, morecols, offset, cellblock = cell
4763
1758
            attributes = {}
4764
1759
            if morerows:
4765
1760
                attributes['morerows'] = morerows
4766
1761
            if morecols:
4767
1762
                attributes['morecols'] = morecols
4768
1763
            entry = nodes.entry(**attributes)
4769
1764
            row += entry
4770
1765
            if ''.join(cellblock):
4771
1766
                self.nested_parse(cellblock, input_offset=tableline+offset,
4772
1767
                                  node=entry)
4773
1768
        return row
4774
1769
4775
1770
4776
1771
    explicit = Struct()
4777
1772
    """Patterns and constants used for explicit markup recognition."""
4778
1773
4779
1774
    explicit.patterns = Struct(
4780
1775
          target=re.compile(r"""
4781
1776
                            (
4782
1777
                              _               # anonymous target
4783
1778
                            |               # *OR*
4784
1779
                              (?!_)           # no underscore at the beginning
4785
1780
                              (?P<quote>`?)   # optional open quote
4786
1781
                              (?![ `])        # first char. not space or
4787
1782
                                              # backquote
4788
1783
                              (?P<name>       # reference name
4789
1784
                                .+?
4790
1785
                              )
4791
1786
                              %(non_whitespace_escape_before)s
4792
1787
                              (?P=quote)      # close quote if open quote used
4793
1788
                            )
4794
1789
                            (?<!(?<!\x00):) # no unescaped colon at end
4795
1790
                            %(non_whitespace_escape_before)s
4796
1791
                            [ ]?            # optional space
4797
1792
                            :               # end of reference name
4798
1793
                            ([ ]+|$)        # followed by whitespace
4799
1794
                            """ % vars(Inliner), re.VERBOSE | re.UNICODE),
4800
1795
          reference=re.compile(r"""
4801
1796
                               (
4802
1797
                                 (?P<simple>%(simplename)s)_
4803
1798
                               |                  # *OR*
4804
1799
                                 `                  # open backquote
4805
1800
                                 (?![ ])            # not space
4806
1801
                                 (?P<phrase>.+?)    # hyperlink phrase
4807
1802
                                 %(non_whitespace_escape_before)s
4808
1803
                                 `_                 # close backquote,
4809
1804
                                                    # reference mark
4810
1805
                               )
4811
1806
                               $                  # end of string
4812
1807
                               """ % vars(Inliner), re.VERBOSE | re.UNICODE),
4813
1808
          substitution=re.compile(r"""
4814
1809
                                  (
4815
1810
                                    (?![ ])          # first char. not space
4816
1811
                                    (?P<name>.+?)    # substitution text
4817
1812
                                    %(non_whitespace_escape_before)s
4818
1813
                                    \|               # close delimiter
4819
1814
                                  )
4820
1815
                                  ([ ]+|$)           # followed by whitespace
4821
1816
                                  """ % vars(Inliner),
4822
1817
                                  re.VERBOSE | re.UNICODE),)
4823
1818
4824
1819
    def footnote(self, match):
4825
1820
        src, srcline = self.state_machine.get_source_and_line()
4826
1821
        indented, indent, offset, blank_finish = \
4827
1822
              self.state_machine.get_first_known_indented(match.end())
4828
1823
        label = match.group(1)
4829
1824
        name = normalize_name(label)
4830
1825
        footnote = nodes.footnote('\n'.join(indented))
4831
1826
        footnote.source = src
4832
1827
        footnote.line = srcline
4833
1828
        if name[0] == '#':              # auto-numbered
4834
1829
            name = name[1:]             # autonumber label
4835
1830
            footnote['auto'] = 1
4836
1831
            if name:
4837
1832
                footnote['names'].append(name)
4838
1833
            self.document.note_autofootnote(footnote)
4839
1834
        elif name == '*':               # auto-symbol
4840
1835
            name = ''
4841
1836
            footnote['auto'] = '*'
4842
1837
            self.document.note_symbol_footnote(footnote)
4843
1838
        else:                           # manually numbered
4844
1839
            footnote += nodes.label('', label)
4845
1840
            footnote['names'].append(name)
4846
1841
            self.document.note_footnote(footnote)
4847
1842
        if name:
4848
1843
            self.document.note_explicit_target(footnote, footnote)
4849
1844
        else:
4850
1845
            self.document.set_id(footnote, footnote)
4851
1846
        if indented:
4852
1847
            self.nested_parse(indented, input_offset=offset, node=footnote)
4853
1848
        return [footnote], blank_finish
4854
1849
4855
1850
    def citation(self, match):
4856
1851
        src, srcline = self.state_machine.get_source_and_line()
4857
1852
        indented, indent, offset, blank_finish = \
4858
1853
              self.state_machine.get_first_known_indented(match.end())
4859
1854
        label = match.group(1)
4860
1855
        name = normalize_name(label)
4861
1856
        citation = nodes.citation('\n'.join(indented))
4862
1857
        citation.source = src
4863
1858
        citation.line = srcline
4864
1859
        citation += nodes.label('', label)
4865
1860
        citation['names'].append(name)
4866
1861
        self.document.note_citation(citation)
4867
1862
        self.document.note_explicit_target(citation, citation)
4868
1863
        if indented:
4869
1864
            self.nested_parse(indented, input_offset=offset, node=citation)
4870
1865
        return [citation], blank_finish
4871
1866
4872
1867
    def hyperlink_target(self, match):
4873
1868
        pattern = self.explicit.patterns.target
4874
1869
        lineno = self.state_machine.abs_line_number()
4875
1870
        block, indent, offset, blank_finish = \
4876
1871
              self.state_machine.get_first_known_indented(
4877
1872
              match.end(), until_blank=True, strip_indent=False)
4878
1873
        blocktext = match.string[:match.end()] + '\n'.join(block)
4879
1874
        block = [escape2null(line) for line in block]
4880
1875
        escaped = block[0]
4881
1876
        blockindex = 0
4882
1877
        while True:
4883
1878
            targetmatch = pattern.match(escaped)
4884
1879
            if targetmatch:
4885
1880
                break
4886
1881
            blockindex += 1
4887
1882
            try:
4888
1883
                escaped += block[blockindex]
4889
1884
            except IndexError:
4890
1885
                raise MarkupError('malformed hyperlink target.')
4891
1886
        del block[:blockindex]
4892
1887
        block[0] = (block[0] + ' ')[targetmatch.end()-len(escaped)-1:].strip()
4893
1888
        target = self.make_target(block, blocktext, lineno,
4894
1889
                                  targetmatch.group('name'))
4895
1890
        return [target], blank_finish
4896
1891
4897
1892
    def make_target(self, block, block_text, lineno, target_name):
4898
1893
        target_type, data = self.parse_target(block, block_text, lineno)
4899
1894
        if target_type == 'refname':
4900
1895
            target = nodes.target(block_text, '', refname=normalize_name(data))
4901
1896
            target.indirect_reference_name = data
4902
1897
            self.add_target(target_name, '', target, lineno)
4903
1898
            self.document.note_indirect_target(target)
4904
1899
            return target
4905
1900
        elif target_type == 'refuri':
4906
1901
            target = nodes.target(block_text, '')
4907
1902
            self.add_target(target_name, data, target, lineno)
4908
1903
            return target
4909
1904
        else:
4910
1905
            return data
4911
1906
4912
1907
    def parse_target(self, block, block_text, lineno):
4913
1908
        """
4914
1909
        Determine the type of reference of a target.
4915
1910
4916
1911
        :Return: A 2-tuple, one of:
4917
1912
4918
1913
            - 'refname' and the indirect reference name
4919
1914
            - 'refuri' and the URI
4920
1915
            - 'malformed' and a system_message node
4921
1916
        """
4922
1917
        if block and block[-1].strip()[-1:] == '_': # possible indirect target
4923
1918
            reference = ' '.join([line.strip() for line in block])
4924
1919
            refname = self.is_reference(reference)
4925
1920
            if refname:
4926
1921
                return 'refname', refname
4927
1922
        reference = ''.join([''.join(line.split()) for line in block])
4928
1923
        return 'refuri', unescape(reference)
4929
1924
4930
1925
    def is_reference(self, reference):
4931
1926
        match = self.explicit.patterns.reference.match(
4932
1927
            whitespace_normalize_name(reference))
4933
1928
        if not match:
4934
1929
            return None
4935
1930
        return unescape(match.group('simple') or match.group('phrase'))
4936
1931
4937
1932
    def add_target(self, targetname, refuri, target, lineno):
4938
1933
        target.line = lineno
4939
1934
        if targetname:
4940
1935
            name = normalize_name(unescape(targetname))
4941
1936
            target['names'].append(name)
4942
1937
            if refuri:
4943
1938
                uri = self.inliner.adjust_uri(refuri)
4944
1939
                if uri:
4945
1940
                    target['refuri'] = uri
4946
1941
                else:
4947
1942
                    raise ApplicationError('problem with URI: %r' % refuri)
4948
1943
            self.document.note_explicit_target(target, self.parent)
4949
1944
        else:                       # anonymous target
4950
1945
            if refuri:
4951
1946
                target['refuri'] = refuri
4952
1947
            target['anonymous'] = 1
4953
1948
            self.document.note_anonymous_target(target)
4954
1949
4955
1950
    def substitution_def(self, match):
4956
1951
        pattern = self.explicit.patterns.substitution
4957
1952
        src, srcline = self.state_machine.get_source_and_line()
4958
1953
        block, indent, offset, blank_finish = \
4959
1954
              self.state_machine.get_first_known_indented(match.end(),
4960
1955
                                                          strip_indent=False)
4961
1956
        blocktext = (match.string[:match.end()] + '\n'.join(block))
4962
1957
        block.disconnect()
4963
1958
        escaped = escape2null(block[0].rstrip())
4964
1959
        blockindex = 0
4965
1960
        while True:
4966
1961
            subdefmatch = pattern.match(escaped)
4967
1962
            if subdefmatch:
4968
1963
                break
4969
1964
            blockindex += 1
4970
1965
            try:
4971
1966
                escaped = escaped + ' ' + escape2null(block[blockindex].strip())
4972
1967
            except IndexError:
4973
1968
                raise MarkupError('malformed substitution definition.')
4974
1969
        del block[:blockindex]          # strip out the substitution marker
4975
1970
        block[0] = (block[0].strip() + ' ')[subdefmatch.end()-len(escaped)-1:-1]
4976
1971
        if not block[0]:
4977
1972
            del block[0]
4978
1973
            offset += 1
4979
1974
        while block and not block[-1].strip():
4980
1975
            block.pop()
4981
1976
        subname = subdefmatch.group('name')
4982
1977
        substitution_node = nodes.substitution_definition(blocktext)
4983
1978
        substitution_node.source = src
4984
1979
        substitution_node.line = srcline
4985
1980
        if not block:
4986
1981
            msg = self.reporter.warning(
4987
1982
                'Substitution definition "%s" missing contents.' % subname,
4988
1983
                nodes.literal_block(blocktext, blocktext),
4989
1984
                source=src, line=srcline)
4990
1985
            return [msg], blank_finish
4991
1986
        block[0] = block[0].strip()
4992
1987
        substitution_node['names'].append(
4993
1988
            nodes.whitespace_normalize_name(subname))
4994
1989
        new_abs_offset, blank_finish = self.nested_list_parse(
4995
1990
              block, input_offset=offset, node=substitution_node,
4996
1991
              initial_state='SubstitutionDef', blank_finish=blank_finish)
4997
1992
        i = 0
4998
1993
        for node in substitution_node[:]:
4999
1994
            if not (isinstance(node, nodes.Inline) or
5000
1995
                    isinstance(node, nodes.Text)):
Status:	Merged
Merged at revision:	33
Proposed branch:	lp:~mitya57/ubuntu/raring/python-docutils/updated-aliases-patch
Merge into:	lp:ubuntu/raring/python-docutils
Diff against target:	10178 lines (+9862/-50) 14 files modified .pc/applied-patches (+1/-0) .pc/support-aliases-in-references.diff/docs/ref/rst/restructuredtext.txt (+2979/-0) .pc/support-aliases-in-references.diff/docutils/parsers/rst/states.py (+3052/-0) .pc/support-aliases-in-references.diff/docutils/transforms/references.py (+903/-0) .pc/support-aliases-in-references.diff/test/test_parsers/test_rst/test_inline_markup.py (+1682/-0) .pc/support-aliases-in-references.diff/test/test_transforms/test_hyperlinks.py (+843/-0) debian/changelog (+7/-0) debian/patches/series (+1/-0) debian/patches/support-aliases-in-references.diff (+130/-18) docs/ref/rst/restructuredtext.txt (+39/-12) docutils/parsers/rst/states.py (+37/-16) docutils/transforms/references.py (+6/-4) test/test_parsers/test_rst/test_inline_markup.py (+85/-0) test/test_transforms/test_hyperlinks.py (+97/-0)
To merge this branch:	bzr merge lp:~mitya57/ubuntu/raring/python-docutils/updated-aliases-patch
Related bugs:	Link a bug report
Reviewer	Review Type	Date Requested	Status
Ubuntu branches		2013-03-19	Pending
Review via email: mp+154004@code.launchpad.net