Merge lp:~mitya57/ubuntu/raring/python-docutils/updated-aliases-patch into lp:ubuntu/raring/python-docutils

Proposed by Dmitry Shachnev
Status: Merged
Merged at revision: 33
Proposed branch: lp:~mitya57/ubuntu/raring/python-docutils/updated-aliases-patch
Merge into: lp:ubuntu/raring/python-docutils
Diff against target: 10178 lines (+9862/-50)
14 files modified
.pc/applied-patches (+1/-0)
.pc/support-aliases-in-references.diff/docs/ref/rst/restructuredtext.txt (+2979/-0)
.pc/support-aliases-in-references.diff/docutils/parsers/rst/states.py (+3052/-0)
.pc/support-aliases-in-references.diff/docutils/transforms/references.py (+903/-0)
.pc/support-aliases-in-references.diff/test/test_parsers/test_rst/test_inline_markup.py (+1682/-0)
.pc/support-aliases-in-references.diff/test/test_transforms/test_hyperlinks.py (+843/-0)
debian/changelog (+7/-0)
debian/patches/series (+1/-0)
debian/patches/support-aliases-in-references.diff (+130/-18)
docs/ref/rst/restructuredtext.txt (+39/-12)
docutils/parsers/rst/states.py (+37/-16)
docutils/transforms/references.py (+6/-4)
test/test_parsers/test_rst/test_inline_markup.py (+85/-0)
test/test_transforms/test_hyperlinks.py (+97/-0)
To merge this branch: bzr merge lp:~mitya57/ubuntu/raring/python-docutils/updated-aliases-patch
Reviewer Review Type Date Requested Status
Ubuntu branches Pending
Review via email: mp+154004@code.launchpad.net

Description of the change

Note: I don't think this needs an FFe because this patch was present in previous version of the package before the FF, and disabling it was a quick work-around for the crash (please correct me if I'm wrong).

The patch is question makes it possible to always use links like

    `Example link <1_>`_

    .. _1: http://example.com

which is an useful feature for localized projects like Ubuntu Packaging Guide (it has been possible to *sometimes* use such links before, but now the support is consistent).

This patch was disabled in 0.10-1ubuntu2 because it caused python3.3 docs to fail to build. That crash could be possible without this patch, but the patch made it easier to trigger. The crash is now fixed in the new version, and also references that start with the URI scheme are now ignored and processed as in the previous versions (that ensures we don't break any existing projects).

All changes are covered by regression tests, which are run during build and on jenkins.ubuntu.com. I've verified that both Docutils and Sphinx test suites succeed, and that python3.3 docs build correctly.

To post a comment you must log in.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file '.pc/applied-patches'
2--- .pc/applied-patches 2013-03-06 14:34:02 +0000
3+++ .pc/applied-patches 2013-03-19 07:30:29 +0000
4@@ -8,3 +8,4 @@
5 test-sys-path.diff
6 move-data-to-usr-share.diff
7 disable_py33_failing_tests.diff
8+support-aliases-in-references.diff
9
10=== added directory '.pc/support-aliases-in-references.diff'
11=== added directory '.pc/support-aliases-in-references.diff/docs'
12=== added directory '.pc/support-aliases-in-references.diff/docs/ref'
13=== added directory '.pc/support-aliases-in-references.diff/docs/ref/rst'
14=== added file '.pc/support-aliases-in-references.diff/docs/ref/rst/restructuredtext.txt'
15--- .pc/support-aliases-in-references.diff/docs/ref/rst/restructuredtext.txt 1970-01-01 00:00:00 +0000
16+++ .pc/support-aliases-in-references.diff/docs/ref/rst/restructuredtext.txt 2013-03-19 07:30:29 +0000
17@@ -0,0 +1,2979 @@
18+.. -*- coding: utf-8 -*-
19+
20+=======================================
21+ reStructuredText Markup Specification
22+=======================================
23+
24+:Author: David Goodger
25+:Contact: docutils-develop@lists.sourceforge.net
26+:Revision: $Revision: 7302 $
27+:Date: $Date: 2012-01-03 20:23:53 +0100 (Di, 03. Jan 2012) $
28+:Copyright: This document has been placed in the public domain.
29+
30+.. Note::
31+
32+ This document is a detailed technical specification; it is not a
33+ tutorial or a primer. If this is your first exposure to
34+ reStructuredText, please read `A ReStructuredText Primer`_ and the
35+ `Quick reStructuredText`_ user reference first.
36+
37+.. _A ReStructuredText Primer: ../../user/rst/quickstart.html
38+.. _Quick reStructuredText: ../../user/rst/quickref.html
39+
40+
41+reStructuredText_ is plaintext that uses simple and intuitive
42+constructs to indicate the structure of a document. These constructs
43+are equally easy to read in raw and processed forms. This document is
44+itself an example of reStructuredText (raw, if you are reading the
45+text file, or processed, if you are reading an HTML document, for
46+example). The reStructuredText parser is a component of Docutils_.
47+
48+Simple, implicit markup is used to indicate special constructs, such
49+as section headings, bullet lists, and emphasis. The markup used is
50+as minimal and unobtrusive as possible. Less often-used constructs
51+and extensions to the basic reStructuredText syntax may have more
52+elaborate or explicit markup.
53+
54+reStructuredText is applicable to documents of any length, from the
55+very small (such as inline program documentation fragments, e.g.
56+Python docstrings) to the quite large (this document).
57+
58+The first section gives a quick overview of the syntax of the
59+reStructuredText markup by example. A complete specification is given
60+in the `Syntax Details`_ section.
61+
62+`Literal blocks`_ (in which no markup processing is done) are used for
63+examples throughout this document, to illustrate the plaintext markup.
64+
65+
66+.. contents::
67+
68+
69+-----------------------
70+ Quick Syntax Overview
71+-----------------------
72+
73+A reStructuredText document is made up of body or block-level
74+elements, and may be structured into sections. Sections_ are
75+indicated through title style (underlines & optional overlines).
76+Sections contain body elements and/or subsections. Some body elements
77+contain further elements, such as lists containing list items, which
78+in turn may contain paragraphs and other body elements. Others, such
79+as paragraphs, contain text and `inline markup`_ elements.
80+
81+Here are examples of `body elements`_:
82+
83+- Paragraphs_ (and `inline markup`_)::
84+
85+ Paragraphs contain text and may contain inline markup:
86+ *emphasis*, **strong emphasis**, `interpreted text`, ``inline
87+ literals``, standalone hyperlinks (http://www.python.org),
88+ external hyperlinks (Python_), internal cross-references
89+ (example_), footnote references ([1]_), citation references
90+ ([CIT2002]_), substitution references (|example|), and _`inline
91+ internal targets`.
92+
93+ Paragraphs are separated by blank lines and are left-aligned.
94+
95+- Five types of lists:
96+
97+ 1. `Bullet lists`_::
98+
99+ - This is a bullet list.
100+
101+ - Bullets can be "*", "+", or "-".
102+
103+ 2. `Enumerated lists`_::
104+
105+ 1. This is an enumerated list.
106+
107+ 2. Enumerators may be arabic numbers, letters, or roman
108+ numerals.
109+
110+ 3. `Definition lists`_::
111+
112+ what
113+ Definition lists associate a term with a definition.
114+
115+ how
116+ The term is a one-line phrase, and the definition is one
117+ or more paragraphs or body elements, indented relative to
118+ the term.
119+
120+ 4. `Field lists`_::
121+
122+ :what: Field lists map field names to field bodies, like
123+ database records. They are often part of an extension
124+ syntax.
125+
126+ :how: The field marker is a colon, the field name, and a
127+ colon.
128+
129+ The field body may contain one or more body elements,
130+ indented relative to the field marker.
131+
132+ 5. `Option lists`_, for listing command-line options::
133+
134+ -a command-line option "a"
135+ -b file options can have arguments
136+ and long descriptions
137+ --long options can be long also
138+ --input=file long options can also have
139+ arguments
140+ /V DOS/VMS-style options too
141+
142+ There must be at least two spaces between the option and the
143+ description.
144+
145+- `Literal blocks`_::
146+
147+ Literal blocks are either indented or line-prefix-quoted blocks,
148+ and indicated with a double-colon ("::") at the end of the
149+ preceding paragraph (right here -->)::
150+
151+ if literal_block:
152+ text = 'is left as-is'
153+ spaces_and_linebreaks = 'are preserved'
154+ markup_processing = None
155+
156+- `Block quotes`_::
157+
158+ Block quotes consist of indented body elements:
159+
160+ This theory, that is mine, is mine.
161+
162+ -- Anne Elk (Miss)
163+
164+- `Doctest blocks`_::
165+
166+ >>> print 'Python-specific usage examples; begun with ">>>"'
167+ Python-specific usage examples; begun with ">>>"
168+ >>> print '(cut and pasted from interactive Python sessions)'
169+ (cut and pasted from interactive Python sessions)
170+
171+- Two syntaxes for tables_:
172+
173+ 1. `Grid tables`_; complete, but complex and verbose::
174+
175+ +------------------------+------------+----------+
176+ | Header row, column 1 | Header 2 | Header 3 |
177+ +========================+============+==========+
178+ | body row 1, column 1 | column 2 | column 3 |
179+ +------------------------+------------+----------+
180+ | body row 2 | Cells may span |
181+ +------------------------+-----------------------+
182+
183+ 2. `Simple tables`_; easy and compact, but limited::
184+
185+ ==================== ========== ==========
186+ Header row, column 1 Header 2 Header 3
187+ ==================== ========== ==========
188+ body row 1, column 1 column 2 column 3
189+ body row 2 Cells may span columns
190+ ==================== ======================
191+
192+- `Explicit markup blocks`_ all begin with an explicit block marker,
193+ two periods and a space:
194+
195+ - Footnotes_::
196+
197+ .. [1] A footnote contains body elements, consistently
198+ indented by at least 3 spaces.
199+
200+ - Citations_::
201+
202+ .. [CIT2002] Just like a footnote, except the label is
203+ textual.
204+
205+ - `Hyperlink targets`_::
206+
207+ .. _Python: http://www.python.org
208+
209+ .. _example:
210+
211+ The "_example" target above points to this paragraph.
212+
213+ - Directives_::
214+
215+ .. image:: mylogo.png
216+
217+ - `Substitution definitions`_::
218+
219+ .. |symbol here| image:: symbol.png
220+
221+ - Comments_::
222+
223+ .. Comments begin with two dots and a space. Anything may
224+ follow, except for the syntax of footnotes/citations,
225+ hyperlink targets, directives, or substitution definitions.
226+
227+
228+----------------
229+ Syntax Details
230+----------------
231+
232+Descriptions below list "doctree elements" (document tree element
233+names; XML DTD generic identifiers) corresponding to syntax
234+constructs. For details on the hierarchy of elements, please see `The
235+Docutils Document Tree`_ and the `Docutils Generic DTD`_ XML document
236+type definition.
237+
238+
239+Whitespace
240+==========
241+
242+Spaces are recommended for indentation_, but tabs may also be used.
243+Tabs will be converted to spaces. Tab stops are at every 8th column.
244+
245+Other whitespace characters (form feeds [chr(12)] and vertical tabs
246+[chr(11)]) are converted to single spaces before processing.
247+
248+
249+Blank Lines
250+-----------
251+
252+Blank lines are used to separate paragraphs and other elements.
253+Multiple successive blank lines are equivalent to a single blank line,
254+except within literal blocks (where all whitespace is preserved).
255+Blank lines may be omitted when the markup makes element separation
256+unambiguous, in conjunction with indentation. The first line of a
257+document is treated as if it is preceded by a blank line, and the last
258+line of a document is treated as if it is followed by a blank line.
259+
260+
261+Indentation
262+-----------
263+
264+Indentation is used to indicate -- and is only significant in
265+indicating -- block quotes, definitions (in definition list items),
266+and local nested content:
267+
268+- list item content (multi-line contents of list items, and multiple
269+ body elements within a list item, including nested lists),
270+- the content of literal blocks, and
271+- the content of explicit markup blocks.
272+
273+Any text whose indentation is less than that of the current level
274+(i.e., unindented text or "dedents") ends the current level of
275+indentation.
276+
277+Since all indentation is significant, the level of indentation must be
278+consistent. For example, indentation is the sole markup indicator for
279+`block quotes`_::
280+
281+ This is a top-level paragraph.
282+
283+ This paragraph belongs to a first-level block quote.
284+
285+ Paragraph 2 of the first-level block quote.
286+
287+Multiple levels of indentation within a block quote will result in
288+more complex structures::
289+
290+ This is a top-level paragraph.
291+
292+ This paragraph belongs to a first-level block quote.
293+
294+ This paragraph belongs to a second-level block quote.
295+
296+ Another top-level paragraph.
297+
298+ This paragraph belongs to a second-level block quote.
299+
300+ This paragraph belongs to a first-level block quote. The
301+ second-level block quote above is inside this first-level
302+ block quote.
303+
304+When a paragraph or other construct consists of more than one line of
305+text, the lines must be left-aligned::
306+
307+ This is a paragraph. The lines of
308+ this paragraph are aligned at the left.
309+
310+ This paragraph has problems. The
311+ lines are not left-aligned. In addition
312+ to potential misinterpretation, warning
313+ and/or error messages will be generated
314+ by the parser.
315+
316+Several constructs begin with a marker, and the body of the construct
317+must be indented relative to the marker. For constructs using simple
318+markers (`bullet lists`_, `enumerated lists`_, footnotes_, citations_,
319+`hyperlink targets`_, directives_, and comments_), the level of
320+indentation of the body is determined by the position of the first
321+line of text, which begins on the same line as the marker. For
322+example, bullet list bodies must be indented by at least two columns
323+relative to the left edge of the bullet::
324+
325+ - This is the first line of a bullet list
326+ item's paragraph. All lines must align
327+ relative to the first line. [1]_
328+
329+ This indented paragraph is interpreted
330+ as a block quote.
331+
332+ Because it is not sufficiently indented,
333+ this paragraph does not belong to the list
334+ item.
335+
336+ .. [1] Here's a footnote. The second line is aligned
337+ with the beginning of the footnote label. The ".."
338+ marker is what determines the indentation.
339+
340+For constructs using complex markers (`field lists`_ and `option
341+lists`_), where the marker may contain arbitrary text, the indentation
342+of the first line *after* the marker determines the left edge of the
343+body. For example, field lists may have very long markers (containing
344+the field names)::
345+
346+ :Hello: This field has a short field name, so aligning the field
347+ body with the first line is feasible.
348+
349+ :Number-of-African-swallows-required-to-carry-a-coconut: It would
350+ be very difficult to align the field body with the left edge
351+ of the first line. It may even be preferable not to begin the
352+ body on the same line as the marker.
353+
354+
355+Escaping Mechanism
356+==================
357+
358+The character set universally available to plaintext documents, 7-bit
359+ASCII, is limited. No matter what characters are used for markup,
360+they will already have multiple meanings in written text. Therefore
361+markup characters *will* sometimes appear in text **without being
362+intended as markup**. Any serious markup system requires an escaping
363+mechanism to override the default meaning of the characters used for
364+the markup. In reStructuredText we use the backslash, commonly used
365+as an escaping character in other domains.
366+
367+A backslash followed by any character (except whitespace characters)
368+escapes that character. The escaped character represents the
369+character itself, and is prevented from playing a role in any markup
370+interpretation. The backslash is removed from the output. A literal
371+backslash is represented by two backslashes in a row (the first
372+backslash "escapes" the second, preventing it being interpreted in an
373+"escaping" role).
374+
375+Backslash-escaped whitespace characters are removed from the document.
376+This allows for character-level `inline markup`_.
377+
378+There are two contexts in which backslashes have no special meaning:
379+literal blocks and inline literals. In these contexts, a single
380+backslash represents a literal backslash, without having to double up.
381+
382+Please note that the reStructuredText specification and parser do not
383+address the issue of the representation or extraction of text input
384+(how and in what form the text actually *reaches* the parser).
385+Backslashes and other characters may serve a character-escaping
386+purpose in certain contexts and must be dealt with appropriately. For
387+example, Python uses backslashes in strings to escape certain
388+characters, but not others. The simplest solution when backslashes
389+appear in Python docstrings is to use raw docstrings::
390+
391+ r"""This is a raw docstring. Backslashes (\) are not touched."""
392+
393+
394+Reference Names
395+===============
396+
397+Simple reference names are single words consisting of alphanumerics
398+plus isolated (no two adjacent) internal hyphens, underscores,
399+periods, colons and plus signs; no whitespace or other characters are
400+allowed. Footnote labels (Footnotes_ & `Footnote References`_), citation
401+labels (Citations_ & `Citation References`_), `interpreted text`_ roles,
402+and some `hyperlink references`_ use the simple reference name syntax.
403+
404+Reference names using punctuation or whose names are phrases (two or
405+more space-separated words) are called "phrase-references".
406+Phrase-references are expressed by enclosing the phrase in backquotes
407+and treating the backquoted text as a reference name::
408+
409+ Want to learn about `my favorite programming language`_?
410+
411+ .. _my favorite programming language: http://www.python.org
412+
413+Simple reference names may also optionally use backquotes.
414+
415+Reference names are whitespace-neutral and case-insensitive. When
416+resolving reference names internally:
417+
418+- whitespace is normalized (one or more spaces, horizontal or vertical
419+ tabs, newlines, carriage returns, or form feeds, are interpreted as
420+ a single space), and
421+
422+- case is normalized (all alphabetic characters are converted to
423+ lowercase).
424+
425+For example, the following `hyperlink references`_ are equivalent::
426+
427+ - `A HYPERLINK`_
428+ - `a hyperlink`_
429+ - `A
430+ Hyperlink`_
431+
432+Hyperlinks_, footnotes_, and citations_ all share the same namespace
433+for reference names. The labels of citations (simple reference names)
434+and manually-numbered footnotes (numbers) are entered into the same
435+database as other hyperlink names. This means that a footnote
436+(defined as "``.. [1]``") which can be referred to by a footnote
437+reference (``[1]_``), can also be referred to by a plain hyperlink
438+reference (1_). Of course, each type of reference (hyperlink,
439+footnote, citation) may be processed and rendered differently. Some
440+care should be taken to avoid reference name conflicts.
441+
442+
443+Document Structure
444+==================
445+
446+Document
447+--------
448+
449+Doctree element: document.
450+
451+The top-level element of a parsed reStructuredText document is the
452+"document" element. After initial parsing, the document element is a
453+simple container for a document fragment, consisting of `body
454+elements`_, transitions_, and sections_, but lacking a document title
455+or other bibliographic elements. The code that calls the parser may
456+choose to run one or more optional post-parse transforms_,
457+rearranging the document fragment into a complete document with a
458+title and possibly other metadata elements (author, date, etc.; see
459+`Bibliographic Fields`_).
460+
461+Specifically, there is no way to indicate a document title and
462+subtitle explicitly in reStructuredText. Instead, a lone top-level
463+section title (see Sections_ below) can be treated as the document
464+title. Similarly, a lone second-level section title immediately after
465+the "document title" can become the document subtitle. The rest of
466+the sections are then lifted up a level or two. See the `DocTitle
467+transform`_ for details.
468+
469+
470+Sections
471+--------
472+
473+Doctree elements: section, title.
474+
475+Sections are identified through their titles, which are marked up with
476+adornment: "underlines" below the title text, or underlines and
477+matching "overlines" above the title. An underline/overline is a
478+single repeated punctuation character that begins in column 1 and
479+forms a line extending at least as far as the right edge of the title
480+text. Specifically, an underline/overline character may be any
481+non-alphanumeric printable 7-bit ASCII character [#]_. When an
482+overline is used, the length and character used must match the
483+underline. Underline-only adornment styles are distinct from
484+overline-and-underline styles that use the same character. There may
485+be any number of levels of section titles, although some output
486+formats may have limits (HTML has 6 levels).
487+
488+.. [#] The following are all valid section title adornment
489+ characters::
490+
491+ ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~
492+
493+ Some characters are more suitable than others. The following are
494+ recommended::
495+
496+ = - ` : . ' " ~ ^ _ * + #
497+
498+Rather than imposing a fixed number and order of section title
499+adornment styles, the order enforced will be the order as encountered.
500+The first style encountered will be an outermost title (like HTML H1),
501+the second style will be a subtitle, the third will be a subsubtitle,
502+and so on.
503+
504+Below are examples of section title styles::
505+
506+ ===============
507+ Section Title
508+ ===============
509+
510+ ---------------
511+ Section Title
512+ ---------------
513+
514+ Section Title
515+ =============
516+
517+ Section Title
518+ -------------
519+
520+ Section Title
521+ `````````````
522+
523+ Section Title
524+ '''''''''''''
525+
526+ Section Title
527+ .............
528+
529+ Section Title
530+ ~~~~~~~~~~~~~
531+
532+ Section Title
533+ *************
534+
535+ Section Title
536+ +++++++++++++
537+
538+ Section Title
539+ ^^^^^^^^^^^^^
540+
541+When a title has both an underline and an overline, the title text may
542+be inset, as in the first two examples above. This is merely
543+aesthetic and not significant. Underline-only title text may *not* be
544+inset.
545+
546+A blank line after a title is optional. All text blocks up to the
547+next title of the same or higher level are included in a section (or
548+subsection, etc.).
549+
550+All section title styles need not be used, nor need any specific
551+section title style be used. However, a document must be consistent
552+in its use of section titles: once a hierarchy of title styles is
553+established, sections must use that hierarchy.
554+
555+Each section title automatically generates a hyperlink target pointing
556+to the section. The text of the hyperlink target (the "reference
557+name") is the same as that of the section title. See `Implicit
558+Hyperlink Targets`_ for a complete description.
559+
560+Sections may contain `body elements`_, transitions_, and nested
561+sections.
562+
563+
564+Transitions
565+-----------
566+
567+Doctree element: transition.
568+
569+ Instead of subheads, extra space or a type ornament between
570+ paragraphs may be used to mark text divisions or to signal
571+ changes in subject or emphasis.
572+
573+ (The Chicago Manual of Style, 14th edition, section 1.80)
574+
575+Transitions are commonly seen in novels and short fiction, as a gap
576+spanning one or more lines, with or without a type ornament such as a
577+row of asterisks. Transitions separate other body elements. A
578+transition should not begin or end a section or document, nor should
579+two transitions be immediately adjacent.
580+
581+The syntax for a transition marker is a horizontal line of 4 or more
582+repeated punctuation characters. The syntax is the same as section
583+title underlines without title text. Transition markers require blank
584+lines before and after::
585+
586+ Para.
587+
588+ ----------
589+
590+ Para.
591+
592+Unlike section title underlines, no hierarchy of transition markers is
593+enforced, nor do differences in transition markers accomplish
594+anything. It is recommended that a single consistent style be used.
595+
596+The processing system is free to render transitions in output in any
597+way it likes. For example, horizontal rules (``<hr>``) in HTML output
598+would be an obvious choice.
599+
600+
601+Body Elements
602+=============
603+
604+Paragraphs
605+----------
606+
607+Doctree element: paragraph.
608+
609+Paragraphs consist of blocks of left-aligned text with no markup
610+indicating any other body element. Blank lines separate paragraphs
611+from each other and from other body elements. Paragraphs may contain
612+`inline markup`_.
613+
614+Syntax diagram::
615+
616+ +------------------------------+
617+ | paragraph |
618+ | |
619+ +------------------------------+
620+
621+ +------------------------------+
622+ | paragraph |
623+ | |
624+ +------------------------------+
625+
626+
627+Bullet Lists
628+------------
629+
630+Doctree elements: bullet_list, list_item.
631+
632+A text block which begins with a "*", "+", "-", "•", "‣", or "⁃",
633+followed by whitespace, is a bullet list item (a.k.a. "unordered" list
634+item). List item bodies must be left-aligned and indented relative to
635+the bullet; the text immediately after the bullet determines the
636+indentation. For example::
637+
638+ - This is the first bullet list item. The blank line above the
639+ first list item is required; blank lines between list items
640+ (such as below this paragraph) are optional.
641+
642+ - This is the first paragraph in the second item in the list.
643+
644+ This is the second paragraph in the second item in the list.
645+ The blank line above this paragraph is required. The left edge
646+ of this paragraph lines up with the paragraph above, both
647+ indented relative to the bullet.
648+
649+ - This is a sublist. The bullet lines up with the left edge of
650+ the text blocks above. A sublist is a new list so requires a
651+ blank line above and below.
652+
653+ - This is the third item of the main list.
654+
655+ This paragraph is not part of the list.
656+
657+Here are examples of **incorrectly** formatted bullet lists::
658+
659+ - This first line is fine.
660+ A blank line is required between list items and paragraphs.
661+ (Warning)
662+
663+ - The following line appears to be a new sublist, but it is not:
664+ - This is a paragraph continuation, not a sublist (since there's
665+ no blank line). This line is also incorrectly indented.
666+ - Warnings may be issued by the implementation.
667+
668+Syntax diagram::
669+
670+ +------+-----------------------+
671+ | "- " | list item |
672+ +------| (body elements)+ |
673+ +-----------------------+
674+
675+
676+Enumerated Lists
677+----------------
678+
679+Doctree elements: enumerated_list, list_item.
680+
681+Enumerated lists (a.k.a. "ordered" lists) are similar to bullet lists,
682+but use enumerators instead of bullets. An enumerator consists of an
683+enumeration sequence member and formatting, followed by whitespace.
684+The following enumeration sequences are recognized:
685+
686+- arabic numerals: 1, 2, 3, ... (no upper limit).
687+- uppercase alphabet characters: A, B, C, ..., Z.
688+- lower-case alphabet characters: a, b, c, ..., z.
689+- uppercase Roman numerals: I, II, III, IV, ..., MMMMCMXCIX (4999).
690+- lowercase Roman numerals: i, ii, iii, iv, ..., mmmmcmxcix (4999).
691+
692+In addition, the auto-enumerator, "#", may be used to automatically
693+enumerate a list. Auto-enumerated lists may begin with explicit
694+enumeration, which sets the sequence. Fully auto-enumerated lists use
695+arabic numerals and begin with 1. (Auto-enumerated lists are new in
696+Docutils 0.3.8.)
697+
698+The following formatting types are recognized:
699+
700+- suffixed with a period: "1.", "A.", "a.", "I.", "i.".
701+- surrounded by parentheses: "(1)", "(A)", "(a)", "(I)", "(i)".
702+- suffixed with a right-parenthesis: "1)", "A)", "a)", "I)", "i)".
703+
704+While parsing an enumerated list, a new list will be started whenever:
705+
706+- An enumerator is encountered which does not have the same format and
707+ sequence type as the current list (e.g. "1.", "(a)" produces two
708+ separate lists).
709+
710+- The enumerators are not in sequence (e.g., "1.", "3." produces two
711+ separate lists).
712+
713+It is recommended that the enumerator of the first list item be
714+ordinal-1 ("1", "A", "a", "I", or "i"). Although other start-values
715+will be recognized, they may not be supported by the output format. A
716+level-1 [info] system message will be generated for any list beginning
717+with a non-ordinal-1 enumerator.
718+
719+Lists using Roman numerals must begin with "I"/"i" or a
720+multi-character value, such as "II" or "XV". Any other
721+single-character Roman numeral ("V", "X", "L", "C", "D", "M") will be
722+interpreted as a letter of the alphabet, not as a Roman numeral.
723+Likewise, lists using letters of the alphabet may not begin with
724+"I"/"i", since these are recognized as Roman numeral 1.
725+
726+The second line of each enumerated list item is checked for validity.
727+This is to prevent ordinary paragraphs from being mistakenly
728+interpreted as list items, when they happen to begin with text
729+identical to enumerators. For example, this text is parsed as an
730+ordinary paragraph::
731+
732+ A. Einstein was a really
733+ smart dude.
734+
735+However, ambiguity cannot be avoided if the paragraph consists of only
736+one line. This text is parsed as an enumerated list item::
737+
738+ A. Einstein was a really smart dude.
739+
740+If a single-line paragraph begins with text identical to an enumerator
741+("A.", "1.", "(b)", "I)", etc.), the first character will have to be
742+escaped in order to have the line parsed as an ordinary paragraph::
743+
744+ \A. Einstein was a really smart dude.
745+
746+Examples of nested enumerated lists::
747+
748+ 1. Item 1 initial text.
749+
750+ a) Item 1a.
751+ b) Item 1b.
752+
753+ 2. a) Item 2a.
754+ b) Item 2b.
755+
756+Example syntax diagram::
757+
758+ +-------+----------------------+
759+ | "1. " | list item |
760+ +-------| (body elements)+ |
761+ +----------------------+
762+
763+
764+Definition Lists
765+----------------
766+
767+Doctree elements: definition_list, definition_list_item, term,
768+classifier, definition.
769+
770+Each definition list item contains a term, optional classifiers, and a
771+definition. A term is a simple one-line word or phrase. Optional
772+classifiers may follow the term on the same line, each after an inline
773+" : " (space, colon, space). A definition is a block indented
774+relative to the term, and may contain multiple paragraphs and other
775+body elements. There may be no blank line between a term line and a
776+definition block (this distinguishes definition lists from `block
777+quotes`_). Blank lines are required before the first and after the
778+last definition list item, but are optional in-between. For example::
779+
780+ term 1
781+ Definition 1.
782+
783+ term 2
784+ Definition 2, paragraph 1.
785+
786+ Definition 2, paragraph 2.
787+
788+ term 3 : classifier
789+ Definition 3.
790+
791+ term 4 : classifier one : classifier two
792+ Definition 4.
793+
794+Inline markup is parsed in the term line before the classifier
795+delimiter (" : ") is recognized. The delimiter will only be
796+recognized if it appears outside of any inline markup.
797+
798+A definition list may be used in various ways, including:
799+
800+- As a dictionary or glossary. The term is the word itself, a
801+ classifier may be used to indicate the usage of the term (noun,
802+ verb, etc.), and the definition follows.
803+
804+- To describe program variables. The term is the variable name, a
805+ classifier may be used to indicate the type of the variable (string,
806+ integer, etc.), and the definition describes the variable's use in
807+ the program. This usage of definition lists supports the classifier
808+ syntax of Grouch_, a system for describing and enforcing a Python
809+ object schema.
810+
811+Syntax diagram::
812+
813+ +----------------------------+
814+ | term [ " : " classifier ]* |
815+ +--+-------------------------+--+
816+ | definition |
817+ | (body elements)+ |
818+ +----------------------------+
819+
820+
821+Field Lists
822+-----------
823+
824+Doctree elements: field_list, field, field_name, field_body.
825+
826+Field lists are used as part of an extension syntax, such as options
827+for directives_, or database-like records meant for further
828+processing. They may also be used for two-column table-like
829+structures resembling database records (label & data pairs).
830+Applications of reStructuredText may recognize field names and
831+transform fields or field bodies in certain contexts. For examples,
832+see `Bibliographic Fields`_ below, or the "image_" and "meta_"
833+directives in `reStructuredText Directives`_.
834+
835+Field lists are mappings from field names to field bodies, modeled on
836+RFC822_ headers. A field name may consist of any characters, but
837+colons (":") inside of field names must be escaped with a backslash.
838+Inline markup is parsed in field names. Field names are
839+case-insensitive when further processed or transformed. The field
840+name, along with a single colon prefix and suffix, together form the
841+field marker. The field marker is followed by whitespace and the
842+field body. The field body may contain multiple body elements,
843+indented relative to the field marker. The first line after the field
844+name marker determines the indentation of the field body. For
845+example::
846+
847+ :Date: 2001-08-16
848+ :Version: 1
849+ :Authors: - Me
850+ - Myself
851+ - I
852+ :Indentation: Since the field marker may be quite long, the second
853+ and subsequent lines of the field body do not have to line up
854+ with the first line, but they must be indented relative to the
855+ field name marker, and they must line up with each other.
856+ :Parameter i: integer
857+
858+The interpretation of individual words in a multi-word field name is
859+up to the application. The application may specify a syntax for the
860+field name. For example, second and subsequent words may be treated
861+as "arguments", quoted phrases may be treated as a single argument,
862+and direct support for the "name=value" syntax may be added.
863+
864+Standard RFC822_ headers cannot be used for this construct because
865+they are ambiguous. A word followed by a colon at the beginning of a
866+line is common in written text. However, in well-defined contexts
867+such as when a field list invariably occurs at the beginning of a
868+document (PEPs and email messages), standard RFC822 headers could be
869+used.
870+
871+Syntax diagram (simplified)::
872+
873+ +--------------------+----------------------+
874+ | ":" field name ":" | field body |
875+ +-------+------------+ |
876+ | (body elements)+ |
877+ +-----------------------------------+
878+
879+
880+Bibliographic Fields
881+````````````````````
882+
883+Doctree elements: docinfo, author, authors, organization, contact,
884+version, status, date, copyright, field, topic.
885+
886+When a field list is the first non-comment element in a document
887+(after the document title, if there is one), it may have its fields
888+transformed to document bibliographic data. This bibliographic data
889+corresponds to the front matter of a book, such as the title page and
890+copyright page.
891+
892+Certain registered field names (listed below) are recognized and
893+transformed to the corresponding doctree elements, most becoming child
894+elements of the "docinfo" element. No ordering is required of these
895+fields, although they may be rearranged to fit the document structure,
896+as noted. Unless otherwise indicated below, each of the bibliographic
897+elements' field bodies may contain a single paragraph only. Field
898+bodies may be checked for `RCS keywords`_ and cleaned up. Any
899+unrecognized fields will remain as generic fields in the docinfo
900+element.
901+
902+The registered bibliographic field names and their corresponding
903+doctree elements are as follows:
904+
905+- Field name "Author": author element.
906+- "Authors": authors.
907+- "Organization": organization.
908+- "Contact": contact.
909+- "Address": address.
910+- "Version": version.
911+- "Status": status.
912+- "Date": date.
913+- "Copyright": copyright.
914+- "Dedication": topic.
915+- "Abstract": topic.
916+
917+The "Authors" field may contain either: a single paragraph consisting
918+of a list of authors, separated by ";" or ","; or a bullet list whose
919+elements each contain a single paragraph per author. ";" is checked
920+first, so "Doe, Jane; Doe, John" will work. In some languages
921+(e.g. Swedish), there is no singular/plural distinction between
922+"Author" and "Authors", so only an "Authors" field is provided, and a
923+single name is interpreted as an "Author". If a single name contains
924+a comma, end it with a semicolon to disambiguate: ":Authors: Doe,
925+Jane;".
926+
927+The "Address" field is for a multi-line surface mailing address.
928+Newlines and whitespace will be preserved.
929+
930+The "Dedication" and "Abstract" fields may contain arbitrary body
931+elements. Only one of each is allowed. They become topic elements
932+with "Dedication" or "Abstract" titles (or language equivalents)
933+immediately following the docinfo element.
934+
935+This field-name-to-element mapping can be replaced for other
936+languages. See the `DocInfo transform`_ implementation documentation
937+for details.
938+
939+Unregistered/generic fields may contain one or more paragraphs or
940+arbitrary body elements.
941+
942+
943+RCS Keywords
944+````````````
945+
946+`Bibliographic fields`_ recognized by the parser are normally checked
947+for RCS [#]_ keywords and cleaned up [#]_. RCS keywords may be
948+entered into source files as "$keyword$", and once stored under RCS or
949+CVS [#]_, they are expanded to "$keyword: expansion text $". For
950+example, a "Status" field will be transformed to a "status" element::
951+
952+ :Status: $keyword: expansion text $
953+
954+.. [#] Revision Control System.
955+.. [#] RCS keyword processing can be turned off (unimplemented).
956+.. [#] Concurrent Versions System. CVS uses the same keywords as RCS.
957+
958+Processed, the "status" element's text will become simply "expansion
959+text". The dollar sign delimiters and leading RCS keyword name are
960+removed.
961+
962+The RCS keyword processing only kicks in when the field list is in
963+bibliographic context (first non-comment construct in the document,
964+after a document title if there is one).
965+
966+
967+Option Lists
968+------------
969+
970+Doctree elements: option_list, option_list_item, option_group, option,
971+option_string, option_argument, description.
972+
973+Option lists are two-column lists of command-line options and
974+descriptions, documenting a program's options. For example::
975+
976+ -a Output all.
977+ -b Output both (this description is
978+ quite long).
979+ -c arg Output just arg.
980+ --long Output all day long.
981+
982+ -p This option has two paragraphs in the description.
983+ This is the first.
984+
985+ This is the second. Blank lines may be omitted between
986+ options (as above) or left in (as here and below).
987+
988+ --very-long-option A VMS-style option. Note the adjustment for
989+ the required two spaces.
990+
991+ --an-even-longer-option
992+ The description can also start on the next line.
993+
994+ -2, --two This option has two variants.
995+
996+ -f FILE, --file=FILE These two options are synonyms; both have
997+ arguments.
998+
999+ /V A VMS/DOS-style option.
1000+
1001+There are several types of options recognized by reStructuredText:
1002+
1003+- Short POSIX options consist of one dash and an option letter.
1004+- Long POSIX options consist of two dashes and an option word; some
1005+ systems use a single dash.
1006+- Old GNU-style "plus" options consist of one plus and an option
1007+ letter ("plus" options are deprecated now, their use discouraged).
1008+- DOS/VMS options consist of a slash and an option letter or word.
1009+
1010+Please note that both POSIX-style and DOS/VMS-style options may be
1011+used by DOS or Windows software. These and other variations are
1012+sometimes used mixed together. The names above have been chosen for
1013+convenience only.
1014+
1015+The syntax for short and long POSIX options is based on the syntax
1016+supported by Python's getopt.py_ module, which implements an option
1017+parser similar to the `GNU libc getopt_long()`_ function but with some
1018+restrictions. There are many variant option systems, and
1019+reStructuredText option lists do not support all of them.
1020+
1021+Although long POSIX and DOS/VMS option words may be allowed to be
1022+truncated by the operating system or the application when used on the
1023+command line, reStructuredText option lists do not show or support
1024+this with any special syntax. The complete option word should be
1025+given, supported by notes about truncation if and when applicable.
1026+
1027+Options may be followed by an argument placeholder, whose role and
1028+syntax should be explained in the description text. Either a space or
1029+an equals sign may be used as a delimiter between options and option
1030+argument placeholders; short options ("-" or "+" prefix only) may omit
1031+the delimiter. Option arguments may take one of two forms:
1032+
1033+- Begins with a letter (``[a-zA-Z]``) and subsequently consists of
1034+ letters, numbers, underscores and hyphens (``[a-zA-Z0-9_-]``).
1035+- Begins with an open-angle-bracket (``<``) and ends with a
1036+ close-angle-bracket (``>``); any characters except angle brackets
1037+ are allowed internally.
1038+
1039+Multiple option "synonyms" may be listed, sharing a single
1040+description. They must be separated by comma-space.
1041+
1042+There must be at least two spaces between the option(s) and the
1043+description. The description may contain multiple body elements. The
1044+first line after the option marker determines the indentation of the
1045+description. As with other types of lists, blank lines are required
1046+before the first option list item and after the last, but are optional
1047+between option entries.
1048+
1049+Syntax diagram (simplified)::
1050+
1051+ +----------------------------+-------------+
1052+ | option [" " argument] " " | description |
1053+ +-------+--------------------+ |
1054+ | (body elements)+ |
1055+ +----------------------------------+
1056+
1057+
1058+Literal Blocks
1059+--------------
1060+
1061+Doctree element: literal_block.
1062+
1063+A paragraph consisting of two colons ("::") signifies that the
1064+following text block(s) comprise a literal block. The literal block
1065+must either be indented or quoted (see below). No markup processing
1066+is done within a literal block. It is left as-is, and is typically
1067+rendered in a monospaced typeface::
1068+
1069+ This is a typical paragraph. An indented literal block follows.
1070+
1071+ ::
1072+
1073+ for a in [5,4,3,2,1]: # this is program code, shown as-is
1074+ print a
1075+ print "it's..."
1076+ # a literal block continues until the indentation ends
1077+
1078+ This text has returned to the indentation of the first paragraph,
1079+ is outside of the literal block, and is therefore treated as an
1080+ ordinary paragraph.
1081+
1082+The paragraph containing only "::" will be completely removed from the
1083+output; no empty paragraph will remain.
1084+
1085+As a convenience, the "::" is recognized at the end of any paragraph.
1086+If immediately preceded by whitespace, both colons will be removed
1087+from the output (this is the "partially minimized" form). When text
1088+immediately precedes the "::", *one* colon will be removed from the
1089+output, leaving only one colon visible (i.e., "::" will be replaced by
1090+":"; this is the "fully minimized" form).
1091+
1092+In other words, these are all equivalent (please pay attention to the
1093+colons after "Paragraph"):
1094+
1095+1. Expanded form::
1096+
1097+ Paragraph:
1098+
1099+ ::
1100+
1101+ Literal block
1102+
1103+2. Partially minimized form::
1104+
1105+ Paragraph: ::
1106+
1107+ Literal block
1108+
1109+3. Fully minimized form::
1110+
1111+ Paragraph::
1112+
1113+ Literal block
1114+
1115+All whitespace (including line breaks, but excluding minimum
1116+indentation for indented literal blocks) is preserved. Blank lines
1117+are required before and after a literal block, but these blank lines
1118+are not included as part of the literal block.
1119+
1120+
1121+Indented Literal Blocks
1122+```````````````````````
1123+
1124+Indented literal blocks are indicated by indentation relative to the
1125+surrounding text (leading whitespace on each line). The minimum
1126+indentation will be removed from each line of an indented literal
1127+block. The literal block need not be contiguous; blank lines are
1128+allowed between sections of indented text. The literal block ends
1129+with the end of the indentation.
1130+
1131+Syntax diagram::
1132+
1133+ +------------------------------+
1134+ | paragraph |
1135+ | (ends with "::") |
1136+ +------------------------------+
1137+ +---------------------------+
1138+ | indented literal block |
1139+ +---------------------------+
1140+
1141+
1142+Quoted Literal Blocks
1143+`````````````````````
1144+
1145+Quoted literal blocks are unindented contiguous blocks of text where
1146+each line begins with the same non-alphanumeric printable 7-bit ASCII
1147+character [#]_. A blank line ends a quoted literal block. The
1148+quoting characters are preserved in the processed document.
1149+
1150+.. [#]
1151+ The following are all valid quoting characters::
1152+
1153+ ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~
1154+
1155+ Note that these are the same characters as are valid for title
1156+ adornment of sections_.
1157+
1158+Possible uses include literate programming in Haskell and email
1159+quoting::
1160+
1161+ John Doe wrote::
1162+
1163+ >> Great idea!
1164+ >
1165+ > Why didn't I think of that?
1166+
1167+ You just did! ;-)
1168+
1169+Syntax diagram::
1170+
1171+ +------------------------------+
1172+ | paragraph |
1173+ | (ends with "::") |
1174+ +------------------------------+
1175+ +------------------------------+
1176+ | ">" per-line-quoted |
1177+ | ">" contiguous literal block |
1178+ +------------------------------+
1179+
1180+
1181+Line Blocks
1182+-----------
1183+
1184+Doctree elements: line_block, line. (New in Docutils 0.3.5.)
1185+
1186+Line blocks are useful for address blocks, verse (poetry, song
1187+lyrics), and unadorned lists, where the structure of lines is
1188+significant. Line blocks are groups of lines beginning with vertical
1189+bar ("|") prefixes. Each vertical bar prefix indicates a new line, so
1190+line breaks are preserved. Initial indents are also significant,
1191+resulting in a nested structure. Inline markup is supported.
1192+Continuation lines are wrapped portions of long lines; they begin with
1193+a space in place of the vertical bar. The left edge of a continuation
1194+line must be indented, but need not be aligned with the left edge of
1195+the text above it. A line block ends with a blank line.
1196+
1197+This example illustrates continuation lines::
1198+
1199+ | Lend us a couple of bob till Thursday.
1200+ | I'm absolutely skint.
1201+ | But I'm expecting a postal order and I can pay you back
1202+ as soon as it comes.
1203+ | Love, Ewan.
1204+
1205+This example illustrates the nesting of line blocks, indicated by the
1206+initial indentation of new lines::
1207+
1208+ Take it away, Eric the Orchestra Leader!
1209+
1210+ | A one, two, a one two three four
1211+ |
1212+ | Half a bee, philosophically,
1213+ | must, *ipso facto*, half not be.
1214+ | But half the bee has got to be,
1215+ | *vis a vis* its entity. D'you see?
1216+ |
1217+ | But can a bee be said to be
1218+ | or not to be an entire bee,
1219+ | when half the bee is not a bee,
1220+ | due to some ancient injury?
1221+ |
1222+ | Singing...
1223+
1224+Syntax diagram::
1225+
1226+ +------+-----------------------+
1227+ | "| " | line |
1228+ +------| continuation line |
1229+ +-----------------------+
1230+
1231+
1232+Block Quotes
1233+------------
1234+
1235+Doctree element: block_quote, attribution.
1236+
1237+A text block that is indented relative to the preceding text, without
1238+preceding markup indicating it to be a literal block or other content,
1239+is a block quote. All markup processing (for body elements and inline
1240+markup) continues within the block quote::
1241+
1242+ This is an ordinary paragraph, introducing a block quote.
1243+
1244+ "It is my business to know things. That is my trade."
1245+
1246+ -- Sherlock Holmes
1247+
1248+A block quote may end with an attribution: a text block beginning with
1249+"--", "---", or a true em-dash, flush left within the block quote. If
1250+the attribution consists of multiple lines, the left edges of the
1251+second and subsequent lines must align.
1252+
1253+Multiple block quotes may occur consecutively if terminated with
1254+attributions.
1255+
1256+ Unindented paragraph.
1257+
1258+ Block quote 1.
1259+
1260+ -- Attribution 1
1261+
1262+ Block quote 2.
1263+
1264+`Empty comments`_ may be used to explicitly terminate preceding
1265+constructs that would otherwise consume a block quote::
1266+
1267+ * List item.
1268+
1269+ ..
1270+
1271+ Block quote 3.
1272+
1273+Empty comments may also be used to separate block quotes::
1274+
1275+ Block quote 4.
1276+
1277+ ..
1278+
1279+ Block quote 5.
1280+
1281+Blank lines are required before and after a block quote, but these
1282+blank lines are not included as part of the block quote.
1283+
1284+Syntax diagram::
1285+
1286+ +------------------------------+
1287+ | (current level of |
1288+ | indentation) |
1289+ +------------------------------+
1290+ +---------------------------+
1291+ | block quote |
1292+ | (body elements)+ |
1293+ | |
1294+ | -- attribution text |
1295+ | (optional) |
1296+ +---------------------------+
1297+
1298+
1299+Doctest Blocks
1300+--------------
1301+
1302+Doctree element: doctest_block.
1303+
1304+Doctest blocks are interactive Python sessions cut-and-pasted into
1305+docstrings. They are meant to illustrate usage by example, and
1306+provide an elegant and powerful testing environment via the `doctest
1307+module`_ in the Python standard library.
1308+
1309+Doctest blocks are text blocks which begin with ``">>> "``, the Python
1310+interactive interpreter main prompt, and end with a blank line.
1311+Doctest blocks are treated as a special case of literal blocks,
1312+without requiring the literal block syntax. If both are present, the
1313+literal block syntax takes priority over Doctest block syntax::
1314+
1315+ This is an ordinary paragraph.
1316+
1317+ >>> print 'this is a Doctest block'
1318+ this is a Doctest block
1319+
1320+ The following is a literal block::
1321+
1322+ >>> This is not recognized as a doctest block by
1323+ reStructuredText. It *will* be recognized by the doctest
1324+ module, though!
1325+
1326+Indentation is not required for doctest blocks.
1327+
1328+
1329+Tables
1330+------
1331+
1332+Doctree elements: table, tgroup, colspec, thead, tbody, row, entry.
1333+
1334+ReStructuredText provides two syntaxes for delineating table cells:
1335+`Grid Tables`_ and `Simple Tables`_.
1336+
1337+As with other body elements, blank lines are required before and after
1338+tables. Tables' left edges should align with the left edge of
1339+preceding text blocks; if indented, the table is considered to be part
1340+of a block quote.
1341+
1342+Once isolated, each table cell is treated as a miniature document; the
1343+top and bottom cell boundaries act as delimiting blank lines. Each
1344+cell contains zero or more body elements. Cell contents may include
1345+left and/or right margins, which are removed before processing.
1346+
1347+
1348+Grid Tables
1349+```````````
1350+
1351+Grid tables provide a complete table representation via grid-like
1352+"ASCII art". Grid tables allow arbitrary cell contents (body
1353+elements), and both row and column spans. However, grid tables can be
1354+cumbersome to produce, especially for simple data sets. The `Emacs
1355+table mode`_ is a tool that allows easy editing of grid tables, in
1356+Emacs. See `Simple Tables`_ for a simpler (but limited)
1357+representation.
1358+
1359+Grid tables are described with a visual grid made up of the characters
1360+"-", "=", "|", and "+". The hyphen ("-") is used for horizontal lines
1361+(row separators). The equals sign ("=") may be used to separate
1362+optional header rows from the table body (not supported by the `Emacs
1363+table mode`_). The vertical bar ("|") is used for vertical lines
1364+(column separators). The plus sign ("+") is used for intersections of
1365+horizontal and vertical lines. Example::
1366+
1367+ +------------------------+------------+----------+----------+
1368+ | Header row, column 1 | Header 2 | Header 3 | Header 4 |
1369+ | (header rows optional) | | | |
1370+ +========================+============+==========+==========+
1371+ | body row 1, column 1 | column 2 | column 3 | column 4 |
1372+ +------------------------+------------+----------+----------+
1373+ | body row 2 | Cells may span columns. |
1374+ +------------------------+------------+---------------------+
1375+ | body row 3 | Cells may | - Table cells |
1376+ +------------------------+ span rows. | - contain |
1377+ | body row 4 | | - body elements. |
1378+ +------------------------+------------+---------------------+
1379+
1380+Some care must be taken with grid tables to avoid undesired
1381+interactions with cell text in rare cases. For example, the following
1382+table contains a cell in row 2 spanning from column 2 to column 4::
1383+
1384+ +--------------+----------+-----------+-----------+
1385+ | row 1, col 1 | column 2 | column 3 | column 4 |
1386+ +--------------+----------+-----------+-----------+
1387+ | row 2 | |
1388+ +--------------+----------+-----------+-----------+
1389+ | row 3 | | | |
1390+ +--------------+----------+-----------+-----------+
1391+
1392+If a vertical bar is used in the text of that cell, it could have
1393+unintended effects if accidentally aligned with column boundaries::
1394+
1395+ +--------------+----------+-----------+-----------+
1396+ | row 1, col 1 | column 2 | column 3 | column 4 |
1397+ +--------------+----------+-----------+-----------+
1398+ | row 2 | Use the command ``ls | more``. |
1399+ +--------------+----------+-----------+-----------+
1400+ | row 3 | | | |
1401+ +--------------+----------+-----------+-----------+
1402+
1403+Several solutions are possible. All that is needed is to break the
1404+continuity of the cell outline rectangle. One possibility is to shift
1405+the text by adding an extra space before::
1406+
1407+ +--------------+----------+-----------+-----------+
1408+ | row 1, col 1 | column 2 | column 3 | column 4 |
1409+ +--------------+----------+-----------+-----------+
1410+ | row 2 | Use the command ``ls | more``. |
1411+ +--------------+----------+-----------+-----------+
1412+ | row 3 | | | |
1413+ +--------------+----------+-----------+-----------+
1414+
1415+Another possibility is to add an extra line to row 2::
1416+
1417+ +--------------+----------+-----------+-----------+
1418+ | row 1, col 1 | column 2 | column 3 | column 4 |
1419+ +--------------+----------+-----------+-----------+
1420+ | row 2 | Use the command ``ls | more``. |
1421+ | | |
1422+ +--------------+----------+-----------+-----------+
1423+ | row 3 | | | |
1424+ +--------------+----------+-----------+-----------+
1425+
1426+
1427+Simple Tables
1428+`````````````
1429+
1430+Simple tables provide a compact and easy to type but limited
1431+row-oriented table representation for simple data sets. Cell contents
1432+are typically single paragraphs, although arbitrary body elements may
1433+be represented in most cells. Simple tables allow multi-line rows (in
1434+all but the first column) and column spans, but not row spans. See
1435+`Grid Tables`_ above for a complete table representation.
1436+
1437+Simple tables are described with horizontal borders made up of "=" and
1438+"-" characters. The equals sign ("=") is used for top and bottom
1439+table borders, and to separate optional header rows from the table
1440+body. The hyphen ("-") is used to indicate column spans in a single
1441+row by underlining the joined columns, and may optionally be used to
1442+explicitly and/or visually separate rows.
1443+
1444+A simple table begins with a top border of equals signs with one or
1445+more spaces at each column boundary (two or more spaces recommended).
1446+Regardless of spans, the top border *must* fully describe all table
1447+columns. There must be at least two columns in the table (to
1448+differentiate it from section headers). The top border may be
1449+followed by header rows, and the last of the optional header rows is
1450+underlined with '=', again with spaces at column boundaries. There
1451+may not be a blank line below the header row separator; it would be
1452+interpreted as the bottom border of the table. The bottom boundary of
1453+the table consists of '=' underlines, also with spaces at column
1454+boundaries. For example, here is a truth table, a three-column table
1455+with one header row and four body rows::
1456+
1457+ ===== ===== =======
1458+ A B A and B
1459+ ===== ===== =======
1460+ False False False
1461+ True False False
1462+ False True False
1463+ True True True
1464+ ===== ===== =======
1465+
1466+Underlines of '-' may be used to indicate column spans by "filling in"
1467+column margins to join adjacent columns. Column span underlines must
1468+be complete (they must cover all columns) and align with established
1469+column boundaries. Text lines containing column span underlines may
1470+not contain any other text. A column span underline applies only to
1471+one row immediately above it. For example, here is a table with a
1472+column span in the header::
1473+
1474+ ===== ===== ======
1475+ Inputs Output
1476+ ------------ ------
1477+ A B A or B
1478+ ===== ===== ======
1479+ False False False
1480+ True False True
1481+ False True True
1482+ True True True
1483+ ===== ===== ======
1484+
1485+Each line of text must contain spaces at column boundaries, except
1486+where cells have been joined by column spans. Each line of text
1487+starts a new row, except when there is a blank cell in the first
1488+column. In that case, that line of text is parsed as a continuation
1489+line. For this reason, cells in the first column of new rows (*not*
1490+continuation lines) *must* contain some text; blank cells would lead
1491+to a misinterpretation (but see the tip below). Also, this mechanism
1492+limits cells in the first column to only one line of text. Use `grid
1493+tables`_ if this limitation is unacceptable.
1494+
1495+.. Tip::
1496+
1497+ To start a new row in a simple table without text in the first
1498+ column in the processed output, use one of these:
1499+
1500+ * an empty comment (".."), which may be omitted from the processed
1501+ output (see Comments_ below)
1502+
1503+ * a backslash escape ("``\``") followed by a space (see `Escaping
1504+ Mechanism`_ above)
1505+
1506+Underlines of '-' may also be used to visually separate rows, even if
1507+there are no column spans. This is especially useful in long tables,
1508+where rows are many lines long.
1509+
1510+Blank lines are permitted within simple tables. Their interpretation
1511+depends on the context. Blank lines *between* rows are ignored.
1512+Blank lines *within* multi-line rows may separate paragraphs or other
1513+body elements within cells.
1514+
1515+The rightmost column is unbounded; text may continue past the edge of
1516+the table (as indicated by the table borders). However, it is
1517+recommended that borders be made long enough to contain the entire
1518+text.
1519+
1520+The following example illustrates continuation lines (row 2 consists
1521+of two lines of text, and four lines for row 3), a blank line
1522+separating paragraphs (row 3, column 2), text extending past the right
1523+edge of the table, and a new row which will have no text in the first
1524+column in the processed output (row 4)::
1525+
1526+ ===== =====
1527+ col 1 col 2
1528+ ===== =====
1529+ 1 Second column of row 1.
1530+ 2 Second column of row 2.
1531+ Second line of paragraph.
1532+ 3 - Second column of row 3.
1533+
1534+ - Second item in bullet
1535+ list (row 3, column 2).
1536+ \ Row 4; column 1 will be empty.
1537+ ===== =====
1538+
1539+
1540+Explicit Markup Blocks
1541+----------------------
1542+
1543+An explicit markup block is a text block:
1544+
1545+- whose first line begins with ".." followed by whitespace (the
1546+ "explicit markup start"),
1547+- whose second and subsequent lines (if any) are indented relative to
1548+ the first, and
1549+- which ends before an unindented line.
1550+
1551+Explicit markup blocks are analogous to bullet list items, with ".."
1552+as the bullet. The text on the lines immediately after the explicit
1553+markup start determines the indentation of the block body. The
1554+maximum common indentation is always removed from the second and
1555+subsequent lines of the block body. Therefore if the first construct
1556+fits in one line, and the indentation of the first and second
1557+constructs should differ, the first construct should not begin on the
1558+same line as the explicit markup start.
1559+
1560+Blank lines are required between explicit markup blocks and other
1561+elements, but are optional between explicit markup blocks where
1562+unambiguous.
1563+
1564+The explicit markup syntax is used for footnotes, citations, hyperlink
1565+targets, directives, substitution definitions, and comments.
1566+
1567+
1568+Footnotes
1569+`````````
1570+
1571+Doctree elements: footnote, label.
1572+
1573+Each footnote consists of an explicit markup start (".. "), a left
1574+square bracket, the footnote label, a right square bracket, and
1575+whitespace, followed by indented body elements. A footnote label can
1576+be:
1577+
1578+- a whole decimal number consisting of one or more digits,
1579+
1580+- a single "#" (denoting `auto-numbered footnotes`_),
1581+
1582+- a "#" followed by a simple reference name (an `autonumber label`_),
1583+ or
1584+
1585+- a single "*" (denoting `auto-symbol footnotes`_).
1586+
1587+The footnote content (body elements) must be consistently indented (by
1588+at least 3 spaces) and left-aligned. The first body element within a
1589+footnote may often begin on the same line as the footnote label.
1590+However, if the first element fits on one line and the indentation of
1591+the remaining elements differ, the first element must begin on the
1592+line after the footnote label. Otherwise, the difference in
1593+indentation will not be detected.
1594+
1595+Footnotes may occur anywhere in the document, not only at the end.
1596+Where and how they appear in the processed output depends on the
1597+processing system.
1598+
1599+Here is a manually numbered footnote::
1600+
1601+ .. [1] Body elements go here.
1602+
1603+Each footnote automatically generates a hyperlink target pointing to
1604+itself. The text of the hyperlink target name is the same as that of
1605+the footnote label. `Auto-numbered footnotes`_ generate a number as
1606+their footnote label and reference name. See `Implicit Hyperlink
1607+Targets`_ for a complete description of the mechanism.
1608+
1609+Syntax diagram::
1610+
1611+ +-------+-------------------------+
1612+ | ".. " | "[" label "]" footnote |
1613+ +-------+ |
1614+ | (body elements)+ |
1615+ +-------------------------+
1616+
1617+
1618+Auto-Numbered Footnotes
1619+.......................
1620+
1621+A number sign ("#") may be used as the first character of a footnote
1622+label to request automatic numbering of the footnote or footnote
1623+reference.
1624+
1625+The first footnote to request automatic numbering is assigned the
1626+label "1", the second is assigned the label "2", and so on (assuming
1627+there are no manually numbered footnotes present; see `Mixed Manual
1628+and Auto-Numbered Footnotes`_ below). A footnote which has
1629+automatically received a label "1" generates an implicit hyperlink
1630+target with name "1", just as if the label was explicitly specified.
1631+
1632+.. _autonumber label: `autonumber labels`_
1633+
1634+A footnote may specify a label explicitly while at the same time
1635+requesting automatic numbering: ``[#label]``. These labels are called
1636+_`autonumber labels`. Autonumber labels do two things:
1637+
1638+- On the footnote itself, they generate a hyperlink target whose name
1639+ is the autonumber label (doesn't include the "#").
1640+
1641+- They allow an automatically numbered footnote to be referred to more
1642+ than once, as a footnote reference or hyperlink reference. For
1643+ example::
1644+
1645+ If [#note]_ is the first footnote reference, it will show up as
1646+ "[1]". We can refer to it again as [#note]_ and again see
1647+ "[1]". We can also refer to it as note_ (an ordinary internal
1648+ hyperlink reference).
1649+
1650+ .. [#note] This is the footnote labeled "note".
1651+
1652+The numbering is determined by the order of the footnotes, not by the
1653+order of the references. For footnote references without autonumber
1654+labels (``[#]_``), the footnotes and footnote references must be in
1655+the same relative order but need not alternate in lock-step. For
1656+example::
1657+
1658+ [#]_ is a reference to footnote 1, and [#]_ is a reference to
1659+ footnote 2.
1660+
1661+ .. [#] This is footnote 1.
1662+ .. [#] This is footnote 2.
1663+ .. [#] This is footnote 3.
1664+
1665+ [#]_ is a reference to footnote 3.
1666+
1667+Special care must be taken if footnotes themselves contain
1668+auto-numbered footnote references, or if multiple references are made
1669+in close proximity. Footnotes and references are noted in the order
1670+they are encountered in the document, which is not necessarily the
1671+same as the order in which a person would read them.
1672+
1673+
1674+Auto-Symbol Footnotes
1675+.....................
1676+
1677+An asterisk ("*") may be used for footnote labels to request automatic
1678+symbol generation for footnotes and footnote references. The asterisk
1679+may be the only character in the label. For example::
1680+
1681+ Here is a symbolic footnote reference: [*]_.
1682+
1683+ .. [*] This is the footnote.
1684+
1685+A transform will insert symbols as labels into corresponding footnotes
1686+and footnote references. The number of references must be equal to
1687+the number of footnotes. One symbol footnote cannot have multiple
1688+references.
1689+
1690+The standard Docutils system uses the following symbols for footnote
1691+marks [#]_:
1692+
1693+- asterisk/star ("*")
1694+- dagger (HTML character entity "&dagger;", Unicode U+02020)
1695+- double dagger ("&Dagger;"/U+02021)
1696+- section mark ("&sect;"/U+000A7)
1697+- pilcrow or paragraph mark ("&para;"/U+000B6)
1698+- number sign ("#")
1699+- spade suit ("&spades;"/U+02660)
1700+- heart suit ("&hearts;"/U+02665)
1701+- diamond suit ("&diams;"/U+02666)
1702+- club suit ("&clubs;"/U+02663)
1703+
1704+.. [#] This list was inspired by the list of symbols for "Note
1705+ Reference Marks" in The Chicago Manual of Style, 14th edition,
1706+ section 12.51. "Parallels" ("||") were given in CMoS instead of
1707+ the pilcrow. The last four symbols (the card suits) were added
1708+ arbitrarily.
1709+
1710+If more than ten symbols are required, the same sequence will be
1711+reused, doubled and then tripled, and so on ("**" etc.).
1712+
1713+.. Note:: When using auto-symbol footnotes, the choice of output
1714+ encoding is important. Many of the symbols used are not encodable
1715+ in certain common text encodings such as Latin-1 (ISO 8859-1). The
1716+ use of UTF-8 for the output encoding is recommended. An
1717+ alternative for HTML and XML output is to use the
1718+ "xmlcharrefreplace" `output encoding error handler`__.
1719+
1720+__ ../../user/config.html#output-encoding-error-handler
1721+
1722+
1723+Mixed Manual and Auto-Numbered Footnotes
1724+........................................
1725+
1726+Manual and automatic footnote numbering may both be used within a
1727+single document, although the results may not be expected. Manual
1728+numbering takes priority. Only unused footnote numbers are assigned
1729+to auto-numbered footnotes. The following example should be
1730+illustrative::
1731+
1732+ [2]_ will be "2" (manually numbered),
1733+ [#]_ will be "3" (anonymous auto-numbered), and
1734+ [#label]_ will be "1" (labeled auto-numbered).
1735+
1736+ .. [2] This footnote is labeled manually, so its number is fixed.
1737+
1738+ .. [#label] This autonumber-labeled footnote will be labeled "1".
1739+ It is the first auto-numbered footnote and no other footnote
1740+ with label "1" exists. The order of the footnotes is used to
1741+ determine numbering, not the order of the footnote references.
1742+
1743+ .. [#] This footnote will be labeled "3". It is the second
1744+ auto-numbered footnote, but footnote label "2" is already used.
1745+
1746+
1747+Citations
1748+`````````
1749+
1750+Citations are identical to footnotes except that they use only
1751+non-numeric labels such as ``[note]`` or ``[GVR2001]``. Citation
1752+labels are simple `reference names`_ (case-insensitive single words
1753+consisting of alphanumerics plus internal hyphens, underscores, and
1754+periods; no whitespace). Citations may be rendered separately and
1755+differently from footnotes. For example::
1756+
1757+ Here is a citation reference: [CIT2002]_.
1758+
1759+ .. [CIT2002] This is the citation. It's just like a footnote,
1760+ except the label is textual.
1761+
1762+
1763+.. _hyperlinks:
1764+
1765+Hyperlink Targets
1766+`````````````````
1767+
1768+Doctree element: target.
1769+
1770+These are also called _`explicit hyperlink targets`, to differentiate
1771+them from `implicit hyperlink targets`_ defined below.
1772+
1773+Hyperlink targets identify a location within or outside of a document,
1774+which may be linked to by `hyperlink references`_.
1775+
1776+Hyperlink targets may be named or anonymous. Named hyperlink targets
1777+consist of an explicit markup start (".. "), an underscore, the
1778+reference name (no trailing underscore), a colon, whitespace, and a
1779+link block::
1780+
1781+ .. _hyperlink-name: link-block
1782+
1783+Reference names are whitespace-neutral and case-insensitive. See
1784+`Reference Names`_ for details and examples.
1785+
1786+Anonymous hyperlink targets consist of an explicit markup start
1787+(".. "), two underscores, a colon, whitespace, and a link block; there
1788+is no reference name::
1789+
1790+ .. __: anonymous-hyperlink-target-link-block
1791+
1792+An alternate syntax for anonymous hyperlinks consists of two
1793+underscores, a space, and a link block::
1794+
1795+ __ anonymous-hyperlink-target-link-block
1796+
1797+See `Anonymous Hyperlinks`_ below.
1798+
1799+There are three types of hyperlink targets: internal, external, and
1800+indirect.
1801+
1802+1. _`Internal hyperlink targets` have empty link blocks. They provide
1803+ an end point allowing a hyperlink to connect one place to another
1804+ within a document. An internal hyperlink target points to the
1805+ element following the target. For example::
1806+
1807+ Clicking on this internal hyperlink will take us to the target_
1808+ below.
1809+
1810+ .. _target:
1811+
1812+ The hyperlink target above points to this paragraph.
1813+
1814+ Internal hyperlink targets may be "chained". Multiple adjacent
1815+ internal hyperlink targets all point to the same element::
1816+
1817+ .. _target1:
1818+ .. _target2:
1819+
1820+ The targets "target1" and "target2" are synonyms; they both
1821+ point to this paragraph.
1822+
1823+ If the element "pointed to" is an external hyperlink target (with a
1824+ URI in its link block; see #2 below) the URI from the external
1825+ hyperlink target is propagated to the internal hyperlink targets;
1826+ they will all "point to" the same URI. There is no need to
1827+ duplicate a URI. For example, all three of the following hyperlink
1828+ targets refer to the same URI::
1829+
1830+ .. _Python DOC-SIG mailing list archive:
1831+ .. _archive:
1832+ .. _Doc-SIG: http://mail.python.org/pipermail/doc-sig/
1833+
1834+ An inline form of internal hyperlink target is available; see
1835+ `Inline Internal Targets`_.
1836+
1837+2. _`External hyperlink targets` have an absolute or relative URI or
1838+ email address in their link blocks. For example, take the
1839+ following input::
1840+
1841+ See the Python_ home page for info.
1842+
1843+ `Write to me`_ with your questions.
1844+
1845+ .. _Python: http://www.python.org
1846+ .. _Write to me: jdoe@example.com
1847+
1848+ After processing into HTML, the hyperlinks might be expressed as::
1849+
1850+ See the <a href="http://www.python.org">Python</a> home page
1851+ for info.
1852+
1853+ <a href="mailto:jdoe@example.com">Write to me</a> with your
1854+ questions.
1855+
1856+ An external hyperlink's URI may begin on the same line as the
1857+ explicit markup start and target name, or it may begin in an
1858+ indented text block immediately following, with no intervening
1859+ blank lines. If there are multiple lines in the link block, they
1860+ are concatenated. Any whitespace is removed (whitespace is
1861+ permitted to allow for line wrapping). The following external
1862+ hyperlink targets are equivalent::
1863+
1864+ .. _one-liner: http://docutils.sourceforge.net/rst.html
1865+
1866+ .. _starts-on-this-line: http://
1867+ docutils.sourceforge.net/rst.html
1868+
1869+ .. _entirely-below:
1870+ http://docutils.
1871+ sourceforge.net/rst.html
1872+
1873+ If an external hyperlink target's URI contains an underscore as its
1874+ last character, it must be escaped to avoid being mistaken for an
1875+ indirect hyperlink target::
1876+
1877+ This link_ refers to a file called ``underscore_``.
1878+
1879+ .. _link: underscore\_
1880+
1881+ It is possible (although not generally recommended) to include URIs
1882+ directly within hyperlink references. See `Embedded URIs`_ below.
1883+
1884+3. _`Indirect hyperlink targets` have a hyperlink reference in their
1885+ link blocks. In the following example, target "one" indirectly
1886+ references whatever target "two" references, and target "two"
1887+ references target "three", an internal hyperlink target. In
1888+ effect, all three reference the same thing::
1889+
1890+ .. _one: two_
1891+ .. _two: three_
1892+ .. _three:
1893+
1894+ Just as with `hyperlink references`_ anywhere else in a document,
1895+ if a phrase-reference is used in the link block it must be enclosed
1896+ in backquotes. As with `external hyperlink targets`_, the link
1897+ block of an indirect hyperlink target may begin on the same line as
1898+ the explicit markup start or the next line. It may also be split
1899+ over multiple lines, in which case the lines are joined with
1900+ whitespace before being normalized.
1901+
1902+ For example, the following indirect hyperlink targets are
1903+ equivalent::
1904+
1905+ .. _one-liner: `A HYPERLINK`_
1906+ .. _entirely-below:
1907+ `a hyperlink`_
1908+ .. _split: `A
1909+ Hyperlink`_
1910+
1911+If the reference name contains any colons, either:
1912+
1913+- the phrase must be enclosed in backquotes::
1914+
1915+ .. _`FAQTS: Computers: Programming: Languages: Python`:
1916+ http://python.faqts.com/
1917+
1918+- or the colon(s) must be backslash-escaped in the link target::
1919+
1920+ .. _Chapter One\: "Tadpole Days":
1921+
1922+ It's not easy being green...
1923+
1924+See `Implicit Hyperlink Targets`_ below for the resolution of
1925+duplicate reference names.
1926+
1927+Syntax diagram::
1928+
1929+ +-------+----------------------+
1930+ | ".. " | "_" name ":" link |
1931+ +-------+ block |
1932+ | |
1933+ +----------------------+
1934+
1935+
1936+Anonymous Hyperlinks
1937+....................
1938+
1939+The `World Wide Web Consortium`_ recommends in its `HTML Techniques
1940+for Web Content Accessibility Guidelines`_ that authors should
1941+"clearly identify the target of each link." Hyperlink references
1942+should be as verbose as possible, but duplicating a verbose hyperlink
1943+name in the target is onerous and error-prone. Anonymous hyperlinks
1944+are designed to allow convenient verbose hyperlink references, and are
1945+analogous to `Auto-Numbered Footnotes`_. They are particularly useful
1946+in short or one-off documents. However, this feature is easily abused
1947+and can result in unreadable plaintext and/or unmaintainable
1948+documents. Caution is advised.
1949+
1950+Anonymous `hyperlink references`_ are specified with two underscores
1951+instead of one::
1952+
1953+ See `the web site of my favorite programming language`__.
1954+
1955+Anonymous targets begin with ".. __:"; no reference name is required
1956+or allowed::
1957+
1958+ .. __: http://www.python.org
1959+
1960+As a convenient alternative, anonymous targets may begin with "__"
1961+only::
1962+
1963+ __ http://www.python.org
1964+
1965+The reference name of the reference is not used to match the reference
1966+to its target. Instead, the order of anonymous hyperlink references
1967+and targets within the document is significant: the first anonymous
1968+reference will link to the first anonymous target. The number of
1969+anonymous hyperlink references in a document must match the number of
1970+anonymous targets. For readability, it is recommended that targets be
1971+kept close to references. Take care when editing text containing
1972+anonymous references; adding, removing, and rearranging references
1973+require attention to the order of corresponding targets.
1974+
1975+
1976+Directives
1977+``````````
1978+
1979+Doctree elements: depend on the directive.
1980+
1981+Directives are an extension mechanism for reStructuredText, a way of
1982+adding support for new constructs without adding new primary syntax
1983+(directives may support additional syntax locally). All standard
1984+directives (those implemented and registered in the reference
1985+reStructuredText parser) are described in the `reStructuredText
1986+Directives`_ document, and are always available. Any other directives
1987+are domain-specific, and may require special action to make them
1988+available when processing the document.
1989+
1990+For example, here's how an image_ may be placed::
1991+
1992+ .. image:: mylogo.jpeg
1993+
1994+A figure_ (a graphic with a caption) may placed like this::
1995+
1996+ .. figure:: larch.png
1997+
1998+ The larch.
1999+
2000+An admonition_ (note, caution, etc.) contains other body elements::
2001+
2002+ .. note:: This is a paragraph
2003+
2004+ - Here is a bullet list.
2005+
2006+Directives are indicated by an explicit markup start (".. ") followed
2007+by the directive type, two colons, and whitespace (together called the
2008+"directive marker"). Directive types are case-insensitive single
2009+words (alphanumerics plus isolated internal hyphens, underscores,
2010+plus signs, colons, and periods; no whitespace). Two colons are used
2011+after the directive type for these reasons:
2012+
2013+- Two colons are distinctive, and unlikely to be used in common text.
2014+
2015+- Two colons avoids clashes with common comment text like::
2016+
2017+ .. Danger: modify at your own risk!
2018+
2019+- If an implementation of reStructuredText does not recognize a
2020+ directive (i.e., the directive-handler is not installed), a level-3
2021+ (error) system message is generated, and the entire directive block
2022+ (including the directive itself) will be included as a literal
2023+ block. Thus "::" is a natural choice.
2024+
2025+The directive block is consists of any text on the first line of the
2026+directive after the directive marker, and any subsequent indented
2027+text. The interpretation of the directive block is up to the
2028+directive code. There are three logical parts to the directive block:
2029+
2030+1. Directive arguments.
2031+2. Directive options.
2032+3. Directive content.
2033+
2034+Individual directives can employ any combination of these parts.
2035+Directive arguments can be filesystem paths, URLs, title text, etc.
2036+Directive options are indicated using `field lists`_; the field names
2037+and contents are directive-specific. Arguments and options must form
2038+a contiguous block beginning on the first or second line of the
2039+directive; a blank line indicates the beginning of the directive
2040+content block. If either arguments and/or options are employed by the
2041+directive, a blank line must separate them from the directive content.
2042+The "figure" directive employs all three parts::
2043+
2044+ .. figure:: larch.png
2045+ :scale: 50
2046+
2047+ The larch.
2048+
2049+Simple directives may not require any content. If a directive that
2050+does not employ a content block is followed by indented text anyway,
2051+it is an error. If a block quote should immediately follow a
2052+directive, use an empty comment in-between (see Comments_ below).
2053+
2054+Actions taken in response to directives and the interpretation of text
2055+in the directive content block or subsequent text block(s) are
2056+directive-dependent. See `reStructuredText Directives`_ for details.
2057+
2058+Directives are meant for the arbitrary processing of their contents,
2059+which can be transformed into something possibly unrelated to the
2060+original text. It may also be possible for directives to be used as
2061+pragmas, to modify the behavior of the parser, such as to experiment
2062+with alternate syntax. There is no parser support for this
2063+functionality at present; if a reasonable need for pragma directives
2064+is found, they may be supported.
2065+
2066+Directives do not generate "directive" elements; they are a *parser
2067+construct* only, and have no intrinsic meaning outside of
2068+reStructuredText. Instead, the parser will transform recognized
2069+directives into (possibly specialized) document elements. Unknown
2070+directives will trigger level-3 (error) system messages.
2071+
2072+Syntax diagram::
2073+
2074+ +-------+-------------------------------+
2075+ | ".. " | directive type "::" directive |
2076+ +-------+ block |
2077+ | |
2078+ +-------------------------------+
2079+
2080+
2081+Substitution Definitions
2082+````````````````````````
2083+
2084+Doctree element: substitution_definition.
2085+
2086+Substitution definitions are indicated by an explicit markup start
2087+(".. ") followed by a vertical bar, the substitution text, another
2088+vertical bar, whitespace, and the definition block. Substitution text
2089+may not begin or end with whitespace. A substitution definition block
2090+contains an embedded inline-compatible directive (without the leading
2091+".. "), such as "image_" or "replace_". For example::
2092+
2093+ The |biohazard| symbol must be used on containers used to
2094+ dispose of medical waste.
2095+
2096+ .. |biohazard| image:: biohazard.png
2097+
2098+It is an error for a substitution definition block to directly or
2099+indirectly contain a circular substitution reference.
2100+
2101+`Substitution references`_ are replaced in-line by the processed
2102+contents of the corresponding definition (linked by matching
2103+substitution text). Matches are case-sensitive but forgiving; if no
2104+exact match is found, a case-insensitive comparison is attempted.
2105+
2106+Substitution definitions allow the power and flexibility of
2107+block-level directives_ to be shared by inline text. They are a way
2108+to include arbitrarily complex inline structures within text, while
2109+keeping the details out of the flow of text. They are the equivalent
2110+of SGML/XML's named entities or programming language macros.
2111+
2112+Without the substitution mechanism, every time someone wants an
2113+application-specific new inline structure, they would have to petition
2114+for a syntax change. In combination with existing directive syntax,
2115+any inline structure can be coded without new syntax (except possibly
2116+a new directive).
2117+
2118+Syntax diagram::
2119+
2120+ +-------+-----------------------------------------------------+
2121+ | ".. " | "|" substitution text "| " directive type "::" data |
2122+ +-------+ directive block |
2123+ | |
2124+ +-----------------------------------------------------+
2125+
2126+Following are some use cases for the substitution mechanism. Please
2127+note that most of the embedded directives shown are examples only and
2128+have not been implemented.
2129+
2130+Objects
2131+ Substitution references may be used to associate ambiguous text
2132+ with a unique object identifier.
2133+
2134+ For example, many sites may wish to implement an inline "user"
2135+ directive::
2136+
2137+ |Michael| and |Jon| are our widget-wranglers.
2138+
2139+ .. |Michael| user:: mjones
2140+ .. |Jon| user:: jhl
2141+
2142+ Depending on the needs of the site, this may be used to index the
2143+ document for later searching, to hyperlink the inline text in
2144+ various ways (mailto, homepage, mouseover Javascript with profile
2145+ and contact information, etc.), or to customize presentation of
2146+ the text (include username in the inline text, include an icon
2147+ image with a link next to the text, make the text bold or a
2148+ different color, etc.).
2149+
2150+ The same approach can be used in documents which frequently refer
2151+ to a particular type of objects with unique identifiers but
2152+ ambiguous common names. Movies, albums, books, photos, court
2153+ cases, and laws are possible. For example::
2154+
2155+ |The Transparent Society| offers a fascinating alternate view
2156+ on privacy issues.
2157+
2158+ .. |The Transparent Society| book:: isbn=0738201448
2159+
2160+ Classes or functions, in contexts where the module or class names
2161+ are unclear and/or interpreted text cannot be used, are another
2162+ possibility::
2163+
2164+ 4XSLT has the convenience method |runString|, so you don't
2165+ have to mess with DOM objects if all you want is the
2166+ transformed output.
2167+
2168+ .. |runString| function:: module=xml.xslt class=Processor
2169+
2170+Images
2171+ Images are a common use for substitution references::
2172+
2173+ West led the |H| 3, covered by dummy's |H| Q, East's |H| K,
2174+ and trumped in hand with the |S| 2.
2175+
2176+ .. |H| image:: /images/heart.png
2177+ :height: 11
2178+ :width: 11
2179+ .. |S| image:: /images/spade.png
2180+ :height: 11
2181+ :width: 11
2182+
2183+ * |Red light| means stop.
2184+ * |Green light| means go.
2185+ * |Yellow light| means go really fast.
2186+
2187+ .. |Red light| image:: red_light.png
2188+ .. |Green light| image:: green_light.png
2189+ .. |Yellow light| image:: yellow_light.png
2190+
2191+ |-><-| is the official symbol of POEE_.
2192+
2193+ .. |-><-| image:: discord.png
2194+ .. _POEE: http://www.poee.org/
2195+
2196+ The "image_" directive has been implemented.
2197+
2198+Styles [#]_
2199+ Substitution references may be used to associate inline text with
2200+ an externally defined presentation style::
2201+
2202+ Even |the text in Texas| is big.
2203+
2204+ .. |the text in Texas| style:: big
2205+
2206+ The style name may be meaningful in the context of some particular
2207+ output format (CSS class name for HTML output, LaTeX style name
2208+ for LaTeX, etc), or may be ignored for other output formats (such
2209+ as plaintext).
2210+
2211+ .. @@@ This needs to be rethought & rewritten or removed:
2212+
2213+ Interpreted text is unsuitable for this purpose because the set
2214+ of style names cannot be predefined - it is the domain of the
2215+ content author, not the author of the parser and output
2216+ formatter - and there is no way to associate a style name
2217+ argument with an interpreted text style role. Also, it may be
2218+ desirable to use the same mechanism for styling blocks::
2219+
2220+ .. style:: motto
2221+ At Bob's Underwear Shop, we'll do anything to get in
2222+ your pants.
2223+
2224+ .. style:: disclaimer
2225+ All rights reversed. Reprint what you like.
2226+
2227+ .. [#] There may be sufficient need for a "style" mechanism to
2228+ warrant simpler syntax such as an extension to the interpreted
2229+ text role syntax. The substitution mechanism is cumbersome for
2230+ simple text styling.
2231+
2232+Templates
2233+ Inline markup may be used for later processing by a template
2234+ engine. For example, a Zope_ author might write::
2235+
2236+ Welcome back, |name|!
2237+
2238+ .. |name| tal:: replace user/getUserName
2239+
2240+ After processing, this ZPT output would result::
2241+
2242+ Welcome back,
2243+ <span tal:replace="user/getUserName">name</span>!
2244+
2245+ Zope would then transform this to something like "Welcome back,
2246+ David!" during a session with an actual user.
2247+
2248+Replacement text
2249+ The substitution mechanism may be used for simple macro
2250+ substitution. This may be appropriate when the replacement text
2251+ is repeated many times throughout one or more documents,
2252+ especially if it may need to change later. A short example is
2253+ unavoidably contrived::
2254+
2255+ |RST|_ is a little annoying to type over and over, especially
2256+ when writing about |RST| itself, and spelling out the
2257+ bicapitalized word |RST| every time isn't really necessary for
2258+ |RST| source readability.
2259+
2260+ .. |RST| replace:: reStructuredText
2261+ .. _RST: http://docutils.sourceforge.net/rst.html
2262+
2263+ Note the trailing underscore in the first use of a substitution
2264+ reference. This indicates a reference to the corresponding
2265+ hyperlink target.
2266+
2267+ Substitution is also appropriate when the replacement text cannot
2268+ be represented using other inline constructs, or is obtrusively
2269+ long::
2270+
2271+ But still, that's nothing compared to a name like
2272+ |j2ee-cas|__.
2273+
2274+ .. |j2ee-cas| replace::
2275+ the Java `TM`:super: 2 Platform, Enterprise Edition Client
2276+ Access Services
2277+ __ http://developer.java.sun.com/developer/earlyAccess/
2278+ j2eecas/
2279+
2280+ The "replace_" directive has been implemented.
2281+
2282+
2283+Comments
2284+````````
2285+
2286+Doctree element: comment.
2287+
2288+Arbitrary indented text may follow the explicit markup start and will
2289+be processed as a comment element. No further processing is done on
2290+the comment block text; a comment contains a single "text blob".
2291+Depending on the output formatter, comments may be removed from the
2292+processed output. The only restriction on comments is that they not
2293+use the same syntax as any of the other explicit markup constructs:
2294+substitution definitions, directives, footnotes, citations, or
2295+hyperlink targets. To ensure that none of the other explicit markup
2296+constructs is recognized, leave the ".." on a line by itself::
2297+
2298+ .. This is a comment
2299+ ..
2300+ _so: is this!
2301+ ..
2302+ [and] this!
2303+ ..
2304+ this:: too!
2305+ ..
2306+ |even| this:: !
2307+
2308+.. _empty comments:
2309+
2310+An explicit markup start followed by a blank line and nothing else
2311+(apart from whitespace) is an "_`empty comment`". It serves to
2312+terminate a preceding construct, and does **not** consume any indented
2313+text following. To have a block quote follow a list or any indented
2314+construct, insert an unindented empty comment in-between.
2315+
2316+Syntax diagram::
2317+
2318+ +-------+----------------------+
2319+ | ".. " | comment |
2320+ +-------+ block |
2321+ | |
2322+ +----------------------+
2323+
2324+
2325+Implicit Hyperlink Targets
2326+==========================
2327+
2328+Implicit hyperlink targets are generated by section titles, footnotes,
2329+and citations, and may also be generated by extension constructs.
2330+Implicit hyperlink targets otherwise behave identically to explicit
2331+`hyperlink targets`_.
2332+
2333+Problems of ambiguity due to conflicting duplicate implicit and
2334+explicit reference names are avoided by following this procedure:
2335+
2336+1. `Explicit hyperlink targets`_ override any implicit targets having
2337+ the same reference name. The implicit hyperlink targets are
2338+ removed, and level-1 (info) system messages are inserted.
2339+
2340+2. Duplicate implicit hyperlink targets are removed, and level-1
2341+ (info) system messages inserted. For example, if two or more
2342+ sections have the same title (such as "Introduction" subsections of
2343+ a rigidly-structured document), there will be duplicate implicit
2344+ hyperlink targets.
2345+
2346+3. Duplicate explicit hyperlink targets are removed, and level-2
2347+ (warning) system messages are inserted. Exception: duplicate
2348+ `external hyperlink targets`_ (identical hyperlink names and
2349+ referenced URIs) do not conflict, and are not removed.
2350+
2351+System messages are inserted where target links have been removed.
2352+See "Error Handling" in `PEP 258`_.
2353+
2354+The parser must return a set of *unique* hyperlink targets. The
2355+calling software (such as the Docutils_) can warn of unresolvable
2356+links, giving reasons for the messages.
2357+
2358+
2359+Inline Markup
2360+=============
2361+
2362+In reStructuredText, inline markup applies to words or phrases within
2363+a text block. The same whitespace and punctuation that serves to
2364+delimit words in written text is used to delimit the inline markup
2365+syntax constructs. The text within inline markup may not begin or end
2366+with whitespace. Arbitrary `character-level inline markup`_ is
2367+supported although not encouraged. Inline markup cannot be nested.
2368+
2369+There are nine inline markup constructs. Five of the constructs use
2370+identical start-strings and end-strings to indicate the markup:
2371+
2372+- emphasis_: "*"
2373+- `strong emphasis`_: "**"
2374+- `interpreted text`_: "`"
2375+- `inline literals`_: "``"
2376+- `substitution references`_: "|"
2377+
2378+Three constructs use different start-strings and end-strings:
2379+
2380+- `inline internal targets`_: "_`" and "`"
2381+- `footnote references`_: "[" and "]_"
2382+- `hyperlink references`_: "`" and "\`_" (phrases), or just a
2383+ trailing "_" (single words)
2384+
2385+`Standalone hyperlinks`_ are recognized implicitly, and use no extra
2386+markup.
2387+
2388+Inline markup recognition rules
2389+-------------------------------
2390+
2391+Inline markup start-strings and end-strings are only recognized if all of
2392+the following conditions are met:
2393+
2394+1. Inline markup start-strings must start a text block or be
2395+ immediately preceded by
2396+
2397+ * whitespace,
2398+ * one of the ASCII characters ``- : / ' " < ( [ {`` or
2399+ * a non-ASCII punctuation character with `Unicode category`_
2400+ `Pd` (Dash),
2401+ `Po` (Other),
2402+ `Ps` (Open),
2403+ `Pi` (Initial quote), or
2404+ `Pf` (Final quote) [#PiPf]_.
2405+
2406+2. Inline markup start-strings must be immediately followed by
2407+ non-whitespace.
2408+
2409+3. Inline markup end-strings must be immediately preceded by
2410+ non-whitespace.
2411+
2412+4. Inline markup end-strings must end a text block or be immediately
2413+ followed by
2414+
2415+ * whitespace,
2416+ * one of the ASCII characters ``- . , : ; ! ? \ / ' " ) ] } >`` or
2417+ * a non-ASCII punctuation character with `Unicode category`_
2418+ `Pd` (Dash),
2419+ `Po` (Other),
2420+ `Pe` (Close),
2421+ `Pf` (Final quote), or
2422+ `Pi` (Initial quote) [#PiPf]_.
2423+
2424+5. If an inline markup start-string is immediately preceded by one of the
2425+ ASCII characters ``' " < ( [ {``, or a character with Unicode character
2426+ category `Ps`, `Pi`, or `Pf`, it must not be followed by the
2427+ corresponding [#corresponding-quotes]_ closing character from
2428+ ``' " ) ] } >`` or the categories `Pe`, `Pf`, or `Pi`.
2429+
2430+6. An inline markup end-string must be separated by at least one
2431+ character from the start-string.
2432+
2433+7. An unescaped backslash preceding a start-string or end-string will
2434+ disable markup recognition, except for the end-string of `inline
2435+ literals`_. See `Escaping Mechanism`_ above for details.
2436+
2437+.. [#PiPf] `Pi` (Punctuation, Initial quote) characters are "usually
2438+ closing, sometimes opening". `Pf` (Punctuation, Final quote)
2439+ characters are "usually closing, sometimes opening".
2440+
2441+.. [#corresponding-quotes] For quotes, corresponding characters can be
2442+ any of the `quotation marks in international usage`_
2443+
2444+.. _Unicode category:
2445+ http://www.unicode.org/Public/5.1.0/ucd/UCD.html#General_Category_Values
2446+
2447+.. _quotation marks in international usage:
2448+ http://en.wikipedia.org/wiki/Quotation_mark,_non-English_usage
2449+
2450+The inline markup recognition rules were devised to allow 90% of non-markup
2451+uses of "*", "`", "_", and "|" without escaping. For example, none of the
2452+following terms are recognized as containing inline markup strings:
2453+
2454+- 2*x a**b O(N**2) e**(x*y) f(x)*f(y) a|b file*.* (breaks 1)
2455+- 2 * x a ** b (* BOM32_* ` `` _ __ | (breaks 2)
2456+- "*" '|' (*) [*] {*} <*>
2457+ ‘*’ ‚*‘ ‘*‚ ’*’ ‚*’
2458+ “*” „*“ “*„ ”*” „*”
2459+ »*« ›*‹ «*» »*» ›*› (breaks 5)
2460+- || (breaks 6)
2461+- __init__ __init__()
2462+
2463+No escaping is required inside the following inline markup examples:
2464+
2465+- *2 * x *a **b *.txt* (breaks 3)
2466+- *2*x a**b O(N**2) e**(x*y) f(x)*f(y) a*(1+2)* (breaks 4)
2467+
2468+It may be desirable to use `inline literals`_ for some of these anyhow,
2469+especially if they represent code snippets. It's a judgment call.
2470+
2471+These cases *do* require either literal-quoting or escaping to avoid
2472+misinterpretation:
2473+
2474+ \*4, class\_, \*args, \**kwargs, \`TeX-quoted', \*ML, \*.txt
2475+
2476+In most use cases, `inline literals`_ or `literal blocks`_ are the best
2477+choice (by default, this also selects a monospaced font)::
2478+
2479+ *4, class_, *args, **kwargs, `TeX-quoted', *ML, *.txt
2480+
2481+Recognition order
2482+-----------------
2483+
2484+Inline markup delimiter characters are used for multiple constructs,
2485+so to avoid ambiguity there must be a specific recognition order for
2486+each character. The inline markup recognition order is as follows:
2487+
2488+- Asterisks: `Strong emphasis`_ ("**") is recognized before emphasis_
2489+ ("*").
2490+
2491+- Backquotes: `Inline literals`_ ("``"), `inline internal targets`_
2492+ (leading "_`", trailing "`"), are mutually independent, and are
2493+ recognized before phrase `hyperlink references`_ (leading "`",
2494+ trailing "\`_") and `interpreted text`_ ("`").
2495+
2496+- Trailing underscores: Footnote references ("[" + label + "]_") and
2497+ simple `hyperlink references`_ (name + trailing "_") are mutually
2498+ independent.
2499+
2500+- Vertical bars: `Substitution references`_ ("|") are independently
2501+ recognized.
2502+
2503+- `Standalone hyperlinks`_ are the last to be recognized.
2504+
2505+
2506+Character-Level Inline Markup
2507+-----------------------------
2508+
2509+It is possible to mark up individual characters within a word with
2510+backslash escapes (see `Escaping Mechanism`_ above). Backslash
2511+escapes can be used to allow arbitrary text to immediately follow
2512+inline markup::
2513+
2514+ Python ``list``\s use square bracket syntax.
2515+
2516+The backslash will disappear from the processed document. The word
2517+"list" will appear as inline literal text, and the letter "s" will
2518+immediately follow it as normal text, with no space in-between.
2519+
2520+Arbitrary text may immediately precede inline markup using
2521+backslash-escaped whitespace::
2522+
2523+ Possible in *re*\ ``Structured``\ *Text*, though not encouraged.
2524+
2525+The backslashes and spaces separating "re", "Structured", and "Text"
2526+above will disappear from the processed document.
2527+
2528+.. CAUTION::
2529+
2530+ The use of backslash-escapes for character-level inline markup is
2531+ not encouraged. Such use is ugly and detrimental to the
2532+ unprocessed document's readability. Please use this feature
2533+ sparingly and only where absolutely necessary.
2534+
2535+
2536+Emphasis
2537+--------
2538+
2539+Doctree element: emphasis.
2540+
2541+Start-string = end-string = "*".
2542+
2543+Text enclosed by single asterisk characters is emphasized::
2544+
2545+ This is *emphasized text*.
2546+
2547+Emphasized text is typically displayed in italics.
2548+
2549+
2550+Strong Emphasis
2551+---------------
2552+
2553+Doctree element: strong.
2554+
2555+Start-string = end-string = "**".
2556+
2557+Text enclosed by double-asterisks is emphasized strongly::
2558+
2559+ This is **strong text**.
2560+
2561+Strongly emphasized text is typically displayed in boldface.
2562+
2563+
2564+Interpreted Text
2565+----------------
2566+
2567+Doctree element: depends on the explicit or implicit role and
2568+processing.
2569+
2570+Start-string = end-string = "`".
2571+
2572+Interpreted text is text that is meant to be related, indexed, linked,
2573+summarized, or otherwise processed, but the text itself is typically
2574+left alone. Interpreted text is enclosed by single backquote
2575+characters::
2576+
2577+ This is `interpreted text`.
2578+
2579+The "role" of the interpreted text determines how the text is
2580+interpreted. The role may be inferred implicitly (as above; the
2581+"default role" is used) or indicated explicitly, using a role marker.
2582+A role marker consists of a colon, the role name, and another colon.
2583+A role name is a single word consisting of alphanumerics plus isolated
2584+internal hyphens, underscores, plus signs, colons, and periods;
2585+no whitespace or other characters are allowed. A role marker is
2586+either a prefix or a suffix to the interpreted text, whichever reads
2587+better; it's up to the author::
2588+
2589+ :role:`interpreted text`
2590+
2591+ `interpreted text`:role:
2592+
2593+Interpreted text allows extensions to the available inline descriptive
2594+markup constructs. To emphasis_, `strong emphasis`_, `inline
2595+literals`_, and `hyperlink references`_, we can add "title reference",
2596+"index entry", "acronym", "class", "red", "blinking" or anything else
2597+we want. Only pre-determined roles are recognized; unknown roles will
2598+generate errors. A core set of standard roles is implemented in the
2599+reference parser; see `reStructuredText Interpreted Text Roles`_ for
2600+individual descriptions. The role_ directive can be used to define
2601+custom interpreted text roles. In addition, applications may support
2602+specialized roles.
2603+
2604+
2605+Inline Literals
2606+---------------
2607+
2608+Doctree element: literal.
2609+
2610+Start-string = end-string = "``".
2611+
2612+Text enclosed by double-backquotes is treated as inline literals::
2613+
2614+ This text is an example of ``inline literals``.
2615+
2616+Inline literals may contain any characters except two adjacent
2617+backquotes in an end-string context (according to the recognition
2618+rules above). No markup interpretation (including backslash-escape
2619+interpretation) is done within inline literals.
2620+
2621+Line breaks are *not* preserved in inline literals. Although a
2622+reStructuredText parser will preserve runs of spaces in its output,
2623+the final representation of the processed document is dependent on the
2624+output formatter, thus the preservation of whitespace cannot be
2625+guaranteed. If the preservation of line breaks and/or other
2626+whitespace is important, `literal blocks`_ should be used.
2627+
2628+Inline literals are useful for short code snippets. For example::
2629+
2630+ The regular expression ``[+-]?(\d+(\.\d*)?|\.\d+)`` matches
2631+ floating-point numbers (without exponents).
2632+
2633+
2634+Hyperlink References
2635+--------------------
2636+
2637+Doctree element: reference.
2638+
2639+- Named hyperlink references:
2640+
2641+ - Start-string = "" (empty string), end-string = "_".
2642+ - Start-string = "`", end-string = "\`_". (Phrase references.)
2643+
2644+- Anonymous hyperlink references:
2645+
2646+ - Start-string = "" (empty string), end-string = "__".
2647+ - Start-string = "`", end-string = "\`__". (Phrase references.)
2648+
2649+Hyperlink references are indicated by a trailing underscore, "_",
2650+except for `standalone hyperlinks`_ which are recognized
2651+independently. The underscore can be thought of as a right-pointing
2652+arrow. The trailing underscores point away from hyperlink references,
2653+and the leading underscores point toward `hyperlink targets`_.
2654+
2655+Hyperlinks consist of two parts. In the text body, there is a source
2656+link, a reference name with a trailing underscore (or two underscores
2657+for `anonymous hyperlinks`_)::
2658+
2659+ See the Python_ home page for info.
2660+
2661+A target link with a matching reference name must exist somewhere else
2662+in the document. See `Hyperlink Targets`_ for a full description).
2663+
2664+`Anonymous hyperlinks`_ (which see) do not use reference names to
2665+match references to targets, but otherwise behave similarly to named
2666+hyperlinks.
2667+
2668+
2669+Embedded URIs
2670+`````````````
2671+
2672+A hyperlink reference may directly embed a target URI inline, within
2673+angle brackets ("<...>") as follows::
2674+
2675+ See the `Python home page <http://www.python.org>`_ for info.
2676+
2677+This is exactly equivalent to::
2678+
2679+ See the `Python home page`_ for info.
2680+
2681+ .. _Python home page: http://www.python.org
2682+
2683+The bracketed URI must be preceded by whitespace and be the last text
2684+before the end string. With a single trailing underscore, the
2685+reference is named and the same target URI may be referred to again.
2686+
2687+With two trailing underscores, the reference and target are both
2688+anonymous, and the target cannot be referred to again. These are
2689+"one-off" hyperlinks. For example::
2690+
2691+ `RFC 2396 <http://www.rfc-editor.org/rfc/rfc2396.txt>`__ and `RFC
2692+ 2732 <http://www.rfc-editor.org/rfc/rfc2732.txt>`__ together
2693+ define the syntax of URIs.
2694+
2695+Equivalent to::
2696+
2697+ `RFC 2396`__ and `RFC 2732`__ together define the syntax of URIs.
2698+
2699+ __ http://www.rfc-editor.org/rfc/rfc2396.txt
2700+ __ http://www.rfc-editor.org/rfc/rfc2732.txt
2701+
2702+If reference text happens to end with angle-bracketed text that is
2703+*not* a URI, the open-angle-bracket needs to be backslash-escaped.
2704+For example, here is a reference to a title describing a tag::
2705+
2706+ See `HTML Element: \<a>`_ below.
2707+
2708+The reference text may also be omitted, in which case the URI will be
2709+duplicated for use as the reference text. This is useful for relative
2710+URIs where the address or file name is also the desired reference
2711+text::
2712+
2713+ See `<a_named_relative_link>`_ or `<an_anonymous_relative_link>`__
2714+ for details.
2715+
2716+.. CAUTION::
2717+
2718+ This construct offers easy authoring and maintenance of hyperlinks
2719+ at the expense of general readability. Inline URIs, especially
2720+ long ones, inevitably interrupt the natural flow of text. For
2721+ documents meant to be read in source form, the use of independent
2722+ block-level `hyperlink targets`_ is **strongly recommended**. The
2723+ embedded URI construct is most suited to documents intended *only*
2724+ to be read in processed form.
2725+
2726+
2727+Inline Internal Targets
2728+------------------------
2729+
2730+Doctree element: target.
2731+
2732+Start-string = "_`", end-string = "`".
2733+
2734+Inline internal targets are the equivalent of explicit `internal
2735+hyperlink targets`_, but may appear within running text. The syntax
2736+begins with an underscore and a backquote, is followed by a hyperlink
2737+name or phrase, and ends with a backquote. Inline internal targets
2738+may not be anonymous.
2739+
2740+For example, the following paragraph contains a hyperlink target named
2741+"Norwegian Blue"::
2742+
2743+ Oh yes, the _`Norwegian Blue`. What's, um, what's wrong with it?
2744+
2745+See `Implicit Hyperlink Targets`_ for the resolution of duplicate
2746+reference names.
2747+
2748+
2749+Footnote References
2750+-------------------
2751+
2752+Doctree element: footnote_reference.
2753+
2754+Start-string = "[", end-string = "]_".
2755+
2756+Each footnote reference consists of a square-bracketed label followed
2757+by a trailing underscore. Footnote labels are one of:
2758+
2759+- one or more digits (i.e., a number),
2760+
2761+- a single "#" (denoting `auto-numbered footnotes`_),
2762+
2763+- a "#" followed by a simple reference name (an `autonumber label`_),
2764+ or
2765+
2766+- a single "*" (denoting `auto-symbol footnotes`_).
2767+
2768+For example::
2769+
2770+ Please RTFM [1]_.
2771+
2772+ .. [1] Read The Fine Manual
2773+
2774+
2775+Citation References
2776+-------------------
2777+
2778+Doctree element: citation_reference.
2779+
2780+Start-string = "[", end-string = "]_".
2781+
2782+Each citation reference consists of a square-bracketed label followed
2783+by a trailing underscore. Citation labels are simple `reference
2784+names`_ (case-insensitive single words, consisting of alphanumerics
2785+plus internal hyphens, underscores, and periods; no whitespace).
2786+
2787+For example::
2788+
2789+ Here is a citation reference: [CIT2002]_.
2790+
2791+See Citations_ for the citation itself.
2792+
2793+
2794+Substitution References
2795+-----------------------
2796+
2797+Doctree element: substitution_reference, reference.
2798+
2799+Start-string = "|", end-string = "|" (optionally followed by "_" or
2800+"__").
2801+
2802+Vertical bars are used to bracket the substitution reference text. A
2803+substitution reference may also be a hyperlink reference by appending
2804+a "_" (named) or "__" (anonymous) suffix; the substitution text is
2805+used for the reference text in the named case.
2806+
2807+The processing system replaces substitution references with the
2808+processed contents of the corresponding `substitution definitions`_
2809+(which see for the definition of "correspond"). Substitution
2810+definitions produce inline-compatible elements.
2811+
2812+Examples::
2813+
2814+ This is a simple |substitution reference|. It will be replaced by
2815+ the processing system.
2816+
2817+ This is a combination |substitution and hyperlink reference|_. In
2818+ addition to being replaced, the replacement text or element will
2819+ refer to the "substitution and hyperlink reference" target.
2820+
2821+
2822+Standalone Hyperlinks
2823+---------------------
2824+
2825+Doctree element: reference.
2826+
2827+Start-string = end-string = "" (empty string).
2828+
2829+A URI (absolute URI [#URI]_ or standalone email address) within a text
2830+block is treated as a general external hyperlink with the URI itself
2831+as the link's text. For example::
2832+
2833+ See http://www.python.org for info.
2834+
2835+would be marked up in HTML as::
2836+
2837+ See <a href="http://www.python.org">http://www.python.org</a> for
2838+ info.
2839+
2840+Two forms of URI are recognized:
2841+
2842+1. Absolute URIs. These consist of a scheme, a colon (":"), and a
2843+ scheme-specific part whose interpretation depends on the scheme.
2844+
2845+ The scheme is the name of the protocol, such as "http", "ftp",
2846+ "mailto", or "telnet". The scheme consists of an initial letter,
2847+ followed by letters, numbers, and/or "+", "-", ".". Recognition is
2848+ limited to known schemes, per the `Official IANA Registry of URI
2849+ Schemes`_ and the W3C's `Retired Index of WWW Addressing Schemes`_.
2850+
2851+ The scheme-specific part of the resource identifier may be either
2852+ hierarchical or opaque:
2853+
2854+ - Hierarchical identifiers begin with one or two slashes and may
2855+ use slashes to separate hierarchical components of the path.
2856+ Examples are web pages and FTP sites::
2857+
2858+ http://www.python.org
2859+
2860+ ftp://ftp.python.org/pub/python
2861+
2862+ - Opaque identifiers do not begin with slashes. Examples are
2863+ email addresses and newsgroups::
2864+
2865+ mailto:someone@somewhere.com
2866+
2867+ news:comp.lang.python
2868+
2869+ With queries, fragments, and %-escape sequences, URIs can become
2870+ quite complicated. A reStructuredText parser must be able to
2871+ recognize any absolute URI, as defined in RFC2396_ and RFC2732_.
2872+
2873+2. Standalone email addresses, which are treated as if they were
2874+ absolute URIs with a "mailto:" scheme. Example::
2875+
2876+ someone@somewhere.com
2877+
2878+Punctuation at the end of a URI is not considered part of the URI,
2879+unless the URI is terminated by a closing angle bracket (">").
2880+Backslashes may be used in URIs to escape markup characters,
2881+specifically asterisks ("*") and underscores ("_") which are vaid URI
2882+characters (see `Escaping Mechanism`_ above).
2883+
2884+.. [#URI] Uniform Resource Identifier. URIs are a general form of
2885+ URLs (Uniform Resource Locators). For the syntax of URIs see
2886+ RFC2396_ and RFC2732_.
2887+
2888+
2889+Units
2890+=====
2891+
2892+(New in Docutils 0.3.10.)
2893+
2894+All measures consist of a positive floating point number in standard
2895+(non-scientific) notation and a unit, possibly separated by one or
2896+more spaces.
2897+
2898+Units are only supported where explicitly mentioned in the reference
2899+manuals.
2900+
2901+
2902+Length Units
2903+------------
2904+
2905+The following length units are supported by the reStructuredText
2906+parser:
2907+
2908+* em (ems, the height of the element's font)
2909+* ex (x-height, the height of the letter "x")
2910+* px (pixels, relative to the canvas resolution)
2911+* in (inches; 1in=2.54cm)
2912+* cm (centimeters; 1cm=10mm)
2913+* mm (millimeters)
2914+* pt (points; 1pt=1/72in)
2915+* pc (picas; 1pc=12pt)
2916+
2917+This set corresponds to the `length units in CSS`_.
2918+
2919+(List and explanations taken from
2920+http://www.htmlhelp.com/reference/css/units.html#length.)
2921+
2922+The following are all valid length values: "1.5em", "20 mm", ".5in".
2923+
2924+Length values without unit are completed with a writer-dependent
2925+default (e.g. px with `html4css1`, pt with `latex2e`). See the writer
2926+specific documentation in the `user doc`__ for details.
2927+
2928+.. _length units in CSS:
2929+ http://www.w3.org/TR/CSS2/syndata.html#length-units
2930+
2931+__ ../../user/
2932+
2933+Percentage Units
2934+----------------
2935+
2936+Percentage values have a percent sign ("%") as unit. Percentage
2937+values are relative to other values, depending on the context in which
2938+they occur.
2939+
2940+
2941+----------------
2942+ Error Handling
2943+----------------
2944+
2945+Doctree element: system_message, problematic.
2946+
2947+Markup errors are handled according to the specification in `PEP
2948+258`_.
2949+
2950+
2951+.. _reStructuredText: http://docutils.sourceforge.net/rst.html
2952+.. _Docutils: http://docutils.sourceforge.net/
2953+.. _The Docutils Document Tree: ../doctree.html
2954+.. _Docutils Generic DTD: ../docutils.dtd
2955+.. _transforms:
2956+ http://docutils.sourceforge.net/docutils/transforms/
2957+.. _Grouch: http://www.mems-exchange.org/software/grouch/
2958+.. _RFC822: http://www.rfc-editor.org/rfc/rfc822.txt
2959+.. _DocTitle transform:
2960+.. _DocInfo transform:
2961+ http://docutils.sourceforge.net/docutils/transforms/frontmatter.py
2962+.. _getopt.py:
2963+ http://www.python.org/doc/current/lib/module-getopt.html
2964+.. _GNU libc getopt_long():
2965+ http://www.gnu.org/software/libc/manual/html_node/Getopt-Long-Options.html
2966+.. _doctest module:
2967+ http://www.python.org/doc/current/lib/module-doctest.html
2968+.. _Emacs table mode: http://table.sourceforge.net/
2969+.. _Official IANA Registry of URI Schemes:
2970+ http://www.iana.org/assignments/uri-schemes
2971+.. _Retired Index of WWW Addressing Schemes:
2972+ http://www.w3.org/Addressing/schemes.html
2973+.. _World Wide Web Consortium: http://www.w3.org/
2974+.. _HTML Techniques for Web Content Accessibility Guidelines:
2975+ http://www.w3.org/TR/WCAG10-HTML-TECHS/#link-text
2976+.. _image: directives.html#image
2977+.. _replace: directives.html#replace
2978+.. _meta: directives.html#meta
2979+.. _figure: directives.html#figure
2980+.. _admonition: directives.html#admonitions
2981+.. _role: directives.html#custom-interpreted-text-roles
2982+.. _reStructuredText Directives: directives.html
2983+.. _reStructuredText Interpreted Text Roles: roles.html
2984+.. _RFC2396: http://www.rfc-editor.org/rfc/rfc2396.txt
2985+.. _RFC2732: http://www.rfc-editor.org/rfc/rfc2732.txt
2986+.. _Zope: http://www.zope.com/
2987+.. _PEP 258: ../../peps/pep-0258.html
2988+
2989+
2990
2991+..
2992+ Local Variables:
2993+ mode: indented-text
2994+ indent-tabs-mode: nil
2995+ sentence-end-double-space: t
2996+ fill-column: 70
2997+ End:
2998
2999=== added directory '.pc/support-aliases-in-references.diff/docutils'
3000=== added directory '.pc/support-aliases-in-references.diff/docutils/parsers'
3001=== added directory '.pc/support-aliases-in-references.diff/docutils/parsers/rst'
3002=== added file '.pc/support-aliases-in-references.diff/docutils/parsers/rst/states.py'
3003--- .pc/support-aliases-in-references.diff/docutils/parsers/rst/states.py 1970-01-01 00:00:00 +0000
3004+++ .pc/support-aliases-in-references.diff/docutils/parsers/rst/states.py 2013-03-19 07:30:29 +0000
3005@@ -0,0 +1,3052 @@
3006+# $Id: states.py 7495 2012-08-16 14:50:57Z milde $
3007+# Author: David Goodger <goodger@python.org>
3008+# Copyright: This module has been placed in the public domain.
3009+
3010+"""
3011+This is the ``docutils.parsers.rst.states`` module, the core of
3012+the reStructuredText parser. It defines the following:
3013+
3014+:Classes:
3015+ - `RSTStateMachine`: reStructuredText parser's entry point.
3016+ - `NestedStateMachine`: recursive StateMachine.
3017+ - `RSTState`: reStructuredText State superclass.
3018+ - `Inliner`: For parsing inline markup.
3019+ - `Body`: Generic classifier of the first line of a block.
3020+ - `SpecializedBody`: Superclass for compound element members.
3021+ - `BulletList`: Second and subsequent bullet_list list_items
3022+ - `DefinitionList`: Second+ definition_list_items.
3023+ - `EnumeratedList`: Second+ enumerated_list list_items.
3024+ - `FieldList`: Second+ fields.
3025+ - `OptionList`: Second+ option_list_items.
3026+ - `RFC2822List`: Second+ RFC2822-style fields.
3027+ - `ExtensionOptions`: Parses directive option fields.
3028+ - `Explicit`: Second+ explicit markup constructs.
3029+ - `SubstitutionDef`: For embedded directives in substitution definitions.
3030+ - `Text`: Classifier of second line of a text block.
3031+ - `SpecializedText`: Superclass for continuation lines of Text-variants.
3032+ - `Definition`: Second line of potential definition_list_item.
3033+ - `Line`: Second line of overlined section title or transition marker.
3034+ - `Struct`: An auxiliary collection class.
3035+
3036+:Exception classes:
3037+ - `MarkupError`
3038+ - `ParserError`
3039+ - `MarkupMismatch`
3040+
3041+:Functions:
3042+ - `escape2null()`: Return a string, escape-backslashes converted to nulls.
3043+ - `unescape()`: Return a string, nulls removed or restored to backslashes.
3044+
3045+:Attributes:
3046+ - `state_classes`: set of State classes used with `RSTStateMachine`.
3047+
3048+Parser Overview
3049+===============
3050+
3051+The reStructuredText parser is implemented as a recursive state machine,
3052+examining its input one line at a time. To understand how the parser works,
3053+please first become familiar with the `docutils.statemachine` module. In the
3054+description below, references are made to classes defined in this module;
3055+please see the individual classes for details.
3056+
3057+Parsing proceeds as follows:
3058+
3059+1. The state machine examines each line of input, checking each of the
3060+ transition patterns of the state `Body`, in order, looking for a match.
3061+ The implicit transitions (blank lines and indentation) are checked before
3062+ any others. The 'text' transition is a catch-all (matches anything).
3063+
3064+2. The method associated with the matched transition pattern is called.
3065+
3066+ A. Some transition methods are self-contained, appending elements to the
3067+ document tree (`Body.doctest` parses a doctest block). The parser's
3068+ current line index is advanced to the end of the element, and parsing
3069+ continues with step 1.
3070+
3071+ B. Other transition methods trigger the creation of a nested state machine,
3072+ whose job is to parse a compound construct ('indent' does a block quote,
3073+ 'bullet' does a bullet list, 'overline' does a section [first checking
3074+ for a valid section header], etc.).
3075+
3076+ - In the case of lists and explicit markup, a one-off state machine is
3077+ created and run to parse contents of the first item.
3078+
3079+ - A new state machine is created and its initial state is set to the
3080+ appropriate specialized state (`BulletList` in the case of the
3081+ 'bullet' transition; see `SpecializedBody` for more detail). This
3082+ state machine is run to parse the compound element (or series of
3083+ explicit markup elements), and returns as soon as a non-member element
3084+ is encountered. For example, the `BulletList` state machine ends as
3085+ soon as it encounters an element which is not a list item of that
3086+ bullet list. The optional omission of inter-element blank lines is
3087+ enabled by this nested state machine.
3088+
3089+ - The current line index is advanced to the end of the elements parsed,
3090+ and parsing continues with step 1.
3091+
3092+ C. The result of the 'text' transition depends on the next line of text.
3093+ The current state is changed to `Text`, under which the second line is
3094+ examined. If the second line is:
3095+
3096+ - Indented: The element is a definition list item, and parsing proceeds
3097+ similarly to step 2.B, using the `DefinitionList` state.
3098+
3099+ - A line of uniform punctuation characters: The element is a section
3100+ header; again, parsing proceeds as in step 2.B, and `Body` is still
3101+ used.
3102+
3103+ - Anything else: The element is a paragraph, which is examined for
3104+ inline markup and appended to the parent element. Processing
3105+ continues with step 1.
3106+"""
3107+
3108+__docformat__ = 'reStructuredText'
3109+
3110+
3111+import sys
3112+import re
3113+try:
3114+ import roman
3115+except ImportError:
3116+ import docutils.utils.roman as roman
3117+from types import FunctionType, MethodType
3118+
3119+from docutils import nodes, statemachine, utils
3120+from docutils import ApplicationError, DataError
3121+from docutils.statemachine import StateMachineWS, StateWS
3122+from docutils.nodes import fully_normalize_name as normalize_name
3123+from docutils.nodes import whitespace_normalize_name
3124+import docutils.parsers.rst
3125+from docutils.parsers.rst import directives, languages, tableparser, roles
3126+from docutils.parsers.rst.languages import en as _fallback_language_module
3127+from docutils.utils import escape2null, unescape, column_width
3128+from docutils.utils import punctuation_chars, urischemes
3129+
3130+class MarkupError(DataError): pass
3131+class UnknownInterpretedRoleError(DataError): pass
3132+class InterpretedRoleNotImplementedError(DataError): pass
3133+class ParserError(ApplicationError): pass
3134+class MarkupMismatch(Exception): pass
3135+
3136+
3137+class Struct:
3138+
3139+ """Stores data attributes for dotted-attribute access."""
3140+
3141+ def __init__(self, **keywordargs):
3142+ self.__dict__.update(keywordargs)
3143+
3144+
3145+class RSTStateMachine(StateMachineWS):
3146+
3147+ """
3148+ reStructuredText's master StateMachine.
3149+
3150+ The entry point to reStructuredText parsing is the `run()` method.
3151+ """
3152+
3153+ def run(self, input_lines, document, input_offset=0, match_titles=True,
3154+ inliner=None):
3155+ """
3156+ Parse `input_lines` and modify the `document` node in place.
3157+
3158+ Extend `StateMachineWS.run()`: set up parse-global data and
3159+ run the StateMachine.
3160+ """
3161+ self.language = languages.get_language(
3162+ document.settings.language_code)
3163+ self.match_titles = match_titles
3164+ if inliner is None:
3165+ inliner = Inliner()
3166+ inliner.init_customizations(document.settings)
3167+ self.memo = Struct(document=document,
3168+ reporter=document.reporter,
3169+ language=self.language,
3170+ title_styles=[],
3171+ section_level=0,
3172+ section_bubble_up_kludge=False,
3173+ inliner=inliner)
3174+ self.document = document
3175+ self.attach_observer(document.note_source)
3176+ self.reporter = self.memo.reporter
3177+ self.node = document
3178+ results = StateMachineWS.run(self, input_lines, input_offset,
3179+ input_source=document['source'])
3180+ assert results == [], 'RSTStateMachine.run() results should be empty!'
3181+ self.node = self.memo = None # remove unneeded references
3182+
3183+
3184+class NestedStateMachine(StateMachineWS):
3185+
3186+ """
3187+ StateMachine run from within other StateMachine runs, to parse nested
3188+ document structures.
3189+ """
3190+
3191+ def run(self, input_lines, input_offset, memo, node, match_titles=True):
3192+ """
3193+ Parse `input_lines` and populate a `docutils.nodes.document` instance.
3194+
3195+ Extend `StateMachineWS.run()`: set up document-wide data.
3196+ """
3197+ self.match_titles = match_titles
3198+ self.memo = memo
3199+ self.document = memo.document
3200+ self.attach_observer(self.document.note_source)
3201+ self.reporter = memo.reporter
3202+ self.language = memo.language
3203+ self.node = node
3204+ results = StateMachineWS.run(self, input_lines, input_offset)
3205+ assert results == [], ('NestedStateMachine.run() results should be '
3206+ 'empty!')
3207+ return results
3208+
3209+
3210+class RSTState(StateWS):
3211+
3212+ """
3213+ reStructuredText State superclass.
3214+
3215+ Contains methods used by all State subclasses.
3216+ """
3217+
3218+ nested_sm = NestedStateMachine
3219+ nested_sm_cache = []
3220+
3221+ def __init__(self, state_machine, debug=False):
3222+ self.nested_sm_kwargs = {'state_classes': state_classes,
3223+ 'initial_state': 'Body'}
3224+ StateWS.__init__(self, state_machine, debug)
3225+
3226+ def runtime_init(self):
3227+ StateWS.runtime_init(self)
3228+ memo = self.state_machine.memo
3229+ self.memo = memo
3230+ self.reporter = memo.reporter
3231+ self.inliner = memo.inliner
3232+ self.document = memo.document
3233+ self.parent = self.state_machine.node
3234+ # enable the reporter to determine source and source-line
3235+ if not hasattr(self.reporter, 'get_source_and_line'):
3236+ self.reporter.get_source_and_line = self.state_machine.get_source_and_line
3237+ # print "adding get_source_and_line to reporter", self.state_machine.input_offset
3238+
3239+
3240+ def goto_line(self, abs_line_offset):
3241+ """
3242+ Jump to input line `abs_line_offset`, ignoring jumps past the end.
3243+ """
3244+ try:
3245+ self.state_machine.goto_line(abs_line_offset)
3246+ except EOFError:
3247+ pass
3248+
3249+ def no_match(self, context, transitions):
3250+ """
3251+ Override `StateWS.no_match` to generate a system message.
3252+
3253+ This code should never be run.
3254+ """
3255+ self.reporter.severe(
3256+ 'Internal error: no transition pattern match. State: "%s"; '
3257+ 'transitions: %s; context: %s; current line: %r.'
3258+ % (self.__class__.__name__, transitions, context,
3259+ self.state_machine.line))
3260+ return context, None, []
3261+
3262+ def bof(self, context):
3263+ """Called at beginning of file."""
3264+ return [], []
3265+
3266+ def nested_parse(self, block, input_offset, node, match_titles=False,
3267+ state_machine_class=None, state_machine_kwargs=None):
3268+ """
3269+ Create a new StateMachine rooted at `node` and run it over the input
3270+ `block`.
3271+ """
3272+ use_default = 0
3273+ if state_machine_class is None:
3274+ state_machine_class = self.nested_sm
3275+ use_default += 1
3276+ if state_machine_kwargs is None:
3277+ state_machine_kwargs = self.nested_sm_kwargs
3278+ use_default += 1
3279+ block_length = len(block)
3280+
3281+ state_machine = None
3282+ if use_default == 2:
3283+ try:
3284+ state_machine = self.nested_sm_cache.pop()
3285+ except IndexError:
3286+ pass
3287+ if not state_machine:
3288+ state_machine = state_machine_class(debug=self.debug,
3289+ **state_machine_kwargs)
3290+ state_machine.run(block, input_offset, memo=self.memo,
3291+ node=node, match_titles=match_titles)
3292+ if use_default == 2:
3293+ self.nested_sm_cache.append(state_machine)
3294+ else:
3295+ state_machine.unlink()
3296+ new_offset = state_machine.abs_line_offset()
3297+ # No `block.parent` implies disconnected -- lines aren't in sync:
3298+ if block.parent and (len(block) - block_length) != 0:
3299+ # Adjustment for block if modified in nested parse:
3300+ self.state_machine.next_line(len(block) - block_length)
3301+ return new_offset
3302+
3303+ def nested_list_parse(self, block, input_offset, node, initial_state,
3304+ blank_finish,
3305+ blank_finish_state=None,
3306+ extra_settings={},
3307+ match_titles=False,
3308+ state_machine_class=None,
3309+ state_machine_kwargs=None):
3310+ """
3311+ Create a new StateMachine rooted at `node` and run it over the input
3312+ `block`. Also keep track of optional intermediate blank lines and the
3313+ required final one.
3314+ """
3315+ if state_machine_class is None:
3316+ state_machine_class = self.nested_sm
3317+ if state_machine_kwargs is None:
3318+ state_machine_kwargs = self.nested_sm_kwargs.copy()
3319+ state_machine_kwargs['initial_state'] = initial_state
3320+ state_machine = state_machine_class(debug=self.debug,
3321+ **state_machine_kwargs)
3322+ if blank_finish_state is None:
3323+ blank_finish_state = initial_state
3324+ state_machine.states[blank_finish_state].blank_finish = blank_finish
3325+ for key, value in extra_settings.items():
3326+ setattr(state_machine.states[initial_state], key, value)
3327+ state_machine.run(block, input_offset, memo=self.memo,
3328+ node=node, match_titles=match_titles)
3329+ blank_finish = state_machine.states[blank_finish_state].blank_finish
3330+ state_machine.unlink()
3331+ return state_machine.abs_line_offset(), blank_finish
3332+
3333+ def section(self, title, source, style, lineno, messages):
3334+ """Check for a valid subsection and create one if it checks out."""
3335+ if self.check_subsection(source, style, lineno):
3336+ self.new_subsection(title, lineno, messages)
3337+
3338+ def check_subsection(self, source, style, lineno):
3339+ """
3340+ Check for a valid subsection header. Return 1 (true) or None (false).
3341+
3342+ When a new section is reached that isn't a subsection of the current
3343+ section, back up the line count (use ``previous_line(-x)``), then
3344+ ``raise EOFError``. The current StateMachine will finish, then the
3345+ calling StateMachine can re-examine the title. This will work its way
3346+ back up the calling chain until the correct section level isreached.
3347+
3348+ @@@ Alternative: Evaluate the title, store the title info & level, and
3349+ back up the chain until that level is reached. Store in memo? Or
3350+ return in results?
3351+
3352+ :Exception: `EOFError` when a sibling or supersection encountered.
3353+ """
3354+ memo = self.memo
3355+ title_styles = memo.title_styles
3356+ mylevel = memo.section_level
3357+ try: # check for existing title style
3358+ level = title_styles.index(style) + 1
3359+ except ValueError: # new title style
3360+ if len(title_styles) == memo.section_level: # new subsection
3361+ title_styles.append(style)
3362+ return 1
3363+ else: # not at lowest level
3364+ self.parent += self.title_inconsistent(source, lineno)
3365+ return None
3366+ if level <= mylevel: # sibling or supersection
3367+ memo.section_level = level # bubble up to parent section
3368+ if len(style) == 2:
3369+ memo.section_bubble_up_kludge = True
3370+ # back up 2 lines for underline title, 3 for overline title
3371+ self.state_machine.previous_line(len(style) + 1)
3372+ raise EOFError # let parent section re-evaluate
3373+ if level == mylevel + 1: # immediate subsection
3374+ return 1
3375+ else: # invalid subsection
3376+ self.parent += self.title_inconsistent(source, lineno)
3377+ return None
3378+
3379+ def title_inconsistent(self, sourcetext, lineno):
3380+ error = self.reporter.severe(
3381+ 'Title level inconsistent:', nodes.literal_block('', sourcetext),
3382+ line=lineno)
3383+ return error
3384+
3385+ def new_subsection(self, title, lineno, messages):
3386+ """Append new subsection to document tree. On return, check level."""
3387+ memo = self.memo
3388+ mylevel = memo.section_level
3389+ memo.section_level += 1
3390+ section_node = nodes.section()
3391+ self.parent += section_node
3392+ textnodes, title_messages = self.inline_text(title, lineno)
3393+ titlenode = nodes.title(title, '', *textnodes)
3394+ name = normalize_name(titlenode.astext())
3395+ section_node['names'].append(name)
3396+ section_node += titlenode
3397+ section_node += messages
3398+ section_node += title_messages
3399+ self.document.note_implicit_target(section_node, section_node)
3400+ offset = self.state_machine.line_offset + 1
3401+ absoffset = self.state_machine.abs_line_offset() + 1
3402+ newabsoffset = self.nested_parse(
3403+ self.state_machine.input_lines[offset:], input_offset=absoffset,
3404+ node=section_node, match_titles=True)
3405+ self.goto_line(newabsoffset)
3406+ if memo.section_level <= mylevel: # can't handle next section?
3407+ raise EOFError # bubble up to supersection
3408+ # reset section_level; next pass will detect it properly
3409+ memo.section_level = mylevel
3410+
3411+ def paragraph(self, lines, lineno):
3412+ """
3413+ Return a list (paragraph & messages) & a boolean: literal_block next?
3414+ """
3415+ data = '\n'.join(lines).rstrip()
3416+ if re.search(r'(?<!\\)(\\\\)*::$', data):
3417+ if len(data) == 2:
3418+ return [], 1
3419+ elif data[-3] in ' \n':
3420+ text = data[:-3].rstrip()
3421+ else:
3422+ text = data[:-1]
3423+ literalnext = 1
3424+ else:
3425+ text = data
3426+ literalnext = 0
3427+ textnodes, messages = self.inline_text(text, lineno)
3428+ p = nodes.paragraph(data, '', *textnodes)
3429+ p.source, p.line = self.state_machine.get_source_and_line(lineno)
3430+ return [p] + messages, literalnext
3431+
3432+ def inline_text(self, text, lineno):
3433+ """
3434+ Return 2 lists: nodes (text and inline elements), and system_messages.
3435+ """
3436+ return self.inliner.parse(text, lineno, self.memo, self.parent)
3437+
3438+ def unindent_warning(self, node_name):
3439+ # the actual problem is one line below the current line
3440+ lineno = self.state_machine.abs_line_number()+1
3441+ return self.reporter.warning('%s ends without a blank line; '
3442+ 'unexpected unindent.' % node_name,
3443+ line=lineno)
3444+
3445+
3446+def build_regexp(definition, compile=True):
3447+ """
3448+ Build, compile and return a regular expression based on `definition`.
3449+
3450+ :Parameter: `definition`: a 4-tuple (group name, prefix, suffix, parts),
3451+ where "parts" is a list of regular expressions and/or regular
3452+ expression definitions to be joined into an or-group.
3453+ """
3454+ name, prefix, suffix, parts = definition
3455+ part_strings = []
3456+ for part in parts:
3457+ if type(part) is tuple:
3458+ part_strings.append(build_regexp(part, None))
3459+ else:
3460+ part_strings.append(part)
3461+ or_group = '|'.join(part_strings)
3462+ regexp = '%(prefix)s(?P<%(name)s>%(or_group)s)%(suffix)s' % locals()
3463+ if compile:
3464+ return re.compile(regexp, re.UNICODE)
3465+ else:
3466+ return regexp
3467+
3468+
3469+class Inliner:
3470+
3471+ """
3472+ Parse inline markup; call the `parse()` method.
3473+ """
3474+
3475+ def __init__(self):
3476+ self.implicit_dispatch = [(self.patterns.uri, self.standalone_uri),]
3477+ """List of (pattern, bound method) tuples, used by
3478+ `self.implicit_inline`."""
3479+
3480+ def init_customizations(self, settings):
3481+ """Setting-based customizations; run when parsing begins."""
3482+ if settings.pep_references:
3483+ self.implicit_dispatch.append((self.patterns.pep,
3484+ self.pep_reference))
3485+ if settings.rfc_references:
3486+ self.implicit_dispatch.append((self.patterns.rfc,
3487+ self.rfc_reference))
3488+
3489+ def parse(self, text, lineno, memo, parent):
3490+ # Needs to be refactored for nested inline markup.
3491+ # Add nested_parse() method?
3492+ """
3493+ Return 2 lists: nodes (text and inline elements), and system_messages.
3494+
3495+ Using `self.patterns.initial`, a pattern which matches start-strings
3496+ (emphasis, strong, interpreted, phrase reference, literal,
3497+ substitution reference, and inline target) and complete constructs
3498+ (simple reference, footnote reference), search for a candidate. When
3499+ one is found, check for validity (e.g., not a quoted '*' character).
3500+ If valid, search for the corresponding end string if applicable, and
3501+ check it for validity. If not found or invalid, generate a warning
3502+ and ignore the start-string. Implicit inline markup (e.g. standalone
3503+ URIs) is found last.
3504+ """
3505+ self.reporter = memo.reporter
3506+ self.document = memo.document
3507+ self.language = memo.language
3508+ self.parent = parent
3509+ pattern_search = self.patterns.initial.search
3510+ dispatch = self.dispatch
3511+ remaining = escape2null(text)
3512+ processed = []
3513+ unprocessed = []
3514+ messages = []
3515+ while remaining:
3516+ match = pattern_search(remaining)
3517+ if match:
3518+ groups = match.groupdict()
3519+ method = dispatch[groups['start'] or groups['backquote']
3520+ or groups['refend'] or groups['fnend']]
3521+ before, inlines, remaining, sysmessages = method(self, match,
3522+ lineno)
3523+ unprocessed.append(before)
3524+ messages += sysmessages
3525+ if inlines:
3526+ processed += self.implicit_inline(''.join(unprocessed),
3527+ lineno)
3528+ processed += inlines
3529+ unprocessed = []
3530+ else:
3531+ break
3532+ remaining = ''.join(unprocessed) + remaining
3533+ if remaining:
3534+ processed += self.implicit_inline(remaining, lineno)
3535+ return processed, messages
3536+
3537+ # Inline object recognition
3538+ # -------------------------
3539+ # lookahead and look-behind expressions for inline markup rules
3540+ start_string_prefix = (u'(^|(?<=\\s|[%s%s]))' %
3541+ (punctuation_chars.openers,
3542+ punctuation_chars.delimiters))
3543+ end_string_suffix = (u'($|(?=\\s|[\x00%s%s%s]))' %
3544+ (punctuation_chars.closing_delimiters,
3545+ punctuation_chars.delimiters,
3546+ punctuation_chars.closers))
3547+ # print start_string_prefix.encode('utf8')
3548+ # TODO: support non-ASCII whitespace in the following 4 patterns?
3549+ non_whitespace_before = r'(?<![ \n])'
3550+ non_whitespace_escape_before = r'(?<![ \n\x00])'
3551+ non_unescaped_whitespace_escape_before = r'(?<!(?<!\x00)[ \n\x00])'
3552+ non_whitespace_after = r'(?![ \n])'
3553+ # Alphanumerics with isolated internal [-._+:] chars (i.e. not 2 together):
3554+ simplename = r'(?:(?!_)\w)+(?:[-._+:](?:(?!_)\w)+)*'
3555+ # Valid URI characters (see RFC 2396 & RFC 2732);
3556+ # final \x00 allows backslash escapes in URIs:
3557+ uric = r"""[-_.!~*'()[\];/:@&=+$,%a-zA-Z0-9\x00]"""
3558+ # Delimiter indicating the end of a URI (not part of the URI):
3559+ uri_end_delim = r"""[>]"""
3560+ # Last URI character; same as uric but no punctuation:
3561+ urilast = r"""[_~*/=+a-zA-Z0-9]"""
3562+ # End of a URI (either 'urilast' or 'uric followed by a
3563+ # uri_end_delim'):
3564+ uri_end = r"""(?:%(urilast)s|%(uric)s(?=%(uri_end_delim)s))""" % locals()
3565+ emailc = r"""[-_!~*'{|}/#?^`&=+$%a-zA-Z0-9\x00]"""
3566+ email_pattern = r"""
3567+ %(emailc)s+(?:\.%(emailc)s+)* # name
3568+ (?<!\x00)@ # at
3569+ %(emailc)s+(?:\.%(emailc)s*)* # host
3570+ %(uri_end)s # final URI char
3571+ """
3572+ parts = ('initial_inline', start_string_prefix, '',
3573+ [('start', '', non_whitespace_after, # simple start-strings
3574+ [r'\*\*', # strong
3575+ r'\*(?!\*)', # emphasis but not strong
3576+ r'``', # literal
3577+ r'_`', # inline internal target
3578+ r'\|(?!\|)'] # substitution reference
3579+ ),
3580+ ('whole', '', end_string_suffix, # whole constructs
3581+ [# reference name & end-string
3582+ r'(?P<refname>%s)(?P<refend>__?)' % simplename,
3583+ ('footnotelabel', r'\[', r'(?P<fnend>\]_)',
3584+ [r'[0-9]+', # manually numbered
3585+ r'\#(%s)?' % simplename, # auto-numbered (w/ label?)
3586+ r'\*', # auto-symbol
3587+ r'(?P<citationlabel>%s)' % simplename] # citation reference
3588+ )
3589+ ]
3590+ ),
3591+ ('backquote', # interpreted text or phrase reference
3592+ '(?P<role>(:%s:)?)' % simplename, # optional role
3593+ non_whitespace_after,
3594+ ['`(?!`)'] # but not literal
3595+ )
3596+ ]
3597+ )
3598+ patterns = Struct(
3599+ initial=build_regexp(parts),
3600+ emphasis=re.compile(non_whitespace_escape_before
3601+ + r'(\*)' + end_string_suffix, re.UNICODE),
3602+ strong=re.compile(non_whitespace_escape_before
3603+ + r'(\*\*)' + end_string_suffix, re.UNICODE),
3604+ interpreted_or_phrase_ref=re.compile(
3605+ r"""
3606+ %(non_unescaped_whitespace_escape_before)s
3607+ (
3608+ `
3609+ (?P<suffix>
3610+ (?P<role>:%(simplename)s:)?
3611+ (?P<refend>__?)?
3612+ )
3613+ )
3614+ %(end_string_suffix)s
3615+ """ % locals(), re.VERBOSE | re.UNICODE),
3616+ embedded_uri=re.compile(
3617+ r"""
3618+ (
3619+ (?:[ \n]+|^) # spaces or beginning of line/string
3620+ < # open bracket
3621+ %(non_whitespace_after)s
3622+ ([^<>\x00]+) # anything but angle brackets & nulls
3623+ %(non_whitespace_before)s
3624+ > # close bracket w/o whitespace before
3625+ )
3626+ $ # end of string
3627+ """ % locals(), re.VERBOSE | re.UNICODE),
3628+ literal=re.compile(non_whitespace_before + '(``)'
3629+ + end_string_suffix),
3630+ target=re.compile(non_whitespace_escape_before
3631+ + r'(`)' + end_string_suffix),
3632+ substitution_ref=re.compile(non_whitespace_escape_before
3633+ + r'(\|_{0,2})'
3634+ + end_string_suffix),
3635+ email=re.compile(email_pattern % locals() + '$',
3636+ re.VERBOSE | re.UNICODE),
3637+ uri=re.compile(
3638+ (r"""
3639+ %(start_string_prefix)s
3640+ (?P<whole>
3641+ (?P<absolute> # absolute URI
3642+ (?P<scheme> # scheme (http, ftp, mailto)
3643+ [a-zA-Z][a-zA-Z0-9.+-]*
3644+ )
3645+ :
3646+ (
3647+ ( # either:
3648+ (//?)? # hierarchical URI
3649+ %(uric)s* # URI characters
3650+ %(uri_end)s # final URI char
3651+ )
3652+ ( # optional query
3653+ \?%(uric)s*
3654+ %(uri_end)s
3655+ )?
3656+ ( # optional fragment
3657+ \#%(uric)s*
3658+ %(uri_end)s
3659+ )?
3660+ )
3661+ )
3662+ | # *OR*
3663+ (?P<email> # email address
3664+ """ + email_pattern + r"""
3665+ )
3666+ )
3667+ %(end_string_suffix)s
3668+ """) % locals(), re.VERBOSE | re.UNICODE),
3669+ pep=re.compile(
3670+ r"""
3671+ %(start_string_prefix)s
3672+ (
3673+ (pep-(?P<pepnum1>\d+)(.txt)?) # reference to source file
3674+ |
3675+ (PEP\s+(?P<pepnum2>\d+)) # reference by name
3676+ )
3677+ %(end_string_suffix)s""" % locals(), re.VERBOSE | re.UNICODE),
3678+ rfc=re.compile(
3679+ r"""
3680+ %(start_string_prefix)s
3681+ (RFC(-|\s+)?(?P<rfcnum>\d+))
3682+ %(end_string_suffix)s""" % locals(), re.VERBOSE | re.UNICODE))
3683+
3684+ def quoted_start(self, match):
3685+ """Test if inline markup start-string is 'quoted'.
3686+
3687+ 'Quoted' in this context means the start-string is enclosed in a pair
3688+ of matching opening/closing delimiters (not necessarily quotes)
3689+ or at the end of the match.
3690+ """
3691+ string = match.string
3692+ start = match.start()
3693+ if start == 0: # start-string at beginning of text
3694+ return False
3695+ prestart = string[start - 1]
3696+ try:
3697+ poststart = string[match.end()]
3698+ except IndexError: # start-string at end of text
3699+ return True # not "quoted" but no markup start-string either
3700+ return punctuation_chars.match_chars(prestart, poststart)
3701+
3702+ def inline_obj(self, match, lineno, end_pattern, nodeclass,
3703+ restore_backslashes=False):
3704+ string = match.string
3705+ matchstart = match.start('start')
3706+ matchend = match.end('start')
3707+ if self.quoted_start(match):
3708+ return (string[:matchend], [], string[matchend:], [], '')
3709+ endmatch = end_pattern.search(string[matchend:])
3710+ if endmatch and endmatch.start(1): # 1 or more chars
3711+ text = unescape(endmatch.string[:endmatch.start(1)],
3712+ restore_backslashes)
3713+ textend = matchend + endmatch.end(1)
3714+ rawsource = unescape(string[matchstart:textend], 1)
3715+ return (string[:matchstart], [nodeclass(rawsource, text)],
3716+ string[textend:], [], endmatch.group(1))
3717+ msg = self.reporter.warning(
3718+ 'Inline %s start-string without end-string.'
3719+ % nodeclass.__name__, line=lineno)
3720+ text = unescape(string[matchstart:matchend], 1)
3721+ rawsource = unescape(string[matchstart:matchend], 1)
3722+ prb = self.problematic(text, rawsource, msg)
3723+ return string[:matchstart], [prb], string[matchend:], [msg], ''
3724+
3725+ def problematic(self, text, rawsource, message):
3726+ msgid = self.document.set_id(message, self.parent)
3727+ problematic = nodes.problematic(rawsource, text, refid=msgid)
3728+ prbid = self.document.set_id(problematic)
3729+ message.add_backref(prbid)
3730+ return problematic
3731+
3732+ def emphasis(self, match, lineno):
3733+ before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3734+ match, lineno, self.patterns.emphasis, nodes.emphasis)
3735+ return before, inlines, remaining, sysmessages
3736+
3737+ def strong(self, match, lineno):
3738+ before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3739+ match, lineno, self.patterns.strong, nodes.strong)
3740+ return before, inlines, remaining, sysmessages
3741+
3742+ def interpreted_or_phrase_ref(self, match, lineno):
3743+ end_pattern = self.patterns.interpreted_or_phrase_ref
3744+ string = match.string
3745+ matchstart = match.start('backquote')
3746+ matchend = match.end('backquote')
3747+ rolestart = match.start('role')
3748+ role = match.group('role')
3749+ position = ''
3750+ if role:
3751+ role = role[1:-1]
3752+ position = 'prefix'
3753+ elif self.quoted_start(match):
3754+ return (string[:matchend], [], string[matchend:], [])
3755+ endmatch = end_pattern.search(string[matchend:])
3756+ if endmatch and endmatch.start(1): # 1 or more chars
3757+ textend = matchend + endmatch.end()
3758+ if endmatch.group('role'):
3759+ if role:
3760+ msg = self.reporter.warning(
3761+ 'Multiple roles in interpreted text (both '
3762+ 'prefix and suffix present; only one allowed).',
3763+ line=lineno)
3764+ text = unescape(string[rolestart:textend], 1)
3765+ prb = self.problematic(text, text, msg)
3766+ return string[:rolestart], [prb], string[textend:], [msg]
3767+ role = endmatch.group('suffix')[1:-1]
3768+ position = 'suffix'
3769+ escaped = endmatch.string[:endmatch.start(1)]
3770+ rawsource = unescape(string[matchstart:textend], 1)
3771+ if rawsource[-1:] == '_':
3772+ if role:
3773+ msg = self.reporter.warning(
3774+ 'Mismatch: both interpreted text role %s and '
3775+ 'reference suffix.' % position, line=lineno)
3776+ text = unescape(string[rolestart:textend], 1)
3777+ prb = self.problematic(text, text, msg)
3778+ return string[:rolestart], [prb], string[textend:], [msg]
3779+ return self.phrase_ref(string[:matchstart], string[textend:],
3780+ rawsource, escaped, unescape(escaped))
3781+ else:
3782+ rawsource = unescape(string[rolestart:textend], 1)
3783+ nodelist, messages = self.interpreted(rawsource, escaped, role,
3784+ lineno)
3785+ return (string[:rolestart], nodelist,
3786+ string[textend:], messages)
3787+ msg = self.reporter.warning(
3788+ 'Inline interpreted text or phrase reference start-string '
3789+ 'without end-string.', line=lineno)
3790+ text = unescape(string[matchstart:matchend], 1)
3791+ prb = self.problematic(text, text, msg)
3792+ return string[:matchstart], [prb], string[matchend:], [msg]
3793+
3794+ def phrase_ref(self, before, after, rawsource, escaped, text):
3795+ match = self.patterns.embedded_uri.search(escaped)
3796+ if match:
3797+ text = unescape(escaped[:match.start(0)])
3798+ uri_text = match.group(2)
3799+ uri = ''.join(uri_text.split())
3800+ uri = self.adjust_uri(uri)
3801+ if uri:
3802+ target = nodes.target(match.group(1), refuri=uri)
3803+ target.referenced = 1
3804+ else:
3805+ raise ApplicationError('problem with URI: %r' % uri_text)
3806+ if not text:
3807+ text = uri
3808+ else:
3809+ target = None
3810+ refname = normalize_name(text)
3811+ reference = nodes.reference(rawsource, text,
3812+ name=whitespace_normalize_name(text))
3813+ node_list = [reference]
3814+ if rawsource[-2:] == '__':
3815+ if target:
3816+ reference['refuri'] = uri
3817+ else:
3818+ reference['anonymous'] = 1
3819+ else:
3820+ if target:
3821+ reference['refuri'] = uri
3822+ target['names'].append(refname)
3823+ self.document.note_explicit_target(target, self.parent)
3824+ node_list.append(target)
3825+ else:
3826+ reference['refname'] = refname
3827+ self.document.note_refname(reference)
3828+ return before, node_list, after, []
3829+
3830+ def adjust_uri(self, uri):
3831+ match = self.patterns.email.match(uri)
3832+ if match:
3833+ return 'mailto:' + uri
3834+ else:
3835+ return uri
3836+
3837+ def interpreted(self, rawsource, text, role, lineno):
3838+ role_fn, messages = roles.role(role, self.language, lineno,
3839+ self.reporter)
3840+ if role_fn:
3841+ nodes, messages2 = role_fn(role, rawsource, text, lineno, self)
3842+ return nodes, messages + messages2
3843+ else:
3844+ msg = self.reporter.error(
3845+ 'Unknown interpreted text role "%s".' % role,
3846+ line=lineno)
3847+ return ([self.problematic(rawsource, rawsource, msg)],
3848+ messages + [msg])
3849+
3850+ def literal(self, match, lineno):
3851+ before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3852+ match, lineno, self.patterns.literal, nodes.literal,
3853+ restore_backslashes=True)
3854+ return before, inlines, remaining, sysmessages
3855+
3856+ def inline_internal_target(self, match, lineno):
3857+ before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3858+ match, lineno, self.patterns.target, nodes.target)
3859+ if inlines and isinstance(inlines[0], nodes.target):
3860+ assert len(inlines) == 1
3861+ target = inlines[0]
3862+ name = normalize_name(target.astext())
3863+ target['names'].append(name)
3864+ self.document.note_explicit_target(target, self.parent)
3865+ return before, inlines, remaining, sysmessages
3866+
3867+ def substitution_reference(self, match, lineno):
3868+ before, inlines, remaining, sysmessages, endstring = self.inline_obj(
3869+ match, lineno, self.patterns.substitution_ref,
3870+ nodes.substitution_reference)
3871+ if len(inlines) == 1:
3872+ subref_node = inlines[0]
3873+ if isinstance(subref_node, nodes.substitution_reference):
3874+ subref_text = subref_node.astext()
3875+ self.document.note_substitution_ref(subref_node, subref_text)
3876+ if endstring[-1:] == '_':
3877+ reference_node = nodes.reference(
3878+ '|%s%s' % (subref_text, endstring), '')
3879+ if endstring[-2:] == '__':
3880+ reference_node['anonymous'] = 1
3881+ else:
3882+ reference_node['refname'] = normalize_name(subref_text)
3883+ self.document.note_refname(reference_node)
3884+ reference_node += subref_node
3885+ inlines = [reference_node]
3886+ return before, inlines, remaining, sysmessages
3887+
3888+ def footnote_reference(self, match, lineno):
3889+ """
3890+ Handles `nodes.footnote_reference` and `nodes.citation_reference`
3891+ elements.
3892+ """
3893+ label = match.group('footnotelabel')
3894+ refname = normalize_name(label)
3895+ string = match.string
3896+ before = string[:match.start('whole')]
3897+ remaining = string[match.end('whole'):]
3898+ if match.group('citationlabel'):
3899+ refnode = nodes.citation_reference('[%s]_' % label,
3900+ refname=refname)
3901+ refnode += nodes.Text(label)
3902+ self.document.note_citation_ref(refnode)
3903+ else:
3904+ refnode = nodes.footnote_reference('[%s]_' % label)
3905+ if refname[0] == '#':
3906+ refname = refname[1:]
3907+ refnode['auto'] = 1
3908+ self.document.note_autofootnote_ref(refnode)
3909+ elif refname == '*':
3910+ refname = ''
3911+ refnode['auto'] = '*'
3912+ self.document.note_symbol_footnote_ref(
3913+ refnode)
3914+ else:
3915+ refnode += nodes.Text(label)
3916+ if refname:
3917+ refnode['refname'] = refname
3918+ self.document.note_footnote_ref(refnode)
3919+ if utils.get_trim_footnote_ref_space(self.document.settings):
3920+ before = before.rstrip()
3921+ return (before, [refnode], remaining, [])
3922+
3923+ def reference(self, match, lineno, anonymous=False):
3924+ referencename = match.group('refname')
3925+ refname = normalize_name(referencename)
3926+ referencenode = nodes.reference(
3927+ referencename + match.group('refend'), referencename,
3928+ name=whitespace_normalize_name(referencename))
3929+ if anonymous:
3930+ referencenode['anonymous'] = 1
3931+ else:
3932+ referencenode['refname'] = refname
3933+ self.document.note_refname(referencenode)
3934+ string = match.string
3935+ matchstart = match.start('whole')
3936+ matchend = match.end('whole')
3937+ return (string[:matchstart], [referencenode], string[matchend:], [])
3938+
3939+ def anonymous_reference(self, match, lineno):
3940+ return self.reference(match, lineno, anonymous=1)
3941+
3942+ def standalone_uri(self, match, lineno):
3943+ if (not match.group('scheme')
3944+ or match.group('scheme').lower() in urischemes.schemes):
3945+ if match.group('email'):
3946+ addscheme = 'mailto:'
3947+ else:
3948+ addscheme = ''
3949+ text = match.group('whole')
3950+ unescaped = unescape(text, 0)
3951+ return [nodes.reference(unescape(text, 1), unescaped,
3952+ refuri=addscheme + unescaped)]
3953+ else: # not a valid scheme
3954+ raise MarkupMismatch
3955+
3956+ def pep_reference(self, match, lineno):
3957+ text = match.group(0)
3958+ if text.startswith('pep-'):
3959+ pepnum = int(match.group('pepnum1'))
3960+ elif text.startswith('PEP'):
3961+ pepnum = int(match.group('pepnum2'))
3962+ else:
3963+ raise MarkupMismatch
3964+ ref = (self.document.settings.pep_base_url
3965+ + self.document.settings.pep_file_url_template % pepnum)
3966+ unescaped = unescape(text, 0)
3967+ return [nodes.reference(unescape(text, 1), unescaped, refuri=ref)]
3968+
3969+ rfc_url = 'rfc%d.html'
3970+
3971+ def rfc_reference(self, match, lineno):
3972+ text = match.group(0)
3973+ if text.startswith('RFC'):
3974+ rfcnum = int(match.group('rfcnum'))
3975+ ref = self.document.settings.rfc_base_url + self.rfc_url % rfcnum
3976+ else:
3977+ raise MarkupMismatch
3978+ unescaped = unescape(text, 0)
3979+ return [nodes.reference(unescape(text, 1), unescaped, refuri=ref)]
3980+
3981+ def implicit_inline(self, text, lineno):
3982+ """
3983+ Check each of the patterns in `self.implicit_dispatch` for a match,
3984+ and dispatch to the stored method for the pattern. Recursively check
3985+ the text before and after the match. Return a list of `nodes.Text`
3986+ and inline element nodes.
3987+ """
3988+ if not text:
3989+ return []
3990+ for pattern, method in self.implicit_dispatch:
3991+ match = pattern.search(text)
3992+ if match:
3993+ try:
3994+ # Must recurse on strings before *and* after the match;
3995+ # there may be multiple patterns.
3996+ return (self.implicit_inline(text[:match.start()], lineno)
3997+ + method(match, lineno) +
3998+ self.implicit_inline(text[match.end():], lineno))
3999+ except MarkupMismatch:
4000+ pass
4001+ return [nodes.Text(unescape(text), rawsource=unescape(text, 1))]
4002+
4003+ dispatch = {'*': emphasis,
4004+ '**': strong,
4005+ '`': interpreted_or_phrase_ref,
4006+ '``': literal,
4007+ '_`': inline_internal_target,
4008+ ']_': footnote_reference,
4009+ '|': substitution_reference,
4010+ '_': reference,
4011+ '__': anonymous_reference}
4012+
4013+
4014+def _loweralpha_to_int(s, _zero=(ord('a')-1)):
4015+ return ord(s) - _zero
4016+
4017+def _upperalpha_to_int(s, _zero=(ord('A')-1)):
4018+ return ord(s) - _zero
4019+
4020+def _lowerroman_to_int(s):
4021+ return roman.fromRoman(s.upper())
4022+
4023+
4024+class Body(RSTState):
4025+
4026+ """
4027+ Generic classifier of the first line of a block.
4028+ """
4029+
4030+ double_width_pad_char = tableparser.TableParser.double_width_pad_char
4031+ """Padding character for East Asian double-width text."""
4032+
4033+ enum = Struct()
4034+ """Enumerated list parsing information."""
4035+
4036+ enum.formatinfo = {
4037+ 'parens': Struct(prefix='(', suffix=')', start=1, end=-1),
4038+ 'rparen': Struct(prefix='', suffix=')', start=0, end=-1),
4039+ 'period': Struct(prefix='', suffix='.', start=0, end=-1)}
4040+ enum.formats = enum.formatinfo.keys()
4041+ enum.sequences = ['arabic', 'loweralpha', 'upperalpha',
4042+ 'lowerroman', 'upperroman'] # ORDERED!
4043+ enum.sequencepats = {'arabic': '[0-9]+',
4044+ 'loweralpha': '[a-z]',
4045+ 'upperalpha': '[A-Z]',
4046+ 'lowerroman': '[ivxlcdm]+',
4047+ 'upperroman': '[IVXLCDM]+',}
4048+ enum.converters = {'arabic': int,
4049+ 'loweralpha': _loweralpha_to_int,
4050+ 'upperalpha': _upperalpha_to_int,
4051+ 'lowerroman': _lowerroman_to_int,
4052+ 'upperroman': roman.fromRoman}
4053+
4054+ enum.sequenceregexps = {}
4055+ for sequence in enum.sequences:
4056+ enum.sequenceregexps[sequence] = re.compile(
4057+ enum.sequencepats[sequence] + '$', re.UNICODE)
4058+
4059+ grid_table_top_pat = re.compile(r'\+-[-+]+-\+ *$')
4060+ """Matches the top (& bottom) of a full table)."""
4061+
4062+ simple_table_top_pat = re.compile('=+( +=+)+ *$')
4063+ """Matches the top of a simple table."""
4064+
4065+ simple_table_border_pat = re.compile('=+[ =]*$')
4066+ """Matches the bottom & header bottom of a simple table."""
4067+
4068+ pats = {}
4069+ """Fragments of patterns used by transitions."""
4070+
4071+ pats['nonalphanum7bit'] = '[!-/:-@[-`{-~]'
4072+ pats['alpha'] = '[a-zA-Z]'
4073+ pats['alphanum'] = '[a-zA-Z0-9]'
4074+ pats['alphanumplus'] = '[a-zA-Z0-9_-]'
4075+ pats['enum'] = ('(%(arabic)s|%(loweralpha)s|%(upperalpha)s|%(lowerroman)s'
4076+ '|%(upperroman)s|#)' % enum.sequencepats)
4077+ pats['optname'] = '%(alphanum)s%(alphanumplus)s*' % pats
4078+ # @@@ Loosen up the pattern? Allow Unicode?
4079+ pats['optarg'] = '(%(alpha)s%(alphanumplus)s*|<[^<>]+>)' % pats
4080+ pats['shortopt'] = r'(-|\+)%(alphanum)s( ?%(optarg)s)?' % pats
4081+ pats['longopt'] = r'(--|/)%(optname)s([ =]%(optarg)s)?' % pats
4082+ pats['option'] = r'(%(shortopt)s|%(longopt)s)' % pats
4083+
4084+ for format in enum.formats:
4085+ pats[format] = '(?P<%s>%s%s%s)' % (
4086+ format, re.escape(enum.formatinfo[format].prefix),
4087+ pats['enum'], re.escape(enum.formatinfo[format].suffix))
4088+
4089+ patterns = {
4090+ 'bullet': u'[-+*\u2022\u2023\u2043]( +|$)',
4091+ 'enumerator': r'(%(parens)s|%(rparen)s|%(period)s)( +|$)' % pats,
4092+ 'field_marker': r':(?![: ])([^:\\]|\\.)*(?<! ):( +|$)',
4093+ 'option_marker': r'%(option)s(, %(option)s)*( +| ?$)' % pats,
4094+ 'doctest': r'>>>( +|$)',
4095+ 'line_block': r'\|( +|$)',
4096+ 'grid_table_top': grid_table_top_pat,
4097+ 'simple_table_top': simple_table_top_pat,
4098+ 'explicit_markup': r'\.\.( +|$)',
4099+ 'anonymous': r'__( +|$)',
4100+ 'line': r'(%(nonalphanum7bit)s)\1* *$' % pats,
4101+ 'text': r''}
4102+ initial_transitions = (
4103+ 'bullet',
4104+ 'enumerator',
4105+ 'field_marker',
4106+ 'option_marker',
4107+ 'doctest',
4108+ 'line_block',
4109+ 'grid_table_top',
4110+ 'simple_table_top',
4111+ 'explicit_markup',
4112+ 'anonymous',
4113+ 'line',
4114+ 'text')
4115+
4116+ def indent(self, match, context, next_state):
4117+ """Block quote."""
4118+ indented, indent, line_offset, blank_finish = \
4119+ self.state_machine.get_indented()
4120+ elements = self.block_quote(indented, line_offset)
4121+ self.parent += elements
4122+ if not blank_finish:
4123+ self.parent += self.unindent_warning('Block quote')
4124+ return context, next_state, []
4125+
4126+ def block_quote(self, indented, line_offset):
4127+ elements = []
4128+ while indented:
4129+ (blockquote_lines,
4130+ attribution_lines,
4131+ attribution_offset,
4132+ indented,
4133+ new_line_offset) = self.split_attribution(indented, line_offset)
4134+ blockquote = nodes.block_quote()
4135+ self.nested_parse(blockquote_lines, line_offset, blockquote)
4136+ elements.append(blockquote)
4137+ if attribution_lines:
4138+ attribution, messages = self.parse_attribution(
4139+ attribution_lines, attribution_offset)
4140+ blockquote += attribution
4141+ elements += messages
4142+ line_offset = new_line_offset
4143+ while indented and not indented[0]:
4144+ indented = indented[1:]
4145+ line_offset += 1
4146+ return elements
4147+
4148+ # U+2014 is an em-dash:
4149+ attribution_pattern = re.compile(u'(---?(?!-)|\u2014) *(?=[^ \\n])',
4150+ re.UNICODE)
4151+
4152+ def split_attribution(self, indented, line_offset):
4153+ """
4154+ Check for a block quote attribution and split it off:
4155+
4156+ * First line after a blank line must begin with a dash ("--", "---",
4157+ em-dash; matches `self.attribution_pattern`).
4158+ * Every line after that must have consistent indentation.
4159+ * Attributions must be preceded by block quote content.
4160+
4161+ Return a tuple of: (block quote content lines, content offset,
4162+ attribution lines, attribution offset, remaining indented lines).
4163+ """
4164+ blank = None
4165+ nonblank_seen = False
4166+ for i in range(len(indented)):
4167+ line = indented[i].rstrip()
4168+ if line:
4169+ if nonblank_seen and blank == i - 1: # last line blank
4170+ match = self.attribution_pattern.match(line)
4171+ if match:
4172+ attribution_end, indent = self.check_attribution(
4173+ indented, i)
4174+ if attribution_end:
4175+ a_lines = indented[i:attribution_end]
4176+ a_lines.trim_left(match.end(), end=1)
4177+ a_lines.trim_left(indent, start=1)
4178+ return (indented[:i], a_lines,
4179+ i, indented[attribution_end:],
4180+ line_offset + attribution_end)
4181+ nonblank_seen = True
4182+ else:
4183+ blank = i
4184+ else:
4185+ return (indented, None, None, None, None)
4186+
4187+ def check_attribution(self, indented, attribution_start):
4188+ """
4189+ Check attribution shape.
4190+ Return the index past the end of the attribution, and the indent.
4191+ """
4192+ indent = None
4193+ i = attribution_start + 1
4194+ for i in range(attribution_start + 1, len(indented)):
4195+ line = indented[i].rstrip()
4196+ if not line:
4197+ break
4198+ if indent is None:
4199+ indent = len(line) - len(line.lstrip())
4200+ elif len(line) - len(line.lstrip()) != indent:
4201+ return None, None # bad shape; not an attribution
4202+ else:
4203+ # return index of line after last attribution line:
4204+ i += 1
4205+ return i, (indent or 0)
4206+
4207+ def parse_attribution(self, indented, line_offset):
4208+ text = '\n'.join(indented).rstrip()
4209+ lineno = self.state_machine.abs_line_number() + line_offset
4210+ textnodes, messages = self.inline_text(text, lineno)
4211+ node = nodes.attribution(text, '', *textnodes)
4212+ node.source, node.line = self.state_machine.get_source_and_line(lineno)
4213+ return node, messages
4214+
4215+ def bullet(self, match, context, next_state):
4216+ """Bullet list item."""
4217+ bulletlist = nodes.bullet_list()
4218+ self.parent += bulletlist
4219+ bulletlist['bullet'] = match.string[0]
4220+ i, blank_finish = self.list_item(match.end())
4221+ bulletlist += i
4222+ offset = self.state_machine.line_offset + 1 # next line
4223+ new_line_offset, blank_finish = self.nested_list_parse(
4224+ self.state_machine.input_lines[offset:],
4225+ input_offset=self.state_machine.abs_line_offset() + 1,
4226+ node=bulletlist, initial_state='BulletList',
4227+ blank_finish=blank_finish)
4228+ self.goto_line(new_line_offset)
4229+ if not blank_finish:
4230+ self.parent += self.unindent_warning('Bullet list')
4231+ return [], next_state, []
4232+
4233+ def list_item(self, indent):
4234+ if self.state_machine.line[indent:]:
4235+ indented, line_offset, blank_finish = (
4236+ self.state_machine.get_known_indented(indent))
4237+ else:
4238+ indented, indent, line_offset, blank_finish = (
4239+ self.state_machine.get_first_known_indented(indent))
4240+ listitem = nodes.list_item('\n'.join(indented))
4241+ if indented:
4242+ self.nested_parse(indented, input_offset=line_offset,
4243+ node=listitem)
4244+ return listitem, blank_finish
4245+
4246+ def enumerator(self, match, context, next_state):
4247+ """Enumerated List Item"""
4248+ format, sequence, text, ordinal = self.parse_enumerator(match)
4249+ if not self.is_enumerated_list_item(ordinal, sequence, format):
4250+ raise statemachine.TransitionCorrection('text')
4251+ enumlist = nodes.enumerated_list()
4252+ self.parent += enumlist
4253+ if sequence == '#':
4254+ enumlist['enumtype'] = 'arabic'
4255+ else:
4256+ enumlist['enumtype'] = sequence
4257+ enumlist['prefix'] = self.enum.formatinfo[format].prefix
4258+ enumlist['suffix'] = self.enum.formatinfo[format].suffix
4259+ if ordinal != 1:
4260+ enumlist['start'] = ordinal
4261+ msg = self.reporter.info(
4262+ 'Enumerated list start value not ordinal-1: "%s" (ordinal %s)'
4263+ % (text, ordinal))
4264+ self.parent += msg
4265+ listitem, blank_finish = self.list_item(match.end())
4266+ enumlist += listitem
4267+ offset = self.state_machine.line_offset + 1 # next line
4268+ newline_offset, blank_finish = self.nested_list_parse(
4269+ self.state_machine.input_lines[offset:],
4270+ input_offset=self.state_machine.abs_line_offset() + 1,
4271+ node=enumlist, initial_state='EnumeratedList',
4272+ blank_finish=blank_finish,
4273+ extra_settings={'lastordinal': ordinal,
4274+ 'format': format,
4275+ 'auto': sequence == '#'})
4276+ self.goto_line(newline_offset)
4277+ if not blank_finish:
4278+ self.parent += self.unindent_warning('Enumerated list')
4279+ return [], next_state, []
4280+
4281+ def parse_enumerator(self, match, expected_sequence=None):
4282+ """
4283+ Analyze an enumerator and return the results.
4284+
4285+ :Return:
4286+ - the enumerator format ('period', 'parens', or 'rparen'),
4287+ - the sequence used ('arabic', 'loweralpha', 'upperroman', etc.),
4288+ - the text of the enumerator, stripped of formatting, and
4289+ - the ordinal value of the enumerator ('a' -> 1, 'ii' -> 2, etc.;
4290+ ``None`` is returned for invalid enumerator text).
4291+
4292+ The enumerator format has already been determined by the regular
4293+ expression match. If `expected_sequence` is given, that sequence is
4294+ tried first. If not, we check for Roman numeral 1. This way,
4295+ single-character Roman numerals (which are also alphabetical) can be
4296+ matched. If no sequence has been matched, all sequences are checked in
4297+ order.
4298+ """
4299+ groupdict = match.groupdict()
4300+ sequence = ''
4301+ for format in self.enum.formats:
4302+ if groupdict[format]: # was this the format matched?
4303+ break # yes; keep `format`
4304+ else: # shouldn't happen
4305+ raise ParserError('enumerator format not matched')
4306+ text = groupdict[format][self.enum.formatinfo[format].start
4307+ :self.enum.formatinfo[format].end]
4308+ if text == '#':
4309+ sequence = '#'
4310+ elif expected_sequence:
4311+ try:
4312+ if self.enum.sequenceregexps[expected_sequence].match(text):
4313+ sequence = expected_sequence
4314+ except KeyError: # shouldn't happen
4315+ raise ParserError('unknown enumerator sequence: %s'
4316+ % sequence)
4317+ elif text == 'i':
4318+ sequence = 'lowerroman'
4319+ elif text == 'I':
4320+ sequence = 'upperroman'
4321+ if not sequence:
4322+ for sequence in self.enum.sequences:
4323+ if self.enum.sequenceregexps[sequence].match(text):
4324+ break
4325+ else: # shouldn't happen
4326+ raise ParserError('enumerator sequence not matched')
4327+ if sequence == '#':
4328+ ordinal = 1
4329+ else:
4330+ try:
4331+ ordinal = self.enum.converters[sequence](text)
4332+ except roman.InvalidRomanNumeralError:
4333+ ordinal = None
4334+ return format, sequence, text, ordinal
4335+
4336+ def is_enumerated_list_item(self, ordinal, sequence, format):
4337+ """
4338+ Check validity based on the ordinal value and the second line.
4339+
4340+ Return true if the ordinal is valid and the second line is blank,
4341+ indented, or starts with the next enumerator or an auto-enumerator.
4342+ """
4343+ if ordinal is None:
4344+ return None
4345+ try:
4346+ next_line = self.state_machine.next_line()
4347+ except EOFError: # end of input lines
4348+ self.state_machine.previous_line()
4349+ return 1
4350+ else:
4351+ self.state_machine.previous_line()
4352+ if not next_line[:1].strip(): # blank or indented
4353+ return 1
4354+ result = self.make_enumerator(ordinal + 1, sequence, format)
4355+ if result:
4356+ next_enumerator, auto_enumerator = result
4357+ try:
4358+ if ( next_line.startswith(next_enumerator) or
4359+ next_line.startswith(auto_enumerator) ):
4360+ return 1
4361+ except TypeError:
4362+ pass
4363+ return None
4364+
4365+ def make_enumerator(self, ordinal, sequence, format):
4366+ """
4367+ Construct and return the next enumerated list item marker, and an
4368+ auto-enumerator ("#" instead of the regular enumerator).
4369+
4370+ Return ``None`` for invalid (out of range) ordinals.
4371+ """ #"
4372+ if sequence == '#':
4373+ enumerator = '#'
4374+ elif sequence == 'arabic':
4375+ enumerator = str(ordinal)
4376+ else:
4377+ if sequence.endswith('alpha'):
4378+ if ordinal > 26:
4379+ return None
4380+ enumerator = chr(ordinal + ord('a') - 1)
4381+ elif sequence.endswith('roman'):
4382+ try:
4383+ enumerator = roman.toRoman(ordinal)
4384+ except roman.RomanError:
4385+ return None
4386+ else: # shouldn't happen
4387+ raise ParserError('unknown enumerator sequence: "%s"'
4388+ % sequence)
4389+ if sequence.startswith('lower'):
4390+ enumerator = enumerator.lower()
4391+ elif sequence.startswith('upper'):
4392+ enumerator = enumerator.upper()
4393+ else: # shouldn't happen
4394+ raise ParserError('unknown enumerator sequence: "%s"'
4395+ % sequence)
4396+ formatinfo = self.enum.formatinfo[format]
4397+ next_enumerator = (formatinfo.prefix + enumerator + formatinfo.suffix
4398+ + ' ')
4399+ auto_enumerator = formatinfo.prefix + '#' + formatinfo.suffix + ' '
4400+ return next_enumerator, auto_enumerator
4401+
4402+ def field_marker(self, match, context, next_state):
4403+ """Field list item."""
4404+ field_list = nodes.field_list()
4405+ self.parent += field_list
4406+ field, blank_finish = self.field(match)
4407+ field_list += field
4408+ offset = self.state_machine.line_offset + 1 # next line
4409+ newline_offset, blank_finish = self.nested_list_parse(
4410+ self.state_machine.input_lines[offset:],
4411+ input_offset=self.state_machine.abs_line_offset() + 1,
4412+ node=field_list, initial_state='FieldList',
4413+ blank_finish=blank_finish)
4414+ self.goto_line(newline_offset)
4415+ if not blank_finish:
4416+ self.parent += self.unindent_warning('Field list')
4417+ return [], next_state, []
4418+
4419+ def field(self, match):
4420+ name = self.parse_field_marker(match)
4421+ src, srcline = self.state_machine.get_source_and_line()
4422+ lineno = self.state_machine.abs_line_number()
4423+ indented, indent, line_offset, blank_finish = \
4424+ self.state_machine.get_first_known_indented(match.end())
4425+ field_node = nodes.field()
4426+ field_node.source = src
4427+ field_node.line = srcline
4428+ name_nodes, name_messages = self.inline_text(name, lineno)
4429+ field_node += nodes.field_name(name, '', *name_nodes)
4430+ field_body = nodes.field_body('\n'.join(indented), *name_messages)
4431+ field_node += field_body
4432+ if indented:
4433+ self.parse_field_body(indented, line_offset, field_body)
4434+ return field_node, blank_finish
4435+
4436+ def parse_field_marker(self, match):
4437+ """Extract & return field name from a field marker match."""
4438+ field = match.group()[1:] # strip off leading ':'
4439+ field = field[:field.rfind(':')] # strip off trailing ':' etc.
4440+ return field
4441+
4442+ def parse_field_body(self, indented, offset, node):
4443+ self.nested_parse(indented, input_offset=offset, node=node)
4444+
4445+ def option_marker(self, match, context, next_state):
4446+ """Option list item."""
4447+ optionlist = nodes.option_list()
4448+ try:
4449+ listitem, blank_finish = self.option_list_item(match)
4450+ except MarkupError, error:
4451+ # This shouldn't happen; pattern won't match.
4452+ msg = self.reporter.error(u'Invalid option list marker: %s' %
4453+ error)
4454+ self.parent += msg
4455+ indented, indent, line_offset, blank_finish = \
4456+ self.state_machine.get_first_known_indented(match.end())
4457+ elements = self.block_quote(indented, line_offset)
4458+ self.parent += elements
4459+ if not blank_finish:
4460+ self.parent += self.unindent_warning('Option list')
4461+ return [], next_state, []
4462+ self.parent += optionlist
4463+ optionlist += listitem
4464+ offset = self.state_machine.line_offset + 1 # next line
4465+ newline_offset, blank_finish = self.nested_list_parse(
4466+ self.state_machine.input_lines[offset:],
4467+ input_offset=self.state_machine.abs_line_offset() + 1,
4468+ node=optionlist, initial_state='OptionList',
4469+ blank_finish=blank_finish)
4470+ self.goto_line(newline_offset)
4471+ if not blank_finish:
4472+ self.parent += self.unindent_warning('Option list')
4473+ return [], next_state, []
4474+
4475+ def option_list_item(self, match):
4476+ offset = self.state_machine.abs_line_offset()
4477+ options = self.parse_option_marker(match)
4478+ indented, indent, line_offset, blank_finish = \
4479+ self.state_machine.get_first_known_indented(match.end())
4480+ if not indented: # not an option list item
4481+ self.goto_line(offset)
4482+ raise statemachine.TransitionCorrection('text')
4483+ option_group = nodes.option_group('', *options)
4484+ description = nodes.description('\n'.join(indented))
4485+ option_list_item = nodes.option_list_item('', option_group,
4486+ description)
4487+ if indented:
4488+ self.nested_parse(indented, input_offset=line_offset,
4489+ node=description)
4490+ return option_list_item, blank_finish
4491+
4492+ def parse_option_marker(self, match):
4493+ """
4494+ Return a list of `node.option` and `node.option_argument` objects,
4495+ parsed from an option marker match.
4496+
4497+ :Exception: `MarkupError` for invalid option markers.
4498+ """
4499+ optlist = []
4500+ optionstrings = match.group().rstrip().split(', ')
4501+ for optionstring in optionstrings:
4502+ tokens = optionstring.split()
4503+ delimiter = ' '
4504+ firstopt = tokens[0].split('=', 1)
4505+ if len(firstopt) > 1:
4506+ # "--opt=value" form
4507+ tokens[:1] = firstopt
4508+ delimiter = '='
4509+ elif (len(tokens[0]) > 2
4510+ and ((tokens[0].startswith('-')
4511+ and not tokens[0].startswith('--'))
4512+ or tokens[0].startswith('+'))):
4513+ # "-ovalue" form
4514+ tokens[:1] = [tokens[0][:2], tokens[0][2:]]
4515+ delimiter = ''
4516+ if len(tokens) > 1 and (tokens[1].startswith('<')
4517+ and tokens[-1].endswith('>')):
4518+ # "-o <value1 value2>" form; join all values into one token
4519+ tokens[1:] = [' '.join(tokens[1:])]
4520+ if 0 < len(tokens) <= 2:
4521+ option = nodes.option(optionstring)
4522+ option += nodes.option_string(tokens[0], tokens[0])
4523+ if len(tokens) > 1:
4524+ option += nodes.option_argument(tokens[1], tokens[1],
4525+ delimiter=delimiter)
4526+ optlist.append(option)
4527+ else:
4528+ raise MarkupError(
4529+ 'wrong number of option tokens (=%s), should be 1 or 2: '
4530+ '"%s"' % (len(tokens), optionstring))
4531+ return optlist
4532+
4533+ def doctest(self, match, context, next_state):
4534+ data = '\n'.join(self.state_machine.get_text_block())
4535+ self.parent += nodes.doctest_block(data, data)
4536+ return [], next_state, []
4537+
4538+ def line_block(self, match, context, next_state):
4539+ """First line of a line block."""
4540+ block = nodes.line_block()
4541+ self.parent += block
4542+ lineno = self.state_machine.abs_line_number()
4543+ line, messages, blank_finish = self.line_block_line(match, lineno)
4544+ block += line
4545+ self.parent += messages
4546+ if not blank_finish:
4547+ offset = self.state_machine.line_offset + 1 # next line
4548+ new_line_offset, blank_finish = self.nested_list_parse(
4549+ self.state_machine.input_lines[offset:],
4550+ input_offset=self.state_machine.abs_line_offset() + 1,
4551+ node=block, initial_state='LineBlock',
4552+ blank_finish=0)
4553+ self.goto_line(new_line_offset)
4554+ if not blank_finish:
4555+ self.parent += self.reporter.warning(
4556+ 'Line block ends without a blank line.',
4557+ line=lineno+1)
4558+ if len(block):
4559+ if block[0].indent is None:
4560+ block[0].indent = 0
4561+ self.nest_line_block_lines(block)
4562+ return [], next_state, []
4563+
4564+ def line_block_line(self, match, lineno):
4565+ """Return one line element of a line_block."""
4566+ indented, indent, line_offset, blank_finish = \
4567+ self.state_machine.get_first_known_indented(match.end(),
4568+ until_blank=True)
4569+ text = u'\n'.join(indented)
4570+ text_nodes, messages = self.inline_text(text, lineno)
4571+ line = nodes.line(text, '', *text_nodes)
4572+ if match.string.rstrip() != '|': # not empty
4573+ line.indent = len(match.group(1)) - 1
4574+ return line, messages, blank_finish
4575+
4576+ def nest_line_block_lines(self, block):
4577+ for index in range(1, len(block)):
4578+ if block[index].indent is None:
4579+ block[index].indent = block[index - 1].indent
4580+ self.nest_line_block_segment(block)
4581+
4582+ def nest_line_block_segment(self, block):
4583+ indents = [item.indent for item in block]
4584+ least = min(indents)
4585+ new_items = []
4586+ new_block = nodes.line_block()
4587+ for item in block:
4588+ if item.indent > least:
4589+ new_block.append(item)
4590+ else:
4591+ if len(new_block):
4592+ self.nest_line_block_segment(new_block)
4593+ new_items.append(new_block)
4594+ new_block = nodes.line_block()
4595+ new_items.append(item)
4596+ if len(new_block):
4597+ self.nest_line_block_segment(new_block)
4598+ new_items.append(new_block)
4599+ block[:] = new_items
4600+
4601+ def grid_table_top(self, match, context, next_state):
4602+ """Top border of a full table."""
4603+ return self.table_top(match, context, next_state,
4604+ self.isolate_grid_table,
4605+ tableparser.GridTableParser)
4606+
4607+ def simple_table_top(self, match, context, next_state):
4608+ """Top border of a simple table."""
4609+ return self.table_top(match, context, next_state,
4610+ self.isolate_simple_table,
4611+ tableparser.SimpleTableParser)
4612+
4613+ def table_top(self, match, context, next_state,
4614+ isolate_function, parser_class):
4615+ """Top border of a generic table."""
4616+ nodelist, blank_finish = self.table(isolate_function, parser_class)
4617+ self.parent += nodelist
4618+ if not blank_finish:
4619+ msg = self.reporter.warning(
4620+ 'Blank line required after table.',
4621+ line=self.state_machine.abs_line_number()+1)
4622+ self.parent += msg
4623+ return [], next_state, []
4624+
4625+ def table(self, isolate_function, parser_class):
4626+ """Parse a table."""
4627+ block, messages, blank_finish = isolate_function()
4628+ if block:
4629+ try:
4630+ parser = parser_class()
4631+ tabledata = parser.parse(block)
4632+ tableline = (self.state_machine.abs_line_number() - len(block)
4633+ + 1)
4634+ table = self.build_table(tabledata, tableline)
4635+ nodelist = [table] + messages
4636+ except tableparser.TableMarkupError, err:
4637+ nodelist = self.malformed_table(block, ' '.join(err.args),
4638+ offset=err.offset) + messages
4639+ else:
4640+ nodelist = messages
4641+ return nodelist, blank_finish
4642+
4643+ def isolate_grid_table(self):
4644+ messages = []
4645+ blank_finish = 1
4646+ try:
4647+ block = self.state_machine.get_text_block(flush_left=True)
4648+ except statemachine.UnexpectedIndentationError, err:
4649+ block, src, srcline = err.args
4650+ messages.append(self.reporter.error('Unexpected indentation.',
4651+ source=src, line=srcline))
4652+ blank_finish = 0
4653+ block.disconnect()
4654+ # for East Asian chars:
4655+ block.pad_double_width(self.double_width_pad_char)
4656+ width = len(block[0].strip())
4657+ for i in range(len(block)):
4658+ block[i] = block[i].strip()
4659+ if block[i][0] not in '+|': # check left edge
4660+ blank_finish = 0
4661+ self.state_machine.previous_line(len(block) - i)
4662+ del block[i:]
4663+ break
4664+ if not self.grid_table_top_pat.match(block[-1]): # find bottom
4665+ blank_finish = 0
4666+ # from second-last to third line of table:
4667+ for i in range(len(block) - 2, 1, -1):
4668+ if self.grid_table_top_pat.match(block[i]):
4669+ self.state_machine.previous_line(len(block) - i + 1)
4670+ del block[i+1:]
4671+ break
4672+ else:
4673+ messages.extend(self.malformed_table(block))
4674+ return [], messages, blank_finish
4675+ for i in range(len(block)): # check right edge
4676+ if len(block[i]) != width or block[i][-1] not in '+|':
4677+ messages.extend(self.malformed_table(block))
4678+ return [], messages, blank_finish
4679+ return block, messages, blank_finish
4680+
4681+ def isolate_simple_table(self):
4682+ start = self.state_machine.line_offset
4683+ lines = self.state_machine.input_lines
4684+ limit = len(lines) - 1
4685+ toplen = len(lines[start].strip())
4686+ pattern_match = self.simple_table_border_pat.match
4687+ found = 0
4688+ found_at = None
4689+ i = start + 1
4690+ while i <= limit:
4691+ line = lines[i]
4692+ match = pattern_match(line)
4693+ if match:
4694+ if len(line.strip()) != toplen:
4695+ self.state_machine.next_line(i - start)
4696+ messages = self.malformed_table(
4697+ lines[start:i+1], 'Bottom/header table border does '
4698+ 'not match top border.')
4699+ return [], messages, i == limit or not lines[i+1].strip()
4700+ found += 1
4701+ found_at = i
4702+ if found == 2 or i == limit or not lines[i+1].strip():
4703+ end = i
4704+ break
4705+ i += 1
4706+ else: # reached end of input_lines
4707+ if found:
4708+ extra = ' or no blank line after table bottom'
4709+ self.state_machine.next_line(found_at - start)
4710+ block = lines[start:found_at+1]
4711+ else:
4712+ extra = ''
4713+ self.state_machine.next_line(i - start - 1)
4714+ block = lines[start:]
4715+ messages = self.malformed_table(
4716+ block, 'No bottom table border found%s.' % extra)
4717+ return [], messages, not extra
4718+ self.state_machine.next_line(end - start)
4719+ block = lines[start:end+1]
4720+ # for East Asian chars:
4721+ block.pad_double_width(self.double_width_pad_char)
4722+ return block, [], end == limit or not lines[end+1].strip()
4723+
4724+ def malformed_table(self, block, detail='', offset=0):
4725+ block.replace(self.double_width_pad_char, '')
4726+ data = '\n'.join(block)
4727+ message = 'Malformed table.'
4728+ startline = self.state_machine.abs_line_number() - len(block) + 1
4729+ if detail:
4730+ message += '\n' + detail
4731+ error = self.reporter.error(message, nodes.literal_block(data, data),
4732+ line=startline+offset)
4733+ return [error]
4734+
4735+ def build_table(self, tabledata, tableline, stub_columns=0):
4736+ colwidths, headrows, bodyrows = tabledata
4737+ table = nodes.table()
4738+ tgroup = nodes.tgroup(cols=len(colwidths))
4739+ table += tgroup
4740+ for colwidth in colwidths:
4741+ colspec = nodes.colspec(colwidth=colwidth)
4742+ if stub_columns:
4743+ colspec.attributes['stub'] = 1
4744+ stub_columns -= 1
4745+ tgroup += colspec
4746+ if headrows:
4747+ thead = nodes.thead()
4748+ tgroup += thead
4749+ for row in headrows:
4750+ thead += self.build_table_row(row, tableline)
4751+ tbody = nodes.tbody()
4752+ tgroup += tbody
4753+ for row in bodyrows:
4754+ tbody += self.build_table_row(row, tableline)
4755+ return table
4756+
4757+ def build_table_row(self, rowdata, tableline):
4758+ row = nodes.row()
4759+ for cell in rowdata:
4760+ if cell is None:
4761+ continue
4762+ morerows, morecols, offset, cellblock = cell
4763+ attributes = {}
4764+ if morerows:
4765+ attributes['morerows'] = morerows
4766+ if morecols:
4767+ attributes['morecols'] = morecols
4768+ entry = nodes.entry(**attributes)
4769+ row += entry
4770+ if ''.join(cellblock):
4771+ self.nested_parse(cellblock, input_offset=tableline+offset,
4772+ node=entry)
4773+ return row
4774+
4775+
4776+ explicit = Struct()
4777+ """Patterns and constants used for explicit markup recognition."""
4778+
4779+ explicit.patterns = Struct(
4780+ target=re.compile(r"""
4781+ (
4782+ _ # anonymous target
4783+ | # *OR*
4784+ (?!_) # no underscore at the beginning
4785+ (?P<quote>`?) # optional open quote
4786+ (?![ `]) # first char. not space or
4787+ # backquote
4788+ (?P<name> # reference name
4789+ .+?
4790+ )
4791+ %(non_whitespace_escape_before)s
4792+ (?P=quote) # close quote if open quote used
4793+ )
4794+ (?<!(?<!\x00):) # no unescaped colon at end
4795+ %(non_whitespace_escape_before)s
4796+ [ ]? # optional space
4797+ : # end of reference name
4798+ ([ ]+|$) # followed by whitespace
4799+ """ % vars(Inliner), re.VERBOSE | re.UNICODE),
4800+ reference=re.compile(r"""
4801+ (
4802+ (?P<simple>%(simplename)s)_
4803+ | # *OR*
4804+ ` # open backquote
4805+ (?![ ]) # not space
4806+ (?P<phrase>.+?) # hyperlink phrase
4807+ %(non_whitespace_escape_before)s
4808+ `_ # close backquote,
4809+ # reference mark
4810+ )
4811+ $ # end of string
4812+ """ % vars(Inliner), re.VERBOSE | re.UNICODE),
4813+ substitution=re.compile(r"""
4814+ (
4815+ (?![ ]) # first char. not space
4816+ (?P<name>.+?) # substitution text
4817+ %(non_whitespace_escape_before)s
4818+ \| # close delimiter
4819+ )
4820+ ([ ]+|$) # followed by whitespace
4821+ """ % vars(Inliner),
4822+ re.VERBOSE | re.UNICODE),)
4823+
4824+ def footnote(self, match):
4825+ src, srcline = self.state_machine.get_source_and_line()
4826+ indented, indent, offset, blank_finish = \
4827+ self.state_machine.get_first_known_indented(match.end())
4828+ label = match.group(1)
4829+ name = normalize_name(label)
4830+ footnote = nodes.footnote('\n'.join(indented))
4831+ footnote.source = src
4832+ footnote.line = srcline
4833+ if name[0] == '#': # auto-numbered
4834+ name = name[1:] # autonumber label
4835+ footnote['auto'] = 1
4836+ if name:
4837+ footnote['names'].append(name)
4838+ self.document.note_autofootnote(footnote)
4839+ elif name == '*': # auto-symbol
4840+ name = ''
4841+ footnote['auto'] = '*'
4842+ self.document.note_symbol_footnote(footnote)
4843+ else: # manually numbered
4844+ footnote += nodes.label('', label)
4845+ footnote['names'].append(name)
4846+ self.document.note_footnote(footnote)
4847+ if name:
4848+ self.document.note_explicit_target(footnote, footnote)
4849+ else:
4850+ self.document.set_id(footnote, footnote)
4851+ if indented:
4852+ self.nested_parse(indented, input_offset=offset, node=footnote)
4853+ return [footnote], blank_finish
4854+
4855+ def citation(self, match):
4856+ src, srcline = self.state_machine.get_source_and_line()
4857+ indented, indent, offset, blank_finish = \
4858+ self.state_machine.get_first_known_indented(match.end())
4859+ label = match.group(1)
4860+ name = normalize_name(label)
4861+ citation = nodes.citation('\n'.join(indented))
4862+ citation.source = src
4863+ citation.line = srcline
4864+ citation += nodes.label('', label)
4865+ citation['names'].append(name)
4866+ self.document.note_citation(citation)
4867+ self.document.note_explicit_target(citation, citation)
4868+ if indented:
4869+ self.nested_parse(indented, input_offset=offset, node=citation)
4870+ return [citation], blank_finish
4871+
4872+ def hyperlink_target(self, match):
4873+ pattern = self.explicit.patterns.target
4874+ lineno = self.state_machine.abs_line_number()
4875+ block, indent, offset, blank_finish = \
4876+ self.state_machine.get_first_known_indented(
4877+ match.end(), until_blank=True, strip_indent=False)
4878+ blocktext = match.string[:match.end()] + '\n'.join(block)
4879+ block = [escape2null(line) for line in block]
4880+ escaped = block[0]
4881+ blockindex = 0
4882+ while True:
4883+ targetmatch = pattern.match(escaped)
4884+ if targetmatch:
4885+ break
4886+ blockindex += 1
4887+ try:
4888+ escaped += block[blockindex]
4889+ except IndexError:
4890+ raise MarkupError('malformed hyperlink target.')
4891+ del block[:blockindex]
4892+ block[0] = (block[0] + ' ')[targetmatch.end()-len(escaped)-1:].strip()
4893+ target = self.make_target(block, blocktext, lineno,
4894+ targetmatch.group('name'))
4895+ return [target], blank_finish
4896+
4897+ def make_target(self, block, block_text, lineno, target_name):
4898+ target_type, data = self.parse_target(block, block_text, lineno)
4899+ if target_type == 'refname':
4900+ target = nodes.target(block_text, '', refname=normalize_name(data))
4901+ target.indirect_reference_name = data
4902+ self.add_target(target_name, '', target, lineno)
4903+ self.document.note_indirect_target(target)
4904+ return target
4905+ elif target_type == 'refuri':
4906+ target = nodes.target(block_text, '')
4907+ self.add_target(target_name, data, target, lineno)
4908+ return target
4909+ else:
4910+ return data
4911+
4912+ def parse_target(self, block, block_text, lineno):
4913+ """
4914+ Determine the type of reference of a target.
4915+
4916+ :Return: A 2-tuple, one of:
4917+
4918+ - 'refname' and the indirect reference name
4919+ - 'refuri' and the URI
4920+ - 'malformed' and a system_message node
4921+ """
4922+ if block and block[-1].strip()[-1:] == '_': # possible indirect target
4923+ reference = ' '.join([line.strip() for line in block])
4924+ refname = self.is_reference(reference)
4925+ if refname:
4926+ return 'refname', refname
4927+ reference = ''.join([''.join(line.split()) for line in block])
4928+ return 'refuri', unescape(reference)
4929+
4930+ def is_reference(self, reference):
4931+ match = self.explicit.patterns.reference.match(
4932+ whitespace_normalize_name(reference))
4933+ if not match:
4934+ return None
4935+ return unescape(match.group('simple') or match.group('phrase'))
4936+
4937+ def add_target(self, targetname, refuri, target, lineno):
4938+ target.line = lineno
4939+ if targetname:
4940+ name = normalize_name(unescape(targetname))
4941+ target['names'].append(name)
4942+ if refuri:
4943+ uri = self.inliner.adjust_uri(refuri)
4944+ if uri:
4945+ target['refuri'] = uri
4946+ else:
4947+ raise ApplicationError('problem with URI: %r' % refuri)
4948+ self.document.note_explicit_target(target, self.parent)
4949+ else: # anonymous target
4950+ if refuri:
4951+ target['refuri'] = refuri
4952+ target['anonymous'] = 1
4953+ self.document.note_anonymous_target(target)
4954+
4955+ def substitution_def(self, match):
4956+ pattern = self.explicit.patterns.substitution
4957+ src, srcline = self.state_machine.get_source_and_line()
4958+ block, indent, offset, blank_finish = \
4959+ self.state_machine.get_first_known_indented(match.end(),
4960+ strip_indent=False)
4961+ blocktext = (match.string[:match.end()] + '\n'.join(block))
4962+ block.disconnect()
4963+ escaped = escape2null(block[0].rstrip())
4964+ blockindex = 0
4965+ while True:
4966+ subdefmatch = pattern.match(escaped)
4967+ if subdefmatch:
4968+ break
4969+ blockindex += 1
4970+ try:
4971+ escaped = escaped + ' ' + escape2null(block[blockindex].strip())
4972+ except IndexError:
4973+ raise MarkupError('malformed substitution definition.')
4974+ del block[:blockindex] # strip out the substitution marker
4975+ block[0] = (block[0].strip() + ' ')[subdefmatch.end()-len(escaped)-1:-1]
4976+ if not block[0]:
4977+ del block[0]
4978+ offset += 1
4979+ while block and not block[-1].strip():
4980+ block.pop()
4981+ subname = subdefmatch.group('name')
4982+ substitution_node = nodes.substitution_definition(blocktext)
4983+ substitution_node.source = src
4984+ substitution_node.line = srcline
4985+ if not block:
4986+ msg = self.reporter.warning(
4987+ 'Substitution definition "%s" missing contents.' % subname,
4988+ nodes.literal_block(blocktext, blocktext),
4989+ source=src, line=srcline)
4990+ return [msg], blank_finish
4991+ block[0] = block[0].strip()
4992+ substitution_node['names'].append(
4993+ nodes.whitespace_normalize_name(subname))
4994+ new_abs_offset, blank_finish = self.nested_list_parse(
4995+ block, input_offset=offset, node=substitution_node,
4996+ initial_state='SubstitutionDef', blank_finish=blank_finish)
4997+ i = 0
4998+ for node in substitution_node[:]:
4999+ if not (isinstance(node, nodes.Inline) or
5000+ isinstance(node, nodes.Text)):
The diff has been truncated for viewing.

Subscribers

People subscribed via source and target branches