Merge lp:~paul-lucas/zorba/pjl-misc into lp:zorba

Proposed by Paul J. Lucas
Status: Merged
Approved by: Dennis Knochenwefel
Approved revision: 10946
Merged at revision: 10946
Proposed branch: lp:~paul-lucas/zorba/pjl-misc
Merge into: lp:zorba
Diff against target: 116 lines (+65/-34)
1 file modified
src/util/unicode_util.h (+65/-34)
To merge this branch: bzr merge lp:~paul-lucas/zorba/pjl-misc
Reviewer Review Type Date Requested Status
Dennis Knochenwefel Approve
Paul J. Lucas Approve
Review via email: mp+115403@code.launchpad.net

Commit message

Added functions to test for and create UTF-16 surrogate pairs.
These will probably be needed by whoever fixes bug #1025622.

Description of the change

Added functions to test for and create UTF-16 surrogate pairs.
These will probably be needed by whoever fixes bug #1025622.

To post a comment you must log in.
Revision history for this message
Paul J. Lucas (paul-lucas) :
review: Approve
Revision history for this message
Dennis Knochenwefel (dennis-knochenwefel) wrote :

looks good. only a small typo:

%s/covert_surrogate/convert_surrogate/

review: Needs Fixing
lp:~paul-lucas/zorba/pjl-misc updated
10946. By Paul J. Lucas

Fixed typo.

Revision history for this message
Dennis Knochenwefel (dennis-knochenwefel) :
review: Approve
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :

Validation queue job pjl-misc-2012-07-18T20-52-58.349Z is finished. The final status was:

All tests succeeded!

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/util/unicode_util.h'
2--- src/util/unicode_util.h 2012-07-16 23:37:51 +0000
3+++ src/util/unicode_util.h 2012-07-18 14:14:21 +0000
4@@ -136,40 +136,6 @@
5 bool is_ucschar( code_point c );
6
7 /**
8- * Checks whether the given value is a "high surrogate."
9- *
10- * @param n The value to check.
11- * @return Returns \c true only if \a n is a high surrogate.
12- */
13-inline bool is_high_surrogate( unsigned long n ) {
14- return n >= 0xD800 && n <= 0xDBFF;
15-}
16-
17-/**
18- * Checks whether the given value is a "low surrogate."
19- *
20- * @param n The value to check.
21- * @return Returns \c true only if \a n is a low surrogate.
22- */
23-inline bool is_low_surrogate( unsigned long n ) {
24- return n >= 0xDC00 && n <= 0xDFFF;
25-}
26-
27-/**
28- * Converts the given high and low surrogate values into the code-point they
29- * represent. Note that no checking is done on the parameters.
30- *
31- * @param high The high surrogate value.
32- * @param low The low surrogate value.
33- * @return Returns the represented code-point.
34- * @see is_high_surrogate()
35- * @see is_low_surrogate()
36- */
37-inline code_point convert_surrogate( unsigned high, unsigned low ) {
38- return 0x10000 + (high - 0xD800) * 0x400 + (low - 0xDC00);
39-}
40-
41-/**
42 * Checks whether the given code-point is valid.
43 *
44 * @param c The code-point to check.
45@@ -338,6 +304,71 @@
46 return to_string( in.data(), static_cast<size_type>( in.size() ), out );
47 }
48
49+////////// UTF-16 surrogate pairs /////////////////////////////////////////////
50+
51+/**
52+ * Converts the given high and low surrogate values into the code-point they
53+ * represent. Note that no checking is done on the parameters.
54+ *
55+ * @param high The high surrogate value.
56+ * @param low The low surrogate value.
57+ * @return Returns the represented code-point.
58+ * @see is_high_surrogate()
59+ * @see is_low_surrogate()
60+ */
61+inline code_point convert_surrogate( unsigned high, unsigned low ) {
62+ return 0x10000 + ((high - 0xD800) << 10) + (low - 0xDC00);
63+}
64+
65+/**
66+ * Converts the given code-point into the high and low surrogate values that
67+ * represent it. Note that no checking is done on the parameters.
68+ *
69+ * @tparam ResultType The integer type for the results.
70+ * @param c The code-point to convert.
71+ * @param high A pointer to where to put the high surrogate.
72+ * @param low A pointer to where to put the low surrogate.
73+ */
74+template<typename ResultType> inline
75+typename std::enable_if<ZORBA_TR1_NS::is_integral<ResultType>::value,
76+ void>::type
77+convert_surrogate( code_point c, ResultType *high, ResultType *low ) {
78+ code_point const n = c - 0x10000;
79+ *high = 0xD800 + (static_cast<unsigned>(n) >> 10);
80+ *low = 0xDC00 + (n & 0x3FF);
81+}
82+
83+/**
84+ * Checks whether the given value is a "high surrogate."
85+ *
86+ * @param n The value to check.
87+ * @return Returns \c true only if \a n is a high surrogate.
88+ */
89+inline bool is_high_surrogate( unsigned long n ) {
90+ return n >= 0xD800 && n <= 0xDBFF;
91+}
92+
93+/**
94+ * Checks whether the given value is a "low surrogate."
95+ *
96+ * @param n The value to check.
97+ * @return Returns \c true only if \a n is a low surrogate.
98+ */
99+inline bool is_low_surrogate( unsigned long n ) {
100+ return n >= 0xDC00 && n <= 0xDFFF;
101+}
102+
103+/**
104+ * Checks whether the given code-point is in the "supplementary plane" and
105+ * therefore would need a surrogate pair to be encoded in UTF-16.
106+ *
107+ * @param c The code-point to check.
108+ * @return Returns \c true only if \a c is within the supplementary plane.
109+ */
110+inline bool is_supplementary_plane( code_point c ) {
111+ return c >= 0x10000 && c <= 0x10FFFF;
112+}
113+
114 ///////////////////////////////////////////////////////////////////////////////
115
116 } // namespace unicode

Subscribers

People subscribed via source and target branches