Unfortunately this fix isn't in the right place. By changing the CharRef regexp, you've broken some valid XML character references like  . Also, the error message for � is not very good.
Nicolae - can you help point us in the right direction here? There is already code in xquery_scanner.l and jsoniq_scanner.l that verifies character references in string literals:
However, same check also needs to be done anywhere else that may have character references, including element content and attribute values. But we don't know enough about flex/bison to understand how to get this done.
If possible I'd still like Luis to do the actual change, if only to help spread some knowledge about the lexer/parser to other team members. But if you could briefly describe what kind of change needs to happen, we'd appreciate it!
Unfortunately this fix isn't in the right place. By changing the CharRef regexp, you've broken some valid XML character references like  . Also, the error message for � is not very good.
Nicolae - can you help point us in the right direction here? There is already code in xquery_scanner.l and jsoniq_scanner.l that verifies character references in string literals:
{StringLiteral} { if (checkXmlRefs( &yylval- >err, yytext, yyleng, this, yylloc)) return token:: UNRECOGNIZED; TRY_STRING_ LITERAL( STRING_ LITERAL, yytext, yyleng); }
However, same check also needs to be done anywhere else that may have character references, including element content and attribute values. But we don't know enough about flex/bison to understand how to get this done.
If possible I'd still like Luis to do the actual change, if only to help spread some knowledge about the lexer/parser to other team members. But if you could briefly describe what kind of change needs to happen, we'd appreciate it!