Merge lp:~adamreichold/qpdfview/extended-text-selection into lp:qpdfview

Proposed by Adam Reichold
Status: Rejected
Rejected by: Adam Reichold
Proposed branch: lp:~adamreichold/qpdfview/extended-text-selection
Merge into: lp:qpdfview
Diff against target: 331 lines (+138/-46)
6 files modified
sources/documentview.cpp (+2/-2)
sources/model.h (+10/-1)
sources/pageitem.cpp (+39/-2)
sources/pageitem.h (+3/-0)
sources/pdfmodel.cpp (+83/-40)
sources/pdfmodel.h (+1/-1)
To merge this branch: bzr merge lp:~adamreichold/qpdfview/extended-text-selection
Reviewer Review Type Date Requested Status
Benjamin Eltzner Pending
Razi Alavizadeh Pending
Review via email: mp+264322@code.launchpad.net

Description of the change

This branch extends the model to allow for extended text selections, i.e. an arbitrary boundary and the text contained within it, and implements this within the PDF model. It also modifies the PageItem so that it makes use of this selection if available, preferring it over the previous method of text extraction and also highlighting the boundary during the selection process.

The extended text extraction itself uses the same Poppler API as the cached text extraction and relies on the fact that the text boxes are provided in "approximately reading order" by Poppler already. Hence this might be much more complicated to implement in the other backends, e.g. DjVuLibre, but therefore it should stay optional IMHO.

Some of the points that need to be discussed IMHO are:

* Do we want to always use it if made available by the model, or maybe extend the copy-to-clipboard pop-up menu with additional options to explicitly select the method of text extraction instead?

* Is the performance sufficient or do we need to devise a more incremental computation? It feels alright on my machine but I did come to wrong conclusions based on this in the past. This also affects whether this should be put behind a configuration setting or into a separate rubber-band-mode.

* This currently works on a per-text-box, i.e. approximately per-word, level and could be extended to work on a per-character level but I suspect with significant overhead for either tracking character runs or already aggregating runs into a boundary and contained text.

* Should the boundary be processed further to provide more connected highlighting and if so how, e.g. computing (an approximation of) the convex hull?

To post a comment you must log in.
Revision history for this message
Adam Reichold (adamreichold) wrote :

Also note that this does not make any use of tagged PDF features and hence its reliability w.r.t. complicated layouts is probably questionable...

1935. By Adam Reichold

Since PdfPage::text is now a fallback interface we can remove the separate Page::cachedText entry point which all other plugins forwarded to Page::text anyway.

1936. By Adam Reichold

Some minor cosmetic clean-ups to PDF text extraction and selection.

Revision history for this message
Razi Alavizadeh (srazi) wrote :

Hello Adam,

Also note that this does not make any use of tagged PDF features and hence
> its reliability w.r.t. complicated layouts is probably questionable...
>

Is related? With this branch it selects text of RTL documents in reversed
order.

CORRECT: این عشق خانمانسوز
INCORRECT: زوسنامناخ قشع نیا
I attached a PDF in Persian for testing.

Best regards,
Razi.

2015-07-09 22:46 GMT+04:30 Adam Reichold <email address hidden>:

> Also note that this does not make any use of tagged PDF features and hence
> its reliability w.r.t. complicated layouts is probably questionable...
> --
>
> https://code.launchpad.net/~adamreichold/qpdfview/extended-text-selection/+merge/264322
> You are requested to review the proposed merge of
> lp:~adamreichold/qpdfview/extended-text-selection into lp:qpdfview.
>

--
Alavizadeh, Sayed Razi
My Blog: http://pozh.org
Saaghar (نرم‌افزار شعر): http://saaghar.pozh.org/
Saaghar Fan Page: http://www.facebook.com/saaghar.p
Saaghar Mailing List: http://groups.google.com/group/saaghar

Revision history for this message
Adam Reichold (adamreichold) wrote :

Hello,

> Is related? With this branch it selects text of RTL documents in reversed
> order.
>
> CORRECT: این عشق خانمانسوز
> INCORRECT: زوسنامناخ قشع نیا
> I attached a PDF in Persian for testing.

I don't think this is related to tagged PDF since the previous method of text extraction did not use it as well. But it is obviously incorrect and a result of loosing some of the heuristics which where implemented within Poppler::Page::text and Poppler::Page::search.

What I wonder is though, this should affect the extended search dock in the same way as it uses the same method of text extraction? At least I don't understand how there could be a difference in behaviour...

From how I understand these things, "fixing" this by just calling QString::reverse if Document::wantsRightToLeft() will not be correct?

Best regards, Adam.

Revision history for this message
Benjamin Eltzner (b-eltzner) wrote :

Hi Adam,

I noticed the following:

* When selecting several lines of text, line breaks are not preserved
and not replaced by blank spaces, which leads to word concatenations.

* The selection tool is still a "frame", so when the text portion to be
selected starts to the right of the position where it ends one will
still have to copy unwanted text bits. Maybe the selection tool could be
adjusted to select "line wise" from the "button push position" to the
"button release position".

* I did not have any trouble concerning performance.

I agree that it would be favorable to be able to use the tagged PDF
features, however it seems that the poppler project maintainers are not
particularly interested in these. So I am really glad you propose this
implementation and are willing to take the burden of maintenance.

Cheers, Benjamin

Revision history for this message
Razi Alavizadeh (srazi) wrote :

Hello Adam,

What I wonder is though, this should affect the extended search dock in the
> same way as it uses the same method of text extraction? At least I don't
> understand how there could be a difference in behaviour...
>

Yes, indeed the extended search dock is affected, too. But I didn't report
it, because it is the bug of Poppler, and also the issue is bigger when we
speak about searching:

If you want search for RTL phrase "ABC" by Poppler then you have to reverse
it, i.e. you have to search for "CBA" and if it's a mixed phrase i.e "ABC
123" that "123" is an LTR phrase then you have to search for "CBA 123". see
the long standing bug [1].

From how I understand these things, "fixing" this by just calling
> QString::reverse if Document::wantsRightToLeft() will not be correct?
>
Well I have some problem and points here:
1- QString::reverse() (that we should implement it) is a solution when text
is RTL only.

2- What is reterned by "Document::wantsRightToLeft()" if there is both RTL
and LTR content?

3 And my problem: I compiled Poppler 0.29 and 0.34 but when linking with
both of them there is error about symbol not found :| for both of them it
says "m_document->textDirection()" is not found and for version 0.34 it
says "
Poppler::Page::SearchFlags" is not found, maybe its because I disable
"libcairo" when compiling? or Do I have set some flags?

Btw, I wrote a patch that fixes problem with extracting/searching [2].
Don't you think we can port this patch to Poppler's code as a patch for [1]?

[1] https://bugs.freedesktop.org/show_bug.cgi?id=2981
[2] https://code.launchpad.net/~srazi/qpdfview/extended-text-selection

Best Regards,
Razi.

2015-07-11 11:13 GMT+04:30 Adam Reichold <email address hidden>:

> Hello,
>
> > Is related? With this branch it selects text of RTL documents in reversed
> > order.
> >
> > CORRECT: این عشق خانمانسوز
> > INCORRECT: زوسنامناخ قشع نیا
> > I attached a PDF in Persian for testing.
>
> I don't think this is related to tagged PDF since the previous method of
> text extraction did not use it as well. But it is obviously incorrect and a
> result of loosing some of the heuristics which where implemented within
> Poppler::Page::text and Poppler::Page::search.
>
> What I wonder is though, this should affect the extended search dock in
> the same way as it uses the same method of text extraction? At least I
> don't understand how there could be a difference in behaviour...
>
> >From how I understand these things, "fixing" this by just calling
> QString::reverse if Document::wantsRightToLeft() will not be correct?
>
> Best regards, Adam.
> --
>
> https://code.launchpad.net/~adamreichold/qpdfview/extended-text-selection/+merge/264322
> You are requested to review the proposed merge of
> lp:~adamreichold/qpdfview/extended-text-selection into lp:qpdfview.
>

--
Alavizadeh, Sayed Razi
My Blog: http://pozh.org
Saaghar (نرم‌افزار شعر): http://saaghar.pozh.org/
Saaghar Fan Page: http://www.facebook.com/saaghar.p
Saaghar Mailing List: http://groups.google.com/group/saaghar

Revision history for this message
Adam Reichold (adamreichold) wrote :

Hello Razi,

Am 13.07.2015 um 17:33 schrieb S. Razi Alavizadeh:
> Yes, indeed the extended search dock is affected, too. But I didn't report
> it, because it is the bug of Poppler, and also the issue is bigger when we
> speak about searching:
>
> If you want search for RTL phrase "ABC" by Poppler then you have to reverse
> it, i.e. you have to search for "CBA" and if it's a mixed phrase i.e "ABC
> 123" that "123" is an LTR phrase then you have to search for "CBA 123". see
> the long standing bug [1].

> Btw, I wrote a patch that fixes problem with extracting/searching [2].
> Don't you think we can port this patch to Poppler's code as a patch for [1]?

I will look into what it would take to get this into Poppler, especially
reading up on the history of [1] seems necessary before submitting
anything. But could you rebase your work on qpdfview's trunk as a
separate merge request? (Because I will reject this merge and abandon
the branch for which I'll give an explanation in a separate message.)

>>From how I understand these things, "fixing" this by just calling
>> QString::reverse if Document::wantsRightToLeft() will not be correct?
>>
> Well I have some problem and points here:
> 1- QString::reverse() (that we should implement it) is a solution when text
> is RTL only.
>
> 2- What is reterned by "Document::wantsRightToLeft()" if there is both RTL
> and LTR content?

This is exactly what I was thinking of. So your bidirectional method
seems like the way to go. Is this a full implementation of the BiDi
algorithm?

> 3 And my problem: I compiled Poppler 0.29 and 0.34 but when linking with
> both of them there is error about symbol not found :| for both of them it
> says "m_document->textDirection()" is not found and for version 0.34 it
> says "
> Poppler::Page::SearchFlags" is not found, maybe its because I disable
> "libcairo" when compiling? or Do I have set some flags?

Have you checked that qpdfview is building against the correct Poppler
include path? This sounds like you try to build with "HAS_POPPLER_31"
but without adjustment of "INCLUDEPATH"? It should definitely be
independent of whether you enable Cario integration or not.

Best regards, Adam.

Revision history for this message
Adam Reichold (adamreichold) wrote :

Hello Benjamin,

Am 12.07.2015 um 13:30 schrieb Benjamin Eltzner:
> * When selecting several lines of text, line breaks are not preserved
> and not replaced by blank spaces, which leads to word concatenations.

Looking into this, Poppler::Page::textList gives us absolutely no
information other than the text (in whatever granularity) and its
bounding boxes. Everything else has to be synthesized from that. From
looking at how for example Okular does that, I do definitely not want to
implement comparable heuristics and will rather try to work on the
tagged PDF support in the Qt frontend of Poppler instead. Hence I will
reject this merge and abandon the related branch.

> * The selection tool is still a "frame", so when the text portion to be
> selected starts to the right of the position where it ends one will
> still have to copy unwanted text bits. Maybe the selection tool could be
> adjusted to select "line wise" from the "button push position" to the
> "button release position".

I agree, but it does not matter IMHO because of the above.

> I agree that it would be favorable to be able to use the tagged PDF
> features, however it seems that the poppler project maintainers are not
> particularly interested in these. So I am really glad you propose this
> implementation and are willing to take the burden of maintenance.

I do not think that favour has anything to do with it, but rather a
simple shortage of contributed labour. Tagged PDF support is a
significant project that puts considerable strain on the resources of
the Poppler project.

Best regards, Adam.

Revision history for this message
Razi Alavizadeh (srazi) wrote :

Hello Adam,

> I will look into what it would take to get this into Poppler, especially
> reading up on the history of [1] seems necessary before submitting
> anything. But could you rebase your work on qpdfview's trunk as a
> separate merge request? (Because I will reject this merge and abandon
> the branch for which I'll give an explanation in a separate message.)
>
Because you find existing patch for [1] a good start point then I don't
send the patch, feel free to ask it on your need.

This is exactly what I was thinking of. So your bidirectional method
> seems like the way to go. Is this a full implementation of the BiDi
> algorithm?
>
I think you already know that my patch is not bidi-algorithm and
bidi-algorithm is a very complicated thing, you can find its implementation
in QTSRC/gui/text/qtextengine.cpp.

Have you checked that qpdfview is building against the correct Poppler
> include path? This sounds like you try to build with "HAS_POPPLER_31"
> but without adjustment of "INCLUDEPATH"? It should definitely be
> independent of whether you enable Cario integration or not.
>
Interestingly it was disappeared when compiling against Qt5.

Best Regards,
Razi.

2015-07-13 21:41 GMT+04:30 Adam Reichold <email address hidden>:

> The proposal to merge lp:~adamreichold/qpdfview/extended-text-selection
> into lp:qpdfview has been updated.
>
> Status: Needs review => Rejected
>
> For more details, see:
>
> https://code.launchpad.net/~adamreichold/qpdfview/extended-text-selection/+merge/264322
> --
> You are requested to review the proposed merge of
> lp:~adamreichold/qpdfview/extended-text-selection into lp:qpdfview.
>

--
Alavizadeh, Sayed Razi
My Blog: http://pozh.org
Saaghar (نرم‌افزار شعر): http://saaghar.pozh.org/
Saaghar Fan Page: http://www.facebook.com/saaghar.p
Saaghar Mailing List: http://groups.google.com/group/saaghar

Unmerged revisions

1936. By Adam Reichold

Some minor cosmetic clean-ups to PDF text extraction and selection.

1935. By Adam Reichold

Since PdfPage::text is now a fallback interface we can remove the separate Page::cachedText entry point which all other plugins forwarded to Page::text anyway.

1934. By Adam Reichold

At least simplify the boundary and text of extended PDF selections.

1933. By Adam Reichold

Use proper text selection if made available by the model.

1932. By Adam Reichold

Extend model to allow for proper text selections

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'sources/documentview.cpp'
--- sources/documentview.cpp 2015-07-05 14:37:58 +0000
+++ sources/documentview.cpp 2015-07-09 19:23:47 +0000
@@ -936,8 +936,8 @@
936936
937 const QRectF surroundingRect(x, rect.top(), width, rect.height());937 const QRectF surroundingRect(x, rect.top(), width, rect.height());
938938
939 const QString& matchedText = m_pages.at(page - 1)->cachedText(rect);939 const QString& matchedText = m_pages.at(page - 1)->text(rect);
940 const QString& surroundingText = m_pages.at(page - 1)->cachedText(surroundingRect);940 const QString& surroundingText = m_pages.at(page - 1)->text(surroundingRect);
941941
942 return qMakePair(matchedText, surroundingText);942 return qMakePair(matchedText, surroundingText);
943}943}
944944
=== modified file 'sources/model.h'
--- sources/model.h 2015-06-27 06:35:40 +0000
+++ sources/model.h 2015-07-09 19:23:47 +0000
@@ -65,6 +65,15 @@
6565
66 };66 };
6767
68 struct Selection
69 {
70 QPainterPath boundary;
71 QString text;
72
73 Selection() : boundary(), text() {}
74
75 };
76
68 class Annotation : public QObject77 class Annotation : public QObject
69 {78 {
70 Q_OBJECT79 Q_OBJECT
@@ -115,7 +124,7 @@
115 virtual QList< Link* > links() const { return QList< Link* >(); }124 virtual QList< Link* > links() const { return QList< Link* >(); }
116125
117 virtual QString text(const QRectF& rect) const { Q_UNUSED(rect); return QString(); }126 virtual QString text(const QRectF& rect) const { Q_UNUSED(rect); return QString(); }
118 virtual QString cachedText(const QRectF& rect) const { return text(rect); }127 virtual Selection* selectedText(const QRectF& rect) const { Q_UNUSED(rect); return 0; }
119128
120 virtual QList< QRectF > search(const QString& text, bool matchCase, bool wholeWords) const { Q_UNUSED(text); Q_UNUSED(matchCase); Q_UNUSED(wholeWords); return QList< QRectF >(); }129 virtual QList< QRectF > search(const QString& text, bool matchCase, bool wholeWords) const { Q_UNUSED(text); Q_UNUSED(matchCase); Q_UNUSED(wholeWords); return QList< QRectF >(); }
121130
122131
=== modified file 'sources/pageitem.cpp'
--- sources/pageitem.cpp 2015-06-02 18:15:29 +0000
+++ sources/pageitem.cpp 2015-07-09 19:23:47 +0000
@@ -75,6 +75,7 @@
75 m_formFields(),75 m_formFields(),
76 m_rubberBandMode(ModifiersMode),76 m_rubberBandMode(ModifiersMode),
77 m_rubberBand(),77 m_rubberBand(),
78 m_selection(),
78 m_annotationOverlay(),79 m_annotationOverlay(),
79 m_formFieldOverlay(),80 m_formFieldOverlay(),
80 m_renderParam(),81 m_renderParam(),
@@ -457,6 +458,12 @@
457 {458 {
458 m_rubberBand = QRectF(event->pos(), QSizeF());459 m_rubberBand = QRectF(event->pos(), QSizeF());
459460
461 if(m_rubberBandMode == CopyToClipboardMode)
462 {
463 const QRectF rect = m_transform.inverted().mapRect(m_rubberBand).normalized();
464 m_selection.reset(m_page->selectedText(rect));
465 }
466
460 emit rubberBandStarted();467 emit rubberBandStarted();
461468
462 update();469 update();
@@ -561,6 +568,12 @@
561 {568 {
562 m_rubberBand.setBottomRight(event->pos());569 m_rubberBand.setBottomRight(event->pos());
563570
571 if(m_rubberBandMode == CopyToClipboardMode)
572 {
573 const QRectF rect = m_transform.inverted().mapRect(m_rubberBand).normalized();
574 m_selection.reset(m_page->selectedText(rect));
575 }
576
564 update();577 update();
565578
566 event->accept();579 event->accept();
@@ -595,6 +608,8 @@
595 m_rubberBandMode = ModifiersMode;608 m_rubberBandMode = ModifiersMode;
596 m_rubberBand = QRectF();609 m_rubberBand = QRectF();
597610
611 m_selection.reset();
612
598 emit rubberBandFinished();613 emit rubberBandFinished();
599614
600 update();615 update();
@@ -730,6 +745,18 @@
730745
731void PageItem::copyToClipboard(const QPoint& screenPos)746void PageItem::copyToClipboard(const QPoint& screenPos)
732{747{
748 QString text;
749
750 if(m_selection)
751 {
752 text = m_selection->text;
753 }
754 else
755 {
756 const QRectF rect = m_transform.inverted().mapRect(m_rubberBand).normalized();
757 text = m_page->text(rect);
758 }
759
733 QMenu menu;760 QMenu menu;
734761
735 QAction* copyTextAction = menu.addAction(tr("Copy &text"));762 QAction* copyTextAction = menu.addAction(tr("Copy &text"));
@@ -737,8 +764,6 @@
737 const QAction* copyImageAction = menu.addAction(tr("Copy &image"));764 const QAction* copyImageAction = menu.addAction(tr("Copy &image"));
738 const QAction* saveImageToFileAction = menu.addAction(tr("Save image to &file..."));765 const QAction* saveImageToFileAction = menu.addAction(tr("Save image to &file..."));
739766
740 const QString text = m_page->text(m_transform.inverted().mapRect(m_rubberBand));
741
742 copyTextAction->setVisible(!text.isEmpty());767 copyTextAction->setVisible(!text.isEmpty());
743 selectTextAction->setVisible(!text.isEmpty() && QApplication::clipboard()->supportsSelection());768 selectTextAction->setVisible(!text.isEmpty() && QApplication::clipboard()->supportsSelection());
744769
@@ -1271,6 +1296,18 @@
12711296
1272 painter->restore();1297 painter->restore();
1273 }1298 }
1299
1300 if(m_selection)
1301 {
1302 painter->save();
1303
1304 painter->setTransform(m_transform, true);
1305 painter->setCompositionMode(QPainter::CompositionMode_Multiply);
1306
1307 painter->fillPath(m_selection->boundary, QBrush(s_settings->pageItem().highlightColor()));
1308
1309 painter->restore();
1310 }
1274}1311}
12751312
1276} // qpdfview1313} // qpdfview
12771314
=== modified file 'sources/pageitem.h'
--- sources/pageitem.h 2015-06-01 19:28:55 +0000
+++ sources/pageitem.h 2015-07-09 19:23:47 +0000
@@ -37,6 +37,7 @@
37namespace Model37namespace Model
38{38{
39struct Link;39struct Link;
40struct Selection;
40class Annotation;41class Annotation;
41class FormField;42class FormField;
42class Page;43class Page;
@@ -166,6 +167,8 @@
166 RubberBandMode m_rubberBandMode;167 RubberBandMode m_rubberBandMode;
167 QRectF m_rubberBand;168 QRectF m_rubberBand;
168169
170 QScopedPointer< Model::Selection > m_selection;
171
169 void copyToClipboard(const QPoint& screenPos);172 void copyToClipboard(const QPoint& screenPos);
170 void addAnnotation(const QPoint& screenPos);173 void addAnnotation(const QPoint& screenPos);
171174
172175
=== modified file 'sources/pdfmodel.cpp'
--- sources/pdfmodel.cpp 2015-06-27 06:35:40 +0000
+++ sources/pdfmodel.cpp 2015-07-09 19:23:47 +0000
@@ -150,11 +150,59 @@
150typedef QSharedPointer< Poppler::TextBox > TextBox;150typedef QSharedPointer< Poppler::TextBox > TextBox;
151typedef QList< TextBox > TextBoxList;151typedef QList< TextBox > TextBoxList;
152152
153QCache< const PdfPage*, TextBoxList > textCache(1 << 12);153QCache< const Poppler::Page*, TextBoxList > textCache(1 << 12);
154QMutex textCacheMutex;154QMutex textCacheMutex;
155155
156#define LOCK_TEXT_CACHE QMutexLocker mutexLocker(&textCacheMutex);156#define LOCK_TEXT_CACHE QMutexLocker mutexLocker(&textCacheMutex);
157157
158TextBoxList textBoxes(QMutex* mutex, const Poppler::Page* page)
159{
160 {
161 LOCK_TEXT_CACHE
162
163 if(const TextBoxList* object = textCache.object(page))
164 {
165 return *object;
166 }
167 }
168
169 TextBoxList textBoxes;
170
171 {
172 QMutexLocker mutexLocker(mutex);
173
174 foreach(Poppler::TextBox* textBox, page->textList())
175 {
176 textBoxes.append(TextBox(textBox));
177 }
178 }
179
180 {
181 LOCK_TEXT_CACHE
182
183 textCache.insert(page, new TextBoxList(textBoxes), textBoxes.count());
184 }
185
186 return textBoxes;
187}
188
189inline void extendSelection(Selection* selection, const TextBox& textBox)
190{
191 selection->boundary.addRect(textBox->boundingBox());
192 selection->text.append(textBox->text());
193
194 if(textBox->hasSpaceAfter())
195 {
196 selection->text.append(QLatin1Char(' '));
197 }
198}
199
200inline void simplifySelection(Selection* selection)
201{
202 selection->boundary = selection->boundary.simplified();
203 selection->text = selection->text.simplified();
204}
205
158} // anonymous206} // anonymous
159207
160namespace qpdfview208namespace qpdfview
@@ -293,7 +341,7 @@
293 {341 {
294 LOCK_TEXT_CACHE342 LOCK_TEXT_CACHE
295343
296 textCache.remove(this);344 textCache.remove(m_page);
297 }345 }
298346
299 delete m_page;347 delete m_page;
@@ -420,46 +468,9 @@
420468
421QString PdfPage::text(const QRectF& rect) const469QString PdfPage::text(const QRectF& rect) const
422{470{
423 LOCK_PAGE
424
425 return m_page->text(rect).simplified();
426}
427
428QString PdfPage::cachedText(const QRectF& rect) const
429{
430 bool wasCached = false;
431 TextBoxList textBoxes;
432
433 {
434 LOCK_TEXT_CACHE
435
436 if(const TextBoxList* object = textCache.object(this))
437 {
438 wasCached = true;
439
440 textBoxes = *object;
441 }
442 }
443
444 if(!wasCached)
445 {
446 {
447 LOCK_PAGE
448
449 foreach(Poppler::TextBox* textBox, m_page->textList())
450 {
451 textBoxes.append(TextBox(textBox));
452 }
453 }
454
455 LOCK_TEXT_CACHE
456
457 textCache.insert(this, new TextBoxList(textBoxes), textBoxes.count());
458 }
459
460 QString text;471 QString text;
461472
462 foreach(const TextBox& textBox, textBoxes)473 foreach(const TextBox& textBox, textBoxes(m_mutex, m_page))
463 {474 {
464 if(!rect.intersects(textBox->boundingBox()))475 if(!rect.intersects(textBox->boundingBox()))
465 {476 {
@@ -485,6 +496,38 @@
485 return text.simplified();496 return text.simplified();
486}497}
487498
499Selection* PdfPage::selectedText(const QRectF& rect) const
500{
501 Selection* selection = new Selection();
502
503 bool firstRun = false;
504 TextBoxList currentRun;
505
506 foreach(const TextBox& textBox, textBoxes(m_mutex, m_page))
507 {
508 if(textBox->boundingBox().intersects(rect))
509 {
510 foreach(const TextBox& textBox, currentRun)
511 {
512 extendSelection(selection, textBox);
513 }
514
515 extendSelection(selection, textBox);
516
517 firstRun = true;
518 currentRun.clear();
519 }
520 else if(firstRun)
521 {
522 currentRun.append(textBox);
523 }
524 }
525
526 simplifySelection(selection);
527
528 return selection;
529}
530
488QList< QRectF > PdfPage::search(const QString& text, bool matchCase, bool wholeWords) const531QList< QRectF > PdfPage::search(const QString& text, bool matchCase, bool wholeWords) const
489{532{
490 LOCK_PAGE533 LOCK_PAGE
491534
=== modified file 'sources/pdfmodel.h'
--- sources/pdfmodel.h 2015-01-27 20:41:38 +0000
+++ sources/pdfmodel.h 2015-07-09 19:23:47 +0000
@@ -115,7 +115,7 @@
115 QList< Link* > links() const;115 QList< Link* > links() const;
116116
117 QString text(const QRectF& rect) const;117 QString text(const QRectF& rect) const;
118 QString cachedText(const QRectF& rect) const;118 Selection* selectedText(const QRectF& rect) const;
119119
120 QList< QRectF > search(const QString& text, bool matchCase, bool wholeWords) const;120 QList< QRectF > search(const QString& text, bool matchCase, bool wholeWords) const;
121121

Subscribers

People subscribed via source and target branches

to all changes: