[[BackLinksMenu]]

[[TicketQuery(summary=TEXT_PERFORMANCE_R0, format=table, col=summary|owner|status|type|component|priority|effort|importance, rows=description|analysis_owners|analysis_reviewers|analysis_score|design_owners|design_reviewers|design_score|implementation_owners|implementation_reviewers|implementation_score|test_owners|test_reviewers|test_score|)]]

= Analysis =

== Overview ==
Text performance is bad. The purpose of this task is to improve it and make Sophie usable with large text (see the [wiki:TEXT_PERFORMANCE_R0#Comments Comments] section). At the first revision, known bottlenecks should be attacked and fixed and a way to measure performance should be established. At next revisions, profiling should be performed to find other bottlenecks that slow down the text. These should be fixed as well.

== Task requirements ==
 * The following issues that are known to affect text performance should be fixed:
  * Text processors - they should have a faster implementation. Currently their naive implementation processes the whole text after every keystroke, which is very slow.
  * Properties in BaseTextModel - they should be reduced as their constant re-computation causes slowdown.
  * Mark and caret positions - when navigating through the text or typing, mark and caret positions are moved separately, which leads to processing the text twice.
 * Tests should be extended to show text processors performance. At later revisions, they should be extended to serve as profilers.

== Task result ==
Source code

== Implementation idea ==
 * Text processors should not process the whole text but instead only the portion of it that has changed.
 * The processedText property in BaseTextModel is computationally heavy and should be changed if possible.
 * Mark and caret positions can be combined in a single property since they are the same in most cases (except selection).
 * Text performance is best measured by running automatic tests. A set of predefined and fixed text resources should be created and used in the tests. Performance of each individual task should be tested - e.g. importing, typing, deleting, changing styles, chaining, wrapping, etc.

== Related ==
TEXT_MODEL_REDESIGN[[BR]]

== How to demo ==
 * Show the better performance by pasting a large text and editing it.
 * Run the performance tests written and describe the results.
 * (internal) Show and describe the implementation of one of the processors.
 * (internal) Show and describe the improved BaseTextModel.

= Design =
What should the test measure:

 * Insert a large text in s frame (like a paste will do);
 * Simulate pressing left / right arrows (navigation);
 * Simulate typing several chars in the frame;
 * Select some text;
 * Simulate backspace (deleting the selection);
 * Press "Undo";
 * Apply a link to the text;
 * write some more chars in the text.

The test will extend SystemTestBase, since it is the only testbase that starts the whole app. Thus we will be able to take in mind other factors like halo menus. It will perform these things and print the result in milliseconds. Unnecessary modules will not be started (alternative skin, for example). All the measurements will be placed in one test case, since starting the app for every case will be very slow.

Create class SelectionInfo, which will hold the mark and caret positions. It will be immutable, 2 private final int fields - mark and caret. A default constructor will be provided, as well as getters for the mark and caret indices. Also, generated hashcode() and equals() and public static SelectionInfo DEFAULT = new SelectionInfo(-1, -1) (values are same to the current default values for the mark and caret);

In BaseTextModel, delete mark() and caret(), instead create private RwProp<SelectionInfo> selectionInfo(), with a default value of SelectionInfo.DEFAULT.

Replace getMark() and getCaret() with getSelectionInfo(). Replace setMark() and setCaret() with setSelectionInfo(). I think the purpose of these methods is clear, as is their usage. 

The processedText() is currently an auto property. It also depends on the current selection options, the raw text and the lastChange() property. When you type a symbol, the caret is set, which sets new options, then the mark is set, which also sets new options, the raw text is changed, and the lastChange() is set. So, the text is processed minimum 4 times and therefore laid out minimum 4 times (it was actually 6 times, but I don't remember why). 

Replace the auto property processedText() with private ImmHotText getProcessed() method. Of course, if it is not automatic, it should be called by another methods. They should be: setCaretInfo(), update(..), and maybe setProcessOptions(..). I say maybe, because setProcessOptions(..) is used for 2 purposes: 
 * A logic wants to change the way a text is processed (for example, quick search is performed). Then the text must be processed immediately. 
 * The selectionOptionsSync() property wants to change both selection and caret options. The text could be processed only once, after the second set. 
So, the best idea I have is to extract the code of setProcessOptions(..) in a setOptionsInternal(..) method, which takes one more argument: ''shouldProcessText''.

Of course, there is a problem with the manual calling of getProcessed(). It is that I dunno (:) who shall call the method when the change comes outside a client logic (this should be the case with file and server accesses, maybe even "Undo"). The problem is very similar to the one in TEXT_SERVER_R0 (the selection does not update in file and server accesses), so I propose to resolve them both in the next week.

The processors usage seem to be a bit misunderstood by us during the previous task, maybe because of lack of good documentation (my fault). The problem is that the returned TextEffect by the processor's methods should contain the change, which transforms the last processed text to the new result (maybe a picture should be drawn for the implementation, showing the meaning of the changes I will make there). I found out that a wrong change returned caused the large text to cause assertionError in ImmDosTree (Oh, don't ask why) and eventually the "GC overhead limit exceeded". Just a note, since I am not sure why this happens, I cannot be sure that this is the real reason for the GC error. 

Hm. CaretProcessor, SelectionProcessor and SearchProcessor don't actually need a lot of optimizations - they just apply a style to an interval in the text. I will look again their code, but I doubt that much can be done there. The real problem is with the LinksProcessor, which can be much faster. Maybe the best idea is to use code, similar to the code in CaretProcessor, since it is designed to reuse unaffected intervals in the text. So, nothing special, we get the modified interval, process it again, concatenate it with the rest of the text, and return the result. Bla :)

The performance test: [browser:branches/private/kyli/perf/modules/org.sophie2.dev/src/test/java/org/sophie2/dev/author/TextPerformanceSystemTest.java Here].



= Implementation =
 
 Done according to the design in [browser:branches/private/kyli/perf]. Tests show the new performance is quite better than the trunk's, but of course there is still much to do. For the next revision I propose to run the test with larger text and profile. Also, mouse clicking over the text currently performs a lot of processing, so it could be investigated.
 I also fixed some problems with navigation - pressing Home, End, Up or backspace on empty text frame caused exception; Highlighting a quick search result in a text frame and then deleting the text caused exception. 
 The broken things after this ticket: 
  * The text that is drawn is not calculated after a synchronization with server, a saved book and after performing "undo/redo".

= Testing =
^Place the testing results here.

= Comments =
"Large text" is currently defined as the text contained within 100 standard-sized frames, each one on its own page.