Ticket #1958 (closed bug: obsolete)

Opened 10 years ago

Last modified 8 years ago

html-import - HTML import throws an exception

Reported by: deyan Owned by: diana
Priority: critical Milestone: M12_RELEASE
Component: uncategorized Version: 2.0
Keywords: Cc: diana
Category: unknown Effort:
Importance: 87 Ticket_group:
Estimated Number of Hours: 0 Add Hours to Ticket: 0
Billable?: yes Total Hours: 0
Analysis_owners: deyan Design_owners: diana
Imp._owners: diana Test_owners:
Analysis_reviewers: Changelog:
Design_reviewers: pap Imp._reviewers: pap, deyan
Test_reviewers: Analysis_score: 0
Design_score: 3.5 Imp._score: 3.5
Test_score: 0

Description (last modified by deyan) (diff)

When I try to import a HTML file (Insert->HTML), an exception is thrown instead of importing the text from the HTML file in a newly created text frame.

(Use File->Save as in your browser and test with this HTML, not with simple one)

Attachments

1958.txt (7.0 KB) - added by deyan 10 years ago.
html.patch (13.6 KB) - added by diana 10 years ago.
entry.txt (1.0 KB) - added by diana 10 years ago.
rtf.patch (1.7 KB) - added by diana 10 years ago.
html.2.patch (4.0 KB) - added by diana 10 years ago.

Change History

Changed 10 years ago by deyan

comment:1 Changed 10 years ago by deyan

  • Status changed from new to s1b_analysis_finished

comment:2 Changed 10 years ago by deyan

  • Description modified (diff)

comment:3 Changed 10 years ago by deyan

  • Description modified (diff)
  • Summary changed from HTML import doesn't work to HTML import throws an exception

comment:4 Changed 10 years ago by deyan

  • Priority changed from major to critical
  • Description modified (diff)

comment:5 Changed 10 years ago by deyan

  • Importance set to 87
  • Summary changed from HTML import throws an exception to html-import – HTML import throws an exception

Batch update from file report_1.csv

comment:6 Changed 10 years ago by deyan

  • Summary changed from html-import – HTML import throws an exception to html-import – HTML import throws an exception

Batch update from file 0911261.csv

comment:7 Changed 10 years ago by todor

  • Summary changed from html-import – HTML import throws an exception to html-import - HTML import throws an exception

comment:8 Changed 10 years ago by diana

  • Design_owners set to diana
  • Imp._owners set to diana

In HtmlTextImportManager:getResourceData method open the html file and remove the meta tag with the http-equiv and content parameters (this tag causes an exception in the javax.swing.text.html.parser.DocumentParser class).

comment:9 Changed 10 years ago by diana

  • Status changed from s1b_analysis_finished to s2a_design_started

comment:10 Changed 10 years ago by diana

  • Status changed from s2a_design_started to s3b_implementation_finished

comment:11 Changed 10 years ago by pap

  • Status changed from s3b_implementation_finished to s2c_design_ok
  • Cc diana added
  • Imp._reviewers set to pap
  • Design_score changed from 0 to 3.5
  • Design_reviewers set to pap
  • Imp._score changed from 0 to 2
  • First when I apply the patch I get an error that the method createStyled is nonexistent in ApplyHtmlStylesUtility
  • Second you use a StringBuffer that is thread-safe. This is not necessary and a StringBuilder could be used instead.
  • "N" is not a nice name for a constant or variable - it means almost nothing.
  • "st" is not a nice name too - means the same as the previous.
  • Same about "b", "newBuffer" and even about "temp"
  • Also you could just read the whole file at once.
  • You could use regular expressions to find and remove the "bad" tag.
  • Really I don't get the idea of creating so much objects and doing so much slow things.
  • It would be great if you could improve the ApplyHtmlStylesUtility as it creates lots of new ImmHotText objects which is very slow since the last text layout speed improvements. ApplyRtfStylesUtility may be used for reference.

Changed 10 years ago by diana

comment:12 Changed 10 years ago by diana

  • Owner set to diana
  • Status changed from s2c_design_ok to s3a_implementation_started

comment:13 Changed 10 years ago by diana

  • Status changed from s3a_implementation_started to s3b_implementation_finished

Changed 10 years ago by diana

Changed 10 years ago by diana

Changed 10 years ago by diana

comment:14 Changed 10 years ago by pap

  • Status changed from s3b_implementation_finished to s3c_implementation_ok
  • Imp._score changed from 2 to 3.5
  • Imp._reviewers changed from pap to pap, deyan
  • Passing but we had lots of things to fix as you know. And not it is much better.
  • You moved the ElementEntry class in some very very strange place. It belongs to the text func module and that's for sure.
  • "(?i)(?u)<meta(.*?)http-equiv(.*?)>" - the one to rule them all regular expression.
  • Bad indentation and spacing - I am getting tired fixing this.
  • Importing may be improved regarding code qualituy and duplication.
  • I don't get the idea of using a View object for the html import and a StyledDocument for the rtf one but... that's another story.
  • Commited in [8288].

comment:15 Changed 8 years ago by meddle

  • Status changed from s3c_implementation_ok to closed
  • Resolution set to obsolete

Closing all the tickets before M Y1

Note: See TracTickets for help on using tickets.