Modify

Opened 3 months ago

Closed 3 months ago

Last modified 3 months ago

#23468 closed enhancement (fixed)

Performance issue with Validator tree window

Reported by: GerdP Owned by: team
Priority: normal Milestone: 24.02
Component: Core validator Version:
Keywords: performance Cc: simon04

Description

Situation: Have a large dataset downloaded with overpass api and run validator with informational messages enabled. In my case I have > 560.000 entries in the tree and - probably most important - lots of results from TagChecker:

 Presets do not contain property key (529 359)
 Presets do not contain property value (3 081)

I've closed the validation layer as it slows down any zooming or paning drastically and I've zoomed in so that only a few nodes are visible. So far so good. JOSM is usable.
Steps to reproduce the performance issue:

  1. Open new data layer (e.g. to download data from OSM and fix one important error), note that the validation tree windows is emptied (correct)
  2. close the (small) data layer. Note that JOSM doesn't react for quite a while (17 secs in my case) before JOSM shows the validation results for the large dataset. (which didn't change)

I found out that most of the time is spent in OsmValidator.getErrorsBySeverityMessageDescription() which was written for #14704, esp. in AlphanumComparator.compare().

I wonder if there is a good reason to use this complex comparator instead of String.compare() which is much faster. I see difference in the order of messages but I wouldn't say one order is better than the other.

Attachments (2)

23468.patch (3.9 KB ) - added by GerdP 3 months ago.
23468.2.patch (6.4 KB ) - added by taylor.smock 3 months ago.
Alternative to attachment:23468.patch where AlphanumComparator is optimized

Download all attachments as: .zip

Change History (14)

by GerdP, 3 months ago

Attachment: 23468.patch added

comment:1 by GerdP, 3 months ago

Experimental patch to measure elapsed time in OsmValidator.getErrorsBySeverityMessageDescription() and use String.compare() in dialog.
Elapsed time with AlphanumComparator: ~18.5 secs
Elapsed time with String.compare(): ~0.1 secs

This time is also spent when the dialog shows the progress bar with the text "Updating ignored errors".

comment:2 by GerdP, 3 months ago

Cc: simon04 added

If I got that right AlphanumComparator was introduced with r11132 to solve #11072, not sure why.

comment:3 by taylor.smock, 3 months ago

Looking at some profiling data, it looks like most of the time is spent in RuleBasedCollator, and more specifically, in the normalization phase for it in StringBuilder.append.

That normalization phase would be so that the following entries are in the expected order:

  • Doolittle
  • Ḗttiene
  • Ettiene
  • Foobar

Standard String::compareTo ordering:

  • [Doolittle, Ettiene, Foobar, Ḗttiene]

With the Collator ignoring accented letters:

  • [Doolittle, Ettiene, Ḗttiene, Foobar]

We can try to detect ascii only strings and use String::compareTo for those.

I'll upload a sample patch so you can check to see if it will work for your test area.

Notes on the patch:

  • getChunk returns a substring of the string that was passed in -- the method does no transformations, it just appends the character for each position to a StringBuilder.
  • Chunk comparison code was extracted to its own method (for profiling purposes, but it also makes the methods a bit easier to read)
  • New isAscii method to avoid Collator comparison methods wherever possible. Most significant performance gains will be in areas with no accented letters.

EDIT: Profiler comparison for attachment:23468.2.patch using Mesa County, Colorado as the source data:

  • CPU: -86.4% (3.366s to 0.459s) -- I would expect your test area to be ~2.5s, down from 18.5s.
  • Memory: -99.9% (2.37GB to 2.07 MB)
Last edited 3 months ago by taylor.smock (previous) (diff)

by taylor.smock, 3 months ago

Attachment: 23468.2.patch added

Alternative to attachment:23468.patch where AlphanumComparator is optimized

comment:4 by GerdP, 3 months ago

Works fine for me. Time with this patch is 1.0 secs which is still much better than the original 18.5 secs.

comment:5 by taylor.smock, 3 months ago

Better than I expected. There might have been some additional optimizations done by the JVM after some time -- one of my profiler runs appears to have compiled both compareChunk and getChunk to "native" code.

comment:6 by taylor.smock, 3 months ago

Resolution: fixed
Status: newclosed

In 18973/josm:

Fix #23468: Improve performance in the Validator tree window

A large number of entries in the validator tree would cause the UI to lock for
significant periods of time when switching to a layer with many errors. Most of
the time spent was in AlphanumComparator.compare.

We need to use AlphanumComparator since String.compare would improperly sort
strings with accented characters.

The performance optimizations for this patch come from the following locations:

  • Extracting string chunk comparison to its own method (which can be compiled to native code)
  • Using String.substring instead of a StringBuilder when getting a string chunk for comparison

Both of those methods may be compiled to native code, but absent code compilation,
the performance improvements are as follows (as measured using an overpass
download of Mesa County, Colorado):

  • -86.4% CPU usage (3.366s to 0.459s)
  • -99.9% memory allocations (2.37 GB to 2.07 MB)

comment:7 by taylor.smock, 3 months ago

Milestone: 24.02

comment:8 by GerdP, 3 months ago

Summary: Performace issue with Validator tree windowPerformance issue with Validator tree window

comment:9 by taylor.smock, 3 months ago

This caused a test to fail. I may have to back it out if I cannot figure out a way to get ( and _ to have the "right" order.

I really don't want to though. :(

comment:10 by GerdP, 3 months ago

see also #22423

comment:11 by taylor.smock, 3 months ago

In 18977/josm:

See #23468: Improve performance in the validator tree window

This fixes an issue where there was a difference in sorting algorithms between
the faster ASCII only sorting method and the sorting method used by non-ASCII
strings.

comment:12 by taylor.smock, 3 months ago

In 18978/josm:

See #23468: Improve performance in the validator tree window

Add missing - character.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.