Modify

Opened 7 years ago

Last modified 7 years ago

#15574 new enhancement

[patch] [experimental] make large jpeg loading work through the use of JNI bridge to turbojpeg system library

Reported by: cmuelle8 Owned by: team
Priority: normal Milestone:
Component: Core image mapping Version: latest
Keywords: jpeg geoimage turbojpeg large images libjpeg-turbo jni Cc:

Description (last modified by cmuelle8)

Image loading in JOSM is currently done with

 Image img = Toolkit.getDefaultToolkit().createImage(file.getPath());
 [..]
 tracker.addImage(img, 1);

This may cause issues when trying to load images with large dimensions, if the heap space cannot hold the buffer of a full decode of the image. JPEG loading however may be done in a way that the native decoding library decodes and scales the image on the fly (before the pixels are being sent to a BufferedImage).

JDK8 and JDK9 already use jpeg or turbojpeg system libraries in the background to do the heavy lifting, but there does not seem to be a proper way in those JDKs to delegate the scaling wish down to the JNI level. The imageio framework has canSetSourceRenderSize() to check if a plugin can handle a custom source render size, but this is currently (JDK7, JDK8, JDK9) not implemented for the JPEG decoder, see


One solution is to employ the JNI bridge to turbojpeg system library.

Problems (caveats) with this approach are

  • libjpeg-turbo on the host system must have been compiled with the --with-java swith to ./configure
    • debian/ubuntu precompiled versions do not have java enabled by default, see debian/rules
  • the library needs to be found and readable by the JVM to load (general JNI problem)
  • as the loaded image will be a scaled version, zooming in 1:1 will not be possible
    • this may be overcome by reloading subregions of the source image from file as needed
    • in principle this should be possible using this decompress variant instead of the convenience function

Test data (a jpeg image with 30.000x21.952 pixels) may be found under

Trying to load this file without the proposed patch results in

  • in java.lang.OutOfMemoryError: Java heap space or
  • if JOSM is started with -Xmx4G (or a variant) to supply more heap space
    • in a terminated JVM and a segfaulting Xorg (libpthread related)
    • serious bug related to OOM conditions in openjdk (?)

The proposed patch makes an effort to determine the presumably available memory before loading an image. It reads width and height of an image without loading it and then checks

  • if the image will presumably fit into half of the avail memory to the JVM loading is defered to the default toolkit as usual
  • otherwise an instance of TJDecoder is used to load a scaled version (that will also not employ more than half of the available JVM heap space memory)

Not taking more than half of the memory available is especially necessary for the bilinear scaling to work as it duplicates the image buffer (worst case) in best fit mode.

In general, because of the deliberate checking for possible OOM conditions, JOSM will be from now on less likely to hit such exceptions if large images are loaded (this applies to all image formats, not just JPEG, but for JPEGs the code will make a second attempt in loading the image, instead of giving up).


Future work:
Considering that bilinear scaling may be turned off in prefs, the constraint of not using more than half the available memory may be loosened. However, a dynamic check depending on the state bilin* options are in, might need more work, i.e. if that option is changed during runtime and an image is loaded already employing more than half of the available memory, it may seem strange to do an expensive reload of the whole image, just because this scaling option changed. So for now, this check is not dependent on the scaling option used for ImageDisplay.

Pledge for further testing:
This patch had limited testing on a single dev machine and a single OS.
It has not been tested, whether libjpeg-turbo is properly loaded under Windows or MacOS, e.g.

Attachments (6)

josm-large-jpeg-scaled-loading-through-jni-turbojpeg-to-workaround-oom-experimental.patch (77.1 KB ) - added by cmuelle8 7 years ago.
josm-large-jpeg-scaled-loading-through-jni-turbojpeg-to-workaround-oom-experimental.patch (update with fix for ticket:15625)
josm-image-loading_check-ram-constraints-to-prevent-oom-exceptions_do-not-use-expensive-scaling-while-dragging.patch (16.2 KB ) - added by cmuelle8 7 years ago.
light version; does ram constraint checking as proposed with the larger patch; adds some javadoc; does not use bilinear scaling when image is dragged; [main diff to larger patch: strips 2nd loading attempt if loading fails due to ram constraints] (update with fix to ticket:15625)
josm-store-width-and-height-info-in-ImageEntry.patch (22.4 KB ) - added by cmuelle8 7 years ago.
patch against r13193; based on the patch in comment:14; additionally refactors ImageDisplay to use ImageEntry instead; stores width and height info while metadata of images are read; might break plugin code
josm-large-jpeg-scaled-loading-through-jni-turbojpeg-to-workaround-oom-experimental__r3.patch (3.7 KB ) - added by cmuelle8 7 years ago.
patch against r13191 + josm-store-width-and-height-info-in-ImageEntry.patch applied; deprecates the two oldest patches; of limited use until (a) distros support jni in turbojpeg lib; atm debian, ubuntu, etc do not and (b) binding classes for turbojpeg support partial image decoding
josm-turbojpeg-decompressor-JNI-glue.patch (57.3 KB ) - added by cmuelle8 7 years ago.
binding classes for turbojpeg, with a custom loader in TJLoader.java and a modified TJDecompressor.java to not depend on the portions that handle intermediate YUV images; lacks partial image decoding as upstream does atm
josm-enable-large-jpeg-loading__if-memory-can-hold-scaled-version-only__by-using-jni-turbojpeg__r4.patch (3.5 KB ) - added by cmuelle8 7 years ago.
use try-with-resources statement, patch against r13271; obsoletes first four patches of this ticket; applies in combination with josm-turbojpeg-decompressor-JNI-glue.patch

Download all attachments as: .zip

Change History (32)

comment:1 by cmuelle8, 7 years ago

Description: modified (diff)

comment:2 by cmuelle8, 7 years ago

Description: modified (diff)

comment:3 by cmuelle8, 7 years ago

Description: modified (diff)

comment:4 by cmuelle8, 7 years ago

Some added files (TJCompressor.java, TJTransform.java, etc.) may not actually be needed, eventually a subset to do the decoding will do (and hence shrink the overall patch size).

Last edited 7 years ago by cmuelle8 (previous) (diff)

comment:5 by cmuelle8, 7 years ago

Description: modified (diff)

clarified situation on partial jpeg decoding with turbojpeg, implemented in C, but not exported to jni atm..

comment:6 by Don-vip, 7 years ago

The patch is quite big, and I don't really like depending on a new library for a feature probably rarely useful. How did you find this, and why would you need it?

comment:7 by cmuelle8, 7 years ago

How did I find this? .. That's a weird question, I'm uncomfortable with it somehow.
Why would that matter? All the information contained herein is present on the net.
If you would have searched the net in that direction, I'm pretty sure
you would have come up with a similar solution.

Why would I/anyone need it? Well, imaging technology is advancing.
A lot of mobile phones around these days produce images between 13 to 23 MP
and there is no reason to believe that technology will stop at this point.
And I have not started to talk about professional digital camera equipment yet.


Another issue is that the current image loading is not RAM constraint safe,
the patch also contains some improvement on that side. So I'd rather say,
if we can support it at low cost, why would you think we should not support it?

Please also keep in mind that people may use JOSM on computers with as little
as half a gig of ram. In these cases scaled loading will also be of great help,
which means the patch does not necessarily target the example given in the
description, but scale down to other use cases and/or environments.


I agree with you however, that the current state is still dissatisfying, in
my mind largely because of missing JNI binding support in the standard
jpeg libs that are packaged for distributions (debian, kubuntu, windows(?))

As for the size of the patch:

  • we do not pull in a library, just a /binding/ to a (very standard) library
  • turbojpeg itself is not exotic at all, its used as the standard jpeg decoding library on lots and lots of boxes
  • we do not need the encoding part, just the decoder, so it will maybe shrink by half of the size, when compiled this won't add much to JOSM
  • actually it's rather small patch compared to others (twelvemonkey lib e.g.)

As I added [experimental] to the title, I do not expect you to apply it
currently. It's here as a work-in-progress and for people needing this
right now (that are advanced enough to pick, apply and compile this them-
selves).

comment:8 by cmuelle8, 7 years ago

stripped some non-needed bindings from patch:
encoder, yuvimage handling parts, deprecated methods

you can further reduce the size by stripping documentation,
but, while you could argue that it is available online or in the
turbojpeg sources, i do not recommend doing so.

btw: this patch does no harm, if turbojpeg is not found or not

bindable by jni, it silently skips the 2nd image loading attempt

comment:9 by cmuelle8, 7 years ago

have ImageDisplay.java first, for if the patch is viewed using trac

comment:10 by cmuelle8, 7 years ago

@Don-vip: I've extracted some useful parts of the larger patch for you to apply independently of the broader topic of this ticket:

  • check presumably available RAM before image loads
    • the ImageObserver is used to obtain width and height of an image, before it is fully loaded with the tracker
    • this might have some issues with exotic image formats that do not provide WIDTH and HEIGHT without a full decode

-> read: does the JDK allow such image formats at all? If yes, does this have some relevance to the
geoimage component? Presumably not, but I've not done an in depth check, maybe you have some more insight.

  • do not use expensive bilinear scaling when the image is dragged
  • gracefully fall-back to default scaler if allocating duplicate buffer for bilinear op fails
  • refactors ImageDisplay.java somewhat, so the turbojpeg fragments (or other lib) can be plugged/patched at a later time
  • and finally, add some javadoc, as requested by you in an aftermath to ticket:15476#comment:22

by cmuelle8, 7 years ago

josm-large-jpeg-scaled-loading-through-jni-turbojpeg-to-workaround-oom-experimental.patch (update with fix for ticket:15625)

by cmuelle8, 7 years ago

light version; does ram constraint checking as proposed with the larger patch; adds some javadoc; does not use bilinear scaling when image is dragged; [main diff to larger patch: strips 2nd loading attempt if loading fails due to ram constraints] (update with fix to ticket:15625)

comment:11 by cmuelle8, 7 years ago

.. patches are based on r13174, but checked to apply cleanly to trunk/src/org/openstreetmap/josm/gui/layer/geoimage/ImageDisplay.java?rev=13186

comment:12 by Don-vip, 7 years ago

In 13191/josm:

fix #15625, see #15574 - geo image loading: do ram constraint checking; add some javadoc; do not use bilinear scaling when image is dragged (patch by cmuelle8, modified)

comment:13 by cmuelle8, 7 years ago

Thanks for converting the Logging.info() call to use the common placeholder format, before applying.

Do you know if DevelopersGuide/StyleGuide is up to date? E.g. do you run your eclipse setup for josm like this? I'd like to pretty format patch submissions in the future, since I've seen you ran through the hazzle of checkstyling each time. Unless you like doing this, that is ;-)

comment:14 by cmuelle8, 7 years ago

@Don-vip: I've noticed on jenkins four warnings classified high priority (BAD STYLE) for ImageDisplay.java - to make findbugs happy, please use this follow-up patch. In theory we should not have more than one ImageDisplay instance, but it may have some relevance if this class is ever used more often than once.

EDIT: added fixes to findbug rants introduced with r13191, see https://josm.openstreetmap.de/jenkins/job/JOSM/3595/findbugsResult/new/. There are also warnings about the usage of System.gc(), do you think its ok to work with @SupressFBWarning .. in these cases to ignore them? They should be called at times, when performance is degraded anyway due to error or caught outOfMem condition.

EDIT2: include proper checks in the event image dimensions cannot be read (because of file format or other condition), this ensures that the thread will not linger around waiting forever. if lots of such images are loaded and switched between these lingering threads may aggregate, so not waiting forever and properly let them finish is somewhat important.

EDIT3: add repaint call if reading width and height timed out and run() method returns early

EDIT4: superceded by attachment:josm-store-width-and-height-info-in-ImageEntry.patch

Last edited 7 years ago by cmuelle8 (previous) (diff)

in reply to:  13 comment:15 by Don-vip, 7 years ago

Replying to cmuelle8:

Do you know if DevelopersGuide/StyleGuide is up to date?

It looks like up to date.

E.g. do you run your eclipse setup for josm like this? I'd like to pretty format patch submissions in the future, since I've seen you ran through the hazzle of checkstyling each time. Unless you like doing this, that is ;-)

Yes. With the Checkstyle plugin it's fast to spot violations and fix them during the code review.

comment:16 by cmuelle8, 7 years ago

The patch in comment:14 should fix most of the findbugs warnings, except those related to System.gc() usage in exceptional events. Its debatable how to deal with them:

  • leave them as is for devs to occasionally stumble upon when checking reports
  • do not use System.gc() even in/after OutOfMemory conditions
  • use SupressFBWarnings

in reply to:  16 ; comment:17 by Don-vip, 7 years ago

Replying to cmuelle8:

  • leave them as is for devs to occasionally stumble upon when checking reports

Not the best choice, but we have a couple of issues like this.

  • do not use System.gc() even in/after OutOfMemory conditions

Is it really needed after all? If we remove it, does your use case still work?

  • use SupressFBWarnings

No, because JOSM would require findbugs to compile.

in reply to:  17 comment:18 by cmuelle8, 7 years ago

Replying to Don-vip:

Replying to cmuelle8:
Is it really needed after all? If we remove it, does your use case still work?

Applies to turbojpeg-enabled code:
Seems to work without. If the user switches images around a lot and the JVM has not had the chance to do GC yet, a reload of a big image will gracefully load at a lower resolution. Which actually is better performance-wise in such conditions. If the user actually would want to wait for a higher resolution to be loadable it is suboptimal, but this seems unlikely in the event images are switched a lot in a small amount of time. Reducing the click frequency will have the JVM garbage collect rather quick, so that higher resolution loading is automagically re-enabled.


Applies to both loaders (default toolkit and turbojpeg):
So I've removed the explicit GC call after all.

To fix the remaining findbugs warnings the updated patch in comment:14 should do. I've not used checkstyle yet, so you might have to still reformat some bits. Also the ImageObserver is used with a timeout now, which is important for images that width/height cannot be read from.

Greetings

Last edited 7 years ago by cmuelle8 (previous) (diff)

by cmuelle8, 7 years ago

patch against r13193; based on the patch in comment:14; additionally refactors ImageDisplay to use ImageEntry instead; stores width and height info while metadata of images are read; might break plugin code

by cmuelle8, 7 years ago

patch against r13191 + josm-store-width-and-height-info-in-ImageEntry.patch applied; deprecates the two oldest patches; of limited use until (a) distros support jni in turbojpeg lib; atm debian, ubuntu, etc do not and (b) binding classes for turbojpeg support partial image decoding

by cmuelle8, 7 years ago

binding classes for turbojpeg, with a custom loader in TJLoader.java and a modified TJDecompressor.java to not depend on the portions that handle intermediate YUV images; lacks partial image decoding as upstream does atm

comment:19 by cmuelle8, 7 years ago

@Don-vip:

  • do you know if there is a way to deprecate attachments in trac tickets? without overwriting them individually?
  • In DevelopersGuide/StyleGuide
    • the instructions for calling checkstyle without ant from CLI do not work, as it depends on build2/org/openstreetmap/josm/TopLevelJavadocCheck.class, working command is (just updated the wiki page regarding this)
      # build2 might not be present until 'ant checkstyle-compile' was run
      svn diff --summarize | awk '{ print $2 }' | grep "^[a-z]" | xargs \
        java -classpath "build2:tools/checkstyle/checkstyle-all.jar" \
          com.puppycrawl.tools.checkstyle.Main -c tools/checkstyle/josm_checks.xml
      
    • "correct indentation" when file is saved is not advisable (example below), if you find the time please update the relevant screenshots
      • src/org/openstreetmap/josm/gui/layer/geoimage/ImageDisplay.java

         
        550552                visibleRect.checkPointInside(p);
        551553                VisRect selectedRect = new VisRect(
        552554                        p.x < mousePointInImg.x ? p.x : mousePointInImg.x,
        553                         p.y < mousePointInImg.y ? p.y : mousePointInImg.y,
        554                         p.x < mousePointInImg.x ? mousePointInImg.x - p.x : p.x - mousePointInImg.x,
        555                         p.y < mousePointInImg.y ? mousePointInImg.y - p.y : p.y - mousePointInImg.y,
        556                         visibleRect);
         555                                p.y < mousePointInImg.y ? p.y : mousePointInImg.y,
         556                                        p.x < mousePointInImg.x ? mousePointInImg.x - p.x : p.x - mousePointInImg.x,
         557                                                p.y < mousePointInImg.y ? mousePointInImg.y - p.y : p.y - mousePointInImg.y,
         558                                                        visibleRect);
        557559                selectedRect.checkRectSize();
        558560                selectedRect.checkRectPos();
        559561                ImageDisplay.this.selectedRect = selectedRect;
Last edited 7 years ago by cmuelle8 (previous) (diff)

in reply to:  19 comment:20 by Don-vip, 7 years ago

Replying to cmuelle8:

@Don-vip:

  • do you know if there is a way to deprecate attachments in trac tickets? without overwriting them individually?

You can't. What we can do:

  • delete them
  • version them manually in the filename (v1, v2, etc.)

Thanks for updating the wiki :)

  • "correct indentation" when file is saved is not advisable (example below), if you find the time please update

Time is a critical resource, I hardly find enough time to review patches :) But at least for this, everyone can help.

comment:21 by cmuelle8, 7 years ago

Here is an ant job for checkstyle running on the files changed that will give clickable output to the eclipse console.
It is the command I've updated the wiki with, appended by a call to sed to produce a format recognized by eclipse console.

You can more easily navigate to the source files from the console output of checkstyle this way.

  • build.xml

     
    808808            encoding="UTF-8" classpath="tools/checkstyle/checkstyle-all.jar">
    809809        </javac>
    810810    </target>
     811    <target name="checkstyle-changed" depends="checkstyle-compile">
     812        <exec append="false" executable="bash" failifexecutionfails="true">
     813            <arg value="-c"/>
     814            <arg value="svn diff --summarize | grep -o '\(src\|test\).*' | xargs java -cp 'tools/checkstyle/checkstyle-all.jar:${checkstyle-build.dir}' com.puppycrawl.tools.checkstyle.Main -c tools/checkstyle/josm_checks.xml | sed -e 's:\([^ ]*\) [^:]*/\([^:/]*.java\:[^:]*\):(\2)\1:'"/>
     815        </exec>
     816    </target>
    811817    <target name="checkstyle" depends="checkstyle-compile">
    812818        <taskdef resource="com/puppycrawl/tools/checkstyle/ant/checkstyle-ant-task.properties"
    813819             classpath="tools/checkstyle/checkstyle-all.jar:${checkstyle-build.dir}"/>

in reply to:  21 ; comment:22 by Don-vip, 7 years ago

Replying to cmuelle8:

Here is an ant job for checkstyle running on the files changed that will give clickable output to the eclipse console.

It works only for Linux. I am converting it to Windows too:

    <target name="checkstyle-changed" depends="checkstyle-compile">
        <exec append="false" osfamily="unix" executable="bash" failifexecutionfails="true">
            <arg value="-c"/>
            <arg value="svn status -q --ignore-externals src test | grep -o '\(src\|test\)/.*' | xargs java -cp '${checkstyle.dir}/checkstyle-all.jar:${checkstyle-build.dir}' com.puppycrawl.tools.checkstyle.Main -c ${checkstyle.dir}/josm_checks.xml | sed -e 's:\([^ ]*\) [^:]*/\([^:/]*.java\:[^:]*\):(\2)\1:'"/>
        </exec>
        <exec append="false" osfamily="windows" executable="powershell" failifexecutionfails="true">
            <arg value="/c"/>
            <arg value="svn status -q --ignore-externals src test | ForEach-Object {java -cp '${checkstyle.dir}/checkstyle-all.jar;${checkstyle-build.dir}' com.puppycrawl.tools.checkstyle.Main -c ${checkstyle.dir}/josm_checks.xml $_.split(' ')[7]}"/>
        </exec>
    </target>

but have some trouble to understand what your sed s:\([^ ]*\) [^:]*/\([^:/]*.java\:[^:]*\):(\2)\1: command does, could you please explain it to me with an example? Thanks.

comment:23 by Don-vip, 7 years ago

In 13220/josm:

see #15574:

  • additionally refactors ImageDisplay to use ImageEntry instead; stores width and height info while metadata of images are read; might break plugin code (patch by cmuelle8, minor changes)
  • remove double semicolon causing https://github.com/pmd/pmd/issues/785
  • enable PMD rule DoNotCallGarbageCollectionExplicitly
  • disable SpotBugs rule ST_WRITE_TO_STATIC_FROM_INSTANCE_METHOD

in reply to:  22 comment:24 by cmuelle8, 7 years ago

Replying to Don-vip:

but have some trouble to understand what your sed s:\([^ ]*\) [^:]*/\([^:/]*.java\:[^:]*\):(\2)\1: command does, could you please explain it to me with an example? Thanks.

sure:

  • a colon is used to delimit search expression from replacment
    • this is configurable in sed: whichever character follows the leading s, is used
    • to use a colon in the search expression, we need to quote it using backslash, i.e. \: is not a delimiter
  • \( \) are used to build capture groups (note the backslashes, without those braces are literals)
  • the first capture group captures words like WARN or INFO (that are printed space delimited before the full file path)
  • the second captures the file name followed by the line number (separated by a colon), [^:]*/ matches the directory, but that is not used in the replacement expression
  • the replacement expresion is (\2)\1, which is a back reference to the second captured group with added, literal braces, followed by the first capture group (to not loose the checkstyle classification in the output (like WARN, INFO, etc.)

comment:25 by cmuelle8, 7 years ago

.. eclipse, in the default configuration, only recognizes expressions of the form (%n:%l), e.g. (Filename.java:15), when translating console output to hyperlinks, which was the reason to run the streaming editor sed on the checkstyle output.

.. another way might be to tinker with eclipse preferences (iirc the search pattern is configurable somewhere), so that it recognizes the native checkstyle output.

.. doing a regex replacement however has the advantage for us that its easier to transport (i.e. we don't have to worry about getting the correct setting pushed via .settings/ folder in the git repo). then, prefs files of eclipse might see changes, which is another point to assume that this would be less future proof, and prefering to use standard shell utilities

Have you checked it works on Mac OS? FreeBSD? ;-)

by cmuelle8, 7 years ago

use try-with-resources statement, patch against r13271; obsoletes first four patches of this ticket; applies in combination with josm-turbojpeg-decompressor-JNI-glue.patch

comment:26 by Don-vip, 7 years ago

In 13303/josm:

see #15574 - new target checkstyle-changed to call checkstyle only for modified files

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The owner will remain team.
as The resolution will be set. Next status will be 'closed'.
to The owner will be changed from team to the specified user.
Next status will be 'needinfo'. The owner will be changed from team to cmuelle8.
as duplicate The resolution will be set to duplicate. Next status will be 'closed'. The specified ticket will be cross-referenced with this ticket.
The owner will be changed from team to anonymous. Next status will be 'assigned'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.