Modify

Opened 10 years ago

Closed 10 years ago

#11418 closed defect (duplicate)

Unicode chars >16 bit will crash JOSM

Reported by: Hb--- Owned by: team
Priority: major Milestone:
Component: Core Version: latest
Keywords: template_report unicode Cc:

Description

The bike symbol below is the 🚲 unicode character.

What steps will reproduce the problem?

  1. Delete existing preferences.xml and its backup and start JOSM.
  2. Create a new layer and create a new way.
  3. Tag the first node with 'keyA=🚲A'
  4. Tag the second node with 'keyB=🚲B'.
  5. Save the data to a local file like test.osm and close JOSM.
  6. Open JOSM again and load the file test.osm.

-- Notice that the first nodes value changed to '🚲🚲A'.

  1. Save the file test.osm again and close JOSM.

-- Notice that in the preferences.xml file the last entry in <list key='properties.recent-tags'> is now '🚲🚲B'.

  1. Open JOSM a third time and load the file test.osm again.

-- Notice that the values in both nodes changed to '🚲🚲🚲A' resp. '🚲🚲🚲B'

  1. Save the file test.osm a third time and close JOSM.

-- Notice that the last entry in <list key='properties.recent-tags'> is now '🚲🚲🚲🚲🚲B' (Five! bikes).

What is the expected result?

Using extended Unicode Characters greater 16 bit should not lead to an endlessly growing preference.xml file.

What happens instead?

JOSM didn't started any more some days after tagging a bike symbol. Error shown in command console was not enough "Java heap memory". This was caused by an entry in <list key='properties.recent-tags'> which was around 300,000 bytes long. That is far over 9000 bike symbols!

Revision: 8338
Repository Root: http://josm.openstreetmap.de/svn
Relative URL: ^/trunk
Last Changed Author: Don-vip
Last Changed Date: 2015-05-07 01:27:41 +0200 (Thu, 07 May 2015)
Build-Date: 2015-05-07 01:31:04
URL: http://josm.openstreetmap.de/svn/trunk
Repository UUID: 0c6e7542-c601-0410-84e7-c038aed88b3b
Last Changed Rev: 8338

Identification: JOSM/1.5 (8338 de) Windows 7 64-Bit
Memory Usage: 329 MB / 3604 MB (33 MB allocated, but free)
Java version: 1.8.0_45, Oracle Corporation, Java HotSpot(TM) 64-Bit Server VM
Dataset consistency test: No problems found

Attachments (0)

Change History (5)

comment:1 by wiktorn, 10 years ago

As far as I did my research on this, the problem arises in org.openstreetmap.josm.data.Preferences in when doing java.xml.stream.XMLStreamReader.getAttributeValue(null, "value"). Although we have only one bike character in file, this method returns two characters.

Additional reading about 32bit unicode support in Java:
http://www.oracle.com/us/technologies/java/supplementary-142654.html

http://stackoverflow.com/questions/7721293/java-reading-in-character-streams-with-supplementary-unicode-characters

Maybe we could try to encode supplementary Unicode characters as XML entities, and the parser would handle them properly, but I haven't tested it yet.

comment:2 by Don-vip, 10 years ago

Keywords: unicode added

comment:3 by Don-vip, 10 years ago

See #3290. Can you please try with jdk9?

comment:4 by wiktorn, 10 years ago

I've just tested with JDK9, it looks that this problem is solved there.

comment:5 by Don-vip, 10 years ago

Resolution: duplicate
Status: newclosed

Closed as duplicate of #3290.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain team.
as The resolution will be set.
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.