Modify

Opened 8 months ago

Last modified 3 weeks ago

#23738 reopened enhancement

Mass upload: JOSM tries to upload changes even though changeset is already closed

Reported by: mmd Owned by: GerdP
Priority: normal Milestone: 25.01
Component: Core Version:
Keywords: Cc:

Description

When trying to upload 100.000 changes with package size 10.000, JOSM tries to upload changes to an already closed changeset. A typical sequence of events would be as follows:

Step 1: Create changeset 1
Step 2: Upload 10k changes to changeset 1 -> HTTP 200
Step 3: Upload 10k changes to changeset 1 -> HTTP 409 (Changeset already closed)
Step 4: Close changeset 1
Step 5: Create changeset 2
Step 6: Upload 10k changes to changeset 2 -> HTTP 200
... etc.

Step 3 fails, because we've already uploaded 10k changes in step 2, and a changeset can only contain a maximum number of 10k changes, according to /api/capabilities.

It would be good if JOSM could somehow skip Step 3 above, since it's clear that the API will reject the request in any case.

Attachments (4)

23738-wip.patch (1.5 KB ) - added by GerdP 7 months ago.
This seems to fix the problem but I think something is wrong with the progress monitor (no update after first close)
23738-2.patch (2.3 KB ) - added by GerdP 7 months ago.
also fixes problems when chunk size < 10000, but monitor still doesn't update properly
23738-3.patch (3.1 KB ) - added by GerdP 7 months ago.
now progress monitor also works
23738-4.patch (8.3 KB ) - added by GerdP 7 months ago.
Improve handling of upload into open changeset

Download all attachments as: .zip

Change History (46)

comment:1 by GerdP, 7 months ago

Maybe JOSM should not allow that the chunk size is equal to the server limit? I assume with chunksize = 9999 everything works fine?

comment:2 by mmd, 7 months ago

So I tried this with package size 9999 and posted the results below. It seems that JOSM isn't aware of the current number of changes in a changeset and applies the fixed, user defined chunk size for each upload.

In this "9999" scenario, the second upload to a changeset would then fail:

2024-06-17 21:16:05.939 INFO: PUT http://localhost:31500/api/0.6/changeset/create (167 B) ...
2024-06-17 21:16:05.949 INFO: PUT http://localhost:31500/api/0.6/changeset/create -> HTTP/1.1 200 (9 ms; 26 B)
2024-06-17 21:16:05.949 INFO: OK
2024-06-17 21:16:06.118 INFO: POST http://localhost:31500/api/0.6/changeset/126407/upload (117 kB) ...
2024-06-17 21:16:06.586 INFO: POST http://localhost:31500/api/0.6/changeset/126407/upload -> HTTP/1.1 200 (467 ms; 50.6 kB)
2024-06-17 21:16:06.586 INFO: OK
2024-06-17 21:16:06.873 INFO: POST http://localhost:31500/api/0.6/changeset/126407/upload (120 kB) ...
2024-06-17 21:16:07.383 INFO: POST http://localhost:31500/api/0.6/changeset/126407/upload -> HTTP/1.1 409 (509 ms; 58 B)
2024-06-17 21:16:07.384 INFO: Conflict
2024-06-17 21:16:07.384 SEVERE: Error header: The changeset 126407 was closed at 2024-06-17 19:16:06 UTC
2024-06-17 21:16:07.486 INFO: PUT http://localhost:31500/api/0.6/changeset/126407/close (22 B) ...
2024-06-17 21:16:07.494 INFO: PUT http://localhost:31500/api/0.6/changeset/126407/close -> HTTP/1.1 200 (7 ms; 20 B)
2024-06-17 21:16:07.494 INFO: OK
2024-06-17 21:16:07.628 INFO: PUT http://localhost:31500/api/0.6/changeset/create (167 B) ...
2024-06-17 21:16:07.640 INFO: PUT http://localhost:31500/api/0.6/changeset/create -> HTTP/1.1 200 (11 ms; 26 B)
2024-06-17 21:16:07.640 INFO: OK
2024-06-17 21:16:07.810 INFO: POST http://localhost:31500/api/0.6/changeset/126408/upload (120 kB) ...
2024-06-17 21:16:08.313 INFO: POST http://localhost:31500/api/0.6/changeset/126408/upload -> HTTP/1.1 200 (502 ms; 50.6 kB)
2024-06-17 21:16:08.313 INFO: OK
2024-06-17 21:16:08.575 INFO: POST http://localhost:31500/api/0.6/changeset/126408/upload (121 kB) ...
2024-06-17 21:16:09.049 INFO: POST http://localhost:31500/api/0.6/changeset/126408/upload -> HTTP/1.1 409 (472 ms; 58 B)
2024-06-17 21:16:09.049 INFO: Conflict
2024-06-17 21:16:09.049 SEVERE: Error header: The changeset 126408 was closed at 2024-06-17 19:16:08 UTC
2024-06-17 21:16:09.153 INFO: PUT http://localhost:31500/api/0.6/changeset/126408/close (22 B) ...
2024-06-17 21:16:09.178 INFO: PUT http://localhost:31500/api/0.6/changeset/126408/close -> HTTP/1.1 200 (23 ms; 20 B)
2024-06-17 21:16:09.178 INFO: OK
...

comment:3 by GerdP, 7 months ago

Hmm, what's the wanted behaviour with a chunksize of e.g. 100? Should JOSM create one CS for each chunk? I think so, else I see no reason for this upload strategy (I've never used it).

comment:4 by mmd, 7 months ago

Back in the days, mappers used a chunk size of 100, because the upload was still fairly slow. Instead of uploading everything at once and waiting for 1-60 minutes without status update and hoping that the network connection would not break down, mappers used smaller chunks sizes instead. JOSM then shows a status update every 10 seconds or so, and the upload felt much smoother overall.

Today, that's not so much of a concern anymore: even 10k changes can be uploaded reasonably fast in one go. Still some mappers are using small chunk sizes today, maybe out of habit.

So even with chunk size of 100, JOSM should try to upload as many changes as possible in a single changeset (that's 100 uploads). It could even keep track of all successful uploads, and switch to a new changeset for upload 101.

comment:5 by GerdP, 7 months ago

Is there a test server with a (possibly much smaller) limit that can be used to test this? If yes, what do I have to do to use this? Can I simply try to upload some 25000 new objects again and again until I get an idea what to change?

comment:6 by mmd, 7 months ago

You could use the dev instance on https://master.apis.dev.openstreetmap.org with the well known 10k changeset size limit.

In theory you could upload the same 25000 new objects as often as you like. However, since we have a rate limit on changeset uploads in place, it would be best to reach out to tomhughes and ask him to assign your OSM dev user the "importer" role. With this role you can upload up to 1 million changes per hour. Normal "newbee" users have a limit of 1000 changes/hour, which will not work in your case.

in reply to:  6 comment:7 by gaben, 7 months ago

Replying to mmd:

ask him to assign your OSM dev user the "importer" role

Are these special roles documented somewhere?

comment:8 by mmd, 7 months ago

We currently have administrator, moderator and importer roles. The "importer" role was only recently added and is also mentioned in the Import Guidelines: https://wiki.openstreetmap.org/wiki/Import/Guidelines#Using_a_Dedicated_User_Account_for_Imports

Moderator role is mainly relevant for DWG members, administrator role e.g. for operations team members.

by GerdP, 7 months ago

Attachment: 23738-wip.patch added

This seems to fix the problem but I think something is wrong with the progress monitor (no update after first close)

comment:9 by GerdP, 7 months ago

Milestone: 24.06
Owner: changed from team to GerdP
Status: newassigned

comment:10 by GerdP, 7 months ago

@mmd OK, I've hit the upload limit again. How exactly do I contact tomhughes?

by GerdP, 7 months ago

Attachment: 23738-2.patch added

also fixes problems when chunk size < 10000, but monitor still doesn't update properly

by GerdP, 7 months ago

Attachment: 23738-3.patch added

now progress monitor also works

comment:11 by GerdP, 7 months ago

I think this works well now. If nobody complains I'll commit the patch tomorrow.

comment:12 by mmd, 7 months ago

Mastodon would do (https://en.osm.town/@tomh), you can try #osm-dev on IRC, or by email operations@… (for the whole OWG team), or you could create an operations issue https://github.com/openstreetmap/operations/issues

comment:13 by mmd, 7 months ago

Thank you for working on the patch. I also gave it a try on the dev instance with my mmd2mod user (1 million limit/h). Test was performed with your 23738-3.patch, which I applied on top of version 19125.

Overall the upload processing is much smoother now. The second upload attempt after a successful 10k upload no longer occurs.

47 changesets created for user mmd2mod: https://master.apis.dev.openstreetmap.org/user/mmd2mod
JOSM log file: https://gist.github.com/mmd-osm/8a3b39a68d9ae675a97f43e36de94063

comment:14 by GerdP, 7 months ago

Thanks for testing. Still, something must be wrong because when I try to upload exactly 30000 objects only 2 changesets are created and the last 10000 objects are not uploaded.
Edit: This also happens with tested version 19096, so it's a bug at another place.

Last edited 7 months ago by GerdP (previous) (diff)

comment:15 by GerdP, 7 months ago

Resolution: fixed
Status: assignedclosed

In 19126/josm:

fix #23738: Mass upload: JOSM tries to upload changes even though changeset is already closed

  • while filling the changeset check if the maximum size is reached so that JOSM doesn't attempt to upload more than the maximum allowed objects
  • throw ChangesetClosedException to handle the automatic close of the changeset on server side

comment:16 by GerdP, 7 months ago

For the bug with 30.000 see new ticket #23754

comment:17 by GerdP, 7 months ago

Resolution: fixed
Status: closedreopened

This doesn't yet fix the situation when a user uploads to a still open changeset and the maximum object limit is reached.

comment:18 by GerdP, 7 months ago

@mmd: Maybe I have misunderstood the meaning of the 10.000 limit. I now have an open changeset on https://master.apis.dev.openstreetmap.org/changeset/369897 with 10.000 elements.
I would have expected that it is closed by the server?
Or will this only happen when I try to add more than 10.000 elements?

comment:19 by GerdP, 7 months ago

I think somethings wrong on the dev server.
I just tried to add 10.000 objects to an open cs https://master.apis.dev.openstreetmap.org/changeset/369906 which contains 1 node using r19096. This version tries to upload 10.000 elements into the changeset

2024-06-24 15:24:16.062 INFO: Starting upload with tags {created_by=JOSM/1.5 (19125 SVN en);JOSM/1.5 (19096 en), comment=no auto-close4, source=none}
2024-06-24 15:24:16.063 INFO: Strategy: CHUNKED_DATASET_STRATEGY, ChunkSize: 10000, Policy: none, Close after: false
2024-06-24 15:24:16.063 INFO: Changeset 369906: no auto-close4
2024-06-24 15:24:16.098 INFO: Message notifier inactive
2024-06-24 15:24:16.166 INFO: PUT https://master.apis.dev.openstreetmap.org/api/0.6/changeset/369906 (438 B) ...
2024-06-24 15:24:16.322 INFO: PUT https://master.apis.dev.openstreetmap.org/api/0.6/changeset/369906 -> HTTP/1.1 200 (77 ms; 389 B)
2024-06-24 15:24:16.323 INFO: OK
2024-06-24 15:24:16.475 INFO: POST https://master.apis.dev.openstreetmap.org/api/0.6/changeset/369906/upload (824 kB) ...
2024-06-24 15:24:22.475 INFO: POST https://master.apis.dev.openstreetmap.org/api/0.6/changeset/369906/upload -> HTTP/1.1 409 (2.0 s; 58 B)
2024-06-24 15:24:22.477 INFO: Conflict
2024-06-24 15:24:22.477 SEVERE: Error header: The changeset 369906 was closed at 2024-06-24 13:24:20 UTC

In fact the changeset is still open and contains only the one node.

comment:20 by mmd, 7 months ago

An upload is always either applied completely or not at all in case of some error. In your example there’s already 1 change in the changeset, so you cannot upload 10‘000 anymore in the same changeset, because this would exceed the max. 10‘000 limit. The API rejects the upload altogether and the changeset remains open as it was before the failed upload attempt. You could either upload fewer changes or start with a new changeset.
I think the error message may be a bit confusing, iirc the Rails port shows the same behavior.

comment:21 by GerdP, 7 months ago

My understanding was that a changeset with exactly 10000 objects is automatically closed by the server. This also doesn't happen.

comment:22 by mmd, 7 months ago

Regarding your previous question, you could still change the changeset metadata (hashtags) even when you have already uploaded 10‘000 changes. What you can’t do is to upload further changes now.

If it helps you can also take a look at the implementation: https://github.com/zerebubuth/openstreetmap-cgimap/blob/master/src/backend/apidb/changeset_upload/changeset_updater.cpp#L54

comment:23 by GerdP, 7 months ago

OK, the link to the source helps. I was completely mislead by the error header which claims that the changeset was closed. I'll see if I can also change the code which writes to an open changeset, but I see many old problems in that area.

comment:24 by mmd, 7 months ago

The original Rails code also throws the same error message in case there’s genuine timeout or if there are too many changes in a changeset. It would have been easier to understand if both concepts were not clubbed together in a single open status.

https://github.com/openstreetmap/openstreetmap-website/blob/0c4c3cfcd46ea2c314dabcfe05b3f8a0d3430359/app/models/concerns/consistency_validations.rb#L34

https://github.com/openstreetmap/openstreetmap-website/blob/0c4c3cfcd46ea2c314dabcfe05b3f8a0d3430359/app/models/changeset.rb#L75

comment:25 by GerdP, 7 months ago

The phrase "The changeset x was closed" probably means that the update connection was closed, not the changeset itself.

by GerdP, 7 months ago

Attachment: 23738-4.patch added

Improve handling of upload into open changeset

comment:26 by GerdP, 7 months ago

23738-4.patch separates the case that the server responded with changeset closed and the case that JOSM recognizes this would happen.
It still only handles the upload in chunks.
Several problems are unsolved, but are not really related to this ticket:

  • When selecting an open changeset that is too full JOSM should not offer to "Upload all objects in one request". I think best would be to disable the option in this case.
  • The number of already existing objects in the open changeset is not taken into account when the number of chunks is calculated
  • A different code in UploadLayerTask is used when a layer with changes is closed and user decides to upload. This just shows an error popup and cancels the closing of the layer.
  • When uploading a large number of objects with strategy "Upload each object individually" it will take very long and the "Cancel" button doesn't seem to work.

comment:27 by GerdP, 7 months ago

When selecting an open changeset that is too full JOSM should not offer to "Upload all objects in one request". I think best would be to disable the option in this case.

Or maybe better: Don't allow to select an open changeset which cannot take all objects. Examples:

  • Have an open CS with 9995 changes and a new upload with up to 5 objects: no problem
  • Have an open CS with 9995 changes and a new upload with more than 5 objects: reject the selection of the open changeset

comment:28 by GerdP, 7 months ago

In 19127/josm:

see #23738 comment:26

  • make Cancel button work when using "Upload each object individually"

in reply to:  27 comment:29 by skyper, 7 months ago

Replying to GerdP:

When selecting an open changeset that is too full JOSM should not offer to "Upload all objects in one request". I think best would be to disable the option in this case.

Or maybe better: Don't allow to select an open changeset which cannot take all objects. Examples:

  • Have an open CS with 9995 changes and a new upload with up to 5 objects: no problem
  • Have an open CS with 9995 changes and a new upload with more than 5 objects: reject the selection of the open changeset

I think it should be possible to fill the open CS first. What do you do if you have an open CS with 9995 changes and you want to upload another 10004 objects? Maybe, you do not want to close the last CS after upload but continue editing and then upload to the open CS again.

comment:30 by GerdP, 7 months ago

Do you think about another strategy which doesn't ask for the chunk size and just fills the CS and opens a new one if needed?

comment:31 by skyper, 7 months ago

Yes, I think there should be no differences between uploading to an open CS or uploading to a new CS. Once the CS is full a new CS is created but making assumptions and calculations to work around the limit is probably not ideal. If "Upload all objects in one request" does not work as the open CS's object limit will be reached, simply fill the open CS with a smaller chunk and then continue in a new CS.

comment:32 by GerdP, 7 months ago

Yes, I guess that's what most users would want when they chose an open changeset. I think my last patch is close to this but the texts/options in the dialog simply don't work well when the number of changes don't fit into one CS.
Maybe change "Upload all objects in one request" to "Minimize number of uploads"?

comment:33 by skyper, 7 months ago

The second options "Upload objects in chunks of size:" has a similar problem. How about: "Upload objects in minimal requests" and "Upload objects in chunks of maximal size:"?

comment:34 by GerdP, 7 months ago

I don't want to change anything which requires new translations now and I'll be on a longer cycle tour soon, so unless someone else wants to continue this probably has to wait a few weeks.

comment:35 by skyper, 7 months ago

As #23754 also is still to fix, I think waiting for better wording should not be a problem.

Enjoy your holidays. I hope you have pleasant weather.

comment:36 by stoecker, 7 months ago

Milestone: 24.0624.07

Milestone closed.

comment:37 by taylor.smock, 6 months ago

Milestone: 24.0724.08

Ticket retargeted after milestone closed

comment:38 by taylor.smock, 5 months ago

Milestone: 24.0824.09

Ticket retargeted after milestone closed

comment:39 by taylor.smock, 4 months ago

Milestone: 24.0924.10

Ticket retargeted after milestone closed

comment:40 by taylor.smock, 3 months ago

Milestone: 24.1024.11

Ticket retargeted after milestone closed

comment:41 by taylor.smock, 8 weeks ago

Milestone: 24.1124.12

Ticket retargeted after milestone closed

comment:42 by stoecker, 3 weeks ago

Milestone: 24.1225.01

Modify Ticket

Change Properties
Set your email in Preferences
Action
as reopened The owner will remain GerdP.
as The resolution will be set. Next status will be 'closed'.
to The owner will be changed from GerdP to the specified user. Next status will be 'new'.
Next status will be 'needinfo'. The owner will be changed from GerdP to mmd.
as duplicate The resolution will be set to duplicate. Next status will be 'closed'. The specified ticket will be cross-referenced with this ticket.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.