Opened 8 months ago
Last modified 3 weeks ago
#23738 reopened enhancement
Mass upload: JOSM tries to upload changes even though changeset is already closed
Reported by: | mmd | Owned by: | GerdP |
---|---|---|---|
Priority: | normal | Milestone: | 25.01 |
Component: | Core | Version: | |
Keywords: | Cc: |
Description
When trying to upload 100.000 changes with package size 10.000, JOSM tries to upload changes to an already closed changeset. A typical sequence of events would be as follows:
Step 1: Create changeset 1
Step 2: Upload 10k changes to changeset 1 -> HTTP 200
Step 3: Upload 10k changes to changeset 1 -> HTTP 409 (Changeset already closed)
Step 4: Close changeset 1
Step 5: Create changeset 2
Step 6: Upload 10k changes to changeset 2 -> HTTP 200
... etc.
Step 3 fails, because we've already uploaded 10k changes in step 2, and a changeset can only contain a maximum number of 10k changes, according to /api/capabilities.
It would be good if JOSM could somehow skip Step 3 above, since it's clear that the API will reject the request in any case.
Attachments (4)
Change History (46)
comment:1 by , 7 months ago
comment:2 by , 7 months ago
So I tried this with package size 9999 and posted the results below. It seems that JOSM isn't aware of the current number of changes in a changeset and applies the fixed, user defined chunk size for each upload.
In this "9999" scenario, the second upload to a changeset would then fail:
2024-06-17 21:16:05.939 INFO: PUT http://localhost:31500/api/0.6/changeset/create (167 B) ... 2024-06-17 21:16:05.949 INFO: PUT http://localhost:31500/api/0.6/changeset/create -> HTTP/1.1 200 (9 ms; 26 B) 2024-06-17 21:16:05.949 INFO: OK 2024-06-17 21:16:06.118 INFO: POST http://localhost:31500/api/0.6/changeset/126407/upload (117 kB) ... 2024-06-17 21:16:06.586 INFO: POST http://localhost:31500/api/0.6/changeset/126407/upload -> HTTP/1.1 200 (467 ms; 50.6 kB) 2024-06-17 21:16:06.586 INFO: OK 2024-06-17 21:16:06.873 INFO: POST http://localhost:31500/api/0.6/changeset/126407/upload (120 kB) ... 2024-06-17 21:16:07.383 INFO: POST http://localhost:31500/api/0.6/changeset/126407/upload -> HTTP/1.1 409 (509 ms; 58 B) 2024-06-17 21:16:07.384 INFO: Conflict 2024-06-17 21:16:07.384 SEVERE: Error header: The changeset 126407 was closed at 2024-06-17 19:16:06 UTC 2024-06-17 21:16:07.486 INFO: PUT http://localhost:31500/api/0.6/changeset/126407/close (22 B) ... 2024-06-17 21:16:07.494 INFO: PUT http://localhost:31500/api/0.6/changeset/126407/close -> HTTP/1.1 200 (7 ms; 20 B) 2024-06-17 21:16:07.494 INFO: OK 2024-06-17 21:16:07.628 INFO: PUT http://localhost:31500/api/0.6/changeset/create (167 B) ... 2024-06-17 21:16:07.640 INFO: PUT http://localhost:31500/api/0.6/changeset/create -> HTTP/1.1 200 (11 ms; 26 B) 2024-06-17 21:16:07.640 INFO: OK 2024-06-17 21:16:07.810 INFO: POST http://localhost:31500/api/0.6/changeset/126408/upload (120 kB) ... 2024-06-17 21:16:08.313 INFO: POST http://localhost:31500/api/0.6/changeset/126408/upload -> HTTP/1.1 200 (502 ms; 50.6 kB) 2024-06-17 21:16:08.313 INFO: OK 2024-06-17 21:16:08.575 INFO: POST http://localhost:31500/api/0.6/changeset/126408/upload (121 kB) ... 2024-06-17 21:16:09.049 INFO: POST http://localhost:31500/api/0.6/changeset/126408/upload -> HTTP/1.1 409 (472 ms; 58 B) 2024-06-17 21:16:09.049 INFO: Conflict 2024-06-17 21:16:09.049 SEVERE: Error header: The changeset 126408 was closed at 2024-06-17 19:16:08 UTC 2024-06-17 21:16:09.153 INFO: PUT http://localhost:31500/api/0.6/changeset/126408/close (22 B) ... 2024-06-17 21:16:09.178 INFO: PUT http://localhost:31500/api/0.6/changeset/126408/close -> HTTP/1.1 200 (23 ms; 20 B) 2024-06-17 21:16:09.178 INFO: OK ...
comment:3 by , 7 months ago
Hmm, what's the wanted behaviour with a chunksize of e.g. 100? Should JOSM create one CS for each chunk? I think so, else I see no reason for this upload strategy (I've never used it).
comment:4 by , 7 months ago
Back in the days, mappers used a chunk size of 100, because the upload was still fairly slow. Instead of uploading everything at once and waiting for 1-60 minutes without status update and hoping that the network connection would not break down, mappers used smaller chunks sizes instead. JOSM then shows a status update every 10 seconds or so, and the upload felt much smoother overall.
Today, that's not so much of a concern anymore: even 10k changes can be uploaded reasonably fast in one go. Still some mappers are using small chunk sizes today, maybe out of habit.
So even with chunk size of 100, JOSM should try to upload as many changes as possible in a single changeset (that's 100 uploads). It could even keep track of all successful uploads, and switch to a new changeset for upload 101.
comment:5 by , 7 months ago
Is there a test server with a (possibly much smaller) limit that can be used to test this? If yes, what do I have to do to use this? Can I simply try to upload some 25000 new objects again and again until I get an idea what to change?
follow-up: 7 comment:6 by , 7 months ago
You could use the dev instance on https://master.apis.dev.openstreetmap.org with the well known 10k changeset size limit.
In theory you could upload the same 25000 new objects as often as you like. However, since we have a rate limit on changeset uploads in place, it would be best to reach out to tomhughes and ask him to assign your OSM dev user the "importer" role. With this role you can upload up to 1 million changes per hour. Normal "newbee" users have a limit of 1000 changes/hour, which will not work in your case.
comment:7 by , 7 months ago
Replying to mmd:
ask him to assign your OSM dev user the "importer" role
Are these special roles documented somewhere?
comment:8 by , 7 months ago
We currently have administrator, moderator and importer roles. The "importer" role was only recently added and is also mentioned in the Import Guidelines: https://wiki.openstreetmap.org/wiki/Import/Guidelines#Using_a_Dedicated_User_Account_for_Imports
Moderator role is mainly relevant for DWG members, administrator role e.g. for operations team members.
by , 7 months ago
Attachment: | 23738-wip.patch added |
---|
This seems to fix the problem but I think something is wrong with the progress monitor (no update after first close)
comment:9 by , 7 months ago
Milestone: | → 24.06 |
---|---|
Owner: | changed from | to
Status: | new → assigned |
comment:10 by , 7 months ago
@mmd OK, I've hit the upload limit again. How exactly do I contact tomhughes?
by , 7 months ago
Attachment: | 23738-2.patch added |
---|
also fixes problems when chunk size < 10000, but monitor still doesn't update properly
comment:11 by , 7 months ago
I think this works well now. If nobody complains I'll commit the patch tomorrow.
comment:12 by , 7 months ago
Mastodon would do (https://en.osm.town/@tomh), you can try #osm-dev on IRC, or by email operations@… (for the whole OWG team), or you could create an operations issue https://github.com/openstreetmap/operations/issues
comment:13 by , 7 months ago
Thank you for working on the patch. I also gave it a try on the dev instance with my mmd2mod user (1 million limit/h). Test was performed with your 23738-3.patch, which I applied on top of version 19125.
Overall the upload processing is much smoother now. The second upload attempt after a successful 10k upload no longer occurs.
47 changesets created for user mmd2mod: https://master.apis.dev.openstreetmap.org/user/mmd2mod
JOSM log file: https://gist.github.com/mmd-osm/8a3b39a68d9ae675a97f43e36de94063
comment:14 by , 7 months ago
Thanks for testing. Still, something must be wrong because when I try to upload exactly 30000 objects only 2 changesets are created and the last 10000 objects are not uploaded.
Edit: This also happens with tested version 19096, so it's a bug at another place.
comment:17 by , 7 months ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
This doesn't yet fix the situation when a user uploads to a still open changeset and the maximum object limit is reached.
comment:18 by , 7 months ago
@mmd: Maybe I have misunderstood the meaning of the 10.000 limit. I now have an open changeset on https://master.apis.dev.openstreetmap.org/changeset/369897 with 10.000 elements.
I would have expected that it is closed by the server?
Or will this only happen when I try to add more than 10.000 elements?
comment:19 by , 7 months ago
I think somethings wrong on the dev server.
I just tried to add 10.000 objects to an open cs https://master.apis.dev.openstreetmap.org/changeset/369906 which contains 1 node using r19096. This version tries to upload 10.000 elements into the changeset
2024-06-24 15:24:16.062 INFO: Starting upload with tags {created_by=JOSM/1.5 (19125 SVN en);JOSM/1.5 (19096 en), comment=no auto-close4, source=none} 2024-06-24 15:24:16.063 INFO: Strategy: CHUNKED_DATASET_STRATEGY, ChunkSize: 10000, Policy: none, Close after: false 2024-06-24 15:24:16.063 INFO: Changeset 369906: no auto-close4 2024-06-24 15:24:16.098 INFO: Message notifier inactive 2024-06-24 15:24:16.166 INFO: PUT https://master.apis.dev.openstreetmap.org/api/0.6/changeset/369906 (438 B) ... 2024-06-24 15:24:16.322 INFO: PUT https://master.apis.dev.openstreetmap.org/api/0.6/changeset/369906 -> HTTP/1.1 200 (77 ms; 389 B) 2024-06-24 15:24:16.323 INFO: OK 2024-06-24 15:24:16.475 INFO: POST https://master.apis.dev.openstreetmap.org/api/0.6/changeset/369906/upload (824 kB) ... 2024-06-24 15:24:22.475 INFO: POST https://master.apis.dev.openstreetmap.org/api/0.6/changeset/369906/upload -> HTTP/1.1 409 (2.0 s; 58 B) 2024-06-24 15:24:22.477 INFO: Conflict 2024-06-24 15:24:22.477 SEVERE: Error header: The changeset 369906 was closed at 2024-06-24 13:24:20 UTC
In fact the changeset is still open and contains only the one node.
comment:20 by , 7 months ago
An upload is always either applied completely or not at all in case of some error. In your example there’s already 1 change in the changeset, so you cannot upload 10‘000 anymore in the same changeset, because this would exceed the max. 10‘000 limit. The API rejects the upload altogether and the changeset remains open as it was before the failed upload attempt. You could either upload fewer changes or start with a new changeset.
I think the error message may be a bit confusing, iirc the Rails port shows the same behavior.
comment:21 by , 7 months ago
My understanding was that a changeset with exactly 10000 objects is automatically closed by the server. This also doesn't happen.
comment:22 by , 7 months ago
Regarding your previous question, you could still change the changeset metadata (hashtags) even when you have already uploaded 10‘000 changes. What you can’t do is to upload further changes now.
If it helps you can also take a look at the implementation: https://github.com/zerebubuth/openstreetmap-cgimap/blob/master/src/backend/apidb/changeset_upload/changeset_updater.cpp#L54
comment:23 by , 7 months ago
OK, the link to the source helps. I was completely mislead by the error header which claims that the changeset was closed. I'll see if I can also change the code which writes to an open changeset, but I see many old problems in that area.
comment:24 by , 7 months ago
The original Rails code also throws the same error message in case there’s genuine timeout or if there are too many changes in a changeset. It would have been easier to understand if both concepts were not clubbed together in a single open status.
https://github.com/openstreetmap/openstreetmap-website/blob/0c4c3cfcd46ea2c314dabcfe05b3f8a0d3430359/app/models/concerns/consistency_validations.rb#L34
https://github.com/openstreetmap/openstreetmap-website/blob/0c4c3cfcd46ea2c314dabcfe05b3f8a0d3430359/app/models/changeset.rb#L75
comment:25 by , 7 months ago
The phrase "The changeset x was closed" probably means that the update connection was closed, not the changeset itself.
comment:26 by , 7 months ago
23738-4.patch separates the case that the server responded with changeset closed and the case that JOSM recognizes this would happen.
It still only handles the upload in chunks.
Several problems are unsolved, but are not really related to this ticket:
- When selecting an open changeset that is too full JOSM should not offer to "Upload all objects in one request". I think best would be to disable the option in this case.
- The number of already existing objects in the open changeset is not taken into account when the number of chunks is calculated
- A different code in
UploadLayerTask
is used when a layer with changes is closed and user decides to upload. This just shows an error popup and cancels the closing of the layer. - When uploading a large number of objects with strategy "Upload each object individually" it will take very long and the "Cancel" button doesn't seem to work.
follow-up: 29 comment:27 by , 7 months ago
When selecting an open changeset that is too full JOSM should not offer to "Upload all objects in one request". I think best would be to disable the option in this case.
Or maybe better: Don't allow to select an open changeset which cannot take all objects. Examples:
- Have an open CS with 9995 changes and a new upload with up to 5 objects: no problem
- Have an open CS with 9995 changes and a new upload with more than 5 objects: reject the selection of the open changeset
comment:29 by , 7 months ago
Replying to GerdP:
When selecting an open changeset that is too full JOSM should not offer to "Upload all objects in one request". I think best would be to disable the option in this case.
Or maybe better: Don't allow to select an open changeset which cannot take all objects. Examples:
- Have an open CS with 9995 changes and a new upload with up to 5 objects: no problem
- Have an open CS with 9995 changes and a new upload with more than 5 objects: reject the selection of the open changeset
I think it should be possible to fill the open CS first. What do you do if you have an open CS with 9995 changes and you want to upload another 10004 objects? Maybe, you do not want to close the last CS after upload but continue editing and then upload to the open CS again.
comment:30 by , 7 months ago
Do you think about another strategy which doesn't ask for the chunk size and just fills the CS and opens a new one if needed?
comment:31 by , 7 months ago
Yes, I think there should be no differences between uploading to an open CS or uploading to a new CS. Once the CS is full a new CS is created but making assumptions and calculations to work around the limit is probably not ideal. If "Upload all objects in one request" does not work as the open CS's object limit will be reached, simply fill the open CS with a smaller chunk and then continue in a new CS.
comment:32 by , 7 months ago
Yes, I guess that's what most users would want when they chose an open changeset. I think my last patch is close to this but the texts/options in the dialog simply don't work well when the number of changes don't fit into one CS.
Maybe change "Upload all objects in one request" to "Minimize number of uploads"?
comment:33 by , 7 months ago
The second options "Upload objects in chunks of size:" has a similar problem. How about: "Upload objects in minimal requests" and "Upload objects in chunks of maximal size:"?
comment:34 by , 7 months ago
I don't want to change anything which requires new translations now and I'll be on a longer cycle tour soon, so unless someone else wants to continue this probably has to wait a few weeks.
comment:35 by , 7 months ago
As #23754 also is still to fix, I think waiting for better wording should not be a problem.
Enjoy your holidays. I hope you have pleasant weather.
comment:42 by , 3 weeks ago
Milestone: | 24.12 → 25.01 |
---|
Maybe JOSM should not allow that the chunk size is equal to the server limit? I assume with chunksize = 9999 everything works fine?