Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Skip to content

3040 limit calls to unavailable server

Guillaume Jacquart requested to merge 3040-limit_calls_to_unavailable_server into main

Description

Notes is flooding the Murena.io servers, in some situation which needs to be explicited and reproduced.

      2 "PUT /index.php/apps/notes/api/v1/notes/292528242? HTTP/1.1" 404 2800 "-" "eOS (Android) Owncloud-android"
     17 "GET /index.php/apps/notes/api/v1/notes?pruneBefore=0 HTTP/1.1" 404 20902 "-" "eOS (Android) Owncloud-android"
     45 "GET /index.php/apps/notes/api/v1/notes?pruneBefore=0 HTTP/1.1" 404 21270 "-" "eOS (Android) Owncloud-android"
     48 "PUT /index.php/apps/notes/api/v1/notes/292528242? HTTP/1.1" 404 3031 "-" "eOS (Android) Owncloud-android"
     62 "PUT /index.php/apps/notes/api/v1/notes/503143317? HTTP/1.1" 404 3031 "-" "eOS (Android) Owncloud-android"
     62 "PUT /index.php/apps/notes/api/v1/notes/504377264? HTTP/1.1" 404 3031 "-" "eOS (Android) Owncloud-android"
     62 "PUT /index.php/apps/notes/api/v1/notes/506244067? HTTP/1.1" 404 3031 "-" "eOS (Android) Owncloud-android"
     62 "PUT /index.php/apps/notes/api/v1/notes/81678268? HTTP/1.1" 404 3031 "-" "eOS (Android) Owncloud-android"
   3212 "POST /index.php/apps/notes/api/v1/notes? HTTP/1.1" 404 3031 "-" "eOS (Android) Owncloud-android"

Looking at the code (it.niedermann.owncloud.notes.persistence.NotesServerSyncTaskn, pushLocalChanges(), and pull method),

Notes app start sync by:

  1. PUT notes with ids. So 5 different notes with id on this account, and we have 62 attemps.
  2. On 404 error on the PUT, it tries a POST for this notes (62*5 = 310 POST for this )
  3. POST the notes without id. (3212 - 310)/62 = 46,8 the number doesn't match exactly, but if there is around 50 notes on the notes app it would make sense.

Then the pushing phase is done, Notes pulls:

17 + 45 = 62 call to notes?pruneBefore=0 .

the Notes looks to react to these 404 by mark them as error, and continue with the next note.

As Notes app as a worker each 15minutes, and run a sync when the App is resumed, then 62 sync is coherent for one day (62/4~15 : 15hurs of waked up phone).

Technical details

404 error on POST request is a paradox that we could use as a signal to identify that the server isn't in the shape we expect.

Two actions in this MR:

  1. Fail fast when a POST notes return a 404. Break the synchronization loop (on the scenario above, would reduce the number of request from 3584 to 124 (62 failing PUT, then 62 failing POST)
  2. Add an exponential backoff (limited to 24hours) to block synchronization, when a 404 on POST is encountered. This would reduce the number of request from 3584, to 2.

Tests

This fix is intended to Notes instance with an account setup prior to the outage. I manage to test using a murena cloud test account with Notes app disabled, mocking (in the code) the first getNotesId call (which return 404, and block Notes initialisation). With these prerequisites, the Device is in the situation of receiving 404 on POST requests, "and flooding".

Then it is possible to follow test case #18

Issues

https://gitlab.e.foundation/e/os/backlog/-/issues/3040

10 commandments of code reviews

👪 ❤️ https://gitlab.e.foundation/internal/wiki/-/wikis/mobile-team/guidelines/Code-review

Edited by Guillaume Jacquart

Merge request reports

Loading