fix(activesync): separate PING watermark from SYNC modseq to stop iOS…#41
Conversation
… mail loops x(activesync): stop iOS PING loops with separate SYNC and PING watermarks Email collections used one modseq for both SYNC (CHANGEDSINCE) and PING (IMAP STATUS), and PING treated sync_pending as a change signal. That caused infinite ~3s PING loops when SYNC modseq lagged behind the server, and when a MOREAVAILABLE batch remained in sync_pending. Introduce a three-state model: SYNC watermark ($_status), PING watermark ($_pingStatus / ps in sync_data), and sync_pending (SYNC only). PING now polls IMAP STATUS via pingModseq(), advances the checkpoint on detection, and saves with preservePending so in-flight MOREAVAILABLE batches survive. Add unit tests for checkpoint serialization, PING-vs-SYNC modseq separation, and preservePending saves.
fix(activesync): reject corrupt collection sync_data on load and save
Treat deserialized empty arrays and non-Folder_Base blobs as missing state
instead of using them as folder objects. Refuse save() when _folder is not a
valid Folder_Base instance to prevent writing a:0:{} into horde_activesync_state.
…export fix(activesync): repair initial sync, corrupt state, and nested MIME export Follow-up to the PING/SYNC watermark split. Several bugs still blocked reliable mail delivery on iOS ActiveSync: - Reject empty or non-folder sync_data on load and refuse to persist invalid collection state on save, preventing corrupted rows from breaking PING/SYNC and new-mail push. - Defer marking initial sync complete until exported UIDs are acknowledged; fix unserialize treating haveInitialSync=false as complete when the hi key is present. - Stop the SYNC export loop from treating failed exports as progress and from dropping sync_pending entries on retriable backend errors. - Retry IMAP body fetches without BODY[].SIZE when the server returns no data for deeply nested MIME parts (Dovecot), so signed messages with inline PGP key material export correctly. - Treat application/pgp-keys like S/MIME signatures for attachment detection and log message build failures instead of silently omitting messages from export. Add unit tests for initial-sync acknowledgement, corrupt state handling, preservePending saves, nested MIME body fetch fallback, and PGP-key MIME structures.
…ow locks Parallel SYNC and PING workers could load and save the same sync_key concurrently, overwriting sync_data and sync_pending and causing Invalid sync_data corruption during heavy initial sync. Hold a row lock from state load through save() or updateSyncStamp(): SQL uses SELECT ... FOR UPDATE in a transaction; Mongo uses a sync_lock field with findAndModify(), including stale-lock recovery. Release locks on collection switch, save, stamp-only updates, and __destruct(). Add updateSyncStamp() to State_Mongo for parity with SQL. Add unit tests under StateTest/Sql and StateTest/Mongo (RowLockTest, InitialSyncTest).
8901cd1 to
c7e372f
Compare
ralflang
left a comment
There was a problem hiding this comment.
Please rebase before adding code to this. I will put it back into draft status now.
…ruption Add per-collection locking so concurrent PING/SYNC workers for the same folder cannot interleave writes to horde_activesync_state: - migration 24: horde_activesync_collection_lock (SQL) and HAS_collection_lock (Mongo) - acquire/release collection locks around loadState(), save(), and updateServerIdInState() - keep existing row locks (SELECT FOR UPDATE) held through save() Tighten SQL/Mongo state writes: - scope UPDATE/INSERT to sync_key, sync_folderid, sync_devid, and sync_user - persist sync_pending as TEXT; only sync_data uses Horde_Db_Value_Binary - wrap persist in transactions; retry concurrent INSERT as UPDATE - preserve sync_pending on PING watermark saves (preservePending) Detect and recover corrupt collection state: - normalize invalid sync_data on load (reinitialize folder object) - refuse to persist empty/array-shaped sync_data or non-folder objects (StaleState) Tests: SQL/Mongo collection lock, SQL/Mongo row lock, SQL preservePending saves, corrupt sync_data load/save guards.
Treat clearly invalid collection sync_data (arrays, pending-shaped blobs) as StaleState on load so SYNC/PING reset the collection and return STATUS_KEYMISMATCH instead of silently reinitializing at the same synckey. Missing/unparseable blobs still rebuild folder state in place.
|
@ralflang please review |
|
SYNCSTAMP_UPDATE_THRESHOLD and STATE_ROW_LOCK_STALE_SECONDS defined in Base.php but also Sql.php and Mongo.php - harmless but unnecessary and potential source of future trouble. _acknowledgeExportedChange() only handles CHANGE/DRAFT, not ADD. For initial sync the messages come through as ADDs, but the code path that primes Did you test
To make sure it's a viable roundtrip without data loss? Overall I think it's OK to merge. |
|
horde migrate 24 id used to create my table. downgrade to 23 I did not test, but it is trivial : up adds a table, and down drops it. no modification on existing tables. |
|
thanks for the review, wil ladjust the code slightly to address it. |
…c ACK Move SYNCSTAMP_UPDATE_THRESHOLD and STATE_ROW_LOCK_STALE_SECONDS to State_Base and remove duplicate definitions from State_Sql and State_Mongo. Clarify in _acknowledgeExportedChange() that wire-format SYNC Add maps to CHANGE_TYPE_CHANGE after Exporter_Sync::_getNextChange() normalizes bare UIDs from the driver initial-sync short-circuit. Add InitialSyncTest::testInitialSyncBareUidAcknowledgesExportedMessage() to cover sync_pending entries stored as bare UIDs.
Release version 3.0.0-RC1 Merge pull request #41 from horde/fix/ActiveSync_loops fix(activesync): consolidate state constants and document initial-sync ACK fix(ActiveSync): force KEYMISMATCH when loading corrupt sync_data Fix: Filename typo fix(ActiveSync): harden state persist against races and sync_data corruption docs(ActiveSync): clarify row lock behavior and sendNextChange contract fix(activesync): serialize parallel state access with SQL and Mongo row locks fix(activesync): repair initial sync, corrupt state, and nested MIME export fix(activesync): reject corrupt collection sync_data on load and save fix(activesync): separate PING watermark from SYNC modseq to stop iOS mail loops refactor(ActiveSync): use HordeString instead of Horde_String in METHOD handling Update MeetingRequest.php Simplify method retrieval in fromvEvent Refactor METHOD handling in EasMessageBuilder fix(Iterator): use void return type for next() instead of ReturnTypeWillChange fix(Device): restore isset guard for OS property access Fix php and phpunit deprecations
Fix ActiveSync PING/SYNC loops, state corruption, and initial-sync reliability
Branch:
fix/ActiveSync_loops→FRAMEWORK_6_0Repository: horde/ActiveSync
HEAD:
22a44863(8 commits, 22 files, +3181 / −221 lines)Summary
This branch fixes several interacting ActiveSync 3.0 regressions that caused iOS PING loops,
Invalid sync_data/ KEYMISMATCH cycles, stucksync_pending(MOREAVAILABLE) batches, and missing mail on initial sync. The changes introduce a clearer separation between SYNC and PING watermarks, harden state persistence against concurrent workers, validate collection state on load/save, and improve MIME export for nested/signed messages.Production testing on Horde 6 / activesync 3.0.0-beta3 reports significantly improved stability after these fixes.
Problems addressed
1. iOS PING loops (~3s wake/sleep cycle)
Email collections used a single modseq for both:
CHANGEDSINCE/$_status)STATUSpolling)Additionally, PING treated
sync_pending(in-flight MOREAVAILABLE batches) as a change signal. When SYNC modseq lagged behind the server, or a partial batch remained insync_pending, PING would wake repeatedly without delivering mail.2. State corruption under concurrency
Parallel SYNC and PING workers could load and save the same
sync_keyconcurrently, overwritingsync_dataandsync_pending. This producedInvalid sync_datacorruption, especially during heavy initial sync on large mailboxes.3. Corrupt
sync_datawritten or silently acceptedEmpty arrays (
a:0:{}), non-folder blobs, and FOLDERSYNC-shaped data could be treated as valid collection state, breaking subsequent PING/SYNC and blocking new-mail push.4. Initial sync and MIME export gaps
sync_pendingentries on retriable backend errors.Solution
PING / SYNC watermark separation
Introduce a three-state model for mail collections:
$_statusCHANGEDSINCE)$_pingStatusSTATUSviapingModseq())sync_pendingPING now:
STATUSonly (does not usesync_pendingas a change signal)$_pingStatuson detectionpreservePending => trueso in-flight MOREAVAILABLE batches survive PING savesFiles:
State/Base.php,Folder/Imap.php,Imap/Adapter.phpConcurrency control
Row locks (device + sync_key level):
SELECT … FOR UPDATEheld from load throughsave()/updateSyncStamp()sync_lockfield withfindAndModify(), including stale-lock recoveryCollection locks (per folder):
horde_activesync_collection_lock(SQL) /HAS_collection_lock(Mongo)loadState(),save(), andupdateServerIdInState()save()Files:
State/Sql.php,State/Mongo.php,migrations/24_horde_activesync_addcollectionlock.phpState validation and recovery
_normalizeSyncFolderData()— missing/unparseable blobs are rebuilt; clearly corrupt payloads raiseStaleState_assertValidSyncFolderBeforeSave()/_assertSyncDataBlob()— refuse to persist invalid folder objects or empty/array-shaped blobssync_dataforces a collection resync viaSTATUS_KEYMISMATCHinstead of silently reinitializing at the same synckeyFiles:
State/Base.php,Collections.php,Request/Sync.phpSafer persistence (SQL / Mongo)
UPDATE/INSERTtosync_key,sync_folderid,sync_devid,sync_usersync_pendingas TEXT; onlysync_datausesHorde_Db_Value_BinaryINSERTasUPDATEInitial sync and export reliability
haveInitialSync=falsebeing treated as complete when thehikey is presentBODY[].SIZEwhen nested parts return no data (Dovecot)application/pgp-keyslike S/MIME signatures for attachment detectionFiles:
Connector/Exporter/Sync.php,Folder/Imap.php,Imap/MessageBodyData.php,Mime.php,Mime/Iterator.php,Request/Sync.phpCommits (oldest → newest)
ba01d163preservePendingon PING saved5038717sync_dataon load and saveb1c288f5e480db3ac7e372fbsendNextChangecontract)0dfff8d2161d301e22a44863sync_dataDatabase migration
Run Horde migrations after merge:
Adds collection-lock storage required for per-folder serialization.
Tests
New/expanded unit tests under
test/Horde/ActiveSync/StateTest/:StateSqlPreservePendingTestsync_pending; watermark separationSql/RowLockTest,Mongo/RowLockTestSql/CollectionLockTest,Mongo/CollectionLockTestSql/InitialSyncTestImapFolderTest,ImapAdapterTestMessageBodyDataTest,MimeTestProduction validation
Tested on Horde 6 deployment with:
Invalid sync_dataafter fixes + PHP-FPM reloadRelated work (not in this branch)
PostgreSQL deployments may also need a complementary
horde/dbfix for PDO multi-LOB binding: whensync_dataandsync_pendingare bound as LOBs in a singleexecute(), PostgreSQLbyteacan be corrupted. That amplifies the state bugs fixed here. Track separately in horde/Db.Window size configuration (
maximumwindowsize/maximumrequestwindowsize= 0) is orthogonal but increases batch size andMOREAVAILABLEpressure. Recommend finite values (25–100) in productionconf.phpregardless of this merge.Test plan
StateTestsuite on SQL (PostgreSQL/MySQL) and Mongo backendssync_pendingclears across multiple SYNC rounds; PING does not false-wake on pending batchInvalid sync_datain logssync_datarow → client receivesSTATUS_KEYMISMATCHand resyncs cleanlyDeployment notes
Invalid sync_data,KEYMISMATCH, andCOLLECTIONS: Found changes!without subsequent SYNC