Domino on Linux/Unix, Troubleshooting, Best Practices, Tips and more ...

 
alt

Daniel Nashed

 

Traveler Sync Issue with more than one device

Daniel Nashed  5 February 2019 13:24:24

We ran into a situation where secondary devices not used all the time had missing mails, contacts and events.
This was a long going support ticket, because it was very difficult to provide data from when the problem initially occurred.

It turned out that this is caused by a bug in the way the cache worked. The cache is removed after the device is inactive (by default 24 hours) and the next sync when coming back when the device came back was affected by this.

The fix is in Traveler 10.0.0 and higher. Traveler 10 is the next version after 9.0.1.21 and works on a Domino 9.0.1 server with current FPs (I would recommend using the latest IF for FP10).
In contrast to Domino 10, Traveler 10 is an incremental release -- even it has some new features. So installing the Traveler 10.0.1 release on your Domino 9.0.1 FP10 server is perfectly OK.

For some internal reason the fix was not included in the fixlist but the fixlist has been updated end of last month.
See description of the fix here --> https://www.ibm.com/support/docview.wss?uid=swg1LO93818

From what we see this does not only happen if the Traveler server was shutdown but also when all devices for an user are offline.

To figure out if you have the issue, there is a command "DbRecordsCheck"  that you can run on your Traveler server. This check takes a while and goes thru all sync state entries for all users and devices.
It will tell you which users have missing device records by comparing the table of documents that should be synced with what actually is synced.

You can also take a dump for an individual user and check the dumped data for missing "DB records".

Example:
tell traveler dump daniel nashed

Check the dump for lines that look like this:

  100000000000181001: ApplDMPT12XYZABC DB record was not found for this device.  LGUID: 100000000035031204 Type: 100000000000000401 (Event)
  100000000035510212: 6978dbc6ffab4180a1e1c7f16d42f70e timeSyncInDevice: 1543308447 (11/27/2018 09:47:27) timeSent: 1543308447
(11/27/2018 09:47:27) DeviceRecordId: 100000000035031204 tsTaggedForSlowSync: 0 mChangeData: 0 mChangeMove: 0

But if you want check all your users the db records check command is the right way.

It comes in two different modes

1. just check the records and show affected users
2. check the records and if missing records are identified reset the device


We took the approach to first check for all users and from the list we took the VIP users and users we know have been on the road and reset them manually.

Example:
tell traveler reset ApplDMPT12XYZABC daniel nashed


The command is either

Example:
tell traveler DbRecordsCheck show 2500

Or if you directly want to repair by resetting the users:

Example:
tell traveler DbRecordsCheck repair 2500

The number is the maximum number of users that should be checked/fixed.

See https://www.ibm.com/support/docview.wss?uid=swg1LO87614 for reference.

The result looks like this

-- snip --

10.12.2018 12:41:41   Traveler: IBM Traveler Database is checking the records for 2202 accounts...
10.12.2018 15:02:38   Traveler: 316 out of 2202 accounts were missing records and may need to be reset.
10.12.2018 15:02:38   Traveler: Command DbRecordsCheck Show complete.

The error for a user looks like this
:

10.12.2018 12:44:08   Traveler: CN=xyz.../O=Acme with account ID 100000000001234567 is missing at least one Traveler database record for a device but not all devices.  The first encountered record to be missing has LGUID 200000000012345678 and was not found


-- snip --

The command runs a while (it could be 1 hour or more for 1000 users) and checks one user after another.
So if you are concerned about resetting too many users at a time, the reset will be spread over time just by the time it takes to analyze.


Conclusion/Recommendation:

If you are concerned that you might have this issue, you should do a DbRecordsCheck show first.
When you have users facing this issue, you should upgrade to Traveler 10.0.1 first and afterwards run the DbRecordsCheck repair command or reset users/devices individual.

If an user is in a good network location, it will take a couple of seconds to resync a device.
But you should take care when users are on the road with a slow network connection!



Links

    Archives


    • [HCL Domino]
    • [Domino on Linux]
    • [Nash!Com]
    • [Daniel Nashed]