Troubleshooting Lync CMS Replication

Lync CMS ReplicationRecently, I ended up troubleshooting Lync CMS replication in our internal environment. We tried many different things to resolve the issue, but ultimately, it came down to some broken pieces in the actual management components. Let’s talk about the symptoms and – finally – the resolution.

I’m ashamed to say that I don’t always keep track of my Lync environment. I’m not watching the event logs daily and I’m not often checking into the status of my “get-csmanagementstorereplication” output. I know. I feel the shame. Don’t mock me.

Anyway, about two weeks ago, we had an issue. I needed to make a change but I couldn’t get the change to take effect. What was happening was my Lync Server Replica Replicator Agent was crashing. Every time the topology watcher started the agent would crash. Hard.

So I did what most engineers do and asked Dr. Google and his cousin Mr. Bing for answers. I found this wonderful article from my Canadian friend, The Hoff.

I followed that article and sure enough, that fixed my Lync Server Replica Replicator Agent.

But it’s never that easy, is it?

That fix above somehow broke my Lync Server Master Replicator Agent. The symptoms were consistent. I would restart the service and within 50 seconds, it would crash again.

Lync CMS Replication

The Lync Server Application Logs showed a consistent pattern:

Event 2003 – LS Master Replicator Agent Service – Starting

Event 2004 – LS Master Replicator Agent Service – Started

Pause 20 seconds

Event 2021 – LS Master Replicator Agent Service – Successfully read CMS

Event 2033 – LS Master Replicator Agent Service – Running in Active Mode

Event 2008 – LS Master Replicator Agent Service – Successfully connected to back-end

Pause 20 seconds

Event 2012 – LS Master Replicator Agent Service – Topology Watcher

CRASH – Event 2007 – LS Master Replicator Agent Service – Unhandled Exception – CRASH

Fortunately for us the Windows Application Log – at the same time as this 2007 Event ID – would throw some useful information for us.

Event 1026 – .NET Runtime Error with MasterReplicatorAgent.exe

Event 1000 – Application Error – with MasterReplicatorAgent.exe – version 5.0.8308.577

That last error – with version 5.0.8308.577 was curious to me:

Lync CMS Replication

Because the only Lync Server 2013 Component with that version is:

Lync CMS Replication

I uninstalled it and then I reran Lync Server 2013 Deployment Wizard which reinstalled missing components and I let it reinstall the Core Management Server pieces, the replicator agents being part of that.

After that I restarted the services.

Then, low and behold, I got a new error. Luckily it was a useful one:

Lync CMS Replication

Yup, possible reinstallation. From there I “invoke-csmanagementstorereplication” and waited a few minutes.

Replication was fixed, hooray! There was great rejoicing in the office.

At the end of the day, a simple configuration change led us down the path of broken replication and broken agents, and a subsequent reinstallation of components.

The moral of the story? Keep an eye on your Lync Server and read your logs. They are useful.

Are you keeping an eye on your Lync Deployment? Do you need our team of UC experts to give your environment a checkup? Email us or give us a call at 502-240-0404 to schedule a consultation with me or our other Lync engineers.