Wednesday, May 8, 2013

Lync Online Federation Issues with Lync 2013 Post-Migration

***UPDATED*** - 6/19/2013


As part of the Lync 2013/W15 TAP program with Microsoft, we were one of the first organizations to deploy Lync 2013 into production.  In doing so, we also had the joy of experiencing the migration "gotchas" before deploying Lync 2013 into customers' environments.  One oddity I ran into when performing the Edge migration from Lync 2010 to Lync 2013 was with our Lync Online federation partnerships.  Post-migration, many of these federations flat-out broke, and others that seemed to work were behaving strangely.  After looking over the errors and symptoms, we got everything sorted out - and the solution was relatively simple.  However, I'm still unsure as to why this issue was allowed to occur in the first place.



Background


With our Lync 2013 migration coming to a close and the Edge migration as one of the last few steps to get the project wrapped up, everything had gone relatively smooth up to this point.  I had staged the new 2013 Edge pool and tested all of the functionality and felt confident everything was working as expected.  The time came to re-target the AutoConfig and Federation DNS records to the 2013 Edge pool.  DNS updated pretty quickly, clients were signing in, and I was able to test Federation with a customer.  Success!  Except over the next week or so, I started hearing rumblings of users unable to communicate with Federated organizations.  After further review, all of the problematic relationships revolved around organizations using Lync Online.

Reviewing our Edge configuration, nothing seemed to jump out immediately.  Previously, we had been setting up Federation with Lync Online customers individually (as depicted below).




Once Lync 2010 Mobility was released, we had configured the new Hosting Provider for Lync Online to support push notifications for Windows Phone 7 and Apple iOS devices.




With Lync 2010, having both the Allowed Domains with "sipfed.online.lync.com" and a corresponding Hosting Provider configured didn't seem to cause any functionality loss or issues with communications with Lync Online connectivity.  However, once the migration to Lync 2013 Edge was complete, this prevented the Lync Online partnership from working correctly.


Symptoms


The most glaring symptom of this issue is an Event ID 14517 which gets logged.  The details noted below.





Log Name:      Lync Server
Source:        LS Protocol Stack
Date:          5/7/2013 4:44:16 PM
Event ID:      14517
Task Category: (1001)
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      LyncEdge01.domain.com
Description:
The server configuration validation mechanism detected some serious problems.

1 errors and 0 warnings were detected.

ERRORS:
The server at FQDN [sipfed.online.lync.com] is configured as both type 'allowed partner server' and type 'IM service provider'.

WARNINGS:
No warnings

Cause: The configuration is invalid and the server might not behave as expected.
Resolution:
Review and correct the errors listed above, then restart the service. You may also wish to review any warnings present.

The other thing I noticed was the inability to add a new Lync Online domain selectively as we did in the past.  The following error was generated when trying to do so.


Solution


To get this sorted out, we simply removed the Access Edge Service FQDN from all of the SIP Federated Domains which were configured for "sipfed.online.lync.com".  The following command did this for us in a matter of seconds:

Get-CsAllowedDomain | Where {$_.ProxyFqdn -eq "sipfed.online.lync.com"} | Set-CsAllowedDomain -ProxyFqdn $null

Alternatively, if "selective" Federation with Lync Online is desired, we could have removed the hosting provider and left the Allowed Domains configured with "sipfed.online.lync.com" as the Access Edge FQDN.

As I said up front, ultimately the solution was pretty straight forward and easy to implement.  Most organizations probably won't even run into this situation because they have configured the Lync Online Hosting Provider to include Federation for all Lync Online organizations.  However, I know other orgs out there don't allow Open Federation and may have selectively added Lync Online domains.  Since Lync 2010 would also allow you to configure a Hosting Provider for Lync Online, there is a possibility of running up against this issue in those situations.

Bottom line - make sure you check these settings prior to completing your Edge migration.

************************UPDATE 6/19/2013**************************

Even after making the changes outlined above, we were still experiencing Federation problems with some Office 365 tenants.  We opened a case with Microsoft PSS both for the on-prem environment (which I could find no other issues with) and the O365 side.  The symptom I was still seeing in the on-prem Edge server logging was a 403 coming from the O365 Federation Edge.

Realizing we also have our own O365 tenant (for Exchange Online only) as <company>.onmicrosoft.com, we started investigating there as well.  Some of the Lync functionality had been configured in the O365 portal, including Federation set to Enabled.  Since we are not sharing a SIP address across both the on-prem and online environments, we turned Federation OFF in our O365 tenant.  This then allowed O365 <-> on-prem Lync federation to begin working again!  We have since turned Federation back ON without any repercussions.  Ultimately, I believe this issue occurred when our O365 tenant was migrated to the 2013 platforms (we are also part of the O365 TAP/RDP).

I see this article getting quite a few hits, so for those of you still facing some challenges, this may also be your situation as well.

Related articles:
Configuring Federation Support for a Lync Online Domain

5 comments:

  1. Thanks for posting this. I was running into this issue in Lync 2010 after applying the latest CU patches (not sure why). In the control panel, under external user access on the federated domains tab, removing sipfed.online.lync.com from the federated domain 'push.lync.com' fixed the issue. On the provider tab, I kept sipfed.online.lync.com as an edge server entry for the hosted provider for LyncOnlinePushNotifications for our mobility clients.

    Thanks!

    ReplyDelete
    Replies
    1. That is how I would have handled it as well. Glad it helped.

      Delete
  2. Got the same issue between onpremise Lync 2010 infra and Lync online.
    -> Removed the Lync online hosting provider since we allow the federated domains one by one.

    Thanks for the tip !

    ReplyDelete
  3. Hi,

    I have been trying for several days to get the Lync migration working with Office 365. But had no luck to overcome 'The Registrar "sipfed.online.lync.com" does not exist'

    I have posted the issue on the below forum:

    http://community.office365.com/en-us/forums/166/p/184965/544529.aspx

    Thanks

    ReplyDelete
  4. You are my hero!!
    Was searching the web and logs etc for hours and could not find it.
    Until I hit your blog because I found the right error code and googled it.
    I applied KB2809243 this morning and now facing the problem with Edge Access service not starting with error 1067.
    In eventlog lyncserver I found these errors and followed your steps (with first post by anonymous extra info). It fixed it right away!!
    I did not get the relationship with the update but now I think I do. Someone else is working on implementing O365 in my domain. Probably he made some adjustments which now form to be a problem restarting the services for the first time with my update.

    ReplyDelete