Palo Alto GlobalProtect Issue: Split Tunnel VPN with Skype for Business

There was a weird issue when I first joined my current job: I was told was that because of the way Palo Alto GlobalProtect (GP) and Microsoft Skype for Business (SfB) works (or maybe was configured?), I needed to log-in to SfB first, then connect to the GP VPN. The rationale was that SfB wouldn’t connect, or it would take a long time to connect, AND THEN even after a period of time, SfB would start behaving weird and it’s Exchange connectivity would drop, so SfB wouldn’t get voicemails, missed calls, etc. Just all out weirdness going on. It’s 2020, so maybe some of this true to form for the year, but probably not.

Palo Alto GlobalProtect Skype for Business

Click here if you want to skip the context and go to the solution.

Uh…Skype for Business?

Full stop. I’m sure you’re asking yourself right now, “Why not just migrate to Microsoft Teams? Get rid of that whole on-premises stuff.”

Let me answer that in a meme:

Sean Beam Boromir Meme: One does not simply migrate from Skype for Business to Teams

Skype for Business is one of the integrative technologies that spans lots of technology stacks that isn’t exactly easy to just jump ship from, and Teams as a VoIP replacement is arguably not there yet.

Also, have you seen the UI comparisons? Going from a sleek floating window for calling, IM, and conferencing with SfB to the giant-lets-pack-lots-of-services-into-one-large-window that is Teams is kind of a hard sell on the user training side of things. Maybe I’m biased. Maybe, but I digress.

The Challenge

Ok, back to the GlobalProtect and Skype for Business issues.

I was admittedly puzzled that the solution — to instruct users to sign-in to SfB before they sign-in to the VPN — was the best solution; it doesn’t seem right from a user experience perspective, and then when you toss-in the sudden weird issues with Exchange connectivity, none of this seemed right, and I doubt that’s the ideal experience. So I brought this up and the team basically said, “we just haven’t had time to troubleshoot it, but if you want to figure it out, go for it.”

You know what that sounds like? An adventure. An itch to itch. Something to solve. A challenge! There could be only one response:

Challenge Accepted Meme

Why Split Tunnel Skype for Business?

Something you might be asking is “Why configure split tunnel in the first place? Isn’t split tunneling a headache to manage?”

Split tunneling can definitely be a PITA, but like a million IT questions out there, the answer ultimately to this is, “it depends.” From my experience, split tunneling becomes difficult when you have a lot of split tunneling to manage, but if you have one or two services, it’s not that bad.

For Skype for Business, it’s one of those technologies that is sensitive to jitter, latency, and packet loss. Why? It’s because it’s voice traffic, and just like voice traffic on the inside of the network, where there’s jitter, latency, and/or packet loss, users on opposite ends of calls/conferences will experience this as delayed audio or parts of the conversation will just break up and it leads to an overall poor experience.

When you configure split tunneling, particularly for technologies like SfB, you avoid the dual encryption scenarios and you allow the technology to use its own optimized methods for connecting voice and application traffic by letting the software connect to services over the internet directly versus through a tunnel.

Baseline

That said, what’s the baseline here? How is GlobalProtect configured with split tunneling and what issues are there?

For GlobalProtect, the split tunnel configuration was configured pretty much like this documentation from Palo Alto (using just the application split tunnel, nothing else). It looked like this:

GlobalProtect Split Tunnel Domain and Application Tab Showing Excluded lync.exe

Here are the issues that were encountered in this setup:

  1. Connectivity issues if connecting SfB after GP VPN is connected
  2. Exchange connectivity in the SfB client drops after a duration of time, even if connection is established before VPN connection
  3. Call transfers working inconsistently
  4. Application sharing working inconsistently
  5. Conference meetings working inconsistently

Issues 3-5 really came later because they were hard to pinpoint due to their inconsistency, but issues 1 and 2 brought some fast wins.

Let’s get to some solutions.

Solutions

Solution for 1 and 2: DNS. It’s always dns.

It’s kind of a joke, but DNS really does cause a lot of problems, and in a split tunnel configuration when you’ve split-tunnel the traffic by application, the application is still going to resolve addresses by the servers you specify in the GlobalProtect configuration. So if you haven’t changed DNS records, the application will split tunnel, but it will still try to connect to internal resources because that’s the records it has.

I don’t have a PCAP screenshot for this, but if you pull up Wireshark and look at the PCAPs for your network interface (non-GP interface), you’ll see attempts to get to SfB internal IP addresses that aren’t (typically) on your network, and thus services fail.

The solution is simple: for your VPN clients, serve the external IP addresses for A records being queried. I solved this by setting up dedicated DNS servers for VPN clients, then just creating the zones and root records for each FQDN. I did this for all the Skype for Business external IPs (edge and reverse proxy) and the external Exchange records.

After doing this, problems 1 and 2 went away because hostnames were being resolved correctly.

Solutions for 3 through 5: Firewall rules and IP Split Tunneling

Problems 3 through 5 were frustrating like no other because I couldn’t really narrow the problems down exactly. Some people had no problems with call transfers, application sharing, or conferencing, but then sometimes they would. So the thing to do is dig into the logs, and when I did I encountered a lot of this:

ms-diagnostics: 23;source="mediationServer.contoso.com";reason="Call failed to establish due to a media connectivity failure when one endpoint is internal and the other is remote"

Or

ms-diagnostics: 24;source="mediationServer.contoso.com";reason="Call failed to establish due to a media connectivity failure when both endpoints are remote"

Or even better:

ms-client-diagnostics: 52049; reason="Leaving app sharing because re-invite failed";UserType="Callee";MediaType="applicationsharing-video"

These all pointed to firewall issues, and even the ICEWARN messages noted something wrong with STUN, TURN, NAT, etc.

So I did some digging and found that firewall rules needed to be in-place to prevent VPN clients and internal SfB servers from communicating with one another. So I added some PAN policies, and things got better, but not perfect. Also, I added the external SfB IP addresses to the split tunnel in Network > GlobalProtect > Gateway > Agent > Client Settings > Client-Config > Split Tunnel > Exclude (which basically just adds static routes in the Windows routing table to send traffic for those IPs out the non-tunneled interface). Still the occasional error creeping up, and I could even witness it, but still can’t quite nail the problem.

Finally, I had a thought: why not get rid of the application process split tunnel? I mean, if I have DNS addreses configured, and IP split tunneling working, why is the application process split tunnel needed? Removed that from the setting and bam — all the problems went away. Like magic.

Shia Labeouf Magic

Here’s what the final outcome should look like for a GlobalProtect-Skype for Business-Exchange environment for split tunnel.

Palo Alto GlobalProtect Skype For Business Split Tunnel

Of course, I fully admit this is really more of a legacy design with everything on-premises, but you could just as easily send the Exchange traffic to Office 365 in the split tunnel.

Thoughts on GlobalProtect Application Process Split Tunnel

While I had configured the traditional methods of doing split tunnel configurations (IP split tunnel and DNS servers), I’m still a little puzzled to the fact that the Palo Alto GlobalProtect application process split tunnel seemed to cause issues. My guess is that something in the way the Skype for Business client is designed prevents the process from being completely split tunneled, and I think this has to do with the way Skype for Business operates with Windows.

If you get really bored on a Friday night and have nothing better to do in life, check out some of these deep dives on candidate path selection and other stuff related to media flow. What you’ll see in the SfB client log files is something like this:

Skype for Business candidate selection
Credit

Basically, SfB gets a selection of candidates that it uses from the interfaces on the computer. In a GP split tunnel set up (with or without application process split tunnel configured), you’ll see ALL IP addresses (including the tunnel address) listed as candidates, and my suspicion is that Skype for Business still tries to use a tunnel interface, and sometimes it gets around the Palo Alto GlobalProtect application exclusion, and then that causes calls, application sharing, and even conferences to fail. I can’t show my own logs seeing this for security reasons, so you’ll have to trust me on that one.

Solution (tl;dr)

Here’s the quick solution for GlobalProtect and Skype for Business Split Tunnel

  1. Create separate DNS servers for VPN clients and create the specific Skype for Business DNS records needed, and configure them for external IP addresses so that Skype for Business resolves external addresses and configures itself appropriately.
  2. Create firewall rules that block traffic to/from the VPN network to internal Skype for Business and Exchange IP addresses. We want the SfB client to determine it can’t go inside for traffic.
  3. In Panorama or PANOS, under Network > GlobalProtect > Gateway > Agent > Client Settings > Client-Config > Split Tunnel > Exclude, configure all external SfB addresses so that the GP client doesn’t send traffic for those IPs through the tunnel. Alternatively, under Network > GlobalProtect > Gateway > Agent > Client Settings > Client-Config > Split Tunnel > Domain and Application > Exclude Domain, you could add the SfB external FQDNs (that said, IIRC, the stuff under ‘Domain and Application’ requires the GlobalProtect license…technically).

Links, Further Reading, Credit

Skype for Business: Cleanly Shutting Down Server (Invoke-CSComputerFailover and More)

It’s been over three years since I managed and deployed a Skype for Business/Lync system, and at my new job I was hired on as a be a network engineer, but I noted in a past life I received a MCSE in Skype for Business, so I could definitely be the backup for the primary SME (subject matter expert) in SfB. However, in a strange twist, the primary SME left — and you know what, there’s just not a lot of Skype for Business/Lync engineers out there, especially in a small labor market, so I stepped up to help the organization because I was the most qualified by a long-shot.

So I’m back doing some Skype for Business again.

Captain America Here We Go Again

I actually have always liked voice routing, so it’s fun to be doing some of this stuff again (although SfB is a pretty intense, integrative technology, so it’s not all Pop-Tarts and unicorns).

However, I was trying to get reacquainted with some commands for cleanly shutting down a Skype For Business server, and I just didn’t find a lot of good information out there, so I thought I might write something up “real quick”. This is somewhat basic info for SfB enterprise deployments, but it might be helpful.

Enough of the pretext, let’s get to the first command…

Get-CSWindowsService

I’m starting with this command because it’s the most basic command you should already know, but plays a role for later in this post. Typically you’ll use the command to see how many connections are being used by a service, if a service is running, etc.

The command is straightforward: Get all or one of the SfB (or Communication Server, which is where the acronym CS comes from) Windows Services on the machine.

Stop-CSWindowsService

The command `Stop-CSWindowsService` is the most basic command you’ll use to stop all or one of the services on a SfB server. The command will execute stopping of services in the proper order of stopping SfB services, including any dependent services.

Typically you’ll be using this command on a ‘Standard’ deployment SfB server, or any non-front end server in an ‘Enterprise’ deployment such as mediation/edge servers (more info: standard vs enterprise deployment). However, there are probably rare situations in which you’ll stop just one service, so you’ll likely be stopping all of them.

If you’re doing this outside a maintenance window for some reason, I prefer to do the following: Stop-CSWindowsService -Graceful. The -Graceful is important here, because what it does is it puts the services into a paused state, preventing any new connections from happening and waiting on existing connections to disconnect. On mediation servers, whenever I’ve need to stop a server in a mediation pool, this is my preferred method so that I wait for the calls to end. However, it won’t stop until the call is done, so you might be waiting awhile.

Invoke-CSComputerFailover

For whatever reason, this command scared me at first, largely because of my ignorance of what it does. The official documentation on the command I don’t think does it justice, so here’s my attempt at it.

The command `Invoke-CSComputerFailover` will basically perform a Stop-CSWindowsService -Graceful operation, but it acts slightly different. The differences:

  1. It’s used on front end servers in an enterprise deployment (or at least I’ve never seen it documented or used on other SfB server pools). The command causes the front end server to be in a ‘failover’ state, making it unavailable to the rest of the front end pool.
  2. The command migrates data, routing groups, and more to the other front end servers.
  3. The command has a wait time of 1 hour per service, after which if the connections haven’t disconnected, it will force a disconnect. This default can be changed with the `-WaitTime` parameter.
  4. This command will make the server unavailable in the front end pool. After a reboot, or if for some reason you run Start-CSWindowsService, the server won’t be available until you run Invoke-CSComputerFailBack.

After working with it and using it several times, it’s not as scary as I thought. Just run it on one machine at a time lest you have some Windows Fabric issue due to quorum loss (or something to that effect).

Invoke-CSComputerFailover Hanging or Taking Awhile

Sometimes when you’re failing over a front end server, you get stuck waiting for some services to stop like this:

Status screen waiting for Invoke-CSComputerFailover to Progress

If you look at `Get-CSWindowsService`, you might actually find something like this:

Get-CSWindowsService Seeing Services With Hanging Connections

If you note the red and blue arrows, the services are left open, likely from a conference that has ended already, but is being left open for whatever reason. To speed up Invoke-CSComputerFailover, just open a separate elevated terminal and stop the services like this:

Stopping the stalled services with Stop-CSWindowsService in separate window

After which, `Invoke-CSComputerFailover` will continue on as expected.

Invoke-CSComputerFailover progressing

I originally tried out the idea on my own, but the following blog entry also helped me and explains it from a different perspective.

Quickie: PXE 0xC0000001 Error in SCCM

When you’re imaging/PXE-booting in SCCM, I think the “0xC0000001” error is one of the strangest errors to troubleshoot, because the source of the error is related to some problem/conflict between TFTP and the network adapter of the machine you’re imaging — but then sometimes it has to do with the network adapter of your distribution point.

Windows Boot Manager Error 0xC0000001

I have encountered multiple solutions to this problem, and here are some of them from other blogs:

There are others.

However for me, I had the following scenario, and my solution turned out to be quite simple.

Context:

  • Trying to image Pentium-based, 4 GB HP ProBook x360 11 G1 EE laptops
  • From a VMWare DP, the machines receive the boot image just fine.
  • From a Hyper-V DP, ‘0xC0000001’ error occurs on these laptops.
    • For the sake of curiosity, I did the three items above (change reg key, changed network adapter properties, and even reinstalled WDS). None of these worked.
    • I reverted all those settings and put the DP back in a ‘vanilla’ state

I found this behavior to be really odd. Why would it download the boot image on one just fine, but not on the other? Perhaps the boot image needs network adapter?

Well, it turns out the solution for this model laptop was to add the network adapter driver* to the boot image. We generally don’t add network drivers to boot images unless necessary, which in this case it turned out it was.

And that was that. If you don’t know how to add drivers to the boot image, I’m not going to reinvent the wheel, but I will direct you to this website that has a decent how-to.

Happy SCCM-ing!

(Edit 20180710 – clarified the use of a driver)

(Edit 20190702 – Added Windows Updates as a potential cause).

SCCM: Using 7-Zip to Deploy Large Packages/Applications

One PITA problem to deal in SCCM with when deploying large application installs like AutoDesk Inventor or Adobe Creative Cloud is the sheer size of the install, which can be 10-20 GBs in size. Well after a fellow Twitterer asked about zipping packages and so forth for deploying large packages and applications in SCCM, I responded with something I do at my organization, and thought I’d share the info. This probably isn’t really anything new, but it’s what I’ve done to solve the problem.

Gizmo from Gremlins
I couldn’t think of an image for this post, so you get Gizmo.

Deploying large packages over the network can take awhile; one solution to this is to compress the contents into a package/application, deploy the compressed contents with the 7-zip executable (and dll), then use the 7-zip .exe to uncompress the contents on the remote device. After the install is complete, just remove the uncompressed contents, and Bob’s your uncle.

This requires the 7z.exe and 7z.dll files, and of course the compressed package in whatever format you want (I prefer .7z).

Below is a batch file I use to perform this task, and I’m using AutoDesk Inventor 2017 as an example. I also have a Github page that has the complete install and uninstall for AutoDesk Inventor.

@echo off
pushd "%~dp0"

start /wait "" 7z.exe x "%cd%\Inventor2017.7z" -o"c:\InventorTemp" -y

:==============================
:AutoDesk Inventor 2017 Install
:==============================

C:\Inventortemp\Img\Setup.exe /W /q /I C:\Inventortemp\Img\AutoDesk Inventor 2017.ini /language en-us

rd "c:\InventorTemp" /s /q

exit /b 0