If you’re running a Palo 5200/7000 series firewall, you may have inserted transceiver modules and the only way you really know they’re recognized is if you get them online. While troubleshooting an issue, I found a item from the Palo Alto KB, namely the log file brdagent.log is where you can view the transceiver you inserted; the log itself is for monitoring interfaces and other networking chips being inserted (see book link below).
To view the log, I prefer to use the tail command to view the log, which is a Unix/Linux command to view the last section of the log file, and even to watch the log file as it’s being used.
tail lines 100 follow yes cp-log brdagent.log <bunch of log files> cp brdagent.log 2019-10-03 17:10:40 2019-10-03 17:10:40.001 -0600 Port 25: Down 40Gb/s-full duplex cp brdagent.log 2019-10-03 17:10:40 2019-10-03 17:10:40.001 -0600 PORT25: board_port_autoneg_enabled -> board_port_autoneg, link: 0, mode: 1 cp brdagent.log 2019-10-03 17:11:40 2019-10-03 17:11:40.644 -0600 Error: gryphon_get_port_info(5200/gryphon_ports.c:1231): Remote fault detected on port 25! cp brdagent.log 2019-10-03 17:13:22 2019-10-03 17:13:22.156 -0600 Port 25: sysd_obj_lookup for <QSFP> failed !!! cp brdagent.log 2019-10-03 17:13:22 2019-10-03 17:13:22.156 -0600 Port 25: Empty QSFP+ detected cp brdagent.log 2019-10-03 17:14:07 2019-10-03 17:14:07.441 -0600 QSFP+ detected on port 25 cp brdagent.log 2019-10-03 17:14:07 vendor 'FINISAR CORP. '; part 'FTL4E1QE1C-A5 '; id 'QSFPP'
The command broken down:
tail – the Unix/Linux command that displays the last section of a file
lines 100– displays the last 100 lines (or less) of the file
follow yes – informs the cli to continue printing any additional input to the file
cp-log– this is a directory for control plane logs, and according to Tom Piens with the Mastering Palo Alto Networks book below, there a few different directories for the management plane, data plane, and other logs. The log directories will vary based on models.
brdagent.log– the log filename.
As you see in the log above, you can see the the QSFP transceiver in the log. Using the tail follow command though, you can watch as transceivers are inserted, like this (I inserted a Cisco twinax as an example):
2021-05-25 13:50:34.663 -0600 Port 19: SFP Plus Empty detected 2021-05-25 13:50:51.756 -0600 Port 19 :: sfp_data.transceiver_codes(str) = 10000 2021-05-25 13:50:51.756 -0600 SFP Plus detected on port 19 vendor 'CISCO-TYCO '; part '2053783-3 '; tc '10000' 2021-05-25 13:50:51.757 -0600 Port 19: SFP Plus Fiber detected 2021-05-25 13:50:51.757 -0600 PORT19: board_port_sfp_nopop_0 -> board_port_startup, link: 0, mode: 2 2021-05-25 13:50:51.761 -0600 PORT19: board_port_startup -> board_port_autoneg, link: 0, mode: 2 2021-05-25 13:50:51.764 -0600 PORT19: board_port_autoneg -> board_port_autoneg_linked, link: 1, mode: 2 2021-05-25 13:50:51.764 -0600 Port 19: Up 10Gb/s-full duplex
Yes, you can other vendor/third-party transceivers in Palo Alto firewalls. Palo Alto publishes the details of the transceivers here. I haven’t ever encountered issues with other vendor transceivers, but that said, getting support if you have an issue and the transceiver is potentially the problem may not result in support from customer support.
While troubleshooting an HSCI port, I found that I could test if the optic was functioning correctly from the HSCI connecting the port to a switch and configuring layer 2 connectivity (ethernet). The transceiver responded, which makes sense because the configuration in HA sets the transport for HSCI as ethernet. Even though Palo says, “[t]he traffic carried on the HSCI ports is raw Layer 1 traffic, which is not routable or switchable. Therefore, you must connect the HSCI ports directly to each other (from the HSCI port on the first firewall to the HSCI port on the second firewall),” I found the HSCI at least responds to ethernet. So perhaps they mention this to indicate you shouldn’t bridge the connection (via layer 2 switch).
If you’re not getting complete session sync between a pair of HA firewalls, and even though your HA is showing as up, you should perhaps check your transceivers, particularly if you’re using non-Palo Alto transceivers.
Working in a Juniper SRX (or other model), you might have a need to secure the web interface with a certificate through the CLI. I had this need and ran into a problem where Junos wouldn’t recognize the input I gave in the CLI for the certificate (it was an issue with line breaks). The web management service wasn’t starting and I was receiving errors like the following in the httpd.log file:
Searching online, there isn’t much that talks about this process, so this is an attempt to rectify that. The following is a quick breakdown of how to import a certificate into a Juniper SRX via the CLI. The process should be applicable for all Juniper devices (EX, MX, etc.), but I’ve only tested it on SRXs in 15.x+.
First off, generate a certificate. I’m not going into the details of how to generate a certificate because that’s outside the scope of this post, and your certificate process will vary.
For this process, I generate certificates outside the Juniper device with some scripts, so no need to create a CSR on the Juniper device and so forth. Typically I use a 2048 to 4096-bit key, and I like to create SAN/UCC certs for these device types so that I can include the hostname, FQDN, and IP address.
One important item here to note: your certificate format needs to be in base64/PEM format. In other words, it should be a file that looks something like this:
And your unencrypted private key like this:
Combine Private Key and Certificate Into Single Line
Next, I open up VS Code and concatenate the private key and certificate into one file:
Now we’re going to replace all the line breaks with line breaks that Junos recognizes. Using below for reference, hit Ctrl+F (Find) or Ctrl+H (Replace) to get the find and replace box and click the drop-down arrow. Next switch to using regex (red arrow), then type ‘\n’ in the search field (pink arrow), type ‘\\n’ in the replace field (green arrow), and then click ‘Replace All’ button (blue arrow).
The end result should look something like this:
Configure Certificate in Junos
To import configure the certificate in Junos, just copy the single line you created above then in configuration mode, enter the set command like below (with your single-line in-between the quotes):
set security certificates local <Junos Name for Cert> "<Transformed Private Key and Certificate>"
Commit the config and you’re set.
Optional: Set the Certificate in Web Management
If you haven’t set a certificate yet for the web-management service, the command below will configure it with the certificate name you used:
set system services web-management local-certificate EXAMPLE-CERT-NAME
Optional: Restart Web Management
As a general practice from managing Linux boxes, I like to restart the web management service. In operational mode, run the following to restart the web-management service:
That’s it! You should be set.
Monitor or View Log Files
If you want, you can look at the httpd.log file to watch the process:
There was a weird issue when I first joined my current job: I was told was that because of the way Palo Alto GlobalProtect (GP) and Microsoft Skype for Business (SfB) works (or maybe was configured?), I needed to log-in to SfB first, then connect to the GP VPN. The rationale was that SfB wouldn’t connect, or it would take a long time to connect, AND THEN even after a period of time, SfB would start behaving weird and it’s Exchange connectivity would drop, so SfB wouldn’t get voicemails, missed calls, etc. Just all out weirdness going on. It’s 2020, so maybe some of this true to form for the year, but probably not.
Full stop. I’m sure you’re asking yourself right now, “Why not just migrate to Microsoft Teams? Get rid of that whole on-premises stuff.”
Let me answer that in a meme:
Skype for Business is one of the integrative technologies that spans lots of technology stacks that isn’t exactly easy to just jump ship from, and Teams as a VoIP replacement is arguably not there yet.
Also, have you seen the UI comparisons? Going from a sleek floating window for calling, IM, and conferencing with SfB to the giant-lets-pack-lots-of-services-into-one-large-window that is Teams is kind of a hard sell on the user training side of things. Maybe I’m biased. Maybe, but I digress.
Ok, back to the GlobalProtect and Skype for Business issues.
I was admittedly puzzled that the solution — to instruct users to sign-in to SfB before they sign-in to the VPN — was the best solution; it doesn’t seem right from a user experience perspective, and then when you toss-in the sudden weird issues with Exchange connectivity, none of this seemed right, and I doubt that’s the ideal experience. So I brought this up and the team basically said, “we just haven’t had time to troubleshoot it, but if you want to figure it out, go for it.”
You know what that sounds like? An adventure. An itch to itch. Something to solve. A challenge! There could be only one response:
Why Split Tunnel Skype for Business?
Something you might be asking is “Why configure split tunnel in the first place? Isn’t split tunneling a headache to manage?”
Split tunneling can definitely be a PITA, but like a million IT questions out there, the answer ultimately to this is, “it depends.” From my experience, split tunneling becomes difficult when you have a lot of split tunneling to manage, but if you have one or two services, it’s not that bad.
For Skype for Business, it’s one of those technologies that is sensitive to jitter, latency, and packet loss. Why? It’s because it’s voice traffic, and just like voice traffic on the inside of the network, where there’s jitter, latency, and/or packet loss, users on opposite ends of calls/conferences will experience this as delayed audio or parts of the conversation will just break up and it leads to an overall poor experience.
When you configure split tunneling, particularly for technologies like SfB, you avoid the dual encryption scenarios and you allow the technology to use its own optimized methods for connecting voice and application traffic by letting the software connect to services over the internet directly versus through a tunnel.
That said, what’s the baseline here? How is GlobalProtect configured with split tunneling and what issues are there?
For GlobalProtect, the split tunnel configuration was configured pretty much like this documentation from Palo Alto (using just the application split tunnel, nothing else). It looked like this:
Here are the issues that were encountered in this setup:
Connectivity issues if connecting SfB after GP VPN is connected
Exchange connectivity in the SfB client drops after a duration of time, even if connection is established before VPN connection
Call transfers working inconsistently
Application sharing working inconsistently
Conference meetings working inconsistently
Issues 3-5 really came later because they were hard to pinpoint due to their inconsistency, but issues 1 and 2 brought some fast wins.
Let’s get to some solutions.
Solution for 1 and 2: DNS. It’s always dns.
It’s kind of a joke, but DNS really does cause a lot of problems, and in a split tunnel configuration when you’ve split-tunnel the traffic by application, the application is still going to resolve addresses by the servers you specify in the GlobalProtect configuration. So if you haven’t changed DNS records, the application will split tunnel, but it will still try to connect to internal resources because that’s the records it has.
I don’t have a PCAP screenshot for this, but if you pull up Wireshark and look at the PCAPs for your network interface (non-GP interface), you’ll see attempts to get to SfB internal IP addresses that aren’t (typically) on your network, and thus services fail.
The solution is simple: for your VPN clients, serve the external IP addresses for A records being queried. I solved this by setting up dedicated DNS servers for VPN clients, then just creating the zones and root records for each FQDN. I did this for all the Skype for Business external IPs (edge and reverse proxy) and the external Exchange records.
After doing this, problems 1 and 2 went away because hostnames were being resolved correctly.
Solutions for 3 through 5: Firewall rules and IP Split Tunneling
Problems 3 through 5 were frustrating like no other because I couldn’t really narrow the problems down exactly. Some people had no problems with call transfers, application sharing, or conferencing, but then sometimes they would. So the thing to do is dig into the logs, and when I did I encountered a lot of this:
ms-diagnostics: 23;source="mediationServer.contoso.com";reason="Call failed to establish due to a media connectivity failure when one endpoint is internal and the other is remote"
ms-diagnostics: 24;source="mediationServer.contoso.com";reason="Call failed to establish due to a media connectivity failure when both endpoints are remote"
Or even better:
ms-client-diagnostics: 52049; reason="Leaving app sharing because re-invite failed";UserType="Callee";MediaType="applicationsharing-video"
These all pointed to firewall issues, and even the ICEWARN messages noted something wrong with STUN, TURN, NAT, etc.
So I did some digging and found that firewall rules needed to be in-place to prevent VPN clients and internal SfB servers from communicating with one another. So I added some PAN policies, and things got better, but not perfect. Also, I added the external SfB IP addresses to the split tunnel in Network > GlobalProtect > Gateway > Agent > Client Settings > Client-Config > Split Tunnel > Exclude (which basically just adds static routes in the Windows routing table to send traffic for those IPs out the non-tunneled interface). Still the occasional error creeping up, and I could even witness it, but still can’t quite nail the problem.
Finally, I had a thought: why not get rid of the application process split tunnel? I mean, if I have DNS addreses configured, and IP split tunneling working, why is the application process split tunnel needed? Removed that from the setting and bam — all the problems went away. Like magic.
Here’s what the final outcome should look like for a GlobalProtect-Skype for Business-Exchange environment for split tunnel.
Of course, I fully admit this is really more of a legacy design with everything on-premises, but you could just as easily send the Exchange traffic to Office 365 in the split tunnel.
Thoughts on GlobalProtect Application Process Split Tunnel
While I had configured the traditional methods of doing split tunnel configurations (IP split tunnel and DNS servers), I’m still a little puzzled to the fact that the Palo Alto GlobalProtect application process split tunnel seemed to cause issues. My guess is that something in the way the Skype for Business client is designed prevents the process from being completely split tunneled, and I think this has to do with the way Skype for Business operates with Windows.
Basically, SfB gets a selection of candidates that it uses from the interfaces on the computer. In a GP split tunnel set up (with or without application process split tunnel configured), you’ll see ALL IP addresses (including the tunnel address) listed as candidates, and my suspicion is that Skype for Business still tries to use a tunnel interface, and sometimes it gets around the Palo Alto GlobalProtect application exclusion, and then that causes calls, application sharing, and even conferences to fail. I can’t show my own logs seeing this for security reasons, so you’ll have to trust me on that one.
Here’s the quick solution for GlobalProtect and Skype for Business Split Tunnel
Create separate DNS servers for VPN clients and create the specific Skype for Business DNS records needed, and configure them for external IP addresses so that Skype for Business resolves external addresses and configures itself appropriately.
Create firewall rules that block traffic to/from the VPN network to internal Skype for Business and Exchange IP addresses. We want the SfB client to determine it can’t go inside for traffic.
In Panorama or PANOS, under Network > GlobalProtect > Gateway > Agent > Client Settings > Client-Config > Split Tunnel > Exclude, configure all external SfB addresses so that the GP client doesn’t send traffic for those IPs through the tunnel. Alternatively, under Network > GlobalProtect > Gateway > Agent > Client Settings > Client-Config > Split Tunnel > Domain and Application > Exclude Domain, you could add the SfB external FQDNs (that said, IIRC, the stuff under ‘Domain and Application’ requires the GlobalProtect license…technically).
Part of my recent stumbling into supporting VoIP systems again is supporting and maintaining Ribbon session border controllers (SBCs). SBCs are pretty much gateways/routers and even firewalls for VoIP traffic, which not only routes/limits traffic between disparate voice systems (like ISDN lines to SIP trunks), but can also manipulate the traffic such as adjusting the phone number format, media codec, changing SIP header information, yada yada — you get the point.
Pretty much all SBCs are configured via extensive web interfaces with lots of deep options, so if you have to manage these at scale, it can get a little time consuming — which is what my problem was. I have many SBCs scattered across our state, and I generally loathe having to make mass changes via any GUI, so does Ribbon have some method of configuring SBCs at scale? Yes, through its Ribbon API.
However, I’m not going to sit down and use Postman to configure these (well, maybe initially to test), so I need a scripting framework. I could use PowerShell or curl with bash scripting to do this, but most of my work in networking is done via Python, and thus because I couldn’t find any Python module built for this already (and it seemed like fun to create), I made this simple Python module and it’s on Github.
I wrote this with a VoIP engineer in-mind, and the basic idea is you import the module in a Python script, then interact with source of data to make changes or perform actions. For example, I’ve used it to create a ton of phone number changes with the routing and transformation tables, and I’ve used it to perform configuration backups across all systems.
To do most of this though, you need to get familiar with API and how object resources are designed. Personally I started with Postman and got a sense for how the API works, then I built the class to handle the API. Initially I started small, but then realized it was rather simple to add additional methods to the pyribbon class to perform different tasks. One could easily build additional methods such as reboot, backup, etc., but I just didn’t see the need for that — yet.
One potential cool use is to perhaps use the module to query for call statistics and send that to a database, then maybe graph it out with Grafana; or, if your NMS allows for script pollers, perhaps use that versus digging through SNMP OIDs.
Final thoughts: I know this is probably high on the esoteric use-cases. I mean, how many VoIP engineers are there in the world, then narrow that down to Sonus/Ribbon VoIP engineers, then dwindle that down to those that interact with the API, and then dwindle that down even more that do it via Python. Pretty small, but if you like Python for this kind of work, this should help.
Also, I thought about putting this in PyPi, but I can’t imagine this getting much use or further development on my end (it does what I need it to do well), and thus I’m shying away from that.
Let me know if you find this useful or have questions.