Skype for Business: Cleanly Shutting Down Server (Invoke-CSComputerFailover and More)

It’s been over three years since I managed and deployed a Skype for Business/Lync system, and at my new job I was hired on as a be a network engineer, but I noted in a past life I received a MCSE in Skype for Business, so I could definitely be the backup for the primary SME (subject matter expert) in SfB. However, in a strange twist, the primary SME left — and you know what, there’s just not a lot of Skype for Business/Lync engineers out there, especially in a small labor market, so I stepped up to help the organization because I was the most qualified by a long-shot.

So I’m back doing some Skype for Business again.

Captain America Here We Go Again

I actually have always liked voice routing, so it’s fun to be doing some of this stuff again (although SfB is a pretty intense, integrative technology, so it’s not all Pop-Tarts and unicorns).

However, I was trying to get reacquainted with some commands for cleanly shutting down a Skype For Business server, and I just didn’t find a lot of good information out there, so I thought I might write something up “real quick”. This is somewhat basic info for SfB enterprise deployments, but it might be helpful.

Enough of the pretext, let’s get to the first command…

Get-CSWindowsService

I’m starting with this command because it’s the most basic command you should already know, but plays a role for later in this post. Typically you’ll use the command to see how many connections are being used by a service, if a service is running, etc.

The command is straightforward: Get all or one of the SfB (or Communication Server, which is where the acronym CS comes from) Windows Services on the machine.

Stop-CSWindowsService

The command `Stop-CSWindowsService` is the most basic command you’ll use to stop all or one of the services on a SfB server. The command will execute stopping of services in the proper order of stopping SfB services, including any dependent services.

Typically you’ll be using this command on a ‘Standard’ deployment SfB server, or any non-front end server in an ‘Enterprise’ deployment such as mediation/edge servers (more info: standard vs enterprise deployment). However, there are probably rare situations in which you’ll stop just one service, so you’ll likely be stopping all of them.

If you’re doing this outside a maintenance window for some reason, I prefer to do the following: Stop-CSWindowsService -Graceful. The -Graceful is important here, because what it does is it puts the services into a paused state, preventing any new connections from happening and waiting on existing connections to disconnect. On mediation servers, whenever I’ve need to stop a server in a mediation pool, this is my preferred method so that I wait for the calls to end. However, it won’t stop until the call is done, so you might be waiting awhile.

Invoke-CSComputerFailover

For whatever reason, this command scared me at first, largely because of my ignorance of what it does. The official documentation on the command I don’t think does it justice, so here’s my attempt at it.

The command `Invoke-CSComputerFailover` will basically perform a Stop-CSWindowsService -Graceful operation, but it acts slightly different. The differences:

  1. It’s used on front end servers in an enterprise deployment (or at least I’ve never seen it documented or used on other SfB server pools). The command causes the front end server to be in a ‘failover’ state, making it unavailable to the rest of the front end pool.
  2. The command migrates data, routing groups, and more to the other front end servers.
  3. The command has a wait time of 1 hour per service, after which if the connections haven’t disconnected, it will force a disconnect. This default can be changed with the `-WaitTime` parameter.
  4. This command will make the server unavailable in the front end pool. After a reboot, or if for some reason you run Start-CSWindowsService, the server won’t be available until you run Invoke-CSComputerFailBack.

After working with it and using it several times, it’s not as scary as I thought. Just run it on one machine at a time lest you have some Windows Fabric issue due to quorum loss (or something to that effect).

Invoke-CSComputerFailover Hanging or Taking Awhile

Sometimes when you’re failing over a front end server, you get stuck waiting for some services to stop like this:

Status screen waiting for Invoke-CSComputerFailover to Progress

If you look at `Get-CSWindowsService`, you might actually find something like this:

Get-CSWindowsService Seeing Services With Hanging Connections

If you note the red and blue arrows, the services are left open, likely from a conference that has ended already, but is being left open for whatever reason. To speed up Invoke-CSComputerFailover, just open a separate elevated terminal and stop the services like this:

Stopping the stalled services with Stop-CSWindowsService in separate window

After which, `Invoke-CSComputerFailover` will continue on as expected.

Invoke-CSComputerFailover progressing

I originally tried out the idea on my own, but the following blog entry also helped me and explains it from a different perspective.

Modding EX2200 PoE Switch with Quieter Fans

Since COVID-19 hit, I moved out to my garage so I could work from home, and in doing so I realized how loud my network rack really is. My rack is not the prettiest rack, and to be frank I’ve never really been a fan of the sleek LED lit home lab. My aesthetic is more like something from a Tatooine droid shop: not disorganized, per se, but certainly not pretty.

Watto's Shop

I’ve really been wanting a Juniper switch for home, so I went shopping on ebay to see if I could find anything, and surprisingly, right now you can find EX2200 switches for really cheap — like, $75 shipped. I wondered then if I could modify the fans on the switch, and lo and behold, you can! So I bought a Juniper PoE switch, but the next part was purchasing the fans.

I came across the video below from Christian Scholz showing mostly how to replace the fans. It’s an excellent video, and shows you how to replace the two back chassis fans.

However, it doesn’t show you how to replace the fans for the power supply, largely because the power supply on non-PoE EX2200s doesn’t have fans (see YouTube preview above).

For the fans on a PoE EX2200, there’s actually just one fan, and if you want to replace it with a Noctua fan, it’s going to work, but it’s going to get a bit warm. One solution is to add a fan on the exhaust port that draws the air out, but it’s going to require a little modification for the chassis cover.

I’m going to briefly show what I did, but I didn’t think to take pictures during the process, but here’s a shot of what it looks like overall:

Uncovered EX2200 switch with fans added.

Backstory: I actually screwed up and got the wrong Noctua fan for the power supply fan. I thought it was a PWM for some reason, but it actually needed a FLX fan. However, I was able to still use the PWM fan. Here’s a breakdown of the above:

  • Blue – This is the power supply fan, a Noctua NF-A4x20 FLX fan.
  • Teal – This is a splice of the old cable adapter with the new using the Noctua omnijoin adapter set that comes with the fans (which is stupidly easy to use). Red and black wires matched up, and then I matched the yellow on the Noctua cable to the blue on the old fan plug.
  • Green – This is the Noctua NF-A4x20 PWM fan that mistakenly bought. The fan runs at 100% all the time, but not an issue. While a mistake, the fan came with…
  • Pink – Y-Adapter set that came with the PWM fan. I was able to plug this into the power plug port for the right FLX rear chassis fan (red), then plug that fan into the main y-adapter and then the PWM into the other port.
  • Red – These are FLX fans.
  • Purple – This was a port labeled “J9” that tried to use for power, but didn’t work (hence why I used the y-adapter).

Some notes I learned. For one, you’ll need to lift the power supply up in order to unscrew the screws:

EX2200 power supply lifted out

I also had to drill a fairly large hole in the chassis cover so that I could fit the the cable in (see below). Had I not screwed up and spliced the PWM cable, the hole could have been smaller and I probably could have fit the y-adapter within the chassis.

EX2200 Fan backside

I had to also modify the chassis for the PoE exhaust port by flattening the metal screen (so the attached fan didn’t rub against it) and I had to drill two additional holes so that the silicon screws could hold the fan down.

Finally, here’s a look from the CLI side:

EX2200 show chassis environment - Fans spinning normally

Final Thoughts

  • Overall, it’s working really well. It’s almost completely quiet — my EVE-NG server is actually louder than this thing.
  • It’s not pretty, but the ugliness is hidden in the back.
  • I DO NOT RECOMMEND YOU DO THIS TO PRODUCTION EQUIPMENT. I’m doing this to my home stuff, so I can live with it and the consequences, but I would never do this at work (too much work, TBH; better to buy an EX2300-C).
  • Screwing the chassis cover back on will indeed be a little tighter, but with a little force you can get all the screws on.
  • I don’t recommend using a drill to unscrew these (you can strip the screws pretty easily), but if you do, have a firm downward motion and screw/unscrew in bursts.
  • I used to loathe the EX2200s for how slow they are, but on the 12.3R12.4 software, they seem to work well.

Channelizing Ports: The Case of the Missing 1 Gbps Interfaces on EX4650 Switches

A few weeks ago while helping deploy a demo Juniper EX4650 aggregation/distribution layer switch, we ran into a problem where 1 Gbps interfaces would not function correctly; i.e., 1 Gbps interfaces were missing and wouldn’t appear on the EX4650s. The issue went something like this:

  • Plugged-in 1 Gbps SFP modules
  • Ran show chassis hardware and verified SFPs were installed
  • Ran show interface ge-0/0/4 to check out the interface but received “error: device ge-0/0/4 not found

Steve Brule Looking Confused

Huh? Come again?

Truth be told, we were using Cisco 1 Gbps SFPs on the switch, and not having any Juniper SFPs (we never have), we chalked up the issue to this being a newer switch and Juniper not capable of supporting non-Juniper SFPs. Thus we ordered Juniper SFPs, waited for them to ship to us, and then tried again — same result.

Steve Brule saying 'k'.

Fast forward an evening of troubleshooting and waiting for TAC to get involved, eventually a Juniper engineer gave us the solution: we needed to channelize our ports.

Channelizing Ports on Juniper EX4650s

First off, I recommend reading the documentation from Juniper on channelizing ports on EX4650s. It’s not required for what’s below, but if you want more information about it, I recommend reading up on it.

Trivia time! EX4650-48Y-8C switches are the same hardware as QFX5120-48Y-8C, just a different OS package!

EX4650 switches come with 48 SFP+ ports that are capable of up to 25 Gbps ports, but come configured by default as 10 Gbps ports; they also come with 8 QSFP+ ports that are capable of up to 100 Gbps speeds, but can operate as 40 Gbps, or can be broken up individually into 4 channels of 25 Gbps (100 Gbps to 4-25 Gbps via breakout cables) or 4 channels of 10 Gbps (40 Gbps to 4-10 Gbps via breakout cables). Breaking them up would be done via a cable like this (there are multiple options for break out cables; this is just one) option:

Breakout cable - 1 40 Gbps QSFP+ module to 4 SFP+ 10 Gbps modules

If you’re not familiar with channelizing ports (I wasn’t until this), channelizing is the process of configuring interface ports to operate in different capacities. The most important thing to note about QFX5120 or EX4650 switches is that the QSFP+ uplink ports are the only ports that perform a process called auto-channelization, a process in which if you plug-in a module, the port will automatically switch between 100 Gbps and 40 Gbps (not sure if you plugged-in a 10 Gbps module if it would do this). If you wish to use the 25/10 Gbps break-out cables, you’ll need to disable auto-channelization and manually configure the ports to operate as such (that’s outside my scope here, but read the link above for more info).

Why this is so important for my issue is that on EX4650s, the 48 SFP+ ports do not perform auto-channelization! These ports, by default, come configured as 10 Gbps ports, and if you wish to use 1 or 25 Gbps modules, you have to manually configure the switches to perform this. This is exactly why the 1 Gbps modules were not appearing, because the ports were not configured to operate in 1G mode!

To configure the ports for the speed needed, here is the configuration we needed:

{master:0}[edit chassis]
[email protected]#
fpc 0 {
    pic 0 {
        port 0 {
            speed 1G;
        }
        port 4 {
            speed 1G;
        }
        port 8 {
            speed 25G;
        }
    }
}

Under the chassis > fpc 0 > pic 0 stanzas, we configured speeds for the ports. However, note that the ports configurations above are broken up every four ports; this is because for the 48 SFP+ ports, port speeds are configured in groups of four (quads), and each quad can be 1, 10, or 25 Gbps. Here’s a visual of the quads:

EX4650 Port Quads - every group of four ports are colored and labeled by the first port - Port 0, port 4, etc.
(Click to enlarge)
Each quad is colorized above (port/quad 0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44)

Therefore, in order to configure 1 Gbps interfaces on an EX4650 switch (or a QFX5120 for that matter), you need to manually set the configuration speed in the chassis configuration.

Problem solved. And Bob’s your uncle.

Some additional items that I’ve discovered in this process:

  • Because the ports are configured in groups of 4, all four ports in a quad will be the same speed. You cannot configure port 1, for example, at a different speed. This makes me suspect that on the backplane side of things, each quad is really a 100 Gbps port of some kind, like the uplink ports on the right, but is channelized in some way on the backend. Maybe. Not sure. This post makes question that logic.
  • Like other EX series switches, non-Juniper copper 1 Gbps modules do not work. The copper modules must be Juniper (or Juniper-coded) SFPs.
  • Non-Juniper 1 Gbps optic modules do work correctly.
  • The eight uplink ports on the right are configured individually, not as quads.
  • Unlike older Juniper equipment, a system reboot is not required for changing port speeds.

Cisco Engage Boise 2020: Get Your Programability On

Note: I wrote this the evening of the event. Just getting around to publishing it now.

Another day has gone and another Cisco Engage Boise is behind us. This year at Cisco Engage Boise 2020, there was a very clear theme that I took away: network engineers need to be adding programability to their skill sets.

Cisco Engage

Note that I used the term programability and not automation. I think Cisco did a fantastic job making this distinction because the term automation gets thrown around a lot in engineering circles, often mistakenly conflated with scripting, or perhaps even with programability.

So what is programability and how is it different than automation? In my mind and in the context of network engineering, programability is an approach to network engineering that utilizes APIs and other tools within a software development framework to accomplish network engineering tasks. Automation, on the other hand, utilizes programability, but uses it to repeat tasks on regular intervals or based on events, both without the need of human interaction.

Cisco made it abundantly clear that future of networking involves incorporating programability into your skill sets, and they’re emphasizing this so much that they’re incorporating programability into the new certifications (click here or here for a great explanation). The engineering tracks will have 80% engineering and 20% programability, whereas the new developer-focused certifications will have 80% development, 20% engineering.

Cisco certifications, showing an engineering track and developer track of certifications.

Cisco is also pushing this paradigm shift by offering a ton of resources on it’s learning platform DevNet, which has a lot of cool resources, and was also a topic that received a lot of airplay today.

Personally, I’ve long held the belief that programability is the future for network and system engineers alike, but to be completely honest, I’ve often been guilty of using the term automation to describe something that I didn’t have great word for. It’s great to see Cisco finally start pushing this, but I must say other vendors having been pushing programability and education on this topic for awhile now; why this is more relevant now is because the big player in this industry, Cisco, is pushing it now, so hopefully we’ll get better programability offerings such as APIs, structured data models, etc.

Sessions

Of course, it goes without saying that Cisco Engage Boise 2020 also had sessions, but to be honest, I don’t think there’s much to note that hasn’t already been mentioned on Reddit or discussed by Packet Pushers. There were a few different sessions to select from, but no tracks, so I went to sessions I thought were interesting for my interests right now (largely operational logistics and preventative analytics).

For conferences and sessions, I must say I have a bit of fatigue for security topics right now, especially on the networking front, so I avoided those. I don’t think anyone is really presenting anything new with security, and most of the fashionable topics are largely relevant to the higher tier of the NIST or CIS security frameworks. Most issues are addressed in the lower tiers, and it’s in those tiers where I’m interested.

That said, for the sessions I attended, I have just a few notes worth writing about:

  • Zero Trust –  Cisco’s philosophy to zero trust is like others: assume the traffic malicious until proven otherwise. Cisco basically believes there are three components that need to be addressed in order move towards a zero trust framework, so Cisco has a off-the-shelf framework for you to purchase and implement. The framework:
    • Users need to be interrogated and challenged for access. Cisco product: Duo.
    • Workloads need to be monitored and inspected for malicious activity. Cisco product: Tetration.
    • Workplaces need to be secured and addressed. Cisco products: DNA, SD-Access and/or ISE.
  • Hyperflex – This is Cisco’s hyperconverged infrastructure, and of note Cisco is going to be deploying Cisco Hyperflex Application Flatform (HXAP), which is a hypervisor to compete with the likes of VMware and will include it’s own Kubernetes implementation. Initially HXAP will be focused on containers, but eventually will grow to allow VMs and so forth. They noted that you’ll be able to avoid the ‘VMware tax’, but I’m certain that will just be replaced with the ‘Cisco tax’.
  • DNA – DNA was definitely featured in a few of the sessions I attended, but I don’t care to go into the details of it. DNA basically is a platform to monitor and manage anything and everything about your Cisco network gear (well, only stuff that does 16.9, and basically on the hardware side is limited to 3850s and 9000 series gear). What really stood out to me about watching DNA in action for the first time was why Juniper made it’s Mist acquisition, because Juniper doesn’t really have a complete product to compete with DNA (yet, but they are making great strides). However, having features and having capabilities is one thing, but execution and proof of execution is a whole other thing — not to mention the price tag for the full vertical integration of everything to integrate with DNA.

Final Thoughts and An Evolving Mindset

When I signed up for Cisco Engage, I fully expected product evangelism — and I was not disappointed. I have a bit of an allergy to sales for some reason, but I’ve been challenging myself about this topic recently, and as a result, I found today’s conference quite useful.

That said, what I’m finally coming around to is the notion that I am a network engineer first (although I use engineer quite loosely here), and a user of products second. At a fundamentally high level, I am trying to solve networking problems for the organization I’m a part of, helping us accomplish our goals the best I think we can. It mostly doesn’t matter what products we deploy and use, as long as we can accomplish what the business needs us to.

I realize that’s not too much of a profound thought, but at times I feel like there’s a religious battle that continually takes place between vendors x, y, and z, aa, bb, etc. The reality is that, to paraphrase a podcast I was recently listening to, the technology landscape is much more diverse than just one or two vendors; there is plenty to learn from out there, and I think limiting yourself to one or even two vendors might force you into a paradigm/framework bubble that may not even be the best solution for your problems, maybe even stifling your ability to think more critically and out-of-the-box.

As a result, I’m trying to stay open to more ideas in networking, trying to not see every problem as a nail.