The tldr; is at the bottom, after the story mode writeup. Feel free to CTRL+END or hyper-scroll your way there, or read the machinations of my troubleshooting on your way there.
There comes a time in all SMBs where your basic network isn't enough anymore. When all devices are on the same subnet it's called a "flat" network. Generally, a simple and flat network can accommodate up to 253 devices. That sounds like a lot if you're thinking about your home, but what about if you're a business with a >100 workstations, a bunch of printers, about 80 IP phones and some servers? You can run out of address space really fast, and when you don't get an IP address, you don't get network access.
Before my time at Manufacturing Company 1 (known forever after as MC1) there was a decent Cisco switch installed as the core switch. It was an L3 or "Layer 3" switch, meaning it could route traffic and handle multiple network segments. All of the downstream network switches, however, were much more simple than that and were what many of us network nerds call "unmanaged" switches. Network switches like these don't really route traffic, then jus send it upstream to the core switch or firewall, and they can't understand VLANs, so...if most or all of your devices are plugged into the unmanaged switches, then they're all competing for the same, limited IP address ranges.
Enter me, network dude extraordinaire. Most of my time would soon be consumed with learning all about the mainframe there, which ran HP MPE/UX, but this one-sided conversation is not about that. It's about the network, and how I had to scrap all of it and rebuild it from scratch.
Budget? Minimal
Expertise on the team? Just me.
The need for VLANs? Dire.
On a daily basis we were running out of IP addresses. So the other member of the IT Department would routinely assign computers and phones static IP addresses, which really only perpetuated the issue, but without a background in networking it seems like the straightforward thing to do.
Oddly enough, some of the wireless access points were on a different VLAN, those that were connected to the core switch. But others were on the default VLAN. And, even though the access points were all Ubiquiti, they were not all managed by the same controller due to the fact that the original controller was running on the other IT employee's old desktop before it fell prey to ransomware, and the others were managed by a different controller running on a server that was shut down because it had also been infected by the same ransomware. So, in essence, none of them were managed, some were on the default VLAN and others were on a different VLAN.
Of course, people were allowed to connect their personal devices to the wireless network, so what precious few IP addresses would have been available were not.
So...the core switch was usable, though very old and did not have current support, the other switches were not usable in any scalable way, the access points were all okay, but needed to be reset and managed again and the firewall was...okay, but not really configured well.
Around this time, Ubiquiti announced its Gen2 switches and the UDM Pro. All at prices that were very easy to swallow. The standard 48 + PoE was <$600 and the UDM PRO was <$400, so I could outright replace the entire stack for ~$4,000. The UDP PRO would host the network controller and perform all the L3, all the switches could handle VLANs and we'd have 32 PoE ports per switch for the phones (which meant we could remove all the phone power adapters from each desk). It would be a pretty basic setup, but it would work a lot better than the existing one.
So I ordered the gear, and once it arrived I unpacked it, stacked it on my desk and started configuring it. I used the WAN2 port to connect it to the network so that I could get the cloud adoption done and get updates on all the devices. That also allowed me to set the static IP on WAN1 in preparation for the cutover, which I scheduled for a weekend.
I was able to set up several VLANs, though I was not able to set up the default VLAN the same as was currently being used, because I was using that as a connection to WAN2, so it would have created an overlap (of course), but that would be an easy change I could make during the cutover.
With everything as ready as it could be before cutover, I schedule it for a weekend and shut down all the Ubiquiti gear.
Cutover was long. It was tedious. There were a number of limitations on the prep work I could do:
I had no real visibility into what device was plugged into which port as most of the older switches were unmanaged, so I had to watch and see what populated on the new switches as I moved over connections and switch the port to the correct VLAN if needed.
Because of the same reason I had to move some connections for printers and other devices to ports 33-48 as they didn't need PoE and I wanted to save the PoE ports for phones and wireless access points.
The scissor lift wasn't working, so I couldn't hard reset the wireless access points properly. I was able to SSH into some of them with guessed credentials and reset them, but others I just had to use the old VLAN config on that port until the scissor lift was available again at a later date.
A lot of offices were locked and the key I was provided with didn't work on all of them, so I wasn't able to reboot a lot of the Polycom phones until Monday, meaning a few dozen phones weren't ready before users came into the office. Turns out that there were multiple key sets as the locks had been changed by different groups over the years, so there was no real "master" key.
I was also changing the rack layout at the same time as the previous layout had all the patch panels at the top and all the switches on the bottom with 6-10 foot cables used (which meant tons of cable loops and slack hanging everywhere). I was now staggering the patch panels and network switches and moving to 6 inch cables, which was a much cleaner layout.
When it was done I settled at my desk and started pinging the bejezus out of everything I could in order to make sure everything was online and ready for Monday. There were a handful of devices that didn't renew their DHCP scope, but we changed most of the static IP'd phones and workstations to DHCP while we moved the connections over.
A couple of hours later, after resolving a few duplicate IP's due to previously set static IPs or workstations hanging on to previous IPs because the DHCP scope was set to 7 days (!!!), it was all done and I went home for the night.
Come Monday there was a lot of running around, taking care of the remaining configuration issues for devices that were locked in offices and other random, non-emergency type things that happen when you to a complete rip and replace, but the project itself was a massive success. There were a few firmware stability issues that only presented on load we had to sort out, but applying older firmware versions and changing the uplink topology a bit resolved those.
The Ubiquiti platform has evolved a lot since then, and for the better. It was a little rough at times with some of their firmware releases, but since then they've gotten much better at it.
And the platform allowed me to easily integration security cameras during the COVID era. And a door access system after that. And when MC1 expanded to several different facilities I was able to set a default network stack across the board. I'd introduced a core switch with a 48 PRO PoE at each as well as adding in all the cameras and the sudden surge in Google Meets meetings instead of in-person meetings added additional network strain that the standard 48 PoE switches had difficulty handling.
tldr; old network stack was severely limited and MC1 was running out of IPs daily. New network stack was much more modern and flexible and took a weekend to cut over.
Total cost: ~$4000 + a lot of coffee.