Posted on
Storage area network (SAN) administrators know they play a pivotal role in ensuring mission-critical workloads stay up and running. The workloads and applications that run on the infrastructure they manage are key to overall business success for the company.
Like any infrastructure, issues do arise from time to time, and the ability to identify transient links or address SAN congestion quickly and efficiently is paramount. Today, SAN administrators typically rely on proprietary tools and software from the Fibre Channel (FC) switch vendors to monitor the SAN traffic. When SAN performance issues arise, they rely on their years of experience to troubleshoot the issues.
What creates congestion in a SAN anyway?
Refresh cycles for servers and storage are typically shorter and more frequent than that of SAN infrastructure. This results in servers and storage arrays that run at different speeds being connected to the SAN. Legacy servers and storage arrays may connect to the SAN at 16GFC bandwidth while newer servers and storage are connected at 32GFC.
Fibre Channel SANs use buffer credits to manage the prioritization of the traffic flow in the SAN. When a slower device intermixes with faster devices on the SAN, there can be situations where response times to buffer credit requests slow down, causing what is called “Slow Drain” congestion. This is a well-known issue in FC SANs that can be time consuming to troubleshoot and, with newer FC-NVMe arrays, this problem can be magnified. But these days are soon coming to an end with the introduction of what we can refer to as the self-driving SAN.
(more…)