Posted on

Next Evolution for Storage Networking: Self-driving SANs

By Todd Owens, Field Marketing Director, Marvell

and Jacqueline Nguyen, Marvell Field Marketing Manager

Storage area network (SAN) administrators know they play a pivotal role in ensuring mission-critical workloads stay up and running. The workloads and applications that run on the infrastructure they manage are key to overall business success for the company.

Like any infrastructure, issues do arise from time to time, and the ability to identify transient links or address SAN congestion quickly and efficiently is paramount. Today, SAN administrators typically rely on proprietary tools and software from the Fibre Channel (FC) switch vendors to monitor the SAN traffic. When SAN performance issues arise, they rely on their years of experience to troubleshoot the issues.

What creates congestion in a SAN anyway?

Refresh cycles for servers and storage are typically shorter and more frequent than that of SAN infrastructure. This results in servers and storage arrays that run at different speeds being connected to the SAN. Legacy servers and storage arrays may connect to the SAN at 16GFC bandwidth while newer servers and storage are connected at 32GFC.

Fibre Channel SANs use buffer credits to manage the prioritization of the traffic flow in the SAN. When a slower device intermixes with faster devices on the SAN, there can be situations where response times to buffer credit requests slow down, causing what is called “Slow Drain” congestion. This is a well-known issue in FC SANs that can be time consuming to troubleshoot and, with newer FC-NVMe arrays, this problem can be magnified. But these days are soon coming to an end with the introduction of what we can refer to as the self-driving SAN.

Like a self-driving car, the SAN infrastructure will soon be able to sense what is going on within the SAN and course-correct the SAN traffic around any potential issues. Recent updates to the FC standards have created industry support for Fabric Performance Impact Notification or FPIN communications between the FC HBAs and FC SAN switches in the SAN fabric. FPIN notifications are analogous to road warning signs. Early warning enables the SAN components to take action to address the issue or make course corrections.

There are four FPIN types that are supported by the FC industry.

  • Link Integrity – identifies that the error thresholds for the physical layer are being exceeded, a detour is suggested
  • Congestion – notification to the SAN component that it is the source of congestion, throttling I/Os can help mitigate
  • Peer Congestion – notification that a slow device exists in the surrounding zone, segregating the device to a slow lane can help
  • Delivery – transmissions failed to complete and were not delivered, the exit is closed

Today, these notifications are sent by the FC switches and received by the FC HBAs and are logged in the SAN for monitoring by the SAN administrator. Later in 2022, HBA drivers will be updated to take action based on the notification received. For example, if a Link Integrity notification is received, the HBA can notify the operating system’s multipathing capability to automatically switch the data flow to another path in the SAN. Or if a Congestion notification is received, the HBA driver can implement algorithms to throttle the I/O rate of the specific port in an attempt to clear the congestion. Additional capabilities are being developed.

What the future holds

In the future, we will see more collaboration between the switch-centric capabilities and those of the HBA. Prioritization and segregation of the traffic will be enabled with what can be referred to as virtual lanes where HBA drivers can split the traffic on a port into virtual lanes and prioritize individual frames as low, medium or high priority.

To summarize all of this, the Fibre Channel industry and the server connectivity engineers at Marvell are working to make the Fibre Channel SAN infrastructure smarter and more self-aware. Like the self-driving car, the self-driving SAN is not too far off, and you can be assured the QLogic HBA technology from Marvell will be a key piece of the SAN infrastructure to make this a reality.

Marvell® QLogic® QLE2690 16GFC HBAs and QLogic QLE2740/QLE2770 32GFC HBAs all support FPIN notifications with latest firmware and drivers today in all tier-1 O/S environments. The feature is referred to as Universal SAN Congestion Mitigation (USCM) and is part of the StorFusion™ technology within QLogic HBAs. Unlike competitor offerings, USCM does not require additional management software to enable support for all four FPIN types. Driver development and interoperability testing is currently underway to implement automated corrective action in response to FPIN notifications. SAN customers can future proof their environments to be “self-driving” by implementing QLogic 16GFC and 32GFC HBAs in their SAN solutions today. For more information on USCM, access our technical brief here. or for more information about Marvell QLogic HBA technology, click here.

Tags: ,

Comments are closed.