Ethernet Ecstasy?
Telecommunications carriers and large enterprises continue to deploy Ethernet throughout their networks to take advantage of efficiencies and cost-savings. As they do, there is a strong need for equipment vendors to begin developing standards to provide full OAM (Operations, Administration, Maintenance) visibility across carrier Ethernet implementations.
At this time, carrier Ethernet implementations are being driven in three main target areas:
1. Wireless backhaul, driven by 4G and LTE network technologies.
2. Metro Ethernet connections between large enterprises for day-to-day business transactions.
3. Metro Ethernet connections between carriers for large traffic hand-offs.
In order to speed the adoption of carrier-class Ethernet, service providers and carriers need to overcome a number of market constraints that are hindering widespread deployment. The first is the lack of deployed fiber.
The physical fiber or high-speed Ethernet connectivity is not present across the entire geography of a carrier. Dense urban areas are more likely to have access to fiber, but outfitting every floor of a skyscraper is a costly investment. Likewise, extending the reach of physical connectivity to a rural environment may not be cost-effective.
The second obstacle is related to Service Level Agreements (SLAs). In today’s competitive market, carriers must move beyond the current offerings within SLAs. Metrics of these newer SLAs must include mean time to repair (MTTR), frame loss, frame delay, and frame delay variation, to name a few. This is necessary to monetize the service and to ensure quality across networks carrying critical voice, video, and data services.
Business-class SLAs are especially important to the carriers’ customers, who will demand to know how the traffic across the carrier’s network is performing, not just whether the network is up or down. Mission critical applications are being distributed between interstate enterprise locations as if they were running between floors of a building, and large amounts of data are being replicated nightly between locations. Disruptions in this service or a faulty connection can cause data corruption and loss of revenue for the customer.
Luckily, there is considerable market traction and momentum working to mitigate these limitations. For example, the Metro Ethernet Forum (MEF) has been doing interoperability testing of vendors’ equipment. The MEF has become the clearinghouse of interoperability for in-network devices. If the device or carrier is MEF-compliant, the customer can be sure the service will work.
In addition, a number of large wireless operators (e.g., Verizon Wireless, BT Group, and Sprint/Clearwire) have publicly announced large capital commitments to wireless backhaul of their Ethernet traffic to help support their enhanced wireless services. Wireless carriers are banking on the base station to mobile switching center (MSC) bandwidth provided by Ethernet.
Everybody’s Doing It
Perhaps most importantly, there is a trend towards mass adoption of standards for SLA management. Equipment vendors and carriers alike see the need to develop standards to share information for reliability and SLA information. Unfortunately, there are a number of standards bodies involved, including the MEF, the IEEE, and ITU-T. And as with any standard, there is room for interpretation when implementing the standard.
Two key standards are involved in measuring Ethernet performance and are driven by the various groups. The standards are:
1. IEEE 802.1ag, Connectivity Fault Management.
2. ITU-T Y.1731, OAM Functions and Mechanisms for Ethernet based networks.
The two standards build off of each other and provide the basic mechanisms to measure an Ethernet service. The standards rely on the in-network devices to actively participate in measuring the service to provide the five key performance indicators (KPIs):
1. Availability
2. Frame Loss
3. Frame Delay
4. Frame Delay Variation
5. Throughput
There are technical reasons for these KPIs. Disk backups have strict delay and delay variation requirements. Voice services require consistent delay in both directions to maintain the same user experience as today’s infrastructure. Unfortunately, the standards are still very much in development and not yet near ratification. It is also clear that the original intent of the standards was to measure network performance for network engineering and operations, but not to measure network performance for monetizing the service and paying SLA penalties. Fortunately, the standards bodies recognize that work is still needed, and efforts are being re-focused on their current use case of enforcing SLAs and monetizing Ethernet SLAs.
The current standards work underway through all three standards bodies (MEF, IEEE, and ITU-T) addressed a number of industry challenges.
Challenge #1: Vendor interoperability for Operations, Administration, and Maintenance (OAM) is almost non-existent. This is crucial to mass adoption. The statistics are available when only one vendor’s equipment is in the network, but nearly impossible when a different vendor is at each end of a Layer 2 connection. This is due in large part because there are no incentives for vendors to work with other vendors. If a vendor is truly interoperable, then they are potentially opening up a customer deployment to competition and price erosion.
Challenge #2: Even in the single vendor environment, there is a challenge to producing meaningful results. The carrier is trying to monetize an Ethernet service with multiple classes of service. However, the carrier can only get KPIs on a per-port or per-VLAN basis. These detailed implementation differences in reporting metrics such as per-port reporting vs. per-VLAN reporting complicates matters. This lack in service reporting definition per class is causing slow adoption of fine-grained SLAs across carrier networks. This leaves the industry no choice but to stick with more basic SLAs based on the criteria of availability and MTTR.
Challenge #3: The calculations in today’s versions of current standards are being challenged. The Y.1731 definition of frame loss requires in-line devices to rely on counters for this KPI. The reality of networking is that a pure count methodology is insufficient and cannot account for the crossing of an uncontrolled network. Any TCP retries, discovery protocols, or network anomalies will disrupt the counters and invalidate the SLA. Frame loss could actually be reported as a positive count.
Working Toward the Goal
The encouraging news is that new standards are being worked on cooperatively across the industry. The focus of the working groups and standards bodies is to clarify OAM functions and provide KPI measurement methods that can be used to monetize SLAs. Here are some examples:
• MEF 10, Ethernet Services Attributes. The MEF 10.1.1 working group is developing updates to clarify and standardize the metrics included in SLAs around availability, frame loss, frame delay and inter-frame delay variation (jitter).
• Service OAM Fault Management (SOAM FM), Implementers Agreement. The SOAM FM Working Group is developing an agreement that covers how OAM functions designed to support fault management are going to be implemented.
• SOAM PM, Service OAM Performance Management Implementers Agreement. The SOAM PM Working Group is developing an agreement that covers how OAM functions designed to support performance management and used to measure service quality terms in SLAs are going to be implemented.
In addition, ITU-T Y.1563 is a standard being developed in the ITU-T Study Group 12, under Question 17 (Q17/SG12). It standardizes the definition of metrics used in SLAs. The ITU-T description for this effort is summarized in this statement: This recommendation defines parameters that may be used in specifying and assessing the performance of speed, accuracy, dependability, and availability of Ethernet frame transfer of Ethernet communication service.
There is a similar focus also in Y.1563 to MEF 10, but it is too early to tell if all of the work done in both groups translates to the same direct standards.
Nirvana = Business Class SLA Enforcement
On a positive note, solutions exist to measure in the heterogeneous Layer 2 network today. The standards provide a common set of features that allow systems to measure the Ethernet network. Systems that understand SLAs and can provide visibility at the port, VLAN, and class of service level are deployable to bridge the gap between the next round of standards and interoperability across Ethernet devices for OAM.
Service providers are deploying solutions today that avoid proprietary implementations of measurement and key performance indicators, and rely on establishing measurement methodologies that will evolve with their installed equipment base. These solutions will act as a bridge while interoperability and standardization of OAM continues.
There is a roadmap for the standards to provide full OAM visibility on carrier Ethernet that meets the needs of customers and service providers. The evolution will continue from today’s environment where a single vendor solution is the only guarantee that key performance indicators are presented from in-network devices. Soon, we will see true interoperability across the equipment landscape. In the mean time, concentrated effort on the standards and creative interim solutions are the focus to bridging the gap.
Charlie Baker is Product Manager, EXFO Service Assurance. For more information, visit: www.exfo.com.
• More information is available at: Institute of Electrical and Electronic Engineers (IEEE): www.ieee.org.
• International Telecommunication Union (ITU): www.itu.int.
• Metro Ethernet Forum (MEF): www.metroethernetforum.org.
What's your take on this subject? Leave a comment and get the conversation going.

