|
Table Of Contents
Link Sizing for the VPN Backup Tunnel
MGCP Bandwidth Requirements Under Normal BH Load
Cisco BTS 10200 Restart and Large-Scale Events
Network Management Bandwidth Requirements for a Remote POP
RTP Streams to Announcement Server
VPN Tunnel Bandwidth Requirements
VPN Tunnel Bandwidth Characteristics
Recommendations for Future Engineering
Overcoming Noncoincident Busy Hours
Call Centers and Telemarketing Applications
Voice Traffic Engineering
Voice traffic engineering is used to determine how much voice traffic will be offered to the Cisco BLISS for T1 solution, and to determine the proper sizing of traffic-sensitive parts of the solution. Because Cisco BLISS for T1 is an integrated voice/data solution, there is also a need to properly size parts of the solution to accommodate subscriber Internet data flows. Traffic engineering for the voice application and Internet data traffic engineering are related, as they may share certain components.
This chapter describes the traffic engineering needed to support the voice application. Where there is an associated impact from Internet data flows, this is noted. There is a separate section covering Internet data traffic engineering. It is important to note that voice traffic engineering will differ for each customer depending on design criteria, operational methods, and so forth.
This chapter contains the following sections:
Lists the design objectives for the various traffic-sensitive parts in the Cisco BLISS for T1 solution.
Explains the methodology used to size local and long distance trunk groups for initial deployment, including how initial deployment traffic estimates are formulated from the information in the Customer Requirements Document (CRD). Assumptions used to support the design methodology are noted.
Explains the methodology used to determine how much SS7 link capacity will be required to support the traffic offered at initial deployment.
• Link Sizing for the VPN Backup Tunnel
Describes the methodology for sizing the VPN tunnel through the ISP that carries backup signaling traffic between the POP housing the Cisco BTS 10200 and the remote POPs.
• Recommendations for Future Engineering
Discusses how the traffic engineering model will evolve over time with further deployment.
Design Goals
Telephone traffic systems provide a set of network resources that are shared by a large group of subscribers to make phone calls. Certain components in these systems are usually oversubscribed. This means that the system is not sized to allow every subscriber to simultaneously make phone calls to an off-switch location (there are trunk restrictions). Building a system to assure that every subscriber could simultaneously make phone calls to all possible destinations would require that dedicated resources (such as trunks, bandwidth, and so on) be allocated to every single originating line, which is economically prohibitive. This constraint applies for both traditional TDM telephone switches and for new softswitches. This is an acceptable tradeoff, as historical data shows that only a small group of subscribers on a switch needs to make a phone call at any one time.
Typically, the number of users that simultaneously need service varies throughout the day, peaking at a certain hour called the busy hour (BH). The BH can vary based on the day of week (for example, Monday's BH may be higher than all other days), and may vary based on the time of year.
A telephone switch is designed to carry traffic loads offered during a BH with a certain grade of service. The goal of traffic engineering is to size traffic-sensitive components in the switch to provide the desired grade of service. Table 4-1 lists traffic-sensitive components and design goals for the Cisco BLISS for T1 solution.
Trunking Design Methodology
This section describes the methodology used to size the local and long distance trunk groups for initial deployment. The first step is to establish the target engineering forecast timeframe as defined by the customer. Hence, the engineering estimates will size the trunk groups to accommodate traffic for that target timeframe based on customer market forecasts.
The base data for trunk sizing should be provided by the customer and indicate service mix for different service types (PBX, POTS, and so forth), average line/trunks, local minutes, long-distance minutes, and bandwidth consumption for calls.
The design may include the following line types:
•POTS—An IAD with standard FXS lines and analog phones
•POTS-Rotary—An IAD with standard FXS lines and analog phones, configured as a hunt group in the Cisco BTS 10200
•PBX-digital PR—An IAD with a digital PRI interface to a PBX
•PBX-digital CA—An IAD with a digital CAS interface to a PBX
The design methodology should estimate the initial deployment load by computing the contribution of a single line (or trunk) toward the traffic offered to the local or long distance trunk group during busy hour (BH). The total offered load to the trunk groups is the contribution of an individual line/trunk type at BH multiplied by the total number of lines of that type.
Note that after the total number of required trunks is computed, you may need to split the required number of trunks into multiple trunk groups (TGs) and hunt through the TGs using a route guide. Having portions of the total egress capacity allocated into smaller trunk groups allows portions of the capacity to be removed from service without completely isolating a POP. This may also prevent "accidents" at the CLI (where someone accidentally forces the wrong TG out of service) from completely isolating a POP.
SS7 Link Sizing
This section describes the methodology used to determine the number of SS7 links required to support Cisco BTS 10200 traffic to/from the PSTN. One key parameter that is required as input to SS7 sizing is an estimate of the total number of calls per second arriving at the Cisco BTS 10200 during BH. Common assumptions that may be used in the calculations are as follows:
•50 percent of the calls are incoming (all using a standard ISUP call setup sequence). 50 percent of the calls are outgoing.
•Every call originated from an analog line or trunk is destined off-net, either to local or long distance trunks. This results in SS7 messages for every call.
•All outgoing local calls are to exchanges that are portable, hence an LNP query is required. The assumption is that very few dialed numbers will be for numbers ported into the Cisco BTS 10200. Calls dialed to a number ported into the Cisco BTS 10200 should not require an LNP query.
•All outgoing long distance calls will not require LNP queries, just standard ISUP SS7 call setups. LNP for long distance calls, when required, is typically done by the long distance carrier at the switch just prior to delivery of the call to the local exchange carrier (LEC).
•All 800 calls are routed to the long distance carrier, no 800 query or subsequent LNP query for the translated number is required (this will be handled by the long distance carrier).
•Calling Name Delivery (CNAM) may be required for some incoming calls. This would require a CNAM database query.
•CLASS services are not offered between carriers. That means that services such as Automatic Recall do not generate TCAP messaging to local or long distance switches.
•Operator Services/Directory Assistance (OS/DA) calls are routed mobile forwarded (MF) calls (they do not require SS7 signaling).
During a BH, the traffic reaches equilibrium. That means the rate at which new call attempts are made is equal to the rate at which existing calls are terminated. A new call that is in the setup phase will inject the ISUP Initial Address Message (IAM), Address Complete Message (ACM), and Answer Message (ANM) onto the SS7 links.
A call in the terminating phase will contribute REL and RLC messages to the SS7 link. Bcause there is equilibrium at the BH, the total SS7 messaging contributed by a new call setup is all five messages (IAM, ACM, ANM, REL, and RLC). In equilibrium, every new call being launched is matched with a call that is in the teardown phase. Typical sizes for these five message types are listed in Table 4-2.
Table 4-2 Sample SS7 ISUP Message Sizes
Message Type Approx SizeIAM
43 bytes
ACM
17 bytes
ANM
15 bytes
REL
19 bytes
RLC
14 bytes
Total message byte count
for a standard call setup108 bytes
The total average call rate at BH based on the number lines deployed needs to be calculated. Table 4-3 gives an estimate of the total outbound SS7 load (at 1 cps) for normal outbound calls. The data in the table assumes an average CPS rate of 2.283 cps of which 50 percent are outbound calls (~ 1 cps worth) and will trigger an LNP query in the outbound direction of the SS7 link. It further assumes 50 percent are incoming calls (~1 cps worth) and will trigger a CNAM query in the outbound direction of the SS7 link (assuming all subscribers have CNAM).
Table 4-3 Estimated SS7 Link Usage
Message Type Approx Size CommentStandard SS7 call setup
108 bytes (see Table 4-2)
Incoming and outgoing call rate are assumed equal (1 cps).
All outgoing calls are assumed to trigger an LNP query and all incoming calls are assumed to trigger a CNAM query.
LNP query
93 bytes
CNAM query
68 bytes
Total per outgoing call
269 bytes
Total outgoing at 2.283 cps call rate
614.127 bytes (or 4913 bps)
The SS7 link utilization is dominated in the outbound direction.
614.127 bytes/sec x 8 bits/byte = 4913 bps =0.0877 Erlang
(8.77 percent occupancy on a single 56 kbps SS7 link)Two A-links are required to meet redundancy requirements, resulting in 4.39 percent occupancy on each A-link.
Note The industry standard design goal for an A-link is Ј 40 percent occupancy. This lower loading point allows for the failure of any single A-link as the other remaining A-link would still be able to assume the load of the failed link at 80 percent occupancy.
Link Sizing for the VPN Backup Tunnel
This section describes the methodology for sizing the VPN tunnel through the ISP that will carry backup signaling traffic between the POP housing the Cisco BTS 10200 and the individual remote POPs. In the Cisco BLISS for T1 design, a clear channel DS3 is recommended to provide primary connectivity between the Cisco BTS 10200 POP and the remote POPs.
This DS3 will carry the following traffic between the POPs:
•MGCP signaling traffic between the Cisco BTS 10200 and remote gateways. This includes MGCP pings as well as MGCP messages associated with call setup over the course of an average hold-time call, as well as MGCP pings occurring for idle gateways (those not making any calls).
•ISDN backhaul signaling. This is Q.931 traffic from digital PRI IADs that is sent to the Cisco BTS 10200 for call processing. This traffic includes a steady stream of keepalive messages, even when no ISDN call setups are taking place.
•Real Time Protocol (RTP) traffic for calls routed to an announcement server. This is expected to be one-way traffic for callers originating in the remote POP that are routed to an announcement (announcement servers are located in the Cisco BTS 10200 POP).
•SNMP and syslog traffic from routers located in the remote POPs.
If the primary DS3 should fail, the intent is to route the above traffic between the Cisco BTS 10200 POP and the remote POP over a VPN tunnel provided by an ISP. This section provides a method for determining the required characteristics of the tunnel, such as latency and bandwidth, and factors that should be included in the design.
The standard MGCP call setup traffic flows identified above are expected under normal BH operating conditions. It is important to note that MGCP signaling may be temporarily much higher under certain failure conditions. For example, if the Cisco BTS 10200 Softswitch was stopped and restarted (both sides of the duplex system) the resulting endpoint capability exchange may consume a substantial amount of bandwidth in a very short period of time. The potential size of a large-scale gateway restart must be considered, and balanced against the normal operating load of MGCP signaling at BH. The VPN tunnel should be sized to accommodate the larger of the two flows.
MGCP Bandwidth Requirements Under Normal BH Load
The following information is used to compute the MGCP signaling bandwidth at BH. This is similar to the analysis we did previously for SS7 link sizing.
•Capture a sample MGCP ping exchange. This message exchange will occur in the background for every gateway (trunk gateways and IADs), even when the gateway is not involved in a call, and will occur at predetermined intervals. The Cisco BTS 10200 scheduler will schedule pings such that every gateway is pinged within a 10-minute interval.
•Capture a sample MGCP call flow in the lab for a call terminating to an FXS port. Total the number of bytes required for all of these messages in the direction from the Cisco BTS 10200 POP toward the remote POP.
•Capture a sample MGCP call flow in the lab for a call terminating to a PBX-digital CAS trunk. Total the number of bytes required for all of these messages in the direction from the Cisco BTS 10200 POP toward the remote POP.
•Capture a sample MGCP call flow in the lab for a call terminating to a PBX-digital PRI trunk. Total the number of bytes required for all of these messages in the direction from the Cisco BTS 10200 POP toward the remote POP.
Note The PBX-digital PRI requires some MGCP messages, but most of the call setup uses ISDN backhaul. The GW will terminate the PBX Q.921 layer, and will backhaul the Q.931 layer of the signaling channel back to the Cisco BTS 10200 using Reliable UDP (RUDP). This is referred to as an "ISDN backhaul."
There are frequent message exchanges (keepalive messages) between the gateway and Cisco BTS 10200, even when there are no call setups. Therefore, there is "idle" traffic seen at all times (similar to MGCP ping). The ISDN messaging is used to perform call setup between the IAD and the interconnected PBX. The MGCP messaging is used to supply the IAD with information about the far end gateway that it must communicate with over the IP network. The PRI IAD is sent a CRCX which has a Session Description Parameter (SDP) that tells it the far-end IP address and port number with which to setup the RTP stream.
•Capture a sample MGCP call flow in the lab for an incoming call that is forwarded to a PSTN number. Note that extra MDCX messages are expected here. Total the number of bytes required for all of these messages in the direction from the Cisco BTS 10200 POP toward the remote POP.
The dominant direction is from the Cisco BTS 10200 toward the remote POP, as this direction is expected to carry large messages, such as RQNT, CRCX, and MDCX. The gateways in the remote POP typically reply with very small ACK messages; however, some ACK messages are bigger than others. Specifically, ACK messages sent in response to a CRCX can include a gateway-provided SDP parameter, which can make this particular ACK message much larger than the others. This effect is included when totaling the bandwidth for the individual call flows.
NTFY messages sent from the remote POP are also expected to be much smaller.
Table 4-4 shows the size of the MGCP call flow for different call setup scenarios as described in the Comments column. These were captured from various releases of the Cisco BTS 10200 software. While the Cisco BTS 10200 releases are different, the call flow size is sufficiently accurate to determine the approximate bandwidth requirements.
Table 4-5 shows the expected MGCP signaling for each line type in a given population of n lines. The following is computed on this table:
•Based on the number of lines of each type (and the number of lines for a particular IAD type), the number of MGCP GWs can be estimated. For example, if there are 10 POTS lines deployed, and there are 8 lines / POTS IAD, then 1.25 MGCP GWs are used for POTS.
This process continues for all line types to find the total number of MGCP GWs. This method is only used as an approximation, because it is obviously not possible to have fractions of a gateway. This number is used to compute the number of MGCP pings seen in the network over the 10-minute ping interval.
•A similar calculation determines how many PBX-digital PRI GWs are deployed. This is used, in turn, to determine how many ISDN backhaul sessions will be active (even at idle), and the required bandwidth necessary to support the keepalive messages.
•Using the approximated call flow samples, and the per-line contribution to the call rate at BH, the average MGCP signaling for each type of line can be estimated. The MGCP signaling for the different line types is summed to form the Total Estimated Average Signaling.
•A "safety factor" is then multiplied by the Total Estimated Avg Signaling to form the column "Total Estimated Average Signaling (with safety factor)." This safety factor is quite large (x10 is used in this calculation).
This safety factor is large for the following reasons:
–The calls arrive in a Poisson distribution (standard telephony assumption), which means bursts of call arrivals as high as 6 (.05% probability) could occur. During these bursts, the average bandwidth required would be 6 times higher than normal.
–The sample call flows use an endpoint identifier that is extremely short (from a lab setup). Real endpoint identifiers will be substantially longer.
–The sample call flows frequently used very short transaction identifiers (these are randomly chosen). In a real network these could increase the length of each message by a few bytes).
–The sample call flows are samples used to obtain a baseline number. Other call flows may be found in the real network that are not predicted.
Cisco BTS 10200 Restart and Large-Scale Events
The computation of average MGCP signaling bandwidth is shown as one factor that must be considered in sizing the VPN tunnel. Certain failure events will result in a large number of MGCP messages being exchanged between the Cisco BTS 10200 and GWs. If the Cisco BTS 10200 is restarted (both sides in the duplex system), it will audit every endpoint in the network for its full capabilities. The response from the endpoint is large (978 bytes from a sample call flow in the lab). Because every endpoint is hit with this AUEP message, there can be a large MGCP load generated. The Cisco BTS 10200 will pace this restoration procedure using two parameters in the Call Agent configuration (ca-config) table (mgcp-init-duration and mgcp-init-terms). The Cisco BTS 10200 will attempt to recover the number of terminations specified in mgcp-init-terms during the time interval specified in mgcp-init-duration. The default values will recover 160 endpoints in 1 second. Table 4-5 shows that the bandwidth required to support this recovery activity far exceeds the average bandwidth requirements for normal MGCP signaling at BH.
The computation of this number uses:
Avg_MGCP_recovery_load = 160 x [978 + 64] = 160 x 1042 = 166720 bytes
where:
160 = number of terminations recovered in 1 second
978 = number of bytes in an Audit Endpoint (AUEP) response (full capability)
64 = number of bytes for the ESP header
Avg_MGCP_recovery_bitrate=166720 bytes x 8bits/byte = 1333760 bps
Calculation of the MGCP signaling for normal call setup includes the following:
•14-byte Ethernet header (this would not be present on a DS3 facility, but would be replaced by a 6-byte PPP header). For the purpose of estimating bandwidth, this is almost an even exchange.
•The 64-byte ESP header is not included in the MGCP signaling. Because the MGCP recovery dominates the signaling needs, it was not necessary to compute these numbers with an IPSEC header.
Network Management Bandwidth Requirements for a Remote POP
This section describes calculating the network management bandwidth requirements for a remote POP.
SNMP Traffic
To monitor the remote POP, SNMP polls and traps are collected from the remote nodes (ESR, MGX, AS5850, and so on). SNMP traps are not typically collected from IADs. The customer premise equipment usually logs messages to the logging buffer, and uses the syslog facility to transmit warnings back to the data center (syslog requirements are discussed in the next section).
The polls are conducted every 60 minutes and take approximately 90 seconds to complete (there are 30 pollable items in the MGX MIB). The polls are issued at a rate of approximately one every 3 seconds. The response to the poll is a protocol data unit (PDU) of 796 bytes (largest PDU for the MGX shelf).
Existing experience with a similar network uses the rule of thumb that there is one trap generated every interval of time that corresponds to three poll cycles (for example, approximately one trap every 9 seconds). The maximum trap size is also 796 bytes (PDU size). Hence, the average rate at which SNMP data is exchanged is very low.
A 796-byte SNMP PDU has 20 bytes (IP header), 8 bytes (UDP header), 64 bytes (IPSEC header) = 888 bytes. There is essentially one 888-byte packet sent every 3 seconds during a poll interval (for the next 90 seconds) every 60 minutes. In addition, one 888-byte trap is sent every 9 seconds.
However, calculating the bandwidth capacity for the average SNMP load will result in a serious underestimation when unforeseen events occur. In a critical failure, a large number of SNMP traps could get generated by the MGX shelf in a very short period of time. Many messages would be lost if the bandwidth is sized for an average network management scenario (normal polls/traps.). At a time when it is most important to know what is happening in the network, there will be a loss of information.
To avoid this problem, the bandwidth should be engineered to meet the needs of a critical situation. However, it is very difficult to foresee the peak SNMP load under all possible failure conditions. As an estimate, it is assumed that a catastrophic event could occur on one of the remote POP nodes where up to 100 SNMP traps are sent in a 1-second interval. This would result in an average bandwidth requirement of snmp_bandwidth = 100 x 888 bytes/msg = 88800 bytes = 710400 bps.
Note The assumption is that there is one node affected at the remote POP that causes this high traffic activity. Events can occur that affect more than one node at the remote POP. If multiple nodes were to start spewing SNMP traps at this rate, some messages may be lost.
Syslog Traffic
IADs are monitored remotely by forwarding warning messages to a syslog server at the data center. The IAD is also configured to write messages of informational level (or above) to the logging buffer. This buffer can be retrieved to help diagnose customer issues. In a similar philosophy to the SNMP sizing, the bandwidth should be sized to accommodate the traffic from an event that may affect multiple IADs. (that is, it is not uncommon for multiple IADs to be affected by a power outage, resulting in T1 outages and PBX signaling link failures).
The following assumptions are used to determine the syslog load:
•Based on existing traces from such an event, each syslog message is estimated to be ~ 500 characters (total packet size must add the 20-byte IP header, 8-byte UDP header, 64-byte VPN header) = 592 bytes total per syslog message.
•60 GWs are present at 700 lines per deployment.
•A failure event affects 25 percent of all GWs.
Based on the above assumption, the required syslog bandwidth is computed as:
syslog_bandwidth = 60 x .25 x 592 bytes/msg x 10 msgs/event x 8 bits/byte = 710400 bps
Reserved Bandwidth for Telnet Sessions
It is important to be able to log in and remotely control equipment at the remote POP. While equipment like the MGX and ESR will have remote dial-in capabilities, CPE equipment like the IAD is only reachable via Telnet. Ensure Telnet capability into remote CPE is provided in case of an emergency. Hence, bandwidth should be allocated for these sessions.
To determine the bandwidth needed to support a Telnet session, the following assumptions were made:
•The goal is to provide an equivalent 9600-bps connection via Telnet (1200 characters/sec)
•From a sample snoop trace of a Telnet session (show run conducted on a router), Telnet packets are 590 bytes (breaking the output of show run into multiple packets). This is a 536-byte payload, with a 14-byte Ethernet header, 20-byte IP header, and a 20-byte TCP header.
•To achieve a 1200-cps throughput, there will need to be:
telnet_packet_thruput = 1200 chars ч 536 chars/packet = 2.23 packets_per_second
telnet_bandwidth_per_session = 2.23 x [590 + 64] = 1464 bytes/sec = 11713 bps
590 bytes assumes that the Ethernet header (14 bytes) and 6-byte PPP header are approximately a wash. The 64-byte IPSEC header is then added.
•Allocating enough bandwidth to serve four simultaneous Telnet sessions:
telnet_bandwidth = 11713 bps x 4 sessions = 46853 bps
RTP Streams to Announcement Server
If a call is incoming at the remote POP and must be routed to the announcement server (AS), it is normally carried on the private facility DS3 link between the Cisco BTS 10200 POP and the remote POP. If that link fails, the RTP stream for these announcements will be routed into the VPN tunnel.
Bandwidth must be allocated for these announcements. Using the publicly available bandwidth calculator on CCO, the required bandwidth for a G.726 (32k) call is 80,222 bps. This factors in a 64-byte IPSEC header on PPP links at a packet rate of 50 packets per second.
Table 4-6 shows the expected rate at which callers will route to the announcement server.
Note The example numbers in Table 4-6 are using a failure rate of 4 percent on calls. Because the trunk groups were engineered for 1 percent blocking at BH, it is expected that at least 1 percent of call attempts at BH will need an announcement. The other 3 percent were allocated for miscellaneous other causes.
Computing the required RTP bandwidth for 700 lines is shown by:
AS_bandwidth = 7 x 80222 bps/announcement = 561554 bps
VPN Tunnel Bandwidth Requirements
Table 4-7 summarizes the bandwidth requirements described in the preceding sections.
Table 4-7 Estimated Total Average VPN Bandwidth Required
Function Average Bandwidth Requirements CommentMGCP Signaling
1333760 bps
Dominated by Cisco BTS 10200 endpoint recovery (paced at 160 terminations /sec)
SNMP Traffic
710400 bps
Allows for worst case loading of 100 SNMP traps/sec (796 byte PDU size)
Syslog Traffic
710400 bps
Assumes 60 GWs, 25% involved in an event, 10 messages per event, 592 chars/msg
Telnet Sessions
46853 bps
Assumes 9600 bps emulation rate, 4 simultaneous sessions
RTP streams for AS
561554 bps
Assumes 7 simultaneous users of AS, g726(32k) compression
Total
3362967 bps (3.37 Mbps)
Safety factor x2
6725934 bps
(6.73 Mbps)Safety factor covers:
•Unforeseen events (need to change recovery rates, increase AS ports, Telnet sessions, increased SNMP or syslog usage)
•Uncontrollable burst rates from Cisco BTS 10200 or GW (see Instantaneous Bandwidth)
VPN Tunnel Bandwidth Characteristics
The following bandwidth characteristics are required for the VPN backup tunnel.
Instantaneous Bandwidth
The bandwidth requirements for the functions in Table 4-7are listed as "average bandwidth." There is a difference between the average bandwidth required and the instantaneous bandwidth required. This is particularly important for the MGCP signaling.
The average bandwidth assumes all packet arrivals are spaced evenly apart, when in reality the packets do not arrive evenly spaced in time. When the Cisco BTS 10200 does a recovery (or sends out a burst of MGCP messages for signaling), it sends at the rate limited by the local media (FastEthernet). The Cisco BTS 10200 or GW will not know there is a bandwidth-constrained tunnel elsewhere in the network. So while the bandwidth used over a longer interval of time conforms to the average, over a short space of time it may far exceed that.
The service provider (SP) providing the tunnel must allow for these bursts without discarding the packets (particularly the signaling). SPs will typically police an input from the edge. If the flow does not conform to the contract, packets will be discarded. Even though the average bit rates are specified, it is necessary to burst up to line rate for some period of time (the longer the period allowed for the burst, the better). The x2 safety factor also helps to provide some "slack" to ensure the bursts do not exceed policing.
This is one difference between a privately owned DS3 and a VPN tunnel. The private DS3 would not be policed for the MGCP flow. If there are no other users (for example, Priority Queue voice packets) on the DS3 competing for the bandwidth (the DS3 is not congested), the MGCP signaling flow can burst at line rate indefinitely.
If the tunnel is policed at a hard rate (based on the average), it is possible that a burst of MGCP packets (sent at FastEthernet rates by the Cisco BTS 10200) will get discarded by the SP. This is an undesirable outcome, as the loss of packets will cause retransmission (generating more packets and increasing the average bandwidth utilization). If enough messages are lost, call setups may fail. It is also possible that lost messages will place trunks or terminations into "hung" states, where manual intervention is necessary to restore them.
It may be possible to traffic shape the MGCP signaling, but there are finite limits to this, as the queue depth for the traffic shaper will be finite. A traffic shaper is a queuing mechanism where the output of the queue is controlled to conform to the contracted rate. Incoming bursts beyond the contract rate are temporarily queued, and will eventually be sent. However, the queued packet will incur additional delay and once the input queue fills, messages will be discarded. The degree to which the Cisco BTS 10200 will cause bursting (especially in recovery) is unknown. It may be possible to compensate with traffic shaping - or it may not be.
Other traffic flows such as syslog and SNMP can also have large bursts in excess of their average rates. Traffic shaping/policing may be necessary on the output queue to prevent these applications from consuming bandwidth allocated to MGCP signaling.
Message Latency
Another key tunnel characteristic that must be controlled is message latency. MGCP messages must be acknowledged within 400 ms, or they will be retransmitted. In addition, the VPN tunnel will also carry ISDN backhaul signaling. ISDN Layer 3 traffic from IAD-digital PRI units is encapsulated in Reliable UDP (RUDP) messages and sent to the Call Agent. RUDP messages are retransmitted after 300 ms (and more than two retransmissions will cause the backhaul link to go out of service). ISDN backhaul messages should also be sequenced properly, as RUDP is attempting to maintain the message ordering normally provided by the ISDN Q.921 layer. Out of sequence RUDP messages can still be acknowledged, but it drives up the bandwidth utilization because the external ACKs (EACKS) are bigger and is undesirable.
The processing delay for the Cisco BTS 10200 and GWs are probably very small for a new deployment. However, as traffic increases, the processing delays for both nodes could increase. As an estimate, it is probably desirable to allow for 250 ms of processing between the nodes at each end. This would allow 150 ms (400 ms - 250 ms = 150 ms) for the round trip delay across the tunnel. This implies the one-way delay should probably not exceed 75 ms. A shorter delay is more desirable.
Note This latency figure assumes that interactive voice services are not being provided on the VPN tunnel. In this sizing exercise, this figure is chosen based on signaling needs. Interactive voice conversations are highly susceptible to delay. Excessive delays (> 150 ms one way) can result in speakers talking over each other. If future needs change such that the VPN tunnel is intended to carry interactive voice, this figure needs to be revisited. In this case, the figure may be heavily influenced by the need to stay within the 150-ms voice budget. The announcement server traffic is one way (there is no interaction), and less sensitive to the delay, although excessive delay could still exacerbate existing echo issues.
Message Loss Rate
Message loss rates within the tunnel must be extremely low, which assumes that the MGCP packet has made it past the policer (the issue described in item 1 above). The exact rate at which the loss becomes unacceptable is not known.
Message Sequencing
The tunnel must support message sequencing to prevent ISDN backhaul signaling from arriving out of sequence.
Recommendations for Future Engineering
Overcoming Noncoincident Busy Hours
The method described in this chapter computes total offered load at BH by multiplying the contribution of each line type by the number of lines (of that type), then summing across all line types. This provides a reasonable initial estimate of offered load. However, it is a rather conservative figure because it assumes that all lines will be active and contributing during the BH.
In reality, if there are several lines at a small business, not all of them may be active during the BH. For example, some personnel may be on vacation, working from home, or perhaps a user comes in on a staggered shift (thereby offsetting his individual BH from the overall BH of the switch). The extent to which this happens may vary depending on the nature of the subscribers, and cannot be quantified. Hence, if the estimated total offered load were 20 Erlangs at BH (computed using proposed methods), the observed offered load at BH may only be 15 Erlangs.
To accommodate the possible "noncoincidence" of the BH for individual lines and other usage characteristics outside the initial assumptions, it is proposed that the future traffic engineering methodology measure the actual offered load for groups of 100 to 200 lines/trunks. Instead of measuring the impact of any single line on the trunk group load, the impact of a group of lines is measured. This observed value would include the effects of noncoincidence in individual BH for lines in that group. The future traffic engineering methodology would be used once real trunk group (or CDR) data became available for analysis. This will yield a more accurate picture of true usage. The trunk groups would then be engineered based on the expected addition of 100 to 200 lines.
Note The effects of noncoincident BH may be small, because the target group of small business users probably has a consistent work schedule. This effect may be more evident as the mix of users grows to include multiple dwelling units (residential subscribers), whose usage patterns may differ from small business.
Call Centers and Telemarketing Applications
The models described in this document are useful for initial engineering for a population of normal business users, particularly for the case where all PSTN flows egress at the POP where a call originates.
In cases where large nationwide business subscribers are placed on the network, it is important to have prior knowledge about major call flows for this subscriber, particularly if the subscriber runs call centers or telemarketing applications. This would require obtaining traffic information from the subscriber before they are placed on the network. Large subscribers running these kinds of applications may contribute significant amounts of traffic onto the network, potentially contributing to loads in excess of planned BH levels, which may disrupt the service quality seen by other subscribers. The potential concerns are listed below:
•Call centers are typically the focal point for large streams of incoming traffic. For example, assume that a customer begins providing service for a chain of department stores. This department store chain runs a credit center that grants credit to customers and maintains a call center where store employees may call to check customer credit or grant new credit on the spot. A call center like this may funnel large loads toward a particular POP where the call center resides. This could affect the amount of bandwidth required between POP sites. During peak seasons, such as Christmas, this call center may see especially heavy loads.
•Telemarketing applications are usually on the line continuously, over the hour, and may contribute minutes of usage/line well beyond the 20 minutes at BH used in the engineering estimate.
Fax and Dial-Up Modem Traffic
Fax and modem service on Cisco BLISS for T1 is currently provided using modem and fax passthrough. This means that certain frequency tones unique to FAX and modem are detected by the gateway's DSP chip and are used to load a G.711 codec. G.711 can transparently digitize fax and modem analog signals for the end user. However, this means that the bandwidth requirements for a fax/modem call (~91 kbps) are greater than that for a normal voice call (~58 kbps).
For the initial deployment, voice traffic is usually not carried inter-core on IP links. All off-net traffic is routed via the PSTN. On-net to on-net calls are routed over LAN facilities (there are no fax/modem or interactive voice calls carried on WAN facilities). At initial deployment, the only WAN facility that will carry fax/modem and voice calls will be the T1 drop to the customer premise. This link will require QoS settings to place fax/modem and voice packets into a Priority Queue (PQ).
PQ provides expedited queuing for these packets to reduce latency and jitter for these call types. The bandwidth requirements are different for these call types. The problem in sizing PQ is that the amount of fax/modem traffic over the T1 link is not known prior to deployment. It is desirable to accommodate the worst case scenario where all calls are fax/modem (a very low probability event), but this implies the PQ must be sized to accommodate the case where all voice channels require the G.711 codec.
For the case of an 8-line analog FXS IAD, this is not an issue. Allocating the full 91 kbps for the worst-case scenario (all calls are fax/modem) would only require 728 kbps. PQ can be easily configured for this with plenty of extra bandwidth. If the calls are all G.726 (58 kbps), only 464 kbps of the allocated PQ bandwidth would be used. The unused portion of the bandwidth could be configured to be dynamically allocated for other purposes (if so desired).
However, in the case of a 24-channel digital PRI or CAS IAD, this would imply 24 x 91 kbps = 2.184 Mbps, which is more bandwidth than is available over the T1. Hence, it is not possible to accommodate the scenario where all channels of a digital PRI/CAS IAD are using fax/modem.
There are several problems to deal with when deploying a digital PRI/CAS IAD (from a bandwidth perspective):
•The requirements if all 24 channels are used for G.726 voice at 58 kbps would be 1.392 Mbps. This equates to ~90 percent of the T1 bandwidth allocated to PQ, and 4 percent allocated to signaling, with the remainder allocated to other users of the link (for example, Internet data). This means that 1.392 Mbps is reserved to handle voice calls. Technically, it is possible to deploy a 24-channel digital PRI/CAS IAD and carry voice on all 24 channels. However, there would be only 92 kbps left to run Internet data.
1.5336 Mbps - 1.392 Mbps (voice) - 61 kbps (signaling) = 92 kbps
In this scenario, the customer's Internet data would be running extremely slow.
•Assuming that only one T1 WAN link is used, the 1.392 Mbps bandwidth is consumed in different chunks, depending on whether the call is G.726 (~58 kbps) or if the call is fax/modem (~91kbps). Once the bandwidth in PQ is exhausted, the call may proceed with call setup, but packets may be dropped. For example, 22 voice calls + 2 fax/modem calls = [22 x 58 kbps] + [2 x 91 kbps] = 1.458 Mbps. This will oversubscribe the amount of reserved bandwidth in PQ (1.392 Mbps), resulting in packet drops. There is no method currently used in existing production deployments, to perform per-call, call admission control at the gateway.
•For scenarios where a full 24 trunks are provided to the customer premises, it may be required that Multilink PPP (MLPPP) be used to ensure adequate bandwidth to accommodate a variable number of fax/modem calls. Assuming the worst case (all calls are fax/modem), would require that 2.184 Mbps be set aside in PQ. There is 3.06 Mbps of bandwidth available on an MLPPP configuration, allowing ~822 kbps to be used for Internet data (3.06 Mbps - 2.184 Mbps (voice) - 61 kbps (signaling) = 822 kbps).
Posted: Fri May 6 08:27:13 PDT 2005
All contents are Copyright © 1992--2005 Cisco Systems, Inc. All rights reserved.
Important Notices and Privacy Statement.