- Maximum segment size windows
- Читайте также
- What is MSS (Maximum Segment Size) and how it is Calculated?
- Question & Answer
- Question
- Answer
- setting the maximum segment size in the tcp header
- 1 Answer 1
- Maximum Segment Size
- Related terms:
- Nokia Network Voyager
- Tuning the TCP/IP Stack
- Networking
- Support for MSS Replacement
- IPSO Command Interface Line Shell (CLISH)
- Applying Security Tuning
- Controlling Sequence Validation
- Tuning the TCP/IP Stack
- Using the Router Alert IP Option
- Optimizing IP1260 Ports
- IP Traffic Engineering
- 7.2.1 TCP Throughput and Possible Bottlenecks
- Mobile Network and Transport Layer
- 14.5 Transmission Control Protocol
- Static Congestion Window
- Dynamic Congestion Window
- Transmission Control Protocol
- CONNECTIONS AND THE THREE-WAY HANDSHAKE
- Connection Establishment
- Data Transfer
- Closing the Connection
- Adaptive Bandwidth Sharing for Elastic Traffic
- 7.6.2 Receiver Buffer and Cumulative ACKs
- Transport Protocols
- 6.2.3 TCP Options
- Network Layer – Building on IP
- Implementing connection establishment (receive)
- Basics of communications
- 2.6.6 Transport Layer
- Recommended publications
Maximum segment size windows
Введение:
Термины:
MTU — Maximum Transmission Unit.
Это максимальный размер пакета данных, который может быть передан за один физический кадр по протоколу TCP/IP. Дело в том, что данные от компьютера к компьютеру в Интернете идут не сплошным потоком, а этими самыми кадрами — пакетами строго определенного размера.
При этом слишком большой пакет в пути, скорее всего, будет фрагментироваться и заполняться «воздухом», «балластом», что негативно скажется на эффективности связи. Так, если ваш провайдер имеет установки MTU=576, а у вас в Windows задано MTU=1500, то каждый ваш пакет будет им разбиваться на три по 576 байт: 576+576+576=1728 — то есть, 228 байт балласта будут добавляться к каждому вашему пакету. Но даже если провайдер тоже имеет MTU=1500, то при связи с удаленным сервером вполне может попасться маршрутизатор с меньшим значением MTU и пакеты опять-таки будут фрагментироваться, замедляя передачу данных.
MSS — Maximum Segment Size — это еще один параметр протокола TCP, определяющий самый большой сегмент данных TCP, которые могут быть переданы за один раз. То есть, MTU = MSS + заголовки TCP/IP. Для заголовка тоже имеется общепринятый размер — это 40 байт (20 байт IP и 20 байт TCP), следовательно, обычно MSS = MTU — 40. Этот параметр не устанавливается, а вычисляется:)
RWIN — Receive Window — окно приема, размер буфера, в котором накапливается содержимое области данных (MSS) нескольких полученных пакетов, прежде чем передается дальше, например, в браузер. При недостаточном размере этого буфера иногда происходит его переполнение, и поступающие пакеты отвергаются и теряются. Размер RWIN обязательно должен быть кратен MSS и обычно для лучшей эффективности модемного соединения рекомендуется его устанавливать равным 4 — 8 MSS. Однако, чрезмерно большой размер буфера также нежелателен, особенно на плохих линиях — при потере всего одного пакета в случае сбоя на линии будет повторно затребован не один потерянный пакет, а все пакеты из этого буфера, что займет некоторое время.
TTL — Time To Live — время жизни — количество хопов, то есть промежуточных серверов, через которые может пройти ваш пакет в поисках своего места назначения. Каждый такой сервер добавляет единицу к специальному счетчику в заголовке вашего пакета, и когда счетчик достигает максимально разрешенного значения, пакет считается заблудившимся и прекращает свое существование. По умолчанию TTL равен 32, что сегодня явно недостаточно для разросшегося Интернета — нередки случаи, когда удаленный сервер находится более чем в 32 переходах, поэтому TTL следует увеличить как минимум до 64.
——————————————————
Таким образом, логично считать, что большие пакеты в итоге все-таки предпочтительнее, и если ваш провайдер настроил свои серверы и маршрутизаторы на большие пакеты, то надо стремиться использовать это на всю катушку.
Итак, надо определить MTU провайдера (оператора). Определяем вручную: Только сначала надо установить MTU равным 1500б. это можно сделать и вручную в реестре, но лучше (чтобы и остальные параметры также править) использовать спец. проги, например
MTUspeed .
После этого мерием MTU прова.
В командной строке:
PING -f -l 1500 ххх.ххх.ххх.ххх, где «ххх.ххх.ххх.ххх» — IP-адрес тестируемого сервера, а «-I» — это буква L, а не единица. -l это размер буфера отправки(1500). -f это флаг запрещающий фрагментацию пакета.
Например, у меня получилось так:
PING -f -l 1472 mail.ru
Обмен пакетами с mail.ru [194.67.57.51] по 1472 байт:
Ответ от 194.67.57.51: число байт=1472 время=3617мс TTL=249
Ответ от 194.67.57.51: число байт=1472 время=3315мс TTL=249
Ответ от 194.67.57.51: число байт=1472 время=3271мс TTL=249
PING -f -l 1473 mail.ru
Обмен пакетами с mail.ru [194.67.57.51] по 1473 байт:
Требуется фрагментация пакета, но установлен запрещающий флаг.
Требуется фрагментация пакета, но установлен запрещающий флаг.
Требуется фрагментация пакета, но установлен запрещающий флаг.
Но это не значит, что MTU прова 1472б. Ping прибавляет к нашим данным заголовок — IP (20 Байтов) и ICMP (8 Байтов). Итак, MTU прова 1500б .
Несмотря на то, что программ, предназначенных якобы для двукратного улучшения связи одним кликом мыши — пруд пруди. И тут, как вы можете сами понять, совсем не факт, что MTU=576, которое везде рекомендуется западными программистами и экспертами, будет оптимальным и для нас в России. Наши провайдеры сплошь и рядом выбирают для себя MTU=1500, а при «пинговании» удаленных серверов мы обнаруживаем, что пакет такого размера, вопреки всем утверждениям, проходит чаще всего нефрагментированным.
Для остальных параметров не считаю нужным ничего писать:) потому что они не так спорны как MTU.
Читайте также
Подключаем ReadyBoost Один из самых простых и недорогих способов ускорить работу Vista…
используют разное ядро существуют разные драйвера, оптимизированные под каждую систему…
Похоже, что многостраничные инструкции на тему того, как создать загрузочную флешку и…
В этой статье хочу рассказать вам о бесспорно хорошей и необходимой функции Windows –…
работать с AIMP для windows 7 стало намного удобнее и приятнее. Если вы впервые слышите…
Подпишись на нашу группу в контакте и будь в курсе обновлений:
What is MSS (Maximum Segment Size) and how it is Calculated?
Question & Answer
Question
What is MSS (Maximum Segment Size) and how it is Calculated?
Answer
What is maximum segment size (MSS)?
— The maximum segment size (MSS) is the largest amount of data specified in bytes that a communication
device can receive in single, unfragmented packet.
— This Maximum Segment Size (MSS) announcement is sent in SYN packet notifying remote end that «I can
accept TCP segments up to specific size in bytes”.
— The MSS advertised by each end can be different depending on their configuration.
— The MSS is only data portion in the packet, it does not include the TCP header or the IP header.
MSS = MTU – size of(TCPHDR) – size of(IPHDR) – size of (IPSEC)*
*if IP SEC is enabled
Following flow charts demonstrates how MSS is determined based on configuration and destination IP.
Example of SYN Packets with MSS:
When connection is established, each end announces the MSS it expects to receive. An MSS option can only appear in SYN packet from each end.
In Packet Number 1, host with IP 2.0.0.205 sent SYN with mss=1460 and in Packet Number 2, host with IP 2.0.0.239 sent with SYN with mss=1460.
Packet Number 1
ETH: ====( 74 bytes transmitted on interface en5 )==== 17:19:48.667571643
ETH: [ 00:02:55:4f:9a:1c -> 00:09:6b:6b:46:5e ] type 800 (IP)
IP:
IP:
IP: ip_v=4, ip_hl=20, ip_tos=16, ip_len=60, ip_id=14994, ip_off=0
IP: ip_ttl=60, ip_sum=3857 (valid), ip_p = 6 (TCP)
TCP:
TCP: th_seq=3797677451, th_ack=0
TCP: th_off=10, flags
TCP: th_win=16384, th_sum=8dd7 (valid), th_urp=0
TCP: mss 1460
Packet Number 2
ETH: ====( 60 bytes received on interface en5 )==== 17:19:48.667771864
ETH: [ 00:09:6b:6b:46:5e -> 00:02:55:4f:9a:1c ] type 800 (IP)
IP:
IP:
IP: ip_v=4, ip_hl=20, ip_tos=0, ip_len=44, ip_id=64798, ip_off=0 DF
IP: ip_ttl=60, ip_sum=35ea (valid), ip_p = 6 (TCP)
TCP:
TCP: th_seq=3654996708, th_ack=3797677452
TCP: th_off=6, flags
TCP: th_win=65535, th_sum=873b (valid), th_urp=0
TCP: mss 1460
Example # 1:
Example # 2:
Example # 3:
Example # 4:
Example # 5:
setting the maximum segment size in the tcp header
I am putting together a port scanner as a learning exercise. My problem is I’m trying to set the maximum segment size option(MSS) in the TCP header. I had a look at tcp.h, but I’m having trouble figuring out how to set it. I was hoping there would be an option like this:
Something similar to the above was in tcp.h but not in the right struct. Admittedly, I’m still fairly new to reading struct definitions and I couldn’t make much sense out of tcp.h so in the end I tried just tacking on the necessary bytes to the end of the TCP header:
But after I wrote it it was clear that is wasn’t a real solution, plus it still doesn’t work 😛 So is there a better way?
1 Answer 1
The struct tcphdr in tcp.h defines the mandatory part of the TCP header. (Look at the TCP header and you can match the definitions in struct tcphdr to your the actual bits to appear in the header.) Structs in C have a constant size, but TCP allows optional data. The header length field ( doff in the structure) is the total length of the header, including options, so you’ll need to add one word to account for the MSS option:
Let’s define a structure for the MSS option:
Now you can populate the structure, in the right order:
Let’s go one step further and define a single structure for your packet, to let the compiler help us out:
(You may need to add an end-of-option-list option at the end, and nop options to pad the option list to 8 bytes.)
Maximum Segment Size
Related terms:
Download as PDF
About this page
Nokia Network Voyager
Tuning the TCP/IP Stack
Established TCP connections have a maximum segment size (MSS) at each end. The MSS setting is the value presented by your system. You can change this value to tune TCP performance by allowing your system to receive the largest possible segments without fragmentation.
The MSS configuration:
Is only applicable to TCP.
Sets the TCP MSS for packets generated and received by the system. If a remote terminating node has a higher MSS than the MSS configured on your system, your system sends packets with the segment size configured with this feature. For example, if you set the MSS to 512, but the remote system has 1024, then your system sends packets with a TCP segment size of 512.
Is only relevant to Check Point security servers or similar products requiring the Nokia appliance to terminate the connection.
Only the remote terminating node responds to the MSS value you set. In other words, intermediate nodes do not respond. Typically, however, intermediate nodes can handle 1500-byte MTUs.
Your system presents the MSS value that you set, and the remote terminated notes respond by sending segments in packets that do not exceed your set value. The MSS presented by your system should be 40 bytes less than the smallest MTU between your system and the outgoing interface. The 40-byte difference allows for a 20-byte TCP header and a 20-byte IP header, which are included in the MTU.
Figure 4.6 shows the Advanced System Tuning page.
Figure 4.6 . The Advanced System Tuning Page
To set the TCP MSS, complete the following steps:
In the system tree, click Configuration | System Configuration | Advanced System Tuning.
In the TCP Maximum Segment Size (MSS) field, type your MSS value.
The range is 512 to 1500 and the default value is 1024. If you enter a value out of the range, you will receive an out-of-range error.
Click Apply and then, to permanently save the value, click Save.
Networking
Support for MSS Replacement
PPPoE implementations may have optional support for MSS ( Maximum Segment Size ) replacement. The PPPoE header adds 8 bytes of data to each Ethernet packet. As a result, the effective MTU becomes 1492 instead of then normal Ethernet MTU of 1500.
When PPPoE is used in a gateway, clients on the network will have no knowledge of this fact. When establishing TCP connections, hosts will advertise a MSS of 1500 rather than the correct value of 1492. This can result in the oversized segments getting dropped by a gateway enabled with PPPoE. This is solved by dynamically replacing the MSS in TCP packets with the correct value.
IPSO Command Interface Line Shell (CLISH)
Applying Security Tuning
The outlined configurations are for specific tuning purposes. In most circumstances, you should not change any default settings.
Controlling Sequence Validation
Use the following command to enable and disable sequence validation:
set advanced-tuning tcp-options sequence-validation
Use the following command to view whether sequence validation is enabled or disabled:
show advanced-tuning tcp-options sequence-validation
NokiaIP130:55> show advanced-tuning tcp-options sequence-validation
Sequence Validation Off
Tuning the TCP/IP Stack
Use the following command to set the TCP maximum segment size (MSS) for segments received by your local system:
set advanced-tuning tcp-ip tcp-mss
The default value is 1024.
Use the following command to view the configured TCP MSS value:
show advanced-tuning tcp-ip tcp-mss
NokiaIP130:56> show advanced-tuning tcp-ip tcp-mss
Using the Router Alert IP Option
Use the following command to specify if IPSO should strip the router alert IP option before passing packets to the firewall:
set advanced-tuning ip-options stripra
The router alert IP option is commonly enabled in IGMP packets.
Use the following command to view the configured setting:
show advanced-tuning ip-options stripra
Optimizing IP1260 Ports
You can use the following command to optimize the performance of the interfaces of two-port Gigabit Ethernet NICs in IP1260 platforms when the interfaces forward unidirectional UDP traffic:
set advanced-tuning ethernet-options
Enabling this option does not optimize throughput for other types of traffic or other interfaces. This command is not available on the IP1220.
Do not enable this option if more than two Gigabit Ethernet interfaces are installed in the system. Doing so can impair system performance.
IP Traffic Engineering
Deep Medhi , Karthik Ramasamy , in Network Routing (Second Edition) , 2018
7.2.1 TCP Throughput and Possible Bottlenecks
It has been noted that TCP throughput depends primarily on three factors: the maximum segment size ( S), the round-trip time ( RTT ), and the average packet loss probability (q). A key result [274] , [275] , and [542] on TCP throughput is the following:
An important question arises from the traffic engineering perspective: where and how does an IP network fit in the three factors and the relation shown in Eq. (7.2.1) ? First, we see that the segment size should be as large as possible. However, note that the maximum segment size is not entirely within the control of the network since it is negotiated by the end hosts; at the same time, this tells us that the network link should be set for the maximum transmission unit possible so that the network link itself does not become the bottleneck in reducing the TCP throughput of end applications. Since end hosts are connected to Ethernet where the maximum transmission unit that can be handled is 1500 bytes, it is imperative that the core network links have the ability to handle packets of at least this size to avoid any fragmentation of packets into multiple smaller packets.
The second factor that affects TCP throughput is the round-trip time. From Eq. (7.2.1) , we see that the round-trip time should be minimized, which means that one-way delay must be minimized. While many factors including processing at the end hosts can impact delay, from the point of view of the network, it is important that the delay on a network link be minimized. Since numerous TCP sessions traverse through a network for different source destinations, delay minimization in an IP network is an important goal. Recall our discussion earlier about the direct relationship between delay and utilization that tells us that utilization should be kept below a desirable value in lieu of considering delay.
The third factor is the average packet loss probability. The average packet loss can depend on many points along a TCP connection; the end hosts may drop a packet, the edge network may drop a packet, there may be a bit error, and so on. A core network can minimize its contribution to the packet loss probability by ensuring that the bit error is not a dominant factor, which is a fair assumption in fiber-based transmission networks now commonly deployed in core networks. However, there is another factor that can contribute to the increase in packet loss probability—that is, if the buffer size at a router is not sized properly. Since packets arrive at random times, it is quite possible that a queue builds up at a router. If there is not enough buffer space, a router is forced to drop packets. If this happens, the affected TCP sessions are forced to reduce the data rate since a drop packet is commonly understood by a TCP session to be an indication of congestion. That is, even if a network link has enough bandwidth, it is quite possible that if a router buffer is not sized properly, it may appear as congestion to TCP sessions; in other words, the router buffer size has the potential to be another bottleneck in reducing TCP throughput. Thus, the router buffer should be sized properly for the benefit of traffic engineering of a network. How do we estimate router buffer size? To determine this, it is helpful to consider the bandwidth-delay product.
Mobile Network and Transport Layer
14.5 Transmission Control Protocol
The TCP [ 31 , 32 ] is the connection-oriented transport layer protocol designed to operate on the top of the datagram network layer IP. The two widely used protocols are known under the collective name TCP/IP. TCP provides a reliable end-to-end byte stream transport. The segmentation and reassembly of the messages are handled by IP, not by TCP.
TCP uses the selective repeat protocol (SRP) with positive acknowledgments and time-out. Each byte sent is numbered and must be acknowledged. A number of bytes can be sent in the same packet, and the acknowledgment (ACK) then indicates the sequence number of the next byte expected by the receiver. ACK carrying sequence number m provides acknowledgment for all packets up to, and including, packets with sequence number m −1. If a packet is lost, the receiver sends duplicate ACK for a subsequent correctly received packet.
The TCP header is at least 20 bytes, and has 16 error detection bits for the data and the header. The error detection bits are calculated by summing the 1’s complements of the groups of 16 bits that make up the data and the header, and by taking the 1’s complement of that sum. The number of data that can be sent before being acknowledged is the window size (Wmax) which can be adjusted by either the sender or the receiver to control the flow based on the available buffers and the congestion. Initial sequence numbers are negotiated by means of a three-way handshake at the outset of connection. Connections are released by means of a three-way handshake.
TCP transmitter (Tx) uses an adaptive window based transmit strategy. Tx does not allow more than Wmax unacknowledged packets outstanding at any given time. With the congestion window lower limit at time t equal to X(t), packets up to X(t) −1 have been transmitted and acknowledged. Tx can send starting from X(t). X(t) has a nondecreasing sample path. With congestion window width at time t equal to W(t), this is the amount of packets Tx is allowed to send starting with X(t). W(t) can increase or decrease (because of window adaptation), but never exceed Wmax. Transitions in X(t) and W(t) are triggered by receipt of ACK. Receiver (Rx) of an ACK increases X(t) by an amount equal to the amount of data acknowledged. Changes in W(t), however, depend on the version of TCP and the congestion control process.
Tx starts a timer each time a new packet is sent. If the timer reaches a round trip time-out (RTO) value before the packet is acknowledged, a time-out occurs. Retransmission is initiated on time-out. RTO value is derived from a round trip timer estimation procedure. RTO is sent only in multiples of a timer granularity.
The window adaptation procedure is as follows:
Slow start phase . At the beginning of the TCP connection, the sender enters the slow start phase, in which window size is increased by 1 maximum segment size (MSS) for every ACK received; thus, the TCP sender window grows exponentially in round trip timer.
Congestion avoidance phase. When the window size reaches Wth, the TCP sender enters the congestion avoidance phase. TCP uses a sliding window-based flow control mechanism allowing the sender to advance the transmission window linearly by one segment upon reception of an ACK, which indicates that the last in-order packet was received successfully by the receiver.
Upon time-out. When packet loss occurs at a congested link due to buffer overflow at the intermediate router, either the sender receives duplicate ACKs, or the sender’s RTO timer expires. These events activate TCP’s fast retransmit and recovery, by which the sender reduces the size of the congestion window to half and linearly increases the congestion window as in the congestion avoidance phase, resulting in a lower transmission rate to relieve the link congestion.
Assuming long running connections and large enough window sizes, the upper bound on throughput, R, of a TCP connection is given by:
MSS = maximum segment size
RTT = average end-to-end round trip time of the TCP connection
p = packet loss probability for the path.
Equation 14.1 neglects retransmissions due to errors. If the error rate is more than 1%, these retransmissions have to be considered. This leads to the following formula:
RTO = retransmission time-out ∼ 5RTT
For a given object, the latency is defined as the time from when the client initiates a TCP connection until the time at which the client receives the requested object in its entirety. Using the following assumptions, we provide expressions for latency with a static and dynamic congestion window.
The network is not congested.
The amount of data that can be transmitted is dependent on the sender’s congestion window size.
Packets are neither lost nor corrupted.
All protocol header overheads are negligible and ignored.
The object to be transferred consists of an integer number of segments of size MSS.
The only packets with non-negligible transmission times are packets that carry maximum-size TCP segments. Request messages, acknowledgments, and TCP connection establishment segments are small and have negligible transmission times.
The initial threshold in the TCP congestion-control mechanism is a large value that is never attained by the congestion window.
Static Congestion Window
L = latency of the connection
K = O/(WS) round-up to the nearest integer
W = congestion window size
O = Size of the object to be transmitted
R = Transmission rate of the link from the server to the client
Maximum Segment Size (MSS) = S
RTT = round-trip time
Dynamic Congestion Window
ISO defined five classes (0 to 4) of connection-oriented transport services (ISO 8073). We briefly describe class 4, which transmits packets with error recovery and in the correct order. This protocol is known as Transport Protocol Class 4 (TP4) and is designed for unreliable networks. The basic steps in the TP4 connection are given below:
Connection establishment: This is performed by means of a three-way handshake to agree on connection parameters, such as a credit value that specifies how many packets can be sent initially until the next credit arrives, connection number, the transport source and destination access points, and a maximum time-out before ACK.
Data transfer: The data packets are numbered sequentially. This allows resequencing. ACKs may be done for blocks of packets. There is a provision for expedited data transport in which the data packets are sent and acknowledged one at a time. Expedited packets jump to the head of the queues. Flow is controlled by windows or by credits.
Clear connection: Connections are released by an expedited packet indicating the connection termination. The buffers are then flushed out of the data packets corresponding to that connection.
In practice, TCP has been tuned for a traditional network consisting of wired links and stationary hosts. TCP assumes that congestion in the network is the primary cause of packet losses and unusual delay. TCP performs well over wired networks by adapting to end-to-end delays and congestion losses. TCP reacts to packet losses by dropping its transmission (congestion) window size before retransmitting packets, initiating congestion control or avoidance mechanisms. These measures result in a reduction in the load on the intermediate links, thereby controlling the congestion in the network. While slow start is one of the most useful mechanisms in wireline networks, it significantly reduces the efficiency of TCP when used together with mobile receivers or senders.
Transmission Control Protocol
Walter Goralski , in The Illustrated Network , 2009
CONNECTIONS AND THE THREE-WAY HANDSHAKE
TCP establishes end-to-end connections over the unreliable, best-effort IP packet service using a special sequence of three TCP segments sent from client to server and back called a three-way handshake. Why three ways? Because packets containing the TCP segment that ask a server to accept another connection and the server’s response might be lost on the IP router network, leaving the hosts unsure of exactly what is going on.
Once the three segments are exchanged, data transfer can take place from host to host in either direction. Connections can be dropped by either host with a simple exchange of segments (four in total), although the other host can delay the dropping until final data are sent, a feature rarely used.
TCP uses unique terminology for the connection process. A single bit called the SYN (synchronization) bit is used to indicate a connection request. This single bit is still embedded in a complete 20-byte (usually) TCP header, and other information, such as the initial sequence number (ISN) used to track segments, is sent to the other host. Connections and data segments are acknowledged with the ACK bit, and a request to terminate a connection is made with the FIN (final) bit.
The entire TCP connection procedure, from three-way handshake to data transfer to disconnect, is shown in Figure 11.3 . TCP also allows for the case where two hosts performs an active open at the same time, but this is unlikely.
FIGURE 11.3 . Client–server interaction with TCP, showing the three connection phases of setup, data transfer, and release (disconnect).
This example shows a small file transfer to a server (with the server sending 1000 bytes back to the client) using 1000-byte segments, but only to make the sequence numbers and acknowledgments easier to follow. The whole file is smaller than the server host’s receive window and nothing goes wrong (but things often go wrong in the real world).
Note that to send even one exchange of a request–response pair inside segments, TCP has to generate seven additional packets. This is a lot of packet overhead, and the whole process is just slow over high latency (delay) links. This is one reason that UDP is becoming more popular as networks themselves become more reliable.
Connection Establishment
Let’s look at the normal TCP connection establishment’s three-way handshake in some detail. The three messages establish three important pieces of information that both sides of the connection need to know.
The ISNs to use for outgoing data (in order to deter hackers, these should not be predictable).
The buffer space (window) available locally for data, in bytes.
The Maximum Segment Size (MSS) is a TCP Option and sets the largest segment that the local host will accept. The MSS is usually the link MTU size minus the 40 bytes of the TCP and IP headers, but many implementations use segments of 512 or 536 bytes (it’s a maximum, not a demand).
A server issues a passive open and waits for a client’s active open SYN, which in this case has an ISN of 2000, a window of 5840 bytes and an MSS of 1460 (common because most hosts are on Ethernet LANs). The window is almost always a multiple of the MSS (1460 × 4 = 5840 bytes). The server responds with a SYN and declares the connection open, setting its own ISN to 4000, and “acknowledging” sequence number 2001 (it really means “the next byte I get from you in a segment should be numbered 2001”). The server also established a window of 8760 bytes and an MSS of 1460 (1460 × 6 = 8760 bytes).
Finally, the client declares the connection open and returns an ACK (a segment with the ACK bit set in the header) with the sequence number expected (2001) and the acknowledgment field set to 4001 (which the server expects). TCP sequence numbers count every byte on the data stream, and the 32-bit sequence field allows more than 4 billion bytes to be outstanding (nevertheless, high-speed transports such as Gigabit Ethernet roll this field over too quickly for comfort, so special “scaling” mechanisms are available for these link speeds).
TCP’s three-way handshake has two important functions. It makes sure that both sides know that they are ready to transfer data and it also allows both sides to agree on the initial sequence numbers, which are sent and acknowledged (so there is no mistake about them) during the handshake. Why are the initial sequence numbers so important? If the sequence numbers are not randomized and set properly, it is possible for malicious users to hijack the TCP session (which can be reliable connections to a bank, a store, or some other commercial entity).
Each device chooses a random initial sequence number to begin counting every byte in the stream sent. How can the two devices agree on both sequence number values in about only three messages? Each segment contains a separate sequence number field and acknowledgment field. In Figure 11.3 , the client chooses an initial sequence number (ISN) in the first SYN sent to the server. The server ACKs the ISN by adding one to the proposed ISN (ACKs always inform the sender of the next byte expected) and sending it in the SYN sent to the client to propose its own ISN. The client’s ISN could be rejected, if, for example, the number is the same as used for the previous connection, but that is not considered here. Usually, the ACK from the client both acknowledges the ISN from the server (with server’s ISN + 1 in the acknowledgment field) and the connection is established with both sides agreeing on ISN. Note that no information is sent in the three-way handshake; it should be held until the connection is established.
This three-way handshake is the universal mechanism for opening a TCP connection. Oddly, the RFC does not insist that connections begin this way, especially with regard to setting other control bits in the TCP header (there are three others in addition to SYN and ACK and FIN). Because TCP really expects some control bits to be used during connection establishment and release, and others only during data transfer, hackers can cause a lot of damage simply by messing around with wild combinations of the six control bits, especially SYN/ACK/FIN, which asks for, uses, and releases a connection all at the same time. For example, forging a SYN within the window of an existing SYN would cause a reset. For this reason, developers have become more rigorous in their interpretation of RFC 793.
Data Transfer
Sending data in the SYN segment is allowed in transaction TCP, but this is not typical. Any data included are accepted, but are not processed until after the three-way handshake completes. SYN data are used for round-trip time measurement (an important part of TCP flow control) and network intrusion detection (NID) evasion and insertion attacks (an important part of the hacker arsenal).
The simplest transfer scenario is one in which nothing goes wrong (which, fortunately, happens a lot of the time). Figure 11.4 shows how the interplay between TCP sequence numbers (which allow TCP to properly sequence segments that pop out of the network in the wrong order) and acknowledgments allow both sides to detect missing segments.
FIGURE 11.4 . How TCP handles lost segments. The key here is that although the client might continue to send data, the server will not acknowledge all of it until the missing segment shows up.
The client does not need to receive an ACK for each segment. As long as the established receive window is not full, the sender can keep sending. A single ACK covers a whole sequence of segments, as long as the ACK number is correct.
Ideally, an ACK for a full receive window’s worth of data will arrive at the sender just as the window is filled, allowing the sender to continue to send at a steady rate. This timing requires some knowledge of the round-trip time (RTT) to the partner host and some adjustment of the segment-sending rate based on the RTT. Fortunately, both of these mechanisms are available in TCP implementations.
What happens when a segment is “lost” on the underlying “best-effort” IP router network? There are two possible scenarios, both of which are shown in Figure 11.4 .
In the first case, a 1000-byte data segment from the client to the server fails to arrive at the server. Why? It could be that the network is congested, and packets are being dropped by overstressed routers. Public data networks such as frame relay and ATM (Asynchronous Transfer Mode) routinely discard their frames and cells under certain conditions, leading to lost packets that form the payload of these data units.
If a segment is lost, the sender will not receive an ACK from the receiving host. After a timeout period, which is adjusted periodically, the sender resends the last unacknowledged segment. The receiver then can send a single ACK for the entire sequence, covering received segments beyond the missing one.
But what if the network is not congested and the lost packet resulted from a simple intermittent failure of a link between two routers? Today, most network errors are caused by faulty connectors that exhibit specific intermittent failure patterns that steadily worsen until they become permanent. Until then, the symptom is sporadic lost packets on the link at random intervals. (Predictable intervals are the signature of some outside agent at work.)
Waiting is just a waste of time if the network is not congested and the lost packet was the result of a brief network “hiccup.” So TCP hosts are allowed to perform a “fast recovery” with duplicate ACKs, which is also shown in Figure 11.4 .
The server cannot ACK the received segments 11,001 and subsequent ones because the missing segment 10,001 prevents it. (An ACK says that all data bytes up to the ACK have been received.) So every time a segment arrives beyond the lost segment, the host only ACKs the missing segment. This basically tells the other host “I’m still waiting for the missing 8001 segment.” After several of these are received (the usual number is three), the other host figures out that the missing segment is lost and not merely delayed and resends the missing segment. The host (the server in this case) will then ACK all of the received data.
The sender will still slow down the segment sending rate temporarily, but only in case the missing segment was the result of network congestion.
Closing the Connection
Either side can close the TCP connection, but it’s common for the server to decide just when to stop. The server usually knows when the file transfer is complete, or when the user has typed logout and takes it from there. Unless the client still has more data to send (not a rare occurrence with applications using persistent connections), the hosts exchange four more segments to release the connection.
In the example, the server sends a segment with the FIN (final) bit set, a sequence number (whatever the incremented value should be), and acknowledges the last data received at the server. The client responds with an ACK of the FIN and appropriate sequence and acknowledgment numbers (no data were sent, so the sequence number does not increment).
The TCP releases the connection and sends its own FIN to the server with the same sequence and acknowledgment numbers. The server sends an ACK to the FIN and increments the acknowledgment field but not the sequence number. The connection is down.
But not really. The “best-effort” nature of the IP network means that delayed duplicated could pop out of a router at any time and show up at either host. Routers don’t do this just to be nasty, of course. Typically, a router that hangs or has a failed link rights itself and finds packets in a buffer (which is just memory) and, trying to be helpful, sends them out. Sometimes routing loops cause the same problem.
In any case, late duplicates must be detected and disposed of (which is one reason the ISN space is 32 bits—about 4 billion—wide). The time to wait is supposed to be twice as long as it could take a packet to have its TTL go to zero, but in practice this is set to 4 minutes (making the packet transit time of the Internet 2 minutes, an incredibly high value today, even for Cisco routers, which are fond of sending packets with the TTL set to 255).
The wait time can be as high as 30 minutes, depending on TCP/IP implementation, and resets itself if a delayed FIN pops out of the network. Because a server cannot accept other connections from this client until the wait timer has expired, this often led to “server paralysis” at early Web sites.
Today, many TCP implementations use an abrupt close to escape the wait-time requirement. The server usually sends a FIN to the client, which first ACKs and then sends a RST (reset) segment to the server to release the connection immediately and bypass the wait-time state.
Adaptive Bandwidth Sharing for Elastic Traffic
Anurag Kumar , . Joy Kuri , in Communication Networking , 2004
7.6.2 Receiver Buffer and Cumulative ACKs
For each direction of data transfer in a TCP connection, there is a transmitter–receiver pair (see Figure 7.13 ). During connection setup, several parameters are negotiated between the transmitter and the receiver in each direction; for example, the maximum segment size (MSS; i.e., the maximum packet length excluding the TCP/IP header). Another connection parameter that is negotiated is W max, the maximum transmission window. A typical value of Wmax is 32 KB, which, for a TCP packet length of 1500 B, is about 21 packets. A larger maximum window can be negotiated when the BDP is large.
Figure 7.13 . A TCP connection is full duplex and involves a transmitter and a receiver in each direction. Although the data is reliably transferred, the acknowledgments can be lost.
One of the constraints on Wmax is the amount of buffer space in the TCP receiver to store out-of-order packets. Consider the example shown in Figure 7.14 . By using a resequencing buffer at the receiver, TCP converts the nonsequential packet delivery service provided by the IP layer to a sequential delivery service. Such a buffer is shown in Figure 7.14 . Packets 1, 2, 3, and 4 are received in sequence and are passed to the application. At this point the receiver sends back an ACK(5), which tells the receiver that all packets up to and including packet 4 have been received, and the next packet required has the number 5. Note that, whereas the actual implementation deals with bytes (i.e., byte counts are acknowledged, and the window is also kept in bytes), for simplicity we work with packets. Thus the TCP receiver’s acknowledgments are cumulative ACKs; each ACK acknowledges everything in the past. This is how TCP deals with an unreliable acknowledgment channel.
Figure 7.14 . The resequencing buffer at a TCP receiver. Out-of-order packets are accepted but are delivered to the application only in order.
Along with ACK(5), the receiver also sends back a window advertisement of Wmax. In the example, we have taken Wmax = 8. From these two pieces of information the transmitter infers that the receiver is willing to accept packet numbers <5, 6, …, 12>(i.e., Wmax packets starting from packet number 5).
Let us now continue following the example in Figure 7.14 . For some reason packet 5 does not show up, but packets 6, 7, and 8 do. These are accepted and stored after space is reserved in the buffer for packet 5. For each of these packets the receiver sends back an ACK(5), which tells the transmitter that (1) a packet was received, but (2) only data up to, and including, packet number 4 have successfully arrived in order. Furthermore, the window advertisement is kept at Wmax. Note that by this point in time, the transmitter will have received the first ACK(5), and its window’s left edge is at packet number 5. Packet 9 also does not show up, but packets 10, 11, and 12 do. With these packets the receiver buffer is now full (having left space for the missing packets). The transmitter cannot send any more packets because the window is closed. Packet 5 then arrives (perhaps it took a circuitous route) and is placed in its position in the buffer. The receiver now returns ACK(9).
The receiver is now ready to pass packets 5, 6, 7, and 8 to the application. However, the application is not ready (e.g., it is a slow line printer with a small internal buffer and has not yet finished printing the previous data). An interlayer flow control will therefore operate and prevent the TCP receiver from sending the packets to the application. Hence the TCP receiver sends back a window advertisement of 4 with the ACK(9), thus telling the transmitter that it can accept only packets 9, 10, 11, and 12. In this situation, at the transmitter, the net effect of receiving this ACK will be to shift the left edge of the transmission window to 9, drop the window to 4, and stay stalled. If packet 9 is really lost, then loss recovery mechanisms (discussed next) will eventually come into play. We thus see that the window advertisement mechanism implements the function of sender–receiver flow control; that is, it prevents a fast transmitter from flooding a slow receiver that has limited storage.
Transport Protocols
Jean-Philippe Vasseur , Adam Dunkels , in Interconnecting Smart Objects with IP , 2010
6.2.3 TCP Options
TCP options provide additional control information. They reside between the TCP header and the data of a segment. Since the original specification of TCP [204] , a number of additions have been defined as TCP options. These include the TCP selective acknowledgment (SACK) [170] and the TCP extensions for high-speed networks [136] that define TCP time stamps and window scaling options.
For smart objects, the arguably most important TCP option is the maximum segment size (MSS) option. The TCP MSS option specifies the largest TCP segment size that a TCP end point is able to accept. The MSS option is sent by both parties during the opening of a connection. The MSS option effectively limits the amount of data in each TCP segment. This is important for smart object networks, which typically can carry only small packets.
When opening a TCP connection, both the TCP sender and the TCP receiver indicate the MSS they can accept by placing the TCP MSS option in the SYN and the SYNACK segments. When receiving a TCP MSS option, a TCP end point must reduce the size of the segments it sends accordingly. This is useful for TCP end points with small amounts of memory, because it allows the end point to set a limit on the size of the packets it will receive.
Network Layer – Building on IP
Implementing connection establishment (receive)
This module will emulate the server’s side ‘listen’ function. A listen function is a blocking function that waits for incoming segments discarding all except ones that establish a new connection. It assumes that no current connection is taking place, and a receiver is just waiting for a new connection to be opened. That is, the other end is just about to initiate a connection. The software holds a global constant ‘state’ initialized to a constant valued, which we associate with the LISTEN state.
In the LISTEN state: Read the next incoming TCP segment. If this has the SYN flag set, continue to the next step, if not, go back and wait for the next segment.
Store the incoming segment’s 32-bit sequence number, in a local variable ‘rxSEQ’. In Figure 7-15 , this number is shown, for example, as 100. The received SYN segment will most likely have an optional header, which will include a MSS field, store this value in a local variable ‘rxMSS’. The MSS is the largest amount of contiguous data the other end will accept from us. We can send smaller data payloads but not larger than the MSS they have specified. We only need to store this value if we think there may be a clash with large data blocks, which is not usually the case in small systems. The default value is usually 1024 bytes. We also create a local sequence number stored in ‘mySEQ’ and initialise this to a random 32-bit number (in the figure we have initialized to 200). We shall be using this number as our incremental sequence number when we transmit data to the remote.
Assemble a reply segment with the following fields: SYN = on, ACK = on, seq(sequence number) = mySEQ, ack(acknowledge number) = rxSEQ + 1, [MSS options] = 800. Set the state table value from LISTEN to SYN_RCVD, and start a 10 s time-out counter. Transmit the segment to the remote, and proceed to the next stage (do not forget to ‘swap’ the port and the IP addresses source–destination pairs when creating the reply). The ack number sent is just the received seq plus one, as the SYN flag is deemed to be equivalent to one byte of data. We also need to add an options header to our TCP segment containing a MSS field. The MSS field we shall send is the maximum amount of data we are prepared to take from the remote, and will depend on how much RAM memory our system has available. We have set this to 800 bytes as an example.
SYN-RCVD state: The receiver now waits for a reply from the remote. At the same time, the countdown timer is ticking down, to ensure the state can be exited if the remote does not respond in time. The next reply from the remote should have ACK = on and its ack field should be set to ‘mySEQ + 1’, which was our last sent seq number plus one. If this condition is met, we proceed to the next stage. Note that this ack segment may contain actual data, so provisions should ideally be made to accept this. If the countdown timer times out (no segment received), or if the received ACK was in error or with the wrong sequence number, we drop the transaction (send a FIN, or a RST segment) and return to 1 above. If we receive a RST segment we just drop the transaction, not forgetting to set the state variable to CLOSE, or LISTEN as required.
ESTABLISHED STATE: A connection has now been established and communications can take place. This will continue until a segment with a FIN or RST is received, or the connection is lost.
Basics of communications
2.6.6 Transport Layer
One of the major tasks of the transport layer is congestion control. In the Transmission Control Protocol (TCP), which is widely used on the Internet, a congestion window is used to control the data traffic volume. The sender can send out all the packets within the congestion window. At the beginning of the connection, the window size is set to the maximum segment size allowed in the connection. Then upon receiving the ACK from the receiver, which means that the corresponding packets have been correctly received, the transmitter doubles its congestion window. When the congestion window is larger than a certain threshold, the increase of the window size becomes linear. If an ACK has not been received before the expiration of the timer, the transmitter assumes that congestion has occurred in the network (thus causing the loss of a packet) and then decreases the congestion window size to the minimum one and also reduces the threshold to a half. This procedure is illustrated in Fig. 2.11 .
Fig. 2.11 . Illustration of congestion control.
Recommended publications
We use cookies to help provide and enhance our service and tailor content and ads. By continuing you agree to the use of cookies .