understanding linux network internals
Post on 27-Dec-2016
Documents
Other Linux resources from OReilly
Related titles Linux in a Nutshell
Linux NetworkAdministrators Guide
Linux Device Drivers
Understanding the LinuxKernel
Building Secure Servers withLinux
LPI Linux Certification in aNutshell
Learning Red Hat Linux
Linux Server HacksTM
Linux Security Cookbook
Managing RAID on Linux
Linux Web Server CDBookshelf
Building Embedded LinuxSystems
Linux BooksResource Center
linux.oreilly.com is a complete catalog of OReillys books onLinux and Unix and related technologies, including samplechapters and code examples.
ONLamp.com is the premier site for the open source web plat-form: Linux, Apache, MySQL, and either Perl, Python, or PHP.
Conferences OReilly brings diverse innovators together to nurture the ideasthat spark revolutionary industries. We specialize in document-ing the latest tools and systems, translating the innovatorsknowledge into useful skills for those in the trenches. Visitconferences.oreilly.com for our upcoming events.
Safari Bookshelf (safari.oreilly.com) is the premier online refer-ence library for programmers and IT professionals. Conductsearches across more than 1,000 books. Subscribers can zero inon answers to time-critical questions in a matter of seconds.Read the books on your Bookshelf from cover to cover or sim-ply flip to the page you need. Try it today with a free trial.
Beijing Cambridge Farnham Kln Paris Sebastopol Taipei Tokyo
Understanding Linux Network Internalsby Christian Benvenuti
Copyright 2006 OReilly Media, Inc. All rights reserved.Printed in the United States of America.
Published by OReilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
OReilly books may be purchased for educational, business, or sales promotional use. Online editionsare also available for most titles (safari.oreilly.com). For more information, contact our corporate/insti-tutional sales department: (800) 998-9938 or corporate@oreilly.com.
Editor: Andy Oram
Production Editor: Philip Dangler
Cover Designer: Karen Montgomery
Interior Designer: David Futato
December 2005: First Edition.
Nutshell Handbook, the Nutshell Handbook logo, and the OReilly logo are registered trademarks ofOReilly Media, Inc. The Linux series designations, Understanding Linux Network Internals, images ofthe American West, and related trade dress are trademarks of OReilly Media, Inc.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed astrademarks. Where those designations appear in this book, and OReilly Media, Inc. was aware of atrademark claim, the designations have been printed in caps or initial caps.
While every precaution has been taken in the preparation of this book, the publisher and author assumeno responsibility for errors or omissions, or for damages resulting from the use of the informationcontained herein.
ISBN: 978-0-596-00255-8 [5/08]
Table of Contents
Part I. General Background
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Basic Terminology 3Common Coding Patterns 4User-Space Tools 18Browsing the Source Code 19When a Feature Is Offered as a Patch 20
2. Critical Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22The Socket Buffer: sk_buff Structure 22net_device Structure 43Files Mentioned in This Chapter 57
3. User-Space-to-Kernel Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Overview 58procfs Versus sysctl 60ioctl 67Netlink 70Serializing Configuration Changes 71
vi | Table of Contents
Part II. System Initialization
4. Notification Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Reasons for Notification Chains 75Overview 77Defining a Chain 78Registering with a Chain 78Notifying Events on a Chain 79Notification Chains for the Networking Subsystems 81Tuning via /proc Filesystem 82Functions and Variables Featured in This Chapter 83Files and Directories Featured in This Chapter 83
5. Network Device Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84System Initialization Overview 84Device Registration and Initialization 86Basic Goals of NIC Initialization 86Interaction Between Devices and Kernel 87Initialization Options 93Module Options 93Initializing the Device Handling Layer: net_dev_init 94User-Space Helpers 96Virtual Devices 100Tuning via /proc Filesystem 103Functions and Variables Featured in This Chapter 104Files and Directories Featured in This Chapter 105
6. The PCI Layer and Network Interface Cards . . . . . . . . . . . . . . . . . . . . . . . . . . . 106Data Structures Featured in This Chapter 106Registering a PCI NIC Device Driver 108Power Management and Wake-on-LAN 109Example of PCI NIC Driver Registration 110The Big Picture 112Tuning via /proc Filesystem 114Functions and Variables Featured in This Chapter 114Files and Directories Featured in This Chapter 115
Table of Contents | vii
7. Kernel Infrastructure for Component Initialization . . . . . . . . . . . . . . . . . . . . 116Boot-Time Kernel Options 116Module Initialization Code 122Optimized Macro-Based Tagging 125Boot-Time Initialization Routines 128Memory Optimizations 130Tuning via /proc Filesystem 134Functions and Variables Featured in This Chapter 134Files and Directories Featured in This Chapter 135
8. Device Registration and Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136When a Device Is Registered 137When a Device Is Unregistered 138Allocating net_device Structures 138Skeleton of NIC Registration and Unregistration 140Device Initialization 141Organization of net_device Structures 145Device State 147Registering and Unregistering Devices 149Device Registration 154Device Unregistration 156Enabling and Disabling a Network Device 159Updating the Device Queuing Discipline State 161Configuring Device-Related Information from User Space 166Virtual Devices 169Locking 171Tuning via /proc Filesystem 171Functions and Variables Featured in This Chapter 172Files and Directories Featured in This Chapter 173
Part III. Transmission and Reception
9. Interrupts and Network Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177Decisions and Traffic Direction 178Notifying Drivers When Frames Are Received 178Interrupt Handlers 183softnet_data Structure 206
viii | Table of Contents
10. Frame Reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210Interactions with Other Features 211Enabling and Disabling a Device 211Queues 212Notifying the Kernel of Frame Reception: NAPI and netif_rx 212Old Interface Between Device Drivers and Kernel: First Part of netif_rx 219Congestion Management 225Processing the NET_RX_SOFTIRQ: net_rx_action 228
11. Frame Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239Enabling and Disabling Transmissions 241
12. General and Reference Material About Interrupts . . . . . . . . . . . . . . . . . . . . . 261Statistics 261Tuning via /proc and sysfs Filesystems 262Functions and Variables Featured in This Part of the Book 263Files and Directories Featured in This Part of the Book 265
13. Protocol Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266Overview of Network Stack 266Executing the Right Protocol Handler 274Protocol Handler Organization 278Protocol Handler Registration 279Ethernet Versus IEEE 802.3 Frames 281Tuning via /proc Filesystem 293Functions and Variables Featured in This Chapter 293Files and Directories Featured in This Chapter 294
Part IV. Bridging
14. Bridging: Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297Repeaters, Bridges, and Routers 297Bridges Versus Switches 299Hosts 300Merging LANs with Bridges 300Bridging Different LAN Technologies 302Address Learning 302Multiple Bridges 305
Table of Contents | ix
15. Bridging: The Spanning Tree Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310Basic Terminology 311Example of Hierarchical Switched L2 Topology 311Basic Elements of the Spanning Tree Protocol 314Bridge and Port IDs 321Bridge Protocol Data Units (BPDUs) 323Defining the Active Topology 328Timers 335Topology Changes 340BPDU Encapsulation 344Transmitting Configuration BPDUs 346Processing Ingress Frames 347Convergence Time 349Overview of Newer Spanning Tree Protocols 350
16. Bridging: Linux Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355Bridge Device Abstraction 355Important Data Structures 359Initialization of Bridging Code 360Creating Bridge Devices and Bridge Ports 361Creating a New Bridge Device 362Bridge Device Setup Routine 362Deleting a Bridge 364Adding Ports to a Bridge 364Enabling and Di
Источник
understanding linux network internals
Post on 27-Dec-2016
Documents
Other Linux resources from OReilly
Related titles Linux in a Nutshell
Linux NetworkAdministrators Guide
Linux Device Drivers
Understanding the LinuxKernel
Building Secure Servers withLinux
LPI Linux Certification in aNutshell
Learning Red Hat Linux
Linux Server HacksTM
Linux Security Cookbook
Managing RAID on Linux
Linux Web Server CDBookshelf
Building Embedded LinuxSystems
Linux BooksResource Center
linux.oreilly.com is a complete catalog of OReillys books onLinux and Unix and related technologies, including samplechapters and code examples.
ONLamp.com is the premier site for the open source web plat-form: Linux, Apache, MySQL, and either Perl, Python, or PHP.
Conferences OReilly brings diverse innovators together to nurture the ideasthat spark revolutionary industries. We specialize in document-ing the latest tools and systems, translating the innovatorsknowledge into useful skills for those in the trenches. Visitconferences.oreilly.com for our upcoming events.
Safari Bookshelf (safari.oreilly.com) is the premier online refer-ence library for programmers and IT professionals. Conductsearches across more than 1,000 books. Subscribers can zero inon answers to time-critical questions in a matter of seconds.Read the books on your Bookshelf from cover to cover or sim-ply flip to the page you need. Try it today with a free trial.
Beijing Cambridge Farnham Kln Paris Sebastopol Taipei Tokyo
Understanding Linux Network Internalsby Christian Benvenuti
Copyright 2006 OReilly Media, Inc. All rights reserved.Printed in the United States of America.
Published by OReilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
OReilly books may be purchased for educational, business, or sales promotional use. Online editionsare also available for most titles (safari.oreilly.com). For more information, contact our corporate/insti-tutional sales department: (800) 998-9938 or corporate@oreilly.com.
Editor: Andy Oram
Production Editor: Philip Dangler
Cover Designer: Karen Montgomery
Interior Designer: David Futato
December 2005: First Edition.
Nutshell Handbook, the Nutshell Handbook logo, and the OReilly logo are registered trademarks ofOReilly Media, Inc. The Linux series designations, Understanding Linux Network Internals, images ofthe American West, and related trade dress are trademarks of OReilly Media, Inc.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed astrademarks. Where those designations appear in this book, and OReilly Media, Inc. was aware of atrademark claim, the designations have been printed in caps or initial caps.
While every precaution has been taken in the preparation of this book, the publisher and author assumeno responsibility for errors or omissions, or for damages resulting from the use of the informationcontained herein.
ISBN: 978-0-596-00255-8 [5/08]
Table of Contents
Part I. General Background
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Basic Terminology 3Common Coding Patterns 4User-Space Tools 18Browsing the Source Code 19When a Feature Is Offered as a Patch 20
2. Critical Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22The Socket Buffer: sk_buff Structure 22net_device Structure 43Files Mentioned in This Chapter 57
3. User-Space-to-Kernel Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Overview 58procfs Versus sysctl 60ioctl 67Netlink 70Serializing Configuration Changes 71
vi | Table of Contents
Part II. System Initialization
4. Notification Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Reasons for Notification Chains 75Overview 77Defining a Chain 78Registering with a Chain 78Notifying Events on a Chain 79Notification Chains for the Networking Subsystems 81Tuning via /proc Filesystem 82Functions and Variables Featured in This Chapter 83Files and Directories Featured in This Chapter 83
5. Network Device Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84System Initialization Overview 84Device Registration and Initialization 86Basic Goals of NIC Initialization 86Interaction Between Devices and Kernel 87Initialization Options 93Module Options 93Initializing the Device Handling Layer: net_dev_init 94User-Space Helpers 96Virtual Devices 100Tuning via /proc Filesystem 103Functions and Variables Featured in This Chapter 104Files and Directories Featured in This Chapter 105
6. The PCI Layer and Network Interface Cards . . . . . . . . . . . . . . . . . . . . . . . . . . . 106Data Structures Featured in This Chapter 106Registering a PCI NIC Device Driver 108Power Management and Wake-on-LAN 109Example of PCI NIC Driver Registration 110The Big Picture 112Tuning via /proc Filesystem 114Functions and Variables Featured in This Chapter 114Files and Directories Featured in This Chapter 115
Table of Contents | vii
7. Kernel Infrastructure for Component Initialization . . . . . . . . . . . . . . . . . . . . 116Boot-Time Kernel Options 116Module Initialization Code 122Optimized Macro-Based Tagging 125Boot-Time Initialization Routines 128Memory Optimizations 130Tuning via /proc Filesystem 134Functions and Variables Featured in This Chapter 134Files and Directories Featured in This Chapter 135
8. Device Registration and Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136When a Device Is Registered 137When a Device Is Unregistered 138Allocating net_device Structures 138Skeleton of NIC Registration and Unregistration 140Device Initialization 141Organization of net_device Structures 145Device State 147Registering and Unregistering Devices 149Device Registration 154Device Unregistration 156Enabling and Disabling a Network Device 159Updating the Device Queuing Discipline State 161Configuring Device-Related Information from User Space 166Virtual Devices 169Locking 171Tuning via /proc Filesystem 171Functions and Variables Featured in This Chapter 172Files and Directories Featured in This Chapter 173
Part III. Transmission and Reception
9. Interrupts and Network Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177Decisions and Traffic Direction 178Notifying Drivers When Frames Are Received 178Interrupt Handlers 183softnet_data Structure 206
viii | Table of Contents
10. Frame Reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210Interactions with Other Features 211Enabling and Disabling a Device 211Queues 212Notifying the Kernel of Frame Reception: NAPI and netif_rx 212Old Interface Between Device Drivers and Kernel: First Part of netif_rx 219Congestion Management 225Processing the NET_RX_SOFTIRQ: net_rx_action 228
11. Frame Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239Enabling and Disabling Transmissions 241
12. General and Reference Material About Interrupts . . . . . . . . . . . . . . . . . . . . . 261Statistics 261Tuning via /proc and sysfs Filesystems 262Functions and Variables Featured in This Part of the Book 263Files and Directories Featured in This Part of the Book 265
13. Protocol Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266Overview of Network Stack 266Executing the Right Protocol Handler 274Protocol Handler Organization 278Protocol Handler Registration 279Ethernet Versus IEEE 802.3 Frames 281Tuning via /proc Filesystem 293Functions and Variables Featured in This Chapter 293Files and Directories Featured in This Chapter 294
Part IV. Bridging
14. Bridging: Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297Repeaters, Bridges, and Routers 297Bridges Versus Switches 299Hosts 300Merging LANs with Bridges 300Bridging Different LAN Technologies 302Address Learning 302Multiple Bridges 305
Table of Contents | ix
15. Bridging: The Spanning Tree Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310Basic Terminology 311Example of Hierarchical Switched L2 Topology 311Basic Elements of the Spanning Tree Protocol 314Bridge and Port IDs 321Bridge Protocol Data Units (BPDUs) 323Defining the Active Topology 328Timers 335Topology Changes 340BPDU Encapsulation 344Transmitting Configuration BPDUs 346Processing Ingress Frames 347Convergence Time 349Overview of Newer Spanning Tree Protocols 350
16. Bridging: Linux Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355Bridge Device Abstraction 355Important Data Structures 359Initialization of Bridging Code 360Creating Bridge Devices and Bridge Ports 361Creating a New Bridge Device 362Bridge Device Setup Routine 362Deleting a Bridge 364Adding Ports to a Bridge 364Enabling and Di
Источник