Winsock Networking Tutorial (C++)

Networking introduction

First I will give you an introduction to basic networking principles and terms. Anyone with internet

access will have some knowledge about networks, servers, clients, but to ensure you know enough

to program with it I've included this chapter. You won't need all the details mentioned here when

programming winsock, but it's good to know something about the underlying techniques.

1. Networks and protocols

You probably already know what a network is, it's a collection of computers connected to each

other so they can exchange data. There are several types of networks, such as LANs (Local Area

Network), WANs (Wide Area Network) and of course the internet. To ensure that all traffic is going

smoothly, networks rely on protocols:

ProtocolA protocol is a set of rules describing the format in which data is transmittedover a network.

As stated in the information box above, a protocol describes how to communicate over a network.

It can be compared with a human language: at the lowest level nearly everyone can make and

hear sounds (compare: electronic signals) but people won't understand each other unless they

speak a according to a specific language they both understand (compare: protocol).

2. Ethernet

Networks rely on several protocol layers, each one having its own task in the communication

process. A very commonly used configuration is the ethernet LAN with TCP/IP. In ethernet LANs,

computers can be connected using coaxial, twisted pair (UTP) or optic fiber cables. Nowadays, for

most networks, UTP cables are used. WANs and the internet (partly a combination of many WANs)

use many of the techniques used in ethernet LANs, so I will discuss ethernet LAN technology first.

MAC

The lowest layer of ethernet is the hardware level, called the Media Access Layer, or MAC for

short. This layer can be a network card, for example, which contains the serial network interface

and controller that take care of converting the raw data into electronic signals and sending it to

the right place.

Package that are sent over a network of course need to reach their destination. So there has to

be some kind of addressing. Various levels of the ethernet interface have different addressing

methods, as you will see later. At the lowest MAC level, addressing is done with MAC numbers.

MAC number48-bit identifier that is hardcoded into each network interface unit. Theallocation of these numbers is done by the IEEE Registration Authority so eachethernet chip has a world wide unique number (that is, if the manufacturerdidn't mess up :). MAC numbers are often noted as colon-separated hexnumbers: 14:74:A0:17:95:D7.

To send a packet to another network interface, the packet needs to include its MAC number. LANs

use a very simple method to send the packets to the right interface: broadcasting. This means

that your network card just shouts the package to every other interface it can reach. Each

Winsock Networking Tutorial (C++) 2010 by Thomas Bleeker (MadWizard)

Winsock C++ programming tutorial Page 1 / 46Page 1 / 46

receiving interface looks at the destination MAC number of the packet, and only buffers it if

matches its own MAC number. While this method is easy to implement and quite effective on LANs,

bigger networks (WANs, internet) don't use this method for obvious reasons; you wouldn't want

everyone on the internet to send packets to everyone else on the internet. WANs use better

routing mechanisms, which I won't discuss here. Just remember that at the lowest level, addressing

is done with MAC numbers. Ethernet packets also include a CRC and error detection.

IP

Just above the hardware level is the IP level. IP simply stands for Internet Protocol. Just like the

MAC layer, IP too has its own way of addressing:

IP numberThe numbers used to address at the IP level of the network interface. IPv4,the version most widely used uses 32-bit values, noted in the well knowndotted format: 209.217.52.4. Unlike MAC numbers, IP numbers are nothardcoded into the hardware, they are assigned to it at software level.

IP numbers shouldn't be something strange to you. The internet uses them to uniquely identify a

specific computer. IP addresses can be assigned to a network interface using software. Doing this

associates the IP number with the MAC address of the network interface. To address using IP

numbers, the associated MAC number needs to be resolved. This is done with the ARP (Address

Resolution Protocol). Each host maintains a list with pairs of IP and MAC numbers. If an IP is used

without a matching MAC number, the host sends out a query packet to the rest of the LAN. If any

of the other computers in the LAN recognize their IP number, it sends back the corresponding MAC

number. If no matching MAC number can be found the packet is sent to the gateway, a computer

that forwards packages to external networks. The IP to MAC conversion is actually done at the

data link layer (MAC layer)

The IP protocol adds the source and destination address (IP numbers) to the packet, as well as

some other package properties such as the TTL hops (time to live hops), the protocol version

used, header checksum, sequence count and some more fields. They are not important to us so I

won't explain them in detail.

TCP

The next layer is the TCP layer (or alternatively, the UDP layer). This layer is very close to the

network application and deals with many things. As final addition to the addressing, TCP adds a

port number to the package:

Port numberWhile IP numbers are used to address a specific computer or network device,port numbers are used to identify which process running on that device shouldreceive the package. Port numbers are 16-bit, and thus limited to 65536numbers. A process can register to receive packets sent to a specific portnumber ('listening'). A notation often used when addressing a port number ona device is 'IP:portnumber', eg. 209.217.52.4:80. Both sides of a connectionuse a port number, but not necessarily the same.

Many port numbers are WKP (Well Known Ports), that is they are commonly associated with a

specific service. For example, the WWW uses port 80 by default, FTP uses port 21, e-mail uses 25

(SMTP) and 110 (POP). Although these are the ports usually used for those services, nobody

Networking introduction - MadWizard.org


prevents you from using different ports. However, it's a good practice to use port numbers higher

than 1024 for other, custom services.

While the IP layer doesn't care about the success of transmissions, TCP does. The TCP layer

ensures data does arrive, and correctly. It also lets the receiver control the data flow, ie. the

receiver can decide when to receive data. If a package is lost during the way to its destination,

TCP resends the package. TCP also reorders the packages if they arrive in an order different from

the original order. This makes the programmer's life easy as it can safely assume the data that is

sent is received and in the right order. UDP, an alternative for TCP, does not have these features

and cannot guarantee the arrival of packages. TCP is connection-oriented, and the best choice for

continuous data streams. UDP on the other hand is connectionless, and packet oriented. I won't

deal with UDP in this tutorial.

Software

Finally, above the TCP layer is the network software. In windows, your application does not

directly access the TCP layer but uses the WinSock API. The software layer provides a very

convenient way of dealing with networking. Thanks to all the underlying layers, you don't need to

worry about packets, packet size, data corruption, resending of lost packets etc.

3. The ethernet interface stack

The image above shows the encapsulation of the each protocol in the ethernet interface stack. It

all starts with the software layer, which has a piece of data that it wants to send over the

network. Even this data usually has a format (eg. HTTP, FTP protocols), although not shown in the

image. The user data first gets a TCP header including the source and destination port number.

Then the IP header is added, containing the source and destination IP address. Finally the data link

layer adds the ethernet header, which specifies the MAC numbers of the source and destination.

This is the data that is actually sent over the wires. As you can see there's a lot of overhead in an

TCP/IP package. The overhead can be minimized by choosing a large enough data size for the

package. Luckily winsock will arrange this for you.



Networking continued

Now that you know the basic layers of the network interface, I will continue with some other

principles concerning hostnames, connections and software level protocols.

1. DNS

DNS stands for Domain Name System, which accounts for the conversion of hostnames to and

from IP numbers. Because IP numbers are not easy to remember (well not many at

least), another more convenient naming system was created. Now, instead of an IP number, you

could use a hostname alternatively. Examples of hostnames are: madwizard.org,

somepc.someuniversity.edu, www.google.com, etc. Anyone browsing the internet has used them.

When connecting to a website, its IP is needed. So if you enter a hostname like

www.google.com, it first needs to lookup the corresponding IP number of google. This is where

DNS comes in. Your PC sends out a hostname lookup request to the DNS your provider has setup

in its network. If the DNS can resolve the hostname, it sends back the corresponding IP to you.

DNS are organized in a hierarchical way, forwarding unresolvable hostnames to a DNS at a higher

level, until the hostname is resolved.

2. Connections

TCP/IP is a connection-oriented protocol. The connection is always between two devices, and

each side uses its own IP and port number. Usually, one side is called the client, the other side

the server.

The client is the one that requests something, the server responses accordingly. For example,

when opening a website, the browser is the client, the webserver is the server. The browser

initiates the connection with the server and requests a specific resource. The server then sends

back a response and the data requested.

The server is continually waiting for incoming connections. This is called listening, which is

always done on a certain IP and port number. The client is only active when necessary, as the

client is always the initiator of a connection and the one that requests information. To create a

connection, the client needs to know both the IP and port number the server is listening on. A

connection is made to that server and hopefully accepted by the server. While communication

over a TCP/IP connection is two-way, many protocols (HTTP, FTP, etc) let the client and server

interact in turn.



Both the server and client side use an IP and port number, but the IP and port number of the

server are usually fixed. The standard port for the WWW is 80 (using HTTP). Google for example,

is a webserver that runs on port 80 and IP 216.239.39.101 (at the moment of writing). Each

client (read: anyone google-ing :) connects to this IP and port. So the webserver can have many

connections on the same port. This is no problem, since all traffic on that port is for the same

process. On the client side, the port number doesn't matter. Any port can be used. Some people

think that the port number used in a connection needs to be the same on both sides. This is not

true. Just open a website and quickly run 'netstat -an' in a command line. You might see a line

like this:

TCP xxx.xxx.xxx.xxx:2894 216.239.39.101:80 ESTABLISHED

xxx.xxx.xxx.xxx was my IP, 216.239.39.101 is google's IP. The number after the colon is the port

number. As you can see, the server side uses port 80, while the client uses a random (read: some

free) port number like 2894. Each client connection needs a different port number on the client

side, since every connection is associated with a different client.

ClientThe program that initiates the connection, and requests information.

ServerThe program that listens for incoming connections, accepts them andresponses according to the received requests. The IP and port number of theserver need to be known by the client to connect to it.

3. Protocols again

In the previous chapters I have showed several protocols at the different levels of a network

interface. The protocols I didn't discuss yet are the protocols that work at software level.

Examples of these are HTTP, FTP, POP3, SMTP. Most of them work in a client-server way, ie. the

client makes requests, the server responds. The exact format of the requests and responses are

described in these protocols. I won't discuss them further right now, but I will later when you

know the winsock basics to actually implement them.

Sockets and winsock

Winsock ('Windows Sockets') is the Windows API that deals with networking. Many functions are

implemented in the same way as the Berkeley socket functions used in BSD Unix.

1. Sockets

So what's a socket?

SocketAs explained in the previous chapter, you will work with two-way connections.The endpoints of this connection are the sockets. Both the client and theserver have a socket. A socket is associated with a certain IP and portnumber.

Almost all winsock functions operate on a socket, as it's your handle to the connection. Both

sides of the connection use a socket, and they are not platform-specific (ie. a Windows and Unix



machine can talk to each other using sockets). Sockets are also two-way, data can be both sent

and received on a socket.

There are two common types for a socket, one is a streaming socket (SOCK_STREAM), the other

is a datagram socket (SOCK_DGRAM). The streaming variant is designed for applications that

need a reliable connection, often using continuous streams of data. The protocol used for this

type of socket is TCP. I will only use this type in my tutorial as it's most commonly used for the

well known protocols like HTTP, TCP, SMTP, POP3 etc.

Datagram sockets use UDP as underlying protocol, are connectionless, and have a maximum

buffer size. They are intended for applications that send data in small packages and that do not

require perfect reliability. Unlike streaming sockets, datagram sockets do not guarantee data will

reach its destination nor that it comes in the right order. Datagram sockets can be slightly faster

and useful for applications like streaming audio or video, where reliability is not as high on the

priority list as speed and latency. Where the reliability is required, streaming sockets are used.

2. Binding sockets

Binding a socket means associating a specific address (IP & port number) with a given socket.

This can be done manually using the bind function, but in some cases winsock will automatically

bind the socket. This will become clear in the next paragraphs.

3. Connecting

The way you use a socket depends on whether you are on the client side or the server side. The

client side initiates a connection by creating a socket, and calling the connect function with the

specified address information. Before the socket is connected, it is not bound yet to an IP or port

number. Because the client side can use any IP and port number for the connection with the

server (provided that network the IP number is part of can reach the network of the destination

IP), often many useable combinations are possible.

When connect is called, winsock will choose the IP and port number to use for the connection

and bind the socket to it before actually connecting it. The port number can be anything that is

free at the moment, the IP number needs a bit more care. PCs may have more than one IP. For

example, a PC connected to both the internet and a local network has at least three IPs (the

external IP for use with the internet, the local network IP (192.168.x.x, 10.0.x.x etc.) and the

loop back address (127.0.0.1)). Here, it does matter to which IP the socket is bound as it also

determines the network you are using for the connection. If you want to connect to the local PC

192.168.0.4, you cannot do that using the network of your internet provider, as that IP is never

used in the internet and will not be found. So you would have to bind the socket to your IP in the

same network (192.168.0.1 for example). Similarly, when you bind the socket to the local loop

back address (127.0.0.1), you can only connect to that same address, as no other address exist

in that 'network'.

Fortunately, winsock will choose a local IP it can use for the IP you want to connect to

automatically. Nothing stops you from binding the socket yourself, but remember that you need

to take the situations above in consideration.

Note that the bind function gives the user the option to set the IP or port number to zero. In this

Sockets and winsock - MadWizard.org


case, zero means 'let winsock choose something for me'. This is useful when you do want to

connect using a specific IP on the client side, but do not care about the port number used.

4. Listening

Things are different on the server side. A server has to wait for incoming connections and clients

will need to know both the IP and port number of the server to be able to connect to it. To make

things easy, servers almost always use a fixed port number (often the default port number for the

used protocol).

Waiting for incoming connections on a specified address is called listening:

ListeningA socket is listening when it is in a state where it will 'listen' for incomingconnections. Usually, this is done on a socket bound to a specific addressknown to the client.

As you can see from the definition above, sockets are often bound to an address before putting it

in the listening state. When the port number of this address is set to a fixed number, the server

will listen for incoming connections on that port number specifically. For example, port 80 (the

default for HTTP) is listened on by most web servers. The socket can be bound to a specific IP as

well but when zero is chosen it will listen on any addresses available, effectively allowing

connections from all networks. It may be set to a fixed IP, for example the IP of the local network

interface, so computers from the local network can connect to the server but not the ones

connected via the internet.

When a client requests a connection to a listening server, the server will accept it (or not) and

spawn another socket which will be the endpoint of the connection. This way the listening socket

is not used for any data transfer on the connection and can continue listening for more

incomming connections.

5. Connections: an example

Here's a graphical example of a webserver that can handle multiple connections.

1. The server socket is created

The server creates a new socket. When it's just created it is not yet bound to an IP or port

number.

2. The server socket is bound



Because the server is a webserver, it will be bound to port number 80, the default for HTTP.

However the IP number is set to zero, indicating the server is willing to recieve incomming

connections from all IPs available for the machine it runs on. In this example, we assume the

server has three IPs, one external (216.239.39.101), one internal (192.168.0.8) and of course the

loop back address (127.0.0.1).

3. The server is listening

After the socket is bound, it is put into the listening state, waiting for incomming connections on

port 80.

4. A client creates a socket

Assume a client in the same local network as the server (192.168.x.x) wants to request a

webpage from the server. To do the data transfer it needs a socket so it creates one.

5. The client socket tries to connect

madwizard.org/programming//1 8/19



The client socket is left unbound and tries to connect to the webserver.

6. The server accepts the request

The listening socket sees some client wants to make a connection. It accepts it by creating a

new socket (on the bottom right) bound to the one of the IPs of itself which can be reached by

the client (ie. they are in the same network, being 192.168.x.x) and the server port (80). From

this point, the client socket and the server connection socket just created will do the data

transfers, while the listening socket will keep listening for other connections. Note that the client

socket is now bound to an IP and port since it's connected. The dotted gray line shows the

separation of the client and server side.

7. Another client connects

7/23/2010 Networking introduction - MadWizard.org


If another client (from the external network) connects, the server will again create a new socket

to deal with the second connection. Note that the IP the socket on the server side is bound to is

different than the one from the first connection. This is possible because the listening server

socket was not bound to any IP. If it had been bound to 192.168.0.8, the second connection

would not be possible.

6. Blocking

The original functions in the Berkeley unix implementation of sockets were blocking functions. This

means that they will just wait when the operation requested cannot be completed immediately.

For example, when connecting to a server using the connect function, it did not return until the

connection had been made (or failed), thus making the program hang for a while. This is not really

a problem when dealing with a single connection using a console mode application but in the

Windows environment, this behavior is rarely acceptable. Any program with a window has a

window procedure that has to be kept running. Stalling it would delay user input, window

painting, notifications, and any other messages resulting in an application that seems to be

hanging while it's using socket functions.

To deal with this problem, winsock can set sockets into blocking or non-blocking mode. The

former (blocking mode) is the original way of using sockets, ie. not returning from the API before

the operation has finished (it will literally block the application). The latter (non-blocking mode) is

the mode you usually use when dealing with a real windows application (ie. not a console

application). When calling a function on a socket that is in non-blocking mode, the function will

always return as soon as possible, even when the operation to be performed could not be

completed immediately. Instead, a notification of some sort will be sent to the program when the

operation is finished, allowing the program to execute in the normal manner while the operation is

unfinished.

Winsock provides several methods of notification for non-blocking sockets, including window

messages and event objects. These methods will be discussed in detail later, for now just

remember there difference between blocking and non-blocking.

7. W insock versions

The most commonly used winsock version is version 2.x, usually just called winsock 2 as there are

only minor differences. The latest version before version 2 was version 1.1. Some people say you

should use this version for compatibility reasons, as Windows 95 and NT 3 only ship version 1.1.

However, all later windows versions (98, ME, NT4, 2000 and XP) have version 2 by default and for

Windows 95 an update is available. So I recommend you just start with winsock 2, it adds a lot of

nice features and windows machines without winsock 2 are getting rare.



The two major versions of winsock reside in two different DLLs, wsock32.dll and ws2_32.dll,

being version 1.1 and version 2.x respectively. The libraries to use

are wsock32.lib and ws2_32.lib. The MASM32 package has most winsock constants in its

windows.inc, for C++ programs including windows.h suffices, it will include the winsock 2

definitions if the _WIN32_WINNT constant is at least 0x400 (NT version 4). The winsock 2 API

includes the full 1.1 API (with some minor changes), wsock32.dll is even just a wrapper for the

actual winsock ws2_32.dll.

This tutorial will assume you are using winsock 2.

8. W insock architecture

Winsock provides two interfaces, the Application Programming Interface (API) and the Service

Provider Interface (SPI). This tutorial is about the API, it contains all the functions you need to

communcate using the well-known protocols. The SPI is an interface to add Data Transport

Providers (like TCP/IP or IPX/SPX) or Name Space Service Providers (like DNS). These extensions

are transparent to the user of the API.

Basic winsock functions

In this chapter of the winsock tutorial, I will show you the basic winsock functions that operate

on sockets. It is important to remember that this chapter is only an introduction to the socket

functions, so you will be able to follow the next tutorials. Do not start coding immediately after

you've read this chapter, the next chapters are just as important.

The basic functionality of each function is relatively simple, but things like the blocking mode

make it more complicated than it looks at first sight. The next chapters will cover the details, but

first you need to be familiar with the functions.

This chapter is quite long and you might not remember everything but that's okay. Just read it

carefully so you know what I'm talking about in the next chapters, you can always look back here

and use it as a quick reference.

1. W SAStartup & WSACleanup

int WSAStartup(WORD wVersionRequested, LPWSADATA lpWSAData);

int WSACleanup();

Before calling any winsock function, you need to initialize the winsock library. This is done with

WSAStartup. It takes two parameters:

wVersionRequested

Highest version of Windows Sockets support that the caller can use. The high-order bytespecifies the minor version (revision) number; the low-order byte specifies the majorversion number.

lpWSAData

Pointer to the WSADATA data structure that is to receive details of the Windows Socketsimplementation.



As explained in the introduction, I will use winsock 2. This means you need to set the low byte of

wVersionRequested to 2, the high byte can be zero (the revision number is not important). The

WSADATA structure specified with the lpWSAData parameter will receive some information about

the winsock version installed.

The function returns zero if it succeeded, otherwise you can call WSAGetLastError to see what

went wrong. WSAGetLastError is the winsock equivalent of the win32 APIs GetLastError, it

retrieves the code of the last occurred error.

It is important to note that you might not get the version you requested in

the wVersionRequestedparameter. This parameter specifies the highest winsock version your

application *supports*, not 'requires'. Winsock will try hard to give you the version you requested

but if that is not possible, it uses a lower version. This version is available after the call, in

the wVersion member of the WSADATA structure. You should check this version after the call to

see if you really got the winsock version you wanted. There is also a member

called wHighVersion that gives the highest winsock version supported by the system. In short:

wVersionRequested parameter: The highest winsock version your application supports.

wHighVersion in WSADATA: The highest winsock version the system supports.

wVersion in WSADATA: min(wVersionRequested, wHighVersion).

Each call to WSAStartup has to match a call to WSACleanup, which cleans up the winsock library.

Although useless, WSAStartup may be called more than once, as long as WSACleanup is called

the same number of times.

An example of initializing and cleaning up winsock:

const int iReqWinsockVer = 2; // Minimum winsock version required

WSADATA wsaData;

if (WSAStartup(MAKEWORD(iReqWinsockVer,0), &wsaData)==0){ // Check if major version is at least iReqWinsockVer if (LOBYTE(wsaData.wVersion) >= iReqWinsockVer) { /* ------- Call winsock functions here ------- */ } else { // Required version not available }

// Cleanup winsock if (WSACleanup()!=0) { // cleanup failed }}else{ // startup failed}

2. socket

SOCKET socket(int af, int type, int protocol);

The socket function creates a new socket and returns a handle to it. The handle is of type

SOCKET and is used by all functions that operate on the socket. The only invalid socket handle

Basic winsock functions - MadWizard.org


value is INVALID_SOCKET (defined as ~0), all other values are legal (this includes the value

zero!). Its parameters are:

af

The address family to use. Use AF_INET to use the address family of TCP & UDP.

type

The type of socket to create. Use SOCK_STREAM to create a streaming socket (usingTCP), or SOCK_DGRAM to create a diagram socket (using UDP). For more information onsocket types, see the previous chapter.

protocol

The protocol to be used, this value depends on the address family. You can specifyIPPROTO_TCP here to create a TCP socket.

The return value is a handle to the new socket, or INVALID_SOCKET if something went wrong.

The socket function can be used like this:

SOCKET hSocket;

hSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);if (hSocket==INVALID_SOCKET){ // error handling code}

3. c losesocket

int closesocket(SOCKET s);

Closesocket closes a socket. It returns zero if no error occurs, SOCKET_ERROR otherwise. Each

socket you created with socket has to be closed with an appropriate closesocket call.

s

Handle to the socket to be closed. Do not use this socket handle after you called thisfunction.

The use of closesocket is pretty straightforward:

closesocket(hSocket);

However, in real situations some more operations are necessary to close the socket properly. This

will be discussed later in the tutorial.

4. sockaddr and byte ordering

Because winsock was made to be compatible with several protocols including ones that might be

added later (using the SPI) a general way of addressing has to be used. TCP/IP uses an IP and

port number to specify an address, but other protocols might do it differently. If winsock forced a

certain way of addressing, adding other protocols may not have been possible. The first version

of winsock solved this with the sockaddr structure:

struct sockaddr{ u_short sa_family; char sa_data[14];};



In this structure, the first member (sa_family) specifies the address family the address is for. The

data stored in the sa_data member can vary among different address families. We will only use

the internet address family (TCP/IP) in this tutorial, winsock has defined a

structure sockaddr_in that is the TCP/IP version of the sockaddr structure. They are

essentially the same structure, but the second is obviously easier to manipulate.

struct sockaddr_in{ short sin_family; u_short sin_port; struct in_addr sin_addr; char sin_zero[8];};

The last 8 bytes of the structure are not used but are padded (with sin_zero) to give the

structure the right size (the same size as sockaddr).

Before proceeding, it is important to know about the network byte order. In case you don't know,

byte ordering is the order in which values that span multiple bytes are stored. For example, a 32-

bit integer value like 0x12345678 spans four 8-bit bytes. Intel x86 machines use the 'little-endian'

order, which means the least significant byte is stored first. So the value 0x12345678 would be

stored as the byte sequence 0x78, 0x56, 0x34, 0x12. Most machines that don't use little-endian

use big-endian, which is exactly the opposite: the most significant byte is stored first. The same

value would then be stored as 0x12, 0x34, 0x56, 0x78. Because protocol data can be transferred

between machines with different byte ordering, a standard is needed to prevent the machines

from interpreting the data the wrong way.

Network byte orderingBecause protocols like TCP/IP have to work between different type of systemswith different type of byte ordering, the standard is that values are storedinbig-endian format, also called network byte order. For example, a portnumber (which is a 16-bit number) like 12345 (0x3039) is stored with its mostsignificant byte first (ie. first 0x30, then 0x39). A 32-bit IP address is stored inthe same way, each part of the IP number is stored in one byte, and the firstpart is stored in the first byte. For example, 216.239.51.100 is stored as thebyte sequence '216,239,51,100', in that order.

Apart from the sin_family value of sockaddr and sockaddr_in, which is not part of the protocol

but tells winsock which address family to use, all the values in both structures have to be in

network byte order. Winsock provides several functions to deal with the conversion between the

byte order of the local host and the network byte order:

// Convert a u_short from host to TCP/IP network byte order.u_short htons(u_short hostshort);

// Convert a u_long from host to TCP/IP network byte order.u_long htonl(u_long hostlong);

// Convert a u_long from TCP/IP network order to host byte order.u_short ntohs(u_short netshort);

// Convert a u_long from TCP/IP network order to host byte order.u_long ntohl(u_long netlong);

You might question why we should need four API functions for such simple operations as

swapping the bytes of a short or long (as that's enough to convert from little-endian (intel) to

big-endian (network)). This is because these APIs will work even if you are running your program

on a machine with other byte ordering than an intel machine (that is, the APIs are platform

independent), like Windows CE on a handheld using a big-endian processor. Whether you use



these APIs or your own macros/functions is up to you. Just know that the API way is guaranteed

to work on all systems.

Back to the sockaddr_in structure, as said above, all members except for sin_family have to be

in network byte order. For sin_family use AF_INET. sin_port is the port number of the address

(16-bit), sin_addr is the IP address (32-bit), declared as an union to manipulate the full 32-bit

word, the two 16-bit parts or each byte separately. sin_zero is not used.

Here are several examples of initializing sockaddr_in structures:

sockaddr_in sockAddr1, sockAddr2;

// Set address familysockAddr1.sin_family = AF_INET;

/* Convert port number 80 to network byte order and assign it to the right structure member. */sockAddr1.sin_port = htons(80);

/* inet_addr converts a string with an IP address in dotted format to a long value which is the IP in network byte order. sin_addr.S_un.S_addr specifies the long value in the address union */sockAddr1.sin_addr.S_un.S_addr = inet_addr("127.0.0.1");

// Set address of sockAddr2 by setting the 4 byte parts:sockAddr2.sin_addr.S_un.S_un_b.s_b1 = 127;sockAddr2.sin_addr.S_un.S_un_b.s_b2 = 0;sockAddr2.sin_addr.S_un.S_un_b.s_b3 = 0;sockAddr2.sin_addr.S_un.S_un_b.s_b4 = 1;

The inet_addr function in the example above can convert an IP address in dotted string format

to the appropriate 32-bit value in network byte order. There is also a function called inet_ntoa,

which does exactly the opposite.

As a side note, winsock 2 does not require that the structure used to address a socket is the

same size of sockaddr, only that the first short is the address family and that the right structure

size is passed to the functions using it. This allows new protocols to use larger structures. The

sockaddr structure is provided for backwards compatibility. However, since we will only use

TCP/IP in this tutorial, the sockaddr_in structure can be used perfectly.

5. connect

int connect(SOCKET s, const struct sockaddr *name, int namelen);

The connect function connects a socket with a remote socket. This function is used on the client

side of a connection, as you are the one initiating it. A short description of its parameters:

s

The unconnected socket you want to connect.

name

Pointer to a sockaddr structure that contains the name (address) of the remote socket toconnect to.

namelen

Size of the structure pointed to by name.

The first parameter s is the client socket used for the connection. For example, a socket you've

just created with the socket function. The other two parameters, name and namelen are used



to address the remote socket (the server socket that is listening for incoming connections). This

is done by using a sockaddr structure (or sockaddr_in for TCP/IP), as described in the previous

section.

A possible use of this function is connecting to a webserver to request a page. To address the

server, you can use sockaddr_in structure and fill it with the server's IP and port number. You

might wonder how you get the IP of a hostname like www.madwizard.org, I will show you how to

do that later. For now, just assume you know the server's IP number.

Assuming a webserver is running on a local network PC with IP number 192.168.0.5, using the

default HTTP port 80, this would be the code to connect to the server:

/* This code assumes a socket has been created and its handle is stored in a variable called hSocket */

sockaddr_in sockAddr;

sockAddr.sin_family = AF_INET;sockAddr.sin_port = htons(80);sockAddr.sin_addr.S_un.S_addr = inet_addr("192.168.0.5");

// Connect to the serverif (connect(hSocket, (sockaddr*)(&sockAddr), sizeof(sockAddr))!=0){ // error handling code}

/* Note: the (sockaddr*) cast is necessary because connect requires a sockaddr type variable and the sockAddr variable is of the sockaddr_in type. It is safe to cast it since they have the same structure, but the compiler naturally sees them as different types. */

6. bind

int bind(SOCKET s, const struct sockaddr *name, int namelen);

Binding a socket has been explained in the previous chapter. By binding a socket you assign an

address to a socket. Bind's parameters are:

s

The unbound socket you want to bind.

name

Pointer to a sockaddr structure that contains the address to assign to the socket.

namelen

Size of the structure pointed to by name.

For TCP/IP, the sockadrr_in structure can be used as usually. Let's look at an example first:

sockaddr_in sockAddr;

sockAddr.sin_family = AF_INET;sockAddr.sin_port = htons(80);sockAddr.sin_addr.S_un.S_addr = INADDR_ANY; // use default

// Bind socket to port 80if (bind(hSocket, (sockaddr*)(&sockAddr), sizeof(sockAddr))!=0){ // error handling code}

As you can see, a sockaddr_in structure is filled with the necessary information. The address

family is AF_INET for TCP/IP. In the example, we bind the socket to port number 80, but not to



an IP number. By specifying the INADDR_ANY value as IP address, winsock will choose an

address for you. This can be very useful for PCs with multiple network adapters (and thus multiple

IPs). If you do want to bind to a specific IP, just convert the IP to a DWORD in network byte

order and put it in the structure. Something similar is possible with the port number; when you

specify 0 as the port number winsock will assign a unique port with a value between 1024 and

5000. However, most of the time you want to bind to a specific port number.

Binding is usually done before putting the socket in a listening state, to make the socket listen on

the right port number (and optionally an IP number). Although you can also bind a socket before

connecting it, this is not commonly done because the address of the socket on the client side is

not important most of the time.

7. lis ten

int listen(SOCKET s, int backlog);

The listen function puts a socket in the listening state, that is it will be listening for incoming

connections. It has two parameters:

s

The bound, unconnected socket you want to set into the listening state.

backlog

Maximum length of the queue of pending connections.

The backlog parameter can be set to specify the length of the queue of pending connections that

have not yet been accepted. Usually, you can use the default value SOMAXCONN, allowing the

underlying service provider to choose a reasonable value.

Before listen is called, the socket must have been bound to an address, as shown in the previous

section. For example, if you bind a socket to port 80 and then call listen on the socket, all

incoming connections on port 80 will be routed to your application. To actually accept the

connection, another function called acceptis available, it will be explained in the next section.

The following code snippet shows how to call the listen function on a socket that has been bound

already:

/* This code assumes the socket specified by hSocket is bound with the bind function */

if (listen(hSocket, SOMAXCONN)!=0){ // error handling code}

8. accept

SOCKET accept(SOCKET s, struct sockaddr *addr, int *addrlen);

When the socket is in the listening state and an incoming connection arrives, you can accept it

with the accept function.

s

The socket that has been placed in a listening state with the listen function.



addr

Optional pointer to a buffer that receives the address of the remote socket. This parameteris a pointer to a sockaddr structure, but its exact structure is determined by the addressfamily.

addrlen

Optional pointer to an integer that contains the length of addr. Before calling the function,the value should be the size of the buffer pointed to by addr. On return, the value is thesize of the data returned in the buffer.

As you know, when a connection is accepted a new socket is created on the server side. This

new socket is connected to the client socket, all operations on that connection are done with

that socket. The original listening socket is not connected, but instead listens for more incoming

connections.

sockaddr_in remoteAddr;int iRemoteAddrLen;SOCKET hRemoteSocket;

iRemoteAddrLen = sizeof(remoteAddr);hRemoteSocket = accept(hSocket, (sockaddr*)&remoteAddr, &iRemoteAddrLen);if (hRemoteSocket==INVALID_SOCKET){ // error handling code}

If accept succeeds, a connection is established and the return value is a new socket handle that

is the server side of the new connection. Optionally, you can set the addr and addrlen

parameters that will receive a sockaddr structure containing the remote address information (IP &

port number).

9. send and recv

int send(SOCKET s, const char *buf, int len, int flags);

s

The connected socket to send data on.

buf

Pointer to a buffer containing the data to send

len

Length of the data pointed to by buf.

flags

Specifies the way in which the call is made.

int recv(SOCKET s, char *buf, int len, int flags);

s

The connected socket to receive data from.

buf

Pointer to a buffer that will receive the data.

len

Length of the buffer pointed to by buf.

flags

Specifies the way in which the call is made.



To transfer data on a connection, you use the send and recv functions. Send sends the data in

the buffer on the socket and returns the number of bytes sent. Recv receives the data that is

currently available at the socket and stores it in the buffer. The flags parameter can usually be set

to zero for both recv and send.

In blocking mode, send will block until all data has been sent (or an error occurred) and recv will

return as much information as is currently available, up to the size of the buffer specified.

Although these functions may seem simple at first, they become more complicated in non-blocking

mode. When a socket is in non-blocking mode, these functions cannot block until the operation is

finished so they may not perform the operation fully (ie. not all data is sent), or not at all. The

next chapter will explain these issues in great detail, I won't discuss it here since this only a

function overview.

This example of recv and send on a connected socket in blocking mode will just send back all data

it receives.

char buffer[128];

while(true){ // Receive data int bytesReceived = recv(hRemoteSocket, buffer, sizeof(buffer), 0);

if (bytesReceived==0) // connection closed { break; } else if (bytesReceived==SOCKET_ERROR) { // error handling code }

// Send received data back if (send(hRemoteSocket, buffer, bytesReceived, 0)==SOCKET_ERROR) { // error handling code }}

10. Usage

As stated in this chapter's introduction, this was only an overview of the main winsock functions.

Just knowing how the functions is not enough to program correctly with winsock. The next

chapters will tell you how to use them correctly, which I/O strategies exist and how blocking and

non-blocking mode works.

I/O models

In chapter 3 I briefly touched blocking and non-blocking sockets, which play a role in the

available winsock I/O models. An I/O model is the method you use to control the program flow of

the code that deals with the network input and output. Winsock provides several functions to

design an I/O strategy, I will discuss them all here in short to get an overview. Later in the tutorial

I will deal with most models separately and show some examples of them.

1. The need for an I/O model

So why do you need an I/O model? We don't have infinite network speed, so when you send or



receive data the operation you asked for may not be completed immediately. Especially with

networks, which are slow compared to 'normal', local operations. How do you handle this? You

could choose to do other things while you're waiting to try again, or let your program wait until the

operation is done, etc. The best choice depends on the structure and requirements of your

program.

Originally, Berkeley sockets used the blocking I/O model. This means that all socket functions

operate synchronously, ie. they will not return before the operation is finished. This kind of

behavior is often undesirable in the Windows environment, because often user input and output

should still be processed even while network operations might occur (I explained this earlier in

chapter 3). To solve this problem, non-blocking sockets were introduced.

2. Non-blocking model

A socket can be set into non-blocking mode using ioctlsocket (with FIONBIO as its cmd

parameter). Some functions used in I/O models implicitly set the socket in non-blocking mode (more

on this later). When a socket is in non-blocking mode, winsock functions that operate on it will

never block but always return immediately, possibly failing because there simply wasn't any time to

perform the operation. Non-blocking sockets introduce a new winsock error code which - unlike

other errors - is not exceptional. For now, keep the following in mind:

WSAEWOULDBLOCKThis constant is the error code a winsock function sets when it cannotimmediately perform an operation on a non-blocking socket. You get thiserror code when you call WSAGetLastError after a winsock function failed.Its name literally says 'error, would block', meaning that the function wouldhave to block to complete. Since a non-blocking socket should not block, thefunction can never do what you ask it to.Note that this isn't really an error. It can occur all the time when using non-blocking sockets. It just says: I can't do that right now, try again later. TheI/O model usually provides a way to determine what's the best time to tryagain.

3. I/O models

I've made several attempts to find a categorical description of the several I/O models but I haven't

really found a good one, mainly because the models' properties overlap and terms like

(a)synchronous have slightly different meanings or apply to different things for each model. So I

decided to just create a table with all the models to show the differences and explain the details

later.

Model Blocking mode

Notification method

noneon network

eventon completion

Blocking sockets blocking x

Polling non-blocking x

Select both blocking select

WSAAsyncSelect non-blockingwindowmessage

WSAEventSelect non-blocking event objects

Overlapped I/O: blocking N/A blocking call

Overlapped I/O: polling N/A x

Overlapped I/O: completion routines N/A callback function

Overlapped I/O: completion ports N/A completion port

I/O models - MadWizard.org


The first five models are commonly used and fairly easy to use. The last four actually use the same

model (overlapped I/O), but use different implementation methods. Actually, you don't really need

overlapped I/O unless you're writing network programs that should be able to handle thousands of

connections. Most people won't write such programs but I included them because good information

and tutorials about the overlapped I/O model is not easy to find on the web. If you're not

interested in overlapped I/O you can safely skip the future chapters about them.

One way to divide the I/O models is based on the blocking mode it uses. The blocking sockets

model naturally uses blocking mode, while the others use non-blocking mode (select may be used

for both). The blocking mode is not applicable to overlapped I/O because these operations always

operate asynchronously (the blocking mode cannot affect this nor the other way around).

Another way to divide them is using their differences in the notification method used (if any).

There are three subtypes:

None

There is no notification of anything, an operation simply fails or succeeds (optionally blocking).

On network event

A notification is sent on a specific network event (data available, ready to send, incoming

connection waiting to be accepted, etc.). Operations fail if they cannot complete immediately,

the network event notification can be used to determine the best time to try again.

On completion

A notification is sent when a pending network operation has been completed. Operations either

succeed immediately, or fail with an 'I/O pending' error code (assuming nothing else went wrong).

You will be notified when the operation does complete, eliminating the need to try the operation

again.

Blocking mode doesn't use any notifications, the call will just block until the operation finished.

WSAAsyncSelect is an example of a network event notification model as you will be notified by a

window message when a specific network event occurred. The completion notification method is

solely used by overlapped I/O, and is far more efficient. They are bound directly to the operations;

the big difference between the network event and completion notification is that a completion

notification will be about a specific operation you requested, while a network event can happen

because of any reason. Also, overlapped I/O operations can - like its name says - overlap. That

means multiple I/O requests can be queued.

In the next section I will show you the details of each model separately. To give you a more

intuitive view of the models, I've created timeline images and used a conversation between the

program and winsock as an analogy to how the model works.

Note

In many of these timelines I've assumed the winsock operation fails (in a WSAEWOULDBLOCK way)

because that is the interesting case. The function might as well succeed and return immediately if

the operation has been done already. I've left this case out in most of the timelines in favor of

clarity.



Blocking sockets are the easiest to use, they were already used in the first socket

implementations. When an operation performed on a blocking socket cannot complete immediately,

the socket will block (ie. halt execution) until it is completed. This implies that when you call a

winsock function like send or recv, it might take quite a while (compared to other API calls) before

it returns.

This is the timeline for a blocking socket:

As you can see, as soon as the main thread calls a winsock function that couldn't be completed

immediately, the function will not return until it is completed. Naturally this keeps the program flow

simple, since the operations can be sequenced easily.

By default, a socket is in blocking mode and behaves as shown above. As I told earlier, I will also

show each I/O model in the form of a conversation between the program and winsock. For blocking

sockets, it's very simple:

program: send this data

winsock: okay, but it might take some time.........done!

5. Polling

Polling is actually a very bad I/O model in windows,but for completeness' sake of I will describe it.

so the socket first has to be put into non-blocking

.

is the desired one, in this case repeating a winsock function



Polling is an I/O model for non-blocking sockets,

mode. This can be done with ioctlsocketPolling in general is repeating something until its status

until it returns successfully:

4. Blocking sockets

Because the socket is non-blocking, the function will not block until the operation is finished. If it

cannot perform the operation it has to fail (with WSAEWOULDBLOCK as error code). The polling I/O

model just keeps calling the function in a loop until it succeeds:


winsock: sorry can't do that right now, I would block








winsock: done!

As I said, this is a really bad method because its effect is the same as a blocking function, except

that you have some control inside the loop so you could stop waiting when some variable is set, for

example. This style of synchronization is called 'busy waiting', which means the program is

continuously busy with waiting, wasting precious CPU time. Blocking sockets are far more efficient

since they use an efficient wait state that requires nearly no CPU time until the operation

completes.

Now you know how the polling I/O model works, forget about it immediately and avoid it by all

means :)

6. Se lect

Select provides you a more controlled way of blocking. Although it can be used on blocking sockets

too, I will only focus on the non-blocking socket usage. This is the timeline for select:

7/23/2010 I/O models - MadWizard.org

VWinsock C++ programming tutorial Page 23 / 46Page 23 / 46

And the corresponding conversation:



program: okay, tell me when's the best time to try again (the select call)

winsock: sure, hang on a minute......try again now!


winsock: done!

You might have noticed that the select call looks suspiciously similar to the blocking socket

timeline. This is because the select function does block. The first call tries to perform the winsock

operation. In this case, the operation would block but the function can't so it returns. Then at one

point, select is called. Select will wait until the best time to retry the winsock operation. So

instead of blocking winsock functions, we now have only one function that blocks, select.

If select blocks, why use it for non-blocking sockets then? Select is more powerful than blocking

sockets because it can wait on multiple events. This is the prototype of select:

select PROTO nfds:DWORD, readfds:DWORD, writefds:DWORD, exceptfds:DWORD,

timeout:DWORD

Select determines the status of one or more sockets, performing synchronous I/O if necessary. The

nfds parameter is ignored, select is one of the original Berkeley sockets functions, it is provided for

compatibility. The timeout parameter can be used to specify an optional timeout for the function.

The other three parameters all specify a set of sockets.

readfds is a set of sockets that will be checked for readability

writefds is a set of sockets that will be checked for writability

exceptfds is a set of sockets that will be checked for errors

Readability means that data has arrived on a socket and that a call to read after select is likely to

receive data. Writability means it's a good time to send data since the receiver is probably ready to

receive it. Exceptfds is used to catch errors from a non-blocking connect call as well as out-of-

band data (which is not discussed in this tutorial).

So while select may block you have more control over it since you can specify more than one

socket to wait on for a specific event, and multiple types of events (data waiting, ready to send or

some error that has occurred). Select will be explained more detailed in later chapters.

7. Windows messages (WSAASyncSelect)

Many windows programs have some kind of window to get input from and give information to the

user. Winsock provides a way to integrate the network event notification with a windows's

message handling. The WSAAsyncSelect function will register notification for the specified

network events in the form of a custom window message.



WSAAsyncSelect PROTO s:DWORD, hWnd:DWORD, wMsg:DWORD, lEvent:DWORD

This function requires a custom message (wMsg) that the user chooses and the window procedure

should handle. lEvent is a bit mask that selects the events to be notified about. The timeline is as

follows:

Let's say the first message wants to write some data to the socket using send. Because the

socket is non-blocking, send will return immediately. The call might succeed immediately, but here

it didn't (it would need to block). Assuming WSAAsyncSelect was setup to notify you about the

FD_WRITE event, you will eventually get a message from winsock telling you a network event has

happened. In this case it's the FD_WRITE event which means something like: "I'm ready again, try

resending your data now". So in the handler of that message, the program tries to send the data

again, and this is likely to succeed.

The conversation between the program and winsock is much like the one with select, the

difference is in the method of notification: a window message instead of a synchronous select call.

While select blocks waiting until an event happens, a program using WSAASyncSelect can continue

to process windows messages as long as no events happen.

program registers for network event notification via window messages



program handles some message

program handles some other message

program gets a notification window message from winsock


winsock: done!

WSAAsyncSelect provide a more 'Windows natural' way of event notification and is fairly easy to

use. For low traffic servers (ie. < 1000 connections) it efficient enough as well. The drawback is

that window messages aren't really fast and that you'll need a window in order to use it.



WSAAsyncSelect brother is WSAEventSelect, which works in a very similar way but uses event

objects instead of windows messages. This has some advantages, including a better separation of

the network code and normal program flow and better efficiency (event objects work faster than

window messages).

Have a good look at the timeline and conversation, it looks a bit complicated but it really isn't:

program registers for network event notification via event objects



program waits for the event object to signal


winsock: done!

It's hard to draw a timeline for this function since event objects are a very powerful mechanism

that can be used in many ways. I chose for a simple example here as this I/O model will be

explained in great detail later in this tutorial.

At first, this model seems a lot like blocking: you wait for an event object to be signaled. This is

true, but you can also wait for multiple events at the same time and create your own event

objects. Event objects are part of the windows API, winsock uses the same objects. Winsock does

have special functions to create the event objects but they are just wrappers around the usual

functions.

All that winsock does with this model is signaling an event object when a winsock event happens.

How you use this notification method is up to you. That makes it a very flexible model.

The function used to register for network events is WSAEventSelect. It is much like

WSAAsyncSelect:

WSAEventSelect PROTO s:DWORD, hEventObject:DWORD, lNetworkEvents:DWORD



8. Event objects (WSAEventSelect)

WSAAsyncSelect will send you a custom message with the network event that happened

(FD_READ, FD_WRITE, etc.). Unlike WSAAsyncSelect, WSAEventSelect has only one way of

notification: signaling the event object. When the object is signaled, one or more events may

have happened. Which events exactly can be found out with WSAEnumNetworkEvents.

9. Use with threads

Before starting with the overlapped I/O models I first want to explain some things about the use of

threads. Some of the models explained can show different behavior when threads come into play.

For example, blocking sockets in a single threaded application will block the whole application. But

when the blocking sockets are used in a separate thread, the main thread continues to run while

the helper thread blocks. For low traffic servers (let's say 10 connections or so), an easy to

implement method is to use the select model with one thread per client. Each running thread is

bound to a specific connection, handling requests and responses for that particular connection.

Other ways of using threads are possible too, like handling multiple connections per thread to limit

the number of threads (this is useful for servers with many connections), or just one main thread

to handle the user input/GUI and one worker thread that deals with all the socket I/O.

The same thing holds for the other models, although some combine better with threads than

others. For example, WSAAsyncSelect uses window messages. You could use threads but you

somehow have to pass the received messages to the worker threads. Easier to use is

WSAEventSelect, since threads can wait on events (even multiple) so notifications can be directly

acted on in the thread. Pure blocking sockets can be used as well, but it's hard to get some

control over a thread that is blocked on a winsock function (select has the same problem). With

events, you can create a custom event (not winsock related) and use that to notify the thread

about something that hasn't got to do with socket I/O like shutting down the server.

As you can see, threads can be very powerful and change the abilities of an I/O model radically.

Many servers need to handle multiple requests at the same time so that's why threads are a logical

choice to implement this; threads all run at the same time. In later chapters I will discuss the use

of threads, for now it's enough to know you can use them.

10. Introduction to Overlapped I/O

Overlapped I/O is very efficient and when implemented well also very scalable (allowing many,

many connections to be handled). This is especially true for overlapped I/O in combination with

completion ports. I said before that for most uses overlapped I/O is a bit overkill but I will explain

them anyway.

The asynchronous models discussed so far all send some kind of notification on the occurrence of a

network event like 'data available' or 'ready to send again'. The overlapped I/O models also notify

you, but about completion instead of a network event. When requesting a winsock operation, it

might either complete immediately or fail with WSA_IO_PENDING as the winsock error code. In the

latter case, you will be notified when the operation is finished. This means you don't have to try

again like with the other models, you just wait until you're told it's done.

The price to pay for this efficient model is that overlapped I/O is a bit tricky to implement. Usually

one of the other models can stand up to the task as well, prefer those if you don't need really high

performance and scalability. Also, the windows 9x/ME series do not fully support all overlapped I/O



performance and scalability. Also, the windows 9x/ME series do not fully support all overlapped I/O

models. While NT4/2K/XP has full kernel support for overlapped I/O, win9x/ME has none. However

for some devices (including sockets), overlapped I/O is emulated by the windows API in win9x/ME.

This means you can use overlapped I/O with winsock for win9x/ME, but NT+ has a much greater

support for it and provides more functionality. For example, I/O completion ports are not available

at all on win9x systems. Besides, if you're writing high-performance applications that require

overlapped I/O I strongly recommend running it on an NT+ system.

As with the network event notification models, overlapped I/O can be implemented in different

ways too. They differ in the method of notification: blocking, polling, completion routines and

completion ports.

11. Overlapped I/O: blocking on event

The first overlapped I/O model I'm going to explain is using an event object to signal completion.

This is much like WSAEventSelect, except that the object is set into the signaled state on

completion of an operation, not on some network event. Here's the timeline:

As with WSAEventSelect, there are many ways to use the event object. You could just wait for it,

you could wait for multiple objects, etc. In the timeline above a blocking wait is used, matching

this simple conversation:


winsock: okay, but I couldn't send it right now

program waits for the event object to signal, indicating completion of theoperation

As you can see, the winsock operation is actually performed at the same time as the main thread is

running (or waiting in this case). When the event is signaled, the operation is complete and the

main thread can perform the next I/O operation. With network event notification models, you

probably had to retry the operation. This is not necessary here.



Just like the polling model mentioned earlier, the status of an overlapped I/O operation can be

polled too. The WSAGetOverlappedResult function can be used to determine the status of a

pending operation. The timeline and conversation are pretty much the same as the other polling

model, except for that the operation happens at the same time as the polling, and that the status

is the completion of the operation, not whether the operation succeeded immediately or would

have blocked.



program: are you done yet?

winsock: no


winsock: no


winsock: no


winsock: no


winsock: yes!

Again, polling isn't very good as it puts too much stress on the CPU. Continuously asking if an

operation completes is less efficient than just waiting for it in an efficient, little CPU consuming

wait state. So I don't consider this a very good I/O model either. This doesn't render

WSAGetOverlappedResult useless though, it has more uses, which I will show when the tutorial

comes to the chapters about overlapped I/O.

13. Overlapped I/O: completion routines

Completion routines are callback routines that get called when an operation (which you associated

with the routine) completes. This looks quite simple but there is a tricky part: the callback routine

is called in the context of the thread that initiated the operation. What does that mean? Imagine a

thread just asked for an overlapped write operation. Winsock will perform this operation while your



12. Overlapped I/O: polling

main thread continues to run. So winsock has its own thread for the operation. When the operation

finishes, winsock will have to call the callback routine. If it would just call it, the routine would be

run in the context of the winsock thread. This means the calling thread (the thread that asked for

the operation) would be running at the same time as the callback routine. The problem with that is

that you don't have synchronization with the calling thread, it doesn't know the operation

completed unless the callback tells him somehow.

To prevent this from happening, winsock makes sure the callback is run in the context of the

calling thread by using the APC (Asynchronous Procedure Call) mechanism included in windows. You

can look at this as 'injecting' a routine into a threads program flow so it will run the routine and

then continue with what it was doing. Of course the system can't just say to a thread: "Stop doing

whatever you were doing, and run this routine first!". A thread can't just be intervened at any

point.

In order to deal with this, the APC mechanism requires the thread to be in a so-called alertable

wait state. Each thread has its own APC queue where APCs are waiting to be called. When the

thread enters an alertable wait state it indicates that it's willing to run an APC. The function that

put the thread in this wait state (for example SleepEx, WaitForMultipleObjectsEx and more) either

returns on the normal events for that function (timeout, triggered event etc.) or when an APC was

executed.

Overlapped I/O with completion routines use the APC mechanism (though slightly wrapped) to

notify you about completion of an operation. The timeline and conversation are:



program enters an alertable wait state

the operation completes

winsock: system, queue this completion routine for that thread

the wait state the program is in is alerted

the wait function executes the queued completion routine and returns to theprogram



APCs can be a bit hard to understand but don't worry, this is just an introduction. Usually a thread

is in the alertable wait state until the callback is called, which handles the event and returns to the

thread. The thread then does some operations if necessary and finally loops back to the wait state

again.

14. Overlapped I/O: completion ports

We've finally come to the last and probably most efficient winsock I/O model: overlapped I/O with

completion ports. A completion port is a mechanism available in NT kernels (win9x/ME has no

support for it) to allow efficient management of threads. Unlike the other models discussed so far,

completion ports have their own thread management. I didn't draw a timeline nor made a

conversation for this model, as it probably wouldn't make things clearer. I did draw an image of the

mechanism itself, have a good look at it first:

The idea behind completion ports is the following. After creating the completion port, multiple

sockets (or files) can be associated with it. At that point, when an overlapped I/O operation

completes, a completion packet is sent to the completion port. The completion port has a pool of

similar worker threads, each of which are blocking on the completion port. On arrival of a

completion packet, the port takes one of the inactive queued threads and activates it. The

activated thread handles the completion event and then blocks again on the port.

The management of threads is done by the completion port. There are a certain number of threads

running (waiting on the completion port actually), but usually not all of them are active at the

same time. When creating the completion port you can specify how many threads are active at the

same time. This value defaults to the number of CPUs in the system.

Completion ports are a bit counter intuitive. There is no relation between a thread and a

connection or operation. Each thread has to be able to act on any completion event that

happened on the completion port. I/O completion ports (IOCP) are not easy to implement but

provide a very good scalability. You will be able to handle thousands of connections with IOCP.

15. Conclusion

I hope you now have a global view of all the I/O models available. Don't worry if you don't fully

understand them, the next chapters will explain them more detailed, one at a time.



The first I/O model I'm going to explain to you is the simplest one, the blocking sockets. Winsock

functions operating on blocking sockets will not return until the requested operation has completed

or an error has occurred. This behavior allows a pretty linear program flow so it's easy to use them.

In chapter 4, you've seen the basic winsock functions. These are pretty much all functions you need

to program blocking sockets, although I will show you some additional functions that may be useful

in this chapter.

You might not be very interested in blocking sockets if you plan to use an I/O model that uses non-

blocking socket. Nonetheless, I strongly recommend you to read the chapters about blocking sockets

too since they cover the socket programming basics and other useful winsock features I will assume

you remember for the next chapters.

1. A s imple c lient

The first example is a simple client program that connects to a website and makes a request. It will

be a console application as they work well with blocking sockets. I won't assume you have deep

knowledge of the HTTP (the protocol used for the web), this is what happens in short:The client connects to the server (on port 80 by default)

The server accepts the connection and just waits

The clients sends its HTTP request as an HTTP request message

The server responds to the HTTP request with an HTTP response message

The server closes the connection*

*) This depends on the value of the connection HTTP header, but to keep things simple, we assume

the connection will always be closed.

HTTP follows the typical client-server model, the client and server talk to each other in turns. The

client initiates the requests; the server reacts with a response.

An HTTP request includes a request method of which the three most used

are GET and POST and HEAD. GET is used to get a resource from the web (webpage, image, etc.).

POST sends data to the server first (like form data filled by the user), then receives the server's

response. Finally, HEAD is the same as GET, except for that the actual data is not send by the

server, only the HTTP response message. HEAD is used as a fast way to see if a page has been

modified without having to download the full page data. In the example program I will use HEAD since

GET can return quite some data while HEAD will only return a response code and set of headers so

the program's output easier to read.

A typical HTTP request with the HEAD request method looks like this:

HEAD / HTTP/1.1

Host: www.google.com

User-agent: HeadReqSample

Connection: close

The first / in the fist line is the requested page, in this case the server's root (default page).

Blocking sockets: client - MadWizard.org


Blocking sockets: client

HTTP/1.1 indicates version 1.1 of the HTTP protocol is used. After this first special line that contains

the command follows a set of header in the form "header-name: value", terminated by a blank line.

As line terminators, a combination of carriage return (CR, 0x0D) and line feed (LF, 0x0A) is used.

That last blank line indicates the end of the client's request. As soon as the server detects this, it

will send back a response in this form:

HTTP/1.1 Response-code Response-message

header-name: value

header-name: value

header-name: value

As you can see the response format is much like that of a request. Response-code is a 3-digit code

that indicates the success or failure of the request. Typical response codes are 200 (everything

OK), 404 (page not found, you probably knew this one :) and 302 (found but located elsewhere,

redirect). Response-message is a human-readable version of the response code and can be anything

the server likes. The set of headers include information about the requested resource. A HEAD

request will result in the above response. If the request method would have been GET, the actual

page data will be sent back by the server after this response message.

So far for the crash course HTTP, it's not really necessary to understand it all to read the examples

about blocking sockets, but now you have some background information too. If you want to read

more about HTTP, find the RFC for it (www.rfc-editor.org) or google for HTTP. Another great

introduction to HTTP is HTTP made really easy.

2. Program example

A possible output of the example program called HeadReq is shown here:

X:\>headreq www.microsoft.com

Initializing winsock... initialized.

Looking up hostname www.microsoft.com... found.

Creating socket... created.

Attempting to connect to 207.46.134.190:80... connected.

Sending request... request sent.

Dumping received data...

HTTP/1.1 200 OK

Connection: close

Date: Mon, 17 Mar 2003 20:14:03 GMT

Server: Microsoft-IIS/6.0

P3P: CP='ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR

SAMo C

NT COM INT NAV ONL PHY PRE PUR UNI'

Content-Length: 31102

Content-Type: text/html

Expires: Mon, 17 Mar 2003 20:14:03 GMT

Cache-control: private



Cleaning up winsock... done.

If the program's parameter (www.microsoft.com) is omitted, www.google.com is used.

3. Hostnames

So what do we need for the client? I'm assuming you have the address of the webpage

(www.google.com for example) and you want to get the default webpage for it, like the page you

get when entering www.google.com in your web browser (in order to keep things simple we will only

receive the server's response headers, not the actual page).

As you know from chapter 4, you can connect a socket to a server with the connect function, but

this function requires a sockaddr structure (or sockaddr_in in the case of TCP/IP). How do we build

up this structure? Sockaddr_in needs an address family, an IP number and a port number. The

address family is simply AF_INET. The port number is also easy; the default for the HTTP protocol is

port 80. What about the IP, we only got a hostname? If you remember chapter 2 there's a DNS

server that knows which IPs correspond to which hostnames. To find this out, winsock has a

function called gethostbyname:

hostent * gethostbyname(const char *name);

You simply provide this function a hostname as a string (eg. "www.google.com") and it will return a

pointer to a hostent structure. This hostent structure contains a list of addresses (IPs) that are

valid for the given hostname. One of these IPs can then be put into the sockaddr_in structure and

we're done.

4. Framework

The program we're going to write will connect to a web server, send a HEAD HTTP request and dump

all output. An optional parameter specifies the server name to connect to, if no name is given it

defaults to www.google.com.

First of all, we define the framework for the application:

#include

#define WIN32_MEAN_AND_LEAN#include #include

using namespace std;

class HRException{public: HRException() : m_pMessage("") {} virtual ~HRException() {} HRException(const char *pMessage) : m_pMessage(pMessage) {} const char * what() { return m_pMessage; }private: const char *m_pMessage;};

int main(int argc, char* argv[]){ // main program}



The winsock headers are already included by windows.h, but because we use some winsock 2

specific things we also need to include winsock2.h. Include this file before windows.h to prevent it

from including an older winsock version first. We will also need the STL's iostream classes, so we

included those too. Don't forget to link to ws2_32.lib, or you'll get a bunch of unresolved symbol

errors.

The HRException class is a simple exception class used to throw errors that occur. One of its

constructors takes a const char * with an error message that can be retrieved with the what()

method.

5. Constants and global data

The program will need some constants and global data, which we define in the following code

snippet:

const int REQ_WINSOCK_VER = 2; // Minimum winsock version requiredconst char DEF_SERVER_NAME[] = "www.google.com";const int SERVER_PORT = 80;const int TEMP_BUFFER_SIZE = 128;

const char HEAD_REQUEST_PART1[] ={ "HEAD / HTTP/1.1\r\n" // Get root index from server "Host: " // Specify host name used};

const char HEAD_REQUEST_PART2[] ={ "\r\n" // End hostname header from part1 "User-agent: HeadReqSample\r\n" // Specify user agent "Connection: close\r\n" // Close connection after response "\r\n" // Empty line indicating end of request};

// IP number typedef for IPv4typedef unsigned long IPNumber;

These constants and data define the default hostname (www.google.com), server port (80 for

HTTP), receive buffer size, and the minimum (major) winsock version required (2 or higher in our

case). Furthermore, the full HTTP request is put in two variables. The request is split up because the

hostname of the server needs to be inserted as the host header (see the HTTP message examples

above). While all strings in C automatically get a 0 byte at the end to terminate it, we don't actually

treat it as a null-terminated string. Only the text itself will be send, without the null terminator.

Finally, unsigned long is typedef'ed to IPNumber to make the

Winsock Networking Tutorial (C++)

Documents