SOCKET PROGRAMMING ON UNIX - INTRODUCTION TO SOCKETS(PART 1/3)

2020-06-03
In this series of tutorials, I will try to explain how socket programming works under UNIX operating systems(focusing on Linux) and eventually, how to use them to create network-based programs, such as a port scanner. In fact, being able to create a (SYN)port scanner is the final scope of this guide. To fully comprehend this tutorial, you should have a good knowledge of system programming in C, some knowledge of GNU/Linux systems(or any UNIX-like OS of your choice) and some knowledge of the TCP/IP stack.

What is a socket?

Let's begin by explaining what exactly is a socket and what is his role on a UNIX operating system. Usually, to communicate with each other, two processes on the same host can use either pipes or shared memory. But what would happen if the two processes were in different host? For instance, what exactly happens when you use your browser to connect to a server from the other part of the world? This is where sockets come into play.
>
A socket is logical endpoint(situated on the transport layer) between two processes running in two different hosts. A socket can be seen as a tuple of one IP address and a port.
-- Wikipedia
Do note here, that what we have just defined above are TCP/IP sockets. Do not confuse them with UNIX sockets. To put it simple, a UNIX socket(or a UNIX domain socket) is a communication mechanism that allows data-exchange between processes running on the same host.

Client/Server communication

We already know that at the core of the TCP/IP stack there's the client/server model, where one designed entity(called client) send requests to the other one(the server) which evaluate it and send back the response. Now we shall see how to implement this mechanism using the notion of socket we learned so far. Before seeing the actual code, let's summarize the steps needed to establish a connection between a client and a server: socket diagram Do note here that the concept of "connection" makes sense only when we are dealing with connection-oriented sockets(i.e., TCP sockets), there's nothing like that on UDP/ICMP/ARP sockets

Server code


/*
* Server code
* Compile it with: gcc -Wall -Wextra -Werror server.c -o server
*/
#include <stdio.h> // printf, puts
#include <stdlib.h> // memset, strlen
#include <string.h> // strlen
#include <unistd.h> // write
#include <sys/socket.h>
#include <arpa/inet.h>

#define BUF_SIZE 1024

int main(int argc, char **argv) {
     int server_fd = 0, client_fd = 0, read_sz, count = 0;
     struct sockaddr_in server_sock, client_sock;
     char buf[BUF_SIZE];

     // Check for cli arguments
     if(argc != 2) {
          printf("Usage: %s <SERVER_PORT>\n", argv[0]);
          return 1;
     }

     // 1. Create socket
     server_fd = socket(AF_INET, SOCK_STREAM, 0);
     if(server_fd == -1) {
           puts("Unable to create socket");
           return 1;
     }

     // Setup socket and initialize buffer
     server_sock.sin_addr.s_addr = inet_addr("127.0.0.1");
     server_sock.sin_port = htons(atoi(argv[1]));
     server_sock.sin_family = AF_INET;
     memset(buf, 0, sizeof(buf));

     // 2. Bind socket to address and port
     int ret = bind(server_fd, (struct sockaddr*)&server_sock, sizeof(server_sock));
     if(ret < 0) {
           puts("Unable to bind TCP socket");
           return 1;
     }

     // 3. Listen for connections
     // 2nd parameter is the backlog. i.e., the maximum length to which the queue
     // of pending connections may grow
     listen(server_fd, 128);
     int len = sizeof(struct sockaddr_in);

     for(;;) {
         // 4. Accept incoming connections
          client_fd = accept(server_fd,
          (struct sockaddr*)&client_sock,
          (socklen_t*)&len);

          if(client_fd < 0) {
          puts("Unable to accept connections");
          return 1;
          }
          // Print client IP address
          char *client_addr = inet_ntoa(client_sock.sin_addr);
          int client_port = ntohs(client_sock.sin_port);
          printf("New connection from %s:%d\n", client_addr, client_port);
          // 5. Send data back to clients
          snprintf(buf,
          sizeof(buf),
          "server> You've reached me %d time(s)\n",
          ++count);
          write(client_fd, buf, strlen(buf));
     }

     if(read_sz == -1) {
          puts("Error reading from client");
          return 1;
     }
     
     return 0;
}

The code should be self-explanatory, we just implement the steps listed in the previous list. In order to test it, we can use the old telnet utility:

$> ./server 3000 &
[1] 356203
$> telnet localhost 3000
Trying ::1...
Connection failed: Connessione rifiutata # Ignore this, we do not support IPv6
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Hi
server> You've reached me 1 time(s)
^]
telnet> q
Connection closed.
[1]  + 356203 done       ./server 3000

Client code

Let's now try to implement a simple client by following the steps listed in the previous section

/*
* Client code
* Compile it with: gcc -Wall -Wextra -Werror client.c -o client
* by Marco Cetica <ceticamarco@gmail.com> 2021
*/

#include <stdio.h> // printf, puts
#include <string.h> // strlen
#include <stdlib.h> // atoi, memset
#include <unistd.h> // close
#include <sys/socket.h>
#include <arpa/inet.h>

#define BUF_SIZE 1024

int main(int argc, char **argv) {
     int server_fd = 0, ret = 0;
     struct sockaddr_in server_sock;
     char msg[BUF_SIZE], server_msg[BUF_SIZE];

     if(argc != 3) {
          printf("Usage: %s <IP_ADDRESS> <PORT>\n", argv[0]);
          return 1;
     }

     // 1. Create socket
     server_fd = socket(AF_INET, SOCK_STREAM, 0);
     if(server_fd  == -1) {
          puts("Unable to create socket");
          return 1;
     }

     // Setup socket and buffers
     server_sock.sin_addr.s_addr = inet_addr(argv[1]);
     server_sock.sin_port = htons(atoi(argv[2]));
     server_sock.sin_family = AF_INET;
     memset(server_msg, 0, sizeof(server_msg));
     sprintf(msg, "Hello World");


     // 2. Connect to server
     ret = connect(server_fd, (struct sockaddr*)&server_sock, sizeof(server_sock));
     if(ret < 0) {
          puts("Unable to connect to remote host");
          return 1;
     }

     // 3. Send data to server
     ret = send(server_fd, msg, strlen(msg), 0);
     if(ret < 0) {
          puts("Unable to send data to remote server");
          return 1;
     }

     // Read server's response
     ret = recv(server_fd, server_msg, sizeof(server_msg), 0);
     if(ret < 0) {
          puts("Unable to read data from remote server");
          return 1;
     }

     printf("%s", server_msg);


     // Finally, close socket
     close(server_fd);

     return 0;
}

If we try to execute both server and client, we should see the exact same behavior:

$> ./server 5000 &
[1] 362784
$> for i in {1..5}; do ./client 127.0.0.1 5000; done
New connection from 127.0.0.1:58706
server> You've reached me 1 time(s)
New connection from 127.0.0.1:58708
server> You've reached me 2 time(s)
New connection from 127.0.0.1:58710
server> You've reached me 3 time(s)
New connection from 127.0.0.1:58712
server> You've reached me 4 time(s)
New connection from 127.0.0.1:58714
server> You've reached me 5 time(s)

Conclusions

In this first part of the guide, we learned what TCP/IP sockets are, how they work and how can we use them to build network applications. Even if our client/server example is pretty basic(what would happen if two or more clients tried to connect to the server simultaneously?), it's a great starting point. In the next part of this guide we will extend our knowledge about sockets by introducing a new kind of socket, completely unrelated to either stream and datagram sockets: raw socket.