Python Socket Module Part #5

Dolyetyus · 19 Ara 2020

Greetings Dear Turk Hack Team Members, in this tutorial we'll end this tutorial which is about python socket module.

Errors

The following is from Pythons socket module documéntation:

All errors raise exceptions. The normal exceptions for invalid argument types and out-of-memory conditions can be raised; starting from Python 3.3, errors related to socket or address semantics raise OSError or one of its subclasses.

Here are some common errors youll probably encounter when working with sockets:

Exception errno Constant Description

BlockingIOError EWOULDBLOCK
Resource temporarily unavailable. For example, in non-blocking mode, when calling send() and the peer is busy and not reading, the send queue (network buffer) is full. Or there are issues with the network. Hopefully this is a temporary condition.

OSError EADDRINUSE
Address already in use. Make sure theres not another process running thats using the same port number and your server is setting the socket option SO_REUSEADDR: socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1).
ConnectionResetError ECONNRESET Connection reset by peer. The remote process crashed or did not close its socket properly (unclean shutdown). Or theres a firewall or other device in the network path thats missing rules or misbehaving.

TimeoutError ETIMEDOUT
Operation timed out. No response from peer.

ConnectionRefusedError ECONNREFUSED
Connection refused. No application listening on specified port.

Socket Address Families

socket.AF_INET and socket.AF_INET6 represent the address and protocol families used for the first argument to socket.socket(). APIs that use an address expect it to be in a certain format, depending on whether the socket was created with socket.AF_INET or socket.AF_INET6.

Address Family Protocol Address Tuple Description

socket.AF_INET IPv4 (host, port)
Host is a string with a hostname like 'www.example.com' or an IPv4 address like '10.1.2.3'. port is an integer.

socket.AF_INET6 IPv6 (host, port, flowinfo, scopeid)
Host is a string with a hostname like 'www.example.com' or an IPv6 address like 'fe80::6203:7ab:fe88:9c23'. port is an integer. flowinfo and scopeid represent the sin6_flowinfo and sin6_scope_id members in the C struct sockaddr_in6.

Note the excerpt below from Pythons socket module documéntation regarding the host value of the address tuple:

For IPv4 addresses, two special forms are accepted instead of a host address: the empty string represents INADDR_ANY, and the string '<broadcast>' represents INADDR_BROADCAST. This behavior is not compatible with IPv6, therefore, you may want to avoıd these if you intend to support IPv6 with your Python programs.

See Pythons Socket families documéntation for more information.

Ive used IPv4 sockets in this tutorial, but if your network supports it, try testing and using IPv6 if possible. One way to support this easily is by using the function socket.getaddrinfo(). It translates the host and port arguments into a sequence of 5-tuples that contains all of the necessary arguments for creating a socket connected to that service. socket.getaddrinfo() will understand and interpret passed-in IPv6 addresses and hostnames that resolve to IPv6 addresses, in addition to IPv4.

The following example returns address information for a TCP connection to example.org on port 80:

Kod:

[COLOR="PaleGreen"]>>> socket.getaddrinfo("example.org", 80, proto=socket.IPPROTO_TCP)
[(<AddressFamily.AF_INET6: 10>, <SocketType.SOCK_STREAM: 1>,
 6, '', ('2606:2800:220:1:248:1893:25c8:1946', 80, 0, 0)),
 (<AddressFamily.AF_INET: 2>, <SocketType.SOCK_STREAM: 1>,
 6, '', ('93.184.216.34', 80))][/COLOR]

Results may differ on your system if IPv6 isnt enabled. The values returned above can be used by passing them to socket.socket() and socket.connect(). Theres a client and server example in the Example section of Pythons socket module documéntation.

Using Hostnames

For context, this section applies mostly to using hostnames with bind() and connect(), or connect_ex(), when you intend to use the loopback interface, localhost. However, it applies any time youre using a hostname and theres an expectation of it resolving to a certain address and having a special meaning to your application that affects its behavior or assumptions. This is in contrast to the typical scenario of a client using a hostname to connect to a server thats resolved by DNS, like Example Domain.

The following is from Pythons socket module documéntation:

If you use a hostname in the host portion of IPv4/v6 socket address, the program may show a non-deterministic behavior, as Python uses the first address returned from the DNS resolution. The socket address will be resolved differently into an actual IPv4/v6 address, depending on the results from DNS resolution and/or the host configuration. For deterministic behavior use a numeric address in host portion.

The standard convention for the name localhost is for it to resolve to 127.0.0.1 or ::1, the loopback interface. This will more than likely be the case for you on your system, but maybe not. It depends on how your system is configured for name resolution. As with all things IT, there are always exceptions, and there are no guarantees that using the name localhost will connect to the loopback interface.

For example, on Linux, see man nsswitch.conf, the Name Service Switch configuration file. Another place to check on macOS and Linux is the file /etc/hosts. On Windows, see C:\Windows\System32\drivers\etc\hosts. The hosts file contains a static table of name to address mappings in a simple text format. DNS is another piece of the puzzle altogether.

Interestingly enough, as of this writing (June 2018), theres an RFC draft Let localhost be localhost that discusses the conventions, assumptions and security around using the name localhost.

Whats important to understand is that when you use hostnames in your application, the returned address(es) could literally be anything. Dont make assumptions regarding a name if you have a security-sensitive application. Depending on your application and environment, this may or may not be a concern for you.

Note: Security precautions and best practices still apply, even if your application isnt security-sensitive. If your application accesses the network, it should be secured and maintained. This means, at a minimum:

-System software updates and security patches are applied regularly, including Python. Are you using any third party libraries? If so, make sure those are checked and updated too.

-If possible, use a dedicated or host-based firewall to restrict connections to trusted systems only.

-What DNS servers are configured? Do you trust them and their administrators?

-Make sure that request data is sanitized and validated as much as possible prior to calling other code that processes it. Use (fuzz) tests for this and run them regularly.

Regardless of whether or not youre using hostnames, if your application needs to support secure connections (encryption and authentication), youll probably want to look into using TLS. This is its own separate topic and beyond the scope of this tutorial. See Pythons ssl module documéntation to get started. This is the same protocol that your web browser uses to connect securely to web sites.

With interfaces, IP addresses, and name resolution to consider, there are many variables. What should you do? Here are some recommendations that you can use if you dont have a network application review process:

Application Usage Recommendation

Server loopback interface Use an IP address, for example, 127.0.0.1 or ::1.
Server ethernet interface Use an IP address, for example, 10.1.2.3. To support more than one interface, use an empty string for all interfaces/addresses. See the security note above.
Client loopback interface Use an IP address, for example, 127.0.0.1 or ::1.
Client ethernet interface Use an IP address for consistency and non-reliance on name resolution. For the typical case, use a hostname. See the security note above.

For clients or servers, if you need to authenticate the host youre connecting to, look into using TLS.

Blocking Calls

A socket function or method that temporarily suspends your application is a blocking call. For example, accept(), connect(), send(), and recv() block. They dont return immediately. Blocking calls have to wait on system calls (I/O) to complete before they can return a value. So you, the caller, are blocked until theyre done or a timeout or other error occurs.

Blocking socket calls can be set to non-blocking mode so they return immediately. If you do this, youll need to at least refactor or redesign your application to handle the socket operation when its ready.

Since the call returns immediately, data may not be ready. The callee is waiting on the network and hasnt had time to complete its work. If this is the case, the current status is the errno value socket.EWOULDBLOCK. Non-blocking mode is supported with setblocking().

By default, sockets are always created in blocking mode. See Notes on socket timeouts for a description of the three modes.

Closing Connections

An interesting thing to note with TCP is its completely legal for the client or server to close their side of the connection while the other side remains open. This is referred to as a half-open connection. Its the applications decision whether or not this is desirable. In general, its not. In this state, the side thats closed their end of the connection can no longer send data. They can only receive it.

Im not advocating that you take this approach, but as an example, HTTP uses a header named Connection thats used to standardize how applications should close or persist open connections. For details, see section 6.3 in RFC 7230, Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing.

When designing and writing your application and its application-layer protocol, its a good idea to go ahead and work out how you expect connections to be closed. Sometimes this is obvious and simple, or its something that can take some initial prototyping and testing. It depends on the application and how the message loop is processed with its expected data. Just make sure that sockets are always closed in a timely manner after they complete their work.

Byte Endianness

See Wikipedias article on endianness for details on how different CPUs store byte orderings in memory. When interpreting individual bytes, this isnt a problem. However, when handling multiple bytes that are read and processed as a single value, for example a 4-byte integer, the byte order needs to be reversed if youre communicating with a machine that uses a different endianness.

Byte order is also important for text strings that are represented as multi-byte sequences, like Unicode. Unless youre always using true, strict ASCII and control the client and server implementations, youre probably better off using Unicode with an encoding like UTF-8 or one that supports a byte order mark (BOM).

Its important to explicitly define the encoding used in your application-layer protocol. You can do this by mandating that all text is UTF-8 or using a content-encoding header that specifies the encoding. This prevents your application from having to detect the encoding, which you should avoıd if possible.

This becomes problematic when there is data involved thats stored in files or a database and theres no métadata available that specifies its encoding. When the data is transferred to another endpoint, it will have to try to detect the encoding. For a discussion, see Wikipedias Unicode article that references RFC 3629: UTF-8, a transformation format of ISO 10646:

However RFC 3629, the UTF-8 standard, recommends that byte order marks be forbidden in protocols using UTF-8, but discusses the cases where this may not be possible. In addition, the large restriction on possible patterns in UTF-8 (for instance there cannot be any lone bytes with the high bit set) means that it should be possible to distinguish UTF-8 from other character encodings without relying on the BOM. (Source)

The takeaway from this is to always store the encoding used for data thats handled by your application if it can vary. In other words, try to somehow store the encoding as métadata if its not always UTF-8 or some other encoding with a BOM. Then you can send that encoding in a header along with the data to tell the receiver what it is.

The byte ordering used in TCP/IP is big-endian and is referred to as network order. Network order is used to represent integers in lower layers of the protocol stack, like IP addresses and port numbers. Pythons socket module includes functions that convert integers to and from network and host byte order:

Function Description

socket.ntohl(x) Convert 32-bit positive integers from network to host byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 4-byte swap operation.
socket.ntohs(x) Convert 16-bit positive integers from network to host byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 2-byte swap operation.
socket.htonl(x) Convert 32-bit positive integers from host to network byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 4-byte swap operation.
socket.htons(x) Convert 16-bit positive integers from host to network byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 2-byte swap operation.

You can also use the struct module to pack and unpack binary data using format strings:

Kod:

import struct
network_byteorder_int = struct.pack('>H', 256)
python_int = struct.unpack('>H', network_byteorder_int)[0]

Conclusion

We covered a lot of ground in this tutorial. Networking and sockets are large subjects. If youre new to networking or sockets, dont be discouraged by all of the terms and acronyms.

There are a lot of pieces to become familiar with in order to understand how everything works together. However, just like Python, it will start to make more sense as you get to know the individual pieces and spend more time with them.

We looked at the low-level socket API in Pythons socket module and saw how it can be used to create client-server applications. We also created our own custom class and used it as an application-layer protocol to exchange messages and data between endpoints. You can use this class and build upon it to learn and help make creating your own socket applications easier and faster.

You can find the source code on GitHub.

Congratulations on making it to the end! You are now well on your way to using sockets in your own applications.

I hope this tutorial has given you the information, examples, and inspiration needed to start you on your sockets development journey.

I hope you enjoyed and learnt some new things, I wish you a good day

//Quoted

Python Socket Module Part #5

Dolyetyus

Özel Üye

Sosyal medya sayfalarımız