Session objects in python APIs

By Brian Fitzgerald

Introduction

Database administrators understand “session” to mean an established database connection. It may come as a surprise that “session”, in several database-related libraries, refers to a data structure in the absence of any established connection.

Database session

A database session is an instance memory structure that holds an established connection’s state. The session contains facts such as a unique session id and the user name. The database session exists only after the transport (network connection) is established, the user has authenticated, and the memory structures have been allocated. The client application uses an API which holds a handle to the server side session. Application code uses the database API handle to issue commands and to receive responses and query results. The API is layered on top of a database driver, which associates the API handle with an operating system or network connection. In this way, client application to a server session via an API and a database driver. In databases, then, a client “handle” connects to a server “session”.

Other meanings of “session”

DBAs work with software APIs, such as python libraries, that do not establish a connection while allocating what they call a “session”. I will mention three APIs: requests, sqlalchemy, and boto3, in which “session” refers to a data structure that is not always associated with a connection.

Requests

Requests is a python HTTP library. HTTP is a connectionless protocol. Each request to the HTTP server can be implemented by a separate, new network connection that is disconnected after each call. In the following example, the call makes a TCP connection to the server, sends a request, receives a response, and closes the connection.

>>> import requests
>>> from json import loads
>>> r = requests.get('https://httpbin.org/get?q=Hello')
>>> type(r)
<class 'requests.models.Response'>

Let’s verify that no connection remains. Suspend your python repl by pressing ctrl-Z.

>>>
[1]+ Stopped python3
$

Is requests still connected?

$ lsof -n -p $(pgrep python3) | grep https

No output here means that requests is no longer connected.

Requests offers an advanced option that preserves state and a connection across calls.

>>> from requests import Session
>>> ses = Session()
>>> type(ses)
<class 'requests.sessions.Session'>
>>> ses.get('https://httpbin.org/get?q=Hello')
<Response [200]>

Is ses still connected? Let’s see.

$ lsof -n -p $(pgrep python3) | grep https
python3 3387 bf2190 3u IPv4 125425570 0t0 TCP 10.130.99.84:36878->52.86.136.68:https (ESTABLISHED)

Yes. The Session is connected to the https server. If you make another call within a short time, requests will make that call on the same socket. If the connection is idle for approximately 2.5 minutes, then requests will close the connection. If you make another call after that, requests will open a new socket.

To summarize, a requests Session is a client side handle that is not connected when it is created, but gets connected on first use. The session remains connected during use, and disconnects after an idle timeout.

SQL Alchemy

SQL Alchemy is a python SQL toolkit and object relational mapper. Two objects used for establishing a connection are the Engine and the Session. The Engine contains information about the driver and the connection string.

from sqlalchemy import create_engine

eng = create_engine(
    connection_string,
    max_identifier_length=128
)
print(type(eng))
<class 'sqlalchemy.engine.base.Engine'>

The Session is an object that will eventually hold an established connection.

from sqlalchemy.orm import sessionmaker
db_sess = sessionmaker(bind=eng)()
print(type(db_sess))
<class 'sqlalchemy.orm.session.Session'>

So far, the Session is not connected, but will be after a call to connection()

conn = db_sess.connection()
print(type(conn))
<class 'sqlalchemy.engine.base.Connection'>

conn holds an established connection. The other way to connect is to simply use the session.

for row in db_sess.execute(text('select dummy from dual')):
    print('%s' % row['dummy'])
X

Now db_sess holds an established connection. In conclusion, the SQLAlchemy Session is an object that is not initially connected. The Session connects by a call to connection() or on first use.

boto3

boto3 is the AWS python library. It is used to manage AWS services. One such service is Relational Database Service (RDS). In boto3, you make API calls to manage your environment. For example, you can list all RDS db instance details by calling describe_db_instances. To make a call, you must first allocate a boto3 client. The constructor for the RDS service is client(‘rds’). You can call the constructor and run describe directly like so:

>>> from boto3 import client
>>> len(client('rds').describe_db_instances()['DBInstances'])
30

The output from describe_db_instances is verbose, but the results are not needed for this example, so len is used here to summarize the result (30 db instances). Because describe was called directly from the constructor, the client object goes out of scope immediately, and the connection is dropped:

>>> 
[1]+ Stopped python3 
$ lsof -n -p $(pgrep python3) | grep https 
$

In the next example, the boto3 client is saved to variable rds

>>> from boto3 import client
>>> rds = client('rds')
>>> type(rds)
<class 'botocore.client.RDS'>
>>> len(rds.describe_db_instances()['DBInstances'])
30
>>>
[1]+ Stopped python3
$ lsof -n -p $(pgrep python3) | grep https
python3 7128 bf2190 3u IPv4 125436458 0t0 TCP 10.130.99.84:60276->52.46.159.85:https (ESTABLISHED)

In this case, the client remains connected after the results are returned. After approximately 2 minutes of idle time, boto3 closes the connection. If you make another call after that, boto3 opens a new connection. In the next case, the client is within the scope of a function. When the function returns, the client does out of scope, and the connection gets dropped.

>>> from boto3 import client
>>>
>>> def desc():
... rds = client('rds')
... rds.describe_db_instances()['DBInstances']
...
>>> desc()
>>>
[1]+ Stopped python3
$ lsof -n -p $(pgrep python3) | grep https
$
In the final example, client rds is set outside the scope. When the function returns, the client remains in scope and connected to the AWS server.
>>> from boto3 import client
>>>
>>> rds = client('rds')
>>> def desc():
... rds.describe_db_instances()['DBInstances']
...
>>> desc()
>>>
[1]+ Stopped python3
$ lsof -n -p $(pgrep python3) | grep https
python3 7631 bf2190 3u IPv4 125437676 0t0 TCP 10.130.99.84:42598->54.239.31.3:https (ESTABLISHED)

The boto3 session

The boto3 Session is not a connection at all. It is an object that holds access credentials. The Session creates objects that connect to AWS, particularly the Client. You can use the session to manage across multiple regions in an account, multiple accounts, multiple users, multiple roles, or across tokens acquired at different times. You can use the session to override the access keys in your profile, in your environment, or in your service account.
In this example, we loop over two regions and count the number of option groups. We allocate a list of sessions
>>> from boto3.session import Session
>>>
>>> AWS_ACCESS_KEY_ID="ASIAW3NNBKAMBK2CUGNE"
>>> AWS_SECRET_ACCESS_KEY="8V7Qaama1Yvl6ADRfrFI0vNTegx+V56S3sY5/uX0"
>>> AWS_SESSION_TOKEN="IQoJb3JpZ2luX2VjENz//////////wEaCXVzLWVhc3QtMSJIMEYCIQDOnKeV4UGWCVxuSvmD+O1FPUJHbxZOCPa/njhGlxBo1gIhAIRky8kx0jUNl40dg0NKGHjn1aOjYKe9Hdhs6msAlmq7KokDCFQQARoMNDcxMTk4NjgzMTYwIgzXqUo5F7yN89PdSxkq5gL1HhFXOOjVa8Irz8fVwUazE4VAKVtYthND8bXkqTLm1BE3x0hbD54rzpae80UxmcCnPv2Ec3WH8L6N2a7AvqSfhAJa34pzqJ+AcLy4blcdlbwE8I7d8ds/rvoNFiowEf5EhOaP41ZOdtngzdIUN1byE01Vsij8l9Ai74iE5DUnFqmIajtbr2T0Ah9a5ovQmBBB5nyjhY0DBcV9Bj0B6R8swoDnd6NBDcdcxQNvPqzJfwHsqrOYNJVzn1/FQLj86sgwTumHa1WyEugKWO3tlgEuOb+4Y3SqaMQ/vgC7OV+cdKfRzCrK/H/XYkWliVt7Ct1fQ8sGkEyxBlxyemgfvXybJnrXmEx5YisIO4LFMAkpzqsKYNTNybEwDKZmOAN3geP2c2QxoGNGHZ0NcT9nXDmtwHFuO0x2OlPf/qYmdZmlFKIRqBbGNCOPaJ5YgwjuGtx0TqULwKpNlYuF9WkZz5VGvFo6C1MSMLnpupEGOqUBmaqzHmvTykPbaH03SR5J1pKIvHI/XVMcvaThT2UEl7xQhwY0OE/AwAgP0+dtm0G6Mt7I6F2oZ5x6S2gnFHlxb5ctrgC8CRavVkks+ehHM8iAeBIm8CMk4Cb9jFBWj6jXldhHZDOCnOaqVAFSOlTaqXD3ZPlSRW1iyN1vOrmHIMJsGKeb6PiaGz0a9omCE/HCewQkcqYJXBhaTHSRUF1oJL/+GCvw"
>>>
>>>
>>> sess = []
>>> for region in 'us-east-1 us-west-1'.split():
... ses = Session(
... aws_access_key_id=AWS_ACCESS_KEY_ID,
... aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
... aws_session_token=AWS_SESSION_TOKEN,
... region_name=region
... )
... print(type(ses))
... sess.append(ses)
...
<class 'boto3.session.Session'>
<class 'boto3.session.Session'>

next we loop over the sessions, and run describe in a function.

>>> for ses in sess:
... res = ses.client('rds').describe_option_groups()['OptionGroupsList']
... print(len(res)).
File "<stdin>", line 3
print(len(res)).
^
SyntaxError: invalid syntax
>>> for ses in sess:
... res = ses.client('rds').describe_option_groups()['OptionGroupsList']
... print(len(res))
...
4
0
>>>
[1]+ Stopped python3
$ cat ../pgrep.bash
#!/bin/bash
lsof -n -p $(pgrep python3) | grep https
$ ../pgrep.bash
$

There are 4 option groups in us-east-1 and 0 in us-west-1. The describe ran directly off the client constructor. The client went out of scope once the describe finished. No connection remains connected. Instead, let’s allocate a list of clients.

>>> from boto3.session import Session
>>>
>>> AWS_ACCESS_KEY_ID="ASIAW3NNBKAMBK2CUGNE"
>>> AWS_SECRET_ACCESS_KEY="8V7Qaama1Yvl6ADRfrFI0vNTegx+V56S3sY5/uX0"
>>> AWS_SESSION_TOKEN="IQoJb3JpZ2luX2VjENz//////////wEaCXVzLWVhc3QtMSJIMEYCIQDOnKeV4UGWCVxuSvmD+O1FPUJHbxZOCPa/njhGlxBo1gIhAIRky8kx0jUNl40dg0NKGHjn1aOjYKe9Hdhs6msAlmq7KokDCFQQARoMNDcxMTk4NjgzMTYwIgzXqUo5F7yN89PdSxkq5gL1HhFXOOjVa8Irz8fVwUazE4VAKVtYthND8bXkqTLm1BE3x0hbD54rzpae80UxmcCnPv2Ec3WH8L6N2a7AvqSfhAJa34pzqJ+AcLy4blcdlbwE8I7d8ds/rvoNFiowEf5EhOaP41ZOdtngzdIUN1byE01Vsij8l9Ai74iE5DUnFqmIajtbr2T0Ah9a5ovQmBBB5nyjhY0DBcV9Bj0B6R8swoDnd6NBDcdcxQNvPqzJfwHsqrOYNJVzn1/FQLj86sgwTumHa1WyEugKWO3tlgEuOb+4Y3SqaMQ/vgC7OV+cdKfRzCrK/H/XYkWliVt7Ct1fQ8sGkEyxBlxyemgfvXybJnrXmEx5YisIO4LFMAkpzqsKYNTNybEwDKZmOAN3geP2c2QxoGNGHZ0NcT9nXDmtwHFuO0x2OlPf/qYmdZmlFKIRqBbGNCOPaJ5YgwjuGtx0TqULwKpNlYuF9WkZz5VGvFo6C1MSMLnpupEGOqUBmaqzHmvTykPbaH03SR5J1pKIvHI/XVMcvaThT2UEl7xQhwY0OE/AwAgP0+dtm0G6Mt7I6F2oZ5x6S2gnFHlxb5ctrgC8CRavVkks+ehHM8iAeBIm8CMk4Cb9jFBWj6jXldhHZDOCnOaqVAFSOlTaqXD3ZPlSRW1iyN1vOrmHIMJsGKeb6PiaGz0a9omCE/HCewQkcqYJXBhaTHSRUF1oJL/+GCvw"
>>>
>>> rdss = []
>>> for region in 'us-east-1 us-west-1'.split():
... rds = Session(
... aws_access_key_id=AWS_ACCESS_KEY_ID,
... aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
... aws_session_token=AWS_SESSION_TOKEN,
... region_name=region
... ).client('rds')
... rdss.append(rds)
...

Now loop over the clients.

>>> for rds in rdss:
... res = rds.describe_option_groups()['OptionGroupsList']
... print(len(res))
...
4
0
>>>
[1]+ Stopped python3
$ ../pgrep.bash
python3 6587 ec2-user 3u IPv4 26002 0t0 TCP 10.0.16.52:40794->52.119.197.147:https (ESTABLISHED)
python3 6587 ec2-user 4u IPv4 26004 0t0 TCP 10.0.16.52:39852->176.32.118.192:https (ESTABLISHED)
Because the list is in the outer scope, the clients remain in place after the describe finishes and the clients remain connected to the AWS server.
One session maps to one account and region, but that Session can create multiple Clients, where each Client can connect to AWS. In conclusion, the boto3 Session is an object containing tokens that are needed create session objects that can connect to AWS. The boto3 Session itself does not connect to the AWS server.

Conclusion

When I started applying python to my database administration job, I ran across various APIs that used the a “session” that was not a connection to the server. In the three APIs mentioned here, requests, SQL Alchemy, and boto3, the Session is not itself a connection, but it can connect and remain connected for a time. The SQL Alchemy session can be used create a connection to a database. The boto3 Session can create a client that connects to the AWS server.

Leave a Reply