sideara-image

Lorem ipsum dolor sit . Proin gravida nibh vel vealiquete sollicitudin, lorem quis bibendum auctonisilin sequat. Nam nec tellus a odio tincidunt auctor ornare.

Stay Connected & Follow us

What are you looking for?

Simply enter your keyword and we will help you find what you need.

Deal with the change of field data type of PynamoDB, an ORM of DynamoDB

PynamoDB, with a Python interface, is a great ORM for AWS DynamoDB. DynamoDB is a NoSQL database with scalability and security and it has awesome API and SDKs. PynamoDB wraps the underneath APIs of DynamoDB and gives a beautiful pythonic way to use DynamoDB.

In real-world our DB fields are not constant. It can change with the requirements update of our project. The hard part is when you have to maintain the old and new features. For example, a field data type can change from String to Binary. So if any API e.g. REST API with versioning is working with DynamoDB, you have to accept both old data type (String) and new data type (Binary). We are going to find a solution in such a case.

Setting the goal

Let’s start with setting a goal. Assume we have a field with Dynamo data type List. But after times it changes to Map. So we will do

  • Accept Map object and save to DynamoDB
  • Accept List object and change it to Map according to requirement and save to DynamoDB as our field changed to Map
  • Return Map object according to requirement change

Setting the project

Let’s add a couple of files to do our task.

# Add project folder
mkdir change-field-ddb
cd change-field-ddb
# Add Pipfile for requirements mange with pipenv
touch Pipfile
# For Custom PynamoDB attributes
touch attributes.py
# Add models.py to add pynamo model
touch models.py
# To test everything
touch test.py

Add python requirements

Add following requirements to Pipfile

[dev-packages]
pipenv = "==2020.8.13"
moto = "==1.3.14"

[packages]
pynamodb = "==4.3.2"

[requires]
python_version = "3.8"

Now initiate pipenv with

pipenv install

Now the project will look like

tree
.
├── attributes.py
├── models.py
├── Pipfile
├── Pipfile.lock
└── test.py

The changed datatype

Let’s assume our previous datatype (List) for a field result is

[3.75, 3.17, 3.90, 3.67, .......]

So the result stores CGPA of a student from 1st semester to final semester e.g. result[0] as 1st semester, result[1] as 2nd semester, and so on. But we will make it better (Map) at the updated version

{
    'semester 1': 3.75,
    'semester 2': 3.17,
    'semester 3': 3.90,
    'semester 4': 3.67,
    ..................
}

PynamoDB attributes

Add the custom PynamoDB attribute to accept old (List) and new (Map) attribute and saves only as Map. But at return time it will return old (List) data as Map and also new data as Map. Also, we are going to use custom UUIDAttribute which will be used as a hash key.

# attributes.py
import uuid

import pynamodb.attributes


# UUID  Attribute will use as Dynamo hash key
class UUIDAttribute(pynamodb.attributes.UnicodeAttribute):

    def serialize(self, value):
        return super().serialize(str(value))

    def deserialize(self, value):
        return uuid.UUID(super().deserialize(value))


# Add custom attribute to serialize and deserialize data
class ResultAttribute(pynamodb.attributes.MapAttribute):

    @classmethod
    def is_raw(cls):
        # Set to use as AttributeContainer
        # https://pynamodb.readthedocs.io/en/latest/api.html#pynamodb.attributes.MapAttribute
        return True

    @staticmethod
    def _parse_value(values):
        return {
            f'semester {idx+1}': val for idx, val in enumerate(values)
        }

    def serialize(self, values):
        # Convert python list to native pynamo
        if isinstance(values, (list, tuple)):
            values = self._parse_value(values)
        return super().serialize(values)

    def get_value(self, value):
        try:
            # Convert from
            # {'L': [{'N': '3.75'}, {'N': '3.17'}]}
            # to
            # {'M': {'semester 1': {'N': '3.75'}, 'semester 2': {'N': '3.17'}}}
            value = {'M': self._parse_value(value['L'])}
        except (KeyError, TypeError):
            pass
        return super().get_value(value)

The PynamoDB Model

We are going to start with a simple PynamoDB Model only with 2 fields

# models.py
import uuid

import pynamodb.models
import pynamodb.attributes

from attributes import UUIDAttribute, ResultAttribute


class ResultModel(pynamodb.models.Model):
    id = UUIDAttribute(hash_key=True, default=uuid.uuid4)
    result = ResultAttribute()

    class Meta:
        table_name = "test-ddb-table"

Time to Test

Add some tests in test.py

from decimal import Decimal

from decimal import Decimal

import boto3
import moto

from models import ResultModel

with moto.mock_dynamodb2():
    region = 'eu-west-1'
    ResultModel.Meta.region = region
    ResultModel.create_table(wait=True)

    # Data
    result_map = {
        'semester 1': 3.75,
        'semester 2': 3.17,
        'semester 3': 3.90,
        'semester 4': 3.67
    }
    result_list = [3.75, 3.17, 3.90, 3.67]

    # Insert as Map with PynamoDB
    result1 = ResultModel(result=result_map)
    result1.save()
    assert ResultModel.count() == 1
    result1_id = result1.id
    print(result1_id)

    # Retrieve the data from inserted as list
    result1_retr = ResultModel.get(result1_id)
    assert result1_retr.result.attribute_values == result_map

    # Insert as list so it will convert and saved as map
    result2 = ResultModel(result=result_list)
    result2.save()
    assert ResultModel.count() == 2
    result2_id = result2.id
    print(result2_id)

    # Retrieve the data from inserted as list
    result2_retr = ResultModel.get(result2_id)
    assert result2_retr.result.attribute_values == result_map

    # Insert list value in result with boto3
    dynamodb = boto3.resource('dynamodb', region)
    table = dynamodb.Table(ResultModel.Meta.table_name)

    # float is not supported by boto3. But decimal is supported. So converted to Decimal
    item = [Decimal(str(v)) for v in result_list]
    # Updating item instead of creating because result is MapAttribute by default
    table.update_item(
        Key={'id': str(result1_id)},
        AttributeUpdates={
            'result': {'Value': item, 'Action': 'PUT'}
        }
    )

    assert table.get_item(Key={'id': str(result1_id)})['Item']['result'] == item

    # Retrieve the data that is a list in dynamodb
    result1_retr = ResultModel.get(result1_id)
    assert result1_retr.result.attribute_values == result_map

Run tests with

pipenv shell
python3 test.py

Full code can be found here:

https://github.com/melon-ruet/change-field-ddb

author avatar
Mahabubur Rahaman Melon
No Comments

Sorry, the comment form is closed at this time.