Allocation

In a service lifecycle, allocation is the lifecycle stage where identifiers are allocated for use by a specific service instance.

For example a customer orders a virtual wire between two ports on two routers. The customer specifies router, port and vlan for both the A and Z side of the wire. In the network, this virtual wire is implemented as a VXlan tunnel, tied to both endpoints. Each such tunnel requires a “VXLAN Network Identifier (VNI)” that uniquely identifies the tunnel. In the allocation phase, the orchestrator selects a VNI and ensures no other customer is assigned the same VNI.

Correct allocation is crucial for the correct functioning of automated services. However, when serving multiple customers at once or when mediating between multiple inventories, correct allocation can be challenging, due to concurrency and distribution effects.

LSM offers a framework to perform allocation correctly and efficiently. The remainder of this document will explain how.

Types of Allocation

We distinguish several types of allocation. The next sections will explain each type, from simplest to most advanced. After the basic explanation, a more in-depth explanation is given for the different types. When first learning about LSM allocation (or allocation in general), it is important to have a basic understanding of the different types, before diving into the details.

LSM internal allocation

The easiest form of allocation is when no external inventory is involved. A range of available identifiers is assigned to LSM to distribute as it sees fit. For example, VNI range 50000-70000 is reserved to this service and can be used by LSM freely. This requires no coordination with external systems and is supported out-of-the-box.

The VNI example, allocation would look like this

main.cf
 1import lsm
 2import lsm::fsm
 3
 4entity VlanAssignment extends lsm::ServiceEntity:
 5    string name
 6
 7    int? vlan_id
 8    lsm::attribute_modifier vlan_id__modifier="r"
 9end
10
11implement VlanAssignment using parents, do_deploy
12
13binding = lsm::ServiceEntityBinding(
14    service_entity="__config__::VlanAssignment",
15    lifecycle=lsm::fsm::simple,
16    service_entity_name="vlan-assignment",
17    allocation_spec="allocate_vlan",
18)
19
20for assignment in lsm::all(binding):
21    VlanAssignment(
22        instance_id=assignment["id"],
23        entity_binding=binding,
24        **assignment["attributes"]
25    )
26end

The main changes in the model are:

  1. the attributes that have to be allocated are added to the service definition as r (read only) attributes.

  2. the service binding refers to an allocation spec (defined in python code)

plugins/__init__.py
 1"""
 2Inmanta LSM
 3
 4:copyright: 2020 Inmanta
 5:contact: code@inmanta.com
 6:license: Inmanta EULA
 7"""
 8
 9import inmanta_plugins.lsm.allocation as lsm
10
11lsm.AllocationSpec(
12    "allocate_vlan",
13    lsm.LSM_Allocator(
14        attribute="vlan_id", strategy=lsm.AnyUniqueInt(lower=50000, upper=70000)
15    ),
16)

The allocation spec specifies how to allocate the attribute:

  1. Use the pure LSM internal allocation mechanism for vlan_id

  2. To select a new value, use the AnyUniqueInt strategy, which selects a random number in the specified range

Internally, this works by storing allocations in read-only attributes on the instance. The lsm::all function ensures that if a value is already in the attribute, that value is used. Otherwise, the allocator gets an appropriate, new value, that doesn’t collide with any value in any attribute-set of any other service instance.

In practice, this means that a value is allocated as long as it’s in the active, candidate or rollback attribute sets of any non-terminated service instance. When a service instance is terminated, or clears one of its attribute sets, all identifiers are automatically deallocated.

Important note when designing custom lifecycles: allocation only happens during validating, and the result of the allocation is always written to the candidate attributes.

External lookup

Often, values received via the NorthBound API are not directly usable. For example, a router can be identified in the API by its name, but what is required is its management IP. The management IP can be obtained based on the name, through lookup in an inventory.

While lookup is not strictly allocation, it is in many ways similar.

The basic mechanism for external lookup is similar to internal allocation: the resolved value is stored in a read-only parameter. This is done to ensure that LSM remains stable, even if the inventory is down or corrupted. This also implies that if the inventory wants to change the value (i.e. router management IP is suddenly changed), it should notify LSM. LSM will not by itself pick up inventory changes. This notification mechanism is currently not supported yet.

An example with router management IP looks like this:

main.cf
 1import lsm
 2import lsm::fsm
 3
 4entity VirtualWire extends lsm::ServiceEntity:
 5    string router_a
 6    int port_a
 7    int vlan_a
 8    string router_z
 9    int port_z
10    int vlan_z
11    int? vni
12    std::ipv4_address?  router_a_mgmt_ip
13    std::ipv4_address?  router_z_mgmt_ip
14    lsm::attribute_modifier vni__modifier="r"
15    lsm::attribute_modifier router_a_mgmt_ip__modifier="r"
16    lsm::attribute_modifier router_z_mgmt_ip__modifier="r"
17    lsm::attribute_modifier router_a__modifier="rw+"
18    lsm::attribute_modifier router_z__modifier="rw+"
19end
20
21implement VirtualWire using parents, do_deploy
22
23for assignment in lsm::all(binding):
24  VirtualWire(
25      instance_id=assignment["id"],
26      router_a = assignment["attributes"]["router_a"],
27      port_a = assignment["attributes"]["port_a"],
28      vlan_a = assignment["attributes"]["vlan_a"],
29      router_z = assignment["attributes"]["router_z"],
30      port_z = assignment["attributes"]["port_z"],
31      vlan_z = assignment["attributes"]["vlan_z"],
32      vni=assignment["attributes"]["vni"],
33      router_a_mgmt_ip=assignment["attributes"]["router_a_mgmt_ip"],
34      router_z_mgmt_ip=assignment["attributes"]["router_z_mgmt_ip"],
35      entity_binding=binding,
36  )
37end
38
39binding = lsm::ServiceEntityBinding(
40    service_entity="__config__::VirtualWire",
41    lifecycle=lsm::fsm::simple,
42    service_entity_name="virtualwire",
43    allocation_spec="allocate_for_virtualwire",

While the allocation implementation could look like the following

plugins/__init__.py
 1"""
 2Inmanta LSM
 3
 4:copyright: 2020 Inmanta
 5:contact: code@inmanta.com
 6:license: Inmanta EULA
 7"""
 8
 9import os
10from typing import Any, Optional
11
12import psycopg2
13from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
14
15import inmanta_plugins.lsm.allocation as lsm
16from inmanta_plugins.lsm.allocation import (
17    AllocationContext,
18    ExternalAttributeAllocator,
19    T,
20)
21
22
23class PGRouterResolver(ExternalAttributeAllocator[T]):
24    def __init__(self, attribute: str, id_attribute: str) -> None:
25        super().__init__(attribute, id_attribute)
26        self.conn = None
27        self.database = None
28
29    def pre_allocate(self):
30        """Connect to postgresql"""
31        host = os.environ.get("db_host", "localhost")
32        port = os.environ.get("db_port")
33        user = os.environ.get("db_user")
34        self.database = os.environ.get("db_name", "allocation_db")
35        self.conn = psycopg2.connect(
36            host=host, port=port, user=user, dbname=self.database
37        )
38        self.conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
39
40    def post_allocate(self) -> None:
41        """Close connection"""
42        self.conn.close()
43
44    def needs_allocation(
45        self, ctx: AllocationContext, instance: dict[str, Any]
46    ) -> bool:
47        attribute_not_yet_allocated = super().needs_allocation(ctx, instance)
48        id_attribute_changed = self._id_attribute_changed(instance)
49        return attribute_not_yet_allocated or id_attribute_changed
50
51    def _id_attribute_changed(self, instance: dict[str, Any]) -> bool:
52        if instance["candidate_attributes"] and instance["active_attributes"]:
53            return instance["candidate_attributes"].get(self.id_attribute) != instance[
54                "active_attributes"
55            ].get(self.id_attribute)
56        return False
57
58    def _get_value_from_result(self, result: Optional[tuple[T]]) -> Optional[T]:
59        if result and result[0]:
60            return result[0]
61        return None
62
63    def allocate_for_attribute(self, id_attribute_value: Any) -> T:
64        with self.conn.cursor() as cursor:
65            cursor.execute(
66                "SELECT mgmt_ip FROM routers WHERE name=%s", (id_attribute_value,)
67            )
68            result = cursor.fetchone()
69            allocated_value = self._get_value_from_result(result)
70            if allocated_value:
71                return allocated_value
72            raise Exception("No ip address found for %s", str(id_attribute_value))
73
74
75lsm.AllocationSpec(
76    "allocate_for_virtualwire",
77    PGRouterResolver(id_attribute="router_a", attribute="router_a_mgmt_ip"),
78    PGRouterResolver(id_attribute="router_z", attribute="router_z_mgmt_ip"),
79    lsm.LSM_Allocator(
80        attribute="vni", strategy=lsm.AnyUniqueInt(lower=50000, upper=70000)
81    ),
82)

External inventory owns allocation

When allocating is owned externally, synchronization between LSM and the external inventory is crucial. If either LSM or the inventory fails, this should not lead to inconsistencies. In other words, LSM doesn’t only have to maintain consistency between different service instances, but also between itself and the inventory.

The basic mechanism for external allocation is similar to external lookup. One important difference is that we also write our allocation to the inventory.

For example, consider that there is an external Postgres Database that contains the allocation table. In the model, this will look exactly the same as in the case of internal allocation, in the code, it will look as follows

plugins/__init__.py
 1"""
 2Inmanta LSM
 3
 4:copyright: 2020 Inmanta
 5:contact: code@inmanta.com
 6:license: Inmanta EULA
 7"""
 8
 9import os
10from typing import Optional
11from uuid import UUID
12
13import psycopg2
14from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
15
16import inmanta_plugins.lsm.allocation as lsm
17from inmanta_plugins.lsm.allocation import ExternalServiceIdAllocator, T
18
19
20class PGServiceIdAllocator(ExternalServiceIdAllocator[T]):
21    def __init__(self, attribute: str) -> None:
22        super().__init__(attribute)
23        self.conn = None
24        self.database = None
25
26    def pre_allocate(self):
27        """Connect to postgresql"""
28        host = os.environ.get("db_host", "localhost")
29        port = os.environ.get("db_port")
30        user = os.environ.get("db_user")
31        self.database = os.environ.get("db_name", "allocation_db")
32        self.conn = psycopg2.connect(
33            host=host, port=port, user=user, dbname=self.database
34        )
35        self.conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
36
37    def post_allocate(self) -> None:
38        """Close connection"""
39        self.conn.close()
40
41    def _get_value_from_result(self, result: Optional[tuple[T]]) -> Optional[T]:
42        if result and result[0]:
43            return result[0]
44        return None
45
46    def allocate_for_id(self, serviceid: UUID) -> T:
47        """Allocate in transaction"""
48        with self.conn.cursor() as cursor:
49            cursor.execute(
50                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
51                (self.attribute, serviceid),
52            )
53            result = cursor.fetchone()
54            allocated_value = self._get_value_from_result(result)
55            if allocated_value:
56                return allocated_value
57            cursor.execute(
58                "SELECT max(allocated_value) FROM allocation where attribute=%s",
59                (self.attribute,),
60            )
61            result = cursor.fetchone()
62            current_max_value = self._get_value_from_result(result)
63            allocated_value = current_max_value + 1 if current_max_value else 1
64            cursor.execute(
65                "INSERT INTO allocation (attribute, owner, allocated_value) VALUES (%s, %s, %s)",
66                (self.attribute, serviceid, allocated_value),
67            )
68            self.conn.commit()
69            return allocated_value
70
71
72lsm.AllocationSpec(
73    "allocate_vlan",
74    PGServiceIdAllocator(
75        attribute="vlan_id",
76    ),
77)

What is important to notice is that the code first tries to see if an allocation has already happened. This is important in case there was a failure before LSM could commit the allocation. In general, LSM must be able to identify what has been allocated to it, in order to recover aborted operations. This is done either by attaching an identifier when performing allocation by knowing where the value will be stored in the inventory up front (e.g. the inventory contains a service model as well, LSM can find the VNI for a service by requesting the VNI for that service directly).

In the above example, the identifier is the same as the service instance id that LSM uses internally to identify an instance. An attribute of the instance can also be used to identify it in the external inventory, as the name attribute in the the example below.

plugins/__init__.py
 1"""
 2Inmanta LSM
 3
 4:copyright: 2020 Inmanta
 5:contact: code@inmanta.com
 6:license: Inmanta EULA
 7"""
 8
 9import os
10from typing import Any, Optional
11
12import psycopg2
13from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
14
15import inmanta_plugins.lsm.allocation as lsm
16from inmanta_plugins.lsm.allocation import ExternalAttributeAllocator, T
17
18
19class PGAttributeAllocator(ExternalAttributeAllocator[T]):
20    def __init__(self, attribute: str, id_attribute: str) -> None:
21        super().__init__(attribute, id_attribute)
22        self.conn = None
23        self.database = None
24
25    def pre_allocate(self):
26        """Connect to postgresql"""
27        host = os.environ.get("db_host", "localhost")
28        port = os.environ.get("db_port")
29        user = os.environ.get("db_user")
30        self.database = os.environ.get("db_name", "allocation_db")
31        self.conn = psycopg2.connect(
32            host=host, port=port, user=user, dbname=self.database
33        )
34        self.conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
35
36    def post_allocate(self) -> None:
37        """Close connection"""
38        self.conn.close()
39
40    def _get_value_from_result(self, result: Optional[tuple[T]]) -> Optional[T]:
41        if result and result[0]:
42            return result[0]
43        return None
44
45    def allocate_for_attribute(self, id_attribute_value: Any) -> T:
46        """Allocate in transaction"""
47        with self.conn.cursor() as cursor:
48            cursor.execute(
49                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
50                (self.attribute, id_attribute_value),
51            )
52            result = cursor.fetchone()
53            allocated_value = self._get_value_from_result(result)
54            if allocated_value:
55                return allocated_value
56            cursor.execute(
57                "SELECT max(allocated_value) FROM allocation where attribute=%s",
58                (self.attribute,),
59            )
60            result = cursor.fetchone()
61            current_max_value = self._get_value_from_result(result)
62            allocated_value = current_max_value + 1 if current_max_value else 1
63            cursor.execute(
64                "INSERT INTO allocation (attribute, owner, allocated_value) VALUES (%s, %s, %s)",
65                (self.attribute, id_attribute_value, allocated_value),
66            )
67            self.conn.commit()
68            return allocated_value
69
70
71lsm.AllocationSpec(
72    "allocate_vlan",
73    PGAttributeAllocator(attribute="vlan_id", id_attribute="name"),
74)

Second, it is required that the inventory has a procedure to safely obtain ownership of an identifier. There must be some way LSM can definitely determine if it has correctly obtained an identifier. In the example, the database transaction ensures this. Many other mechanisms exist, but the inventory has to support at least one. Examples of possible transaction coordination mechanism are:

  1. an API endpoint that atomically and consistently performs allocation,

  2. database transaction

  3. Compare-and-set style API (when updating a value, the old value is also passed along, ensuring no concurrent updates are possible)

  4. API with version argument (like the LSM API itself, when updating a value, the version prior to update has to be passed along, preventing concurrent updates)

  5. Locks and/or Leases (a value or part of the inventory can be locked or leased(locked for some time) prior to allocation, the lock ensures no concurrent modifications)

This scenario performs no de-allocation.

External inventory with deallocation

To ensure de-allocation on an external inventory is properly executed, it is not executed during compilation, but by a handler. This ensures that de-allocation is retried until it completes successfully.

The example below shows how allocation and de-allocation of a VLAN ID can be done using an external inventory. The handler of the PGAllocation entity performs the de-allocation. An instance of this entity is only constructed when the service instance is in the deallocating state.

vlan_assignment/model/_init.cf
 1import lsm
 2import lsm::fsm
 3
 4entity VlanAssignment extends lsm::ServiceEntity:
 5    string name
 6
 7    int? vlan_id
 8    lsm::attribute_modifier vlan_id__modifier="r"
 9end
10
11implement VlanAssignment using parents, do_deploy
12implement VlanAssignment using de_allocation when lsm::has_current_state(self, "deallocating")
13
14entity PGAllocation extends std::PurgeableResource:
15    """
16        This entity ensures that an identifier allocated in PostgreSQL
17        gets de-allocated when the service instance is removed.
18    """
19   string attribute
20   std::uuid service_id
21   string agent
22end
23
24implement PGAllocation using std::none
25
26implementation de_allocation for VlanAssignment:
27    """
28        De-allocate the vlan_id identifier.
29    """
30    self.resources += PGAllocation(
31        attribute="vlan_id",
32        service_id=instance_id,
33        purged=true,
34        send_event=true,
35        agent="internal",
36        requires=self.requires,
37        provides=self.provides,
38    )
39end
40
41binding = lsm::ServiceEntityBinding(
42    service_entity="vlan_assignment::VlanAssignment",
43    lifecycle=lsm::fsm::simple_with_deallocation,
44    service_entity_name="vlan-assignment",
45    allocation_spec="allocate_vlan",
46)
47
48for assignment in lsm::all(binding):
49    VlanAssignment(
50        instance_id=assignment["id"],
51        entity_binding=binding,
52        **assignment["attributes"],
53    )
54end

The handler associated with the PGAllocation handler is shown in the code snippet below. Note that the handler doesn’t have an implementation for the create_resource() and the update_resource() method since they can never be called. The only possible operation is a delete operation.

vlan_assignment/plugins/__init__.py
  1"""
  2Inmanta LSM
  3
  4:copyright: 2020 Inmanta
  5:contact: code@inmanta.com
  6:license: Inmanta EULA
  7"""
  8
  9import os
 10from typing import Optional
 11from uuid import UUID
 12
 13import psycopg2
 14from inmanta.agent import handler
 15from inmanta.agent.handler import CRUDHandlerGeneric as CRUDHandler
 16from inmanta.agent.handler import ResourcePurged, provider
 17from inmanta.resources import PurgeableResource, resource
 18from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
 19
 20from inmanta_plugins.lsm.allocation import AllocationSpec, ExternalServiceIdAllocator
 21
 22
 23class PGServiceIdAllocator(ExternalServiceIdAllocator[int]):
 24    def __init__(self, attribute: str) -> None:
 25        super().__init__(attribute)
 26        self.conn = None
 27        self.database = None
 28
 29    def pre_allocate(self) -> None:
 30        """Connect to postgresql"""
 31        host = os.environ.get("db_host", "localhost")
 32        port = os.environ.get("db_port")
 33        user = os.environ.get("db_user")
 34        self.database = os.environ.get("db_name", "allocation_db")
 35        self.conn = psycopg2.connect(
 36            host=host, port=port, user=user, dbname=self.database
 37        )
 38        self.conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
 39
 40    def post_allocate(self) -> None:
 41        """Close connection"""
 42        self.conn.close()
 43
 44    def _get_value_from_result(self, result: Optional[tuple[int]]) -> Optional[int]:
 45        if result and result[0]:
 46            return result[0]
 47        return None
 48
 49    def allocate_for_id(self, serviceid: UUID) -> int:
 50        """Allocate in transaction"""
 51        with self.conn.cursor() as cursor:
 52            cursor.execute(
 53                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
 54                (self.attribute, serviceid),
 55            )
 56            result = cursor.fetchone()
 57            allocated_value = self._get_value_from_result(result)
 58            if allocated_value:
 59                return allocated_value
 60            cursor.execute(
 61                "SELECT max(allocated_value) FROM allocation where attribute=%s",
 62                (self.attribute,),
 63            )
 64            result = cursor.fetchone()
 65            current_max_value = self._get_value_from_result(result)
 66            allocated_value = current_max_value + 1 if current_max_value else 1
 67            cursor.execute(
 68                "INSERT INTO allocation (attribute, owner, allocated_value) VALUES (%s, %s, %s)",
 69                (self.attribute, serviceid, allocated_value),
 70            )
 71            self.conn.commit()
 72            return allocated_value
 73
 74    def has_allocation_in_inventory(self, serviceid: UUID) -> bool:
 75        """
 76        Check whether a VLAN ID is allocated by the service instance with the given id.
 77        """
 78        with self.conn.cursor() as cursor:
 79            cursor.execute(
 80                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
 81                (self.attribute, serviceid),
 82            )
 83            result = cursor.fetchone()
 84            allocated_value = self._get_value_from_result(result)
 85            if allocated_value:
 86                return True
 87            return False
 88
 89    def de_allocate(self, serviceid: UUID) -> None:
 90        """
 91        De-allocate the VLAN ID allocated by the service instance with the given id.
 92        """
 93        with self.conn.cursor() as cursor:
 94            cursor.execute(
 95                "DELETE FROM allocation WHERE attribute=%s AND owner=%s",
 96                (self.attribute, serviceid),
 97            )
 98            self.conn.commit()
 99
100
101@resource("vlan_assignment::PGAllocation", agent="agent", id_attribute="service_id")
102class PGAllocationResource(PurgeableResource):
103    fields = ("attribute", "service_id")
104
105
106@provider("vlan_assignment::PGAllocation", name="pgallocation")
107class PGAllocation(CRUDHandler[PGAllocationResource]):
108    def __init__(self, *args, **kwargs):
109        super().__init__(*args, **kwargs)
110        self._allocator = PGServiceIdAllocator(attribute="vlan_id")
111
112    def pre(self, ctx: handler.HandlerContext, resource: PGAllocationResource) -> None:
113        self._allocator.pre_allocate()
114
115    def post(self, ctx: handler.HandlerContext, resource: PGAllocationResource) -> None:
116        self._allocator.post_allocate()
117
118    def read_resource(
119        self, ctx: handler.HandlerContext, resource: PGAllocationResource
120    ) -> None:
121        if not self._allocator.has_allocation_in_inventory(resource.service_id):
122            raise ResourcePurged()
123
124    def delete_resource(
125        self, ctx: handler.HandlerContext, resource: PGAllocationResource
126    ) -> None:
127        self._allocator.de_allocate(resource.service_id)
128
129
130AllocationSpec("allocate_vlan", PGServiceIdAllocator(attribute="vlan_id"))