Allocation#

In a service lifecycle, allocation is the lifecycle stage where identifiers are allocated for use by a specific service instance.

For example a customer orders a virtual wire between two ports on two routers. The customer specifies router, port and vlan for both the A and Z side of the wire. In the network, this virtual wire is implemented as a VXlan tunnel, tied to both endpoints. Each such tunnel requires a “VXLAN Network Identifier (VNI)” that uniquely identifies the tunnel. In the allocation phase, the orchestrator selects a VNI and ensures no other customer is assigned the same VNI.

Correct allocation is crucial for the correct functioning of automated services. However, when serving multiple customers at once or when mediating between multiple inventories, correct allocation can be challenging, due to concurrency and distribution effects.

LSM offers a framework to perform allocation correctly and efficiently. The remainder of this document will explain how.

Types of Allocation#

We distinguish several types of allocation. The next sections will explain each type, from simplest to most advanced. After the basic explanation, a more in-depth explanation is give for the different types. When first learning about LSM allocation (or allocation in general), it is important to have a basic understanding of the different types, before diving into the details.

LSM internal allocation#

The easiest form of allocation is when no external inventory is involved. A range of available identifiers is assigned to LSM to distribute as it sees fit. For example, VNI range 50000-70000 is reserved to this service and can be used by LSM freely. This requires no coordination with external systems and is supported out-of-the-box.

The VNI example, allocation would look like this

main.cf#
 1import lsm
 2import lsm::fsm
 3
 4entity VlanAssignment extends lsm::ServiceEntity:
 5    string name
 6
 7    int? vlan_id
 8    lsm::attribute_modifier vlan_id__modifier="r"
 9end
10
11implement VlanAssignment using parents, do_deploy
12
13binding = lsm::ServiceEntityBinding(
14    service_entity="__config__::VlanAssignment",
15    lifecycle=lsm::fsm::simple,
16    service_entity_name="vlan-assignment",
17    allocation_spec="allocate_vlan",
18)
19
20for assignment in lsm::all(binding):
21    VlanAssignment(
22        instance_id=assignment["id"],
23        entity_binding=binding,
24        **assignment["attributes"]
25    )
26end

The main changes in the model are:

  1. the attributes that have to be allocated are added to the service definition as r (read only) attributes.

  2. the service binding refers to an allocation spec (defined in python code)

plugins/__init__.py#
 1"""
 2    Inmanta LSM
 3
 4    :copyright: 2020 Inmanta
 5    :contact: code@inmanta.com
 6    :license: Inmanta EULA
 7"""
 8import inmanta_plugins.lsm.allocation as lsm
 9
10lsm.AllocationSpec(
11    "allocate_vlan",
12    lsm.LSM_Allocator(
13        attribute="vlan_id", strategy=lsm.AnyUniqueInt(lower=50000, upper=70000)
14    ),
15)

The allocation spec specifies how to allocate the attribute:

  1. Use the pure LSM internal allocation mechanism for vlan_id

  2. To select a new value, use the AnyUniqueInt strategy, which selects a random number in the specified range

Internally, this works by storing allocations in read-only attributes on the instance. The lsm::all function ensures that if a value is already in the attribute, that value is used Otherwise the allocator gets an appropriate, new value, that doesn’t collide with any value in any attribute-set of any other service instance.

In practice, this means that a value is allocated as long as it’s in the active, candidate or rollback attribute sets of any non-terminated service instance. When a service instance is terminated, or clears one of its attribute sets, all identifiers are automatically deallocated.

Important note when designing custom lifecycles: allocation only happens during validating, and the result of the allocation is always written to the candidate attributes.

External lookup#

Often, values received via the NorthBound API are not directly usable. For example, a router can be identified in the API by its name, but what is required is its management IP. The management IP can be obtained based on the name, through lookup in an inventory.

While lookup is not strictly allocation, it is in many ways similar.

The basic mechanism for external lookup is similar to internal allocation: the resolved value is stored in a read-only parameter. This is done to ensure that LSM remains stable, even if the inventory is down or corrupted. This also implies that if the inventory wants to change the value (i.e. router management IP is suddenly changed), it should notify LSM. LSM will not by itself pick up inventory changes. This notification mechanism is currently not supported yet.

An example with router management IP looks like this:

main.cf#
 1import lsm
 2import lsm::fsm
 3
 4entity VirtualWire extends lsm::ServiceEntity:
 5    string router_a
 6    int port_a
 7    int vlan_a
 8    string router_z
 9    int port_z
10    int vlan_z
11    int? vni
12    std::ipv4_address?  router_a_mgmt_ip
13    std::ipv4_address?  router_z_mgmt_ip
14    lsm::attribute_modifier vni__modifier="r"
15    lsm::attribute_modifier router_a_mgmt_ip__modifier="r"
16    lsm::attribute_modifier router_z_mgmt_ip__modifier="r"
17    lsm::attribute_modifier router_a__modifier="rw+"
18    lsm::attribute_modifier router_z__modifier="rw+"
19end
20
21implement VirtualWire using parents, do_deploy
22
23for assignment in lsm::all(binding):
24  VirtualWire(
25      instance_id=assignment["id"],
26      router_a = assignment["attributes"]["router_a"],
27      port_a = assignment["attributes"]["port_a"],
28      vlan_a = assignment["attributes"]["vlan_a"],
29      router_z = assignment["attributes"]["router_z"],
30      port_z = assignment["attributes"]["port_z"],
31      vlan_z = assignment["attributes"]["vlan_z"],
32      vni=assignment["attributes"]["vni"],
33      router_a_mgmt_ip=assignment["attributes"]["router_a_mgmt_ip"],
34      router_z_mgmt_ip=assignment["attributes"]["router_z_mgmt_ip"],
35      entity_binding=binding,
36  )
37end
38
39binding = lsm::ServiceEntityBinding(
40    service_entity="__config__::VirtualWire",
41    lifecycle=lsm::fsm::simple,
42    service_entity_name="virtualwire",
43    allocation_spec="allocate_for_virtualwire",

While the allocation implementation could look like the following

plugins/__init__.py#
 1"""
 2    Inmanta LSM
 3
 4    :copyright: 2020 Inmanta
 5    :contact: code@inmanta.com
 6    :license: Inmanta EULA
 7"""
 8import os
 9from typing import Any, Optional
10
11import inmanta_plugins.lsm.allocation as lsm
12import psycopg2
13from inmanta_plugins.lsm.allocation import (
14    AllocationContext,
15    ExternalAttributeAllocator,
16    T,
17)
18from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
19
20
21class PGRouterResolver(ExternalAttributeAllocator[T]):
22    def __init__(self, attribute: str, id_attribute: str) -> None:
23        super().__init__(attribute, id_attribute)
24        self.conn = None
25        self.database = None
26
27    def pre_allocate(self):
28        """Connect to postgresql"""
29        host = os.environ.get("db_host", "localhost")
30        port = os.environ.get("db_port")
31        user = os.environ.get("db_user")
32        self.database = os.environ.get("db_name", "allocation_db")
33        self.conn = psycopg2.connect(
34            host=host, port=port, user=user, dbname=self.database
35        )
36        self.conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
37
38    def post_allocate(self) -> None:
39        """Close connection"""
40        self.conn.close()
41
42    def needs_allocation(
43        self, ctx: AllocationContext, instance: dict[str, Any]
44    ) -> bool:
45        attribute_not_yet_allocated = super().needs_allocation(ctx, instance)
46        id_attribute_changed = self._id_attribute_changed(instance)
47        return attribute_not_yet_allocated or id_attribute_changed
48
49    def _id_attribute_changed(self, instance: dict[str, Any]) -> bool:
50        if instance["candidate_attributes"] and instance["active_attributes"]:
51            return instance["candidate_attributes"].get(self.id_attribute) != instance[
52                "active_attributes"
53            ].get(self.id_attribute)
54        return False
55
56    def _get_value_from_result(self, result: Optional[tuple[T]]) -> Optional[T]:
57        if result and result[0]:
58            return result[0]
59        return None
60
61    def allocate_for_attribute(self, id_attribute_value: Any) -> T:
62        with self.conn.cursor() as cursor:
63            cursor.execute(
64                "SELECT mgmt_ip FROM routers WHERE name=%s", (id_attribute_value,)
65            )
66            result = cursor.fetchone()
67            allocated_value = self._get_value_from_result(result)
68            if allocated_value:
69                return allocated_value
70            raise Exception("No ip address found for %s", str(id_attribute_value))
71
72
73lsm.AllocationSpec(
74    "allocate_for_virtualwire",
75    PGRouterResolver(id_attribute="router_a", attribute="router_a_mgmt_ip"),
76    PGRouterResolver(id_attribute="router_z", attribute="router_z_mgmt_ip"),
77    lsm.LSM_Allocator(
78        attribute="vni", strategy=lsm.AnyUniqueInt(lower=50000, upper=70000)
79    ),
80)

External inventory owns allocation#

When allocating is owned externally, synchronization between LSM and the external inventory is crucial. If either LSM or the inventory fails, this should not lead to inconsistencies. In other words, LSM doesn’t only have to maintain consistency between different service instances, but also between itself and the inventory.

The basic mechanism for external allocation is similar to external lookup. One important difference is that we also write our allocation to the inventory.

For example, consider that there is an external Postgres Database that contains the allocation table. In the model, this will look exactly the same as in the case of internal allocation, in the code, it will look as follows

plugins/__init__.py#
 1"""
 2    Inmanta LSM
 3
 4    :copyright: 2020 Inmanta
 5    :contact: code@inmanta.com
 6    :license: Inmanta EULA
 7"""
 8import os
 9from typing import Optional
10from uuid import UUID
11
12import inmanta_plugins.lsm.allocation as lsm
13import psycopg2
14from inmanta_plugins.lsm.allocation import ExternalServiceIdAllocator, T
15from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
16
17
18class PGServiceIdAllocator(ExternalServiceIdAllocator[T]):
19    def __init__(self, attribute: str) -> None:
20        super().__init__(attribute)
21        self.conn = None
22        self.database = None
23
24    def pre_allocate(self):
25        """Connect to postgresql"""
26        host = os.environ.get("db_host", "localhost")
27        port = os.environ.get("db_port")
28        user = os.environ.get("db_user")
29        self.database = os.environ.get("db_name", "allocation_db")
30        self.conn = psycopg2.connect(
31            host=host, port=port, user=user, dbname=self.database
32        )
33        self.conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
34
35    def post_allocate(self) -> None:
36        """Close connection"""
37        self.conn.close()
38
39    def _get_value_from_result(self, result: Optional[tuple[T]]) -> Optional[T]:
40        if result and result[0]:
41            return result[0]
42        return None
43
44    def allocate_for_id(self, serviceid: UUID) -> T:
45        """Allocate in transaction"""
46        with self.conn.cursor() as cursor:
47            cursor.execute(
48                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
49                (self.attribute, serviceid),
50            )
51            result = cursor.fetchone()
52            allocated_value = self._get_value_from_result(result)
53            if allocated_value:
54                return allocated_value
55            cursor.execute(
56                "SELECT max(allocated_value) FROM allocation where attribute=%s",
57                (self.attribute,),
58            )
59            result = cursor.fetchone()
60            current_max_value = self._get_value_from_result(result)
61            allocated_value = current_max_value + 1 if current_max_value else 1
62            cursor.execute(
63                "INSERT INTO allocation (attribute, owner, allocated_value) VALUES (%s, %s, %s)",
64                (self.attribute, serviceid, allocated_value),
65            )
66            self.conn.commit()
67            return allocated_value
68
69
70lsm.AllocationSpec(
71    "allocate_vlan",
72    PGServiceIdAllocator(
73        attribute="vlan_id",
74    ),
75)

What is important to notice is that the code first tries to see if an allocation has already happened. This is important in case there was a failure before LSM could commit the allocation. In general, LSM must be able to identify what has been allocated to it, in order to recover aborted operations. This is done either by attaching an identifier when performing allocation by knowing where the value will be stored in the inventory up front (e.g. the inventory contains a service model as well, LSM can find the VNI for a service by requesting the VNI for that service directly).

In the above example, the identifier is the same as the service instance id that LSM uses internally to identify an instance. An attribute of the instance can also be used to identify it in the external inventory, as the name attribute in the the example below.

plugins/__init__.py#
 1"""
 2    Inmanta LSM
 3
 4    :copyright: 2020 Inmanta
 5    :contact: code@inmanta.com
 6    :license: Inmanta EULA
 7"""
 8import os
 9from typing import Any, Optional
10
11import inmanta_plugins.lsm.allocation as lsm
12import psycopg2
13from inmanta_plugins.lsm.allocation import ExternalAttributeAllocator, T
14from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
15
16
17class PGAttributeAllocator(ExternalAttributeAllocator[T]):
18    def __init__(self, attribute: str, id_attribute: str) -> None:
19        super().__init__(attribute, id_attribute)
20        self.conn = None
21        self.database = None
22
23    def pre_allocate(self):
24        """Connect to postgresql"""
25        host = os.environ.get("db_host", "localhost")
26        port = os.environ.get("db_port")
27        user = os.environ.get("db_user")
28        self.database = os.environ.get("db_name", "allocation_db")
29        self.conn = psycopg2.connect(
30            host=host, port=port, user=user, dbname=self.database
31        )
32        self.conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
33
34    def post_allocate(self) -> None:
35        """Close connection"""
36        self.conn.close()
37
38    def _get_value_from_result(self, result: Optional[tuple[T]]) -> Optional[T]:
39        if result and result[0]:
40            return result[0]
41        return None
42
43    def allocate_for_attribute(self, id_attribute_value: Any) -> T:
44        """Allocate in transaction"""
45        with self.conn.cursor() as cursor:
46            cursor.execute(
47                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
48                (self.attribute, id_attribute_value),
49            )
50            result = cursor.fetchone()
51            allocated_value = self._get_value_from_result(result)
52            if allocated_value:
53                return allocated_value
54            cursor.execute(
55                "SELECT max(allocated_value) FROM allocation where attribute=%s",
56                (self.attribute,),
57            )
58            result = cursor.fetchone()
59            current_max_value = self._get_value_from_result(result)
60            allocated_value = current_max_value + 1 if current_max_value else 1
61            cursor.execute(
62                "INSERT INTO allocation (attribute, owner, allocated_value) VALUES (%s, %s, %s)",
63                (self.attribute, id_attribute_value, allocated_value),
64            )
65            self.conn.commit()
66            return allocated_value
67
68
69lsm.AllocationSpec(
70    "allocate_vlan",
71    PGAttributeAllocator(attribute="vlan_id", id_attribute="name"),
72)

Second, it is required that the inventory has a procedure to safely obtain ownership of an identifier. There must be some way LSM can definitely determine if it has correctly obtained an identifier. In the example, the database transaction ensures this. Many other mechanisms exist, but the inventory has to support at least one. Examples of possible transaction coordination mechanism are:

  1. an API endpoint that atomically and consistently performs allocation,

  2. database transaction

  3. Compare-and-set style API (when updating a value, the old value is also passed along, ensuring no concurrent updates are possible)

  4. API with version argument (like the LSM API itself, when updating a value, the version prior to update has to be passed along, preventing concurrent updates)

  5. Locks and/or Leases (a value or part of the inventory can be locked or leased(locked for some time) prior to allocation, the lock ensures no concurrent modifications)

This scenario performs no de-allocation.

External inventory with deallocation#

To ensure de-allocation on an external inventory is properly executed, it is not executed during compilation, but by a handler. This ensures that de-allocation is retried until it completes successfully.

The example below shows how allocation and de-allocation of a VLAN ID can be done using an external inventory. The handler of the PGAllocation entity performs the de-allocation. An instance of this entity is only constructed when the service instance is in the deallocating state.

vlan_assignment/model/_init.cf#
 1import lsm
 2import lsm::fsm
 3
 4entity VlanAssignment extends lsm::ServiceEntity:
 5    string name
 6
 7    int? vlan_id
 8    lsm::attribute_modifier vlan_id__modifier="r"
 9end
10
11implement VlanAssignment using parents, do_deploy
12implement VlanAssignment using de_allocation when lsm::has_current_state(self, "deallocating")
13
14entity PGAllocation extends std::PurgeableResource:
15    """
16        This entity ensures that an identifier allocated in PostgreSQL
17        gets de-allocated when the service instance is removed.
18    """
19   string attribute
20   std::uuid service_id
21   string agent
22end
23
24implement PGAllocation using std::none
25
26implementation de_allocation for VlanAssignment:
27    """
28        De-allocate the vlan_id identifier.
29    """
30    self.resources += PGAllocation(
31        attribute="vlan_id",
32        service_id=instance_id,
33        purged=true,
34        send_event=true,
35        agent="internal",
36        requires=self.requires,
37        provides=self.provides,
38    )
39end
40
41binding = lsm::ServiceEntityBinding(
42    service_entity="vlan_assignment::VlanAssignment",
43    lifecycle=lsm::fsm::simple_with_deallocation,
44    service_entity_name="vlan-assignment",
45    allocation_spec="allocate_vlan",
46)
47
48for assignment in lsm::all(binding):
49    VlanAssignment(
50        instance_id=assignment["id"],
51        entity_binding=binding,
52        **assignment["attributes"],
53    )
54end

The handler associated with the PGAllocation handler is shown in the code snippet below. Note that the handler doesn’t have an implementation for the create_resource() and the update_resource() method since they can never be called. The only possible operation is a delete operation.

vlan_assignment/plugins/__init__.py#
  1"""
  2    Inmanta LSM
  3
  4    :copyright: 2020 Inmanta
  5    :contact: code@inmanta.com
  6    :license: Inmanta EULA
  7"""
  8import os
  9from typing import Optional
 10from uuid import UUID
 11
 12import psycopg2
 13from inmanta.agent import handler
 14from inmanta.agent.handler import CRUDHandlerGeneric as CRUDHandler
 15from inmanta.agent.handler import ResourcePurged, provider
 16from inmanta.resources import PurgeableResource, resource
 17from inmanta_plugins.lsm.allocation import AllocationSpec, ExternalServiceIdAllocator
 18from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
 19
 20
 21class PGServiceIdAllocator(ExternalServiceIdAllocator[int]):
 22    def __init__(self, attribute: str) -> None:
 23        super().__init__(attribute)
 24        self.conn = None
 25        self.database = None
 26
 27    def pre_allocate(self) -> None:
 28        """Connect to postgresql"""
 29        host = os.environ.get("db_host", "localhost")
 30        port = os.environ.get("db_port")
 31        user = os.environ.get("db_user")
 32        self.database = os.environ.get("db_name", "allocation_db")
 33        self.conn = psycopg2.connect(
 34            host=host, port=port, user=user, dbname=self.database
 35        )
 36        self.conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
 37
 38    def post_allocate(self) -> None:
 39        """Close connection"""
 40        self.conn.close()
 41
 42    def _get_value_from_result(self, result: Optional[tuple[int]]) -> Optional[int]:
 43        if result and result[0]:
 44            return result[0]
 45        return None
 46
 47    def allocate_for_id(self, serviceid: UUID) -> int:
 48        """Allocate in transaction"""
 49        with self.conn.cursor() as cursor:
 50            cursor.execute(
 51                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
 52                (self.attribute, serviceid),
 53            )
 54            result = cursor.fetchone()
 55            allocated_value = self._get_value_from_result(result)
 56            if allocated_value:
 57                return allocated_value
 58            cursor.execute(
 59                "SELECT max(allocated_value) FROM allocation where attribute=%s",
 60                (self.attribute,),
 61            )
 62            result = cursor.fetchone()
 63            current_max_value = self._get_value_from_result(result)
 64            allocated_value = current_max_value + 1 if current_max_value else 1
 65            cursor.execute(
 66                "INSERT INTO allocation (attribute, owner, allocated_value) VALUES (%s, %s, %s)",
 67                (self.attribute, serviceid, allocated_value),
 68            )
 69            self.conn.commit()
 70            return allocated_value
 71
 72    def has_allocation_in_inventory(self, serviceid: UUID) -> bool:
 73        """
 74        Check whether a VLAN ID is allocated by the service instance with the given id.
 75        """
 76        with self.conn.cursor() as cursor:
 77            cursor.execute(
 78                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
 79                (self.attribute, serviceid),
 80            )
 81            result = cursor.fetchone()
 82            allocated_value = self._get_value_from_result(result)
 83            if allocated_value:
 84                return True
 85            return False
 86
 87    def de_allocate(self, serviceid: UUID) -> None:
 88        """
 89        De-allocate the VLAN ID allocated by the service instance with the given id.
 90        """
 91        with self.conn.cursor() as cursor:
 92            cursor.execute(
 93                "DELETE FROM allocation WHERE attribute=%s AND owner=%s",
 94                (self.attribute, serviceid),
 95            )
 96            self.conn.commit()
 97
 98
 99@resource("vlan_assignment::PGAllocation", agent="agent", id_attribute="service_id")
100class PGAllocationResource(PurgeableResource):
101    fields = ("attribute", "service_id")
102
103
104@provider("vlan_assignment::PGAllocation", name="pgallocation")
105class PGAllocation(CRUDHandler[PGAllocationResource]):
106    def __init__(self, *args, **kwargs):
107        super().__init__(*args, **kwargs)
108        self._allocator = PGServiceIdAllocator(attribute="vlan_id")
109
110    def pre(self, ctx: handler.HandlerContext, resource: PGAllocationResource) -> None:
111        self._allocator.pre_allocate()
112
113    def post(self, ctx: handler.HandlerContext, resource: PGAllocationResource) -> None:
114        self._allocator.post_allocate()
115
116    def read_resource(
117        self, ctx: handler.HandlerContext, resource: PGAllocationResource
118    ) -> None:
119        if not self._allocator.has_allocation_in_inventory(resource.service_id):
120            raise ResourcePurged()
121
122    def delete_resource(
123        self, ctx: handler.HandlerContext, resource: PGAllocationResource
124    ) -> None:
125        self._allocator.de_allocate(resource.service_id)
126
127
128AllocationSpec("allocate_vlan", PGServiceIdAllocator(attribute="vlan_id"))