Allocation#

In a service lifecycle, allocation is the lifecycle stage where identifiers are allocated for use by a specific service instance.

For example a customer orders a virtual wire between two ports on two routers. The customer specifies router, port and vlan for both the A and Z side of the wire. In the network, this virtual wire is implemented as a VXlan tunnel, tied to both endpoints. Each such tunnel requires a “VXLAN Network Identifier (VNI)” that uniquely identifies the tunnel. In the allocation phase, the orchestrator selects a VNI and ensures no other customer is assigned the same VNI.

Correct allocation is crucial for the correct functioning of automated services. However, when serving multiple customers at once or when mediating between multiple inventories, correct allocation can be challenging, due to concurrency and distribution effects.

LSM offers a framework to perform allocation correctly and efficiently. The remainder of this document will explain how.

Types of Allocation#

We distinguish several types of allocation. The next sections will explain each type, from simplest to most advanced. After the basic explanation, a more in-depth explanation is given for the different types. When first learning about LSM allocation (or allocation in general), it is important to have a basic understanding of the different types, before diving into the details.

LSM internal allocation#

The easiest form of allocation is when no external inventory is involved. A range of available identifiers is assigned to LSM to distribute as it sees fit. For example, VNI range 50000-70000 is reserved to this service and can be used by LSM freely. This requires no coordination with external systems and is supported out-of-the-box.

The VNI example, allocation would look like this

main.cf#
 1import lsm
 2import lsm::fsm
 3
 4entity VlanAssignment extends lsm::ServiceEntity:
 5    string name
 6
 7    int? vlan_id
 8    lsm::attribute_modifier vlan_id__modifier="r"
 9end
10
11implement VlanAssignment using parents, do_deploy
12
13binding = lsm::ServiceEntityBinding(
14    service_entity="__config__::VlanAssignment",
15    lifecycle=lsm::fsm::simple,
16    service_entity_name="vlan-assignment",
17    allocation_spec="allocate_vlan",
18)
19
20for assignment in lsm::all(binding):
21    VlanAssignment(
22        instance_id=assignment["id"],
23        entity_binding=binding,
24        **assignment["attributes"]
25    )
26end

The main changes in the model are:

  1. the attributes that have to be allocated are added to the service definition as r (read only) attributes.

  2. the service binding refers to an allocation spec (defined in python code)

plugins/__init__.py#
 1"""
 2    Inmanta LSM
 3
 4    :copyright: 2020 Inmanta
 5    :contact: code@inmanta.com
 6    :license: Inmanta EULA
 7"""
 8
 9import inmanta_plugins.lsm.allocation as lsm
10
11lsm.AllocationSpec(
12    "allocate_vlan",
13    lsm.LSM_Allocator(
14        attribute="vlan_id", strategy=lsm.AnyUniqueInt(lower=50000, upper=70000)
15    ),
16)

The allocation spec specifies how to allocate the attribute:

  1. Use the pure LSM internal allocation mechanism for vlan_id

  2. To select a new value, use the AnyUniqueInt strategy, which selects a random number in the specified range

Internally, this works by storing allocations in read-only attributes on the instance. The lsm::all function ensures that if a value is already in the attribute, that value is used. Otherwise, the allocator gets an appropriate, new value, that doesn’t collide with any value in any attribute-set of any other service instance.

In practice, this means that a value is allocated as long as it’s in the active, candidate or rollback attribute sets of any non-terminated service instance. When a service instance is terminated, or clears one of its attribute sets, all identifiers are automatically deallocated.

Important note when designing custom lifecycles: allocation only happens during validating, and the result of the allocation is always written to the candidate attributes.

External lookup#

Often, values received via the NorthBound API are not directly usable. For example, a router can be identified in the API by its name, but what is required is its management IP. The management IP can be obtained based on the name, through lookup in an inventory.

While lookup is not strictly allocation, it is in many ways similar.

The basic mechanism for external lookup is similar to internal allocation: the resolved value is stored in a read-only parameter. This is done to ensure that LSM remains stable, even if the inventory is down or corrupted. This also implies that if the inventory wants to change the value (i.e. router management IP is suddenly changed), it should notify LSM. LSM will not by itself pick up inventory changes. This notification mechanism is currently not supported yet.

An example with router management IP looks like this:

main.cf#
 1import lsm
 2import lsm::fsm
 3
 4entity VirtualWire extends lsm::ServiceEntity:
 5    string router_a
 6    int port_a
 7    int vlan_a
 8    string router_z
 9    int port_z
10    int vlan_z
11    int? vni
12    std::ipv4_address?  router_a_mgmt_ip
13    std::ipv4_address?  router_z_mgmt_ip
14    lsm::attribute_modifier vni__modifier="r"
15    lsm::attribute_modifier router_a_mgmt_ip__modifier="r"
16    lsm::attribute_modifier router_z_mgmt_ip__modifier="r"
17    lsm::attribute_modifier router_a__modifier="rw+"
18    lsm::attribute_modifier router_z__modifier="rw+"
19end
20
21implement VirtualWire using parents, do_deploy
22
23for assignment in lsm::all(binding):
24  VirtualWire(
25      instance_id=assignment["id"],
26      router_a = assignment["attributes"]["router_a"],
27      port_a = assignment["attributes"]["port_a"],
28      vlan_a = assignment["attributes"]["vlan_a"],
29      router_z = assignment["attributes"]["router_z"],
30      port_z = assignment["attributes"]["port_z"],
31      vlan_z = assignment["attributes"]["vlan_z"],
32      vni=assignment["attributes"]["vni"],
33      router_a_mgmt_ip=assignment["attributes"]["router_a_mgmt_ip"],
34      router_z_mgmt_ip=assignment["attributes"]["router_z_mgmt_ip"],
35      entity_binding=binding,
36  )
37end
38
39binding = lsm::ServiceEntityBinding(
40    service_entity="__config__::VirtualWire",
41    lifecycle=lsm::fsm::simple,
42    service_entity_name="virtualwire",
43    allocation_spec="allocate_for_virtualwire",

While the allocation implementation could look like the following

plugins/__init__.py#
 1"""
 2    Inmanta LSM
 3
 4    :copyright: 2020 Inmanta
 5    :contact: code@inmanta.com
 6    :license: Inmanta EULA
 7"""
 8
 9import os
10from typing import Any, Optional
11
12import inmanta_plugins.lsm.allocation as lsm
13import psycopg2
14from inmanta_plugins.lsm.allocation import (
15    AllocationContext,
16    ExternalAttributeAllocator,
17    T,
18)
19from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
20
21
22class PGRouterResolver(ExternalAttributeAllocator[T]):
23    def __init__(self, attribute: str, id_attribute: str) -> None:
24        super().__init__(attribute, id_attribute)
25        self.conn = None
26        self.database = None
27
28    def pre_allocate(self):
29        """Connect to postgresql"""
30        host = os.environ.get("db_host", "localhost")
31        port = os.environ.get("db_port")
32        user = os.environ.get("db_user")
33        self.database = os.environ.get("db_name", "allocation_db")
34        self.conn = psycopg2.connect(
35            host=host, port=port, user=user, dbname=self.database
36        )
37        self.conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
38
39    def post_allocate(self) -> None:
40        """Close connection"""
41        self.conn.close()
42
43    def needs_allocation(
44        self, ctx: AllocationContext, instance: dict[str, Any]
45    ) -> bool:
46        attribute_not_yet_allocated = super().needs_allocation(ctx, instance)
47        id_attribute_changed = self._id_attribute_changed(instance)
48        return attribute_not_yet_allocated or id_attribute_changed
49
50    def _id_attribute_changed(self, instance: dict[str, Any]) -> bool:
51        if instance["candidate_attributes"] and instance["active_attributes"]:
52            return instance["candidate_attributes"].get(self.id_attribute) != instance[
53                "active_attributes"
54            ].get(self.id_attribute)
55        return False
56
57    def _get_value_from_result(self, result: Optional[tuple[T]]) -> Optional[T]:
58        if result and result[0]:
59            return result[0]
60        return None
61
62    def allocate_for_attribute(self, id_attribute_value: Any) -> T:
63        with self.conn.cursor() as cursor:
64            cursor.execute(
65                "SELECT mgmt_ip FROM routers WHERE name=%s", (id_attribute_value,)
66            )
67            result = cursor.fetchone()
68            allocated_value = self._get_value_from_result(result)
69            if allocated_value:
70                return allocated_value
71            raise Exception("No ip address found for %s", str(id_attribute_value))
72
73
74lsm.AllocationSpec(
75    "allocate_for_virtualwire",
76    PGRouterResolver(id_attribute="router_a", attribute="router_a_mgmt_ip"),
77    PGRouterResolver(id_attribute="router_z", attribute="router_z_mgmt_ip"),
78    lsm.LSM_Allocator(
79        attribute="vni", strategy=lsm.AnyUniqueInt(lower=50000, upper=70000)
80    ),
81)

External inventory owns allocation#

When allocating is owned externally, synchronization between LSM and the external inventory is crucial. If either LSM or the inventory fails, this should not lead to inconsistencies. In other words, LSM doesn’t only have to maintain consistency between different service instances, but also between itself and the inventory.

The basic mechanism for external allocation is similar to external lookup. One important difference is that we also write our allocation to the inventory.

For example, consider that there is an external Postgres Database that contains the allocation table. In the model, this will look exactly the same as in the case of internal allocation, in the code, it will look as follows

plugins/__init__.py#
 1"""
 2    Inmanta LSM
 3
 4    :copyright: 2020 Inmanta
 5    :contact: code@inmanta.com
 6    :license: Inmanta EULA
 7"""
 8
 9import os
10from typing import Optional
11from uuid import UUID
12
13import inmanta_plugins.lsm.allocation as lsm
14import psycopg2
15from inmanta_plugins.lsm.allocation import ExternalServiceIdAllocator, T
16from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
17
18
19class PGServiceIdAllocator(ExternalServiceIdAllocator[T]):
20    def __init__(self, attribute: str) -> None:
21        super().__init__(attribute)
22        self.conn = None
23        self.database = None
24
25    def pre_allocate(self):
26        """Connect to postgresql"""
27        host = os.environ.get("db_host", "localhost")
28        port = os.environ.get("db_port")
29        user = os.environ.get("db_user")
30        self.database = os.environ.get("db_name", "allocation_db")
31        self.conn = psycopg2.connect(
32            host=host, port=port, user=user, dbname=self.database
33        )
34        self.conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
35
36    def post_allocate(self) -> None:
37        """Close connection"""
38        self.conn.close()
39
40    def _get_value_from_result(self, result: Optional[tuple[T]]) -> Optional[T]:
41        if result and result[0]:
42            return result[0]
43        return None
44
45    def allocate_for_id(self, serviceid: UUID) -> T:
46        """Allocate in transaction"""
47        with self.conn.cursor() as cursor:
48            cursor.execute(
49                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
50                (self.attribute, serviceid),
51            )
52            result = cursor.fetchone()
53            allocated_value = self._get_value_from_result(result)
54            if allocated_value:
55                return allocated_value
56            cursor.execute(
57                "SELECT max(allocated_value) FROM allocation where attribute=%s",
58                (self.attribute,),
59            )
60            result = cursor.fetchone()
61            current_max_value = self._get_value_from_result(result)
62            allocated_value = current_max_value + 1 if current_max_value else 1
63            cursor.execute(
64                "INSERT INTO allocation (attribute, owner, allocated_value) VALUES (%s, %s, %s)",
65                (self.attribute, serviceid, allocated_value),
66            )
67            self.conn.commit()
68            return allocated_value
69
70
71lsm.AllocationSpec(
72    "allocate_vlan",
73    PGServiceIdAllocator(
74        attribute="vlan_id",
75    ),
76)

What is important to notice is that the code first tries to see if an allocation has already happened. This is important in case there was a failure before LSM could commit the allocation. In general, LSM must be able to identify what has been allocated to it, in order to recover aborted operations. This is done either by attaching an identifier when performing allocation by knowing where the value will be stored in the inventory up front (e.g. the inventory contains a service model as well, LSM can find the VNI for a service by requesting the VNI for that service directly).

In the above example, the identifier is the same as the service instance id that LSM uses internally to identify an instance. An attribute of the instance can also be used to identify it in the external inventory, as the name attribute in the the example below.

plugins/__init__.py#
 1"""
 2    Inmanta LSM
 3
 4    :copyright: 2020 Inmanta
 5    :contact: code@inmanta.com
 6    :license: Inmanta EULA
 7"""
 8
 9import os
10from typing import Any, Optional
11
12import inmanta_plugins.lsm.allocation as lsm
13import psycopg2
14from inmanta_plugins.lsm.allocation import ExternalAttributeAllocator, T
15from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
16
17
18class PGAttributeAllocator(ExternalAttributeAllocator[T]):
19    def __init__(self, attribute: str, id_attribute: str) -> None:
20        super().__init__(attribute, id_attribute)
21        self.conn = None
22        self.database = None
23
24    def pre_allocate(self):
25        """Connect to postgresql"""
26        host = os.environ.get("db_host", "localhost")
27        port = os.environ.get("db_port")
28        user = os.environ.get("db_user")
29        self.database = os.environ.get("db_name", "allocation_db")
30        self.conn = psycopg2.connect(
31            host=host, port=port, user=user, dbname=self.database
32        )
33        self.conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
34
35    def post_allocate(self) -> None:
36        """Close connection"""
37        self.conn.close()
38
39    def _get_value_from_result(self, result: Optional[tuple[T]]) -> Optional[T]:
40        if result and result[0]:
41            return result[0]
42        return None
43
44    def allocate_for_attribute(self, id_attribute_value: Any) -> T:
45        """Allocate in transaction"""
46        with self.conn.cursor() as cursor:
47            cursor.execute(
48                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
49                (self.attribute, id_attribute_value),
50            )
51            result = cursor.fetchone()
52            allocated_value = self._get_value_from_result(result)
53            if allocated_value:
54                return allocated_value
55            cursor.execute(
56                "SELECT max(allocated_value) FROM allocation where attribute=%s",
57                (self.attribute,),
58            )
59            result = cursor.fetchone()
60            current_max_value = self._get_value_from_result(result)
61            allocated_value = current_max_value + 1 if current_max_value else 1
62            cursor.execute(
63                "INSERT INTO allocation (attribute, owner, allocated_value) VALUES (%s, %s, %s)",
64                (self.attribute, id_attribute_value, allocated_value),
65            )
66            self.conn.commit()
67            return allocated_value
68
69
70lsm.AllocationSpec(
71    "allocate_vlan",
72    PGAttributeAllocator(attribute="vlan_id", id_attribute="name"),
73)

Second, it is required that the inventory has a procedure to safely obtain ownership of an identifier. There must be some way LSM can definitely determine if it has correctly obtained an identifier. In the example, the database transaction ensures this. Many other mechanisms exist, but the inventory has to support at least one. Examples of possible transaction coordination mechanism are:

  1. an API endpoint that atomically and consistently performs allocation,

  2. database transaction

  3. Compare-and-set style API (when updating a value, the old value is also passed along, ensuring no concurrent updates are possible)

  4. API with version argument (like the LSM API itself, when updating a value, the version prior to update has to be passed along, preventing concurrent updates)

  5. Locks and/or Leases (a value or part of the inventory can be locked or leased(locked for some time) prior to allocation, the lock ensures no concurrent modifications)

This scenario performs no de-allocation.

External inventory with deallocation#

To ensure de-allocation on an external inventory is properly executed, it is not executed during compilation, but by a handler. This ensures that de-allocation is retried until it completes successfully.

The example below shows how allocation and de-allocation of a VLAN ID can be done using an external inventory. The handler of the PGAllocation entity performs the de-allocation. An instance of this entity is only constructed when the service instance is in the deallocating state.

vlan_assignment/model/_init.cf#
 1import lsm
 2import lsm::fsm
 3
 4entity VlanAssignment extends lsm::ServiceEntity:
 5    string name
 6
 7    int? vlan_id
 8    lsm::attribute_modifier vlan_id__modifier="r"
 9end
10
11implement VlanAssignment using parents, do_deploy
12implement VlanAssignment using de_allocation when lsm::has_current_state(self, "deallocating")
13
14entity PGAllocation extends std::PurgeableResource:
15    """
16        This entity ensures that an identifier allocated in PostgreSQL
17        gets de-allocated when the service instance is removed.
18    """
19   string attribute
20   std::uuid service_id
21   string agent
22end
23
24implement PGAllocation using std::none
25
26implementation de_allocation for VlanAssignment:
27    """
28        De-allocate the vlan_id identifier.
29    """
30    self.resources += PGAllocation(
31        attribute="vlan_id",
32        service_id=instance_id,
33        purged=true,
34        send_event=true,
35        agent="internal",
36        requires=self.requires,
37        provides=self.provides,
38    )
39end
40
41binding = lsm::ServiceEntityBinding(
42    service_entity="vlan_assignment::VlanAssignment",
43    lifecycle=lsm::fsm::simple_with_deallocation,
44    service_entity_name="vlan-assignment",
45    allocation_spec="allocate_vlan",
46)
47
48for assignment in lsm::all(binding):
49    VlanAssignment(
50        instance_id=assignment["id"],
51        entity_binding=binding,
52        **assignment["attributes"],
53    )
54end

The handler associated with the PGAllocation handler is shown in the code snippet below. Note that the handler doesn’t have an implementation for the create_resource() and the update_resource() method since they can never be called. The only possible operation is a delete operation.

vlan_assignment/plugins/__init__.py#
  1"""
  2    Inmanta LSM
  3
  4    :copyright: 2020 Inmanta
  5    :contact: code@inmanta.com
  6    :license: Inmanta EULA
  7"""
  8
  9import os
 10from typing import Optional
 11from uuid import UUID
 12
 13import psycopg2
 14from inmanta.agent import handler
 15from inmanta.agent.handler import CRUDHandlerGeneric as CRUDHandler
 16from inmanta.agent.handler import ResourcePurged, provider
 17from inmanta.resources import PurgeableResource, resource
 18from inmanta_plugins.lsm.allocation import AllocationSpec, ExternalServiceIdAllocator
 19from psycopg2.extensions import ISOLATION_LEVEL_SERIALIZABLE
 20
 21
 22class PGServiceIdAllocator(ExternalServiceIdAllocator[int]):
 23    def __init__(self, attribute: str) -> None:
 24        super().__init__(attribute)
 25        self.conn = None
 26        self.database = None
 27
 28    def pre_allocate(self) -> None:
 29        """Connect to postgresql"""
 30        host = os.environ.get("db_host", "localhost")
 31        port = os.environ.get("db_port")
 32        user = os.environ.get("db_user")
 33        self.database = os.environ.get("db_name", "allocation_db")
 34        self.conn = psycopg2.connect(
 35            host=host, port=port, user=user, dbname=self.database
 36        )
 37        self.conn.set_isolation_level(ISOLATION_LEVEL_SERIALIZABLE)
 38
 39    def post_allocate(self) -> None:
 40        """Close connection"""
 41        self.conn.close()
 42
 43    def _get_value_from_result(self, result: Optional[tuple[int]]) -> Optional[int]:
 44        if result and result[0]:
 45            return result[0]
 46        return None
 47
 48    def allocate_for_id(self, serviceid: UUID) -> int:
 49        """Allocate in transaction"""
 50        with self.conn.cursor() as cursor:
 51            cursor.execute(
 52                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
 53                (self.attribute, serviceid),
 54            )
 55            result = cursor.fetchone()
 56            allocated_value = self._get_value_from_result(result)
 57            if allocated_value:
 58                return allocated_value
 59            cursor.execute(
 60                "SELECT max(allocated_value) FROM allocation where attribute=%s",
 61                (self.attribute,),
 62            )
 63            result = cursor.fetchone()
 64            current_max_value = self._get_value_from_result(result)
 65            allocated_value = current_max_value + 1 if current_max_value else 1
 66            cursor.execute(
 67                "INSERT INTO allocation (attribute, owner, allocated_value) VALUES (%s, %s, %s)",
 68                (self.attribute, serviceid, allocated_value),
 69            )
 70            self.conn.commit()
 71            return allocated_value
 72
 73    def has_allocation_in_inventory(self, serviceid: UUID) -> bool:
 74        """
 75        Check whether a VLAN ID is allocated by the service instance with the given id.
 76        """
 77        with self.conn.cursor() as cursor:
 78            cursor.execute(
 79                "SELECT allocated_value FROM allocation WHERE attribute=%s AND owner=%s",
 80                (self.attribute, serviceid),
 81            )
 82            result = cursor.fetchone()
 83            allocated_value = self._get_value_from_result(result)
 84            if allocated_value:
 85                return True
 86            return False
 87
 88    def de_allocate(self, serviceid: UUID) -> None:
 89        """
 90        De-allocate the VLAN ID allocated by the service instance with the given id.
 91        """
 92        with self.conn.cursor() as cursor:
 93            cursor.execute(
 94                "DELETE FROM allocation WHERE attribute=%s AND owner=%s",
 95                (self.attribute, serviceid),
 96            )
 97            self.conn.commit()
 98
 99
100@resource("vlan_assignment::PGAllocation", agent="agent", id_attribute="service_id")
101class PGAllocationResource(PurgeableResource):
102    fields = ("attribute", "service_id")
103
104
105@provider("vlan_assignment::PGAllocation", name="pgallocation")
106class PGAllocation(CRUDHandler[PGAllocationResource]):
107    def __init__(self, *args, **kwargs):
108        super().__init__(*args, **kwargs)
109        self._allocator = PGServiceIdAllocator(attribute="vlan_id")
110
111    def pre(self, ctx: handler.HandlerContext, resource: PGAllocationResource) -> None:
112        self._allocator.pre_allocate()
113
114    def post(self, ctx: handler.HandlerContext, resource: PGAllocationResource) -> None:
115        self._allocator.post_allocate()
116
117    def read_resource(
118        self, ctx: handler.HandlerContext, resource: PGAllocationResource
119    ) -> None:
120        if not self._allocator.has_allocation_in_inventory(resource.service_id):
121            raise ResourcePurged()
122
123    def delete_resource(
124        self, ctx: handler.HandlerContext, resource: PGAllocationResource
125    ) -> None:
126        self._allocator.de_allocate(resource.service_id)
127
128
129AllocationSpec("allocate_vlan", PGServiceIdAllocator(attribute="vlan_id"))