Partial Compiles¶

Partial compilation is an approach to speed up compilation when the Service Inventory contains many instances.

Ordinarily, LSM re-compiles all instances on every update. This means that as the inventory grows, the compiles become slower. Partial compiles allow LSM to re-compile only those instances that are relevant to the current service instance, avoiding any slowdown.

Implementation guidelines¶

for every lsm::ServiceEntity,
1. make sure to collect all resources it contains in the relation owned_resources
2. make sure to always select the parent implementations (implement … using parents)
for every Inter Service Relation
1. indicate if this is the relation to the owner by setting lsm::ServiceEntityBinding.relation_to_owner and lsm::ServiceEntityBinding.owner.
to further improve performance, it is possible to batch partial compiles together by enabling enable_batched_partial_compiles. We advice to only turn this on after the model has been tested sufficiently.

Supported scenarios¶

Partial compiles are possible when

Service Instances are unrelated: service instances don’t share any resources and don’t depend on each other in any way. This only requires correctly setting owned_resources.
Services form groups under a common owner.
- Instances within the group can freely depend on each other and share resources, but nothing is shared across groups.
- One specific instance is designated as the common owner of the group.
- Instances can not be moved to another group. The model should prevent this type of update.
- This additionally requires indicating what the owner of any service is, by setting lsm::ServiceEntityBinding.owner and lsm::ServiceEntityBinding.relation_to_owner. This does not immediately have to be the root owner, the ownership hierarchy is allowed to form a tree with intermediate owners below the root owner.
Service instances and groups can depend on shared resources, that are identical for all service instances and groups.
Any combination of the above
For other scenarios, custom partitioning logic is required

How it works for unrelated services¶

For unrelated services, LSM expands on the normal resources set based partial compiles by automatically creating a single resource set for each service instance.

To add resources to the instance’s resource set, simply add them to its lsm::ServiceBase.owned_resources relation and make sure to select the parents implementation for your service entities. LSM will then make sure to populate the resource set and to correctly trigger related compiles and exports.

Example with Inter Service Relations¶

As an example, consider the following model for managing ports and routers. Both are independent services, but a port can only be managed in combination with its router and all its siblings. (This is not in general true, we often manage ports without managing the entire router, but we use it as an example.)

This model is not much different from normal Inter Service Relations, except for lines 29, 38, 58-59.

main.cf¶

import lsm
import lsm::fsm
import std::testing

entity Router extends lsm::ServiceEntity:
    """
        A service for managing routers
    """
    string mgmt_ip
end

index Router(instance_id)

entity Port extends lsm::ServiceEntity:
    """
        A service for managing ports on routers
    """
    string name
end

index Port(instance_id)

Port.router [1] lsm::__service__, lsm::__rwplus__ Router
""" An Inter Service Relation between Router and Port"""

implementation router_config for Router:
    """ Add a dummy resource to the router to represent actual configuration """
    self.resources += std::testing::NullResource(name=self.mgmt_ip)
    self.owned_resources += self.resources # We own all our resources and nothing else
end

implementation port_config for Port:
    """ Add a dummy resource to the Port to represent actual configuration """
    self.resources += std::testing::NullResource(
        name="{{self.router.mgmt_ip}}-{{self.name}}",
        requires = self.router.resources
    )
    self.owned_resources += self.resources # We own all our resources and nothing else
end

implement Router using router_config, parents
implement Port using port_config, parents

# Service binding for Router
binding_router = lsm::ServiceEntityBinding(
    service_entity="__config__::Router",
    lifecycle=lsm::fsm::simple_with_delete_validate,
    service_entity_name="router",
    service_identity="mgmt_ip",
)

# Service binding for Port
binding_port = lsm::ServiceEntityBinding(
    service_entity="__config__::Port",
    lifecycle=lsm::fsm::simple_with_delete_validate,
    service_entity_name="port",
    service_identity="name",
    relation_to_owner="router", # required for Partial Compile
    owner=binding_router, # required for Partial Compile
)

# Normal Service unrolling
for instance in lsm::all(binding_router):
    Router(
        instance_id = instance["id"],
        entity_binding = binding_router,
        **instance["attributes"],
    )
end

for instance in lsm::all(binding_port):
    Port(
        instance_id = instance["id"],
        entity_binding = binding_port,
        name = instance["attributes"]["name"],
        router = Router[instance_id=instance["attributes"]["router"]]
    )
end

How it works¶

To better understand how this works, there are two things to consider:

how to divide the resources into resource sets
how to get the correct instances into the model

Resource sets¶

The key mechanism behind partial compiles are ResourceSets: all resources in the desired state are divided into groups. When building a new desired state, instead of replacing the entire desired state, we only replace a specific ResourceSet. Resources in a ResourceSet can not depend on Resources in other ResourceSets.

To make this work, we have to assign every Service Instance to a ResourceSet, such that the set has no relations to any other ResourceSet.

In practice, we do this by putting all Resources in the ResourceSet of the owning entity.

Resource Sets for the Router example with 2 Routers with each 1 port. Arrows represent the requires relation.¶

In addition to the ResourceSets used by individual services, there are also Resources that are not in any set. These Resources can be shared by multiple services, with the limitation that any compile that produces them, has to produce them exactly the same. For more information see Partial Compiles.

Service Instance Selection¶

To have efficiency gains when recompiling, it is important to only build the model for all Service Instances that are in the ResourceSet we want to update and nothing else.

This selection is done automatically within lsm::all, based on the relations set between the service bindings as explained above.

The underlying mechanism is that when we recompile for a state change on any Service Instance, we first search its owner by traversing lsm::ServiceEntityBinding.relation_to_owner until we reach a single owner. Then we traverse back down the lsm::ServiceEntityBinding.relation_to_owner until we have all children. lsm::all will only return these children and nothing else.

Custom selectors¶

Complex topologies (with multiple parents or cross-relations) require a custom selector. The selector is a piece of python code that determines which instances will be returned by lsm::all.

The API for the selector is inmanta_plugins.lsm.partial.SelectorAPI. To register the selector, use inmanta_plugins.lsm.global_cache.set_selector_factory.

For example, consider a case where we have a tunnel service that connects two ports. To compile the tunnel, we also need to load the ports. This is also not a tree, as each tunnel has not one, but two parents. As such, we can build a custom selector.

main.cf¶

import lsm
import lsm::fsm
import std::testing


entity Tunnel extends lsm::ServiceEntity:
    """
        A service for managing tunnels
    """
    string tunnel_id
end



entity Port extends lsm::ServiceEntity:
    """
        A service for managing ports in a network
    """
    string name
end


index Port(instance_id)


Tunnel.ports [2:2] lsm::__service__, lsm::__rwplus__ Port

""" An Inter Service Relation between Tunnel and Port"""


implementation tunnel_config for Tunnel:
    self.owned_resources += self.resources
    for port in self.ports:
        # dummy resource to represent a tunnel endpoint config on the port
        self.resources += std::testing::NullResource(name=f"{self.tunnel_id} {port.name}")
        self.owned_resources += self.resources
    end
end



implement Tunnel using tunnel_config, parents

implement Port using parents
# Port has no actual config in itself

# Service binding for Tunnel

binding_tunnel = lsm::ServiceEntityBinding(
    service_entity="__config__::Tunnel",
    lifecycle=lsm::fsm::simple_with_delete_validate,
    service_entity_name="tunnel",
    service_identity="tunnel_id",
)

# Service binding for Port

binding_port = lsm::ServiceEntityBinding(
    service_entity="__config__::Port",
    lifecycle=lsm::fsm::simple_with_delete_validate,
    service_entity_name="port",
    service_identity="name",
)


# Normal Service unrolling
for instance in lsm::all(binding_tunnel):
    Tunnel(
        instance_id = instance["id"],
        entity_binding = binding_tunnel,
        tunnel_id = instance["attributes"]["tunnel_id"],
        ports = [Port[instance_id=port_id] for port_id in instance["attributes"]["ports"]],
    )

end


for instance in lsm::all(binding_port):
    Port(
        instance_id = instance["id"],
        entity_binding = binding_port,
        name = instance["attributes"]["name"],
    )

end

__init__.py¶

from inmanta_plugins import lsm
from inmanta_plugins.lsm import partial


class TunnelSelector(partial.AbstractSelector):

    def select_all(self) -> dict[str, list[dict]]:
        # Collect all port ids we need
        port_ids = set()
        # Collect all tunnel ids we need
        tunnel_ids = set()
        # Go over all instances requested
        for current_instance_id in self.requested_instances:
            # Find the actual instance to find its type
            service_instance = lsm.global_cache.get_instance(
                env=self.env,
                service_entity_name=None,  # We don't know yet
                instance_id=current_instance_id,
                include_terminated=True,
            )
            if service_instance is None:
                raise RuntimeError(
                    f"Can not find any instance with id {current_instance_id} in"
                    f"environment {self.env}"
                )

            # Now we know which service it is
            service_entity_name = service_instance["service_entity"]

            # Make sure our instance is cached
            lsm.global_cache.get_all_instances(
                self.env,
                service_entity_name=service_entity_name,
            )

            if service_entity_name == "tunnel":
                # Get all ports we need now
                for port in self._get_attribute(service_instance, "ports"):
                    port_ids.add(port)
                tunnel_ids.add(current_instance_id)
            elif service_entity_name == "port":
                port_ids.add(current_instance_id)
            else:
                raise Exception(
                    f"This selector is only intended to handle ports and tunnels, but got: {service_entity_name}"
                )
        # Convert ids to instances
        all_selected = {}
        all_selected["port"] = [
            lsm.global_cache.get_instance(
                env=self.env,
                service_entity_name="port",
                instance_id=port_id,
            )
            for port_id in port_ids
        ]
        all_selected["tunnel"] = [
            lsm.global_cache.get_instance(
                env=self.env,
                service_entity_name="tunnel",
                instance_id=tunnel_id,
            )
            for tunnel_id in tunnel_ids
        ]

        return all_selected


lsm.global_cache.set_selector_factory(TunnelSelector)

When designing a custom selector, keep in mind the intrinsic limitation of Partial Compiles.

class inmanta_plugins.lsm.partial.SelectorAPI(env: str)[source]¶

The Selector is responsible for determining which instances are returned by lsm::all

A specific selector class can be registered using inmanta_plugins.lsm.global_cache.set_selector_factory

A selector is used in 4 phases:

the Selector is constructed (empty)
it is fed all relevant bindings to analyze and cache.
it is fed all instances requested via the environment variable inmanta_instance_id as used for partial compile
it returns the instances selected

All methods can be called multiple times, but once a method from the next phase is called, methods from the previous phase should not get called any more (for the same binding)

abstractmethod register_instances(instance_ids: Sequence[str]) → None[source]¶: register explicitly requested instances (phase 3), can be called multiple times.

abstractmethod reload_bindings() → None[source]¶

This method is only required for very advanced selectors that need to inspect the binding structure.

This method checks the binding cache for new instances and registers them. e.g.

def reload_bindings(self) -> None:
    for (
        name,
        version,
    ), binding in dict(inmanta_plugins.lsm.global_cache.get_all_versioned_bindings()).items():
        if (name, version) not in self.root_for:
            self.register_binding(binding)

Implementors, keep in mind that:

method can be re-executed because of unset exceptions

any binding additionally required MUST be registered in the global cache (inmanta_plugins.lsm.global_cache)

abstractmethod select(requested_service_type: str) → list[dict][source]¶

Return all instances for a specific type (phase 4)

can be called multiple times

All instances must also be cached in the global cache

The selector is expected to closely integrate with the inmanta_plugins.lsm.CacheManager.

class inmanta_plugins.lsm.CacheManager[source]¶

Entry point for all internal caches in LSM, accessed via global_cache.

Also caches a connection to the server.

convert_instance(instance: dict, validation: bool, instance_id: str | None) → dict | None[source]¶

Convert an instance from the API form to the return format of lsm::all

Parameters:

instance – The instance dict as returned by the api
validation – Whether this is a validation compile
instance_id – The id of the instance being validated (if this is a validation compile)

get_all_bindings() → dict[str, DynamicProxy][source]¶: Return all unversioned service entity bindings that have been registered

get_all_instances(env: str, service_entity_name: str, force: bool = False) → list[dict][source]¶

Get all (non-terminal) instances from the server for a specific environment and service_entity_name. The result is cached and any subsequent call uses the cache.

Parameters:

env – the environment to use.
service_entity_name – the name of specific service entity for which to fetch the instances.
force – when true, the cache is refreshed from the server

Returns:

all instances, as dicts, in the format returned by the server.

get_all_versioned_bindings() → dict[tuple[str, int], DynamicProxy][source]¶: Return all bindings that have been registered

get_binding(entity_name: str, entity_version: int | None = None) → DynamicProxy[source]¶: Takes an entity name and version and returns the appropriate binding. If entity_version is not provided we will assume it is 0 and that we are dealing with unversioned entities. If no specific version is requested and multiple versions exist, raises an exception. The binding returned may not be fully resolved and can raise UnsetExceptions when accessed.

get_instance(env: str, service_entity_name: str | None, instance_id: str | UUID, force: bool = False, include_terminated: bool = False) → dict | None[source]¶

Return the service instance with the given environment, service_entity_name and instance_id or None if no such instance exists.

Parameters:

force – when true, the cache is refreshed from the server
include_terminated – when trying to pull a specific instance, and it is not in the cache,

try to get it from the API directly, so that it is returned even when terminated.

get_instance_state(instance_id: str) → tuple[str, int][source]¶

Get the current state and version for a specific instance. Can only be called for instances retrieved via get_all_instances

Parameters:: instance_id – the uuid for the service instance
Returns:: current state and version for the specific instance

get_instance_state_with_desired_state_version(instance_id: str) → tuple[str, int, int | None][source]¶

Get the current state and version for a specific instance. Can only be called for instances retrieved via get_all_instances

Parameters:: instance_id – the uuid for the service instance
Returns:: current state, version and desired state version for the specific instance

get_lifecycle(entity_name: str, entity_version: int | None = None) → dict[str, dict[str, object]][source]¶

Get the lifecycle for a specific version of service entity. :param entity_name: The name of the service entity. :param entity_version: The version of the service entity. Optional for backwards compatibility.

If it is not provided, will only work if there is one registered version.

Returns:: The lifecycle of the provided service entity version.

get_service_entity(service_entity_name: str, service_entity_version: int | None = None) → ServiceEntity[source]¶

Get the service entity definition with the given name from the api.

Parameters:

service_entity_name – The name of the service whose definition we want to query.
service_entity_version – The version of the service entity (if multi-version is supported by this inmanta server)

register_binding(binding) → dict[str, object][source]¶: Register a lifecycle binding and return its lifecycle

reset() → None[source]¶: Reset all state: drop all caches and renew the connection

set_selector_factory(selector_factory: Callable[[str], inmanta_plugins.lsm.partial.SelectorApi] | None) → None[source]¶

Set the selector factory that will produce a selector for partial compile

For testing: the factory is not reset when the module is reloaded. To reset, set it to None.

Limitations¶

When doing normal compiles, the model can very effectively find conflicts between services (e.g. using indexes), because it has an overview of all instances.
When using partial compile, conflicts between groups can not be detected, because the compiler never sees them together. This means that the model must be designed to be conflict free or rely on an (external) inventory to avoid conflicts. This is why we always advice to run models in full compile mode until performance becomes an issue: it gives the model time to mature and to detect subtle conflicts.

For more details, see limitation section in the core documentation

Partial Compiles¶

Implementation guidelines¶

Supported scenarios¶

How it works for unrelated services¶

Example with Inter Service Relations¶

How it works¶

Resource sets¶

Service Instance Selection¶

Custom selectors¶

Limitations¶

Further Reading¶