Autonomous Agents
Autonomous Agents allow you to create agents that can run on any node continuously without the need to initiate actions via RPC calls.
This feature is available since Choria Server 0.11.0
Use Cases
These agents are designed to run continuously for the life of the Choria Server where they manage the node or environment the node runs in.
These are not designed to replace entire systems like Puppet, below we have some typical use cases, however it is designed to integrate with any other system via a simple shell + exit code interface.
HVAC
You could have systems that monitor the air quality of a room and an air conditioner to improve the quality of the air in the room. You do not wish to run these continuously, perhaps you built them using a few Raspberry PI.
You’d have a sensor or set of sensors that monitor the air quality and a network enabled switch to start and stop your HVAC.
An Autonomous Agent could be created to continuously query the air quality and turn the HVAC on and off on demand. Integration with the Raspbery PI systems would be via scripts you supply.
Cron with event input
You have to gather data about the state of the machine, in general you wish to gather the data once every hour. However should a file or set of files get updated you want to run an immediate gather.
An Autonomous Agent could watch these files and execute the command every 4 hours.
Container Management
You have a manifest that describes a container and its desired tag:
{
"image": "acme/sample",
"tag": "123"
}
You want the system to continuously monitor this file and:
- Watch for changes to the file
- Trigger a deploy of the container at this version if that is not what’s running
- Continuously monitor the health of the container
- Remediate the container by restarting it should the health check fail
- Should the manifest change at any time, redeploy the container to the new desired version
An Autonomous Agent could describe these interactions and the integration with docker could be shell scripts, ruby scripts or anything else like Ansible.
Cluster Management
A manager host in a small cluster could health check the entire cluster and perhaps even own scaling the cluster to desired states. The various health checks, upgrade and downgrade flows and remediation could all be implemented as Ansible Playbooks or Bolt Plans using the Autonomous Agent simply as a scheduler and coordinator between these tools.
Monitoring
Using the included nagios
watcher one can run Nagios compatible plugins:
- Run them on a schedule
- Perform auto remediation when in specific states
- Consume
cloudevents
about check states - Expose monitoring state to Prometheus via the included
node_exporter
integration for alerting and graphing