Centralised AAA

Choria is a very loosely coupled system with no central controller and in fact no shared infrastructure other than a middleware that is completely “dumb”. What this means is there is no per request processing anywhere centrally other than just to shift the packets. No inventory databases, user databases or other shared infrastructure to scale or maintain - though several integration options exist should you choose to do so.

There are many reasons for this - in a large scale environment there are always things broken and automation systems should do their best to keep working even in the face of uncertainty. This design extends from the servers, middleware all the way to the client code. The loosely coupled design ensures that what can be managed will be managed.

This is generally fine and works within my design parameters and goals. For the client though in enterprise environments this is problematic:

  • Enterprises are heavily invested in SSO and entitlement based flows for permissions
  • Enterprises and regulated environments have strong requirements for auditing to centralized systems
  • Certificate management for individual users is a nearly impossible hurdle to scale

So today I would like to present a new extension point that allow you to fully centralize AAA for the Choria CLI.

[Read More]

Limiting Clients to IP Ranges

The upcoming set of releases have a strong focus on security. We will introduce a whole new way to build centralized AAA if a site desires and a few smaller enhancements. One of these enhancements is the ability to limit where clients can be used on your network.

Today the security model allow anyone with the correctly issued and signed certificates to make client requests from anywhere on your network. This is generally fine as the certificates are not to be shared, however there are concerns that there might be rogue clients on your network perhaps outside of your update strategy or just as a form of shadow orchestration system. You could also have concerns about the fact that using just a server certificate one can read all replies the entire network sends that might contain sensitive information.

If you have this concern the upcoming version 0.10.0 of the Choria Broker will include the ability to limit what networks clients can come from.

[Read More]

Choria Lifecycle Events

Events are small JSON documents that describe an event that happens in a system. Events come in many forms but usually they indicate things like startup, shutdown, aliveness, problems or major completed tasks. They tend to be informational and so should be considered lossy - in other words do not expect to get a shutdown event for every shutdown that happens, some kinds of shutdown can prevent it from reaching you. Likewise startups where the middleware connection is flakey.

These events come in many flavours and there are not really many standards around for this stuff, one effort cloudevents from the CNCF looks to be on a good path and once things mature we’ll look to adopt them as the underlying format for our lifecycle messages too.

In Choria we call these Lifecycle Events. I recently released an initial version 1.0.0 of the package that manages these, this post will introduce what we have today and what we use them for.

These kinds of event allow other tools to react to events happening from Choria components, some uses:

  • Create a dashboard of active versions of a component by passively observing the network - use startup, shutdown and alive events.
  • React to nodes starting up by activating other orchestration systems like continuous delivery systems
  • React to a specific component starting up and provision them asap

There are many other cases where an event flow is useful and in time we will add richer event types.

Today Choria Server, Choria Backplane and Choria Provisioner produce events while Choria Provisioner consumes them. We are a bit conservative with when and where we emit them as the clusters we support can be in the 50k node range we need to consider each type of event and the need for it carefully.

Read on for full details.

[Read More]

CfgMgmtCamp 2019

I will be giving a talk at the 2019 installment of CfgMgmtCamp in Ghent on 4 to 6th February 2019

The talk will be focussed on Choria Data Adpaters, NATS Streaming, metadata and will discuss the design of the Choria Stream Replicator.

I’ll hopefully also show off something new I’ve been hacking on on and off!

The CFP submission can be seen below the fold, I hope to see many Choria users there!

[Read More]
talks 

Choria Server 0.9.0

Today I released version 0.9.0 of the Choria Server along with an update to the Ruby plugin for MCollective.

This is a significant milestone release that give us full support for custom Certificate Authorities including chains of Intermediates. The Choria Provisioner supports requesting CSR’s from nodes and supplying those nodes with signed certs and you can integrate it with any CA with an API of your choosing.

We’ve also fixed some bugs, tweaked some things and generally iterated ever forward.

[Read More]

Puppet 6 Support

Back in July 2018 Puppet Inc officially announced that The Marionette Collective was being deprecated and will not be included in the future Puppet Agent releases.

This presented a problem for us as we relied on this packaging to install mcollective, services and its libraries. We would now have to do all this ourselves.

At the same time I was working on the Choria Server and giving it backward compatibility capabilities (still in progress to hit 100%) so we couldn’t support Puppet 6 on release day.

Today we published a bunch of releases and as of version 0.12.0 of the choria/choria release we support Puppet 6 out of the box.

[Read More]

Mass Provisioning Choria Servers

The Choria Server is the agent component of the Choria Orchestrator system, it runs on every node and maintains a connection to the middleware.

Traditionally we’ve configured it using Puppet along with its mcollective compatibility layer. We intend to keep this model for the foreseeable future. Choria Server though has many more uses – it’s embeddable so can be used in IoT, tools like our go-backplane, side cars in kubernetes in more. In these and other cases the Puppet model do not work:

  • You do not have CM at all
  • You do not own the machines where Choria runs on, you provide a orchestration service to other teams
  • You are embedding the Choria Server in your own code, perhaps in a IoT device where Puppet does not make sense
  • Your scale makes using Puppet not an option
  • You wish to have very dynamic decision making about node placement
  • You wish to integrate Choria into your own Certificate Authority system

In all these cases there are real complex problems to solve in configuring Choria Server. We’ve built a system that can help solve this problem, it’s called the Choria Server Provisioner and this post introduce it.

[Read More]

50 000 node network

I’ve been saying for a while now my aim with Choria is that someone can get a 50 000 node Choria network that just works without tuning, like, by default that should be the scale it supports at minimum.

I started working on a set of emulators to let you confirm that yourself – and for me to use it during development to ensure I do not break this promise – though that got a bit side tracked as I wanted to do less emulation and more just running 50 000 instances of actual Choria, more on that in a future post.

Today I want to talk a bit about a actual 50 000 real nodes deployment and how I got there – the good news is that it’s terribly boring since as promised it just works.

[Read More]