Table of Contents

Introduction

CloudBees Jenkins Platform Private SaaS Edition (Private SaaS Edition) provides CJOC and CJE instances as a service on most Cloud providers both public, such as Amazon Web Services (AWS), and private — such as OpenStack.

Key features include:

Scalability
  • Easily create a Private SaaS Edition cluster to manage builds across an organization

  • Create new masters at the click of button

  • Efficiently utilize infrastructure capacity using ephemeral agents (formerly called slaves)

  • Add and remove build capacity over time as needed

High Availability
  • Automatic fault detection and recovery

  • Off-cluster storage backup and recovery

  • Monitoring and alerting

Managed Operations
  • Efficient setup for a complete Private SaaS Edition cluster in under an hour

  • Controlled upgrades and cluster operations

  • Automatic cluster monitoring and healing

Customizable
  • Fully configurable master configurations

  • Customize Jenkins master and agent images using Docker images

Terms and Definitions

Jenkins (also referenced as Jenkins OSS in CloudBees documentation)

Jenkins is an open-source automation server. Jenkins is an independent open-source community, to which CloudBees actively contributes. You can find more information about Jenkins OSS and CloudBees contributions on the CloudBees site.

CJE

CloudBees Jenkins Enterprise - Commercial version of Jenkins based on Jenkins OSS Long-Term Support (LTS) releases with frequent patches by CloudBees. CJE also provides a number of plugins that help organizations address main needs of enterprise installations: Security, High Availability, Continuous Delivery, etc.

CJOC

CloudBees Jenkins Operations Center - operations console for Jenkins that allows you to manage the multiple Jenkins masters within. See details on the CloudBees site.

CJA

CloudBees Jenkins Analytics - provides insight into your usage of Jenkins and the health of your Jenkins cluster by reporting events and metrics to CloudBees Jenkins Operations Center.

PSE

CloudBees Jenkins Platform Private SaaS Edition - leverages your cloud infrastructure to provision instances of CJE for your teams with a one-click interface in CJOC.

PSE Cluster

A set of VMs and their associated software that constitutes PSE.

ELB

Elastic Load Balancer. A load balancer service provided by Amazon in AWS.

Architectural Overview

Components

architecture

Server roles

There are two kinds of servers in a Private SaaS Edition cluster, controllers and workers. Each server of either kind is a virtual machine in the cloud environment on which the cluster is installed.

Controller

Within the cluster, controllers are the brains. They decide what tasks need to be executed, and where those tasks should run. They also answer external requests, and route traffic within the cluster.

Note
For a High Availability (HA) cluster setup, you should have at least 3 controllers.

Using multiple controllers enables you to avoid the risks associated with what could otherwise be a single point of failure. On environments supporting it, a load balancer is set up to distribute traffic across the controller instances.

A controller instance has the following responsibilities:

  • Manages the cluster resources by counting the used and available CPU, memory across the cluster, and offering them to execute tasks.

  • Schedules tasks, checking the services availability and making sure the amount of services defined on the cluster is actually running.

  • Routes web traffic, forwarding incoming traffic to the right service depending on the requested service and its current location within the cluster. It also provides the SSL termination if configured.

Worker

Workers provide computing capacity within the cluster, which is used to host and run end-user applications including CJOC, Jenkins masters and agents (formerly called slaves). You can increase cluster performance and resiliency by adding more workers and configuring workers with more memory and CPU.

Workers can be added and removed over time to adjust cluster capacity for the sake of meeting demand efficiently.

Some workers are dedicated to builds, while other workers are dedicated to other components. Elasticsearch may also have dedicated workers.

Additional worker capacity enables Private SaaS Edition to run more builds in parallel, improving the performance of the cluster. Adding workers also increases cluster resiliency, because spare workers are available in the event of a worker outage. This minimizes Jenkins master and agent downtime in the event of a server failure.

Load Balancing

On Amazon Web Services (AWS), Elastic Load Balancers (ELB) are set up to target all cluster controllers. Depending on the configuration at least two ELBs are created, one to route internal cluster traffic and one to route traffic from the outside.

On OpenStack, PSE creates and manages its own reverse proxy for load distribution.

High Availability

On a Private SaaS Edition cluster, service availability is provided through an automatic failover within the cluster. Healthchecks are performed on all Jenkins masters in the cluster. If a healthcheck determines a master is down, it will reschedule the master on the cluster, potentially on a different node. The master will then go through its starting sequence and will be available again.

How persistence is handled

A Jenkins master needs a persistent filesystem to store configuration, jobs, and build information. Within a Private SaaS Edition cluster, a Jenkins master may be run on any worker. If a worker fails, the Jenkins master will be rescheduled on a different worker. How then does it find its persisted data?

To answer this question, each worker runs a Volume Service providing this capability. There are different backend implementations available depending on the available infrastructure: NFS, EBS (AWS only), and Rsync.

When a Jenkins master is scheduled on a worker, it requests its workspace from the volume service. The volume service provides that, and then the master can start. During its execution, the master will regularly ask the Volume Service to backup its workspace.

If the master later restarts elsewhere, the Volume Service will provide the last backup it has completed.

Additional Services

Elasticsearch

A Private SaaS Edition cluster comes pre-installed with an Elasticsearch cluster.

Elasticsearch is the persistence layer for CloudBees Jenkins Analytics, and Private SaaS Edition also uses it to store cluster logs.

A minimum of 3 workers is recommended for this functionality to perform well.

System Requirements

Controller Workstation

The Private SaaS Edition installer is available for the following platforms:

  • macOS X (Release 10.12.X, also known as "Sierra," is recommended)

  • Linux

The machine on which the installation script runs must have the following:

  • Python 2.7

  • Network access to eventual Private SaaS Edition Cluster VMs

  • Disk Backup for post-installation configuration payload

Python 2.7

You can verify your version of Python by running:

$ python -V
Python 2.7.12

In addition, the following infrastructure-related pieces have to be in place before the installation starts, if you are planning to use them:

Domain Name

Work with your IT/Operations department to create a new domain name for the Private SaaS Edition Cluster.

NFS Server

If you are planning to use NFS as common storage, work with your IT/Operations department to create a NFS server and a mount point.

Depending on your cloud environment, please refer to either OpenStack or AWS environment for additional specific requirements.

Cluster Environment: OpenStack

On OpenStack, you will need to provide the following information:

  • OpenStack credentials — user name, password, tenant name and authentication URL

Note
the user must have rights to create server instances, security groups, keypairs, and floating IPs.
  • OpenStack network and floating IP network to assign IPs to the created instances.

  • If you are planning to use floating IP, ensure that there are floating IP addresses available for use by the Private SaaS Edition Cluster.

  • CJP Private SaaS Edition image, or alternatively Ubuntu 14.04 server image that can be downloaded from Ubuntu cloud images (typically trusty-server-cloudimg-amd64-disk1.img) and installed using Horizon web UI or glance. It is recommended that CJP Private SaaS Edition image be used for installing Private SaaS Edition; it will greatly reduce installation time.

glance image-create \
  --copy-from https://cloudbees-pse-images.s3.amazonaws.com/cloudbees-pse-ubuntu-1.5.3.qcow2 \
  --disk-format qcow2 \
  --container-format bare \
  --name 'cloudbees-pse-ubuntu-1.5.3'
Note
If the glance client does not support --copy-from, use
curl https://cloudbees-pse-images.s3.amazonaws.com/cloudbees-pse-ubuntu-1.5.3.qcow2 | \
glance image-create \
  --disk-format qcow2 \
  --container-format bare \
  --name 'cloudbees-pse-ubuntu-1.5.3'

Cluster Environment: Amazon Web Services

CJP Private SaaS Edition works with all of the Amazon Elastic Compute Cloud (Amazon EC2) instance types, although some of them only work as designed in conjunction with a virtual private cloud.

AWS Credentials

There are two ways to configure an AWS environment: with or without support for EBS Storage. Regardless of which way you choose, AWS configuration requires two different sets of credentials.

First, you will need a set of credentials that will be used to setup the Private SaaS Edition environment. We recommend creating an IAM user with a dedicated policy that will allow for the actions needed in order to create the environment.

An example IAM policy you may use for this user is shown below:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "NotAction": [
        "iam:*",
        "sts:*"
      ],
      "Resource": "*"
    }
  ]
}

Second, you will need a set of credentials that will be used for the worker process that is responsible for volume and snapshot management. There are two ways to furnish these credentials.

By default, the same set of credentials used above will also be used for this service. These credentials will be passed into the container at install time and used for the lifetime of the service. No extra configuration needs to be performed in order to use this configuration.

Using IAM Roles

Another way to achieve this is to specify an instance profile to be used by the worker instances. This is specified in the $PROJECT/cluster-init.config file in the [aws] section alongside the above credentials. The variable name is worker_instance_profile.

In order to use this instance profile, you must have created a role in the IAM settings. The role must have a trust policy defined that allows it to be launched via the EC2 service. For example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

In addition, because the credentialed user above will start the worker instance with this role, it must explicitly have the iam:PassRole ability, which makes our initial IAM policy a bit longer:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "NotAction": [
                "iam:*",
                "sts:*"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::123412341234:role/yourrole"
        }
    ]
}
Note
If you choose to use IAM Roles you will need to add an entry in your .aws/config, see the next section on AWS Credentials for more information.

Now the worker process will not have hard-coded credentials, but instead will use the EC2 metadata service to get a temporary set of credentials.

These credentials will be used to manage: * lifecycle of EBS volumes and snapshots * data on S3 buckets

Last, this worker role needs to have a policy stating what kinds of AWS operations it can perform. A minimum set of AWS operations in policy form you can apply is:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "ec2:CreateTags",
                "ec2:CreateSnapshot",
                "ec2:CreateVolume",
                "ec2:AttachVolume",
                "ec2:DeleteVolume",
                "ec2:DetachVolume",
                "ec2:DeleteSnapshot",
                "ec2:DescribeSnapshots",
                "ec2:DescribeVolumes",
                "ec2:DescribeInstances",
                "ecr:GetAuthorizationToken",
                "ecr:BatchCheckLayerAvailability",
                "ecr:GetDownloadUrlForLayer",
                "ecr:GetRepositoryPolicy",
                "ecr:DescribeRepositories",
                "ecr:ListImages",
                "ecr:BatchGetImage"
            ],
            "Resource": [
                "*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "s3:GetObject",
                "s3:GetObjectVersion",
                "s3:PutObject",
                "s3:GetObjectAcl",
                "s3:GetObjectVersionAcl",
                "s3:PutObjectAcl",
                "s3:PutObjectVersionAcl",
                "s3:DeleteObject",
                "s3:DeleteObjectVersion",
                "s3:ListBucket",
                "s3:ListMultipartUploadParts",
                "s3:AbortMultipartUpload",
                "s3:RestoreObject"
            ],
            "Resource": [
                "pse-*"
            ],
            "Effect": "Allow"
        }
    ]
}

pse-* needs to be adjusted to match <cluster-name>-*.

Note
The ecr actions are only needed if an EC2 Container Registry is used.

For help managing AWS permissions refer to AWS - Manage Permissions and Policies.

AWS Credential Storage

Prior to version 1.5.0, AWS credentials were stored in cluster-init.secrets.

Credentials may continue to be stored and used from within this file. However CJP Private SaaS Edition now contains support for using credentials that are stored within the conventional AWS tool credential locations. By default this file is ~/.aws/credentials, but can be overwritten with the AWS_CREDENTIAL_FILE environment variable.

In order to use credentials stored in the shared credentials file, you must specify the credential_profile in the cluster-init.config.

If you are using AWS IAM Roles

Private SaaS Edition allows you to use IAM Roles to control access to AWS. By configuring role assumption parameters, and specifying the role profile name in the credential_profile config entry, Private SaaS Edition will attempt to change roles before performing AWS operations. This can allow you to provide a shared role for PSE administration access while handing out more limited capability credentials directly to end PSE administrators.

When making use of the credential_profile option vs. using the cluster-init.secrets, the worker_instance_profile must be specified and used. System credentials will not be copied onto worker instances.

You must also add a configuration for your profile to your .aws/config file, for example here is an .aws/config file that specifies a profile with the name "my-developer-profile". Make sure the information here matches what you setup in AWS IAM for your Role & Profile.

[default]
region = us-east-1

[profile my-developer-profile]
region=us-east-1
role_arn = arn:aws:iam::9999998999:role/developer
source_profile = my-company-iam
mfa_serial = arn:aws:iam::974808099065:mfa/joesmith
role_session_name = joesmith-developer
Note
The PSE bees-pse tool uses the Boto3 Python Library to access and control AWS. For more information on Boto’s support for AWS IAM Roles see: Boto docs: AWS Assume Role Provider. For more information on how AWS credentials must be configured for Boto see: Boto docs: Shared Credentials.

EBS Usage

If you are planning to use EBS (Elastic Block Store) as storage, the AWS credentials used for installation must also be able to create EBS resources.

AWS Components

CJP Private SaaS Edition in an AWS environment uses the following Amazon Web Services:

  • EC2 (Elastic Compute Cloud) to provide resizable compute capacity

  • S3 (Simple Storage Services) to store data objects in resources called "buckets"

  • ELB (Elastic Load Balancing) to redirect data traffic as needed

  • EBS (Elastic Block Store) to provide persistent block-level storage volumes for use with Amazon EC2 instances in the AWS Cloud

  • Route 53 (domain name system) for translating Web URL names into numeric IP addresses

  • VPC (virtual private cloud) to enable provisioning of a logically isolated section of AWS

Note
Although some clusters do not use Route 53 or VPC, we recommend that configuration support those components anyway.

The recommended configuration in an AWS environment enables three different types of workers (virtual machines) as described in the following table:

General Purpose

CJOC and Jenkins Masters run as containerized processes on general-purpose workers.

Recommended instance type: M4 2XL (32 GiB)

Dedicated ES

Elasticsearch is performed by dedicated ES workers.

Recommended instance type: R3 XL (30.5 GiB)

Executors

Jenkins agents (formerly called slaves) run on executors.

Recommended instance type: varies by intended workload

Note
The I/O performance of individual instance types differs between OpenStack and AWS.

Recommended Executor instance types are as follows: - M4.2xlarge is best for basic web applications and generic workloads - C3.xlarge is best for C and C++ codebases and complex builds - i2.xlarge is best for I/O-intensive builds such as those associated with building large indexes

Deployments can be thought of as small, medium, or large depending on the number of Jenkins masters that are running (up to 3, up to 10, and up to 30, respectively).

Guidelines for resource allocation across deployments of different size are as follows (The recommended instance for all controllers is M4.large. Controllers route requests and run Apache Mesos to support resource provisioning):

Small

3 controllers and 3 General Purpose workers

Medium

3 controllers, 3 ES workers, 5 General Purpose workers

Large

3 controllers, 5 ES workers, 5 General Purpose workers, 8 Executors

Installation

See the System Requirements section before starting an installation.

Note
lines that begin with $ are command line examples. You enter them as shell commands with the $ excluded.

Install Private SaaS Edition

You can install Private SaaS Edition into a public (AWS) or private (OpenStack) cloud environment from either a Mac OS X or Linux system. If you are installing into a private cloud environment, you may need to install the local binaries on a machine that has full access to the network of the private cloud environment. See more details here.

If you are upgrading Private SaaS Edition, see this section.

1. Download

Download the appropriate package for your platform, then untar the tar.gz package. You can get Private SaaS Edition from link

2. (Mac OS X only) Install Local Binaries

$ cd ~
$ tar zxvf pse_1.5.3_darwin_amd64.tar.gz

2. (Linux only) Install Local Binaries

$ cd ~
$ tar zxvf pse_1.5.3_linux_amd64.tar.gz

3. Update your PATH variable

Add the bin subdirectory to your PATH.

$ export PATH=~/pse_1.5.3/bin:$PATH

4. Verify Installation

Verify that bees-pse is working by running:

$ bees-pse version

You should see output with version reports similar to this:

PSE Release: 10
CloudBees PSE: 1.5.3
Tiger: 1.5.3
Tiger Storage: 1.5.3
Tiger CJOC: 1.5.3
Tiger CJE: 1.5.3
Tiger Search: 1.5.3
Tiger Router: 1.5.3
Tiger Logstash: 1.5.3
PSE SSH Gateway: 1.5.3
Tiger Jenkins Framework: 1.5.3
Mesos: 0.28.2\*
Marathon: 0.15.3\*
Docker: 1.12.1\*
Topbeat: 1.1.0\*
Pip     : 1.5.4\*
Terraform: 0.7.3

If you don’t see this output, check your PATH setting. Another possible cause is a failure occurred unpacking the Private SaaS Edition binaries in step 2.

5. Create a Private SaaS Edition Project

A Private SaaS Edition project is a single directory that contains your configuration and other files needed to maintain the cluster. For the next steps, we’ll use the shell variable PROJECT to refer to this directory.

You can create a Private SaaS Edition project using a relative path. For these steps, we’ll use the location ~/bees-pse-project:

$ PROJECT=~/bees-pse-project

Initialize the Private SaaS Edition project by running the init-project command as follows:

$ bees-pse init-project $PROJECT [aws|openstack]

Use aws or openstack depending on the cloud environment to which you are installing.

This step creates a new Private SaaS Edition project in $PROJECT and populates this directory for either AWS or OpenStack according to the value you provided.

6. Change to the Private SaaS Edition working directory

You should run all other bees-pse commands in the $PROJECT directory:

$ cd $PROJECT

7. Initialize the cluster

To start the first initialization step, execute the following command:

$ cd $PROJECT
$ bees-pse prepare cluster-init

The prepare command creates two configuration templates (cluster-init.config and cluster-init.secrets). These two files contain parameters for starting your Private SaaS Edition cluster. These files are a starting point to tailor the configuration of your Private SaaS Edition cluster.

Edit these files using your favorite plain text editor such as vim or emacs. Within them is detailed guidance on what options are available. The next few sections highlight necessary items you must set.

The file cluster-init.secrets contains the credentials required to interact with AWS or your OpenStack system.

The file cluster-init.config contains the settings needed to create your cluster.

The specific changes you make depend in part on your intended platform (AWS or OpenStack).

The next steps illustrate the minimum set of required customizations to start your cluster. For a complete description of all customization attributes, refer to the Configuration Reference.

Note
If you are installing only to evaluate Private SaaS Edition, existing default values are sufficient.

8. Set Required Parameters for cluster-init.secrets

Edit the cluster-init.secrets and then select the appropriate next step to set the credentials required to create your cluster.

(AWS only) Configure credentials

Fill in the fields aws_access_key_id and aws_secret_access_key. Instructions for getting or creating credentials are here.

Note
If you are using AWS Identity and Access Management (IAM), these credentials are not required.

(OpenStack only) Configure credentials

Fill in the fields openstack_user_name and openstack_password. Obtain these from your OpenStack system administrator.

9. Configure Common Required Settings in cluster-init.config

Most Parameters have usable, secure parameters pre-set in the cluster-init.config. You must set values for the parameters described in this section to successfully launch your Private SaaS Edition cluster.

This section consists of three parts: common, AWS only and OpenStack only settings. Note the indicator following the step number to determine if that item applies only to AWS or OpenStack.

Configure network access

The simplest option for configuring network access is to use this setting:

tiger_admin_port_access = 0.0.0.0/0

This parameter allows access from anywhere but is not as secure as other options shown in networking customization. Change this value later with:

$ bees-pse access-port-update

Cluster Name

cluster_name = <unique cluster name>

Select your cluster name. This value prefixes resource names and other tags when setting up servers and other infrastructure. Some of these resources use global identifiers (e.g., S3 buckets in AWS). Therefore this name must be unique. Choose a value that describes the Private SaaS Edition cluster and includes a unique identifier such as your company name, domain name or other similar value.

This value must contain only alphanumeric, dots ('.'), dashes and underscore values and must be no longer than 20 characters.

Select instance types

You must select the instance types to use for the two types of VM instances that Private SaaS Edition creates. The first one is the Private SaaS Edition controllers. These systems are used to host the controller systems in Private SaaS Edition.

The second one is the worker instances. These are used to host Jenkins masters, Jenkins agents and the build operations themselves. This is the default type used for worker instances, but you can specify alternatives when adding additional worker instances.

For AWS, you must select an available AWS instance type supported by Amazon.

For OpenStack, you must use an instance type that your OpenStack installation allows. Contact your OpenStack administrator for appropriate choices.

controller_instance_type = m4.large
worker_instance_type = m4.xlarge

10. Configure Required Platform Settings in cluster-init.config

(AWS Only) Select AWS Region

You must select an AWS region to host the cluster. The `cluster-init.config' comments list valid values.

region = <aws region>

(OpenStack Only) OpenStack identity service endpoint

Enter the OpenStack identity service endpoint. Private SaaS Edition requires this endpoint to access to OpenStack services.

auth_url = < http://CONTROLLER_HOST:5000/v2 >

(OpenStack Only) Configure External IP address usage (for OpenStack)

OpenStack cloud environments sometimes restrict the number of available external IP addresses. In most private cloud OpenStack deployments, this is not a constraint. If it is a limitation for you, Private SaaS Edition can be configured to use external IP addresses only for Controller nodes. The rest of the Private SaaS Edition VMs only get internal, private IP addresses, rendering them inaccessible from outside of the network.

Setting the floatip_workers parameter to "yes" allows Private SaaS Edition to use external IP addresses, setting it to "no" prevents it from using external IP addresses.

floatip_workers = <"yes" or "no" depending on your OpenStack configuration>

If floatip_workers is set to "no" your installation machine must be connected to the OpenStack private network to do the Private SaaS Edition installation. Work with your IT/Operations Department to secure a VM inside your OpenStack deployment. Install the local binaries on this machine to start the installation.

(OpenStack Only) Configure external network UUID (for OpenStack)

Neutron provides OpenStack with an abstraction of network management services. It interfaces with a wide variety of network technologies. For Private SaaS Edition to interact effectively with external systems (such as your corporate network, or even the internet itself), it needs the UUID of the OpenStack interface which provides that access.

Work with your Operations department to get the correct UUID.

external_network_uuid = <your external network UUID>

See more OpenStack networking details here.

11. Verify Configuration

An error in either cluster-init.config or cluster-init.secrets can disrupt the Private SaaS Edition initialization process. You can identify many issues in the file before attempting initialization with the command:

$ bees-pse verify

If the verification succeeds, you are ready to start the initialization. At this point, the next command creates VM instances in AWS or your OpenStack system and installs software for each instance.

12. Create and Initialize Cluster

Start the initialization by executing the following command:

$ bees-pse apply
Tip
bees-pse apply can be re-run as many times as needed in case there is an error that stops the installation prematurely. After fixing the error, the installation process can be resumed by re-executing bees-pse apply. Use that command to continue the installation from where it left off.
Tip
If an error message appears, refer to Troubleshooting for help.

After starting initialization Private SaaS Edition outputs progress information. This step can take up to 30 minutes to complete. If it succeeds, bees-pse prints a message with links to CloudBees Jenkins Operations Center (CJOC) and the CJP Private SaaS Edition resource management tools. See the output examples that follow.

AWS Cluster Creation Output

On Amazon Web Services (AWS), the end of the output of $ bees-pse apply looks similar to this:

Controllers: ec2-52-91-242-61.compute-1.amazonaws.com,ec2-52-90-204-3.compute-1.amazonaws.com,ec2-54-173-172-182.compute-1.amazonaws.com
Workers    : ec2-52-90-206-206.compute-1.amazonaws.com,ec2-54-88-252-108.compute-1.amazonaws.com

CJOC    : http://pse-controller-1009100199.us-east-1.elb.amazonaws.com.elb.cloudbees.net/cjoc/
Mesos   : http://mesos.pse-controller-1009100199.us-east-1.elb.amazonaws.com.elb.cloudbees.net
Marathon: http://marathon.pse-controller-1009100199.us-east-1.elb.amazonaws.com.elb.cloudbees.net
cluster-init successfully applied

The name of the ELB is the section before .elb.cloudbees.net (example: pse-controller-1009100199.us-east-1.elb.amazonaws.com). You should map your DNS CNAME records to this location.

OpenStack Cluster Creation Output

On OpenStack, the end of the output of $ bees-pse apply looks similar to this:

Controllers: 192.0.2.109,192.0.2.110,192.0.2.111
Workers    : 192.168.2.44,192.168.2.42

CJOC    : http://192.0.2.109.nip.io/cjoc/
Mesos   : http://mesos.192.0.2.109.nip.io
Marathon: http://marathon.192.0.2.109.nip.io
cluster-init successfully applied

13. Retrieve Cluster Information

To retrieve these endpoints in the future, use this command:

$ $bees-pse run display-outputs
Note
if your cluster is already up, check the Changing Domain Name section to apply the changes.

At this point the Private SaaS Edition cluster can be accessed. We recommend that you follow the next step to set up your domain name and set up HTTPS access.

Your $PROJECT directory stores state and configuration information for your cluster.

You should make a backup copy of the entire directory, or store the entire directory in a revision control system. As a convenience, the installation process generates a .gitignore file, ignoring unencrypted secrets files.

14. Logging in to CJOC

You can retrieve the default CJOC administrator user credentials using the following commands:

bees-pse run echo-secrets cjoc_username
bees-pse run echo-secrets cjoc_password

15. Setting up domain name and HTTP/SSL Termination

This step sets up a custom domain name and the HTTPS/SSL termination. This step is optional but recommended.

When you execute the domain-name-change operation, the new domain name must be operational and served by the local DNS server. Work with your Operations department to create a new domain.

If you enable SSL, the certificates must be valid for the machine running the installation.

The DNS domain can be changed with the operation: domain-name-change

$ bees-pse prepare domain-name-change
domain-name-change is staged - review domain-name-change.config and edit as needed - then run 'bees-pse apply' to perform the operation.

Edit the domain-name-change.config file to specify the new domain name and related parameters. For more information about the configuration parameters see Domain Options and SSL Configuration sections. Then apply the operation with bees-pse apply.

Note
The DNS records need to be setup before executing bees-pse apply.
Warning
This operation requires downtime because several applications are reconfigured and restarted.

See Changing Domain Name and Enabling SSL for more information.

16. Opening CloudBees Jenkins Operations Center (CJOC)

Open the link provided in the success message from bees-pse init (see 7. Initialize the cluster), for example,

http://pse-NAME-controller-NNNNNNNNN.AWS-REGION.elb.amazonaws.com.elb.cloudbees.net/cjoc/

You are directed to the CJOC registration page:

cjoc registration

Fill in your name, email address, organization, then approve the license agreement and click Next.

Once this completes the CJOC dashboard displays in the browser window. Here you can create Jenkins masters and run jobs.

You can retrieve the credentials to log in using bees-pse run echo-secrets cjoc_username and bees-pse run echo-secrets cjoc_password.

Licensing

The license you will need to operate Private SaaS Edition is based on the total number of cores available in the Mesos cluster running Private SaaS Edition.

How to determine the number of cores in the Private SaaS Edition cluster

This is the sum of all cores from all workers in your cluster.

To retrieve this information, you need to be logged to CJOC as administrator, or a user with the Metrics/View permission. Then browse cjoc_url/metrics/currentUser/metrics?pretty=true and look for mesos.master.cpus_total.

16. Logging in to CJOC

You can retrieve the default CJOC administrator user credentials using the following commands:

bees-pse run echo-secrets cjoc_username
bees-pse run echo-secrets cjoc_password

17. Securing your Cluster

Initially, the Private SaaS Edition cluster configuration only allows access only to the administrator who created it. A typical setup involves setting up a security realm to allow other people in the organization to log in.

The Private SaaS Edition setup command line requires administrative access to the CJOC instance. Therefore, after setting up a security realm, update the local configuration with a valid username and an API token, so that it can proceed with updates if needed.

Obtain the API token for a connected user by browsing http://your.jenkins.server/me/configure, then click on the button Show API Token.

You then edit the file cluster-init.secrets, and set cjoc_username to the username to log as, and cjoc_password to the API token you obtained by following the previous step.

Tip
To access the controller resources, the user login is admin. The password is stored in the variable router_password, which is located in the file secrets.

Appendix: Sample configurations

Amazon Web Services Sample Configuration

Without EBS Access

If your AWS account does not have access to EBS, or if you do not know whether the AWS account has access, use the attribute values in this template.

cluster-init.config
[aws]
region = [ us-east-1 | us-west-2 | ... ]
tiger_admin_port_access = <external-IP-address-of-machine-installing-PSE>

[storage]
type = builtin-rsync
volume_user = jenkins
volume_group = jenkins
Tip
'rsync' is used if your AWS account does not have access to EBS. In this case, the PSE Cluster provisions a dedicated VM for a storage server.
With EBS Access

If your AWS account has access to EBS, then you can use this:

cluster-init.config
[aws]
region = [ us-east-1 | us-west-2 | ... ]
tiger_admin_port_access = <external-IP-address-of-machine-installing-PSE>

[storage]
type = ebs

Enter your credential strings in the file cluster-init.secrets.

cluster-init.secrets
aws_access_key_id = <access-key-str>
aws_secret_access_key = <secret-key-str>

OpenStack Sample Configuration

Note
To install Private SaaS Edition in an OpenStack environment, work closely with your OpenStack administrator, to get the necessary credentials and information.

As you can see in the sample configuration below, installing Private SaaS Edition in an OpenStack environment requires more information than AWS installations do.

cluster-init.config
[openstack]
tenant_name = <name-of-OpenStack-tenant>
auth_url = <URL-of-OpenStack-nova-API>
image_name = <name-of-image-available-for-tenant>
network_name = <name-of-internal-network>
floatip_network_name = <name-of-external-network-if-install-machine-cannot-access-internal-network>
tiger_admin_port_access = <internal-and-external-IP-address-of-machine-installing-{PSE}>

[storage]
type = builtin-rsync
volume_user = jenkins
volume_group = jenkins
cluster-init.secrets
openstack_user_name = <name-of-account-that-can-access-tenant>
openstack_password = <password-of-same-account>

All state and configuration information is stored in your $PROJECT directory. Of particular interest is the secrets file, where all credentials to the various subsystems are documented. Be sure to change permissions to that file appropriately, as well as make a backup copy of it.

Advanced Configuration Customization

Persistent Storage

Private SaaS Edition needs a persistent storage to store Jenkins Masters/Jobs configuration information. The default storage backend used is builtin-rsync. If you choose this option, the installation process provisions a separate VM to serve as the rsync storage server.

builtin-rsync is also the default option for the OpenStack environment. For the AWS environment, if the AWS account being used to install Private SaaS Edition does not have access to EBS, then choose this option.

Refer to the Configuration Appendix for more information on the attributes used in the following sections.

AWS Elastic Block Storage

By default, AWS uses EBS volumes created from an empty XFS snapshot. Private SaaS Edition backs up these volumes to snapshots periodically. The source snapshot for new EBS volumes is defined by default for several regions but can be customized. Private SaaS Edition sets the user and group ownership in the volume to the parameters volume_user and volume_group.

The EBS backend is enabled by adding the following line to the config file:

[storage]
type = ebs
ebs_snapshot = [custom_snapshot_id]
# volume_user = jenkins
# volume_group = jenkins

For example:

[storage]
ebs_snapshot = snap-d9c1edb1

Snapshots are taken every 2 minutes provided there are changes in the server data, and CJOC cleans them up every 30 minutes, leaving just the last completed one.

Volumes and snapshots are tagged with the following values to identify which cluster and CJE Jenkins Master they belong to:

Tag Description

cluster

Private SaaS Edition cluster name as set in the configuration

account

Name of the Private SaaS Edition Jenkins master or cjoc for CJOC data

NFS

The NFS storage backend mounts Jenkins home folders from a remote NFS server. Configure the host and path of the NFS server, plus the username or numeric uid to use when mounting the NFS filesystem. The user must have permissions for the remote NFS server to mount and create files inside the NFS path used for persistence.

Configure an NFS storage backend by adding the following lines to the config file:

[storage]
type = nfs
nfs_server = <NFS_SERVER_HOST>
# the exported directory on the NFS server to use for storage
nfs_export_dir = /path
# Specify the user that owns the backed-up workspace files on the volume
volume_user = 1000

For example:

[storage]
type = nfs
nfs_server = 10.16.2.1
nfs_export_dir = /srv/nfs4/pse
volume_user = 1000
Networking

Admin Access

Infrastructure VMs in the Private SaaS Edition cluster, such as the 'controller' and 'worker' nodes, are not accessible by default. However, you can grant access to specific IPs and networks by customizing the attribute tiger_admin_port_access.

Ensure that this property includes both the 'public' and 'private' IP addresses of the installation machine. Failing to do so results in a failed installation, where error messages indicate that you cannot SSH to infrastructure VMs.

In AWS or public clouds, use your public-facing IP. To find your IP information, you can enter "what is my IP?" into a search engine like Google or Bing.

OpenStack Network Configuration

In OpenStack or private clouds, use the IP of the machine performing the installation, as seen by the VMs started in the cloud.

Use 0.0.0.0/0 to allow administrative connections from any IP,

# allow access from anywhere
tiger_admin_port_access = 0.0.0.0/0

Managed Masters

Introduction

A key feature of CloudBees Jenkins Platform Private SaaS Edition is the capability to communicate with your infrastructure to create new masters on demand.

Creating a New Managed Master

In CJP Private SaaS Edition, a Jenkins master is referred to as a managed master. You may create as many masters as you like — within the constraints of your allocated resources (see CJP Private SaaS Edition worker count setting).

To create a CJP Private SaaS Edition managed master, do the following:

  1. Navigate to the CJOC dashboard and click New Item and provide the following information:

    • Item name: the name of the new master.

    • Select Managed Master.

      create new master
  2. Click OK.

    • You are redirected to a configuration page where you can specify additional settings for the new master:

      • hardware settings: number of CPUs and amount of memory to allocate.

      • the Docker image you want to use in case you set up a customized Docker image.

    • These settings are populated with default values that should satisfy a new project. Come back to this configuration screen to update values later, if needed.

      master resources configuration
  3. Make any change you want and click Save. In a few minutes the master will be allocated on the cluster and connected to CJOC.

  4. Click Back to Dashboard to return to the CJOC dashboard. You see the new master in a pending state (thunder cloud status) until it’s ready. When ready, the icon turns to a cloud that looks like this:

master ready

This indicates that your new Jenkins master is ready to use.

Upgrading a Managed Master

On creation, a Managed Master will be configured to use a specific Docker image. By default, it will use the first available Docker image in the global configuration.

master configuration overview 1

The Docker image provides the version of Jenkins to use, as well as a default bundle of plugins. When updating the Docker image, the Jenkins version will be updated, as well as the bundled plugins. If additional plugins have been installed on the Managed Master, they will be left as-is as part of the update process.

As part of the Private SaaS Edition upgrade process, a new image definition will be created. To upgrade masters, the docker image in their configuration needs to be updated to the new value.

Once the docker image definition is updated, the CJP Private SaaS Edition administrator will need to restart the instance. Because this operation causes an outage of the affected Managed Master, it should be done at a time when it minimizes impact on users.

action stop
action stop confirm
action start

Once the master is restarted, it will use the new Docker image definition.

master configuration overview 2

Bulk-upgrade managed masters

Once a few masters have been successfully upgraded, if there are a lot of masters, it is more efficient to upgrade them several at a time. This can be achieved by defining a cluster operation on CJOC.

  • Create the operation

    • Click on New Item,

    • Give it a name (e.g. upgrade masters)

    • Select Cluster Operations, then create the item.

  • Edit the configuration.

    • Select Managed Masters then the appropriate option for your case.

    • Add a filter Uses Docker Image to select masters with a given source Docker Image.

    • Under steps, select Update Docker Image and select the new Docker Image.

    • You need then to add another step, Reprovision to restart impacted masters.

  • Save the cluster operation.

The cluster operation is now ready to be launched.

Managing Docker images for Masters

The list of images that can be used to provision masters with can be customized through Manage Jenkins >> Configure System. Look for the Mesos Master Provisioning section, then click on Show docker images…​ to see all entries.

The first entry will be used as the default when creating a new master.

manage docker images

Starting Managed Masters

CloudBees Jenkins Operations Center enables you to create managed masters in the same way regardless of whether your system runs on AWS or OpenStack.

Masters are most efficiently monitored and modified through the CJOC dashboard. Communication between CJOC and managed masters is established automatically. Although you can manually start or stop a managed master, which implies connection between CJOC and that master, it is helpful to think of connection for other contexts, as when you "connect" an existing external (not created in CJP Private SaaS Edition) master into your CJOC instance.

  1. Navigate to the CJOC dashboard and select New Item.

  2. Enter a name in the Item Name field and select Managed Master, then OK.

    CJOC partial
  3. If it is not already checked, click the checkbox for Provision and start on save at the bottom of the configuration UI.

    CJOC provision checkbox
  4. Click Save. Your new master is allocated on the cluster and connected to CJOC. This process takes a minute or two, depending on the storage backend that you chose when creating the Private SaaS Edition cluster.

To start an existing managed master, use CJOC to browse to that master on the network and select Start from the sidebar. If the selected master is already running, then the CJOC sidebar displays a Stop (rather than Start) option.

Setting Up Connections to Client Masters

In addition to supporting managed masters whose entire lifecycle (create to run to stop to destroy) is governed by CloudBees Jenkins Operations Center, CloudBees Jenkins Platform Private SaaS Edition also supports multiple types of client masters that you manage outside of CJOC.

The procedure for connecting a client master to CJOC varies depending on whether the client master already has a valid license for CloudBees Jenkins Enterprise.

Setting up connections between a client master and CJOC is a task that involves multiple steps:

  1. Navigate to the CJOC dashboard and select New Item.

  2. Enter a name in the Item Name field. This name can be different from the hostname of the server on which the client master runs.

  3. Select Client Master as the item type.

  4. Click OK. You are directed to the configuration screen for the newly-created client master. There you set the number of on-master executors, enter email addresses for yourself or colleagues as master owners, and select a licensing option from among those available in the Licensing property.

  5. Verify that CJOC and the client master can communicate with each other over both http and JNLP ports. Inbound http(s) traffic goes through a reverse proxy on a controller node. Host and port to use for JNLP is advertised through http headers by each Jenkins master.

You can connect an existing client master to CJOC by giving that client master the TLS certificate for CJOC, typically through the Configure Global Security page in CJOC. For more information, see How to programmatically connect a Client Master to CJOC.

If you are connecting multiple client masters to your cluster, it is a good idea to partially automate that task using shared configuration.

Accessing Marathon/Mesos Cluster Information

From the Data Center link on CJOC you see a link to Marathon. If you access this URL, you will be asked for credentials.

You can retrieve the Marathon and Mesos credentials using the following commands:

bees-pse run echo-secrets mesos_web_ui_username
bees-pse run echo-secrets mesos_web_ui_password
bees-pse run echo-secrets marathon_username
bees-pse run echo-secrets marathon_password

Using Jenkins SSH Feature on Managed Masters

Some of the Jenkins plugins (Validated Merge plugin, Pipeline Global Library) require SSH access to Jenkins.

In the Private SaaS Edition environment, masters are provisioned on a dynamic host and port within the cluster. Private SaaS Edition provides an http router in order to access managed masters using a fixed name. To access the SSH port of each master, you must configure your ssh client to use a proxy.

SSH Proxy Configuration for OpenSSH

Provided your domain is ci.example.com, you will need to add to ~/.ssh/config the following snippet.

Host=*.ci.example.com
ProxyCommand=ssh -q -p 2222 ci.example.com tunnel %h
OpenStack instructions

Beginning with version 1.1.0, the load balancer component requires different floating IP for each port that is exposed. Therefore for SSH access, you will have to use a different IP than the one mapped to your default domain. To retrieve the IP to use for SSH access, use the command bees-pse run resolve-attr ssh_ip and use the IP instead of ci.example.com in the ProxyCommand above.

You should end up with

Host=*.ci.example.com
ProxyCommand=ssh -q -p 2222 <ssh_ip> tunnel %h
Connecting to a Master Through SSH (OpenSSH)

Provided you added your public SSH key to your user profile in the master you want to connect to, you can then use SSH commands. Please note that you will need to accept host keys for both the proxy and the master on the first connection attempt.

$ ssh admin@cjoc.ci.example.com who-am-i
Authenticated as: admin
Authorities:
  authenticated

SSH Proxy Configuration for Putty

You will need to add the proxy configuration to your connection details.

Go to connection > proxy.

  • Proxy type: local

  • Proxy hostname: ci.example.com

  • Proxy port: 2222

  • Proxy command: plink ssh@%proxyhost -P %proxyport tunnel %host

Save the connection as pse (as an example).

Connecting to a Master Through SSH (Putty)

Provided you added your public SSH key to your user profile in the master you want to connect to, you can then use SSH commands. Please note that you will need to accept host keys for both the proxy and the master on the first connection attempt.

$ plink -load pse admin@cjoc.ci.example.com who-am-i
Authenticated as: admin
Authorities:
  authenticated

Connecting External Agents

JNLP agents that are set up as described in this section can run on Windows, Linux, or (Macintosh) OSX operating systems.

Connecting SSH agents

Refer to the documentation of NIO SSH Agents plugin to connect agents (formerly called "slaves") using SSH. Connecting SSH agents within Private SaaS Edition is no different than in any other environment because outgoing connections are not restricted within Private SaaS Edition clusters.

Connecting JNLP agents

In the PSE architecture, all inbound http(s) traffic goes through a reverse proxy on one of the controller nodes and is forwarded to the right instance.

Host and port to use for JNLP is advertised through http headers by each Jenkins master, and a direct connection is established between the agent and the master.

This method may have some limitations depending on how your cloud provider setup has been done.

On AWS

Each master exposes a unique JNLP port on the worker that it runs. AWS gives public IPs to the worker machines, and the Private SaaS Edition security groups are designed to enable masters to receive direct connections through their JNLP port.

We recommend that you connect your JNLP agents to CJOC as shared agents. You can refer to the corresponding Java Web Start documentation.

On OpenStack

To connect JNLP agents to a PSE cluster installed on OpenStack, the agents must be attached to the same network as the cluster servers (network_name in cluster-init.config).

We recommend that you connect your JNLP agents to CJOC as shared agents. You can refer to the corresponding Java Web Start documentation.

Building on CJP Private SaaS Edition

CJP Private SaaS Edition comes out of the box with a set of agent images in order to run builds on the underlying cluster infrastructure. This is managed by Palace, a scheduler that is launched on the Private SaaS Edition infrastructure.

These agent images are available as items in CJOC and are pushed to client masters located in the same context. You can also create agent images directly on a client master.

Palace, scalable agent scheduling framework for Mesos

Why Palace?

Private SaaS Edition used to ship with a shared cloud using the Mesos plugin to provision docker-based agents. While doing scalability tests on Private SaaS Edition, we realised this plugin couldn’t scale as much as we want because each Jenkins master would register one mesos framework, and Mesos just cannot handle thousands of frameworks.

We therefore built Palace, which is used as an intermediate in order to provision Jenkins agents on Mesos. From Mesos, it is a single framework. From Jenkins masters, it is accessed through a REST API, which removes the need to bundle mesos binaries within masters.

Comparing architectures

Mesos plugin
mesos plugin architecture diagram
Architecture Diagram: Mesos plugin

When using the mesos plugin, each Jenkins master hosts a Mesos framework, which requires to have the mesos package installed in the container running Jenkins. There is a competition between the different masters interacting with Mesos. A faulty master could hog on to Mesos offers and prevent other masters from creating tasks. Latest registered masters are given priority over those which have been there for a long time. When many frameworks are connected, it takes a longer time for a given framework to get the offers it is interested in because they need to be processed and declined by other frameworks.

Palace
palace architecture diagram
Architecture Diagram: Palace

Palace is a scheduled application, just like Jenkins masters. In case of failure, it is automatically restarted and it reconciles its state with the currently running mesos tasks.

Palace works in exclusive mode : since Private SaaS Edition uses workers dedicated to the Palace workload and Palace is a single Mesos framework, it hogs on to offers on these agents and is able to respond instantly whenever a Jenkins agent request is made by one master. Overall it reduces the time to allocate slaves.

Having a central place of decision to provision slaves also helps to apply more elaborated strategies to do task placement optimizations, e.g.

  • bin-packing (pack tasks into a few hosts as possible)

  • load spread (maintain a similar load on all available hosts) (not yet implemented in Private SaaS Edition)

  • implement resources quotas per master (not yet implemented in Private SaaS Edition)

Jenkins masters connect to Palace using a REST API, which decouple them from Mesos itself : we no longer need to ship mesos binaries in the docker container, the mesos native library is no longer called directly from Jenkins which limits impact in case of a crash.

Defining a new agent image

To define a new agent image, just create a new item and select Docker Slave Template.

new item

The name of the item will be the label to use when defining jobs meant to run using this image.

Then you need to configure the agent image to match your requirements.

configuration
Field Default Value Description

Use when no label is specified

  • ❏  

Check this if you want your agent image to be used for unassigned jobs. In case several agent templates have this option selected, these jobs will run on any of them, the first available.

Enable this slave template

  • ✓  

Uncheck this to disable a template you don’t want to use (but that you still want to keep around)

CPU shares

0.1

Refer to the corresponding Docker option. We recommend you to leave it to its default value unless you really know what you are doing.

Memory

256

It is the amount of memory to be allocated for the docker container. If the value is set too low, the container may be killed by the Linux OOM Killer. If it is too high, less tasks will be able to be launched on the infrastructure and you may experience delays when launching jobs.

JVM Memory

128

It is the maximum heap size given to the Jenkins agent. You need to account for this value when setting up Memory above.

JVM Arguments

(empty)

Allows to specify additional JVM arguments to be applied to the JVM running the Jenkins agent.

JNLP Arguments

(empty)

Allows to specify JNLP arguments when launching the Jenkins agent.

Remote FS Root

/jenkins

It is the working directory of the Jenkins agent inside the container. It must be a path on which the user running the container as must be able to do anything, otherwise the agent will fail to launch.

Image

java

This is the docker tag of the image to pull. It needs to have a JDK installed, as this is a prerequisite to launch the Jenkins agent.

Additional Options

Additional URI

Fetches a file from an URI and put it in the Mesos sandbox. This is useful to pull an image from a private Docker registry.

Environment Variable

Add an environment variable to the Docker container.

Force pull image

Force update of an image if it already exists locally. This option is useful if the specified image targets a mutable Docker tag (by convention) such as latest.

Launch in privileged mode

Launches the Docker container in privileged mode.

Parameter

Pass a parameter to the Docker container.

Port Mapping

Allows to map a port from the host to the Docker container.

Use a custom Docker shell command

Allows to override the default /bin/sh shell command that is used to launch the Jenkins agent inside the container. This option can be used to execute commands before the Jenkins agent process is started.

Volume

Allows to map a volume inside the Docker container.

Sharing agent images

You can share agent images by creating them on CJOC. Agent images created in CJOC will be made available to client masters located in the same folder hierarchy (same folder or a subfolder below it).

Agent images created by the installer are located in the root of CJOC. Then are then available to any client master.

Shared agent images can be browsed from any client master within the special folder Operations Center Shared Templates.

operations center shared templates

Example of restricted agent image

Considering a folder named Organization 1, an agent template named agent1 and a master Master 1.

organization1

If we browse the Shared Templates within Master 1, we will see the template agent1 as well as the other templates defined in CJOC.

master1 shared templates

However if there are masters in CJOC root folder or in other folders, they won’t see the template agent1.

Migrating from the Mesos shared cloud to Palace

Private SaaS Edition used to ship with a shared cloud using the Mesos plugin to provision docker-based agents.

Since 1.3.0, Private SaaS Edition ships with a new shared cloud called Palace. It effectively deprecates the Mesos shared cloud by providing a better scalability model as well as a more flexible way to manage docker agent images across your cluster.

We don’t migrate automatically your templates based on the mesos plugin as we don’t guarantee a 100% compatibility with the mesos plugin configuration.

Migrating slave informations

In particular, only mesos slave informations using the Docker containerizer are supported by Palace.

Each mesos slave information corresponds to a Docker slave template item when using Palace.

Field in Mesos plugin Field in Palace

Label string

This is the item name

Jenkins Slave CPUs

CPU Shares = (Jenkins Slave CPUs) + (Number of executors)*(Jenkins Executor CPUs)

Jenkins Slave Memory in MB

JVM Memory

Minimum number of Executors per Slave

Not Applicable (always 1)

Maximum number of Executors per Slave

Not Applicable (always 1)

Jenkins Executor CPUs

See Jenkins Slave CPUs

Jenkins Executor Memory in MB

Memory = (Jenkins Slave Memory) + (Number of executors) * (Jenkins Executor Memory)

Remote FS Root

Remote FS Root

Idle Termination Minutes

Not Applicable. Agents are removed as soon as the build is done.

Mesos Offer Selection Attributes

Not Applicable

Additional Jenkins Slave JVM arguments

JVM Arguments

Additional Jenkins Slave Agent JNLP arguments

JNLP Arguments

Docker Image

Image

Docker Privileged Mode

Options > Launch in privileged mode

Docker Force Pull Image

Options > Force pull image

Docker Image Can Be Customized

Not supported

Custom docker command shell

Use a custom docker shell command

Networking options

Not supported

Volumes

Options > Volume

Parameters

Options > Parameter

Additional URIs

Options > Additional URI

Migrating workers

Palace uses a different strategy to work with Mesos offers than the Mesos plugin does. This allows a quicker time to provision agents as well as a better control on where tasks are getting provisioned.

For this reason and to avoid disrupting your workload when upgrading, we have decided to require specific workers for Palace.

When listing workers using bees-pse run list-workers, you will notice that your existing build workers are now tagged with mp-build. It means these workers are assigned to workload generated by the mesos plugin.

worker-1 (master: m4.xlarge) ec2-aa-bb-cc-d1.compute-1.amazonaws.com > ACTIVE
worker-3 (mp-build: m4.xlarge) ec2-aa-bb-cc-d2.compute-1.amazonaws.com > ACTIVE
worker-2 (master: m4.xlarge) ec2-aa-bb-cc-d3.compute-1.amazonaws.com > ACTIVE

To add new workers that can be used by Palace, run the worker-add operation and choose workload_type = build. Once you are done with your migration you will be able to remove the old mp-build workers using the worker-remove operation.

Creating Credential Package

To use a private registry, CJP Private SaaS Edition needs credentials to access the private registry. Create a docker.tar.gz file including the .docker folder and the contained .docker/config.json by:

  • Login to the private registry manually. Login creates a .docker folder and a .docker/config.json in the user’s home directory.

$ docker login some.docker.bar.com
  Username: foo
  Password:
  Email: foo@bar.com
  • Tar this folder and its contents.

$ cd ~
$ tar czf docker.tar.gz .docker
  • Check you have both files in the tar

$ tar -tvf ~/docker.tar.gz

  drwx------ root/root         0 2015-07-28 02:54 .docker/
  -rw------- root/root       114 2015-07-28 01:31 .docker/config.json
  • Put the gzipped file in a location that can be retrieved from the worker VMs. The URI must be accessible by all workers. Approaches may include storing it on a shared network drive, or for example Amazon S3. It is worth considering the security implications of your chosen approach.

Attaching Credential Package to Custom Build Environment

Then, in the agent image, add an Additional URI entry. In Value, provide the URI to download the file, then check Extract option.

additional uri

Use Docker Commands in Builds

There are two Docker-related build tasks that are available as build tasks.

build tasks

In order to be able to use these Docker-related tasks in a build environment, a Palace build agent definition with the Docker client is required. The Docker image cloudbees/java-with-docker-client can be used for this purpose. It has both Java (for the Jenkins agent process) and Docker client pre-installed. The Docker socket of the enclosing Private SaaS Edition worker also needs to be mounted through a volume inside the container.

To do so, in the agent image configuration, add a Volume entry, with Host Path and Container Path set to /var/run/docker.sock.

providing docker vols
Tip
Configuration changes to agent images are only applied to new containers, not to existing ones.

Use Docker Pipeline

In order to use Docker pipeline syntax like docker.image.inside, besides the requirements in previous section, the workspace needs to be shared across Jenkins nodes. This can be done by setting the agent filesystem remote root directory to /mnt/mesos/sandbox/jenkins, a Docker volume that is automatically injected into all agents.

docker filesystem root directory
Warning
Although the data in the /mnt/mesos/sandbox/jenkins is garbage collected, care must be taken to cleanup after jobs to ensure the space is reclaimed as soon as possible.

One-Shot provisioner API

Since 1.5.1, Palace uses by default the One-Shot executor API instead of the Cloud API. This API is designed to handle ephemeral agents running in Docker containers, whereas the Cloud API was designed at a time virtual machine provisioning was the norm. As a net result, agent provisioning is snappier, and agents are guaranteed to be used by a single build.

In case this new API causes incompatibilities with the installation, it can be disabled through the Global settings of a Jenkins instance. In this case Palace will revert to the Cloud API to provision agents.

one shot toggle

Cluster Operations

Upgrading CJP Private SaaS Edition

Upgrading CJP Private SaaS Edition consists of the following steps:

  1. Download a new Private SaaS Edition release

  2. Install the new release locally

  3. Upgrade the Private SaaS Edition project

    • run bees-pse upgrade-project within the Private SaaS Edition project to upgrade the project

  4. Upgrade Private SaaS Edition components (see notice below)

    • run bees-pse upgrade

Upgrade Private SaaS Edition Components Notice

During the upgrade of components, CJOC might become unavailable for a few minutes. Several internal services will be upgraded and restarted. Services such as the routing service might make some components of the cluster inaccessible for a few seconds.

You should perform the upgrade at the appropriate time in order to minimize the inconvenience to users.

Upgrading a Managed Master

See Upgrading a Managed Master in the Managed Masters section.

Managing Workers

List Workers

To list all workers use the command: bees-pse run list-workers.

$ bees-pse run list-workers
worker-1 ( master ) ec2-155-88-200-66.compute-1.amazonaws.com > ACTIVE
worker-2 ( master ) ec2-152-77-190-55.compute-1.amazonaws.com > ACTIVE
worker-3 ( build ) ec2-254-88-160-11.compute-1.amazonaws.com > ACTIVE

Status of Workers

You can get the current status of the cluster with the command: bees-pse status.

$ bees-pse status
CloudBees Jenkins Operation Center
  URL: https://yourdomain.com/cjoc/
  cjoc: OK
Server Status
  worker-1: OK
  worker-2: OK
  controller-1: OK

You can get the current status of a specific worker with the command: bees-pse status <WORKER-NAME>. Note that the worker name syntax is always worker-ID

For example, to get the status of worker-1

$ bees-pse status worker-1
Worker worker-1:
hostname:        ec2-01-01-01-01.compute-1.amazonaws.com
instance_type:   m4.xlarge

Connectivity to worker-1: OK

Applications running on worker-1:
jce_cjoc
jce_castle
masters_eval-master

Management Operations

Managing workers is done with the Private SaaS Edition CLI "operations" commands. You can get a list of operations by executing the following command:

$ bees-pse list-operations
usage: tiger prepare OPERATION

  OPERATION
    access-port-update             Update admin and user access ports.
    castle-logging-level-update    Specify logging level for Volume Managers
    cjoc-update                    Update CJOC settings such as memory, disk and evaluation mode.
    cluster-destroy                Destroys a PSE cluster.
    cluster-init                   Initialize the PSE cluster using the provided configuration.
    cluster-recover                Recover a PSE cluster.
    controller-restart             Restart a controller. This operation will destroy and recreate the controller.
    domain-name-change             Change domain settings such domain name and SSL settings.
    elasticsearch-backup           Backup Elasticsearch indices.
    elasticsearch-indices-delete   Delete indices older than specified days.
    elasticsearch-restore          Restore Elasticsearch latest backup.
    elasticsearch-update           Update Elasticsearch settings such as memory, sharding, instance count, ...
    pse-support                    Generates support bundle containing information to help debugging.
    worker-add                     Add one or more workers to the cluster.
    worker-disable                 Disable a worker. A disabled worker will no longer be available for further deployment but current deployments will remain running.
    worker-enable                  Enable a previously disabled worker.
    worker-remove                  Delete a specific worker.
    worker-restart                 Restart one or more workers. This operation will destroy and recreate the worker(s).

CJP Private SaaS Edition "operations" commands have a lifecycle. In order to execute an operation, you first need to stage/prepare the command. Next the operation parameters can be edited, reviewed and verified. Then the operation can be applied or cancelled.

The various operations commands are:

list-operations   List supported operations
prepare           Prepare/stage an operation
verify            Verify a staged operation
apply             Apply a staged operation
cancel            Cancel a staged operation
Adding Workers

The 'worker-add' operation is used during the initialization process to set up dedicated workers. Also, if you run low in capacity for Jenkins masters or agents, you can add additional workers with this operation.

$ bees-pse prepare worker-add
worker-add is staged - review worker-add.config and edit as needed - then run 'bees-pse apply' to perform the operation.

worker-add.config lets you define the number of workers to add and the workload type of the worker. It also allows you to set a specific size if the default size is not adequate.

Note
On AWS, the work_volume_size is by default 50 GB. For instances with dedicated disk space, set the proper value to use the entire disk allocated to that the instance.

Workload type

## Type of workload that the worker will handle
# master          : (default) master/main workload: Jenkins masters, CJOC and PSE components
# build           : dedicated to Jenkins build agents
# elasticsearch   : dedicated to Elasticsearch nodes
# workload_type =

Workers created with the 'master' workload type will handle workload of the Jenkins masters plus CJOC and PSE components such as Elasticsearch if not deployed on a dedicated 'elasticsearch' worker node.

Apply the operation with bees-pse apply

Removing Workers

You can remove/delete a specific worker with the operation: worker-remove.

You can get the list of workers with the bees-pse status. Note that worker names are always in the format: worker-ID

For example to remove worker-2

$ bees-pse prepare worker-remove
worker-remove is staged - review worker-remove.config and edit as needed - then run 'bees-pse apply' to perform the operation.

You must specify the worker name in the 'worker-remove.config' file before executing the operation.

[server]

# The name of the server to remove (required)
#
name = worker-2

Then apply the operation with bees-pse apply

Restarting Workers

You can restart a specific worker when unhealthy or unresponsive with the operation: worker-restart

For example to restart worker-2

$ bees-pse prepare worker-restart
worker-restart is staged - review worker-restart.config and edit as needed - then run 'bees-pse apply' to perform the operation.

You must specify the worker name in the worker-restart.config file before executing the operation. Then apply the operation with bees-pse apply

This operation will in essence replace a worker with a new one. All applications running on the restarted worker will be restarted on other workers of the cluster.

Changing Domain Name and Enabling SSL

Initial Domain Names or IPs After Installation

After the CJP Private SaaS Edition cluster is running take note of the generated domain name or IP addresses, depending on the infrastructure where it is running.

AWS

On Amazon AWS it will look like this:

Controllers: ec2-52-91-242-61.compute-1.amazonaws.com,ec2-52-90-204-3.compute-1.amazonaws.com,ec2-54-173-172-182.compute-1.amazonaws.com
Workers    : ec2-52-90-206-206.compute-1.amazonaws.com,ec2-54-88-252-108.compute-1.amazonaws.com

CJOC    : http://pse-controller-1009100199.us-east-1.elb.amazonaws.com.elb.cloudbees.net/cjoc/
Mesos   : http://mesos.pse-controller-1009100199.us-east-1.elb.amazonaws.com.elb.cloudbees.net
Marathon: http://marathon.pse-controller-1009100199.us-east-1.elb.amazonaws.com.elb.cloudbees.net
...

The name of the ELB will be the section after http://cjoc. (example: pse-controller-1009100199.us-east-1.elb.amazonaws.com.elb.cloudbees.net) and that is the name the DNS records have to point to.

OpenStack

In OpenStack, you can take the IP addresses for the controllers and create one or more A records for them instead of CNAME records.

Controllers: 192.0.2.109,192.0.2.110,192.0.2.111
Workers    : 192.168.2.44,192.168.2.42

CJOC    : http://192.0.2.109.nip.io/cjoc/
Mesos   : http://mesos.192.0.2.109.nip.io
Marathon: http://marathon.192.0.2.109.nip.io

The controller IPs in this example would be 192.0.2.109, 192.0.2.110 and 192.0.2.111

Change Domain Name and/or Enable SSL

When the domain-name-change operation is carried out, it is assumed that the new domain name is operational and is being served by the local DNS server. Please work with your Operations department to create a new domain.

If enabling SSL, it is assumed that the certificates are valid for the machine running the installation.

The DNS domain can be changed with the operation: domain-name-change

$ bees-pse prepare domain-name-change
domain-name-change is staged - review domain-name-change.config and edit as needed - then run 'bees-pse apply' to perform the operation.

Edit the domain-name-change.config file to specify the new domain name and related parameters. For more information about the configuration parameters see Domain Options and SSL Configuration sections. Then apply the operation with bees-pse apply.

Note
The DNS records need to be setup before executing bees-pse apply.
Warning
This operation requires down-time because several applications will be re-configured and restarted.
Domain options guide

You want to run your CJP Private SaaS Edition service under your own domain. CJP Private SaaS Edition uses URL names both for user-facing Jenkins instances, CJOC and for internal use. You will end up with names such as team1.ci.example.com in this scenario.

To do this, you need to have access to your own domain name and access to the DNS settings.

The following options allow you to customize the URLs that will be used to access your cluster services. Depending on what your network administrator lets you do on your network, you may be in one of the following situations.

The target for the DNS records can be:

  • a domain name, in providers such as AWS where an Elastic Load Balancer (ELB) is created, for example, pse-controller-1009100199.us-east-1.elb.amazonaws.com.elb.cloudbees.net. In this case a CNAME record needs to be created.

  • one or more IPs, in other providers such as OpenStack, for example, 192.0.2.109,192.0.2.110,192.0.2.111. In this case one or more A records need to be created.

(Recommended) You can register a wildcard DNS entry

All the cluster services will be exposed as subdomains of the domain name you provided.

Note
These snippets need to be tuned to fit your own domain name.

For domain pse.example.com, you would use

cluster-init.config excerpt
[tiger]
...
domain_name = pse.example.com
domain_separator = .
path_mode = no
DNS record
*.pse  IN CNAME  <name of the ELB>.
# or
*.pse  IN A <IP of the controller>.

Then your cluster will be available with the following URLs :

A master named master-1 will be available as http://master-1.pse.example.com.

You cannot register a wildcard DNS entry

Some infrastructure services must be registered with their own domain name, but all Jenkins masters (CJOC, CJE instances) will be exposed under the domain name you provide.

You can create subdomains beyond 1 level

For some installations, it may be not possible to set up a wildcard DNS record. For those cases, using path_mode = yes will expose all the Jenkins master servers under paths for the same domain. Infrastructure services will be registered as subdomains of the domain name you provide.

Note
These snippets need to be tuned to fit your own domain name.

For domain pse.example.com, you would use

cluster-init.config excerpt
[tiger]
...
domain_name = pse.example.com
domain_separator = .
path_mode = yes
DNS records
mesos.pse         IN CNAME  <name of the ELB>.
marathon.pse      IN CNAME  <name of the ELB>.
pse               IN CNAME  <name of the ELB>.
# or
mesos.pse         IN A <IP of the controller>.
marathon.pse      IN A <IP of the controller>.
pse               IN A <IP of the controller>.

Then your cluster will be available with the following URLs:

A master named master-1 will be available as http://pse.example.com/master-1.

You cannot create subdomains beyond 1 level

Infrastructure services will be registered using the provided domain as a suffix.

Note
These snippets need to be tuned to fit your own domain name.

For domain pse.example.com, you would use

cluster-init.config excerpt
[tiger]
...
domain_name = pse.example.com
domain_separator = -
path_mode = yes
DNS records
mesos-pse         IN CNAME  <name of the ELB>.
marathon-pse      IN CNAME  <name of the ELB>.
pse               IN CNAME  <name of the ELB>.
# or
mesos-pse         IN A <IP of the controller>.
marathon-pse      IN A <IP of the controller>.
pse               IN A <IP of the controller>.

Then your cluster will be available with the following URLs:

A master named master-1 will be available as http://pse.example.com/master-1.

SSL Configuration

SSL termination can be configured at controller level by configuring the Nginx proxy server with the SSL certificates or in AWS at the ELB level.

It is configured by setting protocol = https and one of the following options.

Controller Termination

Set router_ssl = yes and provide key and certificate files as nginx.key and nginx.cert respectively in the project directory.

AWS ELB Termination

SSL certificates will need to be configured in EC2 and provided via ssl_certificate_id using Amazon Resource Names (ARN) syntax

AWS IAM certificate example
ssl_certificate_id = arn:aws:iam::123456789012:certificate/some-certificate-name
AWS ACM certificate example
ssl_certificate_id = arn:aws:acm:us-east-1:123456789012:certificate/12345678-aaaa-bbbb-cccc-012345678901

When using path_mode = yes two more ELBs are created for mesos and marathon subdomains, so a wildcard SSL certificate is not required. To configure those certificates set the ssl_certificate_id_mesos and ssl_certificate_id_marathon options using the AWS ARN syntax.

SSL certificate example in path mode
ssl_certificate_id = arn:aws:acm:us-east-1:123456789012:certificate/12345678-aaaa-bbbb-cccc-012345678901
ssl_certificate_id_mesos = arn:aws:acm:us-east-1:123456789012:certificate/12345678-aaaa-bbbb-cccc-012345678902
ssl_certificate_id_marathon = arn:aws:acm:us-east-1:123456789012:certificate/12345678-aaaa-bbbb-cccc-012345678903
Restart Controller

In the event of a controller failure, the controller can be replaced with the 'controller-restart' operation.

$ bees-pse prepare controller-restart
controller-restart is staged - review controller-restart.config and edit as needed - then run 'bees-pse apply' to perform the operation.

Edit the controller-restart.config file and enter the controller name as the [server] name.

Then carry out the operation with bees-pse apply.

Note
This operation will terminate the specified controller and restart a new one. To avoid loss of data, perform this operation only on a multi-controller setup.

Restore a Private SaaS Edition Cluster

If an entire cluster fails or you need to re-create a destroyed cluster, you can use the operation 'cluster-recover' to recover the cluster as long as you still have the PROJECT directory. See Backup a Private SaaS Edition Project section on how to backup your PROJECT directory.

Note
If the cluster fails and needs recovery, it is recommended to first attempt to destroy the cluster (See Destroy a Private SaaS Edition Project in order to cleanup/reclaim resources, then recover as a destroyed cluster.
Note
If you do NOT have the PROJECT directory anymore, you will have to re-create a new cluster following the standard initialization steps. If using EBS storage on AWS, use the same cluster_name to recover CJOC and masters data.
$ bees-pse prepare cluster-recover
cluster-recover is staged - review cluster-recover.config and edit as needed - then run 'bees-pse apply' to perform the operation.

Edit the cluster-recover.config file before executing the operation to specify the configuration directory path to recover.

[pse]

## Cluster configuration directory path to recover
# path relative to the PROJECT directory
dna_path=.dna
...
  • In the case of a cluster failure, the configuration directory path (dna_path) will be the default .dna hidden path.

  • In the case of a recovery of a destroyed cluster, specify the destroyed dna path. The destroyed path is usually a hidden path like .dna-destroyed-DATESTAMP. You can list the hidden path with ls -alt

  • If the recovery is done from a different machine or bastion host, the access to the administrative ports need to be updated via the tiger_admin_port_access parameter.

Then apply the operation with bees-pse apply

Migrate an Entire EC2 Cluster

When you use the aws/ec2 version of CJP Private SaaS Edition, you should use the EBS service for persistence:

[castle]
storage_server = ebs://

This means that all long-term state information is stored in EBS volumes and snapshots. When you run bees-pse destroy, these volumes and snapshots are left in place.

You can use the EBS persistence feature to migrate between Amazon regions. You can run bees-pse destroy in one region, and then tell Amazon (through the console, or CLI) to copy all snapshots to your new region. You can then run the bees-pse cluster-init operation with cluster-init.config changed to reflect the new region you want to run your cluster in (note that you will have to follow the initialization steps to setup DNS records).

Generate a Private SaaS Edition Support Bundle

A CloudBees Support Engineer may ask you to generate a Private SaaS Edition Support Bundle. This is a common step in triaging problems with your Private SaaS Edition instance. This bundle will include information that will be helpful in troubleshooting problems. To do so, run the following command:

$ bees-pse prepare pse-support
pse-support is staged - review pse-support.config and edit as needed - then
run `bees-pse apply` to perform the operation.

Edit the pse-support.config file before executing the operation to choose what packages to include in the bundle.

Note
By default, this operation will generate a bundle with all of the options included. If you wish to exclude a package from the bundle, indicate 'no' instead of 'yes'.
[tiger]

## Include the Support Core Plugin Bundle.
cjoc_support_bundle = yes

## Include logs of this cluster's controllers and workers.
cluster_task_logs = yes

## Include PSE Workspace Specific Files
pse_workspace = yes

Then apply the operation with bees-pse apply.

The Support Bundle archive file will be saved to the current working directory to a time-stamped file called cloudbees-pse-support-YYYY-MM-DD-HH-MM-SS.tgz.

You can then submit this Support Bundle archive file to a CloudBees Support Engineer.

Note
If errors occur in generating the Support Bundle, the pse-support operation will either write a message to stdout indicating the error or indicate inside the support bundle itself if a resource could not be reached.
Note
The Private SaaS Edition Support Bundle is a simple tgz file containing mostly plaintext files. We prefer that you do not modify this archive, but, if necessary, you may redact any sensitive data in this archive.

Destroy a Private SaaS Edition Cluster

When you are finished evaluating Private SaaS Edition or you want to terminate the resources used by a Private SaaS Edition cluster, run the following command:

$ bees-pse prepare cluster-destroy
cluster-destroy is staged - review cluster-destroy.config and edit as needed - then
run `bees-pse apply` to perform the operation.
Note
If bees-pse apply fails, ensure that the machine has valid name servers configured. Running bees-pse apply without a valid name server configuration, or without the host(1) utility installed, will cause the script to fail when resolving addresses.

Edit the cluster-destroy.config file before executing the operation.

Because this is a destructive operation, you must explicitly enter the cluster name in the configuration file. This requirement helps prevent accidental cluster manipulation or destruction.

In the cluster-destroy.config file, available options (operations on clusters and storage buckets, plus — in releases prior to 1.6.0 — operations on Elasticsearch snapshots) are commented out by default.

To invoke any of the three parameters associated with AWS cluster destruction and cleanup (destroying a cluster, storage bucket, or Elasticsearch snapshot), you must first uncomment that parameter in the configuration file by removing the pound sign (#) prefix that by default nullifies the power of these destructive operations by treating them as comments rather than operators.

By uncommenting cluster_name = [your cluster name], destroy_storage_bucket = yes, or destroy_elasticsearch_snapshots = yes, you explicitly permit those operations to be invoked.

{cluster-destroy.config-file} sample excerpt:
[pse]
...
## Cluster name
# Uncomment the next line after verifying that this is really the cluster you want to destroy.
# cluster_name = cobra

(AWS Only) You can also decide whether storage buckets should be deleted. The data stored in the storage buckets can be used to recover a cluster after it has been destroyed. However, the data will prevent creating a new cluster of the same name.

Warning
Destroying a CJP Private SaaS Edition cluster deletes all the resources associated with that cluster. Depending on the data storage backend used, you may not be able to retrieve data after these resources are deleted.

Destroying all data

Depending on the persistent storage used during installation, data may be left over after destroying the cluster.

rsync

The rsync server and all data is deleted after the cluster is destroyed.

NFS

The NFS directories created for CJOC and CJE Jenkins masters are not deleted.

AWS Elastic Block Storage

Volumes, snapshots and S3 buckets with CJOC and CJE Jenkins masters data are not deleted after destroy.

In order to do a full deletion the following script can be used, provided the AWS cli is already configured for the same account used for Private SaaS Edition:

CLUSTER_NAME=myclustername
AWS_REGION=us-east-1

ec2-snapshots-for-cluster() {
    aws ec2 describe-snapshots \
        --region $AWS_REGION \
        --owner-ids self \
        --filters Name=tag:cluster,Values=$CLUSTER_NAME \
        --query 'Snapshots[*].{ID:SnapshotId}' \
        --output text
}

ec2-volumes-for-cluster() {
    aws ec2 describe-volumes \
        --region $AWS_REGION \
        --filters Name=tag:cluster,Values=$CLUSTER_NAME \
        --query 'Volumes[*].{ID:VolumeId}' \
        --output text
}

s3-buckets-for-cluster() {
    for id in $(aws s3api list-buckets --query 'Buckets[].Name' --output text); do
        if [ -n "$CLUSTER_NAME" ] && [ "$CLUSTER_NAME" == "$(aws s3api get-bucket-tagging --bucket $id --query 'TagSet[?Key==`cloudbees:pse:cluster`].Value[]' --output text 2> /dev/null)" ]; then
            echo $id
        fi
    done
}

ec2-volumes-for-cluster | xargs -P 20 -I {} bash -c "echo {}; aws ec2 delete-volume --region $AWS_REGION --volume-id {}"
ec2-snapshots-for-cluster | xargs -P 20 -I {} bash -c "echo {}; aws ec2 delete-snapshot --region $AWS_REGION --snapshot-id {}"
s3-buckets-for-cluster | xargs -P 20 -I {} bash -c "echo {}; aws s3 rb s3://{} --force"
Note
There may be some snapshots and volumes being created for some time after the cluster is destroyed so the command may need to be run again.

Recreating a Destroyed CJP Private SaaS Edition Cluster

The CJP Private SaaS Edition configuration created when you ran bees-pse will not be deleted when you run bees-pse destroy. You can re-initialize by going through the installation process again.

Removing CJP Private SaaS Edition

To remove all traces of CJP Private SaaS Edition:

  1. Destroy the CJP Private SaaS Edition cluster, if it’s still running

  2. Delete stored data.

  3. Delete $PROJECT (created by bees-pse init-project …​)

  4. Delete Private SaaS Edition cli

Updating access from selected IPs

Both the admin_port_access parameter, controlling the admin access from selected IPs, and the user_port_access parameter, controlling the user access from selected IPs, can be updated with the 'access-port-update' operation.

$ bees-pse prepare access-port-update
access-port-update is staged - review access-port-update.config and edit as needed - then run 'bees-pse apply' to perform the operation.

Both access port parameters can contain one or more IP address ranges in CIDR notation separated by commas (for example, 192.0.2.0/24,198.51.100.1/32).

Use 0.0.0.0/0 to allow connections from any IP, or IP/32 (for example 198.51.100.1/32) to allow access from a single IP (for example: 198.51.100.1). Other CIDR network masks can be used to control wider ranges of IPs.

Then carry out the operation with bees-pse apply.

Note
This operation on access port parameters only applies if your product is not already using network information that you supplied during initial configuration.

Updating CJOC parameters

To update CJOC parameters, use the 'cjoc-update' operation.

$ bees-pse prepare cjoc-update
cjoc-update is staged - review cjoc-update.config and edit as needed - then run 'bees-pse apply' to perform the operation.

Edit the cjoc-update.config file to define the parameters you want to change. Only uncomment the parameters you want to change.

This operation allows to: (see cjoc-update.config file for a complete list of parameters)

  • Enable/disable the evaluation mode

  • Set the CJOC container memory

  • Set the CJOC application JVM options

  • Set the CJOC workspace disk size

  • Set a custom CJOC docker image

Enabling/updating EC2 Container Registry (ECR) configuration

Note
If using an AWS IAM profile, first make sure that the proper rights are set. See Worker role policy for more information.

To update the ECR configuration, use the 'ecr-update' operation.

$ bees-pse prepare ecr-update
ecr-update is staged - review ecr-update.config and edit as needed - then run 'bees-pse apply' to perform the operation.

Edit the ecr-update.config file to define the parameters you want to change.

This operation allows to: (see ecr-update.config file for a complete list of parameters)

  • Enable usage of the default AWS EC2 Container Registry

  • Enable AWS EC2 Container Registry for specific accounts

Scripting cluster operations

CJP Private SaaS Edition "operations" commands have a life-cycle. In order to execute an operation, you first need to stage/prepare the command. This stage lays down a configuration file that contains the input parameters of the operation.

By default the config file is edited by the admin user. In order to facilitate the scripting of cli operations, there are two ways to specify operations values via the bees-pse prepare command:

  1. Config/secrets file arguments

  2. Operation parameter arguments

Note that both type of arguments can be used together if necessary. In this case, the parameter value arguments will overwrite the values specified in the config file. Also, only config file parameters can be specified as arguments. If the OPERATION requires secrets, a secrets file can be specified with the --secrets-file SECRETS-FILE argument.

Config/Secrets file arguments

To use a specific config file and/or secrets file for the operation, use the following options of the bees-pse prepare command. See bees-pse prepare -h for all options

  • --config-file CONFIG_FILE

    • bees-pse prepare --config-file CONFIG_FILE OPERATION

  • --secrets-file SECRETS-FILE for operations that require secrets inputs

    • bees-pse prepare --secrets-file SECRETS-FILE OPERATION

Operation parameter arguments

Operation parameters can also be specified as arguments to the bees-pse prepare command. Use bees-pse prepare OPERATION -h to get the list of parameters available for the specified OPERATION.

For example for the worker-add (Add worker(s) to the cluster) operation, the arguments available are:

$ bees-pse prepare worker-add -h
usage: tiger prepare [-h] [-p DIR] [--config-file CONFIG-FILE]
                     [--secrets-file SECRETS-FILE]
                     [--aws.worker_instance_type VAL]
                     [--aws.worker_volume_size VAL]
                     [--worker.count VAL]
                     worker-add

Prepares an operation.

Prepared operations must be configured and then applied using the apply command.

  worker-add            Prepare worker-add.

optional arguments:
  -h, --help            show this help message and exit
  -p DIR, --project DIR
                        Directory containing the PSE project files
  --config-file CONFIG-FILE
                        Use the specified config file
  --secrets-file SECRETS-FILE
                        Use the specified secrets file
  --aws.worker_instance_type VAL
                        The instance type of the worker to create.
  --aws.worker_volume_size VAL
                        The instance root volume size
  --worker.count VAL    Number of workers to add

For example, to add 2 workers using a m4.xlarge instance type on AWS, you can specify the worker count and the instance type as arguments:

$ bees-pse prepare worker-add --worker.count 2 --aws.worker_instance_type m4.xlarge
worker-add is staged - review worker-add.config and edit as needed - then run 'bees-pse apply' to perform the operation.
Note
If bees-pse apply fails, ensure that the machine has valid name servers configured. Running bees-pse apply without a valid name server configuration, or without the host(1) utility installed, will cause the script to fail when resolving addresses.

The worker-add.config file would be pre-populated with the values specified as arguments and ready for 'bees-pse apply'.

worker-add.config file content after the prepare with arguments command:

[worker]

## Number of workers to add
count = 2

[aws]

## The instance type of the worker to create.
# Leave empty to use default value
#
worker_instance_type = m4.xlarge

## The instance root volume size
# worker_volume_size =

Backup a Private SaaS Edition Cluster Project

A Private SaaS Edition Cluster project contains all configuration parameters and state information of the Private SaaS Edition Cluster. This information is needed to administer a cluster. It is recommended to backup this information to prevent loss of information and administrative access to the cluster.

The project information contains sensitive information such as credentials to manage the underlying instances of the clusters (cloud API credentials, ssh keys) and credentials to access the various Private SaaS Edition components (mesos, marathon, Elasticsearch, CJOC, etc…​). This information should be backed up with care.

Setting up a Private SaaS Edition Project for backup

Private SaaS Edition CLI contains a set of commands to lock a project by encrypting sensitive information and to unlock a previously locked project by decrypting the sensitive information. These operations rely on GPG private/public keys to secure/encrypt information and the presence of a SCM ignore file (currently only a Git ignore file is provided) that will prevent sensitive information to be accidentally saved into a SCM repository.

As a pre-requisite to use the 'bees-pse lock-project' and bees-pse unlock-project" commands:

  1. Configure a GPG private/public key on the system used to manage the Private SaaS Edition Cluster

  2. Create an ACL file referencing the GPG key to use for encryption.

ACL file

The ACL file 'secrets.acl' needs to be placed under the project directory and contain a reference to the GPG key UID. For example if the key user name is 'John Doe' and the email address of the user is "john.doe@acme.com", the secrets.acl content will be:

John Doe <john.doe@acme.com>

You can add multiple keys to the ACL file to share the project with others. See Sharing a Private SaaS Edition Project section.

Sharing a Private SaaS Edition Project

In order to share the operation of a Private SaaS Edition cluster, the project needs to be made available to the respective operators. This can be accomplished by locking the project and sharing the saved project with all operators.

To avoid sharing your GPG private key with all operators so they can unlock the project, it is preferable to lock the project using each operator’s GPG public key. To do this, you import all GPG public keys into your GPG key chain ( See GPG keys how-to for more information on GPG key import) and add all key UIDs to the 'secrets.acl' (one per line) prior to locking the project. Note that all GPG public keys need to be trusted.

Lock a Private SaaS Edition Project

In order to save a project, first lock the project with the 'bees-pse lock-project' command. This will encrypt the sensitive information of the project and place the project into a lock state. The lock state will prevent further management commands to execute.

Once the project is locked, you can save/backup your project with your favorite tool.

Note that if using Git as your backup repository, a '.gitignore' file is provided to prevent sensitive information to be accidentally saved into a Git repository.

Unlock a Private SaaS Edition Project

To unlock a 'locked' project use the 'bees-pse unlock-project' command. This will decrypt the previously encrypted information and unlock the project state to allow management commands to execute again.

Elasticsearch

The Elasticsearch sub-system provides a repository for various types of Private SaaS Edition data, such as raw metrics, job-related information, logs, etc…​ This section provides information about the Elasticsearch sub-system as it is being used in Private SaaS Edition.

Data Retention

The CJOC documentation on Elasticsearch gives some information on how much data CJOC retains. Incidentally, the page also provides additional useful information on how Elasticsearch is being used by CJOC. It should be required reading for Private SaaS Edition admins.

The following table details Private SaaS Edition-specific data retention information.

Name Pattern Retention Period Contents Disk usage expectation

topbeat-*

3 days

Performance data for the controller/worker nodes

70MB/node/day

logstash-*

7 days

syslog entries for PSE infrastructure components

800MB/day for the following system: three-node controller, three-node Elasticsearch, one node for master, one node for builds, 2000 jobs/day.

Default Configuration

Configuration Attribute Default Value Description

memory

2048

Total amount of memory allocated for Elasticsearch container (2000MB). Elasticsearch, itself, gets 1024MB of heap space

max_instance_count

1

Number of Elasticsearch node

replicas

1

Number of replica shard to keep

Recommended Configuration for Medium Size Deployment

The following recommendation is for a PSE deployment that runs about 2000 jobs a day, with three Jenkins masters and five build worker nodes. Elasticsearch is to run on three dedicated worker nodes. For AWS, it is of instance type "m4.xlarge", which has 16GB of RAM, 4 vCPUs and 200G of disk space. If OpenStack is the platform, then use the equivalent flavor.

Configuration Attribute Default Value Description

memory

14000

Total amount of memory allocated for Elasticsearch container (14000MB). Elasticsearch, itself, gets 7000MB of heap space

max_instance_count

3

Number of Elasticsearch node

replicas

2

Number of shard replicas to keep

shards

5

Number of shard per index

Changing Elasticsearch Configuration

This is typically done when the user wants to re-configure Elasticsearch, e.g, changing from the out-of-the-box configuration to one than can handle a medium size deployment described above.

Please use the following sequence of actions:

  1. Add three dedicated Elasticsearch nodes by running the worker-add operation

  2. Backup the Elasticsearch

  3. Ensure that the backup has completed by querying Elasticsearch using the following URL http://<PSE-url>/elasticsearch/_snapshot/_status.

  4. Execute the Elasticsearch update operation. For the medium size deployment recommendation, the contents of the file elasticsearch-update.config looks as follows

elasticsearch-update.config
[elasticsearch]
max_instance_count = 3
memory = 14000
shards = 5
replicas = 2
use_dedicated_worker = true

Private SaaS Edition Elasticsearch Operations

Please refer to the Management Operations section for details on operations in general. The following sections are specific to those operations that affect the Elasticsearch sub-system.

Backing Up Elasticseearch

elasticsearch-backup

This operation initiates a back-up action for all Elasticsearch indices into a repository called "tiger-backup". This operation is invoked asynchronously, i.e., it returns immediately before the back-up action has finished.

Tip
To check the back-up action status, reference the following Elasticsearch API end-point http://<PSE-url>/elasticsearch/_snapshot/tiger-backup/_status.

Restoring Elasticseearch

elasticsearch-restore

This operation is the reverse of the elasticsearch-backup operation. It is also an asynchronous operation and the same API end-point can be used to check for its completion status.

Deleting Some Elasticseearch Indices

elasticsearch-indices-delete

This operations removes older logstash-* and topbeat-* indices. By default, it will remove data older than three days. However, the user can customize the cutoff point, by modifying the file "elasticsearch-indices-delete.config".

[elasticsearch]

## The number of days worth of elasticsearch indices to keep.
#days_to_keep = 3

Changing Elasticseearch Configuration

elasticsearch-update

This operation changes the configuration of the Elasticsearch sub-system. It has already been described in more details in the changing-configuration section.

Appendix: Security

CloudBees Jenkins Operations Center

Please refer to CJOC Security chapter in the CloudBees Jenkins Operations Center documentation.

Appendix: Troubleshooting

Accessing Underlying Services

CJP Private SaaS Edition makes use of several underlying services to provide cluster operations, and while these services are hidden during daily operations, there may be times where they need to be accessed to debug or fix issues, as instructed in the following sections.

Accessing Marathon

Marathon url is printed during installation time, and credentials are stored inside .dna/secrets in the Private SaaS Edition project directory, under marathon_username and marathon_password.

Waiting for Marathon at http://marathon.example.com
...
Marathon: http://marathon.example.com

Accessing Apache Mesos

Apache Mesos url is printed during installation time, and credentials are stored inside .dna/secrets in the Private SaaS Edition project directory, under router_username and router_password.

Note
Use router_username and router_password, not mesos_username and mesos_password
Mesos   : http://mesos.example.com

Private SaaS Edition Out-of-the-box Alerts

Private SaaS Edition automatically monitors its infrastructure elements and all of the Jenkins masters that it manages.

As of PSE 1.3.1, the infrastructure element monitoring includes all Controller, Worker, (where appropriate) Storage nodes and Elasticsearch. Private SaaS Edition monitors health status for Elasticsearch. For the various infrastructure nodes, it monitors the following metrics.

  • Available disk space

  • CPU utilization for the most recent 5 minutes

  • RAM utilization for the most recent 5 minutes

If any of these metrics exceed 90% or more, Private SaaS Edition will create an appropriate alert. For example,

Health checks failing: [worker-14: Disk util at 95%, worker-9: Worker down]
Note
It is not possible to change the metric alert thresholds at this time.

The following table show the possible error messages and corresponding descriptions.

Table 1. Table Possible Failure Messages
Messages Descriptions

Disk util at <number>%

Disk utilization reaches 90% or higher

RAM util at <number>%

RAM utilization reaches 90% or higher for five or more minutes

CPU util at <number>%

Total CPU utilization reaches 90% or higher for five or more minutes. The percent utilization is normalized to 100% across all CPU’s on the node.

Worker down

A worker node is either not reachable or its Mesos service is not operational

Controller down

A controller node is either not reachable or one or more of its core services are not operational

Mesos down

Mesos service is not operational on either a controller or worker node

Marathon down

Marathon service is not operational on a controller node

Zookeeper down

Zookeeper service is not operational on a controller node

Elasticsearch unreacheable
Timed out trying to reach Elasticsearch

Elasticsearch is experiencing connectivity difficulty

Cluster is in red status
Cluster is in yellow status

Elasticsearch is not functioning normally. Additional investigation is required

CJOC fails to start during installation

Installation fails with an error message [cjoc] Failed in the last attempt and output similar to

11:23:26 [cjoc] curl: (22) The requested URL returned error: 503 Service Unavailable
11:23:26 [cjoc] 11:23:16 Failure (22) Retrying in 10..
11:23:46 [cjoc] curl: (22) The requested URL returned error: 503 Service Unavailable
11:23:46 [cjoc] 11:23:26 Failure (22) Retrying in 10..
11:23:46 [cjoc] curl: (22) The requested URL returned error: 503 Service Unavailable
11:23:46 [cjoc] 11:23:36 Failure (22) Retrying in 10..
11:23:46 [cjoc] 11:23:46 Failed in the last attempt (curl -k -fsSL http://cjoc..../health/check)
11:23:46 An error occurred during cjoc initialization (22) - see .../.dna/logs/20160229T104328Z-cluster-init/cjoc

CJOC may take some time to startup and that is why multiple The requested URL returned error: 503 Service Unavailable messages are ok, the installation just tries to connect to CJOC for a period of time until it gives up with the Failed in the last attempt error. This typically means that there is some misconfiguration in the Private SaaS Edition config file.

To debug the issue, logs need to be accessed by going to the Mesos web interface where one or more cjoc FAILED tasks will be found under Completed Tasks.

mesos cjoc failed

Logs can be accessed by clicking on Sandbox and downloading the stderr and stdout files.

Because this issue is typically caused due to storage backend issues, those logs need to be fetched too. For that note the Host where the cjoc task failed (ec2-54-152-202-60.compute-1.amazonaws.com in the previous screenshot) and under Active Tasks find the task with name castle.jce that runs on the same host. Click on that task Sandbox and again download the stderr and stdout files.

mesos castle task

See destroying the Private SaaS Edition cluster for information on how to start the cluster again from scratch.

Accessing CJOC or Master Filesystems

In some cases it may be necessary to access the filesystem of the running CJOC or Master servers.

In the following examples replace TENANT_ID with the ID of the master or cjoc for accessing CJOC.

Private SaaS Edition 1.2.0 and later

Run bees-pse run ssh-into-tenant TENANT_ID to get a shell into the CJOC or master container.

Private SaaS Edition <1.2.0

First, run bees-pse run list-applications to find out in which worker host the container is running. In the following example it would be worker-2 for CJOC:

$ bees-pse run list-applications
castle.jce : worker-2
elasticsearch.jce : worker-2
castle.jce : worker-3
cjoc.jce : worker-2
castle.jce : worker-1

Then ssh into the worker with

dna connect worker-2

At this point we can get a shell into the container.

sudo docker exec -ti $(sudo docker ps -f label=com.cloudbees.pse.tenant=TENANT_ID -q) bash

Getting Files

Sometimes it may also be necessary to get files into our local computer (or jumpbox) that has the Private SaaS Edition project configuration.

Run bees-pse run find-worker-for TENANT_ID to find out in which worker the container is running, and then get the ID of the container running as follows:

dna connect worker-2
sudo docker ps -f label=com.cloudbees.pse.tenant=TENANT_ID -q --no-trunc
exit

This will print a long ID, for example 7cc975f4da476f43602a18c60b3bcbb451b5914e61077a5e578fde26326ebf62

Next, let’s find the address of the worker with

$ bees-pse run list-workers
worker-1 (master: m4.xlarge) ec2-54-242-118-115.compute-1.amazonaws.com > ACTIVE
worker-3 (build: m4.xlarge) ec2-54-167-45-25.compute-1.amazonaws.com > ACTIVE
worker-2 (master: m4.xlarge) ec2-52-3-254-90.compute-1.amazonaws.com > ACTIVE

Now we can download any file from the JENKINS_HOME (/mnt/TENANT_ID/CONTAINER_ID/ in the worker filesystem) using scp. For example:

scp -i .dna/identities/default ubuntu@ec2-52-3-254-90.compute-1.amazonaws.com:/mnt/cjoc/7cc975f4da476f43602a18c60b3bcbb451b5914e61077a5e578fde26326ebf62/support/support_*.zip .

Appendix: Recovery

Recovering from Worker Node Failures

Worker nodes constitute the engine that drives PSE. For a variety of reasons, occasionally, these worker nodes can become unproductive or unresponsive. Here are some strategies to restore a worker node to full capacity.

Restarting a Worker Node

Before doing anything too drastic to an existing worker node, it is recommended that it be restarted. This approach takes the least amount of time and in many cases, can fix many temporary blockages.

Please follow the instructions in the Worker Restart chapter.

Replacing a Worker Node

If the worker node becomes completely unresponsive, it is better to replace it.

This is a two-step process that starts with removing the worker. Then, a new worker needs to be added.

PSE Cluster Disaster Recovery

Even in the face of catastrophic failures, where the entire Private SaaS Edition cluster gets obliterated, it is possible to recover it. This is done automatically upon re-provisioning the cluster through the magic of the PSE storage mechanism.

Disaster-Recovery with PSE Project Directory

If you still have the PSE Project directory, which was created during the initial preparatory steps when you installed PSE, simply initialize Private SaaS Edition again. Specifically, on the host machine that has the PSE directory, do the following:

$ cd $PROJECT
$ {cli} init

Tuning Elasticsearch Backup Settings

Private SaaS Edition includes automatic backups of Elasticsearch. This is configurable in CJOC and it is documented in the Analytics section of the CJOC User Guide.

Disaster-Recovery without PSE Project Files

If you no longer have the Private SaaS Edition Project directory nor its back-up, you can still recover the PSE cluster. However, you will need to go through the installation process again. You also need to ensure that the new cluster name be identical to the Private SaaS Edition cluster that you are trying to recover.

For example, if the lost Private SaaS Edition cluster was name 'tigress', then while going through the configuration process, you will need to ensure that the new cluster name is also named 'tigress'.

Post Recovery Actions

Once the recovered cluster becomes active, you will have restored all of the Jenkins masters. They will, however, be in non-running states and will not be connected to the CJOC. You will need to invoke the Start operations on all of these restored masters.

You can do this manually if you only have a few masters. If you have more than a few masters, starting all masters is a good candidate for a cluster operation.

If you do not see any any restored Jenkins masters, you can do this manually by invoking the "Reload Configuration from Disk" action under "Manage Jenkins" and then start the restored Jenkins masters as outlined above.

Appendix: CLI Reference

bees-pse

Most of the Private SaaS Edition Cluster initialization and maintenance actions can be accomplished by executing various sub-commands of the bees-pse CLI. Here is a top-level summary of the available sub-commands.

$ bees-pse -h
usage: bees-pse [-h] COMMAND ...

optional arguments:
  -h, --help       show this help message and exit

  COMMAND
    init-project     Initializes a Private SaaS Edition project skeleton
    init             Initialize a Private SaaS Edition environment
    check            Run tests against a Private SaaS Edition environment
    status           Show Private SaaS Edition cluster status
    list-operations  List supported operations
    prepare          Prepare an operation
    verify           Verify a staged operation
    apply            Apply a staged operation
    cancel           Cancel a staged operation
    upgrade-project  Upgrade Private SaaS Edition project to the current version
    upgrade          Upgrade Private SaaS Edition to the latest version
    reinit           Reinitialize a Private SaaS Edition resource
    run              Run a Private SaaS Edition command
    destroy          Destroy a Private SaaS Edition cluster
    version          Print Private SaaS Edition version
    lock-project     Lock a Private SaaS Edition project
    unlock-project   Unlock a Private SaaS Edition project

For more information about a particular command, use:

$ bees-pse COMMAND -h

where COMMAND is one of the commands above.

Appendix: Configuration Reference

Overview

PSE Cluster configuration is achieved by customizing two text files: cluster-init.config and cluster-init.secrets. These are text files where customization values are stored in section-based name-value pairs. For example,

  1. cluster-init.config

[tiger]
cluster_name = pse
controller_count = 3
...

Most of the attributes are common to all supported environments (AWS, OpenStack, etc…​). However, a small percentage Some of the attributes are environment-specific. While many of these attributes have reasonable default values, some need explicit initialization. There are use cases such as upgrading, managing workers, etc…​, which may require further customization of these files. Please refer to the following sections for more details.

Common Configuration Attributes

Table 2. [tiger] section
[tiger] Attribute Description

cluster_name

Name of the cluster (Required, default "pse")

This value is used to prefix resource names and tags when setting up servers and other infrastructure.

This value must contain only alphanumeric, dots, dashes and underscore values.

controller_count

Number of controllers (default 3)

Controllers manage coordination of the Private SaaS Edition resources. Include multiple controllers to ensure availability of the Private SaaS Edition cluster. When using multiple controllers always use an odd number of controllers.

master_worker_count

Number 'master' of workers (default 2)

Private SaaS Edition workload, for masters, CJOC and PSE components, is handled by 'master' workers. To increase the initial capacity of your Private SaaS Edition cluster, increase this number. You can add more workers later with the 'worker-add' operation. We recommend a minimum of two 'master' workers.

build_worker_count

Number of 'build' workers (default 1)

Private SaaS Edition workload for builds is handled by 'build' workers. To increase the initial build capacity of your Private SaaS Edition cluster, increase this number. You can add more workers later with the 'worker-add' operation.

docker_registry

Docker registry (default public)

Specify the registry used to obtain Private SaaS Edition images. To use the public Docker registry, use 'public'.

http_proxy

HTTP Proxy (optional)

If servers in the PSE cluster require an HTTP proxy to access the Internet, specify the full URL to the proxy server including the applicable port.

If servers in the PSE cluster don’t have any public IP, they must be accessed through a SSH proxy. The following optional section allows to configure access to a SSH proxy.

Table 3. [sshproxy] section
[sshproxy] Attribute Description

user

SSH Proxy user name (optional)

Username to log in as on the ssh proxy.

host

SSH Proxy host name (optional)

Host of the ssh proxy.

port

SSH Proxy port (optional)

Port of the ssh proxy. Defaults to 22.

identity_file

SSH Proxy identity file (optional)

Path to a private key to be used to connect to the SSH proxy.

netcat_cmd

SSH Proxy Netcat command (optional)

GNU Netcat must be installed on the machine used as SSH proxy. This allows to provide a custom path to the netcat executable. Defaults to 'nc'.

CloudBees Jenkins Operations Center Configuration Attributes

Table 4. [cjoc] section
[cjoc] Attribute Description

memory

The container memory limit (default 1.5GB)

This value specifies the amount of memory allocated to the CJOC container

jvm_options

The CJOC application JVM options (default '-Xmx1024m')

This value specifies the options used by the Java Virtual Machine.

Note that if this value specifies the max heap usage of the JVM, it should be lower than the container memory limit

Docker Configuration Attributes

Table 5. [docker] section
[docker] Attribute Description

options

The Docker daemon options (default --log-driver=syslog)

This value sets the Docker daemon options. --log-driver=syslog should always be included to ensure logs are collected.

Common Troubleshooting Attributes

Note
The following debug settings are for internal use only.
Table 6. [debug] section
[debug] Attribute Description

disable_castle

Disable castle (storage agent) (default no)

disable_elasticsearch

Disable elasticsearch (default no)

disable_cjoc = no

Disable cjoc (CloudBees Jenkins Operations Center) (default no)

enable_logstash

Enable logstash (default no)

Logstash may optionally be used to capture syslog events within the Private SaaS Edition cluster.

force_pull_image = false

Force docker image pulls for Marathon apps (default false)

Use this value to set forcePullImage for docker containers in the Marathon applications that support this setting.

enable_hello_app

Enable the hello app (default no)

The hello app is a trivial application that tests basic app deployment to Private SaaS Edition. When testing support for applications, you can set this to to yes, which will generate the hello and start it as a part of the Marathon application init. You can also manually start and stop the hello application from dna.

AWS-Specific Configuration Attributes

Table 7. [aws] section
[aws] Attribute Description

region

AWS region (required)

The region in which Private SaaS Edition will run. This must correspond to the AWS credentials specified above.

Use one of these values:

us-east-1
us-west-1
us-west-2
eu-west-1
eu-central-1
sa-east-1
ap-south-1
ap-southeast-1
ap-southeast-2
ap-northeast-1
ap-northeast-2

availability_zone

AWS availability zone.

The availability zone(s) where PSE will run, separated by comma.

The number of availability zones must be either 1, or 3 for fault tolerance. If not set it will pick a random one.

default_ami

Server AMI base image. Defaults to the AMI for the Private SaaS Edition version being installed

controller_instance_type

EC2 Instance type to use for controller servers (defaults to m4.large)

worker_instance_type

EC2 Instance type to use for worker servers (defaults to m4.xlarge)

storage_ami

Storage server AMI. Same as above but for the storage server.

storage_instance_type

EC2 Instance type to use for the storage server (defaults to m3.medium)

worker_instance_profile

Instance profile of the worker (optional)

If set, launches the worker instance(s) with an IAM profile which will be used to grab the AWS credentials for worker operations.

Cluster access restrictions

tiger_admin_port_access

Admin access from selected IPs (required)

Administrative ports are not accessible by default, but access can be granted to specific IPs and networks by using tiger_admin_port_access.

Ensure this property includes the installation machine IP, failing to do so will result in a failed installation.

In AWS or public clouds that would be your public facing IP. You can use Google by typing "what is my ip" to find public-facing IP.

In OpenStack or private clouds it would be the IP of the machine running the installation as seen by the VMs started in the cloud.

The property can contain one or more IP address ranges in CIDR notation separated by commas that are granted access to the administrative ports (for example: 192.0.2.0/24,198.51.100.1/32).

Use 0.0.0.0/0 to allow administrative connections from any IP, or 198.51.100.1/32 to allow access from only one IP (198.51.100.1).

Ports open for admin access include the ones for user access plus:

  • 22 SSH access to controllers and workers

tiger_user_port_access

User access from selected IPs (optional)

By default, applications are opened to any ip (0.0.0.0/0).

By providing a value, you can restrict access to a particular set of IPs.

The property can contain one or more IP address ranges in CIDR notation separated by commas that are granted access to the administrative ports (for example: 192.0.2.0/24,198.51.100.1/32).

Ports open for user access include: * 80 http access * 443 https access (if enabled) * 2222 Jenkins SSH

VPC options

vpc_id

User-provided VPC ID (optional)

If set, PSE will create all resources within the given VPC. It is assumed that the given VPC ID is valid within the given account and region.

If unset, PSE will create a new VPC and create all resources inside it.

vpc_subnet_id

User-provided subnet ID (required if vpc_id is set)

If vpc_id above is set, it is mandatory to provide also the ID of a subnet defined inside the user-provided VPC and the availability zone it corresponds to, using the availability_zone parameter

The number of subnets must be either 1, or 3 (separated by comma) for fault tolerance.

Instances created by PSE will be launched on the provided subnet.

vpc_subnet

VPC subnet (default 10.16.0.0/16). Unused if vpc_id is set.

VPC and subnet CIDR block.

internal

Whether the cluster is internal only (defaults to no)

Allows to set up a PSE cluster with no public access (no public elb, no public ips) This can be enabled only if the vpc is user-provided (vpc_id and vpc_subnet_id provided above)

Table 8. [storage] section
[storage] Attribute Description

type

Storage type (default ebs if permissions allow, otherwise builtin-rsync)

Private SaaS Edition requires a means of backing up and restoring workspace state. The storage system can be configured with various back-end types.

Types supported:

builtin-rsync

Creates a storage server within the Private SaaS Edition cluster that is used to backup and restore workspace files using rsync.

nfs

Uses a mounted NFS volume to store workspace files

ebs

Uses AWS elastic block store (EBS) to store and snapshot workspace files

nfs_server

NFS server (only if type = nfs)

If using the nfs storage type, you must specify a server to use.

Specify the hostname or IP address of an NFS server that is accessible to the Private SaaS Edition cluster.

nfs_export_dir

NFS export directory

If using the nfs storage type, specify the exported directory on the NFS server (see nfs_server above) to use for storage.

Ensure that the directory is specified exactly as defined on the NFS server. The directory should start with a forward-slash (e.g. /var/myexport).

ebs_snapshot (only if type = ebs)

EBS snapshot

If using the ebs storage type, you must specify a snapshot to be used for creating a new EBS volume.

The snapshot must be of an XFS formatted volume and available in the cluster region (see region setting above).

volume_user

Volume user (default jenkins)

Specify the user that will own the backed-up workspace files on the volume.

volume_group

Volume group (default jenkins)

Specify group that will own the backed-up workspace files on the volume.

Table 9. cluster-init.secrets.in template for AWS
Attribute Description

aws_access_key_id

AWS access key ID (required)

This is the access key ID from your AWS credentials. It must correspond to the value for secret_access_key below.

aws_secret_access_key

AWS secret access key (required)

This is the secret access key from your AWS credentials. It must correspond to the value for access_key_id (above).

aws_session_token

AWS session token (optional)

If using ephemeral credentials from Amazon STS, the session token can be supplied via this configuration option.

docker_registry_auth

Docker authorization (optional)

If using a Docker registry that requires authentication, specify the base 64 encoded auth string used by Docker here.

cjoc_username

CJOC username (optional - defaults to admin)

cjoc_password

CJOC password (optional - defaults to a generated random string)

OpenStack-Specific Attributes

Table 10. openstack section
[openstack] Attribute Description

tenant_name

OpenStack tenant name (required)

Example: tiger

auth_url

OpenStack identity service endpoint (required)

image_name

Default image name for Private SaaS Edition servers (required)

Example: image_name = cloudbees-tiger-ubuntu-1.0

storage_image_name

Image name override for Storage node (optional)

If not specified, then the value specified in the attribute "openstack.image_name" is used

storage_instance_type

Flavor of the storage instance (default m1.small)

If intending to use builtin-rsync, then change this value to something appropriately larger.

controller_image_name

Image name override for Controller nodes (optional)

If not specified, then the value specified in the attribute "openstack.image_name" is used

controller_instance_type

Flavor of the controller instances (default m1.large)

worker_image_name

Image name override for Worker nodes (optional)

If not specified, then the value specified in the attribute "openstack.image_name" is used

worker_instance_type

Flavor of the worker instances (default m1.large)

network_name

Network name within which Tiger servers are created (required)

floatip_network_name

Name of network that houses floating IPs to be assigned to the controller instances so they are accessible from an external network, e.g., the Internet (optional)

This network name will be used to generate one or more floating IPs to expose VMs to the outside world. If not set, only private ips from "network_name" will be assigned and PSE may not be accessible from outside the cluster, depending on your network topology.

floatip_workers

Whether to give a floating ip to workers. (default no)

By default, workers are not exposed to the outside world. Set this to 'yes' to attach workers to the floating IPs network, just like controllers. It will allow to connect external jnlp agents to the masters within the cluster.

tiger_admin_port_access

Admin access from selected IPs (required)

Administrative ports are not accessible by default, but access can be granted to specific IPs and networks by using tiger_admin_port_access.

Ensure this property includes the installation machine IP, failing to do so will result in a failed installation.

In AWS or public clouds that would be your public facing IP. You can use Google by typing "what is my ip" to find public-facing IP.

In OpenStack or private clouds it would be the IP of the machine running the installation as seen by the VMs started in the cloud.

The property can contain one or more IP address ranges in CIDR notation separated by commas that are granted access to the administrative ports (for example, 192.0.2.0/24,198.51.100.1/32).

Use 0.0.0.0/0 to allow administrative connections from any IP, or 198.51.100.1/32 to allow access from only one IP (198.51.100.1).

Table 11. storage section
[storage] Attribute Description

type count

Storage type (default builtin-rsync) Number of storage servers (default 1)

Private SaaS Edition requires a means of backing up and restoring workspace state. The storage system can be configured with various back-end types.

Types supported:

Types supported:

builtin-rsync

Creates a storage server within the Private SaaS Edition cluster that is used to backup and restore workspace files using rsync.

nfs

Uses a mounted NFS volume to store workspace files

When using rsync with a built-in storage server, set count=1. When using another backend or a provided storage server, set count=0

nfs_server

NFS server (only if type = nfs)

If using the nfs storage type, you must specify a server to use.

Specify the hostname or IP address of an NFS server that is accessible to the Private SaaS Edition cluster.

nfs_export_dir

NFS export directory (only if type = nfs)

If using the nfs storage type, specify the exported directory on the NFS server (see nfs_server above) to use for storage.

Ensure that the directory is specified exactly as defined on the NFS server. The directory should start with a forward-slash (e.g. /var/myexport).

volume_user

Volume user (default jenkins)

Specify the user that will own the backed-up workspace files on the volume.

volume_group

Volume group (default jenkins)

Specify group that will own the backed-up workspace files on the volume.

Table 12. cluster-init.secrets.in template for OpenStack
Attribute Description

openstack_user_name

OpenStack username (required)

openstack_password

OpenStack password (required)

docker_registry_auth

Docker authorization (optional)

If using a Docker registry that requires authentication, specify the base 64 encoded auth string used by Docker here.

cjoc_username

CJOC username (optional - defaults to admin)

cjoc_password

CJOC password (optional - defaults to a generated random string)

Appendix: AWS Running in an internal-only VPC

Private SaaS Edition installs by default by setting up public resources, such as ELBs or public IPs.

But it can be installed in an internal-only VPC if required.

Prerequisites

  • You need to have an existing VPC (or to set one up) and a subnet with outbound connectivity.

  • The workstation used for Private SaaS Edition installation need to have access to the VPC private network.

Reference architecture

This is the reference architecture we have been using to test this feature. Instances of the PSE cluster are created using vpc-1 and subnet-1. The PSE workstation was connected using VPN to another VPC peered to vpc-1 through a VPC peering (pcx-1). An alternative would have been to create the PSE workstation directly in the VPC, on the public subnet (subnet-2).

Table 13. AWS resources
Resource Type Elements Attributes

VPC

vpc-1

CIDR: 172.18.128.0/17

Subnet

subnet-1 (private)

CIDR: 172.18.128.0/24

Auto-assign Public IP: no

Route Table: rt-1

subnet-2 (public)

CIDR: 172.18.130.0/24

Auto-assign Public IP: yes

Route Table: rt-2

Route table

rt-1 (private)

172.18.128.0/17 → local

0.0.0.0/0 → nat-1

172.18.64.0/18 → pcx-1

rt-2 (public)

172.18.128.0/17 → local

0.0.0.0/0 → igw-1

172.18.64.0/18 → pcx-1

VPC Peering

pcx-1

Peered VPC CIDR: 172.18.64.0/18

NAT Gateway

nat-1

Attached to public subnet

Internet Gateway

igw-1

Installation

In the cluster-init.config, you will need to provide:

  • vpc_id : ID of the existing VPC

  • vpc_subnet_id : ID of the existing subnet

  • internal : must be set to yes

Appendix: AWS Enhanced Networking

Users of AWS can take advantage of its enhanced networking capabilities, which boost higher networking performance on those instances that meet the requirements for enhanced networking. In particular, the requirements are (refer to the AWS Enabling Enhanced Networking on Linux Guide for more details):

Enhanced Networking Requirements
  1. Instances must use a HVM AMI

  2. Instances must be in a VPC

  3. Instances must have SriovNetSupport

  4. Instances must use the appropriate network driver

  5. Instances must be of the correct instance type

Private SaaS Edition provisions all of its infrastructure nodes with most of the requirements already in place, with the exception of the correct instance type, since the instance types that are associated with enhanced networking consume much more resources and are correspondingly more costly. However, if the use case deems it necessary to enable enhanced networking, the Private SaaS Edition Admin can selectively provision only those nodes that require faster networking throughput.

Please refer to the OpenSource EC2 Instance Types Guide to get an idea on the instance types that support enhanced networking.

IMPORTANT:

AWS instance types that support enhanced networking are much larger, typically having a large amount of RAM, disk space and cores. The consequence of having more RAM is that the instances will be able to support more Jenkins masters, which can be a problem if the RAM amount is large enough.

There is an inherent limitation for AWS instances to support only a fixed number of drives (about 21 drives). Each Jenkins master uses up one drive. Thus, if there are more than 21 Jenkins master on a node, it will not be possible for subsequent Jenkins master to launch.

How to Provision a Worker Node with Enhanced Networking

Follow the instructions to add a new worker. Edit the file worker-add.config to use the appropriate instance type. For example,

[worker]

## Number of workers to add
count = 1

[aws]

## The instance type of the worker to create.
# Leave empty to use default value
#
worker_instance_type = c4.8xlarge

## The instance root volume size
# worker_volume_size =

[mesos]

## Set specific attributes to segregate workload
# <experimental feature>
# slave_attributes =

The instance type c4.8xlarge has a 10 Gigabit network speed.

Appendix: AWS WAF integration

Private SaaS Edition can be integrated with AWS WAF, but it needs additional steps.

Let’s assume you are running Private SaaS Edition using the domain name pse.infra.example.com and the ELB instance created by PSE is named pse-123456789.us-east-1.elb.amazonaws.com.

The DNS entry you created in your registrar should look like:

pse.infra.example.com          IN CNAME pse-123456789.us-east-1.elb.amazonaws.com.

AWS WAF integrates with AWS CloudFront or with AWS ALB. As {pse] uses ELBs, you will need first to set up AWS CloudFront, then you will be able to add WAF rules.

Prerequisites

  • user_access_port must allow access to CloudFront IPs. Due to the very large set of ip ranges used by CloudFront, it may be impossible to restrict access when using CloudFront.

  • Your cluster must be configured with path mode enabled (CJOC is accessible using https://pse.infra.example.com/cjoc/). The reason is CloudFront is not compatible with usage of virtual hosts as used by Private SaaS Edition.

Architecture

cloudfront diagram

This diagram illustrates a setup where https is enabled. Communication between CloudFront and ELB is also using HTTPs, hence it is important to set up the appropriate SSL certificates.

Setup an alternative domain name

elb.infra.example.com IN CNAME pse-123456789.us-east-1.elb.amazonaws.com.

This new domain will be used when accessing the cluster through CloudFront.

you should have an https certificate valid for both pse.infra.example.com and elb.infra.example.com.

Set up CloudFront

Disclaimer: these instructions will allow you to set up a working CloudFront distribution in front of your cluster. It may not satisfy all your requirements. Please read the relevant documentation for more information about each option of this service.

Using the AWS Web Console, go to the CloudFront service.

Create a new distribution, select Web.

Origin Settings section

  • Origin domain name: elb.infra.example.com

  • Origin Protocol Policy: Match Viewer

Default Cache Behaviour Settings

  • Allowed HTTP Methods: GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE

  • Forward Headers: All

  • Forward Cookies: All

  • Query String Forwarding and Caching: Forward All, cache based on all

Distribution Settings

  • Price Class: Pick the relevant value for you

  • Alternate Domain Names (CNAMEs): pse.infra.example.com

  • SSL Certificate: Custom SSL Certificate. Pick a certificate valid for this domain (it may be the same one as for the ELB).

Click on Create Distribution.

A domain name will be allocated for the new distribution, such as abcdefghijklm.cloudfront.net

Wait for the status to be Deployed.

Check that CloudFront is working

curl -I -H "Host: pse.infra.example.com" https://abcdefghijklm.cloudfront.net

Should return something like

HTTP/1.1 403 Forbidden
Content-Length: 837
Content-Type: text/html;charset=UTF-8
Date: Fri, 16 Dec 2016 09:27:11 GMT
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Server: nginx/1.4.6 (Ubuntu)
X-Content-Type-Options: nosniff
X-Hudson: 1.395
X-Hudson-CLI-Port: 31807
X-Jenkins: 2.19.4.2-rolling
X-Jenkins-CLI-Host: ec2-12-123-123-123.compute-1.amazonaws.com
X-Jenkins-CLI-Port: 31807
X-Jenkins-CLI2-Port: 31807
X-Jenkins-Session: 12345678
X-Permission-Implied-By: hudson.security.Permission.GenericRead
X-Permission-Implied-By: hudson.model.Hudson.Administer
X-Required-Permission: hudson.model.Hudson.Read
X-You-Are-Authenticated-As: anonymous
X-You-Are-In-Group:
Connection: keep-alive

Which shows that CloudFront has proxied successfully to CJOC.

In case you obtain a CloudFront error, check that:

  • ELB security group is open to CloudFront IPs

  • The domain must be declared as Alternate Domain Names (CNAMEs) in CloudFront configuration

Update DNS entry

You may now update your DNS to target CloudFront instead of ELB.

pse.infra.example.com IN CNAME abcdefghijklm.cloudfront.net

Create a WAF ACL

Disclaimer: these instructions allows to set up the basics of WAF on your cluster. For more information about AWS WAF features, please read the relevant documentation.

Using the AWS Web Console, go to the WAF service.

Filter: Global (CloudFront)

Create web ACL

  • Web ACL name: Choose a relevant name (for example the name of your PSE cluster).

  • CloudWatch metric name: same as above.

  • Region: Global (Cloudfront)

  • AWS resource to associate: select the CloudFront Distribution you created above.

Start creating conditions you want to apply.