Table of contents

Plugin management


Monitoring Plugin

Introduction

The open source metrics plugin for Jenkins collects various metrics about how Jenkins is performing. The CloudBees Monitoring plugin adds alerting functionality based on when metrics deviate from user-defined ranges.

The Monitoring plugin was introduced in CloudBees Core 14.05.

Support for metrics based views has been dropped from the Monitoring plugin as of version 2.0 of the plugin in favor of CloudBees Jenkins Analytics.

Metrics-based alerts

This feature allows you to define different metrics-based alerts and have Jenkins send emails when the alerts start and finish

When the feature is enabled it adds an Alerts action to the top level Jenkins actions. The Alerts action allows viewing the status of all the defined alerts as well as providing the ability to silence specific alerts.

Note

For the alerting via email to function, Jenkins must be configured to be able to send emails.

Creating some basic alerts

The following instructions will create four basic alerts:

  • An alert that triggers if any of the health reports are failing

  • An alert that triggers if the file descriptor usage on the master goes above 80%

  • An alert that triggers if the JVM heap memory usage is over 80% for more than a minute

  • An alert that triggers if the 5 minute average of HTTP/404 responses goes above 10 per minute for more than five minutes

These instructions assume you have configured Jenkins with the SMTP settings required for sending emails.

  1. Login as an administrator and navigate to the main Jenkins configuration screen.

    gs ca01
  2. Scroll down to the Alerts section.

    gs ca02
  3. Click the Add corresponding to the Conditions

  4. Select the Health check score option

    gs ca03
  5. Specify Health checks as the Alert title. Leave the Alert after at 5 seconds. If you want to specify additional recipients for this health check only you can add them. Emails will be sent to the Global Recipients as well as any alert specific Recipients

    gs ca04
  6. Click the Add corresponding to the Conditions

  7. Select the Local metric gauge within range option

    gs ca05
  8. Specify vm.file.descriptor.ratio as the Gauge. Specify 0.8 as Alert if above. Specify File descriptor usage below 80% as the Alert title. Leave the Alert after at 5 seconds.

    gs ca06
  9. Click the Add corresponding to the Conditions

  10. Select the Local metric gauge within range option

    gs ca07
  11. Specify vm.memory.heap.usage as the Gauge. Specify 0.8 as Alert if above. Specify JVM heap memory usage below 80% as the Alert title. Specify the Alert after as 60 seconds.

    gs ca08
  12. Click the Add corresponding to the Conditions

  13. Select the Local metric meter within range option

    gs ca09
  14. Specify http.responseCodes.badRequest as the Meter. Specify 5 minute average as the Value. Specify 0.16666666 as Alert if above

    • the meter rates all report in events per second. Specify Less than 10 bad requests per minute as the Alert title. Specify the Alert after as 300 seconds.

      gs ca10
  15. Click the Add corresponding to the Global Recipients

  16. Select the Email notifications option

    gs ca11
  17. Specify the alert email recipients as a whitespace or comma separated list in the Email addresses text box.

    gs ca12
  18. Save the configuration.

  19. The main Jenkins root page should now have an Alerts action. Click on this action to view the alerts

    gs ca13

Managing alerts

Each alert can be in one of four states:

Table 1. Alert states
Icon State When

icon

Failing

The alert condition is met for less than the Alert after duration

icon

Failed

The alert condition has been met for at least the Alert after duration

icon

Recovering

The alert condition is not met for less than the Alert after duration

icon

Recovered

The alert condition is not met for at least the Alert after duration

Notification emails will be sent for any alarms that are not silenced on either of the transitions:

  • Failing to Failed

  • Recovering to Recovered

The alerts are checked every 5 seconds. The Alerts page displays the current value of each alert condition. If the condition has changed in between these alert checks then the UI may show the alert in a mixed state such as in An alert where the condition has changed prior to the periodic checks running.

gs ma01
Figure 1. An alert where the condition has changed prior to the periodic checks running

However, once the periodic check runs, the condition will enter either the Failing or Recovering state.

gs ma02
Figure 2. An alert having entered the Failing state

If the condition changes before the condition’s Alert after time expires then no notifications will be sent.

gs ma03
Figure 3. An alert having entered the Recovering state

On the other hand, if the condition stays constant for the entire Alert after time then a notification will be sent.

gs ma04
Figure 4. An alert having entered the Failed state

The Silence button can be used to supress the sending of notifications for specific alerts. The alerts are re-enabled using the Enable button.

gs ma05
Figure 5. Some alerts having been silenced

Maintenance windows

The administrator of a Jenkins instance can use Jenkins CLI commands to schedule maintenance windows for that instance. During a maintenance window all alerts will effectively behave as if they were muted, that is they will not send any notifications.

In the event that:

  • an alert is transitioning to a different state before a maintenance window starts

  • the alert state transition completes during the maintenance window

  • the alert is still in the new state when the maintenance window ends

  • the Jenkins instance was not restarted during the maintenance window

Then the notification of that state transition will be processed after the maintenance window ends.

There are three Jenkins CLI commands available for managing scheduled maintenance windows:

schedule-maintenance-window

Schedules a maintenance window. schedule maintenance window cli This command takes three parameters in order:

  1. The start time. This is parsed using a natural language parser which accepts both dates and relative time descriptions such as: now, tomorrow 5pm, sunday 6 in the morning, afternoon, fourteenth of june 2017 at eleven o’clock in the evening and midnight.

    Note
    The parser is based on Ruby’s chronic date parsing library.
  2. The duration. This is is a number followed by the time units, e.g. 1h, 30m or 2d.

  3. The reason, to display in the user interface, for the maintenance window.

clear-maintenance-windows

Removes all scheduled maintenance windows. clear maintenance windows cli

complete-maintenance-windows

Marks all currently active maintenance windows as completed. complete maintenance windows cli This command is typically used to mark a maintenance window as having completed early. If there are multiple overlapping maintenance windows currently active, this command will mark all of them as completed.

cancel-maintenance-window

Cancels the next maintenance window. cancel maintenance window cli

list-maintenance-windows

Lists the maintenance windows. list maintenance windows cli This command takes an optional --output option to specify the format to use when listing the maintenance windows. The supported formats are: json (the default), and xml.

Maintenance Window Tutorial
Tip

This tutorial assumes that:

  • You have set the JENKINS_URL environment variable to the URL of the Jenkins instance.

  • You have configured your SSH public key in the Jenkins instance’s user details.

  • You have downloaded the Jenkins CLI jar file into the current working directory.

When the above assumptions are true, then Jenkins CLI commands can be invoked with java -jar jenkins-cli.jar. As an alternative to the above, the Jenkins instance can be specified using the -s option and the login CLI command can be used to authenticate.

Firstly, we will start by checking what existing maintenance windows are scheduled using the list-maintenance-windows CLI command:

$ java -jar jenkins-cli.jar list-maintenance-windows

In this case there are no scheduled maintenance windows.

We will now schedule a maintenance window for 1 day on Sunday to perform a system upgrade by using the schedule-maintenance-window CLI command:

$ java -jar jenkins-cli.jar schedule-maintenance-window sunday 1d "system upgrade"

The maintenance windows can alternatively be listed in JSON or XML format:

$ java -jar jenkins-cli.jar list-maintenance-windows --output xml
<list>
  <window>
    <start>1468753200000</start>
    <end>1468839600000</end>
    <ownerId>admin</ownerId>
    <reason>system upgrade</reason>
  </window>
</list>
$ java -jar jenkins-cli.jar list-maintenance-windows --output json
[{"start":1468753200000,"end":1468839600000,"reason":"system upgrade","ownerDisplayName":"admin"}]
$

When there is at least one scheduled maintenance window, the menu:Jenkins[Alerts] screen will provide details of the next scheduled maintenance window.

maintenance window scheduled
Figure 6. A scheduled maintenance window

The schedule-maintenance-window command uses a natural language parser (based on Ruby’s chronic date parser) to parse the start date. This means that if we wanted to schedule, say a reboot of the build agents, at 5pm today, we can just use a command like:

$ java -jar jenkins-cli.jar schedule-maintenance-window "today 5pm" 30min "build agent reboot"
Note

The third parameter of that command is the reason. It is free-form text that informs other administrators and users about the purpose of the maintenance window.

Note

The start time had to be quoted as it contained whitespace.

We can again confirm this maintenance window with the list-maintenance-windows CLI command:

$ java -jar jenkins-cli.jar list-maintenance-windows
[{"start":1468753200000,"end":1468839600000,"reason":"build agent reboot","ownerDisplayName":"admin"},{"start":15653200000,"end":15667200000,"reason":"system upgrade","ownerDisplayName":"admin"}]

If we need to start an unplanned maintenance window, we can just use the start time of now:

$ java -jar jenkins-cli.jar schedule-maintenance-window now 1h "emergency plugin upgrade"
$ java -jar jenkins-cli.jar list-maintenance-windows
Start                        End                          Duration Owner Reason
[{"start":1468653200000,"end":1468739600000,"reason":"emergency plugin upgrade","ownerDisplayName":"admin"},{"start":1468753200000,"end":1468839600000,"reason":"build agent reboot","ownerDisplayName":"admin"},{"start":15653200000,"end":15667200000,"reason":"system upgrade","ownerDisplayName":"admin"}]

When there is a maintenance window active then the menu:Jenkins[Alerts] screen will include a message detailing the currently active maintenance window.

maintenance window active
Figure 7. An active maintenance window
Note

When more than one maintenance window is active at the same time, only the first window to expire will be displayed on the menu:Jenkins[Alerts] screen.

If the maintenance tasks are finished early, we can tell Jenkins to mark all currently active maintenance windows as complete using the complete-maintenance-windows CLI command:

$ java -jar jenkins-cli.jar complete-maintenance-windows
$ java -jar jenkins-cli.jar list-maintenance-windows
[{"start":1468753200000,"end":1468839600000,"reason":"build agent reboot","ownerDisplayName":"admin"},{"start":15653200000,"end":15667200000,"reason":"system upgrade","ownerDisplayName":"admin"}]

We can cancel the next maintenance window using the cancel-maintenance-window CLI command:

$ java -jar jenkins-cli.jar cancel-maintenance-window
$ java -jar jenkins-cli.jar list-maintenance-windows
[{"start":15653200000,"end":15667200000,"reason":"system upgrade","ownerDisplayName":"admin"}]

Finally, we can remove all scheduled maintenance windows using the clear-maintenance-windows CLI command:

$ java -jar jenkins-cli.jar clear-maintenance-windows
$ java -jar jenkins-cli.jar list-maintenance-windows
[]

Reference

The open source Jenkins Metrics Plugin defines an API for integrating the Dropwizard Metrics API within Jenkins and defines a number of standard metrics and provides some basic health checks. This section details the standard metrics and basic health checks available in version 3.0.3 of the Metrics Plugin.

Standard metrics

There are five types of metric defined in the Dropwizard Metrics API:

  • A gauge is an instantaneous measurement of a value

  • A counter is a gauge that tracks the count of something

  • A meter measures the rate of events over time. Meters provide five metrics:

    • the number of observed events

    • the average rate of all observed events

    • the average rate of observed events in the past minute

    • the average rate of observed events in the past five minutes

    • the average rate of observed events in the past fifteen minutes

  • A histogram measures the statistical distribution of values in a stream of data. Histograms provide the following metrics:

    • the number of observed values

    • the average of all observed values

    • the standard deviation of observed values

    • the minimum observed value

    • the maximum observed value

    • the 50th percentile observed value

    • the 75th percentile observed value

    • the 95th percentile observed value

    • the 98th percentile observed value

    • the 99th percentile observed value

    • the 99.9th percentile observed value

      Histograms also maintain a reservoir sample of the stream data. In the Jenkins Metrics Plugin the standard metric histograms use exponentially decaying reservoirs based on a forward-decaying priority reservoir with an exponential weighting towards newer data. Unlike some other exponentially decaying reservoirs this strategy has the advantage of maintaining a statistically representative sampling reservoir.

  • A timer is basically a histogram of the duration of events coupled with a meter of the rate of the event occurence. Timers provide the following metrics:

    • the number of observed events

    • the average rate of all observed observed

    • the average rate of observed events in the past minute

    • the average rate of observed events in the past five minutes

    • the average rate of observed events in the past fifteen minutes

    • the average duration of all observed events

    • the standard deviation of observed event durations

    • the minimum observed event duration

    • the maximum observed event duration

    • the 50th percentile observed event duration

    • the 75th percentile observed event duration

    • the 95th percentile observed event duration

    • the 98th percentile observed event duration

    • the 99th percentile observed event duration

    • the 99.9th percentile observed event duration

      Timers also maintain a exponentially decaying reservoir sample of the event duration data. These exponentially decaying reservoirs are use a forward-decaying priority reservoir with an exponential weighting towards newer data. Unlike some other exponentially decaying reservoirs this strategy has the advantage of maintaining a statistically representative sampling reservoir.

System and Java Virtual Machine metrics
system.cpu.load (gauge)

The system load on the Jenkins master as reported by the JVM’s Operating System JMX bean. The calculation of system load is operating system dependent. Typically this is the sum of the number of processes that are currently running plus the number that are waiting to run. This is typically comparable against the number of CPU cores.

vm.blocked.count (gauge)

The number of threads in the Jenkins master JVM that are currently blocked waiting for a monitor lock.

vm.count (gauge)

The total number of threads in the Jenkins master JVM. This is the sum of: vm.blocked.count, vm.new.count, vm.runnable.count, vm.terminated.count, vm.timed_waiting.count and vm.waiting.count

vm.cpu.load (gauge)

The rate of CPU time usage by the JVM per unit time on the Jenkins master. This is equivalent to the number of CPU cores being used by the Jenkins master JVM.

vm.daemon.count (gauge)

The number of threads in the Jenkins master JVM that are marked as Daemon threads.

vm.deadlocks (gauge)

The number of threads that have a currently detected deadlock with at least one other thread.

vm.file.descriptor.ratio (gauge)

The ratio of used to total file descriptors. (This is a value between 0 and 1 inclusive)

vm.gc..count (gauge)

The number of times the garbage collector has run. The names are supplied by and dependent on the JVM. There will be one metric for each of the garbage collectors reported by the JVM.

vm.gc..time (gauge)

The amount of time spent in the garbage collector. The names are supplied by and dependent on the JVM. There will be one metric for each of the garbage collectors reported by the JVM.

vm.memory.heap.committed (gauge)

The amount of memory, in the heap that is used for object allocation, that is guaranteed by the operating system as available for use by the Jenkins master JVM. (Units of measurement: bytes)

vm.memory.heap.init (gauge)

The amount of memory, in the heap that is used for object allocation, that the Jenkins master JVM initially requested from the operating system. (Units of measurement: bytes)

vm.memory.heap.max (gauge)

The maximum amount of memory, in the heap that is used for object allocation, that the Jenkins master JVM is allowed to request from the operating system. This amount of memory is not guaranteed to be available for memory management if it is greater than the amount of committed memory. The JVM may fail to allocate memory even if the amount of used memory does not exceed this maximum size. (Units of measurement: bytes)

vm.memory.heap.usage (gauge)

The ratio of vm.memory.heap.used to vm.memory.heap.max. (This is a value between 0 and 1 inclusive)

vm.memory.heap.used (gauge)

The amount of memory, in the heap that is used for object allocation, that the Jenkins master JVM is currently using.(Units of measurement: bytes)

vm.memory.non-heap.committed (gauge)

The amount of memory, outside the heap that is used for object allocation, that is guaranteed by the operating system as available for use by the Jenkins master JVM. (Units of measurement: bytes)

vm.memory.non-heap.init (gauge)

The amount of memory, outside the heap that is used for object allocation, that the Jenkins master JVM initially requested from the operating system. (Units of measurement: bytes)

vm.memory.non-heap.max (gauge)

The maximum amount of memory, outside the heap that is used for object allocation, that the Jenkins master JVM is allowed to request from the operating system. This amount of memory is not guaranteed to be available for memory management if it is greater than the amount of committed memory. The JVM may fail to allocate memory even if the amount of used memory does not exceed this maximum size. (Units of measurement: bytes)

vm.memory.non-heap.usage (gauge)

The ratio of vm.memory.non-heap.used to vm.memory.non-heap.max. (This is a value between 0 and 1 inclusive)

vm.memory.non-heap.used (gauge)

The amount of memory, outside the heap that is used for object allocation, that the Jenkins master JVM is currently using. (Units of measurement: bytes)

vm.memory.pools..usage (gauge)

The usage level of the memory pool, where a value of 0 represents an unused pool while a value of 1 represents a pool that is at capacity. The names are supplied by and dependent on the JVM. There will be one metric for each of the memory pools reported by the JVM.

vm.memory.total.committed (gauge)

The total amount of memory that is guaranteed by the operating system as available for use by the Jenkins master JVM. (Units of measurement: bytes)

vm.memory.total.init (gauge)

The total amount of memory that the Jenkins master JVM initially requested from the operating system. (Units of measurement: bytes)

vm.memory.total.max (gauge)

The maximum amount of memory that the Jenkins master JVM is allowed to request from the operating system. This amount of memory is not guaranteed to be available for memory management if it is greater than the amount of committed memory. The JVM may fail to allocate memory even if the amount of used memory does not exceed this maximum size. (Units of measurement: bytes)

vm.memory.total.used (gauge)

The total amount of memory that the Jenkins master JVM is currently using.(Units of measurement: bytes)

vm.new.count (gauge)

The number of threads in the Jenkins master JVM that have not currently started execution.

vm.runnable.count (gauge)

The number of threads in the Jenkins master JVM that are currently executing in the JVM. Some of these threads may be waiting for other resources from the operating system such as the processor.

vm.terminated.count (gauge)

The number of threads in the Jenkins master JVM that have completed execution.

vm.timed_waiting.count (gauge)

The number of threads in the Jenkins master JVM that have suspended execution for a defined period of time.

vm.uptime.milliseconds (gauge)

The number of milliseconds since the Jenkins master JVM started

vm.waiting.count (gauge)

The number of threads in the Jenkins master JVM that are currently waiting on another thread to perform a particular action.

Web UI metrics
http.activeRequests (counter)

The number of currently active requests against the Jenkins master Web UI.

http.responseCodes.badRequest (meter)

The rate at which the Jenkins master Web UI is responding to requests with a HTTP/400 status code

http.responseCodes.created (meter)

The rate at which the Jenkins master Web UI is responding to requests with a HTTP/201 status code

http.responseCodes.forbidden (meter)

The rate at which the Jenkins master Web UI is responding to requests with a HTTP/403 status code

http.responseCodes.noContent (meter)

The rate at which the Jenkins master Web UI is responding to requests with a HTTP/204 status code

http.responseCodes.notFound (meter)

The rate at which the Jenkins master Web UI is responding to requests with a HTTP/404 status code

http.responseCodes.notModified (meter)

The rate at which the Jenkins master Web UI is responding to requests with a HTTP/304 status code

http.responseCodes.ok (meter)

The rate at which the Jenkins master Web UI is responding to requests with a HTTP/200 status code

http.responseCodes.other (meter)

The rate at which the Jenkins master Web UI is responding to requests with a non-informational status code that is not in the list: HTTP/200, HTTP/201, HTTP/204, HTTP/304, HTTP/400, HTTP/403, HTTP/404, HTTP/500, or HTTP/503

http.responseCodes.serverError (meter)

The rate at which the Jenkins master Web UI is responding to requests with a HTTP/500 status code

http.responseCodes.serviceUnavailable (meter)

The rate at which the Jenkins master Web UI is responding to requests with a HTTP/503 status code

http.requests (timer)

The rate at which the Jenkins master Web UI is receiving requests and the time spent generating the corresponding responses.

Jenkins specific metrics
jenkins.executor.count.value (gauge)

The number of executors available to Jenkins. This is corresponds to the sum of all the executors of all the on-line nodes.

jenkins.executor.count.history (histogram)

The historical statistics of jenkins.executor.count.value.

jenkins.executor.free.value (gauge)

The number of executors available to Jenkins that are not currently in use.

jenkins.executor.free.history (histogram)

The historical statistics of jenkins.executor.free.value.

jenkins.executor.in-use.value (gauge)

The number of executors available to Jenkins that are currently in use.

jenkins.executor.in-use.history (histogram)

The historical statistics of jenkins.executor.in-use.value.

jenkins.health-check.count (gauge)

The number of health checks associated with the HealthCheckRegistry defined within the Jenkins Metrics Plugin

jenkins.health-check.duration (timer)

The rate at which the health checks are being run and the duration of each health check run. + The Jenkins Metrics Plugin, by default, will run the health checks once per minute. The frequency can be controlled by the jenkins.metrics.api.Metrics.HEALTH_CHECK_INTERVAL_MINS system property. In addition, the Metrics Plugin’s Operational Servlet can be used to request the health checks be run on demand.

jenkins.health-check.inverse-score (gauge)

The ratio of health checks reporting failure to the total number of health checks. Larger values indicate decreasing health as measured by the health checks. (This is a value between 0 and 1 inclusive)

jenkins.health-check.score (gauge)

The ratio of health checks reporting success to the total number of health checks. Larger values indicate increasing health as measured by the health checks. (This is a value between 0 and 1 inclusive)

jenkins.job.blocked.duration (timer)

The rate at which jobs in the build queue enter the blocked state and the amount of time they spend in that state.

jenkins.job.buildable.duration (timer)

The rate at which jobs in the build queue enter the buildable state and the amount of time they spend in that state.

jenkins.job.building.duration (timer)

The rate at which jobs are built and the time they spend building.

jenkins.job.queuing.duration (timer)

The rate at which jobs are queued and the total time they spend in the build queue.

jenkins.job.total.duration (timer)

The rate at which jobs are queued and the total time they spend from entering the build queue to completing building

jenkins.job.waiting.duration (timer)

The rate at which jobs enter the quiet period and the total amount of time that jobs spend in their quiet period.

Jenkins allows configuring a quiet period for most job types. While in the quiet period multiple identical requests for building the job will be coalesced. Traditionally this was used with source control systems that do not provide an atomic commit facility - such as CVS - in order to ensure that all the files in a large commit were picked up as a single build.

With more modern source control systems the quiet period can still be useful, for example to ensure that push notification of the came commit via redundant parallel notification paths get coalesced.

jenkins.job.count.value (gauge)

The number of jobs in Jenkins

jenkins.job.count.history (histogram)

The historical statistics of jenkins.job.count.value.

jenkins.job.scheduled (meter)

The rate at which jobs are scheduled. If a job is already in the queue and an identical request for scheduling the job is received then Jenkins will coalesce the two requests.

This metric gives a reasonably pure measure of the load requirements of the Jenkins master as it is unaffected by the number of executors available to the system.

Multiplying this metric by jenkins.job.building.duration gives an approximate measure of the number of executors required in order to ensure that every build request results in a build.

A more accurate measure can be obtained from a job-by-job summation of the scheduling rate for that job and the average build duration of that job.

The most accurate measure would require maintaining separate sums partitioned by the labels that each job can run against in order to determine the number of each type of executor required.

Such calculations assume that: every build node is equivalent and/or the build times are comparable across all build nodes; and build times are unaffected by other jobs running in parallel on other executors on the same node.

However in most cases even the basic result from multiplying jenkins.job.scheduled by jenkins.job.building.duration gives a reasonable result. Where larger than jenkins.executor.count.value by more than 10-15% the Jenkins build queue is typically observed to grow rapidly until most jobs have at least one build request sitting in the build queue. Whereas when less than jenkins.executor.count.value by at least 20-25% the build queue will tend to remain small, except for those cases where there are a large number of build jobs fighting for a small number of executors on nodes with specific labels.

jenkins.node.count.value (gauge)

The number of build nodes available to Jenkins, both on-line and off-line.

jenkins.node.count.history (histogram)

The historical statistics of jenkins.node.count.value.

jenkins.node.XXX.builds (timer)

The rate of builds starting on the XXX node and the amount of time they spend building.

There will be one metric for each XXX named node. The metric is lazily created after the JVM starts up when the first build starts on that node.

jenkins.node.offline.value (gauge)

The number of build nodes available to Jenkins but currently off-line.

jenkins.node.offline.history (histogram)

The historical statistics of jenkins.node.offline.value.

jenkins.node.online.value (gauge)

The number of build nodes available to Jenkins and currently on-line.

jenkins.node.online.history (histogram)

The historical statistics of jenkins.node.online.value.

jenkins.plugins.active (gauge)

The number of plugins in the Jenkins instance that started successfully.

jenkins.plugins.failed (gauge)

The number of plugins in the Jenkins instance that failed to start. A value other than 0 is typically indicative of a potential issue within the Jenkins installation that will either be solved by explicitly disabling the plugin(s) or by resolving the plugin dependency issues.

jenkins.plugins.inactive (gauge)

The number of plugins in the Jenkins instance that are not currently enabled.

jenkins.plugins.withUpdate (gauge)

The number of plugins in the Jenkins instance that have an newer version reported as available in the current Jenkins update center metadata held by Jenkins. This value is not indicative of an issue with Jenkins but high values can be used as a trigger to review the plugins with updates with a view to seeing whether those updates potentially contain fixes for issues that could be affecting your Jenkins instance.

jenkins.queue.blocked.value (gauge)

The number of jobs that are in the Jenkins build queue and currently in the blocked state.

jenkins.queue.blocked.history (histogram)

The historical statistics of jenkins.queue.blocked.value.

jenkins.queue.buildable.value (gauge)

The number of jobs that are in the Jenkins build queue and currently in the blocked state.

jenkins.queue.buildable.history (histogram)

The historical statistics of jenkins.queue.buildable.value.

jenkins.queue.pending.value (gauge)

The number of jobs that are in the Jenkins build queue and currently in the blocked state.

jenkins.queue.pending.history (histogram)

The historical statistics of jenkins.queue.pending.value.

jenkins.queue.size.value (gauge)

The number of jobs that are in the Jenkins build queue.

jenkins.queue.size.history (histogram)

The historical statistics of jenkins.queue.size.value.

jenkins.queue.stuck.value (gauge)

The number of jobs that are in the Jenkins build queue and currently in the blocked state.

jenkins.queue.stuck.history (histogram)

The historical statistics of jenkins.queue.stuck.value.

Standard health checks

The Dropwizard Metrics API includes a contract for health checks. Health checks return a simple PASS/FAIL status and can include an optional message.

disk-space

Returns FAIL if any of the Jenkins disk space monitors are reporting the disk space as less than the configured threshold. The message will reference the first node which fails this check. There may be other nodes that fail the check, but this health check is designed to fail fast.

plugins

Returns FAIL if any of the Jenkins plugins failed to start. A failure is typically indicative of a potential issue within the Jenkins installation that will either be solved by explicitly disabling the failing plugin(s) or by resolving the corresponding plugin dependency issues.

temporary-space

Returns FAIL if any of the Jenkins temporary space monitors are reporting the temporary space as less than the configured threshold. The message will reference the first node which fails this check. There may be other nodes that fail the check, but this health check is designed to fail fast.

thread-deadlock

Returns FAIL if there are any deadlocked threads in the Jenkins master JVM.