If the
IdleCount
is set to a value greater than
0
, then idle VMs are created in the background. The runner acquires an existing idle VM before asking for a new job.
If the job is assigned to the runner, then that job is sent to the previously acquired VM.
If the job is not assigned to the runner, then the lock on the idle VM is released and the VM is returned back to the pool.
Limit the number of VMs created by the Docker Machine executor
One runner process can create four different runner workers using different execution environments.
The
concurrent
value is set to 100, so this one runner will execute a maximum of 100 concurrent GitLab CI/CD jobs.
Only the
second
runner worker is configured to use the Docker Machine executor and therefore can automatically create VMs.
The
limit
setting of
30
means that the
second
runner worker can execute a maximum of 30 CI/CD jobs on autoscaled VMs at any point in time.
While
concurrent
defines the global concurrency limit across multiple
[[runners]]
workers,
limit
defines the maximum concurrency for a single
[[runners]]
worker.
In this example, the runner process handles:
Across all
[[runners]]
workers, up to 100 concurrent jobs.
For the
first
worker, no more than 40 jobs, which are executed with the
shell
executor.
For the
second
worker, no more than 30 jobs, which are executed with the
docker+machine
executor. Additionally, Runner will maintain VMs based on the autoscaling configuration in
[runners.machine]
, but no more than 30 VMs in all states (idle, in-use, in-creation, in-removal).
For the
third
worker, no more than 10 jobs, executed with the
ssh
executor.
For the
fourth
worker, no more than 20 jobs, executed with the
virtualbox
executor.
In this second example, there are two
[[runners]]
workers configured to use the
docker+machine
executor. With this configuration, each runner worker manages a separate pool of VMs that are constrained by the value of the
limit
parameter.
The runner processes no more than 100 jobs (the value of
concurrent
).
The runner process executes jobs in two
[[runners]]
workers, each of which uses the
docker+machine
executor.
The
first
runner can create a maximum of 80 VMs. Therefore this runner can execute a maximum of 80 jobs at any point in time.
The
second
runner can create a maximum of 50 VMs. Therefore this runner can execute a maximum of 50 jobs at any point in time.
Even though the sum of the limit value is
130
(
80 + 50 = 130
), the
concurrent
value of
100
at the global level means that this runner process can execute a maximum of 100 jobs concurrently.
If there is an insufficient number of
Idle
machines, GitLab Runner
starts provisioning new machines, subject to the
MaxGrowthRate
limit.
Requests for machines above the
MaxGrowthRate
value are put on hold
until the number of machines being created falls below
MaxGrowthRate
.
At the same time, GitLab Runner is checking the duration of the
Idle
state of
each machine. If the time exceeds the
IdleTime
value, the machine is
automatically removed.
Example:
Let’s suppose, that we have configured GitLab Runner with the following
autoscale parameters:
At the beginning, when no jobs are queued, GitLab Runner starts two machines
(
IdleCount = 2
), and sets them in
Idle
state. Notice that we have also set
IdleTime
to 30 minutes (
IdleTime = 1800
).
Now, let’s assume that 5 jobs are queued in GitLab CI. The first 2 jobs are
sent to the
Idle
machines of which we have two. GitLab Runner now notices that
the number of
Idle
is less than
IdleCount
(
0 < 2
), so it starts new
machines. These machines are provisioned sequentially, to prevent exceeding the
MaxGrowthRate
.
The remaining 3 jobs are assigned to the first machine that is ready. As an
optimization, this can be a machine that was busy, but has now completed its job,
or it can be a newly provisioned machine. For the sake of this example, let us
assume that provisioning is fast, and the provisioning of new machines completed
before any of the earlier jobs completed.
We now have 1
Idle
machine, so GitLab Runner starts another 1 new machine to
satisfy
IdleCount
. Because there are no new jobs in queue, those two
machines stay in
Idle
state and GitLab Runner is satisfied.
This is what happened:
We had 2 machines, waiting in
Idle
state for new jobs. After the 5 jobs
where queued, new machines were created, so in total we had 7 machines. Five of
them were running jobs, and 2 were in
Idle
state, waiting for the next
jobs.
The algorithm still works the same way; GitLab Runner creates a new
Idle
machine for each machine used for the job execution until
IdleCount
is satisfied. Those machines are created up to the number defined by
limit
parameter. If GitLab Runner notices that there is a
limit
number of
total created machines, it stops autoscaling, and new jobs must
wait in the job queue until machines start returning to
Idle
state.
In the above example we always have two idle machines. The
IdleTime
applies only when we are over the
IdleCount
. Then we try to reduce the number
of machines to
IdleCount
.
Scaling down:
After the job is finished, the machine is set to
Idle
state and is waiting
for the next jobs to be executed. Let’s suppose that we have no new jobs in
the queue. After the time designated by
IdleTime
passes, the
Idle
machines
are removed. In our example, after 30 minutes, all machines are removed
(each machine after 30 minutes from when last job execution ended) and GitLab
Runner starts to keep an
IdleCount
of
Idle
machines running, just like
at the beginning of the example.
So, to sum up:
We start GitLab Runner
GitLab Runner creates 2 idle machines
GitLab Runner picks one job
GitLab Runner creates one more machine to fulfill the strong requirement of always
having the two idle machines
Job finishes, we have 3 idle machines
When one of the three idle machines goes over
IdleTime
from the time when
last time it picked the job it is removed
GitLab Runner always has at least 2 idle machines waiting for fast
picking of the jobs
Below you can see a comparison chart of jobs statuses and machines statuses
in time:
How
concurrent
,
limit
and
IdleCount
generate the upper limit of running machines
The
IdleCount
parameter defines a static number of
Idle
machines that runner should sustain.
The value you assign depends on your use case.
You can start by assigning a reasonable small number of machines in the
Idle
state, and have them
automatically adjust to a bigger number, depending on the current usage. To do that, use the experimental
IdleScaleFactor
setting.
IdleScaleFactor
internally is an
float64
value and requires the float format to be used,
for example:
0.0
, or
1.0
or ,
1.5
etc. If an integer format will be used (for example
IdleScaleFactor = 1
),
Runner’s process will fail with the error:
FATAL: Service run failed error=toml: cannot load TOML value of type int64 into a Go float
.
When you use this setting, GitLab Runner tries to sustain a defined number of
machines in the
Idle
state. However, this number is no longer static. Instead of using
IdleCount
,
GitLab Runner checks how many machines are currently in use and defines the desired
Idle
capacity as
a factor of that number.
Of course if there would be no currently used machines,
IdleScaleFactor
would evaluate to no
Idle
machines
to maintain. Because of how the autoscaling algorithm works, if
IdleCount
is greater than
0
(and only then
the
IdleScaleFactor
is applicable), Runner will not ask for jobs if there are no
Idle
machines that can handle
them. Without new jobs the number of used machines would not rise, so
IdleScaleFactor
would constantly evaluate
to
0
. And this would block the Runner in unusable state.
Therefore, we’ve introduced the second setting:
IdleCountMin
. It defines the minimum number of
Idle
machines
that need to be sustained no matter what
IdleScaleFactor
will evaluate to.
The setting can’t be set to less than
1 if
IdleScaleFactor
is used. If done so, Runner will automatically set it to 1.
You can also use
IdleCountMin
to define the minimum number of
Idle
machines that should always be available.
This allows new jobs entering the queue to start quickly. As with
IdleCount
, the value you assign
depends on your use case.
In this case, when Runner approaches the decision point, it checks how many machines are currently in use.
Let’s say we currently have 5
Idle
machines and 10 machines in use. Multiplying it by the
IdleScaleFactor
Runner decides that it should have 11
Idle
machines. So 6 more are created.
If you have 90
Idle
machines and 100 machines in use, based on the
IdleScaleFactor
, GitLab Runner sees that
it should have
100 * 1.1 = 110
Idle
machines. So it again starts creating new ones. However, when it reaches
the number of
100
Idle
machines, it recognizes that this is the upper limit defined by
IdleCount
, and no
more
Idle
machines are created.
If the 100
Idle
machines in use goes down to 20, the desired number of
Idle
machines is
20 * 1.1 = 22
,
and GitLab Runner starts slowly terminating the machines. As described above, GitLab Runner will remove the
machines that weren’t used for the
IdleTime
. Therefore, the removal of too many
Idle
VMs will not be done
too aggressively.
If the number of
Idle
machines goes down to 0, the desired number of
Idle
machines is
0 * 1.1 = 0
. This,
however, is less than the defined
IdleCountMin
setting, so Runner will slowly start removing the
Idle
VMs
until 10 remain. After that point, scaling down stops and Runner keeps 10 machines in
Idle
state.
Autoscaling can be configured to have different values depending on the time period.
Organizations might have regular times when spikes of jobs are being executed,
and other times with few to no jobs.
For example, most commercial companies work from Monday to
Friday in fixed hours, like 10am to 6pm. On nights and weekends
for the rest of the week, and on the weekends, no pipelines are started.
These periods can be configured with the help of
[[runners.machine.autoscaling]]
sections.
Each of them supports setting
IdleCount
and
IdleTime
based on a set of
Periods
.
How autoscaling periods work
In the
[runners.machine]
settings, you can add multiple
[[runners.machine.autoscaling]]
sections, each one with its own
IdleCount
,
IdleTime
,
Periods
and
Timezone
properties. A section should be defined for each configuration, proceeding in order from the most general scenario to the most specific scenario.
All sections are parsed. The last one to match the current time is active. If none match, the values from the root of
[runners.machine]
are used.
In this configuration, every weekday between 9 and 16:59 UTC, machines are overprovisioned to handle the large traffic during operating hours. On the weekend,
IdleCount
drops to 5 to account for the drop in traffic.
The rest of the time, the values are taken from the defaults in the root -
IdleCount = 10
and
IdleTime = 1800
.
The 59th second of the last
minute in any period that you specify is
not
be considered part of the
period. For more information, see
issue #2170
.
You can specify the
Timezone
of a period, for example
"Australia/Sydney"
. If you don’t,
the system setting of the host machine of every runner is used. This
default can be stated as
Timezone = "Local"
explicitly.
To speed up your jobs, GitLab Runner provides a
cache mechanism
where selected directories and/or files are saved and shared between subsequent
jobs.
This is working fine when jobs are run on the same host, but when you start
using the GitLab Runner autoscale feature, most of your jobs run on a
new (or almost new) host, which executes each job in a new Docker
container. In that case, you can’t take advantage of the cache
feature.
To overcome this issue, together with the autoscale feature, the distributed
runners cache feature was introduced.
This feature uses configured object storage server to share the cache between used Docker hosts.
GitLab Runner queries the server and downloads the archive to restore the cache,
or uploads it to archive the cache.
In the example above, the S3 URLs follow the structure
http(s)://<ServerAddress>/<BucketName>/<Path>/runner/<runner-id>/project/<id>/<cache-key>
.
To share the cache between two or more runners, set the
Shared
flag to true.
This flag removes the runner token from the URL (
runner/<runner-id>
) and
all configured runners share the same cache. You can also
set
Path
to separate caches between runners when cache sharing is enabled.
Where
10.11.12.13:12345
is the IP address and port where your registry mirror
is listening for connections from the Docker service. It must be accessible for
each host created by Docker Machine.
concurrent=50# All registered runners can run up to 50 concurrent jobs[[runners]]url="https://gitlab.com"token="RUNNER_TOKEN"# Note this is different from the registration token used by `gitlab-runner register`name="autoscale-runner"executor="docker+machine"# This runner is using the 'docker+machine' executorlimit=10# This runner can execute up to 10 jobs (created machines)[runners.docker]image="ruby:2.7"# The default image used for jobs is 'ruby:2.7'[runners.machine]IdleCount=5# There must be 5 machines in Idle state - when Off Peak time mode is offIdleTime=600# Each machine can be in Idle state up to 600 seconds (after this it will be removed) - when Off Peak time mode is offMaxBuilds=100# Each machine can handle up to 100 jobs in a row (after this it will be removed)MachineName="auto-scale-%s"# Each machine will have a unique name ('%s' is required)MachineDriver="google"# Refer to Docker Machine docs on how to authenticate: https://docs.docker.com/machine/drivers/gce/#credentialsMachineOptions=["google-project=GOOGLE-PROJECT-ID","google-zone=GOOGLE-ZONE", # e.g. 'us-central-1'"google-machine-type=GOOGLE-MACHINE-TYPE", # e.g. 'n1-standard-8'"google-machine-image=ubuntu-os-cloud/global/images/family/ubuntu-1804-lts","google-username=root","google-use-internal-ip","engine-registry-mirror=https://mirror.gcr.io"[[runners.machine.autoscaling]]# Define periods with different settingsPeriods=["* * 9-17 * * mon-fri *"]# Every workday between 9 and 17 UTCIdleCount=50IdleCountMin=5IdleScaleFactor=1.5# Means that current number of Idle machines will be 1.5*in-use machines,# no more than 50 (the value of IdleCount) and no less than 5 (the value of IdleCountMin)IdleTime=3600Timezone="UTC"[[runners.machine.autoscaling]]Periods=["* * * * * sat,sun *"]# During the weekendsIdleCount=5IdleTime=60Timezone="UTC"[runners.cache]Type="s3"[runners.cache.s3]ServerAddress="s3.eu-west-1.amazonaws.com"AccessKey="AMAZON_S3_ACCESS_KEY"SecretKey="AMAZON_S3_SECRET_KEY"BucketName="runner"Insecure=false
Note that the
MachineOptions
parameter contains options for the
google
driver which is used by Docker Machine to spawn machines hosted on Google Compute Engine,
and one option for Docker Machine itself (
engine-registry-mirror
).
If you didn't find what you were looking for,
search the docs
.
If you want help with something specific and could use community support,
post on the GitLab forum
.
For problems setting up or using this feature (depending on your GitLab
subscription).