sync code with last improvements from OpenBSD

2023-08-28 05:57:34 +00:00 · 2023-08-28 05:57:34 +00:00 · 88965415ff
commit 88965415ff
26235 changed files with 29195616 additions and 0 deletions
--- a/lib/mesa/docs/ci/LAVA.rst
+++ b/lib/mesa/docs/ci/LAVA.rst
@ -0,0 +1,82 @@
+LAVA CI
+=======
+
+`LAVA <https://lavasoftware.org/>`_ is a system for functional testing
+of boards including deploying custom bootloaders and kernels.  This is
+particularly relevant to testing Mesa because we often need to change
+kernels for UAPI changes (and this lets us do full testing of a new
+kernel during development), and our workloads can easily take down
+boards when mistakes are made (kernel oopses, OOMs that take out
+critical system services).
+
+Mesa-LAVA software architecture
+-------------------------------
+
+The gitlab-runner will run on some host that has access to the LAVA
+lab, with tags like "mesa-ci-x86-64-lava-$DEVICE_TYPE" to control only
+taking in jobs for the hardware that the LAVA lab contains.  The
+gitlab-runner spawns a Docker container with lavacli in it, and
+connects to the LAVA lab using a predefined token to submit jobs under
+a specific device type.
+
+The LAVA instance manages scheduling those jobs to the boards present.
+For a job, it will deploy the kernel, device tree, and the ramdisk
+containing the CTS.
+
+Deploying a new Mesa-LAVA lab
+-----------------------------
+
+You'll want to start with setting up your LAVA instance and getting
+some boards booting using test jobs.  Start with the stock QEMU
+examples to make sure your instance works at all.  Then, you'll need
+to define your actual boards.
+
+The device type in lava-gitlab-ci.yml is the device type you create in
+your LAVA instance, which doesn't have to match the board's name in
+``/etc/lava-dispatcher/device-types``.  You create your boards under
+that device type and the Mesa jobs will be scheduled to any of them.
+Instantiate your boards by creating them in the UI or at the command
+line attached to that device type, then populate their dictionary
+(using an "extends" line probably referencing the board's template in
+``/etc/lava-dispatcher/device-types``).  Now, go find a relevant
+health check job for your board as a test job definition, or cobble
+something together from a board that boots using the same boot_method
+and some public images, and figure out how to get your boards booting.
+
+Once you can boot your board using a custom job definition, it's time
+to connect Mesa CI to it.  Install gitlab-runner and register as a
+shared runner (you'll need a GitLab admin for help with this).  The
+runner *must* have a tag (like "mesa-ci-x86-64-lava-rk3399-gru-kevin")
+to restrict the jobs it takes or it will grab random jobs from tasks
+across ``gitlab.freedesktop.org``, and your runner isn't ready for
+that.
+
+The Docker image will need access to the LAVA instance.  If it's on a
+public network it should be fine.  If you're running the LAVA instance
+on localhost, you'll need to set ``network_mode="host"`` in
+``/etc/gitlab-runner/config.toml`` so it can access localhost.  Create a
+gitlab-runner user in your LAVA instance, log in under that user on
+the web interface, and create an API token.  Copy that into a
+``lavacli.yaml``:
+
+.. code-block:: yaml
+
+  default:
+    token: <token contents>
+    uri: <URL to the instance>
+    username: gitlab-runner
+
+Add a volume mount of that ``lavacli.yaml`` to
+``/etc/gitlab-runner/config.toml`` so that the Docker container can
+access it.  You probably have a ``volumes = ["/cache"]`` already, so now it would be::
+
+    volumes = ["/home/anholt/lava-config/lavacli.yaml:/root/.config/lavacli.yaml", "/cache"]
+
+Note that this token is visible to anybody that can submit MRs to
+Mesa!  It is not an actual secret.  We could just bake it into the
+GitLab CI YAML, but this way the current method of connecting to the
+LAVA instance is separated from the Mesa branches (particularly
+relevant as we have many stable branches all using CI).
+
+Now it's time to define your test jobs in the driver-specific
+gitlab-ci.yml file, using the device-specific tags.
--- a/lib/mesa/docs/ci/bare-metal.rst
+++ b/lib/mesa/docs/ci/bare-metal.rst
@ -0,0 +1,231 @@
+Bare-metal CI
+=============
+
+The bare-metal scripts run on a system with gitlab-runner and Docker,
+connected to potentially multiple bare-metal boards that run tests of
+Mesa.  Currently "fastboot", "ChromeOS Servo", and POE-powered devices are
+supported.
+
+In comparison with LAVA, this doesn't involve maintaining a separate
+web service with its own job scheduler and replicating jobs between the
+two.  It also places more of the board support in Git, instead of
+web service configuration.  On the other hand, the serial interactions
+and bootloader support are more primitive.
+
+Requirements (fastboot)
+-----------------------
+
+This testing requires power control of the DUTs by the gitlab-runner
+machine, since this is what we use to reset the system and get back to
+a pristine state at the start of testing.
+
+We require access to the console output from the gitlab-runner system,
+since that is how we get the final results back from the tests.  You
+should probably have the console on a serial connection, so that you
+can see bootloader progress.
+
+The boards need to be able to have a kernel/initramfs supplied by the
+gitlab-runner system, since Mesa often needs to update the kernel either for new
+DRM functionality, or to fix kernel bugs.
+
+The boards must have networking, so that we can extract the dEQP XML results to
+artifacts on GitLab, and so that we can download traces (too large for an
+initramfs) for trace replay testing.  Given that we need networking already, and
+our dEQP/Piglit/etc. payload is large, we use NFS from the x86 runner system
+rather than initramfs.
+
+See `src/freedreno/ci/gitlab-ci.yml` for an example of fastboot on DB410c and
+DB820c (freedreno-a306 and freedreno-a530).
+
+Requirements (Servo)
+--------------------
+
+For Servo-connected boards, we can use the EC connection for power
+control to reboot the board.  However, loading a kernel is not as easy
+as fastboot, so we assume your bootloader can do TFTP, and that your
+gitlab-runner mounts the runner's tftp directory specific to the board
+at /tftp in the container.
+
+Since we're going the TFTP route, we also use NFS root.  This avoids
+packing the rootfs and sending it to the board as a ramdisk, which
+means we can support larger rootfses (for Piglit testing), at the cost
+of needing more storage on the runner.
+
+Telling the board about where its TFTP and NFS should come from is
+done using dnsmasq on the runner host.  For example, this snippet in
+the dnsmasq.conf.d in the google farm, with the gitlab-runner host we
+call "servo"::
+
+  dhcp-host=1c:69:7a:0d:a3:d3,10.42.0.10,set:servo
+
+  # Fixed dhcp addresses for my sanity, and setting a tag for
+  # specializing other DHCP options
+  dhcp-host=a0:ce:c8:c8:d9:5d,10.42.0.11,set:cheza1
+  dhcp-host=a0:ce:c8:c8:d8:81,10.42.0.12,set:cheza2
+
+  # Specify the next server, watch out for the double ',,'.  The
+  # filename didn't seem to get picked up by the bootloader, so we use
+  # tftp-unique-root and mount directories like
+  # /srv/tftp/10.42.0.11/jwerner/cheza as /tftp in the job containers.
+  tftp-unique-root
+  dhcp-boot=tag:cheza1,cheza1/vmlinuz,,10.42.0.10
+  dhcp-boot=tag:cheza2,cheza2/vmlinuz,,10.42.0.10
+
+  dhcp-option=tag:cheza1,option:root-path,/srv/nfs/cheza1
+  dhcp-option=tag:cheza2,option:root-path,/srv/nfs/cheza2
+
+See `src/freedreno/ci/gitlab-ci.yml` for an example of Servo on cheza.  Note
+that other Servo boards in CI are managed using LAVA.
+
+Requirements (POE)
+------------------
+
+For boards with 30W or less power consumption, POE can be used for the power
+control.  The parts list ends up looking something like (for example):
+
+- x86-64 gitlab-runner machine with a mid-range CPU, and 3+ GB of SSD storage
+  per board.  This can host at least 15 boards in our experience.
+- Cisco 2960S gigabit ethernet switch with POE. (Cisco 3750G, 3560G, or 2960G
+  were also recommended as reasonable-priced HW, but make sure the name ends in
+  G, X, or S)
+- POE splitters to power the boards (you can find ones that go to micro USB,
+  USBC, and 5V barrel jacks at least)
+- USB serial cables (Adafruit sells pretty reliable ones)
+- A large powered USB hub for all the serial cables
+- A pile of ethernet cables
+
+You'll talk to the Cisco for configuration using its USB port, which provides a
+serial terminal at 9600 baud.  You need to enable SNMP control, which we'll do
+using a "mesaci" community name that the gitlab runner can access as its
+authentication (no password) to configure.  To talk to the SNMP on the router,
+you need to put an IP address on the default vlan (vlan 1).
+
+Setting that up looks something like:
+
+.. code-block: console
+
+  Switch>
+  Password:
+  Switch#configure terminal
+  Switch(config)#interface Vlan 1
+  Switch(config-if)#ip address 10.42.0.2 255.255.0.0
+  Switch(config-if)#end
+  Switch(config)#snmp-server community mesaci RW
+  Switch(config)#end
+  Switch#copy running-config startup-config
+
+With that set up, you should be able to power on/off a port with something like:
+
+.. code-block: console
+
+  % snmpset -v2c -r 3 -t 30 -cmesaci 10.42.0.2 1.3.6.1.4.1.9.9.402.1.2.1.1.1.1 i 1
+  % snmpset -v2c -r 3 -t 30 -cmesaci 10.42.0.2 1.3.6.1.4.1.9.9.402.1.2.1.1.1.1 i 4
+
+Note that the "1.3.6..." SNMP OID changes between switches.  The last digit
+above is the interface id (port number).  You can probably find the right OID by
+google, that was easier than figuring it out from finding the switch's MIB
+database.  You can query the POE status from the switch serial using the `show
+power inline` command.
+
+Other than that, find the dnsmasq/tftp/NFS setup for your boards "servo" above.
+
+See `src/broadcom/ci/gitlab-ci.yml` and `src/nouveau/ci/gitlab-ci.yml` for an
+examples of POE for Raspberry Pi 3/4, and Jetson Nano.
+
+Setup
+-----
+
+Each board will be registered in freedesktop.org GitLab.  You'll want
+something like this to register a fastboot board:
+
+.. code-block:: console
+
+  sudo gitlab-runner register \
+       --url https://gitlab.freedesktop.org \
+       --registration-token $1 \
+       --name MY_BOARD_NAME \
+       --tag-list MY_BOARD_TAG \
+       --executor docker \
+       --docker-image "alpine:latest" \
+       --docker-volumes "/dev:/dev" \
+       --docker-network-mode "host" \
+       --docker-privileged \
+       --non-interactive
+
+For a Servo board, you'll need to also volume mount the board's NFS
+root dir at /nfs and TFTP kernel directory at /tftp.
+
+The registration token has to come from a freedesktop.org GitLab admin
+going to https://gitlab.freedesktop.org/admin/runners
+
+The name scheme for Google's lab is google-freedreno-boardname-n, and
+our tag is something like google-freedreno-db410c.  The tag is what
+identifies a board type so that board-specific jobs can be dispatched
+into that pool.
+
+We need privileged mode and the /dev bind mount in order to get at the
+serial console and fastboot USB devices (--device arguments don't
+apply to devices that show up after container start, which is the case
+with fastboot, and the Servo serial devices are actually links to
+/dev/pts).  We use host network mode so that we can spin up a nginx
+server to collect XML results for fastboot.
+
+Once you've added your boards, you're going to need to add a little
+more customization in ``/etc/gitlab-runner/config.toml``.  First, add
+``concurrent = <number of boards>`` at the top ("we should have up to
+this many jobs running managed by this gitlab-runner").  Then for each
+board's runner, set ``limit = 1`` ("only 1 job served by this board at a
+time").  Finally, add the board-specific environment variables
+required by your bare-metal script, something like::
+
+  [[runners]]
+    name = "google-freedreno-db410c-1"
+    environment = ["BM_SERIAL=/dev/ttyDB410c8", "BM_POWERUP=google-power-up.sh 8", "BM_FASTBOOT_SERIAL=15e9e390", "FDO_CI_CONCURRENT=4"]
+
+The ``FDO_CI_CONCURRENT`` variable should be set to the number of CPU threads on
+the board, which is used for auto-tuning of job parallelism.
+
+Once you've updated your runners' configs, restart with ``sudo service
+gitlab-runner restart``
+
+Caching downloads
+-----------------
+
+To improve the runtime for downloading traces during traces job runs, you will
+want a pass-through HTTP cache.  On your runner box, install nginx:
+
+.. code-block:: console
+
+  sudo apt install nginx libnginx-mod-http-lua
+
+Add the server setup files:
+
+.. literalinclude:: fdo-cache
+   :name: /etc/nginx/sites-available/fdo-cache
+   :caption: /etc/nginx/sites-available/fdo-cache
+
+.. literalinclude:: uri-caching.conf
+   :name: /etc/nginx/snippets/uri-caching.conf
+   :caption: /etc/nginx/snippets/uri-caching.conf
+
+Edit the listener addresses in fdo-cache to suit the ethernet interface that
+your devices are on.
+
+Enable the site and restart nginx:
+
+.. code-block:: console
+
+  sudo ln -s /etc/nginx/sites-available/fdo-cache /etc/nginx/sites-enabled/fdo-cache
+  sudo service nginx restart
+
+  # First download will hit the internet
+  wget http://localhost/cache/?uri=https://s3.freedesktop.org/mesa-tracie-public/itoral-gl-terrain-demo/demo.trace
+  # Second download should be cached.
+  wget http://localhost/cache/?uri=https://s3.freedesktop.org/mesa-tracie-public/itoral-gl-terrain-demo/demo.trace
+
+Now, set ``download-url`` in your ``traces-*.yml`` entry to something like
+``http://10.42.0.1:8888/cache/?uri=https://s3.freedesktop.org/mesa-tracie-public``
+and you should have cached downloads for traces.  Add it to
+``FDO_HTTP_CACHE_URI=`` in your ``config.toml`` runner environment lines and you
+can use it for cached artifact downloads instead of going all the way to
+freedesktop.org on each job.
--- a/lib/mesa/docs/ci/docker.rst
+++ b/lib/mesa/docs/ci/docker.rst
@ -0,0 +1,74 @@
+Docker CI
+=========
+
+For LLVMpipe and Softpipe CI, we run tests in a container containing
+VK-GL-CTS, on the shared GitLab runners provided by `freedesktop
+<http://freedesktop.org>`_
+
+Software architecture
+---------------------
+
+The Docker containers are rebuilt using the shell scripts under
+.gitlab-ci/container/ when the FDO\_DISTRIBUTION\_TAG changes in
+.gitlab-ci.yml. The resulting images are around 1 GB, and are
+expected to change approximately weekly (though an individual
+developer working on them may produce many more images while trying to
+come up with a working MR!).
+
+gitlab-runner is a client that polls gitlab.freedesktop.org for
+available jobs, with no inbound networking requirements.  Jobs can
+have tags, so we can have DUT-specific jobs that only run on runners
+with that tag marked in the GitLab UI.
+
+Since dEQP takes a long time to run, we mark the job as "parallel" at
+some level, which spawns multiple jobs from one definition, and then
+deqp-runner.sh takes the corresponding fraction of the test list for
+that job.
+
+To reduce dEQP runtime (or avoid tests with unreliable results), a
+deqp-runner.sh invocation can provide a list of tests to skip.  If
+your driver is not yet conformant, you can pass a list of expected
+failures, and the job will only fail on tests that aren't listed (look
+at the job's log for which specific tests failed).
+
+DUT requirements
+----------------
+
+In addition to the general :ref:`CI-farm-expectations`, using
+Docker requires:
+
+* DUTs must have a stable kernel and GPU reset (if applicable).
+
+If the system goes down during a test run, that job will eventually
+time out and fail (default 1 hour).  However, if the kernel can't
+reliably reset the GPU on failure, bugs in one MR may leak into
+spurious failures in another MR.  This would be an unacceptable impact
+on Mesa developers working on other drivers.
+
+* DUTs must be able to run Docker
+
+The Mesa gitlab-runner based test architecture is built around Docker,
+so that we can cache the Debian package installation and CTS build
+step across multiple test runs.  Since the images are large and change
+approximately weekly, the DUTs also need to be running some script to
+prune stale Docker images periodically in order to not run out of disk
+space as we rev those containers (perhaps `this script
+<https://gitlab.com/gitlab-org/gitlab-runner/issues/2980#note_169233611>`_).
+
+Note that Docker doesn't allow containers to be stored on NFS, and
+doesn't allow multiple Docker daemons to interact with the same
+network block device, so you will probably need some sort of physical
+storage on your DUTs.
+
+* DUTs must be public
+
+By including your device in .gitlab-ci.yml, you're effectively letting
+anyone on the internet run code on your device.  Docker containers may
+provide some limited protection, but how much you trust that and what
+you do to mitigate hostile access is up to you.
+
+* DUTs must expose the DRI device nodes to the containers.
+
+Obviously, to get access to the HW, we need to pass the render node
+through.  This is done by adding ``devices = ["/dev/dri"]`` to the
+``runners.docker`` section of /etc/gitlab-runner/config.toml.
--- a/lib/mesa/docs/ci/fdo-cache
+++ b/lib/mesa/docs/ci/fdo-cache
@ -0,0 +1,44 @@
+proxy_cache_path /var/cache/nginx/ levels=1:2 keys_zone=my_cache:10m max_size=24g inactive=48h use_temp_path=off;
+
+server {
+	listen 10.42.0.1:8888 default_server;
+	listen 127.0.0.1:8888 default_server;
+	listen [::]:8888 default_server;
+	resolver 8.8.8.8;
+
+	root /var/www/html;
+
+	# Add index.php to the list if you are using PHP
+	index index.html index.htm index.nginx-debian.html;
+
+	server_name _;
+
+        add_header X-GG-Cache-Status $upstream_cache_status;
+        proxy_cache my_cache;
+
+        location /cache_gitlab_artifacts {
+                internal;
+                # Gitlabs http server puts everything as no-cache even though
+                # the artifacts URLS don't change. So enforce a long validity
+                # time and ignore the headers that defeat caching
+                proxy_cache_valid 200 48h;
+                proxy_ignore_headers Cache-Control Set-Cookie;
+                include snippets/uri-caching.conf;
+        }
+
+        location /cache {
+                # special case gitlab artifacts
+                if ($arg_uri ~*  /.*gitlab.*artifacts(\/|%2F)raw/ ) {
+                        rewrite ^ /cache_gitlab_artifacts;
+                }
+                # Set a really low validity together with cache revalidation; Our goal
+                # for caching isn't to lower the number of http requests but to
+                # lower the amount of data transfer. Also for some test
+                # scenarios (typical manual tests) the file at a given url
+                # might get modified so avoid confusion by ensuring
+                # revalidations happens often.
+                proxy_cache_valid 200 10s;
+                proxy_cache_revalidate on;
+                include snippets/uri-caching.conf;
+        }
+}
--- a/lib/mesa/docs/ci/index.rst
+++ b/lib/mesa/docs/ci/index.rst
@ -0,0 +1,273 @@
+Continuous Integration
+======================
+
+GitLab CI
+---------
+
+GitLab provides a convenient framework for running commands in response to Git pushes.
+We use it to test merge requests (MRs) before merging them (pre-merge testing),
+as well as post-merge testing, for everything that hits ``main``
+(this is necessary because we still allow commits to be pushed outside of MRs,
+and even then the MR CI runs in the forked repository, which might have been
+modified and thus is unreliable).
+
+The CI runs a number of tests, from trivial build-testing to complex GPU rendering:
+
+- Build testing for a number of build systems, configurations and platforms
+- Sanity checks (``meson test``)
+- Some drivers (Softpipe, LLVMpipe, Freedreno and Panfrost) are also tested
+  using `VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__
+- Replay of application traces
+
+A typical run takes between 20 and 30 minutes, although it can go up very quickly
+if the GitLab runners are overwhelmed, which happens sometimes. When it does happen,
+not much can be done besides waiting it out, or cancel it.
+
+Due to limited resources, we currently do not run the CI automatically
+on every push; instead, we only run it automatically once the MR has
+been assigned to ``Marge``, our merge bot.
+
+If you're interested in the details, the main configuration file is ``.gitlab-ci.yml``,
+and it references a number of other files in ``.gitlab-ci/``.
+
+If the GitLab CI doesn't seem to be running on your fork (or MRs, as they run
+in the context of your fork), you should check the "Settings" of your fork.
+Under "CI / CD" → "General pipelines", make sure "Custom CI config path" is
+empty (or set to the default ``.gitlab-ci.yml``), and that the
+"Public pipelines" box is checked.
+
+If you're having issues with the GitLab CI, your best bet is to ask
+about it on ``#freedesktop`` on OFTC and tag `Daniel Stone
+<https://gitlab.freedesktop.org/daniels>`__ (``daniels`` on IRC) or
+`Emma Anholt <https://gitlab.freedesktop.org/anholt>`__ (``anholt`` on
+IRC).
+
+The three GitLab CI systems currently integrated are:
+
+
+.. toctree::
+   :maxdepth: 1
+
+   bare-metal
+   LAVA
+   docker
+
+Application traces replay
+-------------------------
+
+The CI replays application traces with various drivers in two different jobs. The first
+job replays traces listed in ``src/<driver>/ci/traces-<driver>.yml`` files and if any
+of those traces fail the pipeline fails as well. The second job replays traces listed in
+``src/<driver>/ci/restricted-traces-<driver>.yml`` and it is allowed to fail. This second
+job is only created when the pipeline is triggered by `marge-bot` or any other user that
+has been granted access to these traces.
+
+A traces YAML file also includes a ``download-url`` pointing to a MinIO
+instance where to download the traces from. While the first job should always work with
+publicly accessible traces, the second job could point to an URL with restricted access.
+
+Restricted traces are those that have been made available to Mesa developers without a
+license to redistribute at will, and thus should not be exposed to the public. Failing to
+access that URL would not prevent the pipeline to pass, therefore forks made by
+contributors without permissions to download non-redistributable traces can be merged
+without friction.
+
+As an aside, only maintainers of such non-redistributable traces are responsible for
+ensuring that replays are successful, since other contributors would not be able to
+download and test them by themselves.
+
+Those Mesa contributors that believe they could have permission to access such
+non-redistributable traces can request permission to Daniel Stone <daniels@collabora.com>.
+
+gitlab.freedesktop.org accounts that are to be granted access to these traces will be
+added to the OPA policy for the MinIO repository as per
+https://gitlab.freedesktop.org/freedesktop/helm-gitlab-config/-/commit/a3cd632743019f68ac8a829267deb262d9670958 .
+
+So the jobs are created in personal repositories, the name of the user's account needs
+to be added to the rules attribute of the GitLab CI job that accesses the restricted
+accounts.
+
+.. toctree::
+   :maxdepth: 1
+
+   local-traces
+
+Intel CI
+--------
+
+The Intel CI is not yet integrated into the GitLab CI.
+For now, special access must be manually given (file a issue in
+`the Intel CI configuration repo <https://gitlab.freedesktop.org/Mesa_CI/mesa_jenkins>`__
+if you think you or Mesa would benefit from you having access to the Intel CI).
+Results can be seen on `mesa-ci.01.org <https://mesa-ci.01.org>`__
+if you are *not* an Intel employee, but if you are you
+can access a better interface on
+`mesa-ci-results.jf.intel.com <http://mesa-ci-results.jf.intel.com>`__.
+
+The Intel CI runs a much larger array of tests, on a number of generations
+of Intel hardware and on multiple platforms (X11, Wayland, DRM & Android),
+with the purpose of detecting regressions.
+Tests include
+`Crucible <https://gitlab.freedesktop.org/mesa/crucible>`__,
+`VK-GL-CTS <https://github.com/KhronosGroup/VK-GL-CTS>`__,
+`dEQP <https://android.googlesource.com/platform/external/deqp>`__,
+`Piglit <https://gitlab.freedesktop.org/mesa/piglit>`__,
+`Skia <https://skia.googlesource.com/skia>`__,
+`VkRunner <https://github.com/Igalia/vkrunner>`__,
+`WebGL <https://github.com/KhronosGroup/WebGL>`__,
+and a few other tools.
+A typical run takes between 30 minutes and an hour.
+
+If you're having issues with the Intel CI, your best bet is to ask about
+it on ``#dri-devel`` on OFTC and tag `Nico Cortes
+<https://gitlab.freedesktop.org/ngcortes>`__ (``ngcortes`` on IRC).
+
+.. _CI-farm-expectations:
+
+CI farm expectations
+--------------------
+
+To make sure that testing of one vendor's drivers doesn't block
+unrelated work by other vendors, we require that a given driver's test
+farm produces a spurious failure no more than once a week.  If every
+driver had CI and failed once a week, we would be seeing someone's
+code getting blocked on a spurious failure daily, which is an
+unacceptable cost to the project.
+
+Additionally, the test farm needs to be able to provide a short enough
+turnaround time that we can get our MRs through marge-bot without the
+pipeline backing up.  As a result, we require that the test farm be
+able to handle a whole pipeline's worth of jobs in less than 15 minutes
+(to compare, the build stage is about 10 minutes).
+
+If a test farm is short the HW to provide these guarantees, consider dropping
+tests to reduce runtime.  dEQP job logs print the slowest tests at the end of
+the run, and Piglit logs the runtime of tests in the results.json.bz2 in the
+artifacts.  Or, you can add the following to your job to only run some fraction
+(in this case, 1/10th) of the dEQP tests.
+
+.. code-block:: yaml
+
+    variables:
+      DEQP_FRACTION: 10
+
+to just run 1/10th of the test list.
+
+If a HW CI farm goes offline (network dies and all CI pipelines end up
+stalled) or its runners are consistently spuriously failing (disk
+full?), and the maintainer is not immediately available to fix the
+issue, please push through an MR disabling that farm's jobs by adding
+'.' to the front of the jobs names until the maintainer can bring
+things back up.  If this happens, the farm maintainer should provide a
+report to mesa-dev@lists.freedesktop.org after the fact explaining
+what happened and what the mitigation plan is for that failure next
+time.
+
+Personal runners
+----------------
+
+Mesa's CI is currently run primarily on packet.net's m1xlarge nodes
+(2.2Ghz Sandy Bridge), with each job getting 8 cores allocated.  You
+can speed up your personal CI builds (and marge-bot merges) by using a
+faster personal machine as a runner.  You can find the gitlab-runner
+package in Debian, or use GitLab's own builds.
+
+To do so, follow `GitLab's instructions
+<https://docs.gitlab.com/ce/ci/runners/#create-a-specific-runner>`__ to
+register your personal GitLab runner in your Mesa fork.  Then, tell
+Mesa how many jobs it should serve (``concurrent=``) and how many
+cores those jobs should use (``FDO_CI_CONCURRENT=``) by editing these
+lines in ``/etc/gitlab-runner/config.toml``, for example::
+
+  concurrent = 2
+
+  [[runners]]
+    environment = ["FDO_CI_CONCURRENT=16"]
+
+
+Docker caching
+--------------
+
+The CI system uses Docker images extensively to cache
+infrequently-updated build content like the CTS.  The `freedesktop.org
+CI templates
+<https://gitlab.freedesktop.org/freedesktop/ci-templates/>`_ help us
+manage the building of the images to reduce how frequently rebuilds
+happen, and trim down the images (stripping out manpages, cleaning the
+apt cache, and other such common pitfalls of building Docker images).
+
+When running a container job, the templates will look for an existing
+build of that image in the container registry under
+``MESA_IMAGE_TAG``.  If it's found it will be reused, and if
+not, the associated `.gitlab-ci/containers/<jobname>.sh`` will be run
+to build it.  So, when developing any change to container build
+scripts, you need to update the associated ``MESA_IMAGE_TAG`` to
+a new unique string.  We recommend using the current date plus some
+string related to your branch (so that if you rebase on someone else's
+container update from the same day, you will get a Git conflict
+instead of silently reusing their container)
+
+When developing a given change to your Docker image, you would have to
+bump the tag on each ``git commit --amend`` to your development
+branch, which can get tedious.  Instead, you can navigate to the
+`container registry
+<https://gitlab.freedesktop.org/mesa/mesa/container_registry>`_ for
+your repository and delete the tag to force a rebuild.  When your code
+is eventually merged to main, a full image rebuild will occur again
+(forks inherit images from the main repo, but MRs don't propagate
+images from the fork into the main repo's registry).
+
+Building locally using CI docker images
+---------------------------------------
+
+It can be frustrating to debug build failures on an environment you
+don't personally have.  If you're experiencing this with the CI
+builds, you can use Docker to use their build environment locally.  Go
+to your job log, and at the top you'll see a line like::
+
+    Pulling docker image registry.freedesktop.org/anholt/mesa/debian/android_build:2020-09-11
+
+We'll use a volume mount to make our current Mesa tree be what the
+Docker container uses, so they'll share everything (their build will
+go in _build, according to ``meson-build.sh``).  We're going to be
+using the image non-interactively so we use ``run --rm $IMAGE
+command`` instead of ``run -it $IMAGE bash`` (which you may also find
+useful for debug).  Extract your build setup variables from
+.gitlab-ci.yml and run the CI meson build script:
+
+.. code-block:: console
+
+    IMAGE=registry.freedesktop.org/anholt/mesa/debian/android_build:2020-09-11
+    sudo docker pull $IMAGE
+    sudo docker run --rm -v `pwd`:/mesa -w /mesa $IMAGE env PKG_CONFIG_PATH=/usr/local/lib/aarch64-linux-android/pkgconfig/:/android-ndk-r21d/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/lib/aarch64-linux-android/pkgconfig/ GALLIUM_DRIVERS=freedreno UNWIND=disabled EXTRA_OPTION="-D android-stub=true -D llvm=disabled" DRI_LOADERS="-D glx=disabled -D gbm=disabled -D egl=enabled -D platforms=android" CROSS=aarch64-linux-android ./.gitlab-ci/meson-build.sh
+
+All you have left over from the build is its output, and a _build
+directory.  You can hack on mesa and iterate testing the build with:
+
+.. code-block:: console
+
+    sudo docker run --rm -v `pwd`:/mesa $IMAGE ninja -C /mesa/_build
+
+
+Conformance Tests
+-----------------
+
+Some conformance tests require a special treatment to be maintained on GitLab CI.
+This section lists their documentation pages.
+
+.. toctree::
+  :maxdepth: 1
+
+  skqp
+
+
+Updating GitLab CI Linux Kernel
+-------------------------------
+
+GitLab CI usually runs a bleeding-edge kernel. The following documentation has
+instructions on how to uprev Linux Kernel in the GitLab CI ecosystem.
+
+.. toctree::
+  :maxdepth: 1
+
+  kernel
--- a/lib/mesa/docs/ci/kernel.rst
+++ b/lib/mesa/docs/ci/kernel.rst
@ -0,0 +1,121 @@
+Upreving Linux Kernel
+=====================
+
+Occasionally, the GitLab CI needs a Linux Kernel update to enable new kernel
+features, device drivers, bug fixes etc to CI jobs.
+Kernel uprevs in GitLab CI are relatively simple, but prone to lots of
+side-effects since many devices from different platforms are involved in the
+pipeline.
+
+Kernel repository
+-----------------
+
+The Linux Kernel used in the GitLab CI is stored at the following repository:
+https://gitlab.freedesktop.org/gfx-ci/linux
+
+It is common that Mesa kernel brings some patches that were not merged on the
+Linux mainline, that is why Mesa has its own kernel version which should be used
+as the base for newer kernels.
+
+So, one should base the kernel uprev from the last tag used in the Mesa CI,
+please refer to `.gitlab-ci/container/gitlab-ci.yml` `KERNEL_URL` variable.
+Every tag has a standard naming: `vX.YZ-for-mesa-ci-<commit_short_SHA>`, which
+can be created via the command:
+
+:code:`git tag vX.YZ-for-mesa-ci-$(git rev-parse --short HEAD)`
+
+Building Kernel
+---------------
+
+When Mesa CI generates a new rootfs image, the Linux Kernel is built based on
+the script located at `.gitlab-ci/container/build-kernel.sh`.
+
+Updating Kconfigs
+^^^^^^^^^^^^^^^^^
+
+When a Kernel uprev happens, it is worth compiling and cross-compiling the
+Kernel locally, in order to update the Kconfigs accordingly.  Remember that the
+resulting Kconfig is a merge between *Mesa CI Kconfig* and *Linux tree
+defconfig* made via `merge_config.sh` script located at Linux Kernel tree.
+
+Kconfigs location
+"""""""""""""""""
+
+------------+--------------------------------------------+-------------------------------------+
+| Platform   | Mesa CI Kconfig location                   | Linux tree defconfig                |
+============+============================================+=====================================+
+| arm        | .gitlab-ci/container/arm.config            | arch/arm/configs/multi_v7_defconfig |
+------------+--------------------------------------------+-------------------------------------+
+| arm64      | .gitlab-ci/container/arm64.config          | arch/arm64/configs/defconfig        |
+------------+--------------------------------------------+-------------------------------------+
+| x86-64     | .gitlab-ci/container/x86_64.config         | arch/x86/configs/x86_64_defconfig   |
+------------+--------------------------------------------+-------------------------------------+
+
+Updating image tags
+-------------------
+
+Every kernel uprev should update 3 image tags, located at two files.
+
+:code:`.gitlab-ci/container/gitlab-ci.yml` tag
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+- **KERNEL_URL** for the location of the new kernel
+
+:code:`.gitlab-ci/image-tags.yml` tags
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+- **KERNEL_ROOTFS_TAG** to rebuild rootfs with the new kernel
+- **DEBIAN_X86_TEST_GL_TAG** to ensure that the new rootfs is being used by the GitLab x86 jobs
+
+Development routine
+-------------------
+
+1. Compile the newer kernel locally for each platform.
+2. Compile device trees for ARM platforms
+3. Update Kconfigs. Are new Kconfigs necessary? Is CONFIG_XYZ_BLA deprecated? Does the `merge_config.sh` override an important config?
+4. Push a new development branch to `Kernel repository`_ based on the latest kernel tag used in GitLab CI
+5. Hack `build-kernel.sh` script to clone kernel from your development branch
+6. Update image tags. See `Updating image tags`_
+7. Run the entire CI pipeline, all the automatic jobs should be green. If some job is red or taking too long, you will need to investigate it and probably ask for help.
+
+When the Kernel uprev is stable
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+1. Push a new tag to Mesa CI `Kernel repository`_
+2. Update KERNEL_URL `debian/x86_test-gl` job definition
+3. Open a merge request, if it is not opened yet
+
+Tips and Tricks
+---------------
+
+Compare pipelines
+^^^^^^^^^^^^^^^^^
+
+To have the most confidence that a kernel uprev does not break anything in Mesa,
+it is suggested that one runs the entire CI pipeline to check if the update affected the manual CI jobs.
+
+Step-by-step
+""""""""""""
+
+1. Create a local branch in the same git ref (should be the main branch) before branching to the kernel uprev kernel.
+2. Push this test branch
+3. Run the entire pipeline against the test branch, even the manual jobs
+4. Now do the same for the kernel uprev branch
+5. Compare the job results. If a CI job turned red on your uprev branch, it means that the kernel update broke the test. Otherwise, it should be fine.
+
+Bare-metal custom kernels
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Some CI jobs have support to plug in a custom kernel by simply changing a variable.
+This is great, since rebuilding the kernel and rootfs may takes dozens of minutes.
+
+For example, Freedreno jobs `gitlab.yml` manifest support a variable named
+`BM_KERNEL`. If one puts a gz-compressed kernel URL there, the job will use that
+kernel to boot the Freedreno bare-metal devices. The same works for `BM_DTB` in
+the case of device tree binaries.
+
+Careful reading of the job logs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Sometimes a job may turn to red for reasons unrelated to the kernel update, e.g.
+LAVA `tftp` timeout, problems with the freedesktop servers etc.
+So it is important to see the reason why the job turned red, and retry it if an
+infrastructure error has happened.
--- a/lib/mesa/docs/ci/local-traces.rst
+++ b/lib/mesa/docs/ci/local-traces.rst
@ -0,0 +1,45 @@
+Running traces on a local machine
+=================================
+
+Prerequisites
+-------------
+- Install `Apitrace <https://apitrace.github.io/>`_
+- Install `Renderdoc <https://renderdoc.org/>`_ (only needed for some traces)
+- Download and compile `Piglit <https://gitlab.freedesktop.org/mesa/piglit>`_ and install his `dependencies <https://gitlab.freedesktop.org/mesa/piglit#2-setup>`_
+- Download traces you want to replay from `traces-db <https://gitlab.freedesktop.org/gfx-ci/tracie/traces-db/>`_
+
+Running single trace
+--------------------
+A simple run to see the output of the trace can be done with
+
+.. code-block:: console
+
+    apitrace replay -w name_of_trace.trace
+
+For more information, look into the `Apitrace documentation <https://github.com/apitrace/apitrace/blob/master/docs/USAGE.markdown>`_.
+
+For comparing checksums use:
+
+.. code-block:: console
+
+    cd piglit/replayer
+    export PIGLIT_SOURCE_DIR="../"
+    ./replayer.py compare trace -d test path/name_of_trace.trace 0 # replace with expected checksum
+
+
+Simulating CI trace job
+-----------------------
+
+Sometimes it's useful to be able to test traces on your local machine instead of the Mesa CI runner. To simulate the CI environment as closely as possible.
+
+Download the YAML file from your driver's `ci/` directory and then change the path in the YAML file from local proxy or MinIO to the local directory (url-like format ``file://``)
+
+.. code-block:: console
+
+    # The PIGLIT_REPLAY_DEVICE_NAME has to match name in the YAML file.
+    export PIGLIT_REPLAY_DEVICE_NAME='your_device_name'
+    export PIGLIT_REPLAY_DESCRIPTION_FILE='path_to_mesa_traces_file.yml'
+    ./piglit run -l verbose --timeout 300 -j10 replay ~/results/
+
+
+Note: For replaying traces, you may need to allow higher GL and GLSL versions. You can achieve that by setting  ``MESA_GLSL_VERSION_OVERRIDE`` and ``MESA_GL_VERSION_OVERRIDE``.
--- a/lib/mesa/docs/ci/skqp.rst
+++ b/lib/mesa/docs/ci/skqp.rst
@ -0,0 +1,101 @@
+SkQP
+====
+
+`SkQP <https://skia.org/docs/dev/testing/skqp/>`_ stands for SKIA Quality
+Program conformance tests.  Basically, it has sets of rendering tests and unit
+tests to ensure that `SKIA <https://skia.org/>`_ is meeting its design specifications on a specific
+device.
+
+The rendering tests have support for GL, GLES and Vulkan backends and test some
+rendering scenarios.
+And the unit tests check the GPU behavior without rendering images.
+
+Tests
+-----
+
+Render tests design
+^^^^^^^^^^^^^^^^^^^
+
+It is worth noting that `rendertests.txt` can bring some detail about each test
+expectation, so each test can have a max pixel error count, to tell SkQP that it
+is OK to have at most that number of errors for that test. See also:
+https://github.com/google/skia/blob/c29454d1c9ebed4758a54a69798869fa2e7a36e0/tools/skqp/README_ALGORITHM.md
+
+.. _test-location:
+
+Location
+^^^^^^^^
+
+Each `rendertests.txt` and `unittest.txt` file must be located inside a specific
+subdirectory inside SkQP assets directory.
+
+--------------+--------------------------------------------+
+| Test type    | Location                                   |
+==============+============================================+
+| Render tests |  `${SKQP_ASSETS_DIR}/skqp/rendertests.txt` |
+--------------+--------------------------------------------+
+| Unit tests   |  `${SKQP_ASSETS_DIR}/skqp/unittests.txt`   |
+--------------+--------------------------------------------+
+
+The `skqp-runner.sh` script will make the necessary modifications to separate
+`rendertests.txt` for each backend-driver combination. As long as the test files are located in the expected place:
+
+--------------+----------------------------------------------------------------------------------------------+
+| Test type    | Location                                                                                     |
+==============+==============================================================================================+
+| Render tests | `${MESA_REPOSITORY_DIR}/src/${GPU_DRIVER}/ci/${GPU_VERSION}-${SKQP_BACKEND}_rendertests.txt` |
+--------------+----------------------------------------------------------------------------------------------+
+| Unit tests   | `${MESA_REPOSITORY_DIR}/src/${GPU_DRIVER}/ci/${GPU_VERSION}_unittests.txt`                   |
+--------------+----------------------------------------------------------------------------------------------+
+
+Where `SKQP_BACKEND` can be:
+
+- gl: for GL backend
+- gles: for GLES backend
+- vk: for Vulkan backend
+
+Example file
+""""""""""""
+
+.. code-block:: console
+
+  src/freedreno/ci/freedreno-a630-skqp-gl_rendertests.txt
+
+- GPU_DRIVER: `freedreno`
+- GPU_VERSION: `freedreno-a630`
+- SKQP_BACKEND: `gl`
+
+.. _rendertests-design:
+
+SkQP reports
+------------
+
+SkQP generates reports after finishing its execution, they are located at the job
+artifacts results directory and are divided in subdirectories by rendering tests
+backends and unit
+tests. The job log has links to every generated report in order to facilitate
+the SkQP debugging.
+
+Maintaining SkQP on Mesa CI
+---------------------------
+
+SkQP is built alongside with another binary, namely `list_gpu_unit_tests`, it is
+located in the same folder where `skqp` binary is.
+
+This binary will generate the expected `unittests.txt` for the target GPU, so
+ideally it should be executed on every SkQP update and when a new device
+receives SkQP CI jobs.
+
+1. Generate target unit tests for the current GPU with :code:`./list_gpu_unit_tests > unittests.txt`
+
+2. Run SkQP job
+
+3. If there is a failing or crashing unit test, remove it from the corresponding `unittests.txt`
+
+4. If there is a crashing render test, remove it from the corresponding `rendertests.txt`
+
+5. If there is a failing render test, visually inspect the result from the HTML report
+    - If the render result is OK, update the max error count for that test
+    - Otherwise, or put `-1` in the same threshold, as seen in :ref:`rendertests-design`
+
+6. Remember to put the new tests files to the locations cited in :ref:`test-location`
--- a/lib/mesa/docs/ci/uri-caching.conf
+++ b/lib/mesa/docs/ci/uri-caching.conf
@ -0,0 +1,37 @@
+set $proxy_authorization '';
+
+set_by_lua $proxyuri '
+        unescaped =  ngx.unescape_uri(ngx.var.arg_uri);
+        it, err = ngx.re.match(unescaped, "(https?://)(.*@)?([^/]*)(/.*)?");
+        if not it then
+                -- Hack to cause nginx to return 404
+                return "http://localhost/404"
+        end
+
+        scheme = it[1];
+        authstring = it[2];
+        host = it[3];
+        query = it[4];
+
+        if ngx.var.http_authorization and ngx.var.http_authorization ~= "" then
+                ngx.var.proxy_authorization = ngx.var.http_authorization;
+        elseif authstring then
+                auth = string.sub(authstring, 0, -2);
+                auth64 = ngx.encode_base64(auth);
+                ngx.var.proxy_authorization = "Basic " .. auth64;
+        end
+
+        -- Default to / if none is set to avoid using the request_uri query
+        if not query then
+                query = "/";
+        end
+
+        return scheme .. host .. query;
+';
+
+add_header X-GG-Cache-Status $upstream_cache_status;
+proxy_set_header Authorization $proxy_authorization;
+
+proxy_pass $proxyuri;
+# Redirect back to ourselves on 301 replies
+proxy_redirect ~^(.*)$ /cache/?uri=$1;