[Discourse.ros.org] [Buildfarm] ROS Buildfarm October 2017 Guide to new changes

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[Discourse.ros.org] [Buildfarm] ROS Buildfarm October 2017 Guide to new changes

Charles Fox via ros-users

Open Robotics hosts the primary ROS buildfarm at http://build.ros.org which has recently migrated to new hosts running Ubuntu Xenial. In the process of doing so we substantially overhauled the configuration management and made some desirable terminology changes which require intervention when updating the ros_buildfarm python module and scripts.

### Target audience

If you or your organization run a Jenkins instance or cluster that uses the python libraries and scripts found in the python package ros-buildfarm (also at https://github.com/ros-infrastructure/ros_buildfarm) the upcoming changes affect you.

Additionally if your buildfarm machines were originally provisioned with the configuration management tools in https://github.com/ros-infrastructure/buildfarm_deployment and https://github.com/ros-infrastructure/buildfarm_deployment_config there are significant changes that are of particular importance if you kept the default autoreconfigure settings. Action is required to ensure successful operation even if you are not ready to update at this time.

### Timeline

Xenial support will not be merged into master branches earlier than 19 October. Between now and then, I hope you'll determine which path you'll take, review the migration guide, and raise any questions or concerns you have before that date.

The ros_buildfarm release needed to perform either upgrade method is still pending. The [`xenial` branch](https://github.com/ros-infrastructure/ros_buildfarm/tree/xenial) of [ros-infrastructure/ros_buildfarm](https://github.com/ros-infrastructure/ros_buildfarm) contains the bulk of the likely changes. As soon as is feasible we will create the release and update this guide.

### Warnings

- Do not attempt to re-run `reconfigure.bash` from buildfarm_deployment on the master host. It will clobber many Jenkins configuration details.

- Running the xenial branches for buildfarm_deployment and buildfarm_deployment_config on a trusty-based host has never been tried and is not supported. Some modules assume systemd is the service supervision provider and definitely will not work.

## CHANGELOG (Abridged)

#### Updated system software:
- Ubuntu 16.04 LTS
- Jenkins LTS 2.60.3
- Docker CE 17.05
- Java 8
- Puppet 3.8

#### [ros_buildfarm](https://github.com/ros-infrastructure/ros_buildfarm)
- Update Jenkins terminology in job names, scripts, system directories, and docs.
- Add config generation for upload jobs (only for build.ros.org)

#### [buildfarm_deployment](https://github.com/ros-infrastructure/buildfarm_deployment)
- Refactor of puppet modules
- Modules attempt to follow the Roles and Profiles pattern and are factored into reusable components.
- Although this is untested, it is more possible than before to incorporate profile modules into a separate puppet infrastructure.
- Updated and pinned to current puppetforge releases for upstream puppet modules.
- Switched to the puppet future parser (puppet 4.x compatible parser)
- Retire vendored upstart module in favor of systemd service provider on Xenial.
- Add script to build reprepro 5.1.1 from backported sourcedeb.
- Add script to fetch Jenkins plugin versions from build.ros.org and generate a puppet module installing those plugin versions.

#### [buildfarm_deployment_config](https://github.com/ros-infrastructure/buildfarm_deployment_config)
- Unified installed puppet modules across roles.
- Refactored hiera config to share common data and provide role-specific configuration separately.
- Install_prequisites.bash now uses system packages for puppet and librarian
- Reconfigure.bash stores the configured role to prevent accidents when reconfiguring.
- Update user account and hiera key names for current Jenkins terminology where possible.

## Updating ros_buildfarm on existing (Trusty) hosts

Its possible to update your running hosts with limited configuration changes that will allow them to benefit from changes to subsequent releases.  _Note that you will not be able to successfully run builds during the migration process._

1. Rename the local user account from `jenkins-slave` to `jenkins-agent`.
  - This is a somewhat system dependent operation. On a Trusty system. The procedure below should cover most installs.
  - Gracefully stop services running as the jenkins-slave user.
  - Check for running processes `ps -u jenkins-slave`
  - If there are remaining non-critical processes stop them with `pkill -u jenkins-slave` otherwise wait for them to shut down gracefully.
  - `usermod -l jenkins-agent jenkins-slave`
  - `groupmod -n jenkins-agent jenkins-slave`
  - `mv /home/jenkins-slave /home/jenkins-agent`
  - `usermod -d /home/jenkins-agent jenkins-agent`

2. Change the path to the jenkins slave jar in /etc/defaults/jenkins-slave to use the new home directory.

3. Check the crontab for entries with hard-coded paths to the old home directory.

4. Apply the label `buildagent` to all executor nodes with the previous label `buildslave`. To ensure this change will persist between restarts usually requires changes to /etc/default/jenkins-slave.
Rename the `check_slaves` job to `check_agents` via the Jenkins web UI.
5. Update the ros_buildfarm tools on your buildfarm
  - Using the ros_buildfarm scripts version TDB or greater run `generate_all_jobs.py YOUR_BUILDFARM_CONFIG_URL`
  - Review the diff output for potential issues.
  - Commit the changes with `generate_all_jobs.py YOUR_BUILDFARM_CONFIG_URL --commit`

## Changing configuration to avoid breaking changes

With some changes to your buildfarms configuration you can continue to use the current (Trusty) configuration management infrastructure and buildfarm scripts until you are ready to perform the upgrade. Potentially you could continue to use the Trusty configuration indefinitely but you will be unable to use newer versions of the ros_buildfarm tools.
1. Update the auto-reconfiguring host configuration.
  - If your configuration is set up to use the ros-infrastructure/buildfarm_deployment repository directly, you will need to make sure that any hosts with the `autoreconfigure: true` setting have their autoreconfigure_command updated to use the `trusty` branch rather than `master`.

2. Set ros_buildfarm to use the last release before the xenial-related changes.
  - In order to preserve the current behavior until youre ready to upgrade make sure youre using version 1.4.1 or earlier.
build.ros.org is designed to track the master branch of the ros_buildfarm scripts. Using the master branch or subsequent ros_buildfarm releases will use updated terminology that may cause errors or unexpected behavior if not handled by following the section: Updating ros_buildfarm on existing (Trusty) hosts.

## Migrating to Ubuntu Xenial

Weve done no testing to support upgrading buildfarm hosts to Ubuntu Xenial in place. The migration to Xenial for Open Robotics was performed by provisioning new hosts running Ubuntu Xenial, running the updated configuration management and migrating the Jenkins and repository data to the new hosts. While all buildfarm deployments would benefit from the improvements in the updated buildfarm stack, particularly large instances, it is _not currently necessary to upgrade to Ubuntu Xenial_ in order to use newer versions of the ros_buildfarm scripts. You can instead follow the section marked *"Updating ros_buildfarm on existing (Trusty) hosts"*. Our migration followed the basic procedure below:

1. Provision new Xenial hosts: master, repo, and agent.

2. Run the Xenial configuration management scripts from https://github.com/ros-infrastructure/buildfarm_deployment_config on the new hosts.

3. Put Jenkins into Shutdown mode and stop any remaining builds (or let them finish)

4. Use rsync to copy packages from the existing repo host to the new xenial repo host.

5. Stop trusty jenkins master, and jenkins agents on all trusty machines.
 archive /var/lib/jenkins  Expect 10-40MB/s depending on compressibility and IO availability on an AWS machine.

6. Stop `jenkins-slave` and `jenkins` on Xenial hosts if they were running.

7. Transfer archive to new master ~10 minutes AWS internal

8. Move existing /var/lib/jenkins into /tmp (it should not contain anything worth preserving)

9. Extract Jenkins archive into /var/lib on the Xenial host

10. Bring new jenkins master online with migrated config

11. Run `generate_all_jobs.py` using version TBD of ros_buildfarm
Start jenkins agents on xenial hosts

[Visit Topic](https://discourse.ros.org/t/ros-buildfarm-october-2017-guide-to-new-changes/2840/1) or reply to this email to respond.

If you do not want to receive messages from ros-users please use the unsubscribe link below. If you use the one above, you will stop all of ros-users from receiving updates.
ros-users mailing list
[hidden email]
Unsubscribe: <http://lists.ros.org/mailman//options/ros-users>