Page MenuHomeRobHJan 5 2023, 10:56 PM
Tags
Referenced Files
None
Subscribers

Description

At the request of serviceops all racking tasks for new hardware also have a sub-task for service ops implementation tracking.

Once parent task T326366 shows resolved, this can proceed via the service ops team.


topic branch with related patches: https://gerrit.wikimedia.org/r/q/topic:gerrit-bullseye


1# schedule and announce downtime
2# on gerrit1001: shortly before the scheduled downtime:
3# on gerrit1001, as root, in a screen: rsync -avp --delete --bwlimit=100m /var/lib/gerrit2/review_site/ rsync://gerrit1003.wikimedia.org/gerrit-var-lib/
4# on gerrit1001, as root, in a screen: rsync -avp --delete --bwlimit=100m /srv/gerrit/ rsync://gerrit1003.wikimedia.org/gerrit-data/
5# on gerrit1003: rsync -avp /srv/gerrit/plugins/lfs/ /srv/gerrit/data/lfs/
6# on gerrit1003: chown -R gerrit2:gerrit2 /var/lib/gerrit2
7# on gerrit1003: chown -R gerrit2:gerrit2 /srv/gerrit
8# scheduled downtime begins / IRC announcement
9# on cumin1001:sudo cookbook sre.hosts.downtime -r 'maintenance' -D 30 gerrit1001.wikimedia.org
10# on cumin1001:sudo cookbook sre.hosts.downtime -r 'maintenance' -H 1 gerrit1003.wikimedia.org
11# on icinga.wikimedia.org - manually schedule downtime for the checks connected to virtual server "gerrit.wikimedia.org". The cookbook does not find this virtual host.
12# on gerrit1003: disable puppet; stop gerrit? (sudo disable-puppet 'gerrit maintenance'; systemctl stop gerrit)
13# merge DNS change that removes gerrit-new and switches IP of gerrit.wikimedia.org - in web UI of gerrit(-old)
14# run authdns-update on ns0.wikimedia.org, see the diff but do NOT commit yet
15# on gerrit1001: disable puppet; stop gerrit! (sudo disable-puppet 'gerrit maintenance'; systemctl stop gerrit)
16# on gerrit1001, as root, in a screen: rsync -avp --delete --bwlimit=100m /var/lib/gerrit2/review_site/ rsync://gerrit1003.wikimedia.org/gerrit-var-lib/
17# on gerrit1001, as root, in a screen: rsync -avp --delete --bwlimit=100m /srv/gerrit/ rsync://gerrit1003.wikimedia.org/gerrit-data/
18# on gerrit1003: rsync -avp /srv/gerrit/plugins/lfs/ /srv/gerrit/data/lfs/
19# on gerrit1003: chown -R gerrit2:gerrit2 /var/lib/gerrit2
20# on gerrit1003: chown -R gerrit2:gerrit2 /srv/gerrit
21# on gerrit1003: start gerrit
22# say "yes" to authdns-update and actually merge DNS change that removes gerrit-new and switches IP of gerrit.wikimedia.org
23# wait 5 minutes
24# ..test https (https://gerrit.wikimedia.org in browser)
25# ..test ssh (e.g. ssh [email protected] -p 29418)
26# announce downtime is over
27# ensure gerrit1001 has puppet disabled and/or services are masked
28# grace period (how long?)
29# decom old host -> https://phabricator.wikimedia.org/T336427


master+1 -28jjb: Update maven-java8-based jobs to images with new gerrit IPoperations/puppetproduction+1 -1gerrit: allow masking the service and do so on gerrit1001operations/puppetproduction+4 -1gerrit: add host-based Hiera keys for gerrit1003operations/homer/publicproduction+2 -2gerrit: switch service IP, turn new into current and current into oldoperations/dnsproduction+1 -0site: add gerrit prod role to gerrit1003operations/puppetproduction+2 -0gerrit: add gerrit1003 to rsync dest hosts when using prod roleoperations/puppetproduction+6 -0Customize query in gerrit

Related Objects