2019-02-19

503's.. the cliffnotes version

So I spent some time this weekend to try and show where 5 12 hour days went in analyzing data. This first graph shows the number of successful mirror requests and breaks down the 503 error's per server.
All mirror requests since 2018-01
The two drops are where logs failed to be copied to the central system versus a problem with the mirrors. You can see that Monday through Friday, Fedora sees a lot of requests and then Saturday and Sunday we see a dip. You can also see that we have gotten an increase in usage since November. The tiny area at the bottom is the number of 503's.. which kind of makes it look unimportant. [Unless you are doing a lot of builds and keep running into it.]

Just the 503's sir.
The above is just a graph of the 503's broken down by each server which sees them. We see that in January, we see a large increase of 503's on 2 servers. The first one is proxy11 which is in Europe and the server may be underpowered for what we are needing it to do. The second was on proxy01 which a lot of sites have hard-coded. If the above graph was done by the minute, you would see it as many many tiny spikes at :02 -> :10 minutes after the hour and most of the day empty. 

The graphs go to 2019-02-15 and the last 4 days have shown a decrease in 503's but proxy01 and proxy11 are still having several thousand a day. I am still looking at other fixes we can do to make this a less painful experience for people when the top of the hour occurs.

2019-02-16

Fedora Infrastructure Detective Work: Mirrorlist 503's

A Mysterious Problem

Recently there has been a large increase in failed yum/dnf updates for users with consumer's getting 503 errors when trying to update their system. This has caused problems in both the Fedora COPR system and with normal users. Trying to find why this problem is occurring was some interesting detective work that took me most of February 8th to February 14th and is still ongoing.

A Scandal in Fedoria

A history of the Fedora Mirrormanager software

The Fedora Project Mirrorlist system has evolved multiple times in the last 10 years. Originally written by Matt Domsch it underwent an update and rewrite by Adrian Reber, et al a couple of years ago. For many years Fedora used a server layout where the front end web servers would proxy the data over VPN to dedicated mirrorlist servers. While this made sense when systems were a bit slower compared to VPN latency, it had become more troublesome over the last couple of years.

Simplification of the older mirrorlist design

Originally most of the Fedora proxy servers were donated systems from various ISP's which made them network fast but not always CPU fast. This meant the system was designed to make the proxies mostly serve static content and relay anything computational to servers with more cpu cycles. As systems improved it made more sense to move the mirrormanager software closer to the proxies, however it wasn't until recently that using Moby and then Podman containers could be put into the mix.

Simplified version of new mirrorlist design
If you will notice neither of the above images mentions a database. There is one but it is a different system which various mirror managers will insert their systems and it will regularly trawl them to make sure they are up to date. It then creates a python pkl with all the network data of the updated ones. This is then pushed to each proxy which will feed it into new pods which are cycled once an hour doing a complicated dance.

  1. New pods are created from base container+new pkl and config data
  2. At 15 minutes after the hour, the first pod is told to drain out its old users.
  3. When it is drained, it is cycled with the new data.
  4. Repeat with the second pod and complete the dance.

The Final 503

Last spring/summer, we started getting reports on various mailing lists about users getting 503's causing dnf to fail. At first blush the number of failures didn't look too large as less than 0.2% of all requests resulted in 503's [We serve on average 20,000,000 requests and at that time 45,000 were 503's which was similar to what we had at times with the old VPN infrastructure.] However on further looking at the logs, it was clear these 45,000 connections were happening at a specific time frame... 15 minutes after the hour.

Doing some more work, it looked like the timeouts we had for waiting for the pods to drain of active connections was not long enough. Adding longer timeouts before killing the container brought the number of 503's down dramatically to an average of  450 per 20 million requests (or down to 0.002%). This seemed to fix things until December.

The Adventure of the Empty 503

In July, all our proxy servers were running Fedora 27 and able to get security updates like all good systems. In late November, our proxy servers were still running Fedora 27 and no longer able to get security updates. Kevin Fenzi put in a lot of hours and got both the containers and the proxy servers redeployed with Fedora 29. This allowed for us to move to newer versions of podman and other tools.

All seemed well until late January when a string reports of 503's started coming up again. At the time we had a couple of proxies who had stuck pod's for various reasons and I figured the two were related. However, the reports still happened after those problems had been fixed, and looking at the logs it was clear that instead of a 'normal' of 6000 503's a day, we were seeing peaks of 400,000 503's a day.

In looking at the combined log data, the first thing that stood out was that the problems were not happening at 15 minutes after the hour. Instead they were mostly happening at the time frame of :00 -> :05 after the hour. They also had peaks of occurring at 00:00, 04:00, 08:00, 12:00, 16:00, and 20:00. These times make a sort of sense in that they are commonly chosen to run daily jobs and at the top of the hour there are usually 3-5x more requests than 10 minutes before.

I then looked to see if the problem happened with a particular client (dnf vs PackageKit vs yum) or versions of those clients but they all happened across the board. [The only issue is that yum seems to retry if it gets a 503 while dnf gives a hard stop. ] At this point, Michal Novotny asked if I was just looking at combined logs.. and I had an 'Aha!' moment. I was looking at the combined logs, and had no idea if this was on one server or not. After looking at the original logs it was clear that proxy01.fedoraproject.org was getting the vast majority of the problems (the other proxies would generate a ~2000/server while proxy01 would generate 50,000 during a day). This again makes sense as both the COPR build system and several other systems seem to hard-code this server because it is in the main Fedora co-location.

At this point I went to see what logs we had. This took some work because of how we had setup the system, but in the end this popped up.

[Wed Feb 13 21:02:36.774319 2019] [wsgi:error] [pid 26286:tid 140136494905088] (11)Resource temporarily unavailable: [client 10.88.0.1:58258] mod_wsgi (pid=26286): Unable to connect to WSGI daemon process 'mirrorlist' on '/run/httpd/wsgi.9.0.1.sock' after multiple attempts as listener backlog limit was exceeded or the socket does not exist.
[Wed Feb 13 21:02:36.774350 2019] [wsgi:error] [pid 26286:tid 140136520083200] (11)Resource temporarily unavailable: [client 10.88.0.1:58248] mod_wsgi (pid=26286): Unable to connect to WSGI daemon process 'mirrorlist' on '/run/httpd/wsgi.9.0.1.sock' after multiple attempts as listener backlog limit was exceeded or the socket does not exist.
[Wed Feb 13 21:02:36.774443 2019] [wsgi:error] [pid 26286:tid 140136125822720] (11)Resource temporarily unavailable: [client 10.88.0.1:58250] mod_wsgi (pid=26286): Unable to connect to WSGI daemon process 'mirrorlist' on '/run/httpd/wsgi.9.0.1.sock' after multiple attempts as listener backlog limit was exceeded or the socket does not exist.
[Wed Feb 13 21:02:36.774228 2019] [wsgi:error] [pid 26286:tid 140136058681088] (11)Resource temporarily unavailable: [client 10.88.0.1:58228] mod_wsgi (pid=26286): Unable to connect to WSGI daemon process 'mirrorlist' on '/run/httpd/wsgi.9.0.1.sock' after multiple attempts as listener backlog limit was exceeded or the socket does not exist

The socket definitely existed so I went to look at backlog limits. Reading through various bug reports and log pages, I figured some options to try: graceful-timeout=30 request-timeout=30 listen-backlog=1000 queue-timeout=30. Kevin added them and rebuilt the images and I rolled it out to the proxies. The amount of failures went down dramatically and I figured it was due to allowing a larger backlog. Michal pointed out that was impossible because the kernel has a backlog size of 128 and the wsgi will just default to that no matter how much larger I made it. Reading through the man pages again I realized I had some cargo-cult going on. Michal then pointed out the change needed and I rolled this out to proxy01 to see if it would help.

The Last 503?

Currently proxy01 is still seeing 503 failures but at a lower rate than before. Before the changes it was averaging 50,000 503's a day and since the change it looks to be at 8,000.  I need to do more research to see what options will help. The increasing of timeouts may have helped but it may only be masking the problem elsewhere. We will need to look at increasing the number of pods that are available though that will increase memory usage and may mean some proxies are not usable for mirrormanager anymore. We also need to find out if other wsgi, haproxy or podman options would help. 

In any case it has been a very interesting week in detective work. I hope we can make the usage of our mirrors more reliable, and this has been useful to read. [I also expect several updates and fixes to this article as time goes on.]

2019-02-15

Proposed Change To EPEL Policies: Release based package lifetimes

https://fedoraproject.org/wiki/EPEL/Changes/Release_based_package_lifetimes

Summary

The change moves expected package lifetime to that of a Red Hat Enterprise Linux minor release.

Owner

  • Name: Stephen Smoogen

Detailed Description

Extra Packages for Enterprise Linux is a sub-project of Fedora which recompiles various Fedora packages against various Red Hat Enterprise Linux releases. When EPEL was started, RHEL lifetimes were 5 years and it was thought that the repository could have similarly long lifetimes. Major rebasing of versions were not to be allowed, and fast moving software was frowned on.

Over time, the lifetime of a RHEL release grew, and the way RHEL releases rebased themselved overtime also changed. This has meant that packagers in EPEL were bound to support software longer than RHEL did and unable to rebase like RHEL could.

The current proposal is closely linked to release based composes but does not depend on it. The changes are as follows:

Packagers commit to maintaining software in EPEL for at least one RHEL minor release or 13 months, whichever is shorter. If a packager needs to stop maintaining the software, they should do the following:
  • announce on epel-devel list that they are no longer able to maintain the package and give the reasons. If someone can take it over they can do so here.
  • release one final version with an additional file named README-RETIRED.txt' in the %docs section which says this software is retired.
  • wait until that package arrives in updates.
  • let the EPEL release manager know that the package needs to be retired for the next release.

Otherwise the latest update of the package will be retagged for the next compose of EPEL and will appear without problems.

If the situation requires a major ABI/API change (security, a new minor compose occurs, a new LTS package is released) a packager should announce to epel-devel and put in a ticket in the epel pagure instance to track. Updates can then be made and will show up in either /updates/ or the next major.minor compose.

Benefit to Fedora

  • Packagers will feel more likely to make their packages available in EPEL.
  • Users of EPEL will have more software and know it is being kept up.

Proposed Change to EPEL Policies: Minor Release Based Composes

I proposed this on the EPEL list earlier this week, but realized it needed more views so am putting a version on the blog. The canonical version is at https://fedoraproject.org/wiki/EPEL/Changes/Minor_release_based_composes

Summary

The change moves EPEL composes to biannual based composes and adds an updates tree for consumers. Package trees will have a naming structure similar to Fedora release names, and will be regularly archived off to /pub/archives after the next minor release.

Package lifetimes will be similarly affected with the expected minimum 'support' lifetime of any package to be that of a minor release.

Owner

  • Name: Stephen Smoogen
  • Email: smooge@fedoraproject.org

Detailed Description


Extra Packages for Enterprise Linux is a sub-project of Fedora which
recompiles various Fedora packages against various Red Hat Enterprise Linux releases. Currently these packages are composed by the Fedora release engineering tools similar to Fedora Rawhide where only the latest packages tagged for a particular EPEL release (example epel-7) appear. When a package is updated or retired, the older version falls out of the EPEL repositories and users can not downgrade to older versions.

The tree structure would look like the following:

/pub/epel/releases/Major.Minor.YYYYMM/{Stuff,Modular}/{x86_64,ppc64le,aarch64,s390x,source}/{Packages,repodata}
/pub/epel/updates/Major.Minor.YYYYMM/{Stuff,Modular}/{x86_64,ppc64le,aarch64,s390x,source}/{Packages,repodata}

with 2 sets of symlinks pointing to the latest supported tree:

/pub/epel/releases/7.06.201903/Stuff/x86_64/Packages/
/pub/epel/releases/8.00.2019MM/Modular/source/Packages/

/pub/epel/updates/7.06.201903/Stuff/aarch64/Packages/
/pub/epel/updates/8.00.2019MM/Modular/s390x/Packages/

/pub/epel/releases/7 -> /pub/epel/updates/7.06.201903
/pub/epel/updates/7  -> /pub/epel/updates/7.06.201903

The proposed change will move EPEL minor releases to match with Red Hat minor releases.  For example, on 2019-02-10, Red Hat has Red Hat Enterprise Linux 7.6 in Full Support/Maintenance mode and has released a beta for RHEL-8. With the change, EPEL would compose a special set of trees for March 2019 and then all updates for those packages would go into the appropriate /updates/ sub-tree.

When the next minor release is released from Red Hat, a new compose tag will be made in koji. Packages which are not retired will then be pulled into the tag, and proven packagers can stage any mass rebuilds needed for rebases. When testing shows that this tree is ready, the symlinks will then be moved to the new tree, and the old tree will be prepared to move over to /pub/archives
Packages which are added to EPEL after a major.minor compose will only show up in the /updates/ tree until the next major.minor compose. This is the same as packages in Fedora.

Due to the fact that RHEL-6 has only a year and a half of Maintenance mode, we are not looking at making changes unless it turns out that we have to.

Benefit to Fedora

  • Users are able to downgrade to an earlier version if an update has issues for them
  • Packages will less likely break for Scientific Linux and CentOS users during minor updates.
  • Packages will be regularly archived off to /pub/archives/epel/ and consumers will less likely pull large number of old packages out of koji looking for older software.

Scope

  • Proposal Owners:
    • Add koji tags for date releases
    • Write scripts for releng to make new releases
  • Other developers:
  • Policies and guidelines:
  • Trademark approval: N/A

Known Change Impacts


Users who have hard-coded repositories or mirror specific directory changes may have a 'broken' experience due to symlink changes.
Archives will grow every 6 months as packages are put there regularly.

How To Test


We will stage this set of changes in /pub/alt/epel before rolling out to the main EPEL. Users wishing to test this can set up systems that will point to this tree using a temporary .repo file to be provided for testing.

User Experience


User's should be able to do the following:
  • yum install package-name
  • yum upgrade package-name
  • yum downgrade package-name

Contingency Plan


  • Contingency mechanism: (What to do?  Who will do it?) If testing shows this proposal to fail we will not release.
  • Contingency deadline: 2019-04-01
  • Blocks release? No (we don't do releases yet).
  • Blocks product? Yes


Release Notes


Discussed on epel-devel@lists.fedoraproject.org list with updates taken into concern.