2019-10-07

Happy Halloween (Packages Not In EPEL-8 yet)

It is October, and in the US it means that all the decorations for Halloween are going up. This is a time of year I love because you get to dress up in a costume and give gifts to people. In the spirit of Halloween, I am going to make various packages available in a COPR to add onto the EPEL-8 repositories.

There are a lot of packages which are in EPEL-6 or EPEL-7 but are not in EPEL-8 yet. Some of these may not be possible due to missing -devel, others may just need someone interested in maintaining a branch for EPEL-8, etc etc. In order to try and get a push on this I wanted to see what packages could be built and made ready at some point. I also wanted to make it possible that if you really needed this package, that they could be available.

Important notes:

  1. These packages will not be getting updates
  2.  These packages will not be something you can file a bugzilla on if they don't work. 
  3. If they turn out to be filled with goblins, you have been warned. 
  4. If your system starts glowing green and moaning about Elder Gods.. I take no responsibility.

That said, this is how I am building these packages in case you want to do this also:


#Look up package src git repository
$ kinit
$ fedpkg clone {src name}
$ fedpkg srpm
$ mock --chain -r epel-8-x86_64 --localrepo=/home/smooge/not-yet-in-epel8/
# See what packages were needed to build or fix any spec file changes
# if packages needed start chain of packages and fixes
$ copr-cli build not-yet-in-epel8 {src.rpm 1} {src.rpm 2} 
# see what failures occurred.. see if they are fixable.

Currently packages which are not fixable are anything using perl-generators. There are 2 perl-generators in RHEL/CentOS-8 with one of them being a fully built module and one being a pseudo-module. The mock configs use best=1 which causes the fully built module perl to be pulled in.. which is the wrong package as it is built against perl-5.24 and the main perl is perl-5.26. Other packages which will not build are ones which need items which RHEL/CentOS-8 did not ship at all (various -devel and such).

In any case, the copr is at https://copr.fedorainfracloud.org/coprs/smooge/not-yet-in-epel8/ . I will maintain this repository until next March when it like all Halloween themed candy is not even sold at the dollar store anymore.

2019-10-01

Attention: Removal of python36 from EPEL-7 on 2019-10-03

With the release of RHEL-7.7, many of the packages for python36 in EPEL were replicated in the release as python3-3.6 packages. The normal pattern when this is seen is to remove the packages from EPEL so that they do not cause problems. However, this did cause problems for users of CentOS-7 who did not have access to the newer packages. Two weeks ago, CentOS-7.7.1908 was released and should have flowed out to users as needed.

So it is time to remove the following src.rpm packages from EPEL:

python36-3.6.8-1.el7.src.rpm
python3-setuptools-39.2.0-3.el7.src.rpm


As they are duplicated by:
python3-3.6.8-10.el7.src.rpm
python3-setuptools-39.2.0-10.el7.src.rpm

We will be removing the python packages on 2019-10-03 so that they should disappear during the repository compose on 2019-10-04. EPEL is a rolling release locked against the latest state of the Red Hat Enterprise Linux repositories. If you are using an older snapshot of RHEL or CentOS, you should sync down versions of the repository and lock particular versions for your use.

2019-09-19

Attention: Fedora Yahoo Email Users

Going from a blast of the past we are currently going through one of the Yahoo is not allowing many emails with either fedoraproject.org OR from our mail routers.  It would seem that the way to get yahoo to blacklist a domain is to get subscribed to mailing lists and then report the lists as SPAM. Enough accounts (or maybe if one person does it enough times).. yahoo will helpfully blacklist the domain completely. [It then is usually a multi-month process of people explaining that no Fedora is not a spam site, hasn't been taken over by a spam site, or a bunch of other things which do happen so any mail admin is going to be wary on.]

The funny thing is that their blockage doesn't work 100% so some people seem to still get emails delivered even when most of our logs show that yahoo is telling our servers various SMTP errors of GO AWAY.

At this point, if you are a packager with a yahoo.com email address, you probably have not gotten an email from any lists or possibly bugzilla for a bit. Trying to email you directly from our site to tell you this isn't going to work.. so we are going back to blogs on the hopes that someone still reads them.


2019-09-16

EPEL Bug: Bash errors on recent EL-8 systems.

Last week, I got asked about a problem with using EPEL-8 on Oracle Enterprise Linux 8 where trying to install packages failed due to bad license file. I duplicated the problem on RHEL-8 which had not happened before some recent updates.

[smooge@localhost ~]$ repoquery
bash: repoquery: command not found...
Failed to search for file: Failed to download gpg key for repo 'epel': Curl error (37): Couldn't read a file:// file for file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-8.0 [Couldn't open file /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-8.0]
The problem seems to be that the EPEL release package uses the string $releasever for various short-cut strings. Take for example:

[epel-playground]
name=Extra Packages for Enterprise Linux $releasever - Playground - $basearch
#baseurl=https://download.fedoraproject.org/pub/epel/playground/$releasever/Everything/$basearch/os
metalink=https://mirrors.fedoraproject.org/metalink?repo=playground-epel$releasever&arch=$basearch&infra=$infra&content=$contentdir
failovermethod=priority
enabled=0
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-$releasever

The problem is that when I wrote new versions of the EPEL-8 repo file, I replaced the old key phrase gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7 with gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-$releasever .  When I tested things with the dnf command it worked fine but I didn't check to see where things like bash completion would show up.

Moving back to the format that EPEL-6 and EPEL-7 used fixes the problem, so I will be pushing an updated release file out this week.  My apologies for people seeing the errors.

2019-08-14

Announcing EPEL-8.0 Official Release

EPEL-8.0 released
The EPEL Steering Committee is pleased to announce that the initial EPEL-8 is ready for release. We would like to thank everyone in the community for helping us get the initial set of builds out to mirrors and to consumers worldwide. Special thanks go to Patrick Uiterwijk, Jeroen van Meeuwen, Robert Scheck, and many others in the community who helped in the last 6 months to get this release done. 

EPEL-8.0 has packages for the x86_64, ppc64le, aarch64, and now the s390x platforms.

https://download.fedoraproject.org/pub/epel/8/Everything/ is the link for seeing what packages are available. [Edited 2019-08-14 19:59 UTC add in link for people to follow]

What is EPEL?


EPEL stands for Extra Packages for Enterprise Linux and is a subcommunity of the Fedora and CentOS projects aimed at bringing a subset of packages out of Fedora releases ready to be used and installed on various Red Hat Enterprise Linux (RHEL). It is not a complete rebuild of Fedora or even of previous EPEL releases. EPEL is also a community and not a product. As such we need community members to help get packages into the repository more than done in Fedora.

If you are interested in getting a package into EPEL, contact the package maintainer through bugzilla. This way the request can be tracked, and if the primary maintainer is not interested in branching to EPEL, others can step in and do so. Optionally you can send a request to the epel-devel@lists.fedoraproject.org mailing list. If you do so, please include why the package is needed, to help other volunteers decide whether they can support it.

What is new?

Playground for Rawhide like things

We have added an additional set of channels for EPEL-8 called playground. It is similar to Fedora Rawhide so packagers can work on versions of software that are too fast moving or will have large API changes compared to versions in the regular channel.

To make this purpose transparent, when a package is built in epel8, it will normally also be built in epel8-playground. This is done via a packages.cfg file which lists the targets for fedpkg to build against. A successful package build will then go through two different paths:
  • epel8 package will go into bodhi to be put into epel8-testing
  • epel8-playground will bypass bodhi and go directly into epel8-playground the next compose.
If a packager needs to focus only on epel8 or epel8-playground they can edit packages.cfg to change the target=epel8 epel8-playground to target=epel8.

Packages in epel8-playground are intended to be used in the following manner:
  • To test out a new version of the package that might not be stable yet.
  • To test out new packaging of the package
  • To test a major version change of the package intended for the next EPEL-8 minor release.
  • To build a package that will never be stable enough for EPEL-8, but still could be useful to some.
  • At minor RHEL releases (ie, 8.1, 8.2) people can pull in big changes from playground to the main EPEL-8 packages. Since people will be upgrading and paying more attention than usual anyhow at those points, it’s a great chance to do that change, but you can test beforehand in the playground to make sure these changes work.
Consumers should be aware that packages in EPEL8-playground are without any Service Level Expectations. You may want to only cherry pick packages from the playground as needed.

New Architecture: s390x

We have added the s390x platform to builds. Some consumers have wanted this platform for many years but we did not have the time to integrate necessary changes. We have done this with EPEL-8, and hope to be able to do so for EPEL-7 if there are continued requests for it. 

What is next?

The goal for EPEL-8.1 will be implementing modules into the repository, which allows builds for packages that depend on non-shipped devel packages. It also allows maintainers to supplement and replace other packages they could not under standard EPEL rules. 

Known Issues:

  1. EPEL-8.0 does not come with modules. Packages built for perl, python and other modules are only built against “default” modules. For example installing a perl library from EPEL will work with the perl-5.26 but not with the perl-5.24 module.
  2. RHEL-8.0 and RHEL-8.1 beta do not come with the same packages in all architectures. There are 720 ‘desktop’ packages which were only shipped for x86_64 and ppc64le. Packagers looking to deliver GNOME, KDE, or other platforms will need to exclude s390x and aarch64 at this time.
  3. The dnf in RHEL-8.1 beta does not work with the EPEL repository due to zchunk code. This has been opened as an upstream bug as https://bugzilla.redhat.com/show_bug.cgi?id=1719830 
  4. Until modularity and module builds are implemented in EPEL, there will be many packages which can not be built for EPEL. This is mainly due to RHEL-8 not shipping many -devel packages and the need for us to rebuild those packages in a module to make those -devel available to build against. When running into this please open a ticket with https://pagure.io/epel/new_issue for us to put in a request for it to be added to Red Hat’s Code Ready Builder. Please list the package(s) which is blocked from being built because of its absence. We will collate these items into bugzilla tickets which will be reviewed by the Red Hat product groups to see if they will be added in future Code Ready Builder releases. Doing this will ensure that we do not have 70 requests for foo-devel but can have one with all the packages needing it.
  5. /usr/bin/python does not exist. Developers should aim towards /usr/bin/python3 or /usr/bin/python2 and patch appropriately. Python2 packages are discouraged. RHEL-8 will contain python2.7 until probably the end of life of RHEL-7. However support upstream will only be minimal. When modularity occurs, we suggest that you make whatever python2 packages modules which can be pulled out when RHEL-8.N no longer has python2.
  6. python2-sphinx is not shipped. Most packages should work with python3-sphinx, and if it doesn’t please open a bug. The python team has been good about making fixes for this.
  7. When branching python packages, be aware that python in EL-8 is python36 and not the version currently in rawhide. This has come up with a couple of test packages where they assumed python37 or later.
  8. While EL-8 comes with platform-python, it should NOT be used in Requires: unless absolutely necessary. python3 should be used instead. (Exceptions can be made but will be rare and need justification.)  [Accepted exception: Use python3.6dist(coverage) instead of python3-coverage. This package is not shipped but is needed in %check code.]
  9. Sometimes RHEL8 only has a python3 package for a dependency you need for your build. (Example: python-bleach requires python2-html5lib, but RHEL8 provides only python3-html5lib). For EPEL-8.0 we recommend strongly to only focus on python3 subpackages.. 
  10. RHEL-8 was built with packages which were not shipped. In general it is OK to branch these packages and build them in EPEL.
  11. systemd-rpm-macros is not a separate packages. If needed, used BuildRequires: systemd
  12. You will need to make sure you have a version of fedpkg greater than fedpkg-1.37-4 to work with both `epel8` and `epel8-playground`. Versions before that should work with just `epel8`.

Developer requests for multiple branches

Branching is handled the same way as requesting a branch using fedpkg request-branch. A maintainer can request an epel8 branch using fedpkg request-branch epel8 which will create a ticket in https://pagure.io/releng/fedora-scm-requests/issues and Release Engineering will process these requests.
To branch multiple packages please use this or a variant of this script:
#!/usr/bin/sh
# Reminder to get an updated pagure token for releng tickets
# Usage: epel-8.sh package1 package2 package3 package4
if [ $# -lt 1 ]
then
    echo "At least one package name should be provided"
else
    TMPDIR=`mktemp -d /tmp/epel8.XXXXXX`
    pushd "$TMPDIR"
    for pkg in "$@"
    do
        fedpkg clone "$pkg"
        pushd "$pkg"
        fedpkg request-branch epel8
        fedpkg request-branch epel8-playground
        popd
    done
    rm -rfv "$TMPDIR"
fi

Releng will then work through the tickets in the system which is adding branches to the PDC and src.fedoraproject.org.

Known RHEL-8 packages missing -devel

  1. libblueray-devel
  2. liba52-devel
  3. libXvMC-devel
  4. libdvdnav-devel
  5. gfbgraph-devel
  6. libuv-devel
  7. rest-devel
  8. qgpgme-devel

Definitions

  1. Package maintainer. Person who has accepted responsibility to package and maintain software in the Fedora Project ecosystem. The main packager is usually someone focused on Fedora Linux, and secondary packagers may be focused on particular use cases like EPEL.
  2. Consumer. A person who has subscribed to EPEL for packages but is not a maintainer.
  3. PDC. Product Definition Center. A tool to help list the lifetime and permissions that a product has so that branching and updates can be better managed.

2019-07-09

EPEL-8 Production Layout

EPEL-8 Production Layout

TL; DR:

  1. EPEL-8 will have a multi-phase roll-out into production.
  2. EPEL-8.0 will build using existing grobisplitter in order to use a ‘flattened’ build system without modules.
  3. EPEL-8.1 will start in staging without grobisplitter and using default modules via mock.
  4. The staging work will allow for continual development changes in koji, ‘ursa-prime’, and MBS functionality to work without breaking Fedora 31 or initial EPEL-8.0 builds.
  5. EPEL-8.1 will look to be ready by November 2019 after Fedora 31 around the time that RHEL-8.1 may release (if it uses a 6 month cadence.)

Multi-phase roll-out

As documented elsewhere, EPEL-8 has been slowly rolling out due to the many changes in RHEL and in the Fedora build system since EPEL-7 was initiated in 2014. Trying to roll out an EPEL-8 which was ‘final’ and thus the way it always will be was too prone to failure as we find we have to constantly change plans to match reality.
We will be rolling out EPEL-8 in a multi-phase release cycle. Each cycle will allow for hopefully greater functionality for developers and consumers. On the flip side, we will find that we have to change expectations of what can and can not be delivered inside of EPEL over that time.
Phases:
  1. 8.0 will be a ‘minimal viability’. Due to un-shipped development libraries and the lack of building replacement modules, not all packages will be able to build. Instead only non-modular RPMs which can rely on only ‘default’ modules will work. Packages must also only rely on what is shipped in RHEL-8.0 BaseOS/AppStream/CodeReadyBuilder channels versus any ‘unshipped -devel’ packages.
  2. 8.1 will add on ‘minimal modularity’. Instead of using a flattened build system, we will look at updating koji to have basic knowledge of modularity, use a tool to tag in packages from modules as needed, and possibly add in the Module Build System (MBS) in order to ship modules.
  3. 8.2 will finish adding in the Module Build System and will enable gating and CI into the workflow so that packages can tested faster.
Due to the fact that the phases will change how EPEL is produced, there may be need to be mass rebuilds between each one. There will also be changes in policies about what packages are allowed to be in EPEL and how they would be allowed.

Problems with koji, modules and mock

If you are wanting to build packages in mock, you can set up a lot of controls in /etc/mock/foo.cfg which will turn on and off modules as needed so that you can enable the javapackages-tools or virt-devel module so that packages like libssh2-devel or javapackages-local are available. However koji does not allow this control per channel because it is meant to completely control what packages are brought into a buildroot. Every build records what packages were used to build an artifact and koji will create a special mock config file to pull in those items. This allows for a high level of auditability and confirmation that the package stored is the package built, and that what was built used certain things.
For building an operating system like Fedora or Red Hat Enterprise Linux (RHEL), this works great because you can show how things were done 2-3 years later when trying to debug something else. However when koji does not ‘own’ the life-cycle of packages this becomes problematic. In building EPEL, the RHEL packages are given to the buildroot via external repositories. This means that koji does not fully know the life-cycle of the packages it ‘pulls’ in to the buildroot. In a basic mode it will choose packages it has built/knows about first, then packages from the buildroot, and if there is a conflict from external packages will try to choose the one with the highest epoch-version-release-timestamp so that only the newest version is in. (If the timestamp is the same, it tends to refuse to use both packages).
An improvement to this was adding code to mergerepo which allows for dnf to make a choice on which packages to use between repositories. This allows for mock’s dnf to pull in modules without the repositories having been mangled or ‘flattened’ as with grobisplitter. However, it is not a complete story. For DNF to know which modules to pull in it needs to set an environment variable for the platform (for fedora releases it is something like f30 and for RHEL it is el8). Koji doesn’t know how to do this so the solution would be to set it in the build systems /etc/mock/site-defaults.cfg but that would affect all builds and would cause problems for building Fedora on the same build system.

Grobisplitter

A second initiative to deal with building with modules was to try and take modules out of the equation completely. Since a module is a virtual repository embedded in a real one, you should be able to pull them apart and make new ones. Grobisplitter was designed to do this to help get CentOS-8 ready and also allow for EPEL to bootstrap using a minimal buildset. While working on this, we found that we needed also parts of the ‘–bare’ koji work because certain python packages have the same src.rpm name-version but different releases which koji would kick out.
Currently grobisplitter does not put in any information about the module it ‘spat’ out. This will affect building when dnf starts seeing metadata in individual rpms which says ‘this is part of a module and needs to be installed as such’.

Production plans

We are trying to determine which tool will work better long term in order to make EPEL-8.0 and EPEL-8.1 work.

EPEL-8.0

Start Date End Date Work Planned Party Involved
2019-07-01 2019-07-05 Lessons Learned Smoogen, Mohan
2019-07-01 2019-07-05 Documentation Smoogen
2019-07-08 2019-07-12 Release Build work Mohan, Fenzi
2019-07-08 2019-07-12 Call for packages Smoogen
2019-07-15 2019-07-19 Initial branching Mohan, Dawson
2019-07-22 2019-07-31 First branch/test Dawson, et al
2019-08-01 2019-08-01 EPEL-8.0 GA EPEL Steering Committee
2019-08-01 2019-08-08 Lessons Learned Smoogen, et al
2019-08-01 2019-08-08 Revise documentation Smoogen, et al
2019-09-01 2019-09-01 Bodhi gating turned on Mohan

EPEL-8.0 Production Breakout

  1. Lessons Learned. Document the steps and lessons learned from the previous time frame. Because previous EPEL spin-ups have been done multiple years apart, what was known is forgotten and has to be relearned. By capturing it, we hope that EPEL-9 does not take as long.
  2. Documentation. Write documents on what was done to set up the environment and what is expected in the next section (how to branch to EPEL-8, how to build with EPEL-8, dealing with unshipped packages, updated FAQ)
  3. Call for Packages This will be going over the steps that packagers need to follow to get packages branched to EPEL-8.
  4. Release Build Work. This is setting up the builders and environment in production. Most of the steps should be repeats of what was done in staging with additional work done in bodhi to have signing and composes work
  5. Initial Branching. This where the first set of packages are needed to be branched and built for EPEL-8: epel-release, epel-rpm-macros, fedpkg-minimal, fedpkg (and all the things needed for it).
  6. First Branch Going over the various tickets for EPEL-8 packages, a reasonable sample will be branched. Work will be done with the packagers on problems they find. This will continue as needed.
  7. EPEL-8.0 GA Branching can follow normal processes to get done.
  8. Lessons Learned. Go over problems and feed into other groups backlogs.
  9. Documentation Update previous documents and add any that were found to be needed.

EPEL-8.1

Start Date End Date Work Planned Party Involved
2019-07-01 2019-07-05 Lessons Learned Fenzi, Contyk, et al
2019-07 ??? Groom Koji changes needed ???
2019-07 ??? Write/Test Koji changes needed ???
2019-07 ??? Non-modular RPM in staging ???
2019-07 ??? MBS in staging ???
2019-08? ??? Implement Koji changes? ???
2019-08? ??? Implement bodhi compose in staging? ???
2019-09? ??? Close off 8.1 beta ???
2019-09? ??? Lessons learned ???
2019-09? ??? Begin changes in prod? ???
2019-10? ??? Open module builds in EPEL ???
2019-11? ??? EPEL-8.1 GA EPEL Steering Committee
2019-11? ??? Lessons Learned ???
2019-11? ??? Revise documentation ???

EPEL-8.1 Production Breakout

This follows the staging and production of the 8.0 with additional work in order to make working with modules work in builds. Most of these dates and layers need to be filled out in future meetings. The main work will be adding in allowing a program code-named ‘Ursa-Prime’ to help build non-modular rpms using modules as dependencies. This will allow for grobisplitter to be replaced with a program that has long term maintenan

2019-06-28

Update on EPEL-8 Status

Update on EPEL-8 Status

Where is EPEL-8? (tl;dr:)

  1. Getting koji to work smoothly with modules has been hard. A multi-level fix has had to be worked to get it working in staging.
  • Needed a way to split out default modules to deal with koji merge options. Grobisplitter was written to do this
  • Koji needed further patching to deal with src.rpms with same NVR but different targets (some python2 and python3 come from same src.rpm but were built in different times).
  • DNF reposync from RHEL-7 would delete the wrong files if you tried the --newest (fixed.)
  • DNF does not know how to reposync modules if it is not the local arch.
  1. Code Ready Builder is not always in sync with packages in main trees. If you need a -devel and it isn’t in CRB, then you have to wait until it is there to build something.
  2. As a couple of fixes landed in mergerepo and koji, we are re-evaluating how we do builds in the next stage of building.

Introduction

In May of 2019, Red Hat released their 8.0 release of Red Hat Enterprise Linux (RHEL). Usually, the Extra Packages for Enterprise Linux (EPEL) group would have a beta available at that time or sooner. With RHEL-8, it has taken a lot longer to get things rolling.

Repository Changes

EPEL packages are built inside of the Fedora Projects’ build infrastructure. This is done by downloading the packages from Red Hat’s public Content Delivery Network (CDN), and then having the Fedora artifact build system (koji) use the release as an external build channel. Koji looks at packages in a different way than other build commands like ‘mock’ do. Where mock is meant to just build packages, koji is designed about auditing the entire lifecycle of a package. In other words, if you want to know how a package in Fedora 12 was built and all its children interacted over time in the buildroots… you can do that with enough work and the koji databases. With mock you have a couple of log files which tell you what was pulled into a buildroot but how those were built would require you finding their log files, etc etc. A developer can also download those packages and look at them to see what was in them and how they were built.

The strength of koji is that you can have a credible chain of builds to know where things came from. However this doesn’t work too well with building packages for EPEL where koji doesn’t know where the RHEL kernel came from. Koji uses mergerepo to look at the external packages provided, determines the src.rpm they would come from and determines what the latest version it would use from each. From this it creates a ‘buildroot’ which it will use to build packages from. This has worked pretty well for building packages from RHEL-5,6, and 7. The major downside has been where someone built a package with the same src.rpm name which koji then decides is the master no matter if a newer version shows up in RHEL.

This all changed with modularity. Koji really only has a rudimentary idea of rpms and repositories… it has zero idea about modules and the rules it has used to determine what an external package is are thrown out with modules.
  1. Packages with different names may come with from the same src.rpm. In RHEL-8 many python27 and python36 packages have the same parent src.rpm but were in different build times. Koji’s standard repo comparison mode will choose one or the other.
  2. Packages may have the same names-version-releases but were built in different module streams (say perl-5.26 and perl-5.24) Koji would then choose a package depending on whatever had the largest src.rpm which meant it could try to build a buildroot with perl-5.24 perl modules but perl-5.26 as the master perl.
If a developer uses mock to build a package with default repositories, mock calls dnf which knows about modules and does the right thing. In the case where you want it to do the ‘wrong’ thing you can also over-ride mock to do that. With koji, further tools are needed to make this work. If you are building a new module, then the Modular Build System (MBS) sits on top of koji and tells koji what to do. It will look at the module yaml file and turn on/off various modules so that it can build in what is needed. To build non-modular packages, other fixes are needed. One of these is called Ursa-Major which was a set of scripts to pull in needed data from a third database and pull things in as needed. However, this was not adopted in Fedora for general use so the EPEL group looked for something different.

The temporary solution written by Patrick Uiterwijk is called grobisplitter (https://github.com/puiterwijk/grobisplitter) which relies on the fact that modules are virtual repositories embedded in a master repository. Grobisplitter takes this fact, and uses it to break out ‘real’ repositories for each module. So the RHEL-8 repository will look like:

ant:1.10:820181213135032:5ea3b708:x86_64
container-tools:rhel8:8000020190416221845:2ffa3d27:x86_64
container-tools:rhel8:820190211172150:20125149:x86_64
freeradius:3.0:8000020190425181943:75ec4169:x86_64
freeradius:3.0:820190131191847:fbe42456:x86_64
gimp:2.8:820181213135540:77fc8825:x86_64
go-toolset:rhel8:820190208025401:b754926a:x86_64
httpd:2.4:8000020190405071959:55190bc5:x86_64
httpd:2.4:820190206142837:9edba152:x86_64
idm:client:820190227213458:49cc9d1b:x86_64
inkscape:0.92.3:820181213140018:77fc8825:x86_64
javapackages-runtime:201801:820181213140046:302ab70f:x86_64
javapackages-tools:201801:820181217165704:dca7b4a4:x86_64
llvm-toolset:rhel8:820190207221833:9edba152:x86_64
mailman:2.1:820181213140247:77fc8825:x86_64
mariadb:10.3:820190206164045:9edba152:x86_64
mariadb:10.3:820190314153642:9edba152:x86_64
maven:3.5:820181213140354:5ea3b708:x86_64
mercurial:4.8:820190108205035:77fc8825:x86_64
merged_repo
mysql:8.0:820190104140943:9edba152:x86_64
nginx:1.14:820181214004940:9edba152:x86_64
nodejs:10:820190108092226:9edba152:x86_64
non_modular
perl-App-cpanminus:1.7044:820181214184336:e5ce1481:x86_64
perl-DBD-MySQL:4.046:820181214121012:6bc6cad6:x86_64
perl-DBD-Pg:3.7:820181214121102:6fcea174:x86_64
perl-DBD-SQLite:1.58:820181214121133:6bc6cad6:x86_64
perl-DBI:1.641:820190116185335:fbe42456:x86_64
perl-FCGI:0.78:820181214153815:fbe42456:x86_64
perl-YAML:1.24:820181214175558:8652dbeb:x86_64
perl:5.26:820181219174508:9edba152:x86_64
php:7.2:820181215112050:76554e01:x86_64
postgresql:10:820190104140132:9edba152:x86_64
python27:2.7:8000020190410132513:c0efe978:x86_64
python27:2.7:820190212161047:43711c95:x86_64
python36:3.6:8000020190410133122:593c47b3:x86_64
python36:3.6:820190123171828:17efdbc7:x86_64
redis:5:820181217094919:9edba152:x86_64
rhn-tools:1.0:8000020190425124933:6ec19280:x86_64
rhn-tools:1.0:820190321094720:e122ddfa:x86_64
ruby:2.5:820190111110530:9edba152:x86_64
rust-toolset:rhel8:820181214214108:b09eea91:x86_64
satellite-5-client:1.0:820190204085912:9edba152:x86_64
scala:2.10:820181213143541:2b79a98f:x86_64
squid:4:820181213143653:9edba152:x86_64
subversion:1.10:820181215112250:a51370e3:x86_64
swig:3.0:820181213143944:9edba152:x86_64
varnish:6:820181213144015:9edba152:x86_64
virt-devel:rhel:820190226174025:9edba152:x86_64
virt:rhel:8000020190510171727:55190bc5:x86_64
virt:rhel:8000020190516125745:55190bc5:x86_64
virt:rhel:820190226174025:9edba152:x86_64

In the above, each of those names is the module name, and grobisplitter would then put the appropriate files in each sub repository. The problem with this version is that we end up with multiple repositories with some of them being ‘non-default’ modules. Building against a non-default module causes problems for someone trying to install that package. It would replace packages from a different module than was wanted. Changes to grobisplitter were made at https://github.com/smooge/grobisplitter to allow only default modules to be published.

From this we were able to start deploying a devolved tree in the Fedora staging koji (https://koji.stg.fedoraproject.org/) The first set of fixes needed was to make it so koji could work with multiple artifacts coming from the same src.rpm. Instead of using the standard mode for resolving differences, we import RHEL-8 repositories with a bare mode which is supposed to use external repository data to determine what should be pulled in. However, we found that koji still gets confused if multiple versions of a package are in the repo data. Say your repository contains both glibc-*-2.1-2 and glibc-*-2.2-2. Koji would pull in glibc-devel-2.1-2 and try to match it against glibc-2.2-2. This of course caused builds to fail.

At first the fix looked to be having the reposync from the CDN pull only the latest data. However we ran into problems with either the RHEL-7 or RHEL-8 reposync deleting data we wanted to keep depending on the options used. Part of this was due to module data and part of it was due to some bugs in dnf’s reposync with other architectures. At this point, it looked like one of two things needed to be done.

One, grobisplitter needs to learn about package order and pull in just the latest package into a non-modular repo. Two, another layer of indirection is needed where after we split out all the repositories we use reposync again to just pull from the grobisplit repositories. In this case we do so with a -n and only have the latest packages. The second option seemed easier to pursue as most of the grobisplitter toolkit should become irrelevant when the next generation of Ursa-Major comes out.

Code Ready Problems

We ran into our next major problem with RHEL-8 repositories when we found that -devel and -lib rpms in Code Ready Builder were not always in sync with their parent packages in BaseOS/AppStream. This means that if your build is wanting kernel-devel and the BaseOS is 4.9-11 but the CRB version is 4.9-10 then koji has no way to supply the dependency for you. The major culprit currently is that the virt module has had multiple updates but the virt-devel module has not had any updates.

Build Over View

  1. RHEL-8 packages are reposync from cdn onto infrastructure.fedoraproject.org nfs directory.
  2. grobisplitter runs on grobisplitter01.phx2.fedoraproject.org to break out each module into repositories in a $date/$arch/$repos layout.
  3. createrepo is run on $date/$arch
  4. a symbolic link is set to $date staged
  5. reposync -n -d is run against staged/$arch to latest/$arch
  6. createrepo is run on latest/$arch
  7. koji points to latest/$arch
  8. packages can be built
  9. packages can be signed
  10. bodhi and other items do their parts
  11. we compose
  12. profit?

What Are The Next Steps?

Currently we are looking to have our internal beta done by July 1st. At that point, we will work on documenting what we have done, and re-implementing the tool changes in production. At which point, developers will be able to make branch requests to releng to make packages available and builds should start flowing. From that we will probably find new things which will need fixes in either spec files or build infrastructure.

A GANNT chart of our current production plan is provided below.

2019-05-30

EPEL Proposal: Steve Gallagher's EPEL 8 Branch Strategy

Stephen Gallagher's Better Proposal for EPEL branching

So earlier this week I wrote up a proposal for EPEL-rawhide which was to go over various ideas the EPEL steering committee has been kicking around for a bit. This was to try and work into how to branch for EPEL-8 and also how to deal with https://fedoraproject.org/wiki/EPEL/Changes/Minor_release_based_composes and https://fedoraproject.org/wiki/EPEL/Changes/Release_based_package_lifetimes During the meeting it was clear that my strawman didn't have much in it, and needed more thinking. Thankfully Stephen Gallagher looked into the meeting and came up with some ideas that he wrote up and proposed to the list.. I recommend that you read the document and updates if you are interested in how branching in EPEL could work with EL7 and EL8.





2019-05-28

EPEL Proposal: EPEL Master branch AKA Rawhide

EPEL-rawhide

Update:

This proposal has been superseded by Stephen Gallagher's excellent wagontrain post. I will put it as a separate post next.

tl; dr:

In order to allow for the ability for faster availability of packages, add rawhide branches for EPEL-7 and EPEL-8. These branches would allow developers to build new packages they aren't sure are ready for either EPEL-N or EPEL-N-testing, and would allow for faster rebuilds of newer features when RHEL has a large feature change.

The Longer Story

In the past 6 months, EPEL has had to have two major changes in its builds which were made harder by the way EPEL is currently built. The first one was with changes in RHEL-7.6 which dropped some packages and changed some others ABI's. This required a rebuild of a lot of packages, but there was no way we could do a find and fix before we did a 'flag-week' of rebuilds with Troy Dawson and others doing lots of Proven Packager fixes and rebuilds.

The second one was with the python36 move which also took a large amount of time and still has little problems showing up here and there. In a similar fashion, updates-testing had to be used as a rawhide for packages which made building and testing hard for things not doing this change.

A third problem showed up when Troy was cleaning out packages in EPEL-6 and 7 testing repos which had been there for years. The packagers were using this for putting things they felt were too unstable for EPEL due to unstable API's so they could either iterate quicker or not break existing users. The problem is that these packages might accidentally get  promoted by someone seeing that the packages are tested but wasn't pushed. Having a separate tree for these unstable packages needed a different thinking.

While doing a review of these two exercises, the EPEL steering committee came up with various ideas.. and I believe Kevin Fenzi brought up adding a rawhide as an easier fix than some of my more convoluted branch every release (aka epel-7.6, epel-7.7, epel-7.8). In this new scheme, we would have the following branches: el6, epel7, epel7-master, epel8, epel8-master.

A possible work flow could be the following:
  1. Packages when branched for EL7 or EL8 would get branched into the epel-M-master tree where they could have builds made against the latest RHEL. 
  2. When Red Hat released a new beta (RHEL-M.N-beta), Fedora Infrastructure would download it and set it up so koji could find it. as EPEL-M-Master (or properly bikeshed name). A mass update and rebuild would then be done against all packages in EPEL-M-Master. Breakfixes and testing can be done.
  3. When the General Availability of RHEL-M.N occurs, EPEL will make a copy of EPEL-M.(N-1), EPEL-M.(N-1)-updates and EPEL-M.(N-1)-updates-testing in /pub/archive/epel/M/M.(N-1)/.
  4. An after Red Hat releases the General Availability of the RHEL-M.N release, 
    1. if the version in master is newer than the version in branch, the master version will be checked into the branch. (This step is probably the most problematic and needs more work and thinking by people).
    2. packages which meet certain criteria will then be promoted to EPEL-M with a new compose of EPEL-M.N and an empty EPEL-M.N/updates and EPEL-M.N/updates-testing.
  5. The packager can do updates and fixes to packages in the EPEL-M branch 
  6. The finishing up of clean up the archives can occur.
This is a preliminary proposal which needs a lot more work and resource commitments in changes to tooling and documentation. I am bringing this up as something I would like to get done as part of revamping EPEL this summer, but I also need feedback and help.

EPEL Proposal: Removal of PPC64 (Not PPC64le) in 2019-06-01

TL; DR:

EPEL is looking to put its EL6 and EL7 branches of PPC64 into archives by 2019-06-01. This is due to the fact that Fedora no longer builds for the PPC64 big-endian architecture.

The long story

As of the EOL of Fedora 28, the Fedora Project no longer supports or builds packages for the big endian Power64 (or ppc64) architecture. Kevin Fenzi went over this in his blog article, but I wanted to go over it again. I realize this is short notice so extra steps need to be done.

The Fedora Project uses Fedora Linux on its builders which is useful for bringing on new architectures, and for getting new features which RHEL does not have yet. However it means that when an OS is End of Lifed, it no longer gets security updates, software improvements, or similar fixes. We could try and stand up an EL7 builder but it would require reworking both tools and scripts that are expecting an F28 world (python3, various newer libraries and scripts, different API's, etc). That would take a while to rework everything back and then continual work of keeping this builder in line with whatever EL8/F30+ world we move to in the coming months. Secondly, this would cut out a limited resource. We only have so many PPC8 systems which we can run PPC64 virtual machines on. The virtual machines can either build an EPEL package or a Fedora <29 be="" but="" down="" epel.="" just="" limiting="" p="" package="" this="" to="" we="" would="">
In the end, the number of PPC64 users are not that great. We have an average of 90 systems per day checking in with many more PPC64LE systems. I think most of the PPC64 users would be able to get stuff from archives just as well.

How do I get my stuff

The builds for EL6.10 and EL7.6 will be archived to /pub/archives/epel/7/7.6 and /pub/archives/epel/6/6.10 this week. We may need to roll out an updated epel-release which will point this architecture to that tree. We will then remove the builders from Fedora and stop building for it. In early July I will remove the remaining trees from /pub/epel and put in redirects to the archives.

2019-03-14

Final 503 addendum

mirrorlist 503's for 2019
This is a graphical shape of the amount of 503's we have had in 2019. The earlier large growth in January/February have dropped down to just one web-server which is probably underpowered to run containers.  We will look at taking it out of circulation in the coming weeks.

2019-03-13

EPEL: Python34->Python36 Move Happening (Currently in EPEL-testing)

Over the last 5 days, Troy Dawson, Jeroen van Meeuwen, Carl W George,  and several helpers have gotten nearly all of the python34 packages moves over to python36 in EPEL-7.  They are being included in 6 Bodhi pushes because of a limitation in Bodhi for the text size of packages in an include.

The current day for these package groups to move into EPEL regular is April 2nd. We would like to have all tests we find in the next week or so also added so that the updates can occur in a large group without too much breakage.


https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-f2d195dada
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-9e9f81e581
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-0d62608bce
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-5be892b745
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-0f4cca7837
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2019-ed3564d906

Please heavily test them by doing the following:

Stage 1 Testing

  1. Install RHEL, CentOS, or Scientific Linux 7 onto a TEST system.
  2. Install or enable the EPEL repository for this system
  3. Install various packages you would normally use
  4. yum --enablerepo=epel-testing update
  5. Report problems to epel-devel@lists.fedoraproject.org

Stage 2 Testing

  1. Check for any updated testing instructions on this blog or EPEL-devel list.
  2. Install RHEL, CentOS, or Scientific Linux 7 onto a TEST system.
  3. Install or enable the EPEL repository for this system
  4. yum install python34
  5. yum --enablerepo=epel-testing update
  6. Report problems to epel-devel@lists.fedoraproject.org

Stage 3 Testing

  1. Check for any updated testing instructions on this blog or EPEL-devel list.
  2. Install RHEL, CentOS, or Scientific Linux 7 onto a TEST system.
  3. Install or enable the EPEL repository for this system
  4. yum install python36
  5. yum --enablerepo=epel-testing update
  6. Report problems to epel-devel@lists.fedoraproject.org
This should cover the three most common scenarios. Other scenarios exist and will require some sort of intervention to work around. We will outline them as they come up.

Many Many Thanks go to Troy, Jeroen, Carl, and the many people on the python team who made a copr and did many of the initial patches to make this possible.

2019-02-19

503's.. the cliffnotes version

So I spent some time this weekend to try and show where 5 12 hour days went in analyzing data. This first graph shows the number of successful mirror requests and breaks down the 503 error's per server.
All mirror requests since 2018-01
The two drops are where logs failed to be copied to the central system versus a problem with the mirrors. You can see that Monday through Friday, Fedora sees a lot of requests and then Saturday and Sunday we see a dip. You can also see that we have gotten an increase in usage since November. The tiny area at the bottom is the number of 503's.. which kind of makes it look unimportant. [Unless you are doing a lot of builds and keep running into it.]

Just the 503's sir.
The above is just a graph of the 503's broken down by each server which sees them. We see that in January, we see a large increase of 503's on 2 servers. The first one is proxy11 which is in Europe and the server may be underpowered for what we are needing it to do. The second was on proxy01 which a lot of sites have hard-coded. If the above graph was done by the minute, you would see it as many many tiny spikes at :02 -> :10 minutes after the hour and most of the day empty. 

The graphs go to 2019-02-15 and the last 4 days have shown a decrease in 503's but proxy01 and proxy11 are still having several thousand a day. I am still looking at other fixes we can do to make this a less painful experience for people when the top of the hour occurs.

2019-02-16

Fedora Infrastructure Detective Work: Mirrorlist 503's

A Mysterious Problem

Recently there has been a large increase in failed yum/dnf updates for users with consumer's getting 503 errors when trying to update their system. This has caused problems in both the Fedora COPR system and with normal users. Trying to find why this problem is occurring was some interesting detective work that took me most of February 8th to February 14th and is still ongoing.

A Scandal in Fedoria

A history of the Fedora Mirrormanager software

The Fedora Project Mirrorlist system has evolved multiple times in the last 10 years. Originally written by Matt Domsch it underwent an update and rewrite by Adrian Reber, et al a couple of years ago. For many years Fedora used a server layout where the front end web servers would proxy the data over VPN to dedicated mirrorlist servers. While this made sense when systems were a bit slower compared to VPN latency, it had become more troublesome over the last couple of years.

Simplification of the older mirrorlist design

Originally most of the Fedora proxy servers were donated systems from various ISP's which made them network fast but not always CPU fast. This meant the system was designed to make the proxies mostly serve static content and relay anything computational to servers with more cpu cycles. As systems improved it made more sense to move the mirrormanager software closer to the proxies, however it wasn't until recently that using Moby and then Podman containers could be put into the mix.

Simplified version of new mirrorlist design
If you will notice neither of the above images mentions a database. There is one but it is a different system which various mirror managers will insert their systems and it will regularly trawl them to make sure they are up to date. It then creates a python pkl with all the network data of the updated ones. This is then pushed to each proxy which will feed it into new pods which are cycled once an hour doing a complicated dance.

  1. New pods are created from base container+new pkl and config data
  2. At 15 minutes after the hour, the first pod is told to drain out its old users.
  3. When it is drained, it is cycled with the new data.
  4. Repeat with the second pod and complete the dance.

The Final 503

Last spring/summer, we started getting reports on various mailing lists about users getting 503's causing dnf to fail. At first blush the number of failures didn't look too large as less than 0.2% of all requests resulted in 503's [We serve on average 20,000,000 requests and at that time 45,000 were 503's which was similar to what we had at times with the old VPN infrastructure.] However on further looking at the logs, it was clear these 45,000 connections were happening at a specific time frame... 15 minutes after the hour.

Doing some more work, it looked like the timeouts we had for waiting for the pods to drain of active connections was not long enough. Adding longer timeouts before killing the container brought the number of 503's down dramatically to an average of  450 per 20 million requests (or down to 0.002%). This seemed to fix things until December.

The Adventure of the Empty 503

In July, all our proxy servers were running Fedora 27 and able to get security updates like all good systems. In late November, our proxy servers were still running Fedora 27 and no longer able to get security updates. Kevin Fenzi put in a lot of hours and got both the containers and the proxy servers redeployed with Fedora 29. This allowed for us to move to newer versions of podman and other tools.

All seemed well until late January when a string reports of 503's started coming up again. At the time we had a couple of proxies who had stuck pod's for various reasons and I figured the two were related. However, the reports still happened after those problems had been fixed, and looking at the logs it was clear that instead of a 'normal' of 6000 503's a day, we were seeing peaks of 400,000 503's a day.

In looking at the combined log data, the first thing that stood out was that the problems were not happening at 15 minutes after the hour. Instead they were mostly happening at the time frame of :00 -> :05 after the hour. They also had peaks of occurring at 00:00, 04:00, 08:00, 12:00, 16:00, and 20:00. These times make a sort of sense in that they are commonly chosen to run daily jobs and at the top of the hour there are usually 3-5x more requests than 10 minutes before.

I then looked to see if the problem happened with a particular client (dnf vs PackageKit vs yum) or versions of those clients but they all happened across the board. [The only issue is that yum seems to retry if it gets a 503 while dnf gives a hard stop. ] At this point, Michal Novotny asked if I was just looking at combined logs.. and I had an 'Aha!' moment. I was looking at the combined logs, and had no idea if this was on one server or not. After looking at the original logs it was clear that proxy01.fedoraproject.org was getting the vast majority of the problems (the other proxies would generate a ~2000/server while proxy01 would generate 50,000 during a day). This again makes sense as both the COPR build system and several other systems seem to hard-code this server because it is in the main Fedora co-location.

At this point I went to see what logs we had. This took some work because of how we had setup the system, but in the end this popped up.

[Wed Feb 13 21:02:36.774319 2019] [wsgi:error] [pid 26286:tid 140136494905088] (11)Resource temporarily unavailable: [client 10.88.0.1:58258] mod_wsgi (pid=26286): Unable to connect to WSGI daemon process 'mirrorlist' on '/run/httpd/wsgi.9.0.1.sock' after multiple attempts as listener backlog limit was exceeded or the socket does not exist.
[Wed Feb 13 21:02:36.774350 2019] [wsgi:error] [pid 26286:tid 140136520083200] (11)Resource temporarily unavailable: [client 10.88.0.1:58248] mod_wsgi (pid=26286): Unable to connect to WSGI daemon process 'mirrorlist' on '/run/httpd/wsgi.9.0.1.sock' after multiple attempts as listener backlog limit was exceeded or the socket does not exist.
[Wed Feb 13 21:02:36.774443 2019] [wsgi:error] [pid 26286:tid 140136125822720] (11)Resource temporarily unavailable: [client 10.88.0.1:58250] mod_wsgi (pid=26286): Unable to connect to WSGI daemon process 'mirrorlist' on '/run/httpd/wsgi.9.0.1.sock' after multiple attempts as listener backlog limit was exceeded or the socket does not exist.
[Wed Feb 13 21:02:36.774228 2019] [wsgi:error] [pid 26286:tid 140136058681088] (11)Resource temporarily unavailable: [client 10.88.0.1:58228] mod_wsgi (pid=26286): Unable to connect to WSGI daemon process 'mirrorlist' on '/run/httpd/wsgi.9.0.1.sock' after multiple attempts as listener backlog limit was exceeded or the socket does not exist

The socket definitely existed so I went to look at backlog limits. Reading through various bug reports and log pages, I figured some options to try: graceful-timeout=30 request-timeout=30 listen-backlog=1000 queue-timeout=30. Kevin added them and rebuilt the images and I rolled it out to the proxies. The amount of failures went down dramatically and I figured it was due to allowing a larger backlog. Michal pointed out that was impossible because the kernel has a backlog size of 128 and the wsgi will just default to that no matter how much larger I made it. Reading through the man pages again I realized I had some cargo-cult going on. Michal then pointed out the change needed and I rolled this out to proxy01 to see if it would help.

The Last 503?

Currently proxy01 is still seeing 503 failures but at a lower rate than before. Before the changes it was averaging 50,000 503's a day and since the change it looks to be at 8,000.  I need to do more research to see what options will help. The increasing of timeouts may have helped but it may only be masking the problem elsewhere. We will need to look at increasing the number of pods that are available though that will increase memory usage and may mean some proxies are not usable for mirrormanager anymore. We also need to find out if other wsgi, haproxy or podman options would help. 

In any case it has been a very interesting week in detective work. I hope we can make the usage of our mirrors more reliable, and this has been useful to read. [I also expect several updates and fixes to this article as time goes on.]