2024-05-29

Where did 5 Million EPEL-7 systems come from starting in March?

ADDENDUM (2024-05-30T01:08+00:00): Multiple Amazon engineers reached out after I posted this and there is work on identifying what is causing this issue. Thank you to all the people who are burning the midnight oil on this.

ORIGINAL ARTICLE:

So starting earlier this year, the Fedora mirror manager mailing list started getting reports about heavier usage from various mirrors. Looking at the traffic reported, it seemed to be a large number of EL-7 systems starting to check in a lot more than in the past. At first I thought it was because various domains were beginning to move their operating systems from EL-7 to a newer release using one of the transition tools like Red Hat LEAPP or Alma ELevate. However, the surge didn't seem to die down, and in fact the Fedora Project mirrors have had regular problems with load due to this surge. 

A couple of weeks ago, I finally had time to look at some graphs I had set up when I was in Fedora Infrastructure and saw this:

Cumulative EPEL releases since 2019

 

Basically the load is an additional 5 million systems starting to query both the Fedora webproxies for mirror data, and then mirrors around the world to get further information. Going through the logs, there seems to be a 'gradual' shift of additional servers starting to ask for content when they had not before. In looking at the logs, it is hard to see what the systems asking for this data are. EL-7 uses yum which doesn't report any user data beyond:

urlgrabber/3.10 yum/3.4.

That could mean the system is Red Hat Enterprise Linux 7, CentOS Linux 7, or even Amazon Linux 2 (which is sort of based on CentOS 7, but with various changes that using EPEL is probably not advised).

Because there wasn't a countme or any identifiers in the yum code, the older data-analysis program does a 'if I see an ip address 1 time a day, I count it once.. if I see it 1000 times, I count it once.' This had a problem of undercounting for various cloud and other networks behind a NAT router.. so normally maybe only 1 ip address would show up in a class C (/24) network space. What seemed to change is where we might only count one ip address in various networks, we were now seeing every ip address showing up in a Class C network. 

Doing some backtracking of the ip addresses to ASN numbers, I was able to show that the 'top 10' ASNs changed dramatically in March

January 27, 2024
Total  ASN 
1347016 16509_AMAZON-02,
219728 14618_AMAZON-AES,
53500 396982_GOOGLE-CLOUD-PLATFORM,
11205 8560_IONOS-AS
10403 8987_AMAZON
8463 32244_LIQUIDWEB,
8019 54641_IMH-IAD,
7965 8075_MICROSOFT-CORP-MSN-AS-BLOCK,
7889 398101_GO-DADDY-COM-LLC,
7234 394303_BIGSCOOTS,

February 27, 2024
1871463 16509_AMAZON-02,
219545 14618_AMAZON-AES,
51511 396982_GOOGLE-CLOUD-PLATFORM,
11021 8560_IONOS-AS
9016 8987_AMAZON
8208 32244_LIQUIDWEB,
7885 54641_IMH-IAD,
7768 8075_MICROSOFT-CORP-MSN-AS-BLOCK,
7618 398101_GO-DADDY-COM-LLC,
7383 394303_BIGSCOOTS,

March 27, 2024
2604768 16509_AMAZON-02,
276737 14618_AMAZON-AES,
34674 396982_GOOGLE-CLOUD-PLATFORM,
10211 8560_IONOS-AS
9560 135629_WESTCLOUDDATA
8134 8987_AMAZON
7952 54641_IMH-IAD,
7677 32244_LIQUIDWEB,
7445 394303_BIGSCOOTS,
7250 398101_GO-DADDY-COM-LLC,

April 27, 2024
4247068 16509_AMAZON-02,
1807803 14618_AMAZON-AES,
65274 8987_AMAZON
51668 135629_WESTCLOUDDATA
41190 55960_BJ-GUANGHUAN-AP
9799 396982_GOOGLE-CLOUD-PLATFORM,
7662 54641_IMH-IAD,
7561 394303_BIGSCOOTS,
6613 32244_LIQUIDWEB,
6425 8560_IONOS-AS

May 27, 2024
4186230 16509_AMAZON-02,
1775898 14618_AMAZON-AES,
62698 8987_AMAZON
50895 135629_WESTCLOUDDATA
38521 55960_BJ-GUANGHUAN-AP
9059 396982_GOOGLE-CLOUD-PLATFORM,
7613 394303_BIGSCOOTS,
7531 54641_IMH-IAD,
6307 398101_GO-DADDY-COM-LLC,
6222 32244_LIQUIDWEB,

I am not sure what changed in Amazon in March, but it has had a tremendous impact on parts of Fedora Infrastructure and the volunteer mirror systems which use it.

2024-05-14

CentOS Linux 7: End of Life 2024-06-30

CentOS 7 EOL

If this looks a lot like the CentOS Stream 8 EOL content.. well it is because they aren't too different. It is just that instead of doing this once every 4 years, we get to do this twice in one year.
 
The last CentOS Linux release, CentOS Linux 7, will reach its end of life on June 30th, 2024. When that happens, the CentOS Infrastructure will follow the standard practice it has been doing since the early days of CentOS Linux:
  1. Move all content to https://vault.centos.org/
  2. Stop the mirroring software from responding to EL-7 queries.  
  3. Additional software for EL-7 may also be removed from other locations.

The first change usually causes any mirror doing an rsync with delete options to remove the contents from their mirror. The second change will cause the yum commands to break with errors. The vault system is also a single server with few active mirrors of its contents. As such, it is likely to be overloaded and very slow to respond as additional requests are made of it. 

 At the same time, the EPEL software on https://dl.fedoraproject.org/pub/epel/7 will be moved to /pub/archive/epel/7 and the mirrormanager for that will be updated appropriately.

In total, if you are using CentOS Linux 7 in either production or a CI/CD system, you can expect a lot of errors on July 1st or shortly afterwords.

What you must do!

There are several steps you can do to get ahead of the tsunami of problems:
  1. You can convert to Red Hat Enterprise Linux 7 and see about getting a Extended LifeCycle Support contract with the system. This is a stop gap measure for you to move toward a newer release of a Linux operating system.
  2. You can move your system to a replacement Enterprise Linux distribution. The Alma Project, Rocky Linux, Oracle and Red Hat all offer tools which can transition an EL7 system to a version of the operating system they will support for several years.
  3. If you are not able to move your systems in the next 45 days, you should look at mirroring the CentOS Linux 7 operating system to a more local location and move your update configs to use your mirror. With the large size of systems which could potentially try to use the vault system, I would not expect this to be very useful. As you will probably need to reinstall, add software or do continuous CI/CD in other areas.. you should keep a copy of the operating system local to your networks.

 References:

2024-05-13

CentOS Stream 8 END OF LIFE : 2024-05-31

CentOS Stream 8 EOL

CentOS Stream 8 will reach its end of life on May 31st, 2024. When that happens, the CentOS Infrastructure will follow the standard practice it has been doing since the early days of CentOS Linux:
  1. Move all content to https://vault.centos.org/
  2. Stop the mirroring software from responding to EL-8 queries. 

The first change usually causes any mirror doing an rsync with delete options to remove the contents from their mirror. The second change will cause the dnf or yum commands to break with errors. The vault system is also a single server with few active mirrors of its contents. As such, it is likely to be overloaded and very slow to respond as additional requests are made of it.

In total, if you are using CentOS Stream 8 in either production or a CI/CD system, you can expect a lot of errors on June 1st or shortly afterwords. 

What you can do!

There are several steps you can do to get ahead of a possible tsunami of problems:
  1. You can look to moving to a newer release of CentOS Stream before the end of the month. This usually will require deployment of new images or installs versus straight updates. 
  2. You can see if any of the 'move my Enterprise Linux' tools have added support for moving from CentOS Stream 8 to their EL8.10. For releases before 8.10, this was very hard because CentOS Stream 8 was usually the next release, but 8.10 is at a point where Alma, Oracle, Red Hat Enterprise Linux, or Rocky are at the same revisions or newer.
  3. You can start mirroring the CentOS Stream 8 content into your infrastructure and point any CI/CD or other systems to that mirror which will allow you to continue to function.

Mirroring CentOS Stream

Of the three options, I recommend the third. However in working out what is needed to mirror CentOS Stream, I realized I needed newer documentation and it would probably be a long post in itself. For a shorter version for self-starters, I recommend the documentation on the CentOS wiki,  https://wiki.centos.org/HowTos(2f)CreateLocalMirror.html  While the information was written for CentOS Linux 6 which was end-of-lifed in 2020, it covers most of the instructions needed. The parts which may need updating is the amount of disk space required for CentOS Stream which seems to be about 280 GB for everything and maybe around 120GB for any one architecture. 

 References:

 

Recent EPEL dnf problems with some EL8 systems

Last week there were several reports about various systems having problems with EPEL-8 repositories. The problems started shortly after various systems in Fedora had been updated from Fedora Linux 38 to Fedora 40, but the problems were not happening to all EL-8 systems. The problems were root-caused to the following:

  1. Fedora 40 had createrepo-1.0 installed which defaults to using zstd compression for various repositories.
  2. The EL8 systems which were working had some version of libsolv-0.7.20 installed which links against libzstd and so works.
  3. The EL8 systems which did not work either had versions of libsolv before 0.7.20 (and were any EL before 8.4) OR they had versions of libsolv-0.7.22

The newer version of libsolv was traced down to coming from repositories of either Red Hat Satellite or a related tool pulp, and was a rebuild of the EL9 package. However it wasn't a complete rebuild as the EL9 version has libzstd support, but the EL8 rebuild did not. 

A band-aid fix was made by Fedora Release Engineering to have EPEL repositories use the xz compression method for various files in /repodata/ versus libzstd. This is a slower compression method, but is possible to be used with all of the versions of libsolv reported in the various bugs and issue trackers.

This is a band-aid fix because users will run into this with any other repositories using the newer createrepo. A fuller fix will require affected systems to do one of the following:

  1. If the system is either Red Hat Enterprise Linux, Rocky Linux, Alma Linux, or Oracle Enterprise Linux and not running EL-8.9 or later, they need to upgrade to that OR point to an older version of a repository in https://dl.fedoraproject.org/pub/archive/epel/
  2. If the system has versions of libsolv-0.7.22 without libzstd support, they should contact the repository to see if a rebuild with libzstd support can be made. 
  3. If the system is CentOS Stream 8, then you should be making plans to upgrade to a different operating system as that version will be EOL on May 31st 2024.

External References

 

2023-04-21

Note To Future Self: Lenovo Laptop USB-C Mini Dock Reset

Hello Future Self,

Past Self here leaving you a note since I forgot to do so last time.

The Problem

When running Linux on a Lenovo, there are times where a firmware update will cause problems with the USB-C Mini Dock afterwards. In the previous 2 cases, the USB-C's RTL network will no longer show up as a seen device. External monitors plugged into the dock may also not function correctly, but it only happened once so I am not sure about that.

Diagnosis of the problem is that the system will complain of no internet connection, and commands will show something like the following (output altered):


ssmoogen@ssmoogen-rh:~$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s31f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
    link/ether ab:cd:12:34:56:78 brd ff:ff:ff:ff:ff:ff
3: wlp82s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 9a:bc:de:f1:23:45 brd ff:ff:ff:ff:ff:ff permaddr aa:aa:bb:bb:cc:cc

The part that has confused me at least once before is that the enp0s31f6 says NO-CARRIER which made me believe that the switch had problems. Looking at the USB-C showed that the link light was on and there was traffic. This led me to try various solutions which were wrong.

Attempted Solutions

In trying to diagnosis this in the past, I tried backing out all the firmware updates to see if they would untrigger the bricking of the connection. The fwupdtool worked great to do this, and I was able to back down through 8 firmwares without a hitch. However the network still said it was offline.

Next I went through older kernels and tried booting with a USB stick. All of them continued to show the e1000e as disconnected. Finally I went through the journalctl command to look for previous boots and what networks were shown up.


ssmoogen@ssmoogen-rh:~$ journalctl | grep eth0 | tail -n 100
....
Apr 20 16:54:37 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
Apr 20 16:54:37 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF
Apr 20 16:54:37 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
Apr 20 16:54:38 ssmoogen-rh.localdomain kernel: r8152 4-2.1.2:1.0 eth0: v1.12.13
Apr 20 16:54:38 ssmoogen-rh.localdomain kernel: r8152 4-2.1.2:1.0 enp9s0u2u1u2: renamed from eth0
Apr 20 17:09:24 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) aa:aa:aa:bb:bb:bb
Apr 20 17:09:24 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
Apr 20 17:09:24 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF
Apr 20 17:09:24 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
Apr 20 17:13:56 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) aa:aa:aa:bb:bb:bb
Apr 20 17:13:56 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
Apr 20 17:13:56 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF
Apr 20 17:13:56 ssmoogen-rh.localdomain kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
...

The Solution

Matching up the timestamps I see that there was an additional device see before the updates occurred. This was using the r8152 driver but after the firmware updates it no longer showed up. This finally triggered a memory of an email I had seen where someone else had a similar problem. Going through my email archives, I found that the solution they had found was to unplug the USB-C Dock for 1 minute and then plug everything back in. Sure enough, doing this restored the RTL driver and my network was restored. The e1000 was a red herring as it is somewhere internal to the laptop and probably available through a dongle which I forgot about as there doesn't seem to be a RJ-45 jack I could find on the exterior of the laptop.

Anyway, when this happens again, please remember this letter and save yourself 2 hours of firmware resets and kernel reboots. Sometimes completely turning off the hardware (remove all power from the Dock including the laptop) and turning it back on WILL fix the problem.

2023-04-19

~1 year to end of EPEL-7

Extra Packages for Enterprise Linux 7

Extra Packages for Enterprise Linux (EPEL) are packages based off of various Fedora releases but built for the various distributions based off of Red Hat Enterprise Linux. In June of 2014, Red Hat Enterprise Linux 7 (EL-7) was released and over the next several months, a focus was made to make the release of EPEL possible. Much of the work was done by Kevin Fenzi and Dennis Gilmore with some additional work by anyone else who had spare time. The initial goal was to make it that core packages needed for Fedora Infrastructure to move its core servers to EL-7 were built. That had been what had been done for the initial releases of EPEL-5 and EPEL-6, and would allow for enough base 'packages' to be built for additional packages to be added by other maintainers.

In comparison to trying to get EPEL-5 working, building for EPEL-7 was fairly easy. The initial distribution came with a large set of shipped development libraries and tooling versus getting added later. Over the years, the EL-7 distribution also gained various newer gcc toolkits via software collections which also helped EPEL maintainers to keep updating software for much of the 10 year lifecycle of EL-7. However, this maintenance has been getting harder over the last 2 years, as more and more software required either newer kernels, glibc, or other libraries that aren't available for an older operating system. [This is similar to what happened with EPEL-5 and EPEL-6 where the last year or so of the repository was more and more packages being removed due to maintenance concerns.]

This ties in with the general end of support for Red Hat Enterprise Linux 7 on June 30, 2024. While final plans for how EPEL-7 will be end of lifed, this is a general outline from how EPEL-5 and EPEL-6 were similarly ended.

  1. There will be regular reminders on mailing lists that the project will no longer be supporting EL-7 after a specified date. 
  2. On that date, the following will happen:
    1. The Fedora build system will not allow any more EPEL-7 builds
    2. A final push of all updates will happen to /pub/epel/7/
    3. The current items in /pub/epel/7/ will be archived over to /pub/archive/epel/7.final/
    4. Symbolic links will be made to point /pub/archive/epel/7 to the 7.final
    5. The mirrormanager program which is what yum uses to look for updates will change where it points to to /pub/archive/epel/7/
    6. After a week to allow mirrors to catch up, /pub/epel/7/ will be removed and a line telling people where to find the archived content.
  3. Updates to lists and such explaining what happened will occur.

Why A Year Plus Notice?

EPEL-7 is the largest release that the Fedora project supports. There are about 400,000 Fedora systems seen by countme, and somewhere between 3.4 million and 6.7 million EPEL-7 users (depending on how looks at mirrormanager statistics). Going from the long tail turn off of EPEL-5 and EPEL-6 systems over the years, many of those EPEL-7 systems will take years to move to later releases. Going from past reports, many of the system administrators are not the original admins who set up the machine, and don't even know the OS or its auxiliary repositories like EPEL are no longer updated. Putting up blog posts like this can help:

  • Give admins notice and a case to their management to do updates BEFORE the end of life date.
  • A heads up on why scripts that mirrored content from /pub/epel/7/ will no longer work.
  • Time to mirror the content locally for the inevitable reinstalls because management don't think an update to a newer release is needed.  

Whatever the case, good luck to you fellow system administrators.

2022-04-01

Compiling openldap for CentOS 8 Stream

Compiling OpenLDAP for EL8 systems

Steps to compile openldap-server for CentOS 8 Stream

The EL8 release did not ship an openldap-server like it did in previous releases. Instead only the client tools and some libraries are included for existing applications. Instead the focus from the upstream provider has been on other LDAP solutions.

This leaves a problem for various sites who have their data in an OpenLDAP system and do not have the time, energy, resources for moving to something else. There are several possible solutions to this:

  1. Continue to use EL5/EL6 even though it is at end of open maintenance.
  2. Continue to use EL7 until it is end of open maintenance around 2024-06-30.
  3. Move to a different distribution which does have working openldap
  4. Compile replacement tools using the Fedora src.rpm which may be closer to the ‘upstream’.
  5. Compile replacement tools using the upstream source.
  6. Compile using the upstream source from https://git.centos.org
  7. [Added after initial post] You can download them from https://koji.mbox.centos.org/koji/

In this tutorial we will work with number 5. At the end we will cover number 6.

Setting up a build environment.

For simplicity sake, we will assume you have a working but minimally installed Fedora 35 or EL8 system (Alma, Oracle, Rocky, etc) which you can do compiles in. If we are using an EL8 system are going to need to get mock and git installed.

$ sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

For Fedora and EL8 systems the following should work the same:

$ sudo dnf install git mock rpm-build
$ sudo usermod -a -G mock $USERNAME
$ newgrp mock

Answer yes to the questions about adding new keys and the packages should be installed to allow for a build to occur. We now need to set up a minimal .rpmmacros file for the next steps:

# uncomment if you want to build in standard homedirectory
#%_topdir %(echo $HOME)/rpmbuild
# comment if want to use standard home directory
%_topdir        %{getenv:PWD}
%_sourcedir     %{_topdir}/SOURCES
#%_sourcedir     %{_topdir}/SOURCES/%{name}-%{version}
%_specdir       %{_topdir}/SPECS
%_srcrpmdir     %{_topdir}/SRPMS
%_builddir      %{_topdir}/BUILD

%__arch_install_post \
    [ "%{buildarch}" = "noarch" ] || QA_CHECK_RPATHS=1 ; \
    case "${QA_CHECK_RPATHS:-}" in [1yY]*) /usr/lib/rpm/check-rpaths ;; esac \
    /usr/lib/rpm/check-buildroot

Once we have that in place, the following will get an openldap build going:

$ mkdir -vp ~/EL8-sources/ ~/output-packages/
$ cd ~/EL8-sources/
$ git clone https://git.centos.org/rpms/openldap.git
$ git clone https://git.centos.org/centos-git-common.git
$ cd openldap
$ ../centos-git-common/get_sources.sh
$ rpmbuild -bs SPECS/openldap.spec

Now depending on the host OS you are doing this on, you should see a file like SRPMS/openldap-2.4.46-18.fc35.src.rpm or SRPMS/openldap-2.4.46-18.el8.src.rpm having been created.

$ mock -r centos-stream+epel-next-8-x86_64 --chain --localrepo \
~/output-packages/ SRPMS/openldap-2.4.46-18.fc35.src.rpm

should then attempt to build the packages and will end up with a fully usable repo in ${HOMEDIR}/output-packages/results/centos-stream+epel-next-8-x86_64

If not, then there are probably some steps or problems I missed in this howto :(. At this point you can determine what to do with installing this -server package on the server needing it.

Downloading direct from CentOS.

This is the ‘feed the fisherman versus teaching how to fish’ part of the document.

If you are using CentOS Stream 8, you can download the build packages from the project koji. I expect similar steps can be done for other rebuilds.

  1. dnf list openldap to get which package you are looking for.
  2. Open a window to https://koji.mbox.centos.org/koji/
  3. Type in openldap in the Search box.
  4. Click on the build you would have installed. For this example, we will choose https://koji.mbox.centos.org/koji/buildinfo?buildID=18688 and then scroll down to the architecture you are using.
  5. Right click on the download button for openldap-servers like:https://koji.mbox.centos.org/pkgs/packages/openldap/2.4.46/18.el8/x86_64/openldap-servers-2.4.46-18.el8.x86_64.rpm
  6. Install this package in the package place you want.
  7. When dnf breaks because it can’t upgrade the package due to the upstream updating, go follow step 0 again.