2017-11-21

Presents for System Administrators: Updated nagios/nagios-plugins/nrpe

It is the traditional US Thanksgiving time period, and I am thankful for the patience many sysadmins have had with me. After some delays updated packages for nagios, nagios-plugins, and nrpe have been made for the EPEL-6, EPEL-7, F25, F26, F27 and rawhide.


  1. nagios is updated to 4.3.4 in all channels. I have also fixed some issues in EL-6.
  2. nagios-plugins is built out of git with what I thought was going to be 2.2.2 this summer. I have cleaned out some old patches and added updates for it to work with compiles on rawhide. [I will have to update rawhide in December when I figure out the right maria-connector-c fix.]
  3. nrpe has been updated to 3.2.1 which was released in September. 
Currently one problem I have to deal with is that moved an entry for the nagios status file at some point in the past. It was in /var/spool/nagios/ and looks to be in /var/log/nagios now which various configs might not have changed to.

Otherwise please look for the updates in the various updates channels and give feedback on what you find right or wrong.

2017-11-06

EPEL mirror file layout changes


As several people have noted, the file directory structure of EPEL has changed recently. This layout may require changes in both (1) scripts written with hard-coded locations, and (2) mirrors which were unable to get daily updates from the main mirrors.  While the changes were communicated in meetings, I did not adequately comprehend their effects to let mirrors and EPEL users know about it. This meant this announcement was delayed over two weeks.

 What Happened


The updates in the build system were to add new features and make the release engineering code more manageable. The old release style used by EPEL in EL-6 and EL-7 was different from how all other releases were done and caused several problems for the release code and mirrors.


  1.  Due to all the files of the release being in one directory, any code which needed to stat (2) the directory caused the server to go over thousands of files before returning. With EPEL being a large amount of downloads, this negatively impacted systems. Servers mirroring the data could find long delays in rsyncing the data down. 
  2. The code that generated this was a 'special' case in the Fedora releng release process which was fragile and tended to cause problems for updates and releases in both EPEL and the normal release.
  3. The layouts were different from the current Fedora release so that    people grabbing software from multiple places also had to special case their scripts.


During the updates to the release system with a new version of pungi, it was decided to remove this special case and have all software Fedora created laid out in the same structure by the build tools. This would hopefully make things much more maintainable and improve performance.

In order to safely transition, there would be a time where the old files would remain on the server in the old trees and also be hardlinked to their new location. This was intended to allow for mirrors to get the files with the minimum amount of bandwidth. However there were some problems which showed up.


  1. As I said before, I didn't grasp that the change was going to affect EPEL and didn't communicate this to the lists. 
  2. The transition time for removing the hardlinks was in days versus weeks. While most mirrors do daily updates, some only do weekly or  monthly rsync's. They missed the hardlinks completely and had to download data twice. 
  3. In the usual rule of three, various top level mirrors (mirrors.kernel.org and some others) had un-related mirroring problems at the end of October. When these servers caught up with the new layout, the hardlinked files were gone. This meant that mirrors taking data from a couple of tier1 sites had large uploads.

 How to deal with current things

The current layout structure should be 'solid' for the next couple of years. With the break down of packages into alphabetical subtrees, the 'load' per server should not require a re-ordering in the near future.

If you have written scripts which downloaded a specific file from the mirrors, (aka http://dl.fedoraproject.org/pub/archive/epel/5/i386/epel-release-5-4.noarch.rpm or some similar link), you should instead use a  stable linked package like http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm The epel-release packages get updated regularly to get new macros or other changes so linking to a specific file is very error prone.

Otherwise one should use yum/dnf related commands to get the files from the mirrors. This is useful for mirror sites which may alter the directory structure themselves and thus only the repodata is 'safe' to figure out what to download.

2017-10-03

Ansible RPMS are no longer in EPEL-7

Ansible packages are no longer shipped in EPEL-7 as they have been included in Red Hat Enterprise Linux Extras (and similarly in CentOS-7 and hopefully Scientific Linux 7.4).

Systems which are either using Amazon Linux or Red Hat Enterprise Linux EUS release of 7.2/7.3 will need to get packages from Ansible directly using

http://releases.ansible.com/ansible/rpm/

My thanks to the Ansible Maintainer Kevin Fenzi for having the package inside of EPEL for the last several years.

2017-10-02

Nagios being updated to 4.3.4 in EPEL and Fedora

It took me longer than I wanted, but I have gotten a testing candidate for nagios-4.3.4 in EL6, EL7, F25, F26, F27 and rawhide. This will fix the security problem seen in CVE-2017-14312.

I have made a couple of changes in the RPM also as rpmlint pointed out that the libnagios.a should be in the -devel package and that various contrib items needed to be packaged up in a similar package. I expect I missed something so please test and let me know so I can get this published as soon as possible.


2017-09-06

Flock 2017 : Summary

The trip to FLOCK 2017 in Cape Code was a nice excursion where I learned a lot of things. I had not been able to go to the two previous Flocks in Rochester NY or Poland, so had not been up to date with many things. It was very nice to see many people who I had not seen in 2 years and to catch up with many projects which I had heard of and even installed servers for but not much knowledge on the details.

The days were mostly a blur of going to a couple of talks per day, a lot of hallway track items and dealing with a couple of outages which were happening that needed help on. So the following is a shortened summary:

Monday: Day 0

I posted on this earlier. The day was a pretty good one and I got to let someone else drive through Massachusetts traffic.

Tuesday: Day 1

I wanted to make sure I did not sleep through the opening day talks (something I have been known to do), so I got up extra early, had a big breakfast with some guests from Europe, and made it to sit up front. Matthew Miller gave a nice talk on the status of Fedora and was able to show some pretty pictures from data I helped collect. After trying to advertise the EPEL state of the union talk, I then went to do some hallway meetings and talked with kernel, FESCO and various developers about x86_32 support in Fedora. This was to tell the x86 committee at a meeting on 2017-09-06. 
Later in the day, I went to see Tom Callaway give a talk on licenses and the importance of a strong liver when dealing with them. It was interesting to see how far we have come in so many years. I had hoped to then go to the Fedora on Windows subsystem as I have been using Cygwin on Windows for years and wanted to see how this worked  also. However, a work item came up and I was pretty much booked until later in the evening.

Wednesday: Day 2

Today was the EPEL state of the Union talk. I spent the morning working on a blog post about everything I was going to say.. only to do a CNTRL-A backspace at the wrong moment. Goodbye writing. I am going to go over the particulars in a different post. The two talks went pretty well but I am needing to go over the videos to see what I actually said versus what I think I said. After the talks, I got to ride in a Tesla and also play various boardwalk games at a nice retro playplace. I finally went back and crashed for a bit, but woke up with insomnia til 4am. 

Thursday: Day 3

This day was a for the start of it. I was really really tired and almost fell asleep at the Fedora Infrastructure State of the Union talk. I went back to the room at 1300 for a power nap and woke up after 1700. Went to see if anything was still active and had some more hallway talks about EPEL and other architectures. Finally went back to bed at 2200 and slept soundly.

Friday: Day 4

Had a nice breakfast with most of the Fedora Infrastructure team, and then did a fast jog to catch my bus to the airport. The bus ride was supposed to be 90 minutes which would allow me 2 hours to get through security. Sadly, a Friday before Labour day weekend.. does not lead to a 90 minute bus ride. At 3 hours and somewhat, I got to the airport in time to deal with very last minute getting through security and everything else. I got onto the plane before the doors closed, and was able to fly home to be greeted by the last remnants of hurricane Harvey. We only had 40 minutes of rain from it but even as a smidgen of what eastern Texas got it was incredibly heavy rain and hail. Got home and crashed. 

Fedora Project Outage RCA :: DNS Outage 2017-09-06


Early on 2017-09-06, many people attempting to reach fedoraproject.org
found that it had disappeared from the internet. People attempting to
do 'yum/dnf install', browse the website, or other Internet related
activities were getting various error messages that the sites no
longer existed in DNS. Some people had no difficulty and were not
able to duplicate the problem, but anyone who was using a DNS server
that had dnssec checking turned on were unable to get any IP address
lookups related to the site.

The problem was due to a misconfigured record in the registrar's data
about DNS. The previous week, multiple records had been added by the
registrar to the DNS data in the .org. DNS table. The records were the
DNSsec records for fedorapeople.org, fedorahosted.org, and
fedoraproject.org, and the registrar had added them to fedoraproject.org.
versus each to the correct zone. In seeing this, I asked for two of
the records to be removed, and somehow confused which one was to
stay. This meant that the key meant for fedorahosted.org. was left for
fedoraproject.org and the fedoraproject/fedorapeople were removed.

When the registrar updated its .org. data early UTC on 2017-09-06, DNS
servers like Google's 8.8.8.8 dns no longer would show any addresses
inside of Fedora's dns tables. Other dns servers also were no longer
working and people who are on the IETF for DNSsec came into help in
case there was some other problem going on.

After diagnosing the problem, Fedora IT contacted the registrar and
got the correct DNSsec keys added to the master table. This cleaned up
the problems with many DNS servers but some will cache the broken data
for up to the TTL of 24 hours so users were still having problems as
of 2200 UTC 2017-09-06. A temporary fix is to hard code the main proxy
ip address into /etc/hosts, however this can cause problems later if
not removed and the main proxy is down for maintenance.

I would like to thank the members of the IETF dnssec group who took
the time out to help us through this problem. I would also like to
apologize to everyone who had disruption due to this.

2017-08-28

Flock 2017: Day 0

Today (2017-08-28) was the day before the official beginning of Flock 2017 which is being held in Cape Cod, Massachusetts.  This is the first Flock I have been able to go in 2 years so it has been a lot of catchup with old friends.

The day started off pretty well with only the usual planes, trains and automobiles problems. The airport kept having dyslexia problems with sending people to gate C-12 for a flight at C-15, and C-15 for a flight to C-12. The attendants could not correct the problem because the airport runs the consoles. After an hour of calls and people running back and forth, the signs were finally updated 5 minutes before the flight was ready to board. Which then led to the next fun problem for the poor attendant. The plane we were supposed to fly had mechanical difficulties, and the airline had to do a last minute replacement with a slightly smaller plane. This meant that all the seats had to be moved around and new tickets for everyone. 

The plane flight was pretty uneventful, and when I arrived I ran into Zonker Harris who was headed to Walden Pond for a bit of sunbathing. This solved the trains and automobile problems and we took the Interstate and other roads to Cape Cod. The drive was uneventful though it did remind me that Massachusetts is the one state that makes turn signals optional car equipment and car horns extra loud.

Flock is being held at a nice Golf resort in Hyannis. The room I have is on the second floor and it was nice to hear Seagulls in the distance. For dinner I had a cod dinner at the inhouse bar, and tonight I am working on getting my Wednesday Flock presentations better pictures. 

This evening, I am listening to Lynyrd Skynyrd who are playing next door. I expect FreeBird will be the closing song.

2017-06-22

Problems with EPEL and Fedora mirroring: Many Root Cause Analysis

There was a problem with EPEL and Fedora mirrors for the last 24 hours where people getting updates would get various errors like:

Updateinfo file is not valid XML:

The problem was caused by a problem in the compose which output the XML file not as xml but as sqllite. The problem was fixed within a couple of hours on the Fedora side, but it has taken a lot longer to fix further downstream.

  • Some of the Fedora mirror containers were not updating correctly. We use a docker container on each proxy to keep the data fresh. 4? of the 14 proxies said they were updating but seem to not do so. These servers were our main ipv6 servers so people getting updates from these were more affected than other users. 
  • Some mirrors only update 1 or 2 times a day (or even slower). This means that your favourite mirror may keep the data for 12 to 48 hours. 
  • Some client plugins like to peg to a quickest mirror to try and keep downloads fast. While we may tell you that there are 20 mirrors up to date, the plugin will use the one it got stuff fastest from in the past. This means you can end up with going to a 'broken' mirror for a lot longer.
  • Some yum/dnf systems seem to have other options set to keep the bad xml file until it 'ages' out. This means that while an updated xml is there, some systems are still complaining because their box already has it.
The fixes on the Fedora side are to put in better tests to try and see that this does not happen again. The client side fixes are currently to do either one of the following:

  • yum clean all
  • yum clean metadata
Thank you all for your patience on this problem.

2017-06-07

Call for Papers: Flock to Fedora 2017

In summer, an old engineer's fancy turns to writing paper proposals. For it is time for people to submit papers to https://flocktofedora.org/. This year, Flock is being held in Cape Cod Massachusetts from August 29 to September 01. Flock is also focusing on being a 'get-er-done' conference where workshops on getting software problems worked on by many people will have focus. So do you have something you have wanted to get done in Fedora that you needed to have a bunch of people from around the US and Europe to focus on? Put together a short proposal and submit it to https://register.flocktofedora.org/  [Oh and make sure that the people who you need to work with know about it.. and agree that they want to do it also. Surprise is the opposite of consensus.]

The CFP ends on July 15th 2017. Good luck. I am putting in a proposal for a fast moving EPEL workshop. For a more complete post on FLOCK talk/workshop requirements please see http://blog.linuxgrrl.com/2017/06/08/propose-a-talk-for-flock/

2017-05-30

The steam roller of life

Some days it really feels like you are the last man standing as the zombie horde rolls in, and sometimes it feels like people just seem to scream stop at every little thing. However, a lot of times it just looks like this to everyone else:


The security guard is doing his job and is the hero of his own story (in fact has an extra on DVD about his family.) He is trying to get the 'villians' to stop. Austin Powers is the hero in his story because he is just trying to get to the other side of the room to stop Doctor Evil. The vast gulf between the two is just how far apart and how little danger there really is. It is also a story about how avoidable the inevitable crunch at the end is.

  1. The guard could have stood to the left or right and let the steamroller go by. [The guard could have also shot Austin or something else.]
  2. Austin could have 'swerved to the left or right' just a little and missed the guard. [Or he could have gotten out and gotten there faster.]
OK so you are thinking "Yes Captain Obvious that is exactly the humour being shown here.. thank you for breaking it down for us..." The point I am looking at is how often this mirrors our online community problems. Someone is trying to accomplish something, and someone for whatever reason yells stop. (Or someone is meant to keep something stable, and someone is ramming through a new paradigm). Those of us in the moment get caught up in all the energy, and  we forget that to most people outside that all they see is how avoidable the whole confrontation was. 
Sometimes we feel that it is better to get run over by the steamroller than take a step left or right. Sometimes we feel that putting the pedal to the metal on the steamroller is going to make this so much faster, and we can't move it to the right or left for a small change. 

2017-05-24

Canaries in a coal mine (apropos nothing)


[This post is brought to you by Matthew Inman. Reading http://theoatmeal.com/comics/believe made me realize I don't listen enough and Verisatium's https://www.youtube.com/watch?v=UBVV8pch1dM made me realize why thinking is hard. I am writing this to remind myself when I forget and jump on some phrase.]

Various generations ago, part of my family was coal miners and some of their lore was still passed down many many years later. One of those was about the proverbial canary. A lot of people like to think that they are being a canary when they bring up a problem that they believe will cause great harm.. singing louder because they have run out of air.

That isn't what a canary does. The birds in the mines go silent when the air runs out. They may have died or are on the verge of being dead. They got quieter and quieter and what the miners listened for was the lack of noise from birds versus more noise. Of course it is very very hard to hear the birds in the first place in a mine because they aren't quiet places. There is hammering, and shoveling and footsteps echoing down long tubes.. so you might think.. bring more birds.. that just added more distractions and miners would get into fights because the damn birds never shut up. So the birds were few and far between and people would have to check up on the birds every now and then to see if they were still kicking. Safer mines would have some old fellow stay near the bird and if it died/passed out they would begin ringing a bell which could be heard down the hole.

So if analogies were 1:1, the time to worry is not when people are complaining a lot on a mailing list about some change. In fact if everyone complains, then you could interpret that you have too many birds and not enough miners so go ahead. The time to worry would be when things have changed but no one complains. Then you probably really need to look at getting out of the mine (or most likely you will find it is too late).

However analogies are rarely 1:1 or even 1:20. People are not birds, and you should pay attention to when changes cause a lot of consternation. Listen to why the change is causing problems or pain. Take some time to process it, and see what can be done to either alter the change or find a way for the person who is in pain to get out of pain.

2017-04-11

Moving EPEL-4 and EPEL-5 to archives

Today we say goodbye to the last parts of EPEL-5 (and also EPEL-4). The top level files in /pub/epel/4 and /pub/epel/5 were moved to /pub/archive/epel so that people who are still needing packages can get them from the archives. People using yum should not see any change in updates because mirrormanager had the changes to point to archives a couple of days previously.

For any kickstarts or scripts that used the main download servers all that needs to be done is change:


http://dl.fedoraproject.org/pub/epel/5/

to

http://dl.fedoraproject.org/pub/archive/epel/5/

and you can have your kickstart scripts grab the epel rpm from

http://dl.fedoraproject.org/pub/archive/epel/epel-release-latest-5.noarch.rpm

Thanks again to everyone who has helped with EPEL-5 over the years. It was a good crazy ride.

2017-03-17

EPEL-5 article appearing on FedoraMagazine.org

So I thought I was not writing anything more about the EOL of EPEL-5, but I got asked by several people why no one had written anything about it 😐. The ability of my posts to reach the world was much smaller than I realized. In order to rectify that a bit, here is another article on the EOL of EPEL-5 this time at Fedora Magazine.

2017-02-15

IMPORTANT REMINDER: EL 5 is EOL on March 31. 2017

This is probably my final reminder on this before April 3rd 2017. As listed at https://access.redhat.com/support/policy/updates/errata and https://en.wikipedia.org/wiki/Red_Hat_Enterprise_Linux#Product_life_cycle Red Hat Enterprise Linux will be exiting "Production Phase 3", and CentOS will be archiving off old EL-5 releases.

At that point, all remaining EPEL-5 packages will be archived to /pub/archive/epel/5 for systems to get data from. No new updates or packages will be done after that.

2017-02-14

Trying to get an idea about what packages are used

Background

One of the questions I get asked a lot is "You provide various statistics for Fedora, can you show which packages are installed the most?"

To head off a lot of future requests, the answer is no, no I can't. We do not have any sort of popcorn database which shows what packages are popular. When a user requests the OS to install a package, there is no "Hey I am asking for Bob if I can install libfoobar" that gets sent to the Fedora servers. What yum, dnf, PackageKit, or Salt do is then request for the repo data, looks to see if there is a way to figure out what is wanted and then asks for any packages that it needs to get.

It is this data that I can sort of glean some sort of idea of most installed packages.. but I feel it is way past "Lies", "Damn Lies", and "Statistics" into regions like  "Political Promises" or "Half Life 3 confirmed". Looking over an entire month of requests, sorting the data, and ranking the requests, I find that a bunch of packages show up a lot while others fall off in a long tail. Things that make this data dirty are the fact that if 200 people ask for wordpress, 150 for mediawiki and 90 for nagios.. I will see various PHP trunk packages that all three want as a higher number. I can't simply tell if the person wanted that PHP package by itself or wanted wordpress. [I could possibly try and work out a transaction of requested packages and figure out what nodes and leafs there might be.. but I found that the tools don't always request from download.fedoraproject.org everything it is wanting because it possibly already 'knows' where something is.

In any case, here are the most requested packages to the download website for January.

EPEL-7

  1. epel-release-7-9
  2. python2-pip-8
  3. python2-boto-2
  4. openvpn-2
  5. php-tcpdf-6
  6. php-tcpdf-dejavu-sans-fonts-6
  7. pdc-updater-0
  8. duplicity-0
  9. nagios-plugins-2 *lots of plugins show up here*
  10. ansible-2
  11. libopendkim-2
  12. opendkim-2
  13. cowsay-3
  14. python2-wikitcms-2
  15. pkcs11-helper-1
  16. fedmsg-0
  17. htop-2
  18. munin *lots of munin packages here
  19. awscli-1
  20. hdf5-1

EPEL-6

  1. nagios-plugins-2 *lots of other nagios removed*
  2. libmcrypt-2
  3. nodejs-0 *lots of other nodejs removed*
  4. python2-boto-2
  5. GeoIP-1 *other GeoIP removed*
  6. geoipupdate-2
  7. nrpe-2
  8. libnet-1
  9. denyhosts-2
  10. eventlog-0
  11. syslog-ng-3
  12. epel-release-6-8
  13. php-pear-Auth-SASL-1
  14. php-pear-Net-SMTP-1
  15. php-pear-Net-Socket-1
  16. perl-Net-IDN-Encode-2
  17. perl-Net-Whois-Raw-2
  18. perl-Regexp-IPv6-0
  19. pwhois-2
  20. v8
EPEL-6 is our most popular distribution with a ratio of about 12 EPEL-6 : 7 EPEL-7: 1.5 Fedora 25 to 1 EPEL-5 request over the month of January. 

EPEL-5

  1. R-core-3 *lots of other R packages removed*
  2. globus-gssapi-gsi-devel-12 *lots of other globus removed*
  3. nordugrid-arc-5
  4. xrootd-client-libs-4 *lots of other xrootd removed*
  5. pcp-libs-devel-3
  6. nordugrid-arc-devel-5
  7. libopendkim-2
  8. libopendmarc-1
  9. pcp-libs-3
  10. nordugrid-arc-plugins-globus-5
  11. libopendkim-devel-2
  12. libopendmarc-1
  13. ebtree-6
  14. myproxy-libs-6
  15. mosh-1
  16. lua-cyrussasl-1
  17. drupal7
  18. rear-2
  19. clustershell-1
  20. rsnapshot-1
I found it interesting that R was getting pulled in by a lot of computers on EPEL-5. This OS is almost end of lifed, but it looks like systems are still getting provisioned with it.

Fedora 25

  1. java-1
  2. vim-minimal-8
  3. kernel-core-4
  4. libX11-1
  5. perl-libs-5
  6. perl-5
  7. perl-IO-1
  8. perl-macros-5
  9. perl-Errno-1
  10. nss-3
  11. gdk-pixbuf2-2
  12. gtk3-3
  13. audit-libs-2
  14. nss-softokn-freebl-3
  15. libX11-common-1
  16. gdk-pixbuf2-modules-2
  17. libnl3-3
  18. gnutls-3
  19. pcre-8
  20. gtk-update-icon-cache-3
As can be seen from the Fedora 25, there is another problem with my trying to get an idea of packages.. a package getting updated that is installed on a lot of boxes will show up also. 

Conclusions

I really don't think any 'real' conclusions can come out of this other than people really want vim on their Fedora 25 desktops (emacs was way down the list). 😑 I also want to say that we should get an opt-in popcorn for Fedora :).

[Edited: I forgot this part]

This list of agents which get used to pull down packages for EPEL and Fedora was rather interesting. I combined all the yum together as the many different versions kind of polluted the numbers but here are the top agents:


  1. yum
  2. Salt
  3. dnf
  4. Artifactory
  5. python-requests
  6. Debian Apt-Cacher-NG
  7. PackageKit-hawkey
  8. Axel 2.4 (Linux)
  9. Wget
  10. libdnf
  11. curl
  12. urlgrabber
The Salt seems to come from a large number of amazon systems which are installing either epel-release-6 (80% of the time) or epel-release-7 (20% of the time). Nothing else seemed to be 'pulled' from download.fedoraproject.org so it is probably just a config artifact on bootup. 

2017-02-07

Major update to Nagios in Fedora Rawhide and EPEL-7 [moving to 4.2.4]

After a couple months of work, I have put together an updated package for Fedora Rawhide and EPEL-7 today. I expect it will have some 'problems' and so have moved the needed karma to 4 and am looking for people to test and give it negative karma with feedback for items broken.

I will work on getting those done this week so we can try and have working versions of Nagios for Fedora Server 26 and EPEL. Currently I expect it to need changes to the selinux policies for both and may need some additional work there. I am working through the processes for getting those done.

EPEL-6 will need some more work because the rpmbuild is complaining that it can't make /var/run/nagios for some reason.

Creating a new update for  nagios-4.2.4-2.el7 
================================================================================
     nagios-4.2.4-2.el7
================================================================================
  Update ID: FEDORA-EPEL-2017-0f3297a19b
    Release: Fedora EPEL 7
     Status: pending
       Type: security
      Karma: 0
    Request: testing
       Bugs: 1288989 - None
           : 1289710 - None
           : 1299166 - None
           : 1322666 - None
           : 1329857 - None
           : 1330627 - None
           : 1341683 - None
           : 1405365 - None
           : 1411399 - None
      Notes: Major Update. Fixes various CVE and other issues.
  Submitter: smooge
  Submitted: 2017-02-07 23:46:16
   Comments: bodhi - 2017-02-07 23:46:17 (karma 0)
             This update has been submitted for testing by smooge.

  https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2017-0f3297a19b


Major update to Fedora/EPEL moving to nrpe-3.0.1

The version of nrpe in Fedora has been 2.15 for a very long time while the upstream Nagios group moved to a 3.0 series. With some work and a lot of help from my friends, I have put an updated nrpe into EPEL-testing for EPEL-6 and EPEL-7 and in Fedora Rawhide.

EPEL

I have put the EPEL update karma for this to be 4 versus 3 as I would like some more testing done by people before it gets working. If it gets a lot of negative karma I will pull it and work with upstream to get a working version into EPEL.

Rawhide

This version of nrpe was the 'fun' one. This is due to the fact that the newer OpenSSL does not allow for introspection of various structures which it used to. Working with Tomas Mraz and Patrick Uiterwijk, I believe I have a semi-working version. [A secondary problem was that I had to pull some sslv2 code out because we do not ship with those libraries anymore. I am hoping upstream will come up with a better fix than my hacksaw method.]

Fedora 25

I have not put in an update to Fedora 25 because it is a major update and was not listed as a change request. I am looking through what needs to be done for this, and when I have gotten any approvals needed will publish it to Fedora 25 testing.

2017-02-01

Reminder to self: Using Dennet's principles

These last couple of weeks have been a complete emotional and intellectual mess for me. People are arguing to the right of me, to the left of me and everyone seems to have pulled out their favorite purity test to try and prove if someone is good enough to be in their camp.

In trying to come up with clearer axioms to gauge all the various poop-storms without getting into emotives or purity tests, I ran into this article about Daniel Dennet's tools for trying to be a critical thinker. I will report the paraphrase below:

  1. Accept you make mistakes, and then use them to be a better person.
  2. Respect your opponent. 
  3. Beware of "surely" as it is overused as a rhetorical device to avoid critical thinking by assuming something is sure.
  4. Answer rhetorical questions. As with 'surely', using a rhetorical question is a way to avoid thinking about something by being facetious.
  5. Employ Occam's Razor
  6. Employ Sturgeon's Law. 90% of everything is rubbish... [which cuts both ways.] Don't waste your time defending rubbish and don't waste your time attacking it. Work on the 10-20% which isn't.
  7. Avoid deepities... things which are deep and profound but not well defined. [Another way of looking at it is "If it sounds too good to be true, it probably is".] 
Anyway thanks to Jonathan Corbet of lwn.net for reminding me of the Sturgeon's law wikipedia article which lead me to that piece.

2017-01-17

Mea Culpa: Fedora Elections

As announced here, here , and here, the Fedora Election cycle for the start of the 25 release has been done. Congratulations on the winners. Now if you notice there were less than 250 voters for any of the elections out of multiple thousand of eligible voters.. I am not one of them.

It is not like the elections were announced before, at the start, or right before they ended. Yet somehow.. I missed everyone of these emails. I caught various emails on NFS changing configurations, proposed changes to Fedora 26 and 27, or various retired packages.. but I completely spaced the elections. I was actually sending an email asking when they would be held when someone congratulated Kevin Fenzi on IRC about winning.

So to the winners of this cycle of elections. Congratulations. To all the people who put in the hard work of running elections (and having run several it is a LOT of hard work), my sincere apologies for somehow missing it.

2017-01-07

Fedora/EPEL Mirrormanager problems in Asia Pacific countries.

We have been getting a lot of reports of people unable to get updates for EPEL or Fedora at various times. What people are seeing is that they will do a 'yum update' and it will give a long list of failures and quit. At this moment we seem to have pinpointed that most of the people having this problem are in various Asia Pacific nations (primarily Australia and Japan). The problem for both of these seems to be a lack of cross connects between networks.

In the US, if you are on Comcast in say New Mexico and going to a server on Time Warner in North Carolina, your route is usually pretty direct. You will go from one network to various third party providers who will then send the packets the quickest path to the eventual server. If you use a visual grapher of locations, you even find that the path usually follows a linear path. [You might end up going to say California or Seattle first but that is only when Texas and Colorado cross connects are full.] Similarly in most European countries you also see a similar routing algorithm.

In the various Pacific and Indian Ocean countries, you do not see similar interconnects. You can watch a system in Sydney on one network send packets all the way to San Francisco and back again to a server in Sydney because the two telecoms do not 'talk' with each other. This seems to happen also in Japan for a couple of telecom networks. The result of this is that it is much more expensive to mirror data in those countries than you would think. For users it might be faster to get data from mainland China or the United States than it is to get it from a server only hundreds of miles away.

The problem is that mirrormanager is currently not coded to deal with that. It makes an optimistic assumption that you are in Adele and the nearest server is in Sydney.. you should go to that. The mirror in Sydney though is still catching up with data from pulling things in the mainland US (or if the mirror admin made the assumption that an asia pacific mirror is the one to go to.. may be pulling data from a server 20 miles physically and several tens of thousand miles away by network.) The mirrormanager developers are trying to figure out ways to deal with this without making servers and clients having to send each other network maps with throughput charts to figure out things.. [And no the fastest mirror yum plugin doesn't fix this for all/most people. It uses a very very simple 'works for me' test to figure out what mirrors might be a good match at one point in time. You could end up with using a poor mirror 90% of the time but the one time it set itself up.. it also just uses the "hey ping is fast" dynamic which breaks for people on various networks. Improving the fastest mirror plugin would be useful if someone did it.]

So what to do? For EPEL, the current fix is to edit your /etc/yum.repos.d/epel.repo files by adding a '&country=global'



[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
#baseurl=http://download.fedoraproject.org/pub/epel/7/$basearch
mirrorlist=https://mirrors.fedoraproject.org/metalink?repo=epel-7&arch=$basearch&country=global
failovermethod=priority
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7


This will cause yum to ask for the global versus 'local' and you will get all the mirrors. This usually will give a few servers which are in sync even if they are 'not' local.