2017-09-06

Flock 2017 : Summary

The trip to FLOCK 2017 in Cape Code was a nice excursion where I learned a lot of things. I had not been able to go to the two previous Flocks in Rochester NY or Poland, so had not been up to date with many things. It was very nice to see many people who I had not seen in 2 years and to catch up with many projects which I had heard of and even installed servers for but not much knowledge on the details.

The days were mostly a blur of going to a couple of talks per day, a lot of hallway track items and dealing with a couple of outages which were happening that needed help on. So the following is a shortened summary:

Monday: Day 0

I posted on this earlier. The day was a pretty good one and I got to let someone else drive through Massachusetts traffic.

Tuesday: Day 1

I wanted to make sure I did not sleep through the opening day talks (something I have been known to do), so I got up extra early, had a big breakfast with some guests from Europe, and made it to sit up front. Matthew Miller gave a nice talk on the status of Fedora and was able to show some pretty pictures from data I helped collect. After trying to advertise the EPEL state of the union talk, I then went to do some hallway meetings and talked with kernel, FESCO and various developers about x86_32 support in Fedora. This was to tell the x86 committee at a meeting on 2017-09-06. 
Later in the day, I went to see Tom Callaway give a talk on licenses and the importance of a strong liver when dealing with them. It was interesting to see how far we have come in so many years. I had hoped to then go to the Fedora on Windows subsystem as I have been using Cygwin on Windows for years and wanted to see how this worked  also. However, a work item came up and I was pretty much booked until later in the evening.

Wednesday: Day 2

Today was the EPEL state of the Union talk. I spent the morning working on a blog post about everything I was going to say.. only to do a CNTRL-A backspace at the wrong moment. Goodbye writing. I am going to go over the particulars in a different post. The two talks went pretty well but I am needing to go over the videos to see what I actually said versus what I think I said. After the talks, I got to ride in a Tesla and also play various boardwalk games at a nice retro playplace. I finally went back and crashed for a bit, but woke up with insomnia til 4am. 

Thursday: Day 3

This day was a for the start of it. I was really really tired and almost fell asleep at the Fedora Infrastructure State of the Union talk. I went back to the room at 1300 for a power nap and woke up after 1700. Went to see if anything was still active and had some more hallway talks about EPEL and other architectures. Finally went back to bed at 2200 and slept soundly.

Friday: Day 4

Had a nice breakfast with most of the Fedora Infrastructure team, and then did a fast jog to catch my bus to the airport. The bus ride was supposed to be 90 minutes which would allow me 2 hours to get through security. Sadly, a Friday before Labour day weekend.. does not lead to a 90 minute bus ride. At 3 hours and somewhat, I got to the airport in time to deal with very last minute getting through security and everything else. I got onto the plane before the doors closed, and was able to fly home to be greeted by the last remnants of hurricane Harvey. We only had 40 minutes of rain from it but even as a smidgen of what eastern Texas got it was incredibly heavy rain and hail. Got home and crashed. 

Fedora Project Outage RCA :: DNS Outage 2017-09-06


Early on 2017-09-06, many people attempting to reach fedoraproject.org
found that it had disappeared from the internet. People attempting to
do 'yum/dnf install', browse the website, or other Internet related
activities were getting various error messages that the sites no
longer existed in DNS. Some people had no difficulty and were not
able to duplicate the problem, but anyone who was using a DNS server
that had dnssec checking turned on were unable to get any IP address
lookups related to the site.

The problem was due to a misconfigured record in the registrar's data
about DNS. The previous week, multiple records had been added by the
registrar to the DNS data in the .org. DNS table. The records were the
DNSsec records for fedorapeople.org, fedorahosted.org, and
fedoraproject.org, and the registrar had added them to fedoraproject.org.
versus each to the correct zone. In seeing this, I asked for two of
the records to be removed, and somehow confused which one was to
stay. This meant that the key meant for fedorahosted.org. was left for
fedoraproject.org and the fedoraproject/fedorapeople were removed.

When the registrar updated its .org. data early UTC on 2017-09-06, DNS
servers like Google's 8.8.8.8 dns no longer would show any addresses
inside of Fedora's dns tables. Other dns servers also were no longer
working and people who are on the IETF for DNSsec came into help in
case there was some other problem going on.

After diagnosing the problem, Fedora IT contacted the registrar and
got the correct DNSsec keys added to the master table. This cleaned up
the problems with many DNS servers but some will cache the broken data
for up to the TTL of 24 hours so users were still having problems as
of 2200 UTC 2017-09-06. A temporary fix is to hard code the main proxy
ip address into /etc/hosts, however this can cause problems later if
not removed and the main proxy is down for maintenance.

I would like to thank the members of the IETF dnssec group who took
the time out to help us through this problem. I would also like to
apologize to everyone who had disruption due to this.