How to clone (a lot of) package spec files from CentOS 8 git.centos.org

Recently I have had to try and work out all the dependencies on a set of packages. I am writing this as a blog, as I needed to recreate work that I had done several times in the past, but past Smoogen had not fully documented. [I went looking and found I had 3 different copies of trees of packages from Fedora and CentOS but absolutely no notes and my bash_history had cycled over the 10k lines I normally keep. Past Smoogen was a BAD, BAD sysadmin.]

For general requires, I would do this by using dnf or dnf

$ dnf repoquery --requires bash
Last metadata expiration check: 2:44:24 ago on Fri 03 Sep 2021 10:14:22 EDT.
filesystem >= 3

However in this case, I needed to also work out all the buildrequires of the packages, and then the requires and buildrequires of those tools. Basically it is sort of building a buildroot for a set of leaf packages which usually means I need to get the spec files and parse them with something like spectool.

If I was working with Fedora, I would take the shallow git clone of the src.fedoraproject.org website which can be found at https://src.fedoraproject.org/lookaside/. Then I would start going down my list of 'known' software I need to work and clone out the usual 'buildroot' a fedpkg mockbuild of the package would give. However I am working with CentOS Stream 8 which is a slightly different repository layout, and does not have a prebuilt shallow clone.

Thankfully, the CentOS repository has a very useful sub-repository called https://git.centos.org/centos-git-common.git which contains all the tools to fetch the appropriate code and tools from the CentOS src repository. The first one I need work with is centos.git.repolist.py to query the pagure API and get a list of packages. I then need to clean up that list a bit because it contains some forks but the following will get me a complete list of the packages I want to parse:

[centos-git-common (master)]$ ./centos.git.repolist.py | grep -v '/forks/' > repolist-2021-09-03 [centos-git-common (master)]$ ./centos.git.repolist.py --namespace modules | grep -v '/forks/' >> repolist-2021-09-03 [centos-git-common (master)]$ sort -o repolist-2021-09-03 -u repolist-2021-09-03 [centos-git-common (master)]$ wc -l repolist-2021-09-03 8769 repolist-2021-09-03

That is a lot more packages than CentOS ships with as there are just over 2600 src.rpm packages in vault.centos.org for AppStream, BaseOS, and PowerTools. What is going on here?

It turns out there are two different events happening:

  1. Buildroot packages.
  2. SIG packages

Buildroot packages are the ones which are not shipped with EL8 but are needed to build EL8. [EL-8 was not meant to be build complete but just the packages that would be supportable by Red Hat.]. SIG packages are various ones which CentOS SIGs are supporting for projects like virtualization, hyperscale, and automotive. I may need to trace through all of these for my own package set, so I decided to clone them all first and then try to figure out what is needed afterwords.

# A silly script to clone all the repositories from git.centos.org in
# order to work out things like buildorder and other tasks.
# Set up our local repo place. I have lots of space on /srv


# loop over the namespaces we want to clone. It would have been nicer
# if there was a third namespace for sigs, but I don't think
# namespaces really happened when centos-7 was setting up git.centos.org.
for namespace in rpms modules; do
    mkdir -vp ${CLONE_DIR}/${namespace}
    cd ${CLONE_DIR}/${namespace}
    for repo in $( grep "/${namespace}/" ${CLONE_DIR}/${CLONE_LIST} ); do
        repodir=$( basename ${repo} )
        git clone ${repo}
        if [[ -d ${repodir} ]]; then
            pushd ${repodir} &> /dev/null
            X=`git branch --show-current`
            if [[ ${X} =~ 'c8s' ]]; then
                echo "${i} ${X}"
                git branch -a | grep c8s &> /dev/null
                if [[ $? -eq 0 ]]; then
                    echo "${repodir} ${X} xxx"
            popd &> /dev/null
        sleep 1

Running this takes a couple of hours, with a lot of errors about empty directories (for git repos which seem to have been created but never 'filled') and for non-existant branches (my script just looks for a c8s but some branches were called slightly differently than that.) In either case, I end up having the git repos I was trying to remember how I got earlier.

No comments: