mirror-checker/README.md

94 lines
6.3 KiB
Markdown

# Mirror Checker
This mirror status checker determines whether CSC mirror is up-to-date with upstream.
## How To Run
A configuration file may be provided through standard input. Without a configuration file, execute `python main.py`. By default, all the available distributions will be checked. With a configuration file, execute `python main.py < name_of_config_file.in`, for example, `python main.py < example.in`. In this case, only the distributions listed in the configuration file will be checked.
## Dev Notes
How the program works: We first have a general mirror check class called project.py which checks whether the timestamp in the directory of the mirror is in-sync with the upstream. Then, for each CSC mirror, a class is built which inherits from the general project.py class but often overrides the original check function with a check function specific to the mirror. A few big themes are: some check a mirror status tracker provided by the project mirrored; some check all the Release files for each version in a distro etc. website information which all the mirror checker classes need is stored in the data.json file.
Future notes: Because many of the mirror checkers are built very specific to each mirror. A slight change in the way the project manages their mirror-related websites, public repos etc. can drastically influence whether the mirror checker works correctly or not. These problems are also unfortunately very hard to detect, so it's important that CSC actively maintain the mirror checker so that it works as intended in the long term.
Extra notes: A test client for individual mirror checker classes is provided as test.py. To use it, simply change all occurrences of the imported project class
## Resources
- [CSC Mirror](http://mirror.csclub.uwaterloo.ca/)
- [Debian Mirror Status Checker](https://mirror-master.debian.org/status/mirror-status.html)
- [Debian Mirror Status Checker Code](https://salsa.debian.org/mirror-team/mirror/status)
if we can just view their repo online, we only have to remember the link for their repo and then check the latest timestamp in their repo the same way we check ours
even if the date relies on a specific file in their repo, we can still find the right link for it
to find repos of the mirrored projects to check, just search "projectName mirrors"
## Checker Information
- almalinux
- alpine
- apache
- archlinux
- centos
- ceph
- CPAN
- CRAN: https://cran.r-project.org/mirmon_report.html has a mirror tracker
- csclub: for now, this is the upstream itself, so it needs not to be checked
- CTAN: https://www.ctan.org/mirrors/mirmon has a mirror tracker
- Cygwin
- damnsmalllinux: http://distro.ibiblio.org/damnsmall/ not checking this, since it's abandoned
- debian
- debian-backports: this is a legacy thing, no longer have to check
- debian-cd
- debian-multimedia
- debian-ports
- debian-security
- debian-volatile: this is a legacy thing, no longer have to check
- eclipse
- emacsconf: for now, this is the upstream itself, so it needs not to be checked
- fedora
- freeBSD
- gentoo-distfiles
- gentoo-portage
- gnome
- GNU
- gutenberg
- ipfire
- kde
- kde-applicationdata
- kernel
- linuxmint: https://mirrors.edge.kernel.org/linuxmint/ candidate for brute force looping
- linuxmint-packages: https://mirrors.edge.kernel.org/linuxmint-packages/ Checking the timestamp of either the Release file or the Packages file should suffice.
- macPorts: only distfiles has public repo, no timestamp, too large to loop through, comparing ports.tar.gz in distfiles
- manjaro
- mxlinux
- mxlinux-iso: this one seems out of sync on the official tracker for 134 days, which is weird
- mysql: http://mirrors.sunsite.dk/mysql/
- NetBSD: http://ftp.netbsd.org/pub/NetBSD/ checking timestamps of change files in different versions, and SHA512, MD5 files in the isos of different versions
- nongnu: http://download.savannah.nongnu.org/releases/ https://savannah.gnu.org/maintenance/Mirmon/ http://download.savannah.gnu.org/mirmon/savannah/
- openbsd
- opensuse: http://download.opensuse.org/ check Update.repo files in folders inside the update folder, not checking tumbleweed-non-oss/ and tumbleweed/ temporarily
- parabola: https://repo.parabola.nu/ https://www.parabola.nu/mirrors/status/
- pkgsrc
- puppylinux: https://distro.ibiblio.org/puppylinux/ check the ISO files or htm files in the folders starting with puppy
- qtproject: https://download.qt.io/
- racket: https://mirror.racket-lang.org/installers/ make sure that we have the latest version number under racket-installers
- raspberry pi: https://archive.raspberrypi.org/ Checking the timestamp of either the Release file or the Packages file should suffice.
- raspbian: http://archive.raspbian.org/ snapshotindex.txt is most likely a timestamp, tho i'm not sure. also i think our mirror is completely outdated, it's not listed on official mirror list
- sagemath: same source tarballs as them (the sage-*.tar.gz files under 'Source Code')
- salt stack: checking the "Latest release" text under the 'About' header
- scientific: https://scientificlinux.org/downloads/sl-mirrors/ not checking this one since it's abandoned
- slackware: https://mirrors.slackware.com/slackware/ check whether we have each release and whether the timestamp for CHECKSUMS.md5 in each release is the same, for slackware-iso, just make sure that our list of directories is the same
- tdf: https://download.documentfoundation.org/
- trisquel: http://archive.trisquel.info/trisquel/ checking Release file for all versions in packages/dist and md5sum.txt in iso/ with two other mirrors
- ubuntu: https://launchpad.net/ubuntu/+mirror/mirror.csclub.uwaterloo.ca-archive
- ubuntu-ports: http://ports.ubuntu.com/ubuntu-ports/ checking the Release files in dists
- ubuntu-ports-releases: https://cdimage.ubuntu.com/releases/ has public repo, no timestamp, no status tracker, brute force looped it
- ubuntu-releases: https://releases.ubuntu.com/
- vlc: http://download.videolan.org/pub/videolan/
- x.org: https://www.x.org/releases/ check all of the files under each directory under /x.org/individual/, and make sure that we have all of the files which the upstream has, ignoring the xcb folder
- Xiph: https://ftp.osuosl.org/pub/xiph/releases/ loop through each directory in xiph/releases/ and trying to compare the timestamp of the checksum files
- xubuntu-releases: https://cdimage.ubuntu.com/xubuntu/releases/ candidate for brute force looping since it has few folders