My personal and professional life

2020-02-29

Migrations to Git and GitHub continue

Last year I started migrating from CVS to Git and GitHub (see First migrations from CVS to Git), which is something I was willing to start back in 2018 (see Migration to Git and GitHub), but planned since even longer ago. Well, it proved to be a tough task, because I'm still migrating as many of my CVS repositories required corrections, so the history could be properly transferred in Git. During the migration I was surprised how negligent I was to my sources, because I found some uncommitted changes from as far back as 10 years ago, corrupted revision control files and various history inconsistencies. After migrating some projects, I immediately started working on updates and fixes, so I already continue my work on GitHub. I also enabled continuous integration for some projects using Travis CI and GitHub Actions.

I'm currently migrating my Slackware package's build scripts, which is a collection of over 300 Shell scripts and related files organized as separate repositories in dedicated directories, but under the same root. In CVS this was completely right, but it does not produce good history in Git, because the version tags or branches (e.g. FFmpeg-3_4_7 or MySQL-5_5) are specific only to some build script and used only by the files in its respective directory. I thus decided to migrate these separately and then combine them with another repository with sub modules. There were many problems with this migration as well.

These were some the problems:
  • tags or branches with similar names (e.g. TEST-123 and TEST_123). Those are easy to correct - I just delete the wrong tag/branch;
  • misplaced tags or branches. Since in CVS I was tagging and branching as necessary some tags and branches were put on different files at different time, which in Git history resulted in commits with the message "This commit was manufactured by cvs2git to create tag 'TEST-123'" or "This commit was manufactured by cvs2git to create branch 'TEST-123'" with files being added, deleted or modified to adjust history for the tag or branch respectively. I'm fixing such problems by reordering the problematic commits, which is easy by just changing the time in the revision control file in the repository, but requires time to review and understand the reason;
  • unused files not deleted in version history. This also produced the commits with messages mentioned in the previous bullet point. I'm fixing these by deleting the unused files with past date, so such files are not considered by consecutive commits;
  • files belonging to a branch, but committed to main trunk instead. It's similar to the above. In some cases, there are files that should exist only on some branch (e.g. like patches to fix issues with specific software versions), but were committed to main trunk. As in CVS there is no big difference between tags and branches it was enough just to tag the changes for checking out properly. I'm moving these from main trunk into a branch, which means deleting revisions from main trunk and moving them on the branch;
  • spelling errors. Since I mostly committed to CVS on the command line and I did not use spell checker there were many commit messages with spelling errors that irritate my eyes. I'm correcting them after initial migration to Git, so I could check all messages at once.
Two utilities proved to be of great help after migration - licensee and github-linguist. I use the first to check whether license is detected properly. I had some problems (see issues 361 and 392) with some of my projects, so I now check before pushing to GitHub. The second is useful for fine tunning languages detection (e.g. I'd like to see just SQL and not PLSQL, PLpgSQL, SQLPL or TSQL that linguist would detect although all my published SQL sources are for MySQL. And for me SQL is code, not data).

Anyway, this post is to note that my migrations to Git continue. I've made great progress last year by migrating 28 repositories. And this year by now I migrated 114 more. So hopefully by the end of the year I'll be free from using CVS (or at least only by exception). Initially I planned migrating everything, but now I consider skipping some low value projects or examples, which if necessary, I could migrate later.

No comments: