Cleaning up Links in your WordPress Site After Switching Domains or Permalinks

Posted: September 04, 2018

I recently moved this content from Monday By Noon where it sat for over a decade to its new place here on this domain. I learned a lot about a lot during that time and as I iterated on the site I played around with permalinks a number of times.

There was a time when I subscribed to the idea of following a WordPress convention of including the year/month/day in the permalink of posts, so I did that for a while. I also recall a time where for one reason or another I disliked the nesting level that resulted, despite it working well with WordPress’ archive implementation, so I moved to a strange format of YYYYMMDD for a stint as well.🤷‍♂️

As one does, I used internal links from time to time throughout these periods, and along the way set up some .htaccess rules to handle the changes. There’s no real reason why, but I just don’t like having redirect rules in the mix. They can grow exponentially and just introduce some black box behavior that’s outside the scope of this site, so I’ve made an effort to clean a few things up. Further, there was a time when I fully hand-coded my posts and ended up using some absolute URLs with either/or permalink structure.

Because the redirect rules are no longer in play, I had a bunch of now-dead URLs sprawled across the whole site. I used WP Migrate DB Pro to move the site which took care of the domain change, but I have these strange/specific text column changes that need to be made.

Fixing internal links with WP-CLI

Admittedly I don’t use WP-CLI all that often, but when I do it totally saves the day. Today was one of those days. You can use WP-CLI to perform a search-replace on your database.

Before you do that I would strongly suggest you run the following to build an export of your database, regardless of environment. It will generate an export of your database as-is, before you make any changes:

wp db export

Secondly I would suggest trying this on a staging environment before running it in your production environment.

WP-CLI search-replace with regex

WP-CLI’s search-replace command supports a --regex option, and that’s just what we need here. Note that the docs state regex operations take 15-20x longer than normal operations. This is to be expected, a slight bummer, but very much worth it.

Another favorite option of mine for WP-CLI is --dry-run — this allows you to observe what’s going to happen for a specific command before you actually do it. This has saved me from some trouble a number of times. Even on staging environments a fumbled command can eat up time getting the staging site put back together.

I put together my regular expression for the first permalink structure I wanted to update, the command looked like this:

wp search-replace 'jonchristopher\.us\/[0-9]{4}\/[0-9]{2}\/[0-9]{2}\/([a-zA-Z0-9-_]+)/' 'jonchristopher.us/blog/\1"' --regex --regex-flags='mi' wp_posts --include-columns=post_content --dry-run

The command looked to do what was on the tin; limit the search-replace to wp_posts.post_content (influenced by the note that the process would take 15-20x longer than usual) and perform a dry run of the operation.

While I consider myself somewhat proficient with regular expressions, I always feel like I’m missing an edge case or two. I love that WP-CLI displays how many replacements are going to be made, I wanted to know specifically which edits were going to be made so I could make sure it was going to do what I wanted it to do.

I could have run the command and examined what happened by viewing a test post on the site, but if it was wrong I’d need to restore from an export and repeat the process. I was hoping for something a bit more useful.

How to preview WP-CLI’s search-replace in a dry run

I re-read the docs in hopes of finding some sort of ‘preview’ mode to run alongside --dry-run but came up short, so I put out a question on Post Status which quickly got me a helpful answer. Darren Ethier clued me in to the fact that WP-CLI’s --log option will dump out what can be described as a diff for search-replace.

Screenshot of a terminal showing WP-CLI's --log option for search-replace
WP-CLI’s –log option when using search-replace

Using --log I was even able to better refine my command and get the search-replace done in just a couple of passes for the various permalink structures that littered my post_content column. Done and done!