The Fall of PolyWogg (part 1 / 4)
I want to talk some more about my website, and if last September was version 5.0, then the version I had as of February of this year was probably about version 5.2. I had added a few extra features, added some functionality, and expanded the base. I had made extensive progress with my photo gallery, and overall, I hate to say it, but I was feeling pretty good about it. From January 2020 to February 2021, I had drastically upgraded a TON of stuff on the site, and I felt like it was generally under control.
Jinxing myself
If feeling like things are going well, then February was probably a karmic risk. I was spending a little extra time on my photo galleries, and I was starting to feel like I had almost reached critical mass. I had 2005, 2006, and 2007 on the site, plus “regular 2008” and most of “wedding 2008”. I hadn’t quite finished the wedding albums, but most things were working so it was only a matter of time until I did. I was about halfway through the wedding albums, with my eye starting to turn towards the honeymoon photos, and beyond.
Except I ran into a small problem with the galleries. When I went to upload photos, I basically would open an upload window in the NextGen Gallery plugin, upload all the photos, make a number of tweaks, upload the videos, make some more tweaks, save it all, and then share it to FB.
But the upload was throwing errors for some reason. It took a while to narrow down even what the error was about, but essentially it would start uploading, and regardless of the progress on any file in there, if any one file took more than 30s OR the whole process ran more than 2m, then the files would time out. Now you would think that would be easy to figure out where the problem was — obviously there was a server setting somewhere that was timing out on me. Yet when I worked with the Level 1 and 2 support people with the hosting provider, we could not adjust the settings to prevent it.
Of course, I had to work through standard initial responses. “It’s a plugin conflict” … nope, I already tried deactivating EVERYTHING else. “It’s a theme conflict” … nope, tried that too. “It’s a problem with the plugin itself” … except it was the same version as the month before, no updates, AND I have the premium version. No change in the plug-in, but suddenly my server wasn’t letting me upload consistently. We tried modifying a bunch of different variables, mostly to make it more efficient so it would complete before it timed out, but we couldn’t seem to stop the 30s and 2m timelines. The plugin developer gave me a work-around so that it would do one file at a time, no concurrent loads (the default is 6-8 at a time in segments), and it worked most of the time. Right up until a file reached 30s and then it would time out. The 2m limit was still active though.
Now, the simple solution normally would be to do much smaller batches OR upload using FTP, but both added a bunch of extra steps to the process, so I was still trying to find WHERE in the server I was timing out.
Enter the dragon
I don’t want to appear overly dramatic, but I don’t know how to avoid it. I want to be fair, but at the same time, someone screwed the pooch. With the multiple attempts to “fix” things, someone in Level 2 support got the bright idea that there was something wrong with the configuration of my WordPress install. That it was taking too much overhead, and that was why things were timing out. That’s completely unrelated, but whatever, let’s not quibble.
Anyway, they sent me an email on a Thursday morning that said basically, “Okay, I’ve run an optimization on your WordPress databases and activated compression. I know that you said you didn’t want compression on your site, but it will speed things up, and the uploads should be able to complete.” He included pics / screengrabs of the front end of my site to show that it was all working.
I got the email on my personal account while I was working, and my immediate thought was, “Oh, crap.”
First and foremost, I’ve run compression on my site and it screwed up a BUNCH of things. It took some effort to undo them. Now, I ran it at the plugin level, and he was running it at the server level, but it scared the crap out of me. If he had asked me, I would have initially said no, not a chance, but might have been able to be talked into it. Except compression on the site wouldn’t affect my uploads — that would make my site render faster for front-end facing, but it wouldn’t improve my uploads, would it? (Answer: No).
Secondly though, I was wondering about the optimization process. When you run optimization, it often asks you what you want to do as part of your optimization:
- Simple optimization, looking for deleted entries that need to be cleaned up, etc.;
- Deleting revision histories, i.e., if you made a change, the previous “x” number of versions of that post are still available if you have to revert to an earlier version;
- Deleting auto-save / auto-drafts, i.e. if you leave the post open in edit mode for awhile, while you’re doing something else, and don’t close properly, your last “auto-save” is sitting there;
- Emptying the trash;
- Removing old comments that are left unapproved;
- Removing old transients, i.e., unattached orphan bits of info;
- Removing ping-backs and trackbacks to other websites;
- Removing orphaned meta data for posts and comments; and,
- Removing orphaned relationship info.
Generally speaking, what you keep or delete can be “everything” or “nothing”, and every point in between. The first one (simple optimization) is generally done by everyone, without too many problems, as can removing orphaned meta data (#8), relationship info (#9), transients (#6) and the trash (#4). But the rest? That’s highly personalized. I tend to keep up to 4 revisions in my history on current files, just in case I screw something up and want to go back in time without having to do a restore from backup. Older stuff? Sure, no problem, but recent revisions? Some plugins set the cut-off at 2w, which is reasonable, but I had no idea what the support guy had run. Auto-saves? Old comments? I didn’t have anything in the pipeline at the time, but it wouldn’t be the first time in my website history if I had examples of both that I didn’t want to lose.
So I wasn’t exactly thrilled that they had run both compression and optimization without checking with me first. But, the front end was working still, so should be okay, right?
The best-laid backup plans
Later that night, I tried to login to my site and it wouldn’t let me in. I had to actually go in through the server settings, reset a plugin, and then login manually a different way. Odd. But I figured maybe the optimization had messed something up. No worries, all in, reset, all good. Except it wasn’t.
I noticed one of my pages on the front-end looked fine on the blog home page, but when you actually clicked on it to go to the main page, it threw errors. In fact, ALL of my posts were throwing an error when it was on the full page for the post. Umm…Then a few other glitches cropped up. Okay, something’s not right. Time to go to the backup and revert the current version.
As an aside, there are 4 types of backups generally for WordPress sites that are self-hosted. First, you can make a manual copy of the entire site and download it. It’s incredibly time-consuming on a large site, but you can do it. Second, you can run a plugin that will make a backup of the site and store a copy either on the server or send it to some sort of off-site cloud storage. Third, you can run an external software that will backup from the server to a third-party site. Or you can use server software to backup everything.
I had some older versions of the site fully downloaded, so that was always an option, just old. In addition, I ran a backup plugin that had some versions saved on the server. And the big option, the server software option, had daily full and incremental backups. I had run into glitches previously where I had to use the server backup offered by my hoster, and they have always worked well. Even though they screwed up my site Thursday morning, there was a full backup run and available as of 8:00 a.m. that morning. Perfect, I restored from there.
Except the restore only partially worked. It gave me a restore…from almost 14m before. None of my recent work in the last year had restored, nor any of the content. Huh? Okaaaaaay, how about Wednesday? Tuesday? Monday? Nope, none of them would restore properly. Nor would the full-download version of the backup or the plugin ones stored on the site. None of the backups would complete. F***.
I ended up dealing with a CSR who was actually decent, and he figured out that for some reason, the caching software that all of their servers were running, Litespeed, was interfering with the restore. In essence, it was telling the restore that the files were already there, so it wasn’t restoring all of them. Crap. That caching software can’t be disabled. It is at the full server level.
Between the CSR and myself, we managed to run a series of “manual” restores on the Thursday morning version and in the end, we got all of the data back. It wasn’t accomplished by the rules according to Hoyle, but it was done. Whew.
Enter the gremlins
The backup completed, and I seemingly had everything back, but then I started to notice some gremlins. I’d go to edit an old post, and the photos I had linked to wouldn’t show in the editor. If I did a preview, some of them showed, I would refresh, and everything would be fine. Okay, looks like a simple caching problem. Then I would come back to the page, and something else wouldn’t work. I tried editing again, and a paragraph would be “missing”.
It looked at first like it was just “gone”, but then I would notice that the block was still there, it was just NOT SHOWING the text. So I would click on it, switch to the HTML mode, and it would give me a really weird set of codes. Blocks that were the simplest blocks of all would suddenly have almost CSS-like styling codes embedded with them. Huh? Where did THEY come from? How are they merged with my HTML content? WTF?
For my photo galleries, I had been embedding them on pages (rather than as posts), so I was running a plugin that displayed the pages as “nested” trees. It makes it way easier to manage them than as part of the standard WP page interface, yet with the gremlins, the tree wasn’t working consistently. I couldn’t move pages up or down, or I could move one and then have to totally reload the page before trying to move another. WTF x 2?
And then I started noticing some other gremlins. A couple of key plugins that I use were not working / loading at all, I started to get errors in the admin screen, which I could dismiss, and then 20m later, I’d get a similar but just slightly different one.
Finally, I ran into a problem where none of my reusable blocks were loading. Okay, that’s a problem. I tried disabling a few plugins and reinstalling them, I even tried reinstalling the core WP files. Nada. The gremlins remained. It almost looked like I had malware or a virus working its way through the install, but I don’t think so.
I think it was just that the restore had not properly restored everything.
Which left me with a choice
I tried a bunch of things, no luck. No new options from the tech support people, although I was a little bit gun-shy about their help anyway. They were the ones who nuked things in the first place.
So I gradually came to the sad and sobering realization. I had 24y worth of data, and while I had it back, the website that I had crafted over the last 17y was basically now unreliable. The foundation was shot.
I knew I could rebuild, with a massive amount of work. Did I want to?
Continue reading at My existential angst as a dead blogger (part 2 / 4).