Case Study: One Site’s Recovery from an Ugly SEO Mess

Posted by AlanBleiweiss

This past March, I was contacted by a prospective client:

My site has been up since 2004. I had good traffic growth up to 2012 (doubling each year to around a million page views a month), then suffered a 40% drop in mid Feb 2012. I’ve been working on everything that I can think of since, but the traffic has never recovered.

Since my primary business is performing strategic site audits, this is something I hear often. Site appears to be doing quite well, then gets slammed. Site owner struggles for years to fix it, but repeatedly comes up empty.

It can be devastating when that happens. And now more than ever, site owners need real solutions to serious problems.

SEO Traffic Loss

As this chart shows, when separating out the “expected” roller coaster effect, Google organic traffic took a nose-dive in early February of 2012.

First step: check and correlate with known updates

When this happens, the first thing I do is jump to Moz’s
Google Algorithm Change History charts to see if I can pinpoint a known Google update that correlates to a drop.

Except in this case, there was no direct “same day” update listed.

A week before, there’s an entry listed that references integrating Panda into the main index more, however discussion around that is this change happened sometime in January. So maybe it’s Panda, maybe it’s not.

Expand your timeline: look for other hits

At this point, I expanded the timeline view to see if I could spot other specific drops and possibly associate them to known updates. I did this because some sites that get hit once, get hit again and again.

Google Organic Historic Visit View

Well now we have a complete mess.

At this point, if you’re up for the challenge, you can take the time to carefully review all the ups and downs manually, comparing drops to Moz’s change history.

Personally, when I see something this ugly, I prefer to use
the Panguin Tool. It allows you to see this timeline with a “known update” overlay for various Google updates. Saves a lot of time. So that’s what I did.

Panguin Tool Historic View

Well okay this is an ugly mess as well. If you think you can pin enough of the drops on specific factors, that’s great.

What I like about the Panguin Tool is you can “turn off” or “hide” different update types to try and look for a consistent issue type. Alternately, you can zoom in to look at individual updates and see if they align with a specific, clear drop in traffic.

Zoomed In Panguin Evaluation

Looking at this chart, it’s pretty clear the site saw a second dropoff beginning with Panda 3.3. The next dropoff aligned with updates appears to be Panda 3.4, however it was already in a slide after Panda 3.3 so we can’t be certain of this one.

Multiple other updates took place after that where there may or may not have been some impact, followed by further cascading downward.

Then, in the midst of THAT dropoff, we see a Penguin update that also MAY or MAY NOT have played into the problem.

The ambiguous reality

This is a great time to bring up the fact that one of the biggest challenges we face in dealing with SEO is the ambiguous nature of what takes place. We can’t always, with true certainty, know whether a site has been hit by any single algorithm change.

In between all the known updates, Google is constantly making adjustments.

The other factor here is that when a given update takes place, it doesn’t always roll out instantly, nor is every site reprocessed against that latest change right away.

The cascading impact effect

Here’s where evaluating things becomes even more of a mess.

When an algorithm update takes place, it may be days or even weeks before a site sees the impact of that, if at all. And once it does, whatever changes to the overall status of a site come along due to any single algorithm shift, other algorithms are sometimes going to then base formulaic decisions on that new status of a site.

So if a site becomes weaker due to a given algorithm change, even if the drop is minimal or non-observant, it can still suffer further losses due to that weakened state.

I refer to this as the “cascading impact” effect.

The right solution to cascading impact losses

Okay so lets say you’re dealing with a site that appears to have been hit by multiple algorithm updates. Maybe some of them are Panda, maybe others aren’t Panda.

The only correct approach in this scenario is to step back and understand that for maximum sustainable improvement, you need to consider every aspect of SEO. Heck, even if a site was ONLY hit by Panda, or Penguin, or the “Above the Fold” algorithm, I always approach my audits with this mindset. It’s the only way to ensure that a site becomes more resilient to future updates of any type.

And when you approach it this way, because you’re looking at the “across-the-board” considerations, you’re much more likely to address the actual issues that you can associate with any single algorithm.

The QUART mindset

It was at this point where I began to do my work.

A couple years ago, I coined the acronym QUART—what I call the five super-signals of SEO:

  • Quality
  • Uniqueness
  • Authority
  • Relevance
  • Trust

With every single factor across the full spectrum of signals in SEO, I apply the QUART* test. Any single signal needs to score high in at least three of the five super-signals.

Whether it’s a speed issue, a crawl efficiency issue, topical focus, supporting signals on-site or off-site, whatever it is, if that signal does not score well with quality, uniqueness or relevance, it leaves that page, that section of a site, or that site as a whole vulnerable to algorithmic hits.

If you get those three strong enough, that signal will, over time, earn authority and trust score value as well.

If you are really strong with relevance with any single page or section of the site, but weak in quality or uniqueness, you can still do well in SEO if the overall site is over-the-top with authority and trust.

*When I first came up with this acronym, I had the sequence of letters as QURTA, since quality, uniqueness, and relevance are, in my opinion, the true ideal target above all else. New sites don’t have authority or trust, yet they can be perfectly good, valuable sites if they hit those three. Except Jen Lopez suggested that if I shift the letters for the acronym, it would make it a much easier concept for people to remember. Thanks Jen!

A frustratingly true example

Lets say you have one single page of content. It’s only “okay” or may even be “dismal” in regard to quality and uniqueness. Even if you do, and if the site’s overall authority and trust are strong enough, you can outrank an entire site devoted to that specific topic.

This happens all the time with sites like Wikipedia, or Yahoo Answers.

Don’t you hate that? Yeah, I know—Yahoo Answers? Trust? Ha!

Sadly, some sites have, over time, built so much visibility, brand recognition, and trust for enough of their content, that they can seemingly get away with SEO murder.

It’s frustrating to see. Yet the foundational concept as to WHY that happens is understandable if you apply the QUART test.

Spot mobile site issues

One challenge this site has is that there’s also a separate mobile subdomain. Looking at the Google traffic for that shows similar problems, beginning back in February of 2012.

Mobile Site Historic Google Traffic

Note that for the most part, the mobile site suffered from that same major initial hit and subsequent downslide. The one big exception was a technical issue unique to the mobile site at the end of 2012 / beginning of 2013.

Identify and address priority issues

Understanding the QUART concept, and having been doing this work for years, I dove head-first into the audit.

Page processing and crawl efficiency

Before and after Page Speeds
NOTE: This is an educational site – so all “educational page” labels refer to 
different primary pages on the site. 

For my audits, I rely upon
Google Analytics Page Timings data, URIValet.com 1.5 mbps data, and also WebPageTest.org (testing from different server locations and at different speeds including DSL, Cable and Mobile).

Speed improvement goals

Whenever I present audit findings to a client, I explain “Here’s the ideal goal for this issue, yet I don’t expect you to hit the ideal goal, only that you do your best to make improvements without becoming bogged down in this one issue.”

For this site, since not every single page had crisis speed problems, I was looking to have the site owner at least get to a point of better, more consistent stability. So while there’s still room for vast improvement, the work performed went quite far in the right direction.

Speed issues addressed: domain and process calls

The first issue tackled was the fact that at the template level, the various pages on the site were calling several different processes across several different domains.

A great resource to use for generating lists of what third party processes individual pages use that I rely upon is a report in the WebPageTest.org results. It lists every domain called for the page tested, along with total processes called from those, and gives separate data on the total file sizes across each.

Reducing the number of times a page has to call a third party domain, and the number of times an individual process needs to be run is often a way to help speed up functionality.

Improving third partry domain requests

In the case of this site, several processes were eliminated:

  • Google APIs 
  • Google Themes 
  • Google User Content 
  • Clicktale 
  • Gstatic 
  • RackCDN

By eliminating functionality that was dependent upon third party servers meant less DNS lookups, and less dependance upon connections to other servers somewhere else on the web.

Typical service drains can often come from ad blocks (serving too many ads from too many different ad networks is a frequent speed drain culprit), social sharing widgets, third party font generation, and countless other shiny object services.

Clean code

Yeah, I know—you don’t have to have 100% validated code for SEO. Except what I’ve found through years of this work, is that the more errors you have in your markup, the more likely there will be potential for processing delays, and beyond that, the more likely search algorithms will become confused.

And even if you can’t prove in a given site that cleaner code is a significant speed improvement point, it’s still a best practice, which is what I live for. So it’s always included in my audit evaluation process.

HTML Markup Improvements

Improve process efficiency

Next up on the list was the range of issues all too many sites have these days regarding efficiency within a site’s own content. Tools to help here include
Google Webmaster Tools, Google Page Speed Insights, and again WebPageTest.org among others.

Issues I’m talking about here include above-the-fold render-blocking JavaScript and CSS, lack of browser caching, lack of compression of static content, server response times, a host of code-bloat considerations, too-big image sizes, and the list goes on…

Google Page Speed Insights Grades
NOTE: This is an educational site – so all “educational page” labels refer to 
different primary pages on the site. 

Note: Google Page Speed Insights recommendations and WebPageTest.org’s grade reports only offer partial insight. What they do offer however, can help you go a long way to making speed improvements.

Also, other speed reporting tools abound, to differing degrees of value, accuracy and help. The most important factor to me is to not rely on any single resource, and do your own extensive testing. Ultimately, enough effort in research and testing needs to be performed with followup checking to ensure you address the real issues on a big enough scale to make a difference. Just glossing over things or only hitting the most obvious problems is not always going to get you real long-term sustainable results…

Correct crawl inefficiency

Another common problem I find is where a site evolves over time, many of the URLs change. When this happens, site owners don’t properly clean up their own internal links to those pages. The end result is a weakening of crawl efficiency and then user experience quality and trust signals.

Remember that Google and Bing are, in fact, users of your site. Whether you want to admit it or not. So if they’re crawling the site and run into too many internal redirects (or heaven forbid redirect loops), or dead ends, that’s going to make their systems wary to want to bother continuing the crawl. And abandoning the crawl because of that is not helpful by any stretch of the imagination.

It also confuses algorithms.

To that end, I like to crawl a sampling of a site’s total internal links using
Screaming Frog. That tool gives me many different insights, only one of which happens to be internal link problems. Yet it’s invaluable to know. And if I find enough of a percentage of that sample crawl URLs are redirecting or dead ends, that needs to get fixed.

Eliminate crawl inefficiency

Note: for reference sake, the total number of pages on the entire site is less than 5,000. So that’s a lot of internal inefficiency for that size site…

Don’t ignore external links

While having link redirects and dead ends pointing to outside sources isn’t ideal, it’s less harmful most of the time than internal redirects and dead ends. Except when it’s not.

In this case, the site had previously been under a different domain name prior to a rebranding effort. And after the migration, it resulted in some ugly redirect loops involving the old domain!

Redirect Loop

Topical focus evaluation

At this point, the audit moved from the truly technical issues to the truly content related issues. Of course, since it’s algorithms that do the work to “figure it all out,” even content issues are “technical” in nature. Yet that’s a completely different rant. So let’s just move on to the list of issues identified that we can associate with content evaluation.

Improve H1 and H2 headline tags

Yeah, I know—some of you think these are irrelevant. They’re really not. They are one more contributing factor when search engines look to multiple signals for understanding the unique topic of a given page.

Noindex large volume of “thin” content pages

Typical scenario here: a lot of pages that have very little to no actual “unique” content—at least not enough crawlable content to justify their earning high rankings for their unique focus. Be aware—this doesn’t just include the content within the main “content” area of a page. If it’s surrounded (as was the case on this site) by blocks of content common to other pages, or if the main navigation or footer navigation are bloated with too many links (and surrounding words in the code), and if you offer too many shiny object widgets (as this site had), that “unique” content evaluation is going to become strained (as it did for this site).

Add robust crawlable relevant content to video pages

You can have the greatest videos on the planet on your site. And yet, if you’re not a CNN, or some other truly well established high authority site, you are almost always going to need to add high quality, truly relevant content to pages that have those videos. So that was done here.

And I’m not just talking about “filler” content. In this case (as it always should be) the new content was well written and supportive of the content in the videos.

Eliminate “shiny object” “generic” content that was causing duplicate content / topical dilution confusion across the site

On pages that were worth salvaging but where there was thin content, I never recommend throwing those out. Instead, take the time to add more value content, yes. But also, consider eliminating some of those shiny objects. For this site, the reduction of those vastly improved the uniqueness of those pages.

Improve hierarchical URL content funnels reducing the “flat” nature of content

Flat architecture is an SEO myth. Want to know how I know this? I read it on the Internet, that’s how!

The Flat Architecture Myth

Oh wait. That was ME who said it.

Seriously though. If all your content looks like
www.domain.com/category and www.domain.com/category and www.domain.com/category that’s flat architecture.

It claims “every one of these pages is as important as every other page on my site.

And that’s a fantasy.

It also severely harms your need to communicate “here’s all the content specific to this category, or this sub-category”. And THAT harms your need to say “hey, this site is robust with content about this broad topic”.

So please. Stop with the flat architecture.

And no, this is NOT just for search engines. Users who see proper URL funnels can rapidly get a cue as to where they are on the site (or as they look at that in the search results, more confidence about trust factors).

So for this site, reorganization of content was called for and implemented.

Add site-wide breadcrumb navigation

Yes—breadcrumbs are helpful. because they reinforce topical focus, content organization, and improvements to user experience.

So these were added.
Noindex,nofollowed over 1,300 “orphaned” pages

Pop-up windows. They’re great for sharing additional information to site visitors. Except when you allow those to become indexable by search engines. Then all of asudden, you’ve got countless random pages that, on their own, have no meaning, no usability, and no way to communicate “this is how this page relates to all these other pages over there”. They’re an SEO signal killer. So we lopped them out of the mix with a machete.

Sometimes you may want to keep them indexable. If you do, they need full site navigation and branding, and proper URL hierarchical designations. So pay attention to whether it’s worth it to do that or not.

Remove UX confusing widget functionality

One particular widget on the site was confusing from a UX perspective. This particular issue had as much to do with site trust and overall usability as anything, and less to do with pure SEO. Except it caused some speed delays, needless site-jumping, repetition of effort and a serious weakening of brand trust. And those definitely impact SEO, so it was eliminated.

Noindex internal site “search” results pages

Duplicate content. Eliminated. ’nuff said?

Eliminate multiple category assignments for blog articles

More duplicate content issues. Sometimes you can keep these, however if multiple category assignments get out of hand, it really IS a duplicate content problem. So in this case, we resolved that.

Unify brand identity across pages from old branding that had been migrated

Old brand, new brand—both were intermingled after the site migration I previously described. Some of it was a direct SEO issue (old brand name in many page titles, in various on-site links and content) and some was purely a UX trust factor.

Unify main navigation across pages from old branded version that had been migrated

Again, this was a migration issue gone wrong. Half the site had consistent top navigation based on the new design, and half had imported the old main navigation. An ugly UX, crawl and topical understanding nightmare.

Add missing meta descriptions

Some of the bizarre auto-generated meta descriptions Google had been presenting on various searches was downright ugly. Killed that click-block dead by adding meta descriptions to over 1,000 pages.

Remove extremely over-optimized meta keywords tag

Not a problem you say? Ask Duane Forrester. He’ll confirm—it’s one of many signal points they use to seek out potential over-optimization. So why risk leaving them there?

About inbound links

While I found some toxic inbound links in the profile, there weren’t many on this site. Most of those actually disappeared on their own thanks to all the other wars that continue to rage in the penguin arena. So for this site, no major effort has yet gone into cleaning up the small number that remain.

Results

Okay so what did all of this do in regard to the recovery I mention in the title? You tell me.

Google Panda Site Recovery

And here’s just a small fraction of the top phrase ranking changes:

Ranking Improvements After Audit

Next steps

While the above charts show quite serious improvements since the implementation was started, there’s more work that remains.

Google Ad Scripts continue to be a big problem. Errors at the code level and processing delays abound. It’s an ongoing issue many site owners struggle with. Heck—just eliminating Google’s own ad server tracking code has given some of my clients as much as one to three seconds overall page processing improvement depending on the number of ad blocks as well as intermittent problems on Googles ad server network.

Except at a certain point, ads are the life-blood of site owners. So that’s a pain-point we may not be able to resolve.

Other third party processes come with similar problems. Sometimes third party “solution” providers are helpful to want to improve their offerings, however the typical answer to “your widget is killing my site” is “blah blah blah not our fault blah blah blah” when I know for a fact from countless tests, that it is.

So in this case, the client is doing what they can elsewhere for now. And ultimately, if need be, will abandon at least some of those third parties entirely if they can get a quality replacement.

And content improvements—there’s always more to do on that issue.

Bottom line

This is just one site, in one niche market. The work has been and continues to be extensive. It is, however, quite typical of many sites that suffer from a range of issues, not all of which can be pinned to Panda. Yet where ignoring issues you THINK might not be Panda specific is a dangerous game, especially now in 2015, where it’s only going to get uglier out there…

So do the right thing for the site owner / your employer / your client / your affiliate site revenue…

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s