The Latest Weapon In The SEO Arsenal: DeepCrawl
We were recently contacted by Edy from DeepCrawl to trial their product, at first I was skeptical that the online crawler was likely to be able to justify its price, when compared to cheaper software such as ScreamingFrog. In the end though, it turns out that DeepCrawl pays dividends in the amount of insight it can find in one crawl. So without further a do, let me show you some of the results from the crawl we did on www.davidnaylor.co.uk.
This Site Has Issues
The first thing that you notice when looking at the report is the list of issues:

Some of these were a bit of a surprise, in the report you can click on each one and get a detailed breakdown of the problem and the list of URLs. This helped us identify a number of issues.
Issue 1: Duplicate Pages
Funnily enough, this was something we had addressed (or at least thought we had addressed) on an old design of davidnaylor.co.uk:
- http://www.davidnaylor.co.uk/pages/consultation/get-in-touch.html
- http://www.davidnaylor.co.uk/pages/search-engine-marketing/get-in-touch.html
- http://www.davidnaylor.co.uk/pages/search-engine-optimisation/get-in-touch.html
- http://www.davidnaylor.co.uk/pages/get-in-touch.html?link=main-menu
- http://www.davidnaylor.co.uk/pages/get-in-touch.html?link=footer-menu
These are all essentially the same page, realistically only one of them should resolve, with a canonical tag to account for the query strings.
Issue 2: Max Description Length (551 errors)!
We’re not entirely sure when this got started, again it might have been in the new design, but it appears our meta descriptions were/are being populated by part of the blog post content:

Issue 3: Image attachment URLs
Interestingly, we thought all these had been sorted as well, obviously not, for example:
- http://www.davidnaylor.co.uk/yahoo-data-in-bing-webmaster-tools.html/daves-impressions
- http://www.davidnaylor.co.uk/the-big-3-search-engines-gain-market-share-in-2010.html/market-share-over-2009
DeepCrawl found over 200 of these types of error.
Record, Prioritise & Assign Issues
After I’d identified all the problems, I used the built-in “issue” functionality. This enables you to add an issue and assign it to someone, it stores the report you were looking at and it allows you to prioritise by importance. Once you’ve done you can share any reports/issues with external users using an encrypted URL.

Re-run Crawls
The other cool functionality is what happens when DeepCrawl re-crawls a website. The brilliant thing about this is that it shows you the change from the last time it ran. This time I got all the critical issues fixed and then re-crawled the website. It shows you a really useful summary of the changes:

As you go into each report where there was an issue it prompts you with a popup, just click “mark as fixed” and save and hey presto. As you can see the critical issues has now been marked as fixed (as you may also notice, I need to pull my finger out!):

Other Features
There is quite a lot of functionality we haven’t talked about or used yet, one thing I think is a big positive is the ability to schedule recurring crawls so that you can have a historic record of your website from week to week or month to month. On top of that it has white label options for agencies and you can export a PDF report such as this one.
You can also customise what it classifies as:
- Max description tagtitle tag length
- Max HTML size
- Max number of links/external links
- Min/max content size
- Max load time
- Max URL length
- Minimum content to HTML ratio %
- Max number of redirections
- Default language
No doubt they will be adding more options as time goes on and the more people use it. One thing we were discussing with them is the ability to download and track your links from Google Webmaster Tools – which would be a very useful feature indeed. They have a lot already planned, so it’s definitely a product worth checking out.
7 Comments
DaveNaylor - from twitter
The Latest Weapon In The SEO Arsenal: DeepCrawl http://t.co/7Pv0W0u3VU
DaveNaylor - from twitter
The Latest Weapon In The SEO Arsenal: DeepCrawl http://t.co/7Pv0W0u3VU incase you missed it last night
Panagiotis Kontogiannis - http://www.blog7.org
Seospyder and screaming from do the same as deepcrawl, and without this tool any SEO professional don’t do it serious analysis in one website.
http://www.mobiliodevelopment.com/seospyder/
http://www.screamingfrog.co.uk/seo-spider/
David Naylor
Panagiotis that’s not quite true, the big difference for me was giving Deep Crawl 6 million urls and happy in the knowledge that they will return the results and Not run out of memory ( the cloud is awesome sauce )
Tristram
Deep Crawl’s been my tool of choice for some time now. Could not live without it. The only problem is that it makes it very easy to see just how much work you need to do on your sites!
I didn’t see you mention Deep Crawl’s ability to crawl staging websites and compare them to the last crawl of the production environment. This is SO valuable especially for larger, less agile companies; it’s so important to get it right first time.
Deep Crawl can also be used by QA teams, and the regex function is a life saver. We use the regex function to ensure that our schema tags, analytics script and social media tags are present on the pages we want them to be.
David Whitehouse - http://www.davidwhitehouse.co.uk/
Hi Tristram,
Thanks for your comment.
There’s a ton more features, but we’re still quite new to it. I do like the addition of regex, very handy 🙂
Ameli Rowland - http://seouk.co.uk
Crawl! I just love the term “crawl” before I check site status. An extensive diagnosis is always a smart idea to find issues that could be is going to affecting your SEO efforts. Therefore, I am always in search of these types of best analysis tools, which gives me the opportunity to monitor my sites about broken links or server errors overtime.