There are three types of crawling, all of which provide useful data. Internet-wide crawlers are for large-scale link indexing. It's a complicated and often expensive process but, as with social listening, the goal is for SEO experts, business analysts, and entrepreneurs to be able to map how websites link to one another and extrapolate larger SEO trends and growth opportunities. Crawling tools generally do this with automated bots continuously scanning the web. As is the case with most of these SEO tools, many businesses use internal reporting features in tandem with integrated business intelligence (BI) tools to identify even deeper data insights. Ahrefs and Majestic are the two clear leaders in this type of crawling. They have invested more than a decade's worth of time and resources, compiling and indexing millions and billions, respectively, of crawled domains and pages.
Structured data21 is code that you can add to your sites' pages to describe your content to search engines, so they can better understand what's on your pages. Search engines can use this understanding to display your content in useful (and eye-catching!) ways in search results. That, in turn, can help you attract just the right kind of customers for your business.
For example, let's say the keyword difficulty of a particular term is in the 80s and 90s in the top five spots on a particular search results page. Then, in positions 6-9, the difficulty scores drop down into the 50s and 60s. Using that difficulty score, a business can begin targeting that range of spots and running competitive analysis on the pages to see who your website could knock out of their spot.
Many blogging software packages automatically nofollow user comments, but those that don't can most likely be manually edited to do this. This advice also goes for other areas of your site that may involve user-generated content, such as guest books, forums, shout-boards, referrer listings, etc. If you're willing to vouch for links added by third parties (for example, if a commenter is trusted on your site), then there's no need to use nofollow on links; however, linking to sites that Google considers spammy can affect the reputation of your own site. The Webmaster Help Center has more tips on avoiding comment spam40, for example by using CAPTCHAs and turning on comment moderation.

When it comes to finally choosing the SEO tools that suit your organization's needs, the decision comes back to that concept of gaining tangible ground. It's about discerning which tools provide the most effective combination of keyword-driven SEO investigation capabilities, and then on top of that, the added keyword organization, analysis, recommendations, and other useful functionality to take action on the SEO insights you uncover. If a product is telling you what optimizations need to be made to your website, does it then provide technology to help you make those improvements?


In the enterprise space, one major trend we're seeing lately is data import across the big players. Much of SEO involves working with the data Google gives you and then filling in all of the gaps. Google Search Console (formerly, Webmaster Tools) only gives you a 90-day window of data, so enterprise vendors, such as Conductor and Screaming Frog, are continually adding and importing data sources from other crawling databases (like DeepCrawl's). They're combining that with Google Search Console data for more accurate, ongoing Search Engine Results Page (SERP) monitoring and position tracking on specific keywords. SEMrush and Searchmetrics (in its enterprise Suite packages) offer this level of enterprise SERP monitoring as well, which can give your business a higher-level view of how you're doing against competitors.
The third type of crawling tool that we touched upon during testing is backlink tracking. Backlinks are one of the building blocks of good SEO. Analyzing the quality of your website's inbound backlinks and how they're feeding into your domain architecture can give your SEO team insight into everything from your website's strongest and weakest pages to search visibility on particular keywords against competing brands.
While most of the links to your site will be added gradually, as people discover your content through search or other ways and link to it, Google understands that you'd like to let others know about the hard work you've put into your content. Effectively promoting your new content will lead to faster discovery by those who are interested in the same subject. As with most points covered in this document, taking these recommendations to an extreme could actually harm the reputation of your site.
Search engines may penalize sites they discover using black or grey hat methods, either by reducing their rankings or eliminating their listings from their databases altogether. Such penalties can be applied either automatically by the search engines' algorithms, or by a manual site review. One example was the February 2006 Google removal of both BMW Germany and Ricoh Germany for use of deceptive practices.[54] Both companies, however, quickly apologized, fixed the offending pages, and were restored to Google's search engine results page.[55]
The Small SEO Tools Plagiarism Checker also has its version of WordPress plugin for checking plagiarism. With it, you don't need to waste precious time copying and pasting the whole content of your post. Simply install the plugin, and whenever you are working on a new post or page content, click on the “Check Plagiarism” button and the plugin will automatically start checking the full content, sentence-by-sentence. You can also compare plagiarized content within the plugin by clicking on sentences. With this plugin, you don't have to worry about your content being stolen or the search engines penalizing your site for content duplication.
Black hat SEO attempts to improve rankings in ways that are disapproved of by the search engines, or involve deception. One black hat technique uses hidden text, either as text colored similar to the background, in an invisible div, or positioned off screen. Another method gives a different page depending on whether the page is being requested by a human visitor or a search engine, a technique known as cloaking. Another category sometimes used is grey hat SEO. This is in between black hat and white hat approaches, where the methods employed avoid the site being penalized but do not act in producing the best content for users. Grey hat SEO is entirely focused on improving search engine rankings.
Structured data21 is code that you can add to your sites' pages to describe your content to search engines, so they can better understand what's on your pages. Search engines can use this understanding to display your content in useful (and eye-catching!) ways in search results. That, in turn, can help you attract just the right kind of customers for your business.

Another reason is that if you're using an image as a link, the alt text for that image will be treated similarly to the anchor text of a text link. However, we don't recommend using too many images for links in your site's navigation when text links could serve the same purpose. Lastly, optimizing your image filenames and alt text makes it easier for image search projects like Google Image Search to better understand your images.
The caveat in all of this is that, in one way or another, most of the data and the rules governing what ranks and what doesn't (often on a week-to-week basis) comes from Google. If you know where to find and how to use the free and freemium tools Google provides under the surface—AdWords, Google Analytics, and Google Search Console being the big three—you can do all of this manually. Much of the data that the ongoing position monitoring, keyword research, and crawler tools provide is extracted in one form or another from Google itself. Doing it yourself is a disjointed, meticulous process, but you can piece together all the SEO data you need to come up with an optimization strategy should you be so inclined.
Depending on your topic / vertical and your geographic location the search engines may have vastly different search volumes. The tool can only possibly offer approximations. Exact search volumes are hard to find due to vanity searches, click bots, rank checkers, and other forms of automated traffic. Exceptionally valuable search terms may show far greater volume than they actually have due to various competitive commercial forces inflating search volumes due to automated search traffic.
The third type of crawling tool that we touched upon during testing is backlink tracking. Backlinks are one of the building blocks of good SEO. Analyzing the quality of your website's inbound backlinks and how they're feeding into your domain architecture can give your SEO team insight into everything from your website's strongest and weakest pages to search visibility on particular keywords against competing brands.
Provide full functionality on all devices. Mobile users expect the same functionality - such as commenting and check-out - and content on mobile as well as on all other devices that your website supports. In addition to textual content, make sure that all important images and videos are embedded and accessible on mobile devices. For search engines, provide all structured data and other metadata - such as titles, descriptions, link-elements, and other meta-tags - on all versions of the pages.
Tablet - We consider tablets as devices in their own class, so when we speak of mobile devices, we generally do not include tablets in the definition. Tablets tend to have larger screens, which means that, unless you offer tablet-optimized content, you can assume that users expect to see your site as it would look on a desktop browser rather than on a smartphone browser.
At Yoast, we practice what we call ‘holistic SEO‘. This means that your primary goal should be to build and maintain the best possible website. Don’t try to fool Google, but use a sustainable long-term strategy. Ranking will come automatically if your website is of extremely high quality. Google wants to get its users to the right place, as its mission is to index all the world’s online information and make it universally accessible and useful.

This helpful tool scans your backlink profile and turns up a list of contact information for the links and domains you'll need to reach out to for removal. Alternatively, the tool also allows you to export the list if you wish to disavow them using Google's tool. (Essentially, this tool tells Google not to take these links into account when crawling your site.)


Getting the most out of your optimization efforts means understanding the data you’re collecting, from analytics implementation to report setup to analysis techniques. In this session, Krista walks you through several tips for using analytics data to empower your optimization efforts, and then takes it further to show you how to level-up your efforts to take advantage of personalization from mass scale all the way down to individual user actions.
The ranking of your website is partly decided by on-page factors. On-page SEO factors are all those things you can influence from within your actual website. These factors include technical aspects (e.g. the quality of your code and site speed) and content-related aspects, like the structure of your website or the quality of the copy on your website. These are all crucial on-page SEO factors.
Black hat SEO attempts to improve rankings in ways that are disapproved of by the search engines, or involve deception. One black hat technique uses hidden text, either as text colored similar to the background, in an invisible div, or positioned off screen. Another method gives a different page depending on whether the page is being requested by a human visitor or a search engine, a technique known as cloaking. Another category sometimes used is grey hat SEO. This is in between black hat and white hat approaches, where the methods employed avoid the site being penalized but do not act in producing the best content for users. Grey hat SEO is entirely focused on improving search engine rankings.
When referring to the homepage, a trailing slash after the hostname is optional since it leads to the same content ("https://example.com/" is the same as "https://example.com"). For the path and filename, a trailing slash would be seen as a different URL (signaling either a file or a directory), for example, "https://example.com/fish" is not the same as "https://example.com/fish/".
Early versions of search algorithms relied on webmaster-provided information such as the keyword meta tag or index files in engines like ALIWEB. Meta tags provide a guide to each page's content. Using metadata to index pages was found to be less than reliable, however, because the webmaster's choice of keywords in the meta tag could potentially be an inaccurate representation of the site's actual content. Inaccurate, incomplete, and inconsistent data in meta tags could and did cause pages to rank for irrelevant searches.[10][dubious – discuss] Web content providers also manipulated some attributes within the HTML source of a page in an attempt to rank well in search engines.[11] By 1997, search engine designers recognized that webmasters were making efforts to rank well in their search engine, and that some webmasters were even manipulating their rankings in search results by stuffing pages with excessive or irrelevant keywords. Early search engines, such as Altavista and Infoseek, adjusted their algorithms to prevent webmasters from manipulating rankings.[12]
×