Guest post, by Mark Sharron, SussexSEO.
Cast your minds back a while (about 4-5 years ago), the only real way of doing on page SEO properly meant inserting your main primary keywords into the title, headings and body text of the page that you¹re in the process of ranking, sometimes forcibly. Pretty much the same way that Yoast SEO and the majority of bloggers instruct users today. If your task/target was to rank highly for a particular term, you were looking at around 5-10% keyword density and it was of paramount importance to put your term in the following spots in your HTML
<title>, <h1>, <h2>, <h3>, <b> or <strong>, <i> or <em>, <img alt=”” />F, <meta name=”description” content=””>, <meta name=”keywords” content=””>
…that was generally the long and short of it. It isn’t my priority here to discredit this particular procedure, as it can still be considered useful in certain circumstances, although it can only do so much before you over optimise your site and a particular black and white creature causes you a fair amount of problems (Penguin/Panda). Although times have changed, I concur that SEO is little more other than a labeling exercise; the procedure remains the same, although its rules have been updated, and they¹re much more convoluted.
It is safe to say that forcing your primary keyword into your text and automatically ranking are things of the past; the Google of today being more focused on trying to algorithmically comprehend the theme (classifiers/judges) of your site by examining and scrutinizing the language used on page using latent semantic indexing techniques, and latent semantic analysis and site structure (SILO¹ing).
Once upon a time, Yahoo was undoubtedly the trailblazer in this particular task. Their PPC division and its entailed responsibilities were tasked to a company called Overture used technology licensed from another third party (Applied Semantics) in order to suggest semantic variations of keywords for Overture clients to bid upon. With pricked ears and caught attention, Google acquired Applied Semantics in 2003 with the sole aim of co-opting their technology/patents to further develop their paid advertising offerings and organic search by putting Latent Semantic Indexing into action. As a direct consequence, Yahoo and Overture were raided, gutted, and tossed out, their market share taking a comprehensive and irreparable plunge.
Defining Latent Semantic Indexing:
Latent semantic indexing (LSI) is an indexing and retrieval method that uses a mathematical technique called singular value decomposition (SVD) to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text. Wikipedia
Put simply, latent semantic indexing allows the algorithm to differentiate the meaning of language used within a document.
When you’re identifying LSI keywords, it is paramount to set about thinking of terms related to a main keyword that will in turn enable engines to understand the specific context of a piece of work.
Essentially, when it comes to your primary keywords in an article, e.g. “cloud” – LSI keywords would be “rain”, “grey”, “weather”, “flooding “, “I wandered lonely as a,” etc.
Furthermore, if terms like iPad, iPod, Mac were inserted into a piece of work in around the primary keyword “cloud” an LSI algorithm would be able to recognise that the word “cloud” referred to the Apple iCloud, or the general digital term, and not the meteorological phenomenon.
It is fair and probably smart to assume that peppering a document with LSI keywords enables the search engine to stamp in its theme/intent without you having to worry about over-optimisation. Conclusively, LSI is more about the construction of context; a relatable shell of words around the keywords, other than the terms themselves.
One of the more pressing inaccuracies around is that LSI keywords are necessarily synonyms. This isn¹t totally factual, as synonyms can remove the contextual factor of the document; related keywords, entities and concepts are necessary throughout the text.
Enter The Humming Bird
Google wheeled out the Hummingbird Algorithm on August 30, 2013; it was the direct result of different semantic search puzzle pieces fitting together, and as per Google search chief Amit Singhal, was the “first major update of its type since 2001.”
Hummingbird takes each word into consideration, their relationship to the other words in the document, and how they reflect as a completed totality the whole sentence or conversation or meaning is accounted for, rather than individual words. The aim being that pages that match the meaning do better, rather than pages matching a couple of terms.
“Simply put, it’s not just about keywords nowadays but since the hummingbird update Google is focusing more on meaningful signals (semantics).”
As well as the indexation, the algorithm allowed for significant developments in the use of conversational search; leveraging natural language processing to enhanced the way search engine queries are placed, enabling changes in the mean number of users interacting with search engines through the increased co-optation of mobile devices and voice commands / digital PA’s.
A New Epoch of SEO Copy Writing:
Admittedly, “Determining context and relationships between terms / phrases and collections of the latter” is quite a tricky notion, not easily understood immediately. Moreover, finding the appropriate usable information is made more difficult by the SEO industry’s failure to understand, as well as the archetypal travelling potion peddler jumping onto the bandwagon, using LSI as a buzz phrase/attempting to sell the uneducated layman so-called LSI keyword tools that don’t perform the functions expected of them.
It’s acutely important to be aware of LSI and the importance of writing natural free flowing copy for the web. Spending time and effort to create the correct content will pay dividends in the push for higher rankings.
This is a new epoch in SEO Writing; an epoch of singular value decomposition and probability based models. An era wherein each piece of work is a probability distribution over different topics and contexts, each topic is a probability distribution over different terms, and you, the scribe, research ambiguous and conceptual ideas such as “n-grams” to ensure that you are up to scratch
No Need to Worry:
The algorithm is not an AI, it only understands and processes numbers through a data matrix (singular value decomposition) not semantics. If you want to rank for a competitive term, you’re still going to resort to some level of keyword stuffing.
As previously stated, the game remains unchanged, on-site SEO continues to be a case of labeling. The main objective now is focusing on the implementation of as many LSI keywords as you can to ensure Google classifies your site within an appropriate category. There are certain quality checks to adhere to e.g. spelling and reading level if the aptly named phantom update a few months ago did was promised of it, but the general idea of of the task at hand hasn’t developed as much as it might seem; the schematics employed 10 years ago to rank a website for a desired keyword are very similar.
Finding LSI Keywords:
Many apparent LSI keyword tools work by scouring Google’s related searches or auto complete database by prefixing or post fixing your base seed word with different alphabetic characters. A lot can be said for this method, however, it will not produce LSI keywords, it produces slight deviations and nuances.
Firing off an organic search MAY score you a few LSI keywords in the related searches output. The most simple and clear method for finding LSI keywords is to use Google’s own PPC keyword research tool. Essentially, place a short tail variant of a keyword into the tool and then add that same term to the negative filter. This will give you a comprehensive list of related terms, which you can then implement in your website copy.