Have you ever wondered why some listings on Google appear to have more information about the page than others? Well mostly this is because of semantic markup and microformats.
Firstly, the internet started with very little expression as to what web page content meant, people could understand content on web pages but automated scripts and computers couldn’t easily understand the meaning of the sites. The best that computers had to look at was mostly in meta tags, headings and a few other hints. For search engines to get meaning from web pages, they had to work from a whole host of different factors that are not as accurate as telling the search engines what the meaning of the page is; semantic markup provides the ability for content authors to declare meaning on web pages.
Semantic markup is embedded into your HTML; the advantage of doing this is that scripts can then understand the meaning of certain aspects of your page, from user ratings to concerts. By explaining your content in set formats, search engines and many other content providers are able to understand much more easily what your content contains.
Vocabularies
Vocabularies are the collective of many schemas which are used to describe data like: events, reviews, people, products.
Many different vocabularies exist to express the data on the web and actually it's much easier to just skip over the ones I am not using: https://microformats.org/, https://www.foaf-project.org/, https://www.heppnetz.de/projects/goodrelations/, https://www.sioc-project.org/.
Instead I am using schema.org https://schema.org/ which has a lot of definitions for different types of data on web pages, what makes the schema.org most significant for using in SEO is this site was the collective agreement of the big search engines: Bing, Google, Yahoo! and Yandex. See all the schemas here: https://schema.org/docs/full.html.
Formats
Formats are the way in which the data can be expressed; this is similar to XML, HTML, XHTML and other competing technologies. There are two common formats that are now standardised on the web for use in HTML.
RDFa in HTML and Microdata.
For historical reasons I would suggest using Microdata format for use with schema.org as the adoption by the search engines is higher, for a casual user there isn’t much difference between RDFa and Microdata.
Describing your data
As I mentioned earlier, there is many different schema’s that Schema.org profides. The schemas can be used as a company to describe your data. I’m going to run you through two examples which will give you a snapshot to how schemas can be used.
Product reviews
Product reviews can be seen on Google and it gives users a handy indication to a product's quality. As an example here is how a Dyson vacuum cleaner sold by Argos looks on Google:

Here is how that page appears in Google’s rich snippet checker.
As you can see, Google has been able to distinguish a lot of information from this single page, it can use these in how it presents your page to people who are searching.
Argos is using a different schema than the ones on schema.org called h-review: https://microformats.org/wiki/h-review (there's actually a newer version of h-review also: https://microformats.org/wiki/microformats2#h-review).
However, Google and all the other big search engines are now recommending schema.org agregate rating for the same rolled up reviews: https://schema.org/AggregateRating
<div itemprop="aggregateRating" itemscope itemtype="https://schema.org/AggregateRating">
<span itemprop="ratingValue">4.7</span>
<span itemprop="bestrating">5</span>
<span itemprop="reviewCount">376</span>
</div>
What you will also notice from looking at the above screenshot from Google’s structured data checker is that the review is actually embedded within the product data. This is how Semantic markup works, you can embed different content into another; Google will most likely only be looking to associate reviews with a particular tangible thing or service when displaying the reviews on their pages anyway. You can read more about product markup here: https://schema.org/Product
Events
Events can be also be seen on Google, giving users quick snippets of information about where and when with links for direct information.

Checking with Google’s rich snippet checker we can see the site is using another structured data format: hcalendar – whereas the search engines are peddling another schema for this.
Here is how one of the events may look in schema.org format:
<div itemscope itemtype="https://schema.org/Event">
<a itemprop="url" href="https://www.ents24.com/nottingham-events/the-rescue-rooms/the-lancashire-hotpots/3872801"><span itemprop="name">Golden Crates Tour: The Lancashire Hotpots, The Bar-Steward Sons Of Val Doonican</span>
</a>
<meta itemprop="startDate" content="2014-10-18T20:00" />
Sat, 18/10/2014
8:00 p.m.
<meta itemprop="endDate" content="2014-10-19T00:00" />
<div itemprop="location" itemscope itemtype="https://schema.org/Place">
<span itemprop="name">
The rescue rooms
</span>
<div itemprop="address" itemscope itemtype="https://schema.org/PostalAddress">
<span itemprop="streetAddress">25 Goldsmith Street</span>
<span itemprop="addressLocality">Nottingham</span>,
<span itemprop="addressRegion">England</span>
<span itemprop="postalCode">NG1 5LB</span>
</div>
<div itemprop="geo" itemscope itemtype="https://schema.org/GeoCoordinates">
<meta itemprop="latitude" content="52.956802" />
<meta itemprop="longitude" content="-1.155241" />
</div>
</div>
<img itemprop="image" src="https://media.ents24network.com/image/000/090/521/d7867d511725a1f0ec60c7c2def99de088ca3516.jpg" />
<div itemprop="offers" itemscope itemtype="https://schema.org/AggregateOffer">
Priced from: <span itemprop="lowPrice">£13.44</span>
<span itemprop="offerCount">1938</span> tickets left
</div>
</div>
Again you will notice here that several different schemas are embedded within each other here:

This building of data trees also helps search engines work out more about the relationships between data in a similar way to how databases can have many to many relationships. So for example search engines would be able to infer that one event is in multiple places, that the same place has many events or multiple offers.
This layering of data also is another tool in the many that Google uses to work on its knowledge graph: https://www.google.co.uk/insidesearch/features/search/knowledge.html. This is another way that Google can map understanding from sites and direct traffic based upon completely different search vectors. Giving users the ability to explore your data on your site via a search engine leads to many new customers discovering your site, which is ultimately the goal of SEO.
Further reading
- Google has more explanation about their use of structured data: https://support.google.com/webmasters/answer/99170?hl=en&ref_topic=4600154
- Making search more visible: https://www.opensearch.org/
- A similar explanation to why semantics is important in SEO: https://blog.hittail.com/2013/08/what-the-heck-is-semantic-seo-and-should-you-care-at-all.html
- A guide to improving the meaning using HTML5 tags: https://diveintohtml5.info/semantics.html
- Technical differences between Microdata and RDFa: https://manu.sporny.org/2012/mythical-differences/
Are you using (or planning to use) semantic markup on your website? Let us know in the comments!



Comments
Please remember that all comments are moderated and any links you paste in your comment will remain as plain text. If your comment looks like spam it will be deleted. We're looking forward to answering your questions and hearing your comments and opinions!