“A gap to bridge… What gap, what bridge?”
Relax; give me a moment. I have not said a word yet.
Let me first clarify the entities this article revolves around so that you understand who and what I am actually talking about:
- Google Search: Just plain old Google Search as we have known it since 1998.
- Google Shopping: A Google service that allows you to search, view, and compare products since 2010.
- Google Merchant Center (GMC): A platform that, since 2010, allows you to manage how your products appear in some of the different services Google provides. These include organic results in Google Search , Google Shopping , Google Maps, YouTube, Google Images, and Google Lens, and Google Ads channels like Google Search , Google Shopping , Google Maps, YouTube, Google Images, and Google Discover.
As for the “gap” you asked about – this has to do with the complex relationship between Google Search, Google Shopping, and their intermediary, Google Merchant Center, and the fact that there is still a technical gap to close for that relationship to start working efficiently.
They have been working together on closing this gap over the last three to four years, yet it still needs to be fully bridged.
It’s a gap that shows itself in the way we (a) can use GMC to get product variants into Google Shopping and update them vs. (b) cannot do so efficiently through structured data markup and Google Search.
This, in practice, still makes product feeds the preferred and most solid solution for any merchant out there.
Conflicting Structured Data Markup Requirements
For a long time, this gap also extended into some of the structured data markup requirements both parties upheld.
These discrepancies would often lead to the problem that, even though our Product and Offer markup for Google Search was in order, Google Merchant Center would not be satisfied with it – nor vice versa.
Often, we could only resolve this situation by duplicating some of our markup and modeling it ever-so-slightly differently.
This forced us to publish superfluous information because different sides of Google were not taking each other’s requirements into account (or at least not enough) – leaving it up to us to make sense of and deal with that mess.
Different Parts Of Google Finally Began Building A Bridge
Eventually, the different Google parties involved started aligning their structured data markup requirements.
This was even before it was announced that Google Shopping would start including organic results, which was a very important step forward, as it meant products could now finally get into Google Shopping for free.
Hooray for us, the folks who need to produce the structured data they require, yet often get lost in the how-to-do-it – as well as those who always wanted to be included in Google Shopping yet could not afford to.
This signaled a shift in stance, the effects of which we could already see happening over at schema.org as early as mid-2019, the moment the first requests for changes to the vocabulary were coming in and being released.
To illustrate that point, here’s a list of types, properties, and enumerations that have been added and/or modified in the first year following that moment:
- 2019-07-01, schema.org release 3.8: ProductReturnPolicy, gtin.
- 2020-01-21, schema.org release 6.0: MerchantReturnPolicy.
- 2020-05-01, schema.org release 8.0: OfferShippingDetails, deliveryTime, doesNotShip, shippingDestination, shippingLabel, shippingRate, shippingSettingsLink, transitTimeLabel, ShippingDeliveryTime, DefinedRegion, PostalCodeRangeSpecification, OfferShippingDetails, ShippingRateSettings.
- 2020-07-21, schema.org release 9.0: size, pattern, ProductGroup, variesBy, hasVariant, productGroupID, isVariantOf, ProductCollection.
- 2020-09-07, schema.org release 10.0: hasEnergyConsumptionDetails, EnergyConsumptionDetails, hasEnergyEfficiencyCategory, EnergyEfficiencyEnumeration, EUEnergyEfficiencyEnumeration, EnergyStarEnergyEfficiencyEnumeration, energyEfficiencyScaleMax, energyEfficiencyScaleMin.
- See schema.org’s releases page for a full overview of all its updates (it is now at version 23.0).
Obstacles Being Replaced By Semantic Building Blocks
Since then, many of GMC’s product attributes have been translated into new schema.org types, properties, and data shapes.
This is a serious attempt at allowing us to express everything we can express in GMC product feeds through schema.org annotations – thus reducing Google’s reliance on feeds while offering us a new way to achieve the same thing.
It’s a process that is far from done yet, though by now, it is no longer just theory.
We have all seen the changes happening over the last couple of years, including changes in Google Search and Google Shopping results, Google Merchant Center and Google’s Structured Data documentation, and Google Search Console reports.
Google Shopping Shows Product Variants In A Single Result, Google Search – Does Not (Yet?)
A Google Shopping ad type but also a feature for organic product variant listings.
This is an attribute that, until not so long ago, did not have a schema.org equivalent and, therefore, could not be expressed in any form of structured data outside GMC product feeds – thus preventing us from expressing product variant relations in structured data markup and keeping us dependent on GMC.
Introducing A (Somewhat New) Fundamental Piece Of The Puzzle
A translation of a GMC product attribute into schema.org lingo that was added to schema.org in July 2020 but did not appear in GMC’s documentation nor Google’s structured data feature guide until September 2022:
This a property that, at first glance, might give the impression that the problem has been solved, as it allows us to create Product markup that describes the same relations our GMC XML feeds do:
Note: You need to add the inProductGroupWithID property to the markup of all the product variant detail pages that are part of a group – although this also is exactly where the next issue arises.
XML Feeds Vs. Semantic Annotations – A Battle Of Efficiency
Many do not spend much time thinking about it, but the reality is that the gathering of structured data markup via web pages – at a web scale – is quite a complex and resource-heavy task.
This means that Google has to put in quite the effort (and money) to accumulate all that information. And it does not end there.
After collecting all that information, Google still needs to combine all that product variant information and figure out how all those different products fit together.
Oh, and then there is also the fact it has to deal with the enormous amount of markup mistakes out there – which makes it even more difficult to process such information.
It’s a process that never stops because Google also needs to stay up to date in regard to values like availability, price, shipping costs, etc. – information that does not tend to be very static.
And this is something Google must do for every single product detail page on your site, over and over again.
Canonical URLs Are Counterproductive For Keeping Product Variant Info Up To Date
You might not be aware that both Google Search and GMC have some guidelines in regard to product variants and how to canonicalize them.
However, if you have ever dipped your toes in technical SEO, this should not have been news to you. If that is indeed the case, then you might also be aware that Google bots tend to crawl canonicalized URLs substantially less than the URLs the <link rel=”canonical”> refers to.
A crawl rate that tends to go further down the longer a canonical referral stays in place.
The reason is that it is a clear signal to Google Search, one that indicates (near) duplicate content as well as a location of the most important version of that (near) duplicate content.
This is highly efficient for Google Search but not for you and, for example, those of your products that were out of stock for some time and are now available to be sold again – but do not! Because the Google bots are taking their sweet time to re-crawl your product pages. This means your product search results still indicate they are out of stock on page 3 or even deeper down in limbo.
Oh, and how about your Google Shopping out-of-stock organic product listings? How are those working out for you?
Of course, you can change the <lastmod> of those URLs in your sitemap XML file, which works great if you have less than 500 products, but it does not help much when you are talking about tens or hundreds of thousands (or even more) of product URLs.
Then those <lastmod> updates represent raindrops in an ocean of changes, and guess what? In those cases, canonicalized URLs get crawled even less.
GMC Feeds – Still The Most Effective Way To Inform Google About Product Updates
Google only has to fetch a limited number of product feeds. Because of that, they offer Google a much easier and cost-effective method for getting its hands on complete and up-to-date product datasets.
I am not saying that the data contained in feeds is always perfect. (Surprise! They often also look like an exploded piñata.) But at least Google does not have to crawl entire domains upfront so as to get the complete picture.
This situation brings us to the next piece of the puzzle: a new Class that helps us bundle some of that information at more centralized locations on our sites – probably the same locations your product variant <link rel=”canonical”> URLs refer to.
This makes it easier for Google to get to some important pieces of the puzzle prior to having crawled every single product, as all of them will be able to refer to that more centralized version of your structured data markup.
And because your canonical target URLs get crawled at a much higher frequency, your product search results will also update at a higher frequency, thus closing much of the efficiency gap between product feeds and structured data markup.
Be aware that I’m not going to repeat everything the description already expresses (Tip: Always read the manual!).
But First, A Word Of Caution
The reason why I emphasized “likely” is because, until the moment Google finally makes its announcement(s), nobody will know exactly how it will implement and make use of this new type – nor how other consumers of the same data will respond to it, or not (e.g., Bing, Facebook, Pinterest, etc).
This implies that, until Google makes its announcements, you should be very wary of anybody claiming to know how this will pan out and develop over time. Even the folks over at schema.org and Google do not know for sure, so do not let anybody else tell you they do know.
One thing is sure, though: a lot of the future developments will greatly depend on customer and peer feedback. Meaning it will likely also undergo a very agile evolution process.
This is illustrated by the fact that Google has been at it for years now and still has not finished every single detail – to leave some room to wiggle.
What To Prepare For Before ProductGroup Is Announced
Given that I am one of the people who has been involved in the discussions surrounding this topic over at schema.org – for a very long time – combined with the fact that some Googlers have been playing a very active part in those discussions as well, I feel confident enough to at least try to provide you with some insights as to the why and how behind some of the things heading our way, in the (near) future.
By doing so, I hope to give you ample time to start looking into this before Google makes any announcements – because chances are high that implementing this type of markup will come with some technical challenges.
I expect you will need some time to familiarize yourself with everything it entails so you can figure out what needs to be done before it becomes a fact of life. For example:
- Is the data you will need available to you?
- Can your systems produce all the parts of the markup at the right moment?
- Are you able to provide different markup on the canonical URL than you do on the other product variant URLs?
- In case you have feeds, how will you ensure the data in your markup and your feed(s) stays in sync?
And if you do not prepare yourself for this, chances are high you will be caught off guard when Google finally does make its announcements, followed by heading into a rat race, desperately trying to catch up with your competitors.
You should be careful in taking my views on this as facts. The things I will describe next are biased by the intentions and use cases I had in mind when starting the discussion over at schema.org.
And since I am “just Jarno,” a single person, this does not imply that others (a.k.a. the consumers of this ‘stuff’) will use it in exactly the same manner as I would like them to, nor that they have exactly the same use cases in mind as I had.
Nevertheless, I think I am safe lifting some of the curtain for you so that you can form your own ideas and opinions about where this is heading and how you might use it.
Informing Google Where ProductGroup Information Can Be Found
As we continue the markup example used earlier, you will see it is quite easy to inform consumers about the existence of pages containing ProductGroup information.
It only requires us to add a few new lines of markup to the markup on all your product variant detail pages that are part of a ProductGroup.
Although during the discussions over at schema.org, it was suggested that ‘consumers of such data’ should allow for an even more compact form of annotating isVariantOf information, namely:
- “isVariantOf”: “https://www.example.com/products/summer-t-shirt“
The reason is that from a publisher’s perspective, this is much easier to produce while still leading to the same end result: a referral to an entry point where the actual ProductGroup information can be found without adding a bunch of markup fluff to the majority of your pages.
Here is to hoping that Google will listen to that suggestion. (Note: It does a better job at listening to suggestions than many give it credit for).
As for where to publish ProductGroup information – as I expressed earlier, I expect that Google will tell us to publish the markup on the canonical URL of the product variants that are part of a group so as to stay compliant with its current guidelines on how to deal with product variant URLs.
Some Hopeful Speculation
Maybe someday, this will lead to us not always having to choose a canonical URL for every single product variant there is – something that is quite a cumbersome and technically complex task to achieve.
As if that was not enough of a task already, there is also the fact that many ecommerce platforms do not offer any such out-of-the-box functionality.
Instead, many platforms come with a ProductGroup-like page of sorts, in the form of a product detail page with no variable properties set (awaiting user interaction).
Wouldn’t it be really neat if you could simply publish ProductGroup markup on that single URL available to you (per ProductGroup) and have it include all the Product variant information as well?
Imagine if we could do that without losing any Rich Results – which sort of feels like being punished for trying to be transparent and telling the full (data) story of your web pages!
Who knows; maybe one day we will get there. With ProductGroup, there now at least is a schema.org Class, which, in theory, offers a working solution.
I guess time will tell if Google will also see the value and need for it.
“Enough with all the blah blah already, Jarno! Give me the good stuff. Show me what I can do with ProductGroup already.”
…OK, OK, OK, I hear ya. Hold onto your hats, boys and girls, here we go.
Getting Into The Nitty-gritty Of ProductGroup: A Blueprint For Its Members
Before diving into an illustrative markup example (which probably does not cover all you have to say about your products), let me first explain the idea behind ProductGroup’s second function:
- Being a prototype/template/blueprint for the Product variants that are part of a ProductGroup.
What this means is that almost all properties (and their values) you add to a ProductGroup will be inherited by its Product variants – while also being able to express by which properties the products vary via the variesBy property.
This is quite ingenious (if it gets used by consumers as intended) as it prevents us from having to add loads of repetitive information.
For example, because the ProductGroup in my example has the property-value pair “material”: “cotton,” all the Product variants in the ProductGroup automatically inherit that same property-value pair.
This means that the main material of all the Product variants will be cotton without having to add that information to each and every single one of the Product variants.
If you think this will not make much of a difference, think again; the amount of Product information that repeats itself adds up really quickly.
Just in my example alone, I have easily saved up to 30-40 lines of code, roughly 20-25% of the total markup, because I defined the properties material, brand, aggregateRating and offers under ProductGroup.
And that is by making use of referencing via identifiers, thus already condensing the markup a lot. Without that and writing it out – fully – will easily add an additional 100-150 lines of markup.
Lastly, I should mention that each Product still has its own Offer markup, but the repetitive parts are placed under ProductGroup.offers.Offer.
“Great example Jarno – not! You should test your stuff before publishing, it produces errors and warnings and the products do not inherit a single property!”
Yep, I am well aware of that fact. But that is not because I lack the skills or awareness to do this properly. Support for this stuff is on the horizon, but it is not here yet.
This is still reflected by how the different validators/testing tools out there react to this type of markup.
We Will Not See ProductGroup Before Google Is Confident It Is Ready (Enough)
Because of the number of possible markup combinations, as well as the order in which this sort of markup can be expressed and how it all can be screwed up, it needs to be looked at and tested from many different angles prior to releasing it to the world.
Diligence is of the utmost importance, as things could easily go horribly wrong for both publishers and consumers alike.
There are plenty of tiny and large details left to discuss over at schema.org, but for now, it is in the hands of the Googlers involved in this (and their colleagues).
They need time to figure out how they will make this work within their systems and within which constraints at the start, followed by actually making it work and writing the proper documentation for it so that we can learn how to use it the way they would like us to.
This also implies that my markup examples should, by no means, be considered as “the way it will be done” regarding the order in which I have written it, the information it contains, or the number of properties the inheritance will work for.
My examples are based on the discussions we had over at schema.org and are meant to help you make a start with your own investigations.
Leaving You On A Cliffhanger
All of us are waiting for the next episode – produced by Google. I hope we don’t have to wait too long for this one, as I really want to discover how this ends – or rather, starts – for real.
And while we wait, you should start looking into this from as many angles as possible to figure out the potential implications for you, your clients, or your employer before you get caught off guard.
A Final Note
Schema.org is all about creating a workable public vocabulary that offers a “language” we can use for publishing and consuming semantic annotations.
Schema.org does not define how publishers nor consumers of their types and properties are supposed to use them – pretty much the same way dictionaries do not tell us how to write.
The people involved in schema.org have absolutely no say in how consumers of the vocabulary use its language, nor any say in the requirements parties like Google, Bing, and others document and publish.
Featured Image: BestForBest/Shutterstock
Hashtags: #Google #Search #Shopping #Product #Variants #Gap #Bridge