Shopify URL Structure: /products/, /collections/, and the ?variant= Problem
Shopify organizes storefronts around two primary URL types:
/products/product-handle — the canonical page for a single product
/collections/collection-handle — the canonical page for a product collection
Both are clean, descriptive URLs that Google can crawl and index without issues. The complications start when Shopify generates derivative URLs from these bases.
Variant URLs are the main culprit. Every time a product has multiple options — size, color, material — Shopify appends a ?variant= parameter to the URL when a customer selects that option. A single product with four sizes and three colors produces 12 distinct ?variant= URLs. For a store with 500 SKUs and an average of eight variants each, that's 4,000 additional crawlable URLs pointing at content that is, for indexing purposes, identical to the canonical product page.
Shopify handles this with a canonical tag: every ?variant= URL includes a <link rel="canonical"> pointing back to the base /products/ URL. Google is supposed to respect this and consolidate all variant signals onto the canonical. In practice, Googlebot still crawls variant URLs — consuming crawl budget — before following the canonical. The fix isn't removing the canonicals; it's preventing Googlebot from spending crawl budget on the variants in the first place.
The same logic applies to tag-filtered collection views. /collections/t-shirts/blue is a separate crawlable URL from /collections/t-shirts, and for most stores, it contains a subset of the same products with no additional content. Google sees thin, near-duplicate collection pages and often chooses not to index them.
For deeper context on how URL structure interacts with indexation, see Google Indexing: What It Is and Why It Fails.
Variant Duplicates and Canonical URLs
Shopify's automatic canonical implementation is correct in principle but has three practical limits:
1. Canonicals are hints, not directives. Google may choose to index a ?variant= URL if it perceives it as distinct enough — particularly if users are landing on it via links or if internal linking points to variant-specific URLs. Check Google Search Console's Coverage report for ?variant= URLs showing as indexed. If they appear, the canonical signal isn't winning.
2. Crawl budget gets consumed regardless. Even when Google respects the canonical and indexes only the parent URL, the crawler still fetches every ?variant= URL it encounters. On a 2,000-SKU store, this means a significant portion of your daily crawl budget goes to URLs that contribute nothing to the index.
3. Parameter handling in GSC. If you have Google Search Console access, verify that Shopify's ?variant= parameter isn't flagged as a crawl issue. Some older GSC configurations have explicit URL parameter settings that can override or confuse Shopify's canonical tags.
The practical fix: when submitting URLs for active indexing, submit only the canonical /products/ URLs. Don't include ?variant= strings in your submission list. FastIndexing lets you paste a URL list or upload a CSV — clean your list to canonical product and collection URLs before uploading.
For e-commerce indexing at scale, including non-Shopify platforms, the E-commerce Indexing guide covers the broader indexing model.
Thin Collection Pages
Collections are the second major indexing problem in Shopify stores. A collection becomes thin when it contains too few products to constitute a meaningfully distinct page — or when it closely mirrors another collection with only minor product set differences.
Common thin-collection scenarios:
- Automated collections with 1–3 products. A collection created by a tag or vendor condition that only matches a handful of SKUs. The resulting page has little content beyond a product grid, no original copy, and no reason for Google to index it separately.
- Tag-filtered subcollections.
/collections/shoes/running may be a legitimate page for a large shoe retailer. For a smaller store where "running" matches four products from the same collection, it's thin.
- Seasonal or promotional collections created for a campaign and left live after the promotion ends. Empty or near-empty collections are actively harmful: Googlebot crawls them, finds minimal content, and marks them as crawled-not-indexed — a status that, repeated at scale, signals to Google that your site contains a lot of low-value content.
Recommended approach for thin collections:
- Audit your
/collections/ pages. Any collection with fewer than 8–10 products and no unique editorial content is a thin-collection candidate.
- Add a
noindex meta tag to thin collections, or consolidate them into broader categories.
- If a collection genuinely warrants its own page, add unique introductory copy — even two or three sentences specific to that collection — before submitting it for indexing.
For a broader view of how crawl waste affects indexation, see Bulk Indexing: Submit and Check Hundreds of URLs.
Shopify Sitemap and robots.txt
Shopify automatically generates a sitemap at /sitemap.xml. It's a sitemap index file that references child sitemaps:
/sitemap_products_1.xml — all published product pages
/sitemap_collections_1.xml — all collection pages
/sitemap_pages_1.xml — standard CMS pages
/sitemap_blogs_1.xml — blog posts (if applicable)
Each child sitemap lists canonical URLs only — not ?variant= strings. This is Shopify doing the right thing automatically.
Submitting the sitemap to Google Search Console:
- Go to GSC → Sitemaps.
- Enter
sitemap.xml (the index file, not individual child sitemaps).
- Click Submit.
GSC will crawl the index file and queue all referenced URLs. This is a passive signal — Google works through the sitemap on its own schedule, which for large stores can mean weeks before all product pages are evaluated.
Important note on sitemap pinging: Google retired its sitemap ping endpoint in late 2023. You can no longer ping https://www.google.com/ping?sitemap= and expect a response. For Bing, pinging https://www.bing.com/ping?sitemap= still works and remains a valid tactic. Submitting your sitemap through GSC is the correct Google path.
Shopify's robots.txt is not editable through the standard theme editor. As of Shopify 2.0, you can customize it via robots.txt.liquid, but changes require development access. The default Shopify robots.txt disallows /admin, /cart, /orders, and similar non-content paths — standard behavior that doesn't cause indexing problems for most stores. Where it can create issues: if a previous developer or app added custom disallow rules that block product or collection URLs.
Always verify your robots.txt at yourdomain.com/robots.txt before submitting URLs for indexing. A disallowed URL won't be indexed regardless of how many channels you notify.
Bulk Indexing at High SKU Counts
The economics of indexing change significantly once a store passes a few hundred SKUs. At 500 products, the gap between "published" and "indexed" is annoying. At 5,000 products, it's a business problem — especially in competitive categories where being indexed a month after a competitor means lost traffic and lost revenue.
Why Google doesn't automatically index every Shopify product:
Google allocates a crawl budget to each domain based on its authority and performance signals. On a new or low-authority Shopify store, that budget is limited. A store with 4,000 products and 30,000 variant URLs is asking Google to spend its entire allotment on the domain before crawling elsewhere. Google won't do that — it crawls a sample, evaluates quality, and allocates more budget over time if the signals are good.
The result: many products in a large catalog may go weeks or months without being crawled, let alone indexed.
The bulk indexing approach for Shopify:
- Build a clean URL list. Export your canonical product URLs from Shopify's admin (Products → Export) or from your
/sitemap_products_1.xml file. Strip any ?variant= strings.
- Prioritize high-value pages. Not all products are equal. Prioritize by margin, search volume, or competitive importance. Submit high-priority pages first.
- Submit in segments. For stores with 1,000+ products, submit in batches of 200–500 URLs. Run an index check at the 7-day mark using the index checker. Identify which products indexed and which didn't. For non-indexed pages, investigate content quality before resubmitting.
- Address thin or duplicate content before resubmission. If a product page failed to index after two submission cycles, the issue is rarely the submission — it's the page. Very short descriptions, no original content, or excessive similarity to another indexed product page are the usual culprits.
- Monitor ongoing publication. For stores that regularly add new products, set up a repeatable submission workflow: new SKUs go into the next submission batch within 48 hours of publication rather than waiting for Googlebot to discover them.
At €0,13/URL (down to €0,11 with volume), submitting a 1,000-product catalog costs around €110–130. For most stores, the revenue from even a handful of newly indexed product pages returning organic traffic makes the arithmetic simple.
From the Field
Dmytro Puhach, Founder · 15+ years in SEO
The Shopify stores that struggle most with indexing aren't doing anything wrong on the surface. They've built clean collections, written product descriptions, set up the sitemap. But they haven't looked at their crawl budget from Google's perspective.
I audited a store last year — around 3,000 SKUs, 12 collections, and a couple of apps that had added filter parameters to every collection URL. Googlebot was spending most of its crawl budget on paginated, filtered collection views that had zero distinct content and noindex tags on roughly a third of the actual product pages set by the previous developer. The products Google could have indexed, it never reached.
The fix was straightforward: clean up the noindex tags, consolidate the filtered views, export canonical product URLs from the sitemap, and submit in batches. Within three weeks, indexed product count doubled. No new content, no link building — just making sure Googlebot's time on that domain was spent on the right URLs.
That's the thing about Shopify indexing. The platform handles a lot correctly by default. The problems are almost always about what's getting in the way, not what's missing.
FAQ
Why aren't my Shopify products showing up in Google?
The most common reasons: a noindex meta tag on product or collection templates (sometimes added by SEO apps or a previous developer), crawl budget consumed by ?variant= parameter URLs before Google reaches canonical product pages, thin or near-duplicate content that Google crawls but chooses not to index, or simply slow discovery on a newer store with limited inbound links. Check Google Search Console's Coverage report first — it will show the exact status for any URL Google has encountered, including the reason it wasn't indexed. If products aren't appearing in Coverage at all, the discovery bottleneck (not enough crawl budget reaching them) is the likely issue.
Does Shopify create duplicate pages from product variants?
Yes — with a caveat. Shopify generates ?variant= query strings for each product option combination, which creates technically distinct URLs for every variant. Shopify automatically adds a canonical tag to each variant URL pointing back to the base product URL (/products/product-handle). When Google respects the canonical, only the base product URL gets indexed. The problem is that Googlebot still crawls variant URLs — consuming crawl budget — before following the canonical redirect. On stores with high variant counts, this can significantly slow indexation of new products. The fix is to exclude ?variant= URLs from any active indexing submission and rely on Shopify's canonical tags to consolidate the signals.
How do I submit the Shopify sitemap to Google?
Go to Google Search Console, select your property, and navigate to Sitemaps. Enter sitemap.xml (Shopify's sitemap index file is at yourdomain.com/sitemap.xml). Submit it and GSC will crawl the index and queue all child sitemaps — products, collections, pages, blogs. Submitting the sitemap is a passive signal: Google works through it on its own schedule. For stores with large catalogs or time-sensitive product launches, submitting canonical product URLs through an active indexing service in parallel gives faster results. Also note: Google's sitemap ping endpoint was retired in late 2023. Bing's sitemap ping still works; Google's does not.
Can I index thousands of Shopify products at once?
Yes — with the right process. Export your canonical /products/ URLs from Shopify's sitemap (sitemap_products_1.xml) or the admin export tool, excluding any ?variant= strings. Upload the list to FastIndexing.io, which submits each URL across eight indexing channels simultaneously. For very large catalogs (1,000+ products), submit in batches of 200–500 and run an index check at the 7-day mark to see what landed before committing the next batch. Not every product will index right away — Google's final decision depends on content quality, domain authority, and crawl budget — but active submission compresses the discovery window from weeks to days for the pages that pass Google's quality evaluation.
What's the difference between Shopify crawling and Shopify indexing?
Crawling means Googlebot visited your product or collection page. Indexing means Google stored it and it can appear in search results. Shopify stores often have strong crawling signals — a clean sitemap, canonical URLs, good internal linking via collections — but still struggle with indexation because Google crawls a URL and decides not to store it (thin content, near-duplicate, or simply lower priority given limited crawl budget). The index checker shows current index status without requiring GSC access, so you can quickly identify which products are indexed and which are crawled but not stored.
Does the Shopify sitemap cover collections and blog posts too?
Yes. Shopify's sitemap at /sitemap.xml is an index file pointing to four child sitemaps: products, collections, pages, and blogs. All four are auto-generated and updated when you add or remove content. Submitting the root sitemap to GSC covers all four automatically. One watch point: filtered collection views (e.g., /collections/t-shirts/blue) are not included in Shopify's generated sitemap — Shopify correctly excludes them. If you've created custom collection filter pages and want them indexed, you'd need to add them manually or through a third-party sitemap app.