SEO Optimization
Great Docs includes comprehensive search engine optimization (SEO) features to help your documentation rank well in search results. These features are enabled by default and work automatically when you build your site.
Whatβs Included
Great Docs generates and injects SEO-related files and metadata automatically:
sitemap.xml: helps search engines discover all your pagesrobots.txt: guides crawler behavior and references your sitemap- Canonical URLs**: prevents duplicate content issues
- Meta descriptions**: provides search result snippets
- JSON-LD structured data: enables rich search results
- page title templates: consistent
Page Title | Site Nameformat
Auditing SEO Health
After building your site, run the SEO audit command to check for issues:
Terminal
great-docs seoThis produces a report like this:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π SEO Audit Results
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
sitemap.xml: 65 URLs indexed
β
robots.txt: includes sitemap reference
β
robots.txt: has user-agent rules
β
Analyzed 65 HTML pages
β
All pages have canonical URLs
β
All pages have meta descriptions
β
10 pages have JSON-LD structured data
β
All images have alt text
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
All SEO checks passed!
Fixing Issues
Use --fix to automatically generate missing SEO files:
Terminal
great-docs seo --fixThis will create sitemap.xml and robots.txt if theyβre missing.
CI Integration
For continuous integration, use --json for machine-readable output:
Terminal
great-docs seo --json{
"status": "pass",
"pages_checked": 65,
"issues": [],
"warnings": [],
"info": ["β
sitemap.xml: 65 URLs indexed", "..."]
}The command exits with code 1 if critical issues are found, making it easy to fail CI builds on SEO problems.
Configuration
All SEO settings live under the seo key in great-docs.yml. Hereβs the full configuration with defaults:
great-docs.yml
seo:
enabled: true # Master switch for all SEO features
sitemap:
enabled: true
changefreq:
homepage: weekly
reference: monthly
user_guide: monthly
changelog: weekly
default: monthly
priority:
homepage: 1.0
reference: 0.8
user_guide: 0.9
changelog: 0.6
default: 0.5
robots:
enabled: true
allow_all: true
disallow: []
crawl_delay: null
extra_rules: []
canonical:
enabled: true
base_url: null # Auto-detected from GitHub Pages
title_template: "{page_title} | {site_name}"
structured_data:
enabled: true
type: SoftwareSourceCode
default_description: null # Falls back to package descriptionMost users wonβt need to change these defaults as theyβre optimized for typical Python documentation sites.
Sitemap Configuration
The sitemap tells search engines about all your pages and how often they change. Great Docs generates different priorities and change frequencies based on page type.
Page Types
Pages are automatically categorized:
| Type | Example Paths | Default Priority | Default Changefreq |
|---|---|---|---|
homepage |
index.html |
1.0 | weekly |
user_guide |
user-guide/*.html, recipes/*.html |
0.9 | monthly |
reference |
reference/*.html |
0.8 | monthly |
changelog |
changelog.html |
0.6 | weekly |
default |
Everything else | 0.5 | monthly |
Customizing Priorities
Adjust priorities based on whatβs most important for your site:
great-docs.yml
seo:
sitemap:
priority:
homepage: 1.0
user_guide: 0.9 # Your tutorials are most valuable
reference: 0.7 # API docs are secondaryCustomizing Change Frequencies
If your reference documentation changes frequently:
great-docs.yml
seo:
sitemap:
changefreq:
reference: weekly # API changes often
changelog: daily # Frequent releasesRobots.txt Configuration
The robots.txt file tells search engine crawlers which pages to index and where to find your sitemap.
Default Behavior
By default, Great Docs generates a permissive robots.txt:
robots.txt
# Robots.txt generated by Great Docs
User-agent: *
Allow: /
Sitemap: https://username.github.io/repo/sitemap.xmlBlocking Paths
To prevent indexing of specific paths (e.g., draft pages):
great-docs.yml
seo:
robots:
disallow:
- /drafts/
- /_internal/Blocking AI Crawlers
Some projects prefer to block AI training crawlers:
great-docs.yml
seo:
robots:
extra_rules:
- "User-agent: GPTBot"
- "Disallow: /"
- "User-agent: CCBot"
- "Disallow: /"Setting Crawl Delay
For sites with limited bandwidth:
great-docs.yml
seo:
robots:
crawl_delay: 10 # Seconds between requestsCanonical URLs
Canonical URLs tell search engines which version of a page is the βofficialβ one, preventing duplicate content issues when your site is accessible via multiple URLs.
Auto-Detection
Great Docs automatically generates canonical URLs based on your GitHub repository. If your repo is github.com/username/repo, the canonical base URL will be:
https://username.github.io/repo/
Manual Configuration
For custom domains or non-GitHub hosting, set the base URL explicitly:
great-docs.yml
seo:
canonical:
base_url: https://docs.myproject.com/The trailing slash is important (Great Docs will add it if missing).
Page Titles
Great Docs applies a consistent title template to all pages, improving brand recognition in search results.
Default Template
The default template is {page_title} | {site_name}, which produces titles like:
Installation | My PackageGreatDocs.build | My PackageConfiguration | My Package
Custom Templates
Change the separator or format:
great-docs.yml
seo:
title_template: "{page_title} - {site_name}"Or remove the site name entirely:
great-docs.yml
seo:
title_template: "{page_title}"Structured Data (JSON-LD)
Great Docs injects JSON-LD structured data into your pages, enabling rich search results with additional context about your software.
What Gets Injected
On the homepage and reference pages, Great Docs adds:
{
"@context": "https://schema.org",
"@type": "SoftwareSourceCode",
"name": "My Package",
"description": "A Python package for...",
"codeRepository": "https://github.com/username/repo",
"programmingLanguage": {
"@type": "ComputerLanguage",
"name": "Python"
}
}Customizing the Schema Type
For different types of documentation:
great-docs.yml
seo:
structured_data:
type: WebSite # Or: SoftwareApplication, APIReference, etc.Disabling Structured Data
If you prefer not to include JSON-LD:
great-docs.yml
seo:
structured_data:
enabled: falseMeta Descriptions
Meta descriptions appear in search result snippets. Great Docs automatically generates them from your page content.
Auto-Generation
For each page, Great Docs extracts a description from:
- The first meaningful paragraph in the page content
- Falls back to the default description if no suitable content is found
Setting a Default Description
Configure a fallback description for pages without extractable content:
great-docs.yml
seo:
default_description: "Documentation for My Package, a Python library for..."If not set, the package description from pyproject.toml is used.
Page-Level Descriptions
For individual pages, add a description field in the YAML frontmatter:
user_guide/01-installation.qmd
---
title: "Installation"
description: "How to install My Package using pip, conda, or from source."
---Noindex for Internal Pages
Some pages shouldnβt appear in search results. Great Docs automatically adds noindex directives to internal pages like the Skills page.
Manual Noindex
To prevent a specific page from being indexed, add to its frontmatter:
drafts/experimental-feature.qmd
---
title: "Experimental Feature"
robots: "noindex, nofollow"
---Disabling SEO Features
To completely disable SEO generation:
great-docs.yml
seo:
enabled: falseOr disable specific features:
great-docs.yml
seo:
sitemap:
enabled: false
robots:
enabled: false
canonical:
enabled: false
structured_data:
enabled: falseBest Practices
For Maximum SEO Effectiveness
- set a base URL either through a GitHub repo or explicit configuration
- write good descriptions: add
description:to key pagesβ frontmatter - use descriptive titles that are clear (as concise page titles improve click-through rates)
- add alt text to images: Great Docs audits this and will fix any missing alt text
- run
great-docs seoto audit before deployment and catch issues early
For GitHub Pages
If youβre deploying to GitHub Pages, SEO works out of the box:
- your canonical base URL is auto-detected from the repository
- the sitemap is automatically referenced in
robots.txt - all pages get proper canonical links
For Custom Domains
When using a custom domain, this configuration is useful:
great-docs.yml
seo:
canonical:
base_url: https://docs.myproject.com/And ensure your DNS is configured correctly (see Adding a Custom Domain).