SEO Optimization
Documentation is only useful if people can find it. When someone searches for how to install your package or call a specific function, your docs should appear near the top of the results. Good SEO ensures that search engines can crawl, index, and accurately represent your pages.
Great Docs includes comprehensive SEO features that are enabled by default and work automatically when you build your site. Sitemaps, canonical URLs, meta descriptions, structured data, and robots directives are all generated without any configuration.
Whatβs Included
Great Docs generates and injects SEO-related files and metadata automatically:
sitemap.xml: helps search engines discover all your pagesrobots.txt: guides crawler behavior and references your sitemap- Canonical URLs: prevents duplicate content issues
- Meta descriptions: provides search result snippets
- JSON-LD structured data: enables rich search results
- Page title templates: consistent
Page Title | Site Nameformat
All of these are generated at build time. You can customize any of them or disable features you donβt need.
Auditing SEO Health
After building your site, you can audit the generated output to verify that all SEO features are in place. Run the SEO audit command to check for issues:
Terminal
great-docs seoThis produces a report like this:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π SEO Audit Results
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
sitemap.xml: 65 URLs indexed
β
robots.txt: includes sitemap reference
β
robots.txt: has user-agent rules
β
Analyzed 65 HTML pages
β
All pages have canonical URLs
β
All pages have meta descriptions
β
10 pages have JSON-LD structured data
β
All images have alt text
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
All SEO checks passed!
The audit checks every HTML page in the built site and reports missing or malformed SEO elements.
Fixing Issues
Use --fix to automatically generate missing SEO files:
Terminal
great-docs seo --fixThis creates sitemap.xml and robots.txt if theyβre missing and patches any fixable issues in the built output.
CI Integration
For continuous integration, use --json for machine-readable output:
Terminal
great-docs seo --json{
"status": "pass",
"pages_checked": 65,
"issues": [],
"warnings": [],
"info": ["β
sitemap.xml: 65 URLs indexed", "..."]
}The command exits with code 1 if critical issues are found, making it easy to fail CI builds on SEO problems. Pair this with great-docs lint and great-docs links for a comprehensive pre-deployment check.
Configuration
All SEO settings live under the seo key in great-docs.yml. Hereβs the full configuration with defaults:
great-docs.yml
seo:
enabled: true # Master switch for all SEO features
sitemap:
enabled: true
changefreq:
homepage: weekly
reference: monthly
user_guide: monthly
changelog: weekly
default: monthly
priority:
homepage: 1.0
reference: 0.8
user_guide: 0.9
changelog: 0.6
default: 0.5
robots:
enabled: true
allow_all: true
disallow: []
crawl_delay: null
extra_rules: []
canonical:
enabled: true
base_url: null # Auto-detected from GitHub Pages
title_template: "{page_title} | {site_name}"
structured_data:
enabled: true
type: SoftwareSourceCode
default_description: null # Falls back to package descriptionMost users wonβt need to change these defaults. They are optimized for typical Python documentation sites, and every feature can be toggled independently.
Sitemap Configuration
The sitemap tells search engines about all your pages and how often they change. Great Docs generates different priorities and change frequencies based on page type, so your most important content gets crawled first.
Page Types
Pages are automatically categorized based on their path in the built site:
| Type | Example Paths | Default Priority | Default Changefreq |
|---|---|---|---|
homepage |
index.html |
1.0 | weekly |
user_guide |
user-guide/*.html, recipes/*.html |
0.9 | monthly |
reference |
reference/*.html |
0.8 | monthly |
changelog |
changelog.html |
0.6 | weekly |
default |
Everything else | 0.5 | monthly |
These defaults work well for most projects. The subsections below show how to override them if your site has different needs.
Customizing Priorities
Adjust priorities based on whatβs most important for your site:
great-docs.yml
seo:
sitemap:
priority:
homepage: 1.0
user_guide: 0.9 # Your tutorials are most valuable
reference: 0.7 # API docs are secondaryCustomizing Change Frequencies
If your reference documentation changes frequently:
great-docs.yml
seo:
sitemap:
changefreq:
reference: weekly # API changes often
changelog: daily # Frequent releasesRobots.txt Configuration
The robots.txt file tells search engine crawlers which pages to index and where to find your sitemap. Great Docs generates one automatically, but you can customize it for more control over crawler behavior.
Default Behavior
By default, Great Docs generates a permissive robots.txt:
robots.txt
# Robots.txt generated by Great Docs
User-agent: *
Allow: /
Sitemap: https://username.github.io/repo/sitemap.xmlBlocking Paths
To prevent indexing of specific paths (e.g., draft pages):
great-docs.yml
seo:
robots:
disallow:
- /drafts/
- /_internal/Blocking AI Crawlers
Some projects prefer to block AI training crawlers:
great-docs.yml
seo:
robots:
extra_rules:
- "User-agent: GPTBot"
- "Disallow: /"
- "User-agent: CCBot"
- "Disallow: /"Setting Crawl Delay
For sites with limited bandwidth:
great-docs.yml
seo:
robots:
crawl_delay: 10 # Seconds between requestsCanonical URLs
Canonical URLs tell search engines which version of a page is the βofficialβ one. This prevents duplicate content issues when your site is accessible via multiple URLs (for example, with and without a trailing slash, or through both a custom domain and github.io).
Auto-Detection
Great Docs automatically generates canonical URLs based on your GitHub repository. If your repo is github.com/username/repo, the canonical base URL will be:
https://username.github.io/repo/
Manual Configuration
For custom domains or non-GitHub hosting, set the base URL explicitly:
great-docs.yml
seo:
canonical:
base_url: https://docs.myproject.com/The trailing slash is important (Great Docs will add it if missing). Every page in the built site receives a <link rel="canonical"> tag pointing to its full URL.
Page Titles
Great Docs applies a consistent title template to all pages, improving brand recognition in search results.
Default Template
The default template is {page_title} | {site_name}, which produces titles like:
Installation | My PackageGreatDocs.build | My PackageConfiguration | My Package
Custom Templates
Change the separator or format:
great-docs.yml
seo:
title_template: "{page_title} - {site_name}"Or remove the site name entirely:
great-docs.yml
seo:
title_template: "{page_title}"Structured Data (JSON-LD)
Structured data helps search engines understand what your site is about beyond plain text. Great Docs injects JSON-LD structured data into your pages, enabling rich search results with additional context about your software.
What Gets Injected
On the homepage and reference pages, Great Docs adds a schema block like this:
{
"@context": "https://schema.org",
"@type": "SoftwareSourceCode",
"name": "My Package",
"description": "A Python package for...",
"codeRepository": "https://github.com/username/repo",
"programmingLanguage": {
"@type": "ComputerLanguage",
"name": "Python"
}
}Customizing the Schema Type
For different types of documentation:
great-docs.yml
seo:
structured_data:
type: WebSite # Or: SoftwareApplication, APIReference, etc.Disabling Structured Data
If you prefer not to include JSON-LD:
great-docs.yml
seo:
structured_data:
enabled: falseMeta Descriptions
Meta descriptions are the short summaries that appear below page titles in search results. A good description improves click-through rates because it tells searchers exactly what theyβll find on the page. Great Docs generates these automatically from your content.
Auto-Generation
For each page, Great Docs extracts a description from:
- the first meaningful paragraph in the page content
- falls back to the default description if no suitable content is found
Setting a Default Description
Configure a fallback description for pages without extractable content:
great-docs.yml
seo:
default_description: "Documentation for My Package, a Python library for..."If not set, the package description from pyproject.toml is used as a fallback.
Page-Level Descriptions
For individual pages, add a description field in the YAML frontmatter:
user_guide/01-installation.qmd
---
title: "Installation"
description: "How to install My Package using pip, conda, or from source."
---Noindex for Internal Pages
Not every page belongs in search results. Internal pages, drafts, or experimental features can clutter search results and confuse users. Great Docs automatically adds noindex directives to internal pages like the Skills page.
Manual Noindex
To prevent a specific page from being indexed, add to its frontmatter:
drafts/experimental-feature.qmd
---
title: "Experimental Feature"
robots: "noindex, nofollow"
---Disabling SEO Features
If you handle SEO through other means (for example, a hosting platform that generates sitemaps for you), you can disable individual features or all of them at once.
To completely disable SEO generation:
great-docs.yml
seo:
enabled: falseOr disable specific features:
great-docs.yml
seo:
sitemap:
enabled: false
robots:
enabled: false
canonical:
enabled: false
structured_data:
enabled: falseBest Practices
The automatic features cover the technical foundations. These recommendations help you get the most out of them.
For Maximum SEO Effectiveness
- set a base URL either through a GitHub repo or explicit configuration
- write good descriptions: add
description:to key pagesβ frontmatter - use descriptive titles that are clear (as concise page titles improve click-through rates)
- add alt text to images: Great Docs audits this and will fix any missing alt text
- run
great-docs seoto audit before deployment and catch issues early
For GitHub Pages
If youβre deploying to GitHub Pages, SEO works out of the box:
- your canonical base URL is auto-detected from the repository
- the sitemap is automatically referenced in
robots.txt - all pages get proper canonical links
For Custom Domains
When using a custom domain, this configuration is useful:
great-docs.yml
seo:
canonical:
base_url: https://docs.myproject.com/And ensure your DNS is configured correctly (see Adding a Custom Domain).
Next Steps
Good SEO makes your documentation discoverable. Great Docs handles the technical foundations (sitemaps, canonical URLs, meta tags, structured data) automatically, so you can focus on writing clear titles and descriptions that represent your content well.
- Social Cards controls how links appear when shared on social platforms
- Deployment covers publishing to GitHub Pages with SEO settings applied
- Linting catches documentation quality issues that can also affect search relevance
- Configuration covers all
great-docs.ymloptions including SEO settings