Documentation Linting

Master this essential documentation concept

Quick Definition

An automated process that scans written documentation for style inconsistencies, formatting errors, broken links, or guideline violations, similar to how code linters check source code quality.

How Documentation Linting Works

graph TD A[📄 Raw Documentation Files
.md / .rst / .adoc] --> B[Doc Linter Engine
Vale / markdownlint / textlint] B --> C{Rule Violations
Detected?} C -->|Style Issues| D[🎨 Style Checker
Passive voice, word choice,
tone inconsistency] C -->|Format Errors| E[📐 Format Validator
Heading hierarchy,
code block syntax] C -->|Broken References| F[🔗 Link Checker
Dead URLs, missing
cross-references] C -->|No Issues| G[✅ Docs Pass Linting
Ready for merge/publish] D --> H[📋 Lint Report
Line numbers + fix suggestions] E --> H F --> H H --> I{Auto-fixable?} I -->|Yes| J[🔧 Auto-formatter
Prettier / markdownlint --fix] I -->|No| K[👤 Writer Review
Manual correction required] J --> A K --> A

Understanding Documentation Linting

An automated process that scans written documentation for style inconsistencies, formatting errors, broken links, or guideline violations, similar to how code linters check source code quality.

Key Features

  • Centralized information management
  • Improved documentation workflows
  • Better team collaboration
  • Enhanced user experience

Benefits for Documentation Teams

  • Reduces repetitive documentation tasks
  • Improves content consistency
  • Enables better content reuse
  • Streamlines review processes

Running Documentation Linting Checks on Knowledge That Lives in Video

Many teams introduce documentation linting standards during onboarding sessions, tool walkthroughs, or recorded team meetings — a senior writer demonstrates the linter configuration, walks through flagged violations, and explains the reasoning behind each rule. The knowledge is there, but it's locked inside a recording that nobody searches when they're staring at a failed lint check at 4pm on a Friday.

This creates a real gap in how documentation linting gets practiced day-to-day. Your team can't grep a video. When a new contributor asks why the linter flags passive voice in procedure steps, or how to suppress a specific rule for legacy content, there's no quick reference — just a 47-minute onboarding recording they'd have to scrub through manually.

Converting those recordings into structured, searchable documentation changes that dynamic. A video walkthrough of your linting configuration becomes a scannable reference page where contributors can jump directly to the rule they're troubleshooting. Style decisions that were explained verbally get captured as written rationale, making it easier to enforce documentation linting consistently across distributed teams without scheduling another live session every time someone needs context.

If your team's linting standards and style guidance are scattered across recordings, see how converting video content into searchable documentation can help →

Real-World Documentation Use Cases

Enforcing Microsoft Writing Style Guide Across a 500-Page API Reference

Problem

A developer relations team maintaining a large REST API reference has 12 contributors who each write in different styles — some use second person ('you should call'), others use passive voice ('the endpoint is called'), and abbreviations like 'info', 'repo', and 'config' appear inconsistently. Editors spend 3–4 hours per release cycle manually correcting tone and terminology.

Solution

Documentation linting with Vale configured against the Microsoft Writing Style Guide automatically flags passive voice constructions, banned abbreviations, and incorrect product name capitalization on every pull request before human review begins.

Implementation

['Install Vale and create a .vale.ini config file pointing to the microsoft/microsoft-writing-style-guide style package from the Vale package registry.', "Add a custom vocab file listing approved product names (e.g., 'API Gateway' not 'api gateway') and banned terms (e.g., flag 'info' → suggest 'information').", 'Integrate Vale into the GitHub Actions CI pipeline so it runs on every PR targeting the docs/ directory and posts inline comments on offending lines.', "Set the pipeline to 'warning' mode for style issues and 'error' mode for banned terms, blocking merges only on critical violations."]

Expected Outcome

Editor review time drops from 3–4 hours to under 30 minutes per release cycle, with style consistency scores improving measurably across the API reference within two sprint cycles.

Detecting Broken Internal Cross-References After a Documentation Site Migration

Problem

After migrating 800 pages from Confluence to a MkDocs-based static site, hundreds of internal links that used Confluence page IDs now resolve to 404 errors. The team has no systematic way to identify which pages contain broken links without manually clicking through the entire site.

Solution

Documentation linting with a link-checking tool scans all Markdown files at build time, identifies anchor links, relative paths, and external URLs that return non-200 HTTP responses, and generates a prioritized report grouped by broken destination.

Implementation

['Integrate lychee (a fast link checker) into the MkDocs build pipeline via a pre-build hook that scans all .md files in the docs/ directory.', 'Configure lychee to ignore known external URLs that block bots (e.g., LinkedIn) using an .lycheeignore file, reducing false positives.', 'Run the linter in the CI/CD pipeline and export results as a JUnit XML report so broken links appear as failed test cases in the build dashboard.', 'Triage the report by grouping broken links by destination URL — fixing one source page often resolves dozens of link errors pointing to the same renamed page.']

Expected Outcome

All 847 broken internal links are identified within 4 minutes of pipeline execution, and the team resolves 90% of them within one sprint by updating a centralized redirects file.

Standardizing Code Block Language Tags in a Multi-Language SDK Documentation

Problem

An SDK documentation site for Python, Go, and JavaScript has hundreds of fenced code blocks where contributors frequently omit language identifiers (using bare ``` instead of ```python), causing syntax highlighting to fail and making copy-paste examples harder to read. Some blocks also use inconsistent tags like 'js' instead of 'javascript'.

Solution

markdownlint with a custom rule enforces that every fenced code block includes an explicit, approved language identifier from a whitelist, and flags non-canonical aliases like 'js', 'py', or 'golang'.

Implementation

['Configure markdownlint with the MD040 rule (fenced-code-language) set to error severity so all bare code fences are flagged immediately.', "Write a custom markdownlint rule in JavaScript that checks fenced code language tags against an approved list: ['python', 'go', 'javascript', 'bash', 'json', 'yaml', 'text'].", 'Add a pre-commit hook using husky so writers receive immediate feedback locally before pushing, reducing CI pipeline failures.', "Generate a one-time bulk fix script using sed to replace common aliases ('js' → 'javascript', 'py' → 'python') across the entire docs corpus."]

Expected Outcome

Syntax highlighting works correctly on 100% of code examples after the bulk fix, and zero bare code fences appear in new contributions within the first month of enforcement.

Preventing Heading Hierarchy Violations in a Docs-as-Code Workflow

Problem

Technical writers contributing to a Docusaurus site frequently skip heading levels — jumping from H2 directly to H4 — which breaks screen reader navigation, violates WCAG 2.1 accessibility guidelines, and causes the auto-generated page table of contents to render incorrectly with missing entries.

Solution

markdownlint's MD001 rule (heading-increment) is enforced in CI to flag any document where heading levels skip more than one level at a time, with error messages that cite the specific line number and the expected vs. actual heading level.

Implementation

["Enable markdownlint rule MD001 in the .markdownlint.json configuration file and set it to 'error' severity in the CI pipeline configuration.", 'Add a GitHub Actions workflow step that runs markdownlint on changed files only (using git diff) to keep CI feedback fast for large documentation repositories.', "Configure the linter output to use the 'compact' formatter so error messages are parseable by GitHub's problem matcher, displaying them as inline PR annotations.", "Document the heading hierarchy rules in the team's CONTRIBUTING.md with a visual example showing correct (H1→H2→H3) vs. incorrect (H1→H2→H4) structure."]

Expected Outcome

Heading hierarchy violations drop to zero within two weeks of enforcement, and an accessibility audit confirms the documentation site passes WCAG 2.1 Level AA heading structure requirements.

Best Practices

Start with a Minimal Ruleset and Expand Incrementally

Enabling all available linting rules on an existing documentation corpus immediately produces thousands of violations, overwhelming writers and causing teams to disable the linter entirely. Begin with 5–8 high-impact rules (e.g., broken links, missing code block language tags, heading hierarchy) and add rules in batches after each is fully resolved. This creates a culture of incremental quality improvement rather than an adversarial relationship with the tooling.

✓ Do: Enable MD001 (heading hierarchy), MD040 (code block language), and link checking as your first three rules, resolve all existing violations, then add MD013 (line length) in the next iteration.
✗ Don't: Don't enable the full Vale microsoft or google style packages on a legacy documentation set without first running a dry-run audit — you will generate 10,000+ warnings that paralyze the team.

Separate Blocking Errors from Advisory Warnings in CI

Not all linting violations are equally critical — a broken external link blocks a reader immediately, while a passive voice sentence is a quality issue. Configure your linting pipeline with two severity tiers: 'error' rules that block pull request merges, and 'warning' rules that post advisory comments without blocking. This preserves developer velocity while ensuring critical issues are never shipped.

✓ Do: Set broken links, missing alt text on images, and bare code fences as 'error' severity that fails the CI check; set passive voice, sentence length, and word choice suggestions as 'warning' severity.
✗ Don't: Don't mark every style preference as a blocking error — if writers cannot merge documentation fixes because of passive voice warnings, they will bypass the linter entirely by adding skip comments.

Provide Fix Suggestions Alongside Every Linting Error

A linting error that says 'passive voice detected on line 47' is far less actionable than one that says 'passive voice detected: consider changing "the function is called" to "call the function"'. Configure linters like Vale with suggestion rules that output the specific recommended replacement, and use auto-fixable rules wherever possible with markdownlint --fix. Writers adopt linting tools faster when the tool teaches them rather than just penalizing them.

✓ Do: Use Vale's substitution and suggestion rule types to output specific replacement text, and document common violations with before/after examples in your team's style guide.
✗ Don't: Don't rely solely on rule names like 'Microsoft.Passive' as error output — writers unfamiliar with the style guide won't know what change is expected without seeing a concrete suggestion.

Maintain a Project-Specific Vocabulary File to Avoid False Positives

Documentation linters frequently flag legitimate technical terms — API names, product-specific jargon, and proper nouns — as spelling errors or style violations. A custom vocabulary or accept list file (e.g., Vale's accept.txt or a markdownlint custom words list) prevents legitimate terms from generating noise that writers learn to ignore. False positives erode trust in the linting tool and cause teams to dismiss real violations.

✓ Do: Maintain a docs/styles/Vocab/Tech/accept.txt file in Vale containing approved terms like 'Kubernetes', 'OAuth2', 'kubectl', and your product names, and require PRs that add new technical terms to update this file.
✗ Don't: Don't disable spelling or terminology rules entirely to avoid false positives — instead, invest 30 minutes in building an accept list that makes the rule accurate for your specific domain.

Run Documentation Linting Locally via Pre-Commit Hooks, Not Just in CI

Relying solely on CI pipeline linting means writers only discover violations after pushing commits, creating a slow feedback loop that discourages fixing issues. Pre-commit hooks using tools like pre-commit framework with markdownlint and Vale adapters catch violations in under 2 seconds at the moment of commit, before code ever leaves the writer's machine. Fast local feedback is the single biggest factor in whether writers internalize style rules versus treating them as CI obstacles.

✓ Do: Add a .pre-commit-config.yaml file to your documentation repository that runs markdownlint and Vale on staged .md files only, keeping hook execution under 3 seconds even in large repos.
✗ Don't: Don't configure pre-commit hooks to lint the entire documentation corpus on every commit — scanning 800 files takes 45+ seconds and writers will use --no-verify to skip the hook entirely.

How Docsie Helps with Documentation Linting

Build Better Documentation with Docsie

Join thousands of teams creating outstanding documentation

Start Free Trial