12 KiB
| title | description |
|---|---|
| Architecture | Internal design of core/lint -- types, data flow, and extension points |
Architecture
This document explains how core/lint works internally. It covers the core library (pkg/lint), the PHP quality pipeline (pkg/php), and the QA command layer (cmd/qa).
Overview
The system is organised into three layers:
cmd/core-lint CLI entry point (lint check, lint catalog)
cmd/qa QA workflow commands (watch, review, health, issues, PHP tools)
|
pkg/lint Core library: rules, catalog, matcher, scanner, reporting
pkg/php PHP tool wrappers: format, analyse, audit, security, test
pkg/detect Project type detection
|
catalog/*.yaml Embedded rule definitions
The root lint.go file ties the catalog layer to the library:
//go:embed catalog/*.yaml
var catalogFS embed.FS
func LoadEmbeddedCatalog() (*lintpkg.Catalog, error) {
return lintpkg.LoadFS(catalogFS, "catalog")
}
This means all YAML rules are baked into the binary at compile time. There are no runtime file lookups.
Core Types (pkg/lint)
Rule
A Rule represents a single lint check loaded from YAML. Key fields:
type Rule struct {
ID string `yaml:"id"`
Title string `yaml:"title"`
Severity string `yaml:"severity"` // info, low, medium, high, critical
Languages []string `yaml:"languages"` // e.g. ["go"], ["go", "php"]
Tags []string `yaml:"tags"` // e.g. ["security", "injection"]
Pattern string `yaml:"pattern"` // Regex pattern to match
ExcludePattern string `yaml:"exclude_pattern"` // Regex to suppress false positives
Fix string `yaml:"fix"` // Human-readable remediation
Detection string `yaml:"detection"` // "regex" (extensible to other types)
AutoFixable bool `yaml:"auto_fixable"`
ExampleBad string `yaml:"example_bad"`
ExampleGood string `yaml:"example_good"`
FoundIn []string `yaml:"found_in"` // Repos where pattern was observed
FirstSeen string `yaml:"first_seen"`
}
Each rule validates itself via Validate(), which checks required fields and compiles regex patterns. Severity is constrained to five levels: info, low, medium, high, critical.
Catalog
A Catalog is a flat collection of rules with query methods:
ForLanguage(lang)-- returns rules targeting a specific languageAtSeverity(threshold)-- returns rules at or above a severity levelByID(id)-- looks up a single rule
Loading is done via LoadDir(dir) for filesystem paths or LoadFS(fsys, dir) for embedded filesystems. Both read all .yaml files in the directory and parse them into []Rule.
Matcher
The Matcher is the regex execution engine. It pre-compiles all regex-detection rules into compiledRule structs:
type compiledRule struct {
rule Rule
pattern *regexp.Regexp
exclude *regexp.Regexp
}
NewMatcher(rules) compiles patterns once. Match(filename, content) then scans line by line:
- For each compiled rule, check if the filename itself matches the exclude pattern (e.g., skip
_test.gofiles). - For each line, test against the rule's pattern.
- If the line matches, check the exclude pattern to suppress false positives.
- Emit a
Findingwith file, line number, matched text, and remediation advice.
Non-regex detection types are silently skipped, allowing the catalog schema to support future detection mechanisms (AST, semantic) without breaking the matcher.
Scanner
The Scanner orchestrates directory walking and language-aware matching:
- Walk the directory tree, skipping excluded directories (
vendor,node_modules,.git,testdata,.core). - For each file, detect its language from the file extension using
DetectLanguage(). - Filter the rule set to only rules targeting that language.
- Build a language-scoped
Matcherand run it against the file content.
Supported language extensions:
| Extension | Language |
|---|---|
.go |
go |
.php |
php |
.ts, .tsx |
ts |
.js, .jsx |
js |
.cpp, .cc, .c, .h |
cpp |
.py |
py |
Finding
A Finding is the output of a match:
type Finding struct {
RuleID string `json:"rule_id"`
Title string `json:"title"`
Severity string `json:"severity"`
File string `json:"file"`
Line int `json:"line"`
Match string `json:"match"`
Fix string `json:"fix"`
Repo string `json:"repo,omitempty"`
}
Report
The report.go file provides three output formats:
WriteText(w, findings)-- human-readable:file:line [severity] title (rule-id)WriteJSON(w, findings)-- pretty-printed JSON arrayWriteJSONL(w, findings)-- newline-delimited JSON (one object per line)
Summarise(findings) aggregates counts by severity.
Data Flow
A typical scan follows this path:
YAML files ──> LoadFS() ──> Catalog{Rules}
|
ForLanguage() / AtSeverity()
|
[]Rule (filtered)
|
NewScanner(rules)
|
ScanDir(root) / ScanFile(path)
|
┌───────────────┼───────────────┐
│ Walk tree │ Detect lang │
│ Skip dirs │ Filter rules │
│ │ NewMatcher() │
│ │ Match() │
└───────────────┴───────────────┘
|
[]Finding
|
WriteText() / WriteJSON() / WriteJSONL()
Cyclomatic Complexity Analysis (pkg/lint/complexity.go)
The module includes a native Go AST-based cyclomatic complexity analyser. It uses go/parser and go/ast -- no external tools required.
results, err := lint.AnalyseComplexity(lint.ComplexityConfig{
Threshold: 15,
Path: "./pkg/...",
})
Complexity is calculated by starting at 1 and incrementing for each branching construct:
if,for,range,case(non-default),comm(non-default)&&,||binary expressionstype switch,select
There is also AnalyseComplexitySource(src, filename, threshold) for testing without file I/O.
Coverage Tracking (pkg/lint/coverage.go)
The coverage subsystem supports:
- Parsing Go coverage output (
ParseCoverProfilefor-coverprofileformat,ParseCoverOutputfor-coveroutput) - Snapshotting via
CoverageSnapshot(timestamp, per-package percentages, metadata) - Persistence via
CoverageStore(JSON file-backed append-only store) - Regression detection via
CompareCoverage(previous, current)which returns aCoverageComparisonwith regressions, improvements, new packages, and removed packages
Vulnerability Checking (pkg/lint/vulncheck.go)
VulnCheck wraps govulncheck -json and parses its newline-delimited JSON output into structured VulnFinding objects. The parser handles three message types from govulncheck's wire format:
config-- extracts the module pathosv-- stores vulnerability metadata (ID, aliases, summary, affected ranges)finding-- maps OSV IDs to call traces and affected packages
Toolkit (pkg/lint/tools.go)
The Toolkit struct wraps common developer commands into structured Go APIs. It executes subprocesses and parses their output:
| Method | Wraps | Returns |
|---|---|---|
FindTODOs(dir) |
git grep |
[]TODO |
Lint(pkg) |
go vet |
[]ToolFinding |
Coverage(pkg) |
go test -cover |
[]CoverageReport |
RaceDetect(pkg) |
go test -race |
[]RaceCondition |
AuditDeps() |
govulncheck (text) |
[]Vulnerability |
ScanSecrets(dir) |
gitleaks |
[]SecretLeak |
GocycloComplexity(threshold) |
gocyclo |
[]ComplexFunc |
DepGraph(pkg) |
go mod graph |
*Graph |
GitLog(n) |
git log |
[]Commit |
DiffStat() |
git diff --stat |
DiffSummary |
UncommittedFiles() |
git status |
[]string |
Build(targets...) |
go build |
[]BuildResult |
TestCount(pkg) |
go test -list |
int |
CheckPerms(dir) |
filepath.Walk |
[]PermIssue |
ModTidy() |
go mod tidy |
error |
All methods use the Run(name, args...) helper which captures stdout, stderr, and exit code.
PHP Quality Pipeline (pkg/php)
The pkg/php package provides structured wrappers around PHP ecosystem tools. Each tool has:
- Detection -- checks for config files and vendor binaries (e.g.,
DetectAnalyser,DetectPsalm,DetectRector) - Options struct -- configures the tool run
- Execution function -- builds the command, runs it, and returns structured results
Supported Tools
| Function | Tool | Purpose |
|---|---|---|
Format() |
Laravel Pint | Code style formatting |
Analyse() |
PHPStan / Larastan | Static analysis |
RunPsalm() |
Psalm | Type-level static analysis |
RunAudit() |
Composer audit + npm audit | Dependency vulnerability scanning |
RunSecurityChecks() |
Built-in checks | .env exposure, debug mode, filesystem security |
RunRector() |
Rector | Automated code refactoring |
RunInfection() |
Infection | Mutation testing |
RunTests() |
Pest / PHPUnit | Test execution |
QA Pipeline
The pipeline system (pipeline.go + runner.go) organises checks into three stages:
- Quick -- audit, fmt, stan (fast, run on every push)
- Standard -- psalm (if available), test
- Full -- rector, infection (slow, run in full QA)
The QARunner builds process.RunSpec objects with dependency ordering (e.g., stan runs after fmt, test runs after stan). This allows future parallelisation while respecting ordering constraints.
Project Detection (pkg/detect)
The detect package identifies project types by checking for marker files:
go.modpresent => Go projectcomposer.jsonpresent => PHP project
DetectAll(dir) returns all detected types, enabling polyglot project support.
QA Command Layer (cmd/qa)
The cmd/qa package provides workflow-level commands that integrate with GitHub via the gh CLI:
- watch -- polls GitHub Actions for a specific commit, shows real-time status, drills into failure details (failed job, step, error line from logs)
- review -- fetches open PRs, analyses CI status, review decisions, and merge readiness, suggests next actions
- health -- scans all repos in a
repos.yamlregistry, reports aggregate CI health with pass rates - issues -- fetches issues across repos, categorises them (needs response, ready, blocked, triage), prioritises by labels and activity
- docblock -- parses Go source with
go/ast, counts exported symbols with and without doc comments, enforces a coverage threshold
Commands register themselves via cli.RegisterCommands in an init() function, making them available when the package is imported.
Extension Points
Adding New Rules
Create a new YAML file in catalog/ following the schema:
- id: go-xxx-001
title: "Description of the issue"
severity: medium # info, low, medium, high, critical
languages: [go]
tags: [security]
pattern: 'regex-pattern'
exclude_pattern: 'false-positive-filter'
fix: "How to fix the issue"
detection: regex
auto_fixable: false
example_bad: 'problematic code'
example_good: 'corrected code'
The file will be embedded automatically on the next build.
Adding New Detection Types
The Detection field on Rule currently supports "regex". The Matcher skips non-regex rules, so adding a new detection type (e.g., "ast" for Go AST patterns) requires:
- Adding the new type to the
Validate()method - Creating a new matcher implementation
- Integrating it into
Scanner.ScanDir()
Loading External Catalogs
Use LoadDir(path) to load rules from a directory on disk rather than the embedded catalog:
cat, err := lintpkg.LoadDir("/path/to/custom/rules")
This allows organisations to maintain private rule sets alongside the built-in catalog.