This commit introduces a new audit document, `AUDIT-ERROR-HANDLING.md`, which provides a comprehensive review of the project's error handling and logging practices. The audit covers: - **Error Handling:** Analyzes the inconsistency between the well-structured API error responses and the simpler, unstructured error handling at the application's entry points. - **Logging:** Details the existing custom logger, its lack of JSON output, and its inconsistent use across the codebase. - **Recommendations:** Provides actionable steps for improvement, including adopting structured JSON logging, centralizing logger configuration, and standardizing on the global logger. Additionally, this commit includes minor, unrelated fixes to address pre-existing build failures: - Adds a missing package declaration and imports in `pkg/node/dispatcher.go`. - Removes an unused import in `pkg/ueps/packet.go`. Co-authored-by: Snider <631881+Snider@users.noreply.github.com>
5.6 KiB
Error Handling and Logging Audit
1. Error Handling
Exception Handling & Error Recovery
- Graceful Degradation: The application demonstrates graceful degradation in
pkg/mining/service.go, where theNewServicefunction continues to operate with a minimal in-memory profile manager if the primary one fails to initialize. This ensures core functionality remains available. - Inconsistent Top-Level Handling: Error handling at the application's entry points is inconsistent.
- In
cmd/desktop/mining-desktop/main.go, errors fromfs.Subandapp.Runare handled withlog.Fatal, which abruptly terminates the application without using the project's structured logger. - In
cmd/mining/main.go, errors fromcmd.Executeare printed tostderrwithfmt.Fprintf, and the application exits with a status code of 1. This is a standard CLI pattern but bypasses the custom logging framework.
- In
- No Retry or Circuit Breaker Patterns: The codebase does not currently implement explicit retry logic with backoff or circuit breaker patterns for handling failures in external dependencies or services. However, the API error response includes a
Retryablefield, which correctly signals to clients when a retry is appropriate (e.g., for503 Service Unavailable).
User-Facing & API Errors
- Standard API Error Response: The API service (
pkg/mining/service.go) excels at providing consistent, user-friendly error responses.- It uses a well-defined
APIErrorstruct that includes a machine-readablecode, a human-readablemessage, and an optionalsuggestionto guide the user. - The
respondWithErrorandrespondWithMiningErrorfunctions centralize error response logic, ensuring all API errors follow a consistent format.
- It uses a well-defined
- Appropriate HTTP Status Codes: The API correctly maps application errors to standard HTTP status codes (e.g.,
404 Not Foundfor missing miners,400 Bad Requestfor invalid input,500 Internal Server Errorfor server-side issues). - Controlled Information Leakage: The
sanitizeErrorDetailsfunction prevents the leakage of sensitive internal error details in production environments (GIN_MODE=release), enhancing security. Debug information is only exposed whenDEBUG_ERRorsis enabled.
2. Logging
Log Content and Quality
- Custom Structured Logger: The project includes a custom logger in
pkg/logging/logger.gothat supports standard log levels (Debug, Info, Warn, Error) and allows for structured logging by attaching key-value fields. - No JSON Output: The logger's output is a custom string format (
timestamp [LEVEL] [component] message | key=value), not structured JSON. This makes logs less machine-readable and harder to parse, filter, and analyze with modern log management tools. - Good Context in Error Logs: The existing usage of
logging.Errorthroughout thepkg/miningmodule is effective, consistently including relevant context (e.g.,miner,panic,error) as structured fields. - Request Correlation: The API service (
pkg/mining/service.go) implements arequestIDMiddlewarethat assigns a uniqueX-Request-IDto each request, which is then included in logs. This is excellent practice for tracing and debugging.
What is Not Logged
- No Sensitive Data: Based on a review of
logging.Errorusage, the application appears to correctly avoid logging sensitive information such as passwords, tokens, or personally identifiable information (PII).
Inconsistencies
- Inconsistent Adoption: The custom logger is not used consistently across the project. The
mainpackages for both the desktop and CLI applications (cmd/desktop/mining-desktop/main.goandcmd/mining/main.go) use the standardlogandfmtpackages for error handling, bypassing the structured logger. - No Centralized Configuration: There is no centralized logger initialization in
mainorroot.go. The global logger is used with its default configuration (Info level, stderr output), and there is no clear mechanism for configuring the log level or output via command-line flags or a configuration file.
3. Recommendations
-
Adopt Structured JSON Logging: Modify the logger in
pkg/logging/logger.goto output logs in JSON format. This will significantly improve the logs' utility by making them machine-readable and compatible with log analysis platforms like Splunk, Datadog, or the ELK stack. -
Centralize Logger Configuration:
- In
cmd/mining/cmd/root.go, add persistent flags for--log-leveland--log-format(e.g.,text,json). - In an
initfunction, parse these flags and configure the globallogging.Loggerinstance accordingly. - Do the same for the desktop application in
cmd/desktop/mining-desktop/main.go, potentially reading from a configuration file or environment variables.
- In
-
Standardize on the Global Logger:
- Replace all instances of
log.Fatalincmd/desktop/mining-desktop/main.gowithlogging.Errorfollowed byos.Exit(1). - Replace
fmt.Fprintf(os.Stderr, ...)incmd/mining/main.gowith a call tologging.Error.
- Replace all instances of
-
Enrich API Error Logs: In
pkg/mining/service.go, enhance therespondWithErrorfunction to log every API error it handles using the structured logger. This will ensure that all error conditions, including client-side errors like bad requests, are recorded for monitoring and analysis. Include therequest_idin every log entry. -
Review Log Levels: Conduct a codebase-wide review of log levels. Ensure that
Debugis used for verbose, development-time information,Infofor significant operational events,Warnfor recoverable issues, andErrorfor critical, action-required failures.