The List
Vendor | Gating Mechanism | Log Issues |
---|---|---|
1Password | Higher Tier (Business+) |
|
Atlassian Cloud | Higher Tier (Standard+) |
|
AWS Cloudtrail | Usage Based Billing |
|
Datadog | Usage Based Billing |
|
DocuSign | Additional SKU |
|
GitHub | Tiered quality |
|
Gitlab | Higher Tier (Premium+) |
|
Terraform Cloud | Higher Tier (Plus/Enterprise only) |
|
Hugging Face | Higher Tier (Enterprise Hub only) |
|
JAMF PRO | Higher Tier (Enterprise+) & Additional SKUs (Compliance Reporter) |
|
LastPass | Higher Tier (Business+) |
|
Notion | Higher Tier (Enterprise only) |
|
Salesforce | Additonal SKU |
|
Slack | Higher Tier (Enterprise Grid only) |
|
Stripe | No documentation |
|
Tines | Higher Tier (Paid+) |
|
Zendesk | Higher Tier (Enterprise only) |
|
What is an audit log?
DataDog provides good context on what qualifies as an audit vs. system log.
The difference between audit logs and system logs (e.g., error logs, operational logs, etc.) is the information they contain, their purpose, and their immutability. Whereas system logs are designed to help developers troubleshoot errors, audit logs help organizations document a historical record of activity for compliance purposes and other business policy enforcement.
Why does this exist?
Audit logs are a core resource for any defensive security teams to successfully detect, investigate, or find behavioral patterns related to their organization. These audit logs are most commonly generated by external parties, such as SaaS vendors, hardware vendors, or on-premise solution based companies. As noted by CISA’s 2024 “Secure by Design” pledge, it’s imperative for organizations to provide these in the base-tier of products, with sane storage limits of the telemetry provided at no extra cost.
Audit logs designed with security professionals in mind are critical to threat detection engineering and incident response. Considering their significance, vendors should be upheld to following core audit log practices.
The practices might fall into the buckets of:
- Data collection & its technical implementation
- Log quality & consistency
- Log content & documentation
Let’s try to answer the question, how does a security engineer judge what qualifies as a “good” audit log? What do we need as professionals to effectively use these logs?
Here is an attempt to define an answer.
What can be added to this list?
Any vendor can be added to the list if they charge a premium for audit log data.
The purpose is to push those building solutions to think of the security engineering customer and give us the information to better protect our organizations.
For example, added sources may only provide audit logs to high paying enterprise tiers (Zendesk), provide higher quality logs with increased pricing tiers (GitHub), or have audit logs that are a separate package from even enterprise tiers (Salesforce).
Audit Log Quality
Improving security engineers experience with audit logs spans across: the log content, how engineers can collect them.
Log Content
- Event types cover all actions taken in the system and include critical fields, such as source ip address.
- Audit logs have external facing documentation on event types.
- Logs contain enough information to attribute activity to a user within the platform.
- The ability to get detailed audit logs is part of the core product or reasonably priced.
Log Collection
- The ability exists to stream logs to a cloud storage or SIEM provider (such as logpush to S3). Otherwise the API to self-retrieve logs is straightforward, documented and allows engineers to easily retrieve their logs.
- Log collection makes it possible for event IDs to be sorted and straightforward in order to not miss log events or get duplicate events in the pipeline.
- There is good log formatting and data structure choice, making it easy to parse logs once they are retrieved.
Quality & Consistency
- There is log consistency across product versioning and operating systems, including when pulling the logs. Backwards compatibility introduced when needed.
- Low rate of log quality related incidents. Logs are reliable and can be taken as a source of truth.
- There is limited latency between when an action occurs and when the log event is available.
What are some examples of qualities that make a log “bad”?
- Lack of information to attribute activity to a user or IP address.
- Poor formatting and structure that makes it difficult to access required information.
- Ex: Unordered arrays inside of nested JSON
- Ex: Random dicts inside of variable length arrays so the same event may have mixed lengths.
- Different formats depending on which internal team built each event type.
- Inconsistent event type definitions based on how a user is accessing the system.
- Inconsistent formats and naming conventions that differ if you pull it via API or view it in the UI.
- Lack of correlation indicators between two related log events.
- Consistency in naming throughout logs (choosing to use ip_address OR source_ip)
So how do you write good logs?
This is a question I pose to the greater security community, this framework is open to suggestions and edits via the Github repository.