Log Analysis for Threat Detection: Closing the Gap Between Data and Discovery

By IPThreat Team May 12, 2026

The Threat Landscape That Makes Log Analysis Non-Negotiable

Threat actors have grown methodical about avoiding detection. The 0ktapus campaign, which compromised over 130 firms through credential harvesting and phishing, succeeded in part because the lateral movement and authentication abuse it relied on left traces in logs that were never correlated in time to stop the damage. The Canvas breach that disrupted schools and colleges nationwide followed a similar pattern: the indicators were present in authentication logs, access logs, and network telemetry before the breach became visible at the application layer.

OceanLotus, suspected of using PyPI to deliver ZiChatBot malware, demonstrates another dimension of the problem. Supply-chain attacks introduce malicious behavior through trusted channels, which means traditional perimeter detections miss the initial access entirely. From that point forward, the only reliable source of truth is log data: what processes spawned, what network connections were made, what files were created, and what authentication events followed.

Log analysis is not a passive activity reserved for incident response post-mortems. When done correctly and continuously, it functions as a real-time detection surface that covers ground no endpoint agent, firewall rule, or IDS signature can cover on its own. The ISC SANS community has consistently highlighted the importance of data sources beyond the endpoint, and that framing matters: logs from authentication systems, cloud control planes, DNS resolvers, and web proxies collectively describe attacker behavior in ways that no single telemetry source can replicate.

Why Log Coverage Determines Detection Capability

Before any analysis can happen, teams need to understand what they are actually collecting. Most organizations have gaps they have not mapped. Active Directory Certificate Services (AD CS) escalation techniques, which have been documented in detail by researchers unpacking advanced misuse tools, produce distinctive log events in the Windows Security Event Log, specifically around certificate template modifications, certificate enrollment requests, and changes to CA permissions. These events exist in the log stream. The question is whether anyone has enabled the right audit policies to capture them and whether the SIEM is ingesting them.

The same applies to cloud infrastructure. When ScarCruft compromised a gaming platform through a supply-chain attack, the attack chain included package repository abuse, which leaves artifacts in build pipeline logs, package manager logs, and CI/CD audit trails. Organizations that only monitor endpoint telemetry see the payload but miss the delivery mechanism entirely, which means they cannot scope the breach or identify other affected systems.

Log coverage should be treated as a structured inventory, not an assumption. The following source categories represent the minimum viable detection surface for a mid-size enterprise:

  • Authentication logs: Active Directory event logs (Event IDs 4624, 4625, 4648, 4768, 4769, 4771), Azure AD sign-in logs, Okta system logs, and RADIUS/VPN authentication records.
  • Network telemetry: DNS query logs, proxy access logs, NetFlow or IPFIX records, firewall session logs, and east-west traffic from internal network sensors.
  • Endpoint and process telemetry: Sysmon logs (Event IDs 1, 3, 7, 8, 10, 11, 22), Windows PowerShell Script Block Logging, Linux auditd records, and EDR process trees.
  • Cloud control plane: AWS CloudTrail, Azure Activity Log, GCP Cloud Audit Logs, and Kubernetes audit logs for containerized workloads.
  • Application and web logs: Web server access logs, API gateway logs, application error logs, and database query logs.
  • Email and collaboration: Mail transfer agent logs, Microsoft 365 Unified Audit Log, and Slack or Teams audit exports where available.

Gaps in any of these categories translate directly into blind spots. A team that ingests endpoint telemetry but lacks DNS query logs, for example, cannot detect many command-and-control communication patterns or data exfiltration over DNS tunneling.

Structuring Log Analysis Around Attacker Behavior

Raw log volume is not the problem. Most organizations generate more log data than they can store, let alone analyze. The challenge is structuring analysis around the behaviors that matter rather than the events that are loudest.

The MITRE ATT&CK framework provides a practical organizing principle. Each tactic corresponds to a phase of attacker behavior, and each technique maps to specific log events that indicate that behavior occurred. For initial access, teams should monitor for unusual authentication patterns, new device registrations, and first-seen source IP addresses authenticating to sensitive systems. For privilege escalation, the relevant signals include AD CS enrollment events, token manipulation indicators from Sysmon Event ID 10 (process access to LSASS), and Group Policy modification events.

Lateral movement is where log analysis often makes the difference. Techniques like pass-the-hash, pass-the-ticket, and WMI-based remote execution all produce specific authentication and process creation events. A Kerberos TGS request for a sensitive service from a workstation that has never made that request before is a meaningful signal. An NTLM authentication from a server to a workstation is anomalous in most environments and worth investigating.

Temporal correlation is equally important. A single failed login is noise. Fifty failed logins across twenty accounts from the same source IP in two minutes is a credential stuffing attempt. A successful login from a new geographic location followed within thirty seconds by an email forwarding rule creation is a business email compromise indicator. These patterns require correlation logic, not just individual event alerting.

Detection Engineering: From Log Events to Actionable Alerts

Detection engineering is the discipline of translating threat intelligence and attacker behavior knowledge into specific detection logic applied to log data. It sits between raw log collection and analyst response, and the quality of this layer determines how much signal reaches the team and how much noise they must filter.

YARA-X 1.16.0, released recently by the security community, reinforces the ongoing investment in pattern-matching for threat detection. While YARA is traditionally applied to file and memory scanning, the underlying principle applies directly to log analysis: define precise patterns that characterize known-bad behavior, and apply them consistently across the data. In log analysis, this means writing detection rules that look for specific sequences of events, specific field values, or specific combinations of attributes that appear together in attacker activity.

A concrete example: detecting AD CS abuse requires matching several conditions simultaneously. The audit log must show a certificate enrollment request (Event ID 4886 or 4887), the certificate template involved must have client authentication extended key usage, the requesting account must not be a service account provisioned for certificate operations, and the enrollment must occur outside of standard provisioning windows. No single event triggers this detection. The correlation of multiple events across a defined time window does.

Detection rules should be versioned and tested against known-good traffic before deployment. A rule that fires on every AD CS enrollment in an environment with automated certificate provisioning is worse than no rule, because it trains analysts to ignore the alert category. Precision matters as much as recall.

Log Analysis Checklist for Threat Detection Teams

The following checklist is intended for cybersecurity teams establishing or auditing their log analysis capability. Each item represents a concrete action rather than a general recommendation.

  1. Audit log sources against a coverage matrix. Map every system type in the environment to the log sources it should produce. Identify systems that are not sending logs to the central collection platform. Prioritize gaps by asset criticality.
  2. Verify log completeness and fidelity. Check that logs include the fields required for detection, not just the fields enabled by default. Windows Security audit policy often requires explicit configuration to enable object access auditing, process creation logging, and privilege use logging. Confirm that timestamps are synchronized via NTP and that timezone handling is consistent across sources.
  3. Define retention requirements by log type. Authentication logs should be retained for a minimum of 90 days for most environments, with 365 days recommended for environments subject to regulatory requirements. DNS logs and proxy logs support investigation of long-dwell-time intrusions and should be retained accordingly.
  4. Implement baseline behavioral profiling. Establish normal patterns for authentication times, source IP ranges, accessed resources, and data transfer volumes for key user and system accounts. Deviations from baseline are more actionable than threshold alerts.
  5. Write and test detection rules for priority threat techniques. Focus initial detection engineering efforts on the techniques most relevant to the organization's threat profile. For organizations using cloud infrastructure, privilege escalation via IAM policy modification is a higher priority than AD CS abuse. Verify rules produce alerts against synthetic test data before relying on them in production.
  6. Establish alert triage workflows. Define what happens when a detection fires. Who receives the alert, what initial investigation steps are expected, what information must be collected before escalation, and what escalation path exists for confirmed incidents. Unrouted alerts are equivalent to no alerts.
  7. Review high-volume log sources for suppressed signals. DNS logs, proxy logs, and firewall session logs generate enormous volumes. Check that parsing errors, field truncation, or ingestion rate limits are not silently dropping events. A SIEM that ingests 80% of a log source may miss exactly the events that matter.
  8. Conduct periodic threat hunts against log archives. Scheduled hunts using hypotheses derived from recent threat intelligence force proactive examination of log data rather than waiting for alerts. A hunt for signs of PyPI-delivered malware command-and-control traffic, for example, can be completed against DNS and proxy logs using known indicator patterns from threat intelligence feeds.
  9. Integrate external threat intelligence with log analysis. IP addresses, domains, file hashes, and certificates associated with known threat actors can be matched against log data to surface connections that rule-based detection would miss. Automate this matching where possible using SIEM lookup tables or threat intelligence platform integrations.
  10. Document and communicate log analysis findings. Produce regular reports on detection coverage, alert volumes, triage outcomes, and identified gaps. This documentation creates accountability and supports resource requests for expanding coverage or detection engineering capacity.

Handling Log Data from Cloud and Hybrid Environments

Cloud environments produce log types that differ structurally from on-premises logs, and many teams have not fully adapted their analysis pipelines to handle them. AWS CloudTrail, for example, records API calls made to AWS services, including who made the call, from what IP address, using what credentials, and with what parameters. A call to CreateUser followed by AttachUserPolicy with an administrator policy attached is a privilege escalation sequence that should trigger an alert regardless of whether the calling identity is legitimate. This pattern appears in cloud breaches regularly, including cases where compromised CI/CD pipeline credentials were used to create persistent admin accounts.

Azure Activity Log captures control-plane operations including role assignments, resource deletions, and policy modifications. Monitoring for new Owner role assignments, especially to identities that have not previously held privileged roles, surfaces both insider threats and external attacker persistence. GCP Cloud Audit Logs follow a similar structure and include Admin Activity logs, Data Access logs, and System Event logs, each requiring separate enablement and each covering different attacker behaviors.

Kubernetes audit logs deserve specific attention as container adoption grows. Every API call to the Kubernetes API server is logged, including pod creation, service account token requests, and RBAC modifications. Attackers who gain access to a Kubernetes cluster frequently create privileged pods that mount host file systems, a technique that appears clearly in audit logs as a pod spec including hostPath volume mounts or privileged: true security context settings.

Common Implementation Pitfalls

Log analysis programs fail in predictable ways. Understanding these failure modes helps teams avoid building infrastructure that looks complete but produces limited detection value.

The first pitfall is treating log ingestion as equivalent to log analysis. Sending logs to a SIEM and defining no detection logic is infrastructure without capability. Many organizations have years of log data stored in a SIEM with alert coverage for fewer than a dozen event types. The data exists; the analysis does not.

The second pitfall is over-relying on out-of-the-box detection rules from SIEM vendors or security platforms without tuning them to the environment. Default rules are calibrated for a generic environment. An alert that fires on any NTLM authentication will generate thousands of events per day in an environment where legacy applications use NTLM extensively, burying the anomalous events that indicate an attack. Every rule deployed to production should be reviewed for false-positive rate in that specific environment before analysts are expected to act on it.

The third pitfall is ignoring log quality problems. Logs that arrive with incorrect timestamps, missing fields, inconsistent formatting, or parsing errors produce unreliable detection. A detection rule that relies on a field that is frequently empty or malformed will miss events silently. Log quality monitoring, which checks field population rates, timestamp accuracy, and ingestion completeness, should be part of the operational baseline.

The fourth pitfall is building detection only for known threats and not for anomalous behavior. Signature-based detection misses novel attack techniques by definition. Behavioral baselines and anomaly detection fill this gap, but they require investment in establishing what normal looks like before they can identify what abnormal looks like. Teams that skip baselining end up with anomaly detection models that fire constantly or not at all.

The fifth pitfall is siloing log analysis from threat intelligence. The OceanLotus PyPI campaign and the 0ktapus credential harvesting operation both produced indicators that were publicly available in threat intelligence feeds before many organizations had checked their logs for matching activity. Closing the loop between threat intelligence and log analysis requires operational processes, not just tool integrations.

The sixth pitfall is failing to account for log source availability during an incident. Attackers who have gained sufficient access frequently attempt to clear event logs, disable logging services, or delete cloud audit trails. Organizations that store logs only on the systems that generate them lose visibility the moment an attacker achieves local administrator access. Central, tamper-evident log storage with strict access controls is a prerequisite for reliable incident investigation.

Making Log Analysis Sustainable at Scale

Log analysis is not a project with a completion date. The threat landscape evolves, environments change, and detection logic that was effective against last year's techniques may be blind to this year's. Sustainability requires a combination of automation, human review, and continuous improvement processes.

Automation handles high-volume, low-complexity detection tasks: matching logs against threat intelligence indicators, correlating authentication events for obvious credential stuffing patterns, and alerting on clearly defined policy violations. Human review handles the cases that require context, judgment, and knowledge of the specific environment. The division of labor between automation and human analysts should be explicit and regularly revisited as detection capability matures.

Detection rule reviews should be scheduled, not reactive. Every rule in production should be evaluated quarterly for false-positive rate, coverage relevance, and alignment with current threat intelligence. Rules that fire frequently without producing confirmed incidents should be tuned or retired. Rules that cover techniques actively used by threat actors targeting the organization's sector should be prioritized for development.

Threat hunting activity should feed back into detection engineering. When a hunt surfaces an attacker technique that was not covered by existing rules, the outcome should include a new or modified detection rule so that future instances of the same technique alert automatically. This loop between proactive hunting and automated detection is how detection capability grows over time rather than remaining static.

Log analysis will not prevent every attack. Supply-chain compromises like the ScarCruft gaming platform case and credential-based attacks like 0ktapus demonstrate that attackers exploit trust relationships and legitimate access in ways that are difficult to block at the perimeter. What log analysis provides is visibility into what happened, when it happened, and what was affected, which is the foundation for effective response, accurate scoping, and meaningful containment. For organizations willing to invest in coverage, detection engineering, and operational process, log analysis remains one of the most reliable detection surfaces available.

Contact IPThreat