The Breach That Started With a Familiar Signature
A mid-sized financial services firm running a standard SIEM stack received an alert late on a Thursday evening. The alert fired from their web application firewall, flagged a suspicious outbound connection, and was automatically suppressed because the destination IP had appeared on an internal allow-list added six months earlier during a vendor integration project. By Sunday morning, the security team was dealing with a full ransomware deployment across 14 servers.
The logs had recorded everything. HTTPS beaconing at irregular intervals, a spike in LSASS memory access events on two domain controllers, lateral movement via Windows RPC calls matching the PhantomRPC privilege escalation technique disclosed earlier this year. All of it was there. The problem was not a gap in logging coverage. The problem was a gap in how the logs were being read, correlated, and acted upon.
This scenario has become increasingly common. Modified versions of known tooling, including adapted variants of the CIA's Hive C2 framework that have reportedly entered criminal markets, are designed to blend into environments where defenders have tuned their detection around legacy signatures. The implants generate log noise that looks benign in isolation and only becomes readable as an attack when correlated across multiple sources.
What Log Analysis for Threat Detection Actually Involves
Log analysis in a modern environment is not a matter of reviewing records after an incident. It is an active process that requires defenders to understand what normal behavior looks like across their specific infrastructure, what deviations from that baseline mean, and how to connect events from disparate sources into a coherent narrative.
For most organizations, logs arrive from at least five categories of sources: endpoint detection and response (EDR) tools, network devices such as firewalls and routers, authentication systems including Active Directory and LDAP, application servers and web proxies, and cloud control plane logs from providers like AWS CloudTrail or Azure Monitor. Each source tells a partial story. The complete picture requires joining them.
A single failed authentication attempt in an Active Directory log means almost nothing on its own. That same event, occurring alongside a new process creation on the same endpoint two minutes later, an outbound connection to an IP flagged in threat intelligence feeds, and a scheduled task modification logged by Sysmon, becomes a credible indicator of compromise. Log analysis is fundamentally about building these chains, not reviewing individual events.
Structuring Your Log Pipeline for Detection
Centralize Before You Analyze
Detection at scale requires all relevant logs flowing into a central location where correlation is possible. Whether that is a commercial SIEM, an open-source stack built on Elasticsearch and OpenSearch, or a cloud-native solution like Microsoft Sentinel or Google Chronicle, centralization is the foundational requirement. Organizations that still review logs on individual systems or rely on manual SSH sessions to pull records will consistently miss multi-stage attacks.
When configuring log ingestion, prioritize the following sources in order of detection value for modern threat scenarios:
- Windows Event Logs: Event IDs 4624, 4625, 4648, 4688, 4698, 4720, 4769, and 7045 cover authentication, process creation, scheduled tasks, account changes, and service installations. Enable process command-line logging via Group Policy to populate the command-line field in Event ID 4688.
- Sysmon: Deploy Sysmon with a maintained configuration such as the SwiftOnSecurity or Olaf Hartong modular configs. Sysmon Event IDs 1, 3, 7, 8, 10, 11, 12, 13, and 22 provide process creation with hashes, network connections, driver loads, CreateRemoteThread activity, process access events (useful for detecting LSASS dumping), file creation, registry changes, and DNS queries.
- Network Flow Data: NetFlow or IPFIX records from core routers and switches capture communication patterns even when payload inspection is unavailable. Unusual internal-to-internal traffic volumes and unexpected external destinations appear clearly in flow data.
- DNS Query Logs: DNS is used by almost every piece of malware for C2 communication. High-frequency queries to newly registered domains, algorithmically generated domain names, and queries for domains with no prior history in your environment are reliable early indicators.
- Authentication Logs from Identity Providers: Azure AD sign-in logs, Okta system logs, and on-premises Active Directory logs reveal credential-based attacks, impossible travel scenarios, and token abuse.
- Cloud Control Plane Logs: AWS CloudTrail, Azure Activity Logs, and GCP Cloud Audit Logs record API calls that create, modify, or delete resources. Attackers who gain cloud access frequently create new IAM users, modify security group rules, or spin up compute resources for cryptomining or lateral movement.
Normalize and Enrich Early
Raw logs from different sources use inconsistent field names, timestamp formats, and data types. Before logs reach your detection layer, normalize them to a common schema. OCSF (Open Cybersecurity Schema Framework) and Elastic Common Schema (ECS) are two widely adopted options. Normalization makes it possible to write detection rules that work across source types without maintaining source-specific logic everywhere.
Enrichment adds context that raw logs lack. Automatically appending geolocation data, ASN information, threat intelligence scores, and asset ownership information to log records at ingestion time means analysts have that context available the moment they open an alert. Services like AbuseIPDB, Shodan, and commercial threat intelligence platforms can be integrated into log pipelines via API lookups during the enrichment phase.
Detection Logic That Reflects Current Threat Behavior
Recognizing Hive-Style C2 Communication
The Hive implant framework, originally developed by the CIA and later adapted by threat actors who have reportedly introduced it into criminal markets, uses HTTPS for C2 communication over non-standard ports. The traffic uses valid TLS certificates and mimics legitimate web traffic, making payload inspection unreliable. Log-based detection focuses on behavioral patterns rather than signatures.
In your proxy and firewall logs, look for the following patterns associated with Hive-style communication and similar implants:
- Periodic outbound HTTPS connections at consistent intervals (typically between 30 and 300 seconds) to the same destination, varying slightly to avoid exact-interval detection
- Low data transfer volumes per session, typically under 10 KB for beacon check-ins, with larger transfers occurring at irregular intervals when tasking is delivered
- Connections to IP addresses with no associated domain name (direct IP HTTPS) or to domains registered within the last 90 days
- TLS connections where the certificate subject does not match the claimed service or uses a free certificate authority with no organizational validation
A detection rule targeting these patterns in Splunk might look like this:
index=proxy_logs dest_port=443 OR dest_port=8443
| stats count, sum(bytes_out) as total_bytes, dc(dest_ip) as unique_dest by src_ip, dest_ip, _time span=1h
| where count > 10 AND total_bytes < 100000
| join dest_ip [search index=threat_intel category=c2]
| table src_ip, dest_ip, count, total_bytesThis is a simplified example, but the underlying logic is sound: high connection frequency, low data volume, and known malicious infrastructure are individually weak signals that become significant together.
Detecting PhantomRPC-Style Privilege Escalation
The PhantomRPC technique exploits Windows RPC interfaces to achieve privilege escalation. In log terms, this surfaces as unusual RPC calls between processes with mismatched privilege levels and process creation events where a low-privilege parent process spawns a child running under SYSTEM context.
Sysmon Event ID 1 (process creation) combined with Event ID 10 (process access) provides the raw material for detecting this. Specifically, look for:
- A process running under a standard user account accessing the LSASS process (Event ID 10 with TargetImage ending in lsass.exe)
- Child processes of svchost.exe, rpcss.exe, or dllhost.exe that spawn cmd.exe, powershell.exe, or wscript.exe with elevated privileges
- Windows Event ID 4673 (a privileged service was called) fired in close temporal proximity to new process creation events from unexpected parent processes
Correlating Ransomware Precursor Activity
Ransomware attacks continue to increase in frequency and sophistication. The log trail before ransomware deployment is typically three to five days long and consistent across many incident reports. The pattern follows a recognizable sequence: initial access via phishing, credential theft or vulnerability exploitation, persistence establishment, lateral movement, data exfiltration, and finally encryption.
For detection purposes, the most valuable correlation targets are the middle phases. By the time encryption begins, response options are severely limited. The precursor activities that appear in logs include:
- Volume Shadow Copy deletion: Windows Event ID 524 or Sysmon Event ID 1 with command line containing vssadmin delete shadows or wbadmin delete catalog. This occurs in virtually every ransomware deployment.
- Backup software process termination: Sysmon Event ID 1 capturing taskkill /IM or net stop commands targeting Veeam, Acronis, Windows Backup, or similar services.
- Rapid file enumeration: File system activity logs showing a single process accessing hundreds or thousands of files across multiple directories within a short window, particularly if the process has no prior history of such activity.
- Credential harvesting tool execution: Sysmon Event ID 7 (image load) capturing known credential dumping DLLs such as comsvcs.dll combined with Event ID 10 targeting LSASS.
- Remote execution tool activity: Presence of PsExec, Cobalt Strike artifacts, or RMM tools that were not previously inventoried, appearing in process creation logs across multiple endpoints in a short timeframe.
Hunting Compromised Surveillance Camera Infrastructure
Reports of cybercriminals selling access to compromised surveillance camera networks represent a different category of threat relevant to organizations that operate physical security infrastructure. IP cameras, NVRs, and DVRs frequently run embedded Linux with default credentials and exposed management interfaces. When compromised, they appear in network logs as sources of unusual outbound traffic.
In your network flow data, IoT and OT devices should have highly predictable traffic profiles. A surveillance camera should communicate with a specific NVR or cloud recording service and nothing else. Detection rules that alert when any device in your IP camera subnet initiates connections to external IPs outside an approved list, or begins communicating on unexpected ports, will surface compromised devices quickly. These devices also appear in threat intelligence feeds, and enriching your network logs with abuse reports from sources like AbuseIPDB will help identify cameras that have been flagged as participating in attacks against other organizations.
Building Correlation Rules That Catch Chained Attacks
Single-event detection rules generate alert fatigue. Correlation rules that chain multiple weak signals into a high-confidence detection are operationally sustainable. The following correlation approach works across most SIEM platforms.
Define a base event as the first indicator, such as a failed authentication attempt. Assign it a risk score, perhaps 10 out of 100. Define subsequent related events that elevate the risk: a successful authentication from the same source IP within 30 minutes adds 25 points; a new process creation on the authenticated endpoint within 10 minutes of the successful login adds 30 points; an outbound connection from that endpoint to a new external destination within 5 minutes of the process creation adds 35 points. When the cumulative score crosses a threshold, fire an alert.
This risk-based correlation model reduces false positives because any single event stays below the alert threshold, and it captures sophisticated attacks because the chaining logic reflects how real attacks unfold over time. Microsoft Sentinel's entity behavior analytics and Splunk's risk-based alerting framework both support this model natively. It can be implemented manually in Elasticsearch using scripted fields and alerting conditions.
Operationalizing Detection: From Log to Response
Alert Triage Workflow
When a correlated alert fires, the triage process should follow a consistent structure. Start with the enriched data already attached to the alert: asset owner, IP reputation, geolocation, and any matching threat intelligence indicators. Confirm whether the source and destination assets are expected to communicate. Check the timeline of events to identify the first indicator and estimate how long the activity has been occurring.
If the initial triage suggests active compromise rather than a false positive, escalate immediately rather than completing a full investigation before acting. Isolation of affected endpoints, credential rotation for suspected accounts, and blocking of identified C2 infrastructure should happen in parallel with the investigation, not after it concludes.
Threat Hunting Between Alerts
Reactive detection through alerting catches known patterns. Threat hunting catches attackers who have evaded those patterns. Log data is the primary material for hunting. Effective hunting starts with a hypothesis derived from current threat intelligence, such as the hypothesis that a modified Hive implant is present given recent reporting about its availability in criminal markets.
From that hypothesis, derive the observable artifacts the implant would produce in your log data. Hive uses HTTPS C2, so query your proxy logs for connections matching the behavioral profile described earlier. Hive establishes persistence, so query Sysmon logs for scheduled task creation or service installation events from the relevant timeframe. Work backward from the artifact to the activity, and then forward from the activity to its extent across the environment.
Document hunting results regardless of whether they find active compromise. Hunts that find nothing establish a baseline that makes future hunts more efficient. Hunts that find something provide the starting point for incident response.
Reducing Alert Fatigue Through Tuning
Detection rules require ongoing maintenance. A rule that was effective six months ago may generate hundreds of false positives today because the environment has changed. Schedule regular tuning reviews, at least monthly, where analysts review the most prolific alert sources and assess their true positive rate. Rules with a true positive rate below five percent should be adjusted, suppressed for known-good sources, or replaced with more specific logic.
Maintain a suppression log that records every suppression decision, the rationale, and an expiration date. The financial services firm mentioned at the opening of this article had added an IP to an allow-list with no expiration and no documented rationale. When that infrastructure was later used by an attacker, the suppression prevented detection for three days. Every suppression represents a detection gap and should be treated as one.
Log Retention and Legal Considerations
Detection is only possible against logs that exist. Log retention policies must balance storage costs against the realistic timeline of attack detection. Most sophisticated attackers maintain persistence for weeks or months before triggering obvious activity. A 90-day retention window for security-relevant logs is a reasonable minimum. Critical authentication, privilege use, and network flow logs warrant 12 months of retention in most compliance frameworks including PCI DSS, HIPAA, and SOC 2.
Compressed archival storage in object stores like AWS S3 or Azure Blob Storage is cost-effective for older logs. Ensure archived logs are indexed or at minimum organized in a way that makes them searchable during incident response. Investigators who need to query six-month-old logs during an active incident should not spend hours reconstructing access to the data.
Practical Takeaways
Log analysis for threat detection is a discipline that requires investment in infrastructure, tooling, and analyst skill. The following actions have a measurable impact on detection capability:
- Deploy Sysmon on all Windows endpoints with a community-maintained configuration and ensure those logs flow to your central SIEM within five minutes of generation.
- Enable command-line logging for Windows process creation events via Group Policy.
- Implement risk-based alerting rather than single-event rules to reduce false positives while capturing multi-stage attacks.
- Enrich logs with threat intelligence at ingestion time so analysts have context without performing manual lookups during triage.
- Audit all existing suppression rules quarterly and enforce expiration dates on every new suppression added.
- Hunt proactively against current threat intelligence at least twice per month, even when no alerts have fired.
- Retain security-relevant logs for a minimum of 90 days in a searchable format, with archival storage extending to 12 months.
- Test detection rules against known attack simulations using tools like Atomic Red Team or CALDERA to verify they fire as expected before relying on them in production.
Logs contain the full story of most breaches. The challenge is building the systems, processes, and skills to read that story before it ends badly.