Cloud-Native Threat Detection: Deploying YARA for Scalable Malware Detection in EKS
YARA is a powerful tool used for pattern matching in files, primarily for malware detection and digital forensics. It enables users to create rules that define the characteristics of malicious files, making it a crucial component in cybersecurity. In this post, we’ll explore how YARA works, provide practical examples, and demonstrate how to deploy it as a service in a Kubernetes (K8s) cluster.
Understanding YARA Rules
YARA rules are defined using a simple but flexible syntax. Each rule consists of a name, optional metadata, string patterns, and a condition to match files. Here’s a basic example:
rule ExampleRule {
meta:
description = "Detects suspicious files based on string patterns"
author = "Your Name"
strings:
$malicious_string1 = "malware_signature"
$malicious_string2 = { 6A 40 68 00 30 00 00 }
condition:
any of them
}
This rule looks for the presence of the string malware_signature
or a specific byte sequence in a file.
Hackers, penetration testers, antivirus companies are exploring binaries of executable files under quarantine and know payloads that can affect the serializer/deserializer
libraries and OS to find common patterns in such binaries. This can be sequence of bytes in specific position that have some values, etc.
Detecting Go and Java Binaries
YARA can also be used to identify binaries compiled in specific programming languages. Here are rules to detect Go and Java binaries:
Detecting Go Binaries
Go binaries often contain specific symbols and structures. The following rule helps identify them:
rule DetectGoBinary {
meta:
description = "Detects Go-compiled binaries"
author = "Your Name"
strings:
$go_symbol = "go.buildid"
$go_runtime = "runtime.main"
condition:
any of them
}
Detecting Java Binaries
Java binaries typically contain .class
file headers and specific byte sequences.
At any Java
versions the sources that were compiled into .class
bytecode files, always will start with 0xCAFEBABE
hex - called magic number
.
Here is some interesting fact and guess about why:
Magic number is 3405691582 (0xCAFEBABE), the guess is that (a) 32-bit magic numbers are easier to handle and more likely to be unique, Java team wanted something with the Java-coffee metaphor, and since there’s no ‘J’ or ‘V’ in hexadecimal, settled for something with CAFE in it. They figured “CAFE BABE” was sexier than something like “A FAB CAFE” or “CAFE FACE”, and definitely didn’t like the implications of “CAFE A FAD” (or worse, “A BAD CAFE”).
To detect Java classes we can write following rule, that will inspect for this bytes:
rule DetectJavaBinary {
meta:
description = "Detects Java-compiled binaries"
author = "Your Name"
strings:
$java_class_header = { CA FE BA BE }
$java_string = "java/lang/Object"
condition:
any of them
}
Detecting PDF binaries
Same way pdf
files have identifier in binary:
rule pdf_checker
{
meta:
author = "N3NU"
description = "Checks whether or not a file is truly a PDF."
strings:
$start = "%PDF"
$end = "%%EOF"
condition:
$start at 0 and $end
}
These rules help in identifying compiled binaries, which can be useful for analyzing executables in malware research.
OpenSource shared Yara rules
Once researches have found injected portions of malware in the binary they can identify unique signature of byte sequence, chars, positions, size, and created shared rule that will instrument YARA
about what malware and its version.
Acknowledged vulnerabilities are shared with communities and companies to update anti-virus bases, IDs (intrusion detection systems) etc.
Any person can be such searcher to identify and report the vulnerabilities.
There are multiple OS shared riles, i.e. https://github.com/Yara-Rules/rules where you can get exact rules for particular types of vulnerabilities.
Installing YARA
To use YARA locally, install it using:
On Linux:
1
sudo apt update && sudo apt install yara -y
On macOS:
1
brew install yara
Scanning Files with YARA
Once installed, you can scan files with:
1
yara example_rule.yar suspicious_file.exe
If the rule was evaluated with pattern matching - there will be next line rule name
nad payload path
1
2
$ yara rules/pdf_file.yar /path_to/PowerlineLN.pdf
pdf_checker /path_to/PowerlineLN.pdf
To scan an entire directory:
1
yara -r rules/example_rule.yar /path/to/directory
Deploying YARA as a Kubernetes Service into EKS
To make YARA accessible as a scalable service, we can deploy it in a Kubernetes cluster using a containerized YARA scanner.
Step 1: Create a Docker Image for YARA
Create a Dockerfile
with the following contents:
1
2
3
4
FROM ubuntu:latest
RUN apt update && apt install -y yara
COPY rules/ /rules/
ENTRYPOINT ["yara", "-r", "/rules"]
Build and push the image to a registry:
1
2
docker build -t your-dockerhub-username/yara-scanner .
docker push your-dockerhub-username/yara-scanner
Step 2: Create a Kubernetes Deployment
Create a yara-deployment.yaml
file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
apiVersion: apps/v1
kind: Deployment
metadata:
name: yara-scanner
spec:
replicas: 2
selector:
matchLabels:
app: yara
template:
metadata:
labels:
app: yara
spec:
containers:
- name: yara-scanner
image: your-dockerhub-username/yara-scanner:latest
ports:
- containerPort: 80
Deploy it using:
1
kubectl apply -f yara-deployment.yaml
Step 3: Expose YARA as a Service
Create a yara-service.yaml
file:
1
2
3
4
5
6
7
8
9
10
11
12
apiVersion: v1
kind: Service
metadata:
name: yara-service
spec:
selector:
app: yara
ports:
- protocol: TCP
port: 80
targetPort: 80
type: LoadBalancer
Apply the service configuration:
1
kubectl apply -f yara-service.yaml
Step 4: Access the Service
Find the external IP of the YARA service using:
1
kubectl get services yara-service
You can now send files for scanning via a simple API if you’ve extended the container with a REST service, or use kubectl exec
to scan files directly.
C++ go interops
In case if C++
or go
libraries are used to interact with yara
additionally yara-dev
should be installed and application must be build with CGO_EMABLED=1
option.
Conclusion
By deploying YARA in Kubernetes, you enable scalable malware detection across multiple nodes. This setup can be extended with additional automation, such as integrating with CI/CD pipelines or security monitoring tools.
Also code can be adopted not only to perform scanning of files but any payload that is submitted by the user either it is form-based text submission or a uploaded files (images, binaries, xml, text, etc.).
Would you like to see an API wrapper for YARA in Kubernetes? Let me know in the comments! 🚀