diff --git a/source/_posts/splunk-lookup-malware-filter.md b/source/_posts/splunk-lookup-malware-filter.md new file mode 100644 index 0000000..6dddabc --- /dev/null +++ b/source/_posts/splunk-lookup-malware-filter.md @@ -0,0 +1,480 @@ +--- +title: Malicious website detection on Splunk using malware-filter +excerpt: A guide on using malware-filter lookups +date: 2023-04-16 +tags: + - splunk +--- + +[Splunk Add-on for malware-filter](https://gitlab.com/malware-filter/splunk-malware-filter) includes the following CSV files: + +- botnet-filter-splunk.csv +- botnet_ip.csv +- opendbl_ip.csv +- phishing-filter-splunk.csv +- pup-filter-splunk.csv +- urlhaus-filter-splunk-online.csv +- vn-badsite-filter-splunk.csv + +These CSV files can be used as [lookups](https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Aboutlookupsandfieldactions) to find potentially malicious traffic. They contain a list of bad IPs/domains/URLs and we are going to look for those values in the [events](https://docs.splunk.com/Splexicon:Event). + +We can view the content of a lookup file by using [`inputlookup`](https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Inputlookup). When using that command, there should always be a leading pipe character "|" because it is an [event-generating](https://docs.splunk.com/Splexicon:Generatingcommand) command. + +## Lookup file locations + +Lookup file can be uploaded via [Splunk Web](https://docs.splunk.com/Documentation/Splunk/latest/Knowledge/Usefieldlookupstoaddinformationtoyourevents#Upload_the_lookup_table_file) or creating the file in the following locations: + +- `$SPLUNK_HOME/etc/users///lookups/` +- `$SPLUNK_HOME/etc/apps//lookups/` +- `$SPLUNK_HOME/etc/system/lookups/` + +In Splunk Web, setting the permission to app-sharing or global-sharing will automatically moves the file to the second or third location respectively. Uploaded lookup file can be used straight away without having to reload app or restart Splunk, regardless of which way it was created. + +## inputlookup basics + +```spl +| inputlookup botnet_ip.csv +``` + +> `_time` field is omitted for brevity. + +| first_seen_utc | dst_ip | dst_port | c2_status | last_online | malware | updated | +| ------------------- | ------- | -------- | --------- | ----------- | ------- | -------------------- | +| 2021-05-16 19:49:33 | 1.2.3.4 | 1234 | online | 2023-03-05 | Lorem | 2023-03-04T16:41:17Z | + +The output is no different to any other event, we can specify which fields to be displayed and then rename the fields. + +```spl +| inputlookup botnet_ip.csv | fields dst_ip | rename dst_ip AS dst +``` + +| dst | +| ------------ | +| 178.128.23.9 | + +## Search for specific events + +Example firewall events: + +```spl +index=firewall +``` + +| src | src_port | dst | action | +| ----------- | -------- | ------- | ------- | +| 192.168.1.5 | 45454 | 1.2.3.4 | allowed | +| 192.168.1.3 | 45452 | 7.6.5.4 | allowed | +| 192.168.1.4 | 45457 | 4.3.2.1 | allowed | +| 192.168.1.6 | 45451 | 7.7.5.5 | allowed | + +Notice the second row's `dst` value matches `dst_port` value of the example lookup table shown in the [previous section](#inputlookup-basics). + +To match for `dst` value of the firewall events and `dst_ip` of the lookup file, use a [subsearch](https://docs.splunk.com/Documentation/SplunkCloud/latest/SearchTutorial/Useasubsearch) with `inputlookup`. In this example, the subsearch extracts only the `dst_ip` field and rename it to `dst` in order to match the same field in the firewall events. + +```spl +index=firewall [| inputlookup botnet_ip.csv | fields dst_ip | rename dst_ip AS dst] +``` + +| src | src_port | dst | action | +| ----------- | -------- | ------- | ------- | +| 192.168.1.5 | 45454 | 1.2.3.4 | allowed | + +To display events in table format, append `| table *` + +## Wildcard + +Asterisk character (`*`) in the lookup file does work as a [wildcard](https://docs.splunk.com/Documentation/SCS/current/Search/Wildcards). + +```spl +index=proxy +``` + +| src | url | dst_port | +| ----------- | ------------- | -------- | +| 192.168.1.5 | foo.com/path1 | 443 | +| 192.168.1.3 | foo.com/path2 | 443 | +| 192.168.1.4 | bar.com/path3 | 443 | + +The lookup files do not include wildcard affix. + +```spl +| inputlookup urlhaus-filter-splunk-online.csv +``` + +| host | path | message | updated | +| ------- | ---- | ----------------------------------------- | -------------------- | +| foo.com | | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | + +The add-on includes [`geturlhausfilter`](https://gitlab.com/malware-filter/splunk-malware-filter#geturlhausfilter) command along with other commands to update their respective lookup file. Those commands has `wildcard_suffix` argument to append wildcard to the field's values. + +``` +| geturlhausfilter wildcard_suffix=host +| outputlookup override_if_empty=false urlhaus-filter-splunk-online.csv +``` + +| host | path | message | updated | host_wildcard_suffix | +| ------- | ---- | ----------------------------------------- | -------------------- | -------------------- | +| foo.com | | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | foo.com\* | + +```spl +index=proxy [| inputlookup urlhaus-filter-splunk-online.csv | fields host_wildcard_suffix | rename host_wildcard_suffix AS url ] +``` + +| src | url | dst_port | +| ----------- | ------------- | -------- | +| 192.168.1.5 | foo.com/path1 | 443 | +| 192.168.1.3 | foo.com/path2 | 443 | + +### Wildcard prefix + +Previous section showed an example using wildcard suffix ("foo.com\*"). Wildcard also works as a prefix ("\*foo.com") or even in the middle ("f\*o.com"), though these are [discouraged](https://docs.splunk.com/Documentation/SCS/current/Search/Wildcards#When_to_avoid_wildcard_characters). + +```spl +index=proxy +``` + +| src | domain | dst_port | +| ----------- | ------------- | -------- | +| 192.168.1.5 | foo.com | 443 | +| 192.168.1.3 | lorem.foo.com | 443 | +| 192.168.1.4 | bar.com | 443 | + +```spl +| geturlhausfilter wildcard_prefix=host +| outputlookup override_if_empty=false urlhaus-filter-splunk-online.csv +``` + +| host | path | message | updated | host_wildcard_prefix | +| ------- | ---- | ----------------------------------------- | -------------------- | -------------------- | +| foo.com | | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | \*foo.com | + +```spl +index=proxy [| inputlookup urlhaus-filter-splunk-online.csv | fields host_wildcard_prefix | rename host_wildcard_prefix AS domain ] +``` + +| src | domain | dst_port | +| ----------- | ------------- | -------- | +| 192.168.1.5 | foo.com | 443 | +| 192.168.1.3 | lorem.foo.com | 443 | + +## Matching multiple fields + +File hosting services like Google Docs and Dropbox are commonly abused to host phishing website. For those sites, the lookup should match both domain and path. When specifying more than one field in `fields` command, all fields will be matched using AND condition. + +```spl +index=proxy +``` + +| src | domain | path | +| ----------- | ------- | -------------- | +| 192.168.1.5 | foo.com | document1.html | +| 192.168.1.3 | foo.com | document2.html | +| 192.168.1.4 | foo.com | document3.html | + +```spl +| inputlookup urlhaus-filter-splunk-online.csv +``` + +| host | path | message | updated | +| ------- | -------------- | ----------------------------------------- | -------------------- | +| foo.com | document1.html | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | + +```spl +index=proxy [| inputlookup urlhaus-filter-splunk-online.csv | fields host, path | rename host AS domain ] +``` + +| src | domain | path | +| ----------- | ------- | -------------- | +| 192.168.1.5 | foo.com | document1.html | + +### Matching individual and multiple fields + +A lookup file may have rows with empty `path` to denote a `domain` should be blocked regardless of paths, while also having rows with both `domain` and `path` to denote a specific URL should be blocked instead. The syntax is the same as what was shown in the [previous section](#Matching-multiple-fields) because Splunk will only match **non-empty** values, empty values will be ignored instead. + +```spl +index=proxy +``` + +| src | domain | path | +| ----------- | --------------- | ---------------- | +| 192.168.1.5 | bad-domain.com | lorem-ipsum.html | +| 192.168.1.3 | bad-domain.com | foo-bar.html | +| 192.168.1.4 | docs.google.com | malware.exe | +| 192.168.1.4 | docs.google.com | safe.doc | + +```spl +| inputlookup urlhaus-filter-splunk-online.csv +``` + +| host | path | message | updated | +| --------------- | ----------- | ----------------------------------------- | -------------------- | +| bad-domain.com | | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | +| docs.google.com | malware.exe | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | + +```spl +index=proxy [| inputlookup urlhaus-filter-splunk-online.csv | fields host, path | rename host AS domain ] +``` + +| src | domain | path | +| ----------- | --------------- | ---------------- | +| 192.168.1.5 | bad-domain.com | lorem-ipsum.html | +| 192.168.1.3 | bad-domain.com | foo-bar.html | +| 192.168.1.4 | docs.google.com | malware.exe | + +## Case-insensitive + +Lookup file is case-insensitive. If case-sensitive matching is required, use `lookup` and lookup definition. + +```spl +index=proxy +``` + +| src | domain | +| ----------- | -------------- | +| 192.168.1.5 | loremipsum.com | + +```spl +| inputlookup urlhaus-filter-splunk-online.csv +``` + +| host | path | message | updated | +| --------------- | ----------- | ----------------------------------------- | -------------------- | +| lOrEmIpSuM.com | | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | +| docs.google.com | malware.exe | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | + +```spl +index=proxy [| inputlookup urlhaus-filter-splunk-online.csv | fields host, path | rename host AS domain ] +``` + +| src | domain | +| ----------- | -------------- | +| 192.168.1.5 | loremipsum.com | + +## CIDR matching + +Splunk automatically detects CIDR-like value in a lookup file and performs CIDR-matching accordingly. However, this behaviour is on best-effort basis and may not work as intended. To explicitly use lookup fields for CIDR-matching, use `lookup` and lookup definition. + +```spl +index=firewall +``` + +| src | src_port | dst | action | +| ----------- | -------- | --------------- | ------- | +| 192.168.1.5 | 45454 | 187.190.252.167 | allowed | +| 192.168.1.3 | 45452 | 7.6.5.4 | allowed | +| 192.168.1.4 | 45457 | 4.3.2.1 | allowed | +| 192.168.1.6 | 45451 | 89.248.163.100 | allowed | + +```spl +| inputlookup opendbl_ip.csv +``` + +| start | end | netmask | cidr_range | name | updated | +| --------------- | --------------- | ------- | ------------------ | ----------------------------------------- | -------------------- | +| 187.190.252.167 | 187.190.252.167 | 32 | 187.190.252.167/32 | Emerging Threats: Known Compromised Hosts | 2023-01-30T08:03:00Z | +| 89.248.163.0 | 89.248.163.255 | 24 | 89.248.163.0/24 | Dshield | 2023-01-30T08:01:00Z | + +```spl +index=firewall [| inputlookup opendbl_ip.csv | fields cidr_range | rename cidr_range AS dst ] +``` + +| src | src_port | dst | action | +| ----------- | -------- | --------------- | ------- | +| 192.168.1.5 | 45454 | 187.190.252.167 | allowed | +| 192.168.1.6 | 45451 | 89.248.163.100 | allowed | + +## inputlookup + lookup + +When using as a subsearch, `inputlookup` filters the event data and only outputs rows with matching values of specified field(s). `lookup` enriches the event data by appending new fields to the rows with matching field values. Another way to understand the difference is that `inputlookup` performs [inner join]() while `lookup` performs [left outer join]() where the event data is the left table and the lookup file is the right table. + +Despite their difference, it can be useful to use both at the same time to enrich filtered event data, even when using the same lookup file. + +```spl +| inputlookup botnet_ip.csv +``` + +> `_time` field is omitted for brevity. + +| first_seen_utc | dst_ip | dst_port | c2_status | last_online | malware | updated | +| ------------------- | ------- | -------- | --------- | ----------- | ------- | -------------------- | +| 2021-05-16 19:49:33 | 1.2.3.4 | 1234 | online | 2023-03-05 | Lorem | 2023-03-04T16:41:17Z | +| 2021-05-16 19:49:33 | 4.3.2.1 | 1234 | online | 2023-03-05 | Ipsum | 2023-03-04T16:41:17Z | + +```spl +index=firewall +``` + +| src | src_port | dst | action | +| ----------- | -------- | ------- | ------- | +| 192.168.1.5 | 45454 | 1.2.3.4 | allowed | +| 192.168.1.3 | 45452 | 7.6.5.4 | allowed | +| 192.168.1.4 | 45457 | 4.3.2.1 | allowed | +| 192.168.1.6 | 45451 | 7.7.5.5 | allowed | + +```spl +index=firewall [| inputlookup botnet_ip.csv | fields dst_ip | rename dst_ip AS dst] +``` + +| src | src_port | dst | action | +| ----------- | -------- | ------- | ------- | +| 192.168.1.5 | 45454 | 1.2.3.4 | allowed | +| 192.168.1.3 | 45452 | 7.6.5.4 | allowed | +| 192.168.1.4 | 45457 | 4.3.2.1 | allowed | +| 192.168.1.6 | 45451 | 7.7.5.5 | allowed | + +```spl +index=firewall [| inputlookup botnet_ip.csv | fields dst_ip | rename dst_ip AS dst] +| lookup botnet_ip.csv dst_ip AS dst OUTPUT c2_status, malware +``` + +| src | src_port | dst | action | c2_status | malware | +| ----------- | -------- | ------- | ------- | --------- | ------- | +| 192.168.1.5 | 45454 | 1.2.3.4 | allowed | online | Lorem | +| 192.168.1.4 | 45457 | 4.3.2.1 | allowed | online | Ipsum | + +It is also possible to rename lookup destination fields. + +```spl +index=firewall [| inputlookup botnet_ip.csv | fields dst_ip | rename dst_ip AS dst] +| lookup botnet_ip.csv dst_ip AS dst OUTPUT c2_status AS "C2 Server Status", malware AS "Malware Family" +``` + +| src | src_port | dst | action | C2 Server Status | Malware Family | +| ----------- | -------- | ------- | ------- | ---------------- | -------------- | +| 192.168.1.5 | 45454 | 1.2.3.4 | allowed | online | Lorem | +| 192.168.1.4 | 45457 | 4.3.2.1 | allowed | online | Ipsum | + +## Lookup definition + +Lookup definition provides matching rules for a lookup file. It can be configured for case-sensitivity, wildcard, CIDR-matching and others through [transforms.conf](https://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf). It can also be configured via Splunk Web: Settings -> Lookups -> Lookup definitions. + +A bare minimum lookup definition is as such: + +```conf transforms.conf +[lookup-definition-name] +filename = lookup-filename.csv +``` + +transforms.conf can be saved in the following directories in [order of priority](https://docs.splunk.com/Documentation/Splunk/latest/Admin/Wheretofindtheconfigurationfiles) (highest to lowest): + +- `$SPLUNK_HOME/etc/users///local/` +- `$SPLUNK_HOME/etc/apps//local/` +- `$SPLUNK_HOME/etc/system/local/` + +My naming convention for lookup definition is simply removing the `.csv` extension, e.g. "example.csv" (lookup file), "example" (lookup definition). While it is possible to name a lookup definition with file extension ("example.csv"), I discourage it to avoid confusion. + +It is imperative to note that lookup definition only applies to `lookup` search command and does _not_ apply to `inputlookup`. Although `inputlookup` supports lookup definition as a lookup table (in addition to lookup file), its matching rules will be ignored. + +### Case-sensitive + +```conf transforms.conf +[urlhaus-filter-splunk-online] +filename = urlhaus-filter-splunk-online.csv +# applies to all fields +case_sensitive_match = 1 +``` + +```spl +index=proxy +``` + +| src | domain | path | +| ----------- | -------------- | ---------------- | +| 192.168.1.5 | bad-domain.com | lorem-ipsum.html | +| 192.168.1.3 | bad-domain.com | lOrEm-iPsUm.hTmL | + +```spl +| inputlookup urlhaus-filter-splunk-online +``` + +| host | path | message | updated | +| -------------- | ---------------- | ----------------------------------------- | -------------------- | +| bad-domain.com | lorem-ipsum.html | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | + +```spl +index=proxy +| lookup urlhaus-filter-splunk-online host AS domain, path OUTPUT message +``` + +| src | domain | path | message | +| ----------- | -------------- | ---------------- | ----------------------------------------- | +| 192.168.1.5 | bad-domain.com | lorem-ipsum.html | urlhaus-filter malicious website detected | +| 192.168.1.3 | bad-domain.com | lOrEm-iPsUm.hTmL | | + +### Wildcard (lookup) + +```conf transforms.conf +[urlhaus-filter-splunk-online] +filename = urlhaus-filter-splunk-online.csv +match_type = WILDCARD(host_wildcard_suffix) +``` + +```spl +index=proxy +``` + +| src | url | dst_port | +| ----------- | ------------- | -------- | +| 192.168.1.5 | foo.com/path1 | 443 | +| 192.168.1.3 | foo.com/path2 | 443 | +| 192.168.1.4 | bar.com/path3 | 443 | + +The lookup files do not include wildcard affix. + +```spl +| inputlookup urlhaus-filter-splunk-online +``` + +| host | path | message | updated | host_wildcard_suffix | +| ------- | ---- | ----------------------------------------- | -------------------- | -------------------- | +| foo.com | | urlhaus-filter malicious website detected | 2023-03-13T00:11:20Z | foo.com\* | + +```spl +index=proxy +| lookup urlhaus-filter-splunk-online host_wildcard_suffix AS url OUTPUT message +``` + +| src | url | dst_port | message | +| ----------- | ------------- | -------- | ----------------------------------------- | +| 192.168.1.5 | foo.com/path1 | 443 | urlhaus-filter malicious website detected | +| 192.168.1.3 | foo.com/path2 | 443 | urlhaus-filter malicious website detected | + +### CIDR-matching (lookup) + +```conf transforms.conf +[opendbl_ip] +filename = opendbl_ip.csv +match_type = CIDR(cidr_range) +``` + +```spl +index=firewall +``` + +| src | src_port | dst | action | +| ----------- | -------- | --------------- | ------- | +| 192.168.1.5 | 45454 | 187.190.252.167 | allowed | +| 192.168.1.3 | 45452 | 7.6.5.4 | allowed | +| 192.168.1.4 | 45457 | 4.3.2.1 | allowed | +| 192.168.1.6 | 45451 | 89.248.163.100 | allowed | + +```spl +| inputlookup opendbl_ip +``` + +| start | end | netmask | cidr_range | name | updated | +| --------------- | --------------- | ------- | ------------------ | ----------------------------------------- | -------------------- | +| 187.190.252.167 | 187.190.252.167 | 32 | 187.190.252.167/32 | Emerging Threats: Known Compromised Hosts | 2023-01-30T08:03:00Z | +| 89.248.163.0 | 89.248.163.255 | 24 | 89.248.163.0/24 | Dshield | 2023-01-30T08:01:00Z | + +```spl +index=firewall +| lookup opendbl_ip cidr_range AS dst OUTPUT name AS threat +``` + +| src | src_port | dst | action | threat | +| ----------- | -------- | --------------- | ------- | ----------------------------------------- | +| 192.168.1.5 | 45454 | 187.190.252.167 | allowed | Emerging Threats: Known Compromised Hosts | +| 192.168.1.3 | 45452 | 7.6.5.4 | allowed | | +| 192.168.1.4 | 45457 | 4.3.2.1 | allowed | | +| 192.168.1.6 | 45451 | 89.248.163.100 | allowed | Dshield |