Commit Graph

97 Commits

Author SHA1 Message Date
MDLeom 693d996267
fix: pipe extracted stdout to parsing 2025-03-19 11:13:16 +00:00
MDLeom 89e8f56702
style: remove unnecessary global flag in sed
not required when matching once per line
2025-03-19 11:10:11 +00:00
MDLeom f4377f1fe6
fix: add address separator to adblock filters
handle url that ends with $
$ is not percent-encoded by browsers
2025-03-19 10:42:23 +00:00
MDLeom 9d4668bcbd
fix: match top domains to input hostname
instead of url.
to minimise entries such as "bad.com/interactivelogin?continue=https://accounts.google.com"
however, subdomains of top domains will no longer match
2025-03-18 10:32:50 +00:00
MDLeom 58a15ee1df
feat: more robust url parsing
better handle of edge cases
add IPv6 support
increase nodejs requirement to v20 due to URL.canParse()
  https://developer.mozilla.org/en-US/docs/Web/API/URL/canParse_static#browser_compatibility
2025-03-18 10:32:00 +00:00
MDLeom 45783a46b3
perf: rewrite IDS rule creation in javascript
"while do" can be inefficient
previously took >5 minutes is now less than 1 second
2025-03-17 11:51:53 +00:00
MDLeom ec9288267c
fix: match safelink domains
avoid matching path
2025-03-17 10:37:53 +00:00
MDLeom ab5dca49b4
refactor: handle url-without-path & safelinks without "while read"
"while read" can be inefficient
2025-03-16 12:37:56 +00:00
MDLeom 6e359f9a79
fix: remove trailing slash from domain
to replace previous workaround 0578e6c16a
2025-03-16 10:05:02 +00:00
MDLeom 993bb958f5
fix: skip phishtank if download fails 2025-03-16 07:37:17 +00:00
MDLeom 56d67d2a41
Revert "feat: remove phishtank source"
This reverts commit b3f6e90b9a.
https://gitlab.com/malware-filter/phishing-filter/-/issues/40#note_1849507513
2025-03-16 06:56:41 +00:00
MDLeom be1b6c05d7
fix: remove credential from domain/IP
fixes #91
2025-03-11 07:23:02 +00:00
MDLeom e1b051b2fc
fix: remove response header
showing it will stop showing download progress/size
2025-03-08 00:07:04 +00:00
MDLeom a500fca678
fix: use redirected tranco link 2025-03-07 23:55:14 +00:00
MDLeom c5fd7f7d34
fix: output response header to stdout
https://codeahoy.com/general/curl-display-request-response-headers
2025-03-07 23:53:52 +00:00
MDLeom b94d832896
fix: skip tranco if download fails 2025-03-07 23:42:22 +00:00
MDLeom 7e8139510d
style(rpz): generic syntax 2025-02-16 00:44:21 +00:00
MDLeom 3529e93ba3
feat: wildcard asterisk 2025-02-16 00:23:22 +00:00
MDLeom 8506f18029
chore: remove unused oisd exclusion 2025-02-15 01:12:03 +00:00
MDLeom 7f90191c49
feat: add ipthreat.net source 2025-02-08 06:24:03 +00:00
MDLeom 8702981a79
fix: unzip alternatives 2024-07-15 09:43:11 +00:00
MDLeom f07ad2ce4e
refactor: set pipefail conditionally 2024-07-15 08:02:25 +00:00
MDLeom 827342f3e9
fix: expand alias in bash 2024-06-03 08:21:56 +00:00
MDLeom 358003b782
fix: subdomains may be completely excluded 2024-05-03 11:16:01 +00:00
MDLeom 2ee0b2d661
feat(source): disable mitchellkrogza/Phishing.Database
source does not offer online-only links
closes #86
2024-05-02 12:00:37 +00:00
MDLeom 607208c171
fix: check file exists and not zero size 2024-03-10 07:49:19 +00:00
MDLeom a1548a5e1c
fix: may not necessarily contain ipv4 entries 2024-03-10 03:06:51 +00:00
MDLeom 5c7b1f4645
feat(source): add mitchellkrogza/Phishing.Database
ref #40
revert e68268f506
2024-03-09 04:06:37 +00:00
MDLeom 1b2312f492
fix: "phishing-subdomains.txt" may be empty 2024-03-08 07:54:33 +00:00
MDLeom 93b85b00f9
chore: remove remaining phishunt
no longer used since #43 #45
2024-03-07 10:14:08 +00:00
MDLeom b3f6e90b9a
feat: remove phishtank source
frequent interference from cloudflare captcha
2024-03-07 10:09:32 +00:00
MDLeom 07ca1adfd1
refactor: lazy load os-release 2023-05-20 11:23:07 +00:00
MDLeom 667fad0b6f
style: remove debug message 2023-05-20 11:15:29 +00:00
MDLeom 13289d3365
fix: dash does not support pipefail 2023-05-20 10:38:47 +00:00
MDLeom eac902123e
fix: check installed grep is GNU variant 2023-05-20 09:51:12 +00:00
MDLeom eebf51ac47
fix: check existent of busybox
if dos2unix is not installed
2023-05-20 09:44:54 +00:00
MDLeom ca23363ef4
fix: reprocess decoded safelink
- extend 1ea3ce51f5
- also include scope of 0578e6c16a
2023-05-20 08:20:22 +00:00
MDLeom 0578e6c16a
fix: handle URL of top domains without path
- ref #62, #43, #44
- 745c81b134, c623542b9a, 8923941376
were not effective previously
2023-05-19 10:34:04 +00:00
MDLeom 7dbdc85163
fix: sed syntax to recognise newline
https://gitlab.com/malware-filter/urlhaus-filter/-/issues/79
2023-04-29 04:11:14 +00:00
MDLeom 8aa4d2334c
fix: cloudflare radar dataset is now in csv format
instead of zip
2023-01-16 07:09:35 +00:00
MDLeom b5048417b0
style(sed): avoid backslash in insert option
- simpler and more readable
- https://unix.stackexchange.com/a/99351
2022-12-17 00:19:11 +00:00
MDLeom 97cec9d0e8
feat: add csv file for Splunk lookup
- https://docs.splunk.com/Documentation/Splunk/9.0.2/Knowledge/Aboutlookupsandfieldactions
2022-12-17 00:06:59 +00:00
MDLeom 53c62b74c3
docs(header): switch date format from RFC 5322 to ISO 8601
- universally readable
2022-12-16 08:18:00 +00:00
MDLeom 1ea3ce51f5
feat: decode O365 safelink
- https://support.microsoft.com/en-us/office/advanced-outlook-com-security-for-microsoft-365-subscribers-882d2243-eab9-4545-a58a-b36fee4a46e2
2022-12-04 03:53:09 +00:00
MDLeom 5a4a8bb9bc
refactor: xmlstarlet -> html-xml-utils 2022-12-01 10:00:32 +00:00
MDLeom e653ba90c6
fix: remove extra curl option 2022-11-26 01:31:21 +00:00
MDLeom 4bf534bdbc
feat: add Cloudflare Radar top 1m domains dataset 2022-11-25 07:19:20 +00:00
MDLeom c376e2a08f
feat: fallback to busybox dos2unix 2022-11-03 08:48:16 +00:00
MDLeom e51886ff44
feat: fallback to busybox dos2unix 2022-11-03 08:46:39 +00:00
MDLeom a50b2be515
fix: disable phishunt
- closes #43
- closes #45
2022-11-03 08:41:25 +00:00