2018-10-09 06:18:46 +00:00
# URLhaus Malicious URL Blocklist
2018-10-08 10:53:50 +00:00
2018-10-09 06:18:46 +00:00
This [uBO ](https://github.com/gorhill/uBlock/ )-compatible filter list is based on the database dump (CSV) of Abuse.sh [URLhaus ](https://urlhaus.abuse.ch/ ).
## Subscribe
2018-10-11 04:18:21 +00:00
Filter is updated twice a day.
2018-10-09 06:18:46 +00:00
2018-10-09 06:42:04 +00:00
Import the following URL into uBO to subcribe:
2018-10-09 06:18:46 +00:00
https://gitlab.com/curben/urlhaus/raw/master/urlhaus-filter.txt
## Description
Following URL categories are removed from the database dump:
2018-10-09 06:42:04 +00:00
2018-10-10 06:54:25 +00:00
- Offline URLs
- Well-known domains ([top-1m.txt](src/top-1m.txt)) (using [Umbrella Popularity List ](https://s3-us-west-1.amazonaws.com/umbrella-static/index.html ))
- False positives ([exclude.txt](src/exclude.txt))
2018-10-09 06:18:46 +00:00
2018-10-10 06:54:25 +00:00
Database dump is saved as [URLhaus.csv ](src/URLhaus.csv ), get processed by [script.sh ](utils/script.sh ) and output as [urlhaus-filter.txt ](urlhaus-filter.txt ).
2018-10-09 06:18:46 +00:00
## Note
2018-10-10 06:25:29 +00:00
Please report any false positive.
2018-10-09 06:18:46 +00:00
This filter **only** accepts malware URLs from [URLhaus ](https://urlhaus.abuse.ch/ ).
Please report malware URL to the upstream maintainer through https://urlhaus.abuse.ch/api/#submit.
This repo is not endorsed by Abuse.sh.
## FAQ
- Can you add this *very-bad-url.com* to the filter?
+ No, please report to the [upstream ](https://urlhaus.abuse.ch/api/#submit ).
2018-10-10 06:25:29 +00:00
- Why don't you use the URLhaus "Plain-Text URL List"?
+ It doesn't show the status (online/offline) of a URL.
2018-10-13 03:46:15 +00:00
- Why don't you `wget top-1m.csv.zip` and output to stdout?
+ If wget fails, [top-1m.txt ](src/top-1m.txt ) will be empty. Output as file avoids that.
2018-10-10 06:25:29 +00:00
- Why do you need to clone the repo again in your CI? I thought CI already fetch the repo by default?
2018-10-09 06:18:46 +00:00
+ GitLab Runner clone/fetch the repo using HTTPS method by default ([log](https://gitlab.com/curben/urlhaus/-/jobs/105979394)). This method requires deploy *token* which is *read-only* (cannot push).
+ Deploy *key* has write access but cannot be used with the HTTPS method, hence, the workaround to clone using SSH.
2018-10-10 06:25:29 +00:00
+ See issue [#20567 ](https://gitlab.com/gitlab-org/gitlab-ce/issues/20567 ) and [#20845 ](https://gitlab.com/gitlab-org/gitlab-ce/issues/20845 ).