docs: mention lite ruleset

curben 2019-06-13 16:51:42 +09:30
parent 016fae9f4a
commit e14f1a88b4
1 changed files with 1 additions and 3 deletions

4
FAQ.md

@ -4,6 +4,7 @@
3. Exclude popular domains ([Umbrella Popularity List](https://s3-us-west-1.amazonaws.com/umbrella-static/index.html)) and some well-known domains (if not listed by Umbrella, see [exclude.txt](https://gitlab.com/curben/urlhaus-filter/blob/master/src/exclude.txt)).
4. Extract the URLs (from step 1) that include popular domains (Umbrella and exclude.txt).
5. Merge the files from step 3 and 4.
6. Lite version uses **Database dump (CSV)** that is saved as [URLhaus.csv](https://gitlab.com/curben/urlhaus-filter/blob/master/src/URLhaus.csv). Only online urls are extracted from that database.
- Why there is an issue running the scripts locally?
+ Install **dos2unix** or use `busybox dos2unix` if BusyBox is already installed (like Ubuntu).
@ -11,9 +12,6 @@
- Can you add this *very-bad-url.com* to the filter?
+ No, please report to the [upstream](https://urlhaus.abuse.ch/api/#submit).
- Why don't you use the URLhaus "Plain-Text URL List"?
+ It doesn't show the status (online/offline) of a URL.
- Why don't you `wget top-1m.csv.zip` and output to stdout?
+ If wget fails, top-1m.txt will be empty. Output as file avoids that.