diff --git a/FAQ.md b/FAQ.md index 8cba6c6..c1731d2 100644 --- a/FAQ.md +++ b/FAQ.md @@ -1,10 +1,10 @@ - How is the filter created? - 1. Grab the URLhaus **Plain-Text URL List** and save it to [URLhaus.txt](https://gitlab.com/curben/urlhaus-filter/blob/master/src/URLhaus.txt). + 1. Grab the URLhaus **Database dump (CSV)** and save it to [URLhaus.csv](https://gitlab.com/curben/urlhaus-filter/blob/master/src/URLhaus.csv). 2. Extract the domains. 3. Exclude popular domains ([Umbrella Popularity List](https://s3-us-west-1.amazonaws.com/umbrella-static/index.html)) and some well-known domains (if not listed by Umbrella, see [exclude.txt](https://gitlab.com/curben/urlhaus-filter/blob/master/src/exclude.txt)). 4. Extract the URLs (from step 1) that include popular domains (Umbrella and exclude.txt). 5. Merge the files from step 3 and 4. - 6. Lite version uses **Database dump (CSV)** that is saved as [URLhaus.csv](https://gitlab.com/curben/urlhaus-filter/blob/master/src/URLhaus.csv). Only online urls are extracted from that database. + 6. Lite version only parses online urls from that database. - Why there is an issue running the scripts locally? + Install **dos2unix** or use `busybox dos2unix` if BusyBox is already installed (like Ubuntu).