Advertisements blocking with OpenBSD unbound(8)
The Web is stuffed with Advertisements and Trackers. And a approach to keep away from these is to
merely not attain the pungent servers. This may be partially carried out utilizing a
native DNS resolver.
This text is a reboot of each the 2019
Blocking Ads using unbound on OpenBSD
and
Storing unbound logs into InfluxDB
posts ; hopefully improved.
DNS Advertisements blocking is pretty easy: whenever you had been purported to make an
Web request to some servers identified to host Advertisements and Trackers, then
you simply don’t!
This requires you to arrange and keep a wise DNS server. You additionally
have to inform your units (smartphones, tablets, computer systems …) to make use of
it. Beneath the hood, the DNS server tells your units that the area
names they’re in search of don’t exist.
There are such ready-to-use options obtainable.
Pi-hole
and AdGuard
Home
are some
well-known options. uBlock Origin
works in one other manner however makes use of the identical sort of algorithm to guard your
privateness: detects Dangerous assets and never let your go there.
Right here, the unhealthy domains are grabbed utilizing a few of the similar sources additionally
utilized by these initiatives.
Elements wanted for this recipe:
- Grafana to render the statistics ;
- InfluxDB to retailer the knowledge ;
- syslogd(8) and awk(1) to show DNS queries into statistics ;
- collectd(1) and shell script to retailer unbound statistics and logs ;
- unbound(8) and shell script to get and block DNS queries.
Seems like unbound(8) got here in with OpenBSD 5.2.
Anyway, v1.15.0 is now obtainable inventory in OpenBSD 7.1/amd64.
Sourcing Advertisements and Trackers lists
I’m utilizing a combinaison of sources which can be utilized by Pi-hole, AdGuard
Dwelling and uBlock. I write a easy shell script that parses the lists and
flip them right into a format that unbound(8) will perceive:
# cat /residence/scripts/unbound-adhosts
#!/bin/sh
PATH="/bin:/sbin:/usr/bin:/usr/sbin"
_tmp="$(mktemp)" # Temp file to make use of whereas parsing
_out="/var/unbound/and so on/unbound-adhosts.conf" # Unbound formatted zone file
# AdGuard Dwelling
perform adguardhome {
# AdGuard DNS filter
_src="https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt"
ftp -MVo - "$_src" |
sed -nre 's/^||([a-zA-Z0-9_-.]+)^$/local-zone: "1" always_nxdomain/p'
# AdAway default blocklist
_src="https://adaway.org/hosts.txt"
ftp -MVo - "$_src" |
awk '/^127.0.0.1 / { print "local-zone: "" $2 "" always_nxdomain" }'
}
# From Pi-hole
perform stevenblack {
_src="https://uncooked.githubusercontent.com/StevenBlack/hosts/grasp/hosts"
ftp -MVo - "$_src" |
awk '/^0.0.0.0 / { print "local-zone: "" $2 "" always_nxdomain" }'
}
# StopForumSpam, poisonous domains
perform stopforumspam {
_src="https://www.stopforumspam.com/downloads/toxic_domains_whole.txt"
ftp -MVo - "$_src" |
awk '{ print "local-zone: "" $1 "" always_nxdomain" }'
}
# uBlock Origin
perform ublockorigin {
# Malicious Domains Unbound Blocklist
_src="https://malware-filter.gitlab.io/malware-filter/urlhaus-filter-unbound.conf"
ftp -MVo - "$_src" | grep '^local-zone: '
# Peter Lowe's Advert and monitoring server checklist
_src="https://pgl.yoyo.org/adservers/serverlist.php?showintro=0;hostformat=hosts"
ftp -MVo - "$_src" |
awk '/^127.0.0.1 / { print "local-zone: "" $2 "" always_nxdomain" }'
# AdGuard Français
_src="https://uncooked.githubusercontent.com/AdguardTeam/AdguardFilters/grasp/FrenchFilter/sections/adservers.txt"
ftp -MVo - "$_src" |
sed -nre 's/^||([a-zA-Z0-9_-.]+)^.*$/local-zone: "1" always_nxdomain/p'
}
# Seize and format the information
adguardhome >> "$_tmp"
stevenblack >> "$_tmp"
stopforumspam >> "$_tmp"
ublockorigin >> "$_tmp"
# Clear entries
sed -re 's/." all the time/" all the time/' "$_tmp" |
egrep -v ""(t.co)"" |
kind -u -o "$_tmp"
chmod 0644 "$_tmp"
# Take motion is required
diff -q "$_out" "$_tmp" 1>/dev/null
case $? in
0) rm "$_tmp" && exit 0;;
1)
mv "$_tmp" "$_out" &&
doas -u _unbound unbound-checkconf 1>/dev/null &&
exec doas -u _unbound unbound-control reload 1>/dev/null
;;
*) echo "$0: one thing unhealthy occurred!"; exit 1;;
esac
exit 0
#EOF
Cron recurrently synchronizes the checklist content material with a devoted unbound(8) zone
file:
# crontab -l
(...)
# Replace DNS block checklist
0~5 */6 * * * -s /residence/scripts/unbound-adhosts
(...)
The zone file content material can now be utilized by unbound(8).
Configuration
Allow statistics, configure logs, embrace the Advertisements/Trackers FQDN zone
file:
# cat /var/unbound/and so on/unbound.conf
(...)
statistics-cumulative: sure
extended-statistics: sure
(...)
use-syslog: sure
log-queries: no
log-replies: sure
log-local-actions: sure
(...)
embrace: /var/unbound/and so on/unbound-adhosts.conf
(...)
Then apply the brand new unbound(8) configuration:
# rcctl restart unbound
Any longer, every time a consumer will request DNS decision for a foul area,
it’ll get an NXDOMAIN and the question is not going to be processed.
The logs and metrics finish in InfluxDB in order that I can render a reasonably
dashboard. There’s nothing particular to do on the InfluxDB facet. Merely
create the database(s) and ship knowledge to it/these.
Gather the metrics
A shell script parses unbound statistics and write them down right into a
particular InfluxDB measurement:
# cat /residence/scripts/collectd-unbound
#!/bin/sh
#
# CollectD Exec unbound(8) stats
# Configure "extended-statistics: sure"
#
PATH="/bin:/sbin:/usr/bin:/usr/sbin"
HOSTNAME="${COLLECTD_HOSTNAME:-$(hostname -s)}"
INTERVAL="${COLLECTD_INTERVAL:-10}"
whereas sleep "$INTERVAL"; do
doas -u _unbound unbound-control stats_noreset |
egrep -v "^(histogram.|time.now|time.elapsed)" |
sed -re "s;^([^=]+)=([0-9.]+);PUTVAL $HOSTNAME/exec-unbound/gauge-1 interval=$INTERVAL N:2;"
awk -v h=$HOSTNAME -v i=$INTERVAL
'END { print "PUTVAL " h "/exec-unbound/gauge-num.adhosts interval=" i " N:" FNR }'
/var/unbound/and so on/unbound-adhosts.conf
carried out
exit 0
#EOF
# cat /and so on/doas.conf
(...)
allow nopass _collectd as _unbound cmd unbound-control
(...)
# cat /and so on/collectd.conf
(...)
<Plugin exec>
Exec _collectd "/residence/scripts/collectd-unbound"
</Plugin>
(...)
# rcctl restart collectd
In InfluxDB, the information will appear to be this:
> SELECT * FROM "exec_value" WHERE "occasion"='unbound' ORDER BY DESC LIMIT 10
title: exec_value
time host occasion kind type_instance worth
---- ---- -------- ---- ------------- -----
2022-10-02T17:03:01.66013246Z openbsd unbound gauge num.question.authzone.down 0
2022-10-02T17:03:01.660101373Z openbsd unbound gauge num.question.authzone.up 0
2022-10-02T17:03:01.660069948Z openbsd unbound gauge key.cache.rely 4030
2022-10-02T17:03:01.660033432Z openbsd unbound gauge infra.cache.rely 491
2022-10-02T17:03:01.659930095Z openbsd unbound gauge rrset.cache.rely 37499
2022-10-02T17:03:01.659893329Z openbsd unbound gauge msg.cache.rely 108713
2022-10-02T17:03:01.659857007Z openbsd unbound gauge undesirable.replies 9
2022-10-02T17:03:01.659820476Z openbsd unbound gauge undesirable.queries 0
2022-10-02T17:03:01.659784111Z openbsd unbound gauge num.question.aggressive.NXDOMAIN 882
2022-10-02T17:03:01.659747595Z openbsd unbound gauge num.question.aggressive.NOERROR 256
Parse the logs
OpenBSD syslogd(8) has a function that permits sending some logs to an
exterior program. I made a decision I’d write an awk(1) script that you just get
the logs from syslogd, parse and format them into an InfluxDB correct
dataset and use curl(1) to truly save the information.
Authentication is configured on my InfluxDB occasion. So curl(1) has to
use login/password to have the ability to retailer the information. However I seen that in case you
use the “–consumer” flag, then one can see the credentials utilizing ps(1).
So I’m utilizing an additional credential file for curl(1).
# cat /residence/scripts/unbound-logs2influxdb
#!/usr/bin/awk -f
BEGIN {
# Construct an associative array (_ptr[ip]=hostname) of identified DNS shoppers.
_fs = FS; FS = "[" ]+" # Soiled hack to parse unbound logs.
_ptr["127.0.0.1"] = "localhost"
whereas (getline < "/var/unbound/and so on/unbound-tumfatig.conf") {
if ($0 ~ /^local-data-ptr:/) { # solely parse PTR.
break up($3, _fqdn, "."); _ptr[$2] = _fqdn[1]
}
}
shut($0)
FS = _fs # Rollback soiled hack.
}
$3 == "unbound:" && $5 == "information:" { # Solely parse unbound information logs.
if($7 == "static") { # Native zone: authoritative DNS.
break up($8, _client, "@") # Shopper format is IP@PORT
if (_ptr[_client[1]] == "") { _host = "<unknown>" } # If no PTR.
else { _host = _ptr[_client[1]] }
_rec = "unbound_static,host=" $2 ",title=" $9
_rec = _rec ",kind=" $10 ",class=" $11
_rec = _rec ",clientip=" _client[1]
_rec = _rec ",consumer=" _host " matched=1i"
} else if($7 == "always_nxdomain") { # Native zone: AD blocks.
break up($8, _client, "@") # Shopper format is IP@PORT
if (_ptr[_client[1]] == "") { _host = "<unknown>" } # If no PTR.
else { _host = _ptr[_client[1]] }
_rec = "unbound_adblock,host=" $2 ",title=" $6
_rec = _rec ",kind=" $10",class=" $11
_rec = _rec ",clientip=" _client[1]
_rec = _rec ",consumer=" _host " matched=1i"
} else if(NF == 13) { # DNS queries have 13 fields.
if (_ptr[$6] == "") {
_host = "<unknown>" # Set hostname to '<unknown>'
} else { _host = _ptr[$6] } # if no PTR exists in zone file.
_rec = "unbound_queries,host=" $2 ",title=" $7 ",clientip=" $6
_rec = _rec ",consumer=" _host ",kind=" $8 ",class=" $9
_rec = _rec ",return_code=" $10 ",from_cache=" $12
_rec = _rec " time_to_resolve=" $11 ",response_size=" $13 "i"
}
# Construct Influxdb protocol line utilizing curl
_cmd = "/usr/native/bin/curl -s -XPOST "
_cmd = _cmd "-Okay /residence/scripts/unbound-logs2influxdb.conf "
_cmd = _cmd "--data-binary "" _rec """
# Run the curl command = Insert knowledge in InfluxDB
system(_cmd)
}
# cat /residence/scripts/unbound-logs2influxdb.conf
# InfluxDB credentials
url = "https://influxdb_host:8086/write?db=db_name&precision=s"
consumer = "db_user:db_pass"
The script is run by syslogd(8) and the configuration file comprises
credentials. So each recordsdata require particular care concerning permissions
and possession:
# ls -alh /residence/scripts/unbound-logs2influxdb*
-rwxr-x--- 1 root _syslogd 1.9K Oct 2 16:04 /residence/scripts/unbound-logs2influxdb*
-rw-r----- 1 root _syslogd 505B Sep 29 00:51 /residence/scripts/unbound-logs2influxdb.conf
syslogd(8) has a particular configuration to permit unbound(8) logs and solely
them to be ship and parsed by the script:
# cat /and so on/syslog.conf
(...)
!!unbound
*.* |/residence/scripts/unbound-logs2influxdb
!*
(...)
# rcctl restart syslogd
The parsed logs can now be queried from influxdb:
> SELECT * FROM "unbound_adblock" ORDER BY DESC LIMIT 5
title: unbound_adblock
time class consumer clientip host matched title kind
---- ----- ------ -------- ---- ------- ---- ----
2022-10-02T22:14:24Z IN ThinkPad-de-Joel 192.0.0.16 unbound 1 s.youtube.com. A
2022-10-02T22:13:35Z IN - 192.0.0.12 unbound 1 www.googleadservices.com. HTTPS
2022-10-02T22:13:35Z IN - 192.0.0.12 unbound 1 www.googleadservices.com. A
2022-10-02T22:13:34Z IN - 192.0.0.12 unbound 1 s.youtube.com. HTTPS
2022-10-02T22:13:34Z IN - 192.0.0.12 unbound 1 s.youtube.com. A
Doing issues is nice however checking what you’re doing is healthier. You could possibly
recurrently run influxdb instructions and even parse outcomes and ship emails.
However you can even arrange a moootiful Internet web page with Grafana.
For essentially the most impatients and/or curious, it’s doable to benchmark
unbound(8) utilizing generally used domains. Seize and parse the Top 10
milion domains (based on Open PageRank data)
in order that they can be utilized by
dnsperf(1).
# pkg_add dnsperf
# ftp https://www.domcop.com/recordsdata/prime/top10milliondomains.csv.zip
Making an attempt 94.130.193.220...
Requesting https://www.domcop.com/recordsdata/prime/top10milliondomains.csv.zip
100% |************************************************************| 112 MB 00:09
117800727 bytes acquired in 9.77 seconds (11.49 MB/s)
# unzip top10milliondomains.csv.zip
# awk -F '[",]' '{ if($5 != "Area") { print $5 " A" };
if($5 ~/^[a-k]/) { print $5 " MX" }; if(FNR == 100000) exit }'
top10milliondomains.csv > top100k.txt
# dnsperf -s 192.168.0.1 -c 5 -d top100k.txt
Statistics:
Queries despatched: 145937
Queries accomplished: 145821 (99.92%)
Queries misplaced: 116 (0.08%)
Response codes: NOERROR 137226 (94.11%), SERVFAIL 278 (0.19%), NXDOMAIN 8317 (5.70%)
Common packet dimension: request 33, response 79
Run time (s): 236.612169
Queries per second: 616.286984
Common Latency (s): 0.155683 (min 0.000112, max 4.990909)
Latency StdDev (s): 0.300890
You possibly can see that unbound(8) replies however is a bit out of energy. Not all
queries had been served. And collectd appeared to have problem getting some
of the stats throughout such load.
Wanting on the logs, warnings popped out:
warning: can't improve max open fds from 512 to 4152
warning: persevering with with much less udp ports: 460
warning: improve ulimit or lower threads, ports in config to take away this warning
This implies my unbound configuration shouldn’t be tuned correctly for such a
load. In actual situations, I’m manner bellow 8 req/s. So it’ll be okay for me.
And that’s all for now!