Now Reading
Advertisements blocking with OpenBSD unbound(8)

Advertisements blocking with OpenBSD unbound(8)

2022-10-07 10:10:54

 

  

The Web is stuffed with Advertisements and Trackers. And a approach to keep away from these is to
merely not attain the pungent servers. This may be partially carried out utilizing a
native DNS resolver.

This text is a reboot of each the 2019
Blocking Ads using unbound on OpenBSD
and
Storing unbound logs into InfluxDB
posts ; hopefully improved.

DNS Advertisements blocking is pretty easy: whenever you had been purported to make an
Web request to some servers identified to host Advertisements and Trackers, then
you simply don’t!

This requires you to arrange and keep a wise DNS server. You additionally
have to inform your units (smartphones, tablets, computer systems …) to make use of
it. Beneath the hood, the DNS server tells your units that the area
names they’re in search of don’t exist.

There are such ready-to-use options obtainable.
Pi-hole
and AdGuard
Home

are some
well-known options. uBlock Origin
works in one other manner however makes use of the identical sort of algorithm to guard your
privateness: detects Dangerous assets and never let your go there.

Right here, the unhealthy domains are grabbed utilizing a few of the similar sources additionally
utilized by these initiatives.

Elements wanted for this recipe:

  • Grafana to render the statistics ;
  • InfluxDB to retailer the knowledge ;
  • syslogd(8) and awk(1) to show DNS queries into statistics ;
  • collectd(1) and shell script to retailer unbound statistics and logs ;
  • unbound(8) and shell script to get and block DNS queries.

Seems like unbound(8) got here in with OpenBSD 5.2.

Anyway, v1.15.0 is now obtainable inventory in OpenBSD 7.1/amd64.

Sourcing Advertisements and Trackers lists

I’m utilizing a combinaison of sources which can be utilized by Pi-hole, AdGuard
Dwelling and uBlock. I write a easy shell script that parses the lists and
flip them right into a format that unbound(8) will perceive:

# cat /residence/scripts/unbound-adhosts
#!/bin/sh

PATH="/bin:/sbin:/usr/bin:/usr/sbin"

_tmp="$(mktemp)"        # Temp file to make use of whereas parsing
_out="/var/unbound/and so on/unbound-adhosts.conf"  # Unbound formatted zone file

# AdGuard Dwelling
perform adguardhome {
  # AdGuard DNS filter
  _src="https://adguardteam.github.io/AdGuardSDNSFilter/Filters/filter.txt"
  ftp -MVo - "$_src" | 
  sed -nre 's/^||([a-zA-Z0-9_-.]+)^$/local-zone: "1" always_nxdomain/p'

  # AdAway default blocklist
  _src="https://adaway.org/hosts.txt"
  ftp -MVo - "$_src" | 
  awk '/^127.0.0.1 / { print "local-zone: "" $2 "" always_nxdomain" }'
}

# From Pi-hole
perform stevenblack {
  _src="https://uncooked.githubusercontent.com/StevenBlack/hosts/grasp/hosts"
  ftp -MVo - "$_src" | 
  awk '/^0.0.0.0 / { print "local-zone: "" $2 "" always_nxdomain" }'
}

# StopForumSpam, poisonous domains
perform stopforumspam {
  _src="https://www.stopforumspam.com/downloads/toxic_domains_whole.txt"
  ftp -MVo - "$_src" | 
  awk '{ print "local-zone: "" $1 "" always_nxdomain" }'
}

# uBlock Origin
perform ublockorigin {
  # Malicious Domains Unbound Blocklist
  _src="https://malware-filter.gitlab.io/malware-filter/urlhaus-filter-unbound.conf"
  ftp -MVo - "$_src" | grep '^local-zone: '

  # Peter Lowe's Advert and monitoring server checklist
  _src="https://pgl.yoyo.org/adservers/serverlist.php?showintro=0;hostformat=hosts"
  ftp -MVo - "$_src" | 
  awk '/^127.0.0.1 / { print "local-zone: "" $2 "" always_nxdomain" }'

  # AdGuard Français
  _src="https://uncooked.githubusercontent.com/AdguardTeam/AdguardFilters/grasp/FrenchFilter/sections/adservers.txt"
  ftp -MVo - "$_src" | 
  sed -nre 's/^||([a-zA-Z0-9_-.]+)^.*$/local-zone: "1" always_nxdomain/p'
}

# Seize and format the information
adguardhome >> "$_tmp"
stevenblack >> "$_tmp"
stopforumspam >> "$_tmp"
ublockorigin >> "$_tmp"

# Clear entries
sed -re 's/." all the time/" all the time/' "$_tmp" | 
egrep -v ""(t.co)"" | 
kind -u -o "$_tmp"
chmod 0644 "$_tmp"

# Take motion is required
diff -q "$_out" "$_tmp" 1>/dev/null
case $? in
  0)  rm "$_tmp" && exit 0;;
  1)
    mv "$_tmp" "$_out" && 
    doas -u _unbound unbound-checkconf 1>/dev/null && 
    exec doas -u _unbound unbound-control reload 1>/dev/null
    ;;
  *)  echo "$0: one thing unhealthy occurred!"; exit 1;;
esac

exit 0
#EOF

Cron recurrently synchronizes the checklist content material with a devoted unbound(8) zone
file:

# crontab -l
(...)
# Replace DNS block checklist
0~5 */6 * * * -s /residence/scripts/unbound-adhosts
(...)

The zone file content material can now be utilized by unbound(8).

Configuration

Allow statistics, configure logs, embrace the Advertisements/Trackers FQDN zone
file:

# cat /var/unbound/and so on/unbound.conf
(...)
statistics-cumulative: sure
extended-statistics: sure
(...)
use-syslog: sure
log-queries: no
log-replies: sure
log-local-actions: sure
(...)
embrace: /var/unbound/and so on/unbound-adhosts.conf
(...)

Then apply the brand new unbound(8) configuration:

# rcctl restart unbound

Any longer, every time a consumer will request DNS decision for a foul area,
it’ll get an NXDOMAIN and the question is not going to be processed.

The logs and metrics finish in InfluxDB in order that I can render a reasonably
dashboard. There’s nothing particular to do on the InfluxDB facet. Merely
create the database(s) and ship knowledge to it/these.

Gather the metrics

A shell script parses unbound statistics and write them down right into a
particular InfluxDB measurement:

# cat /residence/scripts/collectd-unbound
#!/bin/sh
#
# CollectD Exec unbound(8) stats
# Configure "extended-statistics: sure"
#

PATH="/bin:/sbin:/usr/bin:/usr/sbin"

HOSTNAME="${COLLECTD_HOSTNAME:-$(hostname -s)}"
INTERVAL="${COLLECTD_INTERVAL:-10}"

whereas sleep "$INTERVAL"; do
  doas -u _unbound unbound-control stats_noreset | 
  egrep -v "^(histogram.|time.now|time.elapsed)" | 
  sed -re "s;^([^=]+)=([0-9.]+);PUTVAL $HOSTNAME/exec-unbound/gauge-1 interval=$INTERVAL N:2;"

  awk -v h=$HOSTNAME -v i=$INTERVAL 
  'END { print "PUTVAL " h "/exec-unbound/gauge-num.adhosts interval=" i " N:" FNR }' 
  /var/unbound/and so on/unbound-adhosts.conf
carried out

exit 0
#EOF

# cat /and so on/doas.conf
(...)
allow nopass _collectd as _unbound cmd unbound-control
(...)

# cat /and so on/collectd.conf
(...)
<Plugin exec>
  Exec _collectd "/residence/scripts/collectd-unbound"
</Plugin>
(...)

# rcctl restart collectd

In InfluxDB, the information will appear to be this:

> SELECT * FROM "exec_value" WHERE "occasion"='unbound' ORDER BY DESC LIMIT 10
title: exec_value
time                           host    occasion kind  type_instance                 worth
----                           ----    -------- ----  -------------                 -----
2022-10-02T17:03:01.66013246Z  openbsd unbound  gauge num.question.authzone.down       0
2022-10-02T17:03:01.660101373Z openbsd unbound  gauge num.question.authzone.up         0
2022-10-02T17:03:01.660069948Z openbsd unbound  gauge key.cache.rely               4030
2022-10-02T17:03:01.660033432Z openbsd unbound  gauge infra.cache.rely             491
2022-10-02T17:03:01.659930095Z openbsd unbound  gauge rrset.cache.rely             37499
2022-10-02T17:03:01.659893329Z openbsd unbound  gauge msg.cache.rely               108713
2022-10-02T17:03:01.659857007Z openbsd unbound  gauge undesirable.replies              9
2022-10-02T17:03:01.659820476Z openbsd unbound  gauge undesirable.queries              0
2022-10-02T17:03:01.659784111Z openbsd unbound  gauge num.question.aggressive.NXDOMAIN 882
2022-10-02T17:03:01.659747595Z openbsd unbound  gauge num.question.aggressive.NOERROR  256

Parse the logs

OpenBSD syslogd(8) has a function that permits sending some logs to an
exterior program. I made a decision I’d write an awk(1) script that you just get
the logs from syslogd, parse and format them into an InfluxDB correct
dataset and use curl(1) to truly save the information.

Authentication is configured on my InfluxDB occasion. So curl(1) has to
use login/password to have the ability to retailer the information. However I seen that in case you
use the “–consumer” flag, then one can see the credentials utilizing ps(1).
So I’m utilizing an additional credential file for curl(1).

See Also

# cat /residence/scripts/unbound-logs2influxdb
#!/usr/bin/awk -f
BEGIN {
  # Construct an associative array (_ptr[ip]=hostname) of identified DNS shoppers.
  _fs = FS; FS = "["   ]+"          # Soiled hack to parse unbound logs.

  _ptr["127.0.0.1"] = "localhost"
  whereas (getline < "/var/unbound/and so on/unbound-tumfatig.conf") {
    if ($0 ~ /^local-data-ptr:/) {                     # solely parse PTR.
      break up($3, _fqdn, "."); _ptr[$2] = _fqdn[1]
    }
  }
  shut($0)

  FS = _fs        # Rollback soiled hack.
}
$3 == "unbound:" && $5 == "information:" {      # Solely parse unbound information logs.
  if($7 == "static") {                  # Native zone: authoritative DNS.
    break up($8, _client, "@")                   # Shopper format is IP@PORT
    if (_ptr[_client[1]] == "") { _host = "<unknown>" }     # If no PTR.
    else { _host = _ptr[_client[1]] }

    _rec = "unbound_static,host=" $2 ",title=" $9
    _rec = _rec ",kind=" $10 ",class=" $11
    _rec = _rec ",clientip=" _client[1]
    _rec = _rec ",consumer=" _host " matched=1i"
  } else if($7 == "always_nxdomain") {          # Native zone: AD blocks.
    break up($8, _client, "@")                   # Shopper format is IP@PORT
    if (_ptr[_client[1]] == "") { _host = "<unknown>" }     # If no PTR.
    else { _host = _ptr[_client[1]] }

    _rec = "unbound_adblock,host=" $2 ",title=" $6
    _rec = _rec ",kind=" $10",class=" $11
    _rec = _rec ",clientip=" _client[1]
    _rec = _rec ",consumer=" _host " matched=1i"
  } else if(NF == 13) {                    # DNS queries have 13 fields.
    if (_ptr[$6] == "") {
      _host = "<unknown>"                  # Set hostname to '<unknown>'
    } else { _host = _ptr[$6] }         # if no PTR exists in zone file.

    _rec = "unbound_queries,host=" $2 ",title=" $7 ",clientip=" $6
    _rec = _rec ",consumer=" _host ",kind=" $8 ",class=" $9
    _rec = _rec ",return_code=" $10 ",from_cache=" $12
    _rec = _rec " time_to_resolve=" $11 ",response_size=" $13 "i"
  }

  # Construct Influxdb protocol line utilizing curl
  _cmd = "/usr/native/bin/curl -s -XPOST "
  _cmd = _cmd "-Okay /residence/scripts/unbound-logs2influxdb.conf "
  _cmd = _cmd "--data-binary "" _rec """

  # Run the curl command = Insert knowledge in InfluxDB
  system(_cmd)
}

# cat /residence/scripts/unbound-logs2influxdb.conf
# InfluxDB credentials
url = "https://influxdb_host:8086/write?db=db_name&precision=s"
consumer = "db_user:db_pass"

The script is run by syslogd(8) and the configuration file comprises
credentials. So each recordsdata require particular care concerning permissions
and possession:

# ls -alh /residence/scripts/unbound-logs2influxdb*
-rwxr-x---  1 root  _syslogd   1.9K Oct  2 16:04 /residence/scripts/unbound-logs2influxdb*
-rw-r-----  1 root  _syslogd   505B Sep 29 00:51 /residence/scripts/unbound-logs2influxdb.conf

syslogd(8) has a particular configuration to permit unbound(8) logs and solely
them to be ship and parsed by the script:

# cat /and so on/syslog.conf
(...)
!!unbound
*.* |/residence/scripts/unbound-logs2influxdb
!*
(...)

# rcctl restart syslogd

The parsed logs can now be queried from influxdb:

> SELECT * FROM "unbound_adblock" ORDER BY DESC LIMIT 5
title: unbound_adblock
time                 class consumer           clientip   host    matched title                      kind
----                 ----- ------           --------   ----    ------- ----                      ----
2022-10-02T22:14:24Z IN    ThinkPad-de-Joel 192.0.0.16 unbound 1       s.youtube.com.            A
2022-10-02T22:13:35Z IN    -                192.0.0.12 unbound 1       www.googleadservices.com. HTTPS
2022-10-02T22:13:35Z IN    -                192.0.0.12 unbound 1       www.googleadservices.com. A
2022-10-02T22:13:34Z IN    -                192.0.0.12 unbound 1       s.youtube.com.            HTTPS
2022-10-02T22:13:34Z IN    -                192.0.0.12 unbound 1       s.youtube.com.            A

Doing issues is nice however checking what you’re doing is healthier. You could possibly
recurrently run influxdb instructions and even parse outcomes and ship emails.
However you can even arrange a moootiful Internet web page with Grafana.

For essentially the most impatients and/or curious, it’s doable to benchmark
unbound(8) utilizing generally used domains. Seize and parse the Top 10
milion domains (based on Open PageRank data)

in order that they can be utilized by
dnsperf(1).

# pkg_add dnsperf

# ftp https://www.domcop.com/recordsdata/prime/top10milliondomains.csv.zip
Making an attempt 94.130.193.220...
Requesting https://www.domcop.com/recordsdata/prime/top10milliondomains.csv.zip
100% |************************************************************|  112 MB  00:09
117800727 bytes acquired in 9.77 seconds (11.49 MB/s)

# unzip top10milliondomains.csv.zip

# awk -F '[",]' '{ if($5 != "Area") { print $5 " A" };          
  if($5 ~/^[a-k]/) { print $5 " MX" }; if(FNR == 100000) exit }'   
  top10milliondomains.csv > top100k.txt

# dnsperf -s 192.168.0.1 -c 5 -d top100k.txt
Statistics:

  Queries despatched:         145937
  Queries accomplished:    145821 (99.92%)
  Queries misplaced:         116 (0.08%)

  Response codes:       NOERROR 137226 (94.11%), SERVFAIL 278 (0.19%), NXDOMAIN 8317 (5.70%)
  Common packet dimension:  request 33, response 79
  Run time (s):         236.612169
  Queries per second:   616.286984

  Common Latency (s):  0.155683 (min 0.000112, max 4.990909)
  Latency StdDev (s):   0.300890

You possibly can see that unbound(8) replies however is a bit out of energy. Not all
queries had been served. And collectd appeared to have problem getting some
of the stats throughout such load.

Wanting on the logs, warnings popped out:

warning: can't improve max open fds from 512 to 4152
warning: persevering with with much less udp ports: 460
warning: improve ulimit or lower threads, ports in config to take away this warning

This implies my unbound configuration shouldn’t be tuned correctly for such a
load. In actual situations, I’m manner bellow 8 req/s. So it’ll be okay for me.

And that’s all for now!

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top