Intercepting t.co hyperlinks utilizing DNS rewrites
When somebody hyperlinks to one thing on twitter, both by embedding one thing or simply pasting a URL, twitter will entrance it with its personal t.co
hyperlink. Which means that you can not confirm what the URL is till you click on it and your browser goes to the tip end result by way of t.co
. I solely actually seen this correctly when my DNS sinkholing server (Adguard home) began blocking t.co
hyperlinks and I used to be getting an error when say, clicking a linked information article.
The apparent repair for this is able to be to simply add t.co
to my DNS permit checklist so these requests can undergo. Nevertheless, the truth that you can not see the URL till you’ve already navigated by to it irks me rather a lot, I’d moderately confirm the hyperlink I’m navigating to is one thing I need to go to.
There are browser extensions that remedy this drawback by modifying the DOM to uncloak the hyperlinks (e.g. twitter-link-deobfuscator) which works fairly effectively, however this answer is proscribed to the browser and doesn’t work on the Twitter app on Android. Different choices are to repeat the t.co
hyperlink right into a link uncloaking website, however that is fiddly and annoying, or set up an app on your phone
So I used to be in search of a extra common answer that works throughout units, what if there was a means of “intercepting” whenever you click on on t.co
hyperlinks, unwrapping the place it will definitely results in and presenting the person an interstitial web page detailing this, with an choice to proceed ahead.
Enter the unwrapper, a small service I wrote this weekend that does precisely this, but it surely abuses numerous safeguards we’ve got in place for the online so comes at a excessive worth.
The Unwrapper
The service is only a go server that, when receiving a request for t.co
, makes a HEAD request to the shortened hyperlink and extracts the Location
header earlier than the redirect is adopted. If that is discovered, the worth within the header is returned and rendered on a easy interstitial web page.
However how do you intercept the calls to t.co
? The magic is available in through the use of (abusing?) the DNS rewriting function on Adguard house to intercept DNS requests for t.co
and return the IP of my reverse proxy, after which including a proxy rule to ahead all requests for the host t.co
to the go service.
Surprisingly this labored fairly effectively, though with an enormous caveat – that is your run of the mill Man within the Center (MitM) assault. Browsers will, fairly rightfully, complain that the certificates being offered from my reverse proxy isn’t legitimate for t.co
so the person mustn’t proceed, or proceed with excessive warning.
To mitigate this I did one thing dangerous. Effectively, I’m already doing one thing dangerous, I run my very own self-signed Certificates Authority (CA) and my reverse proxy makes use of certs signed by this CA. This root CA is trusted on my units so I can entry numerous inside providers that run on my community once I’m related to it, or by way of VPN once I’m distant.
With the badness in place, I figured why not add to it by including t.co
to the Topic Different Title
on the cert for my reverse proxy? Now the browser has no drawback and doesn’t complain anymore.
That is clearly a horrible answer and never really useful, but it surely works, and works on all my issues, together with the Twitter app on my cellphone. It’s dangerous although, I’m in all probability wanting previous numerous different safety points that I’ve simply opened myself as much as.
Throughout my testing, one other fascinating subject rapidly arose, numerous the hyperlinks cloaked below t.co
are usually hyperlinks to different hyperlink shortening providers like bit.ly, buff.ly, trib.al (because it seems….the checklist is countless) which means the obfuscated hyperlink subject nonetheless stays.
Going deeper
So to get round this I prolonged the service to work with a number of “identified” URL shortening providers and “comply with” the path till you attain the tip. Most of them work the identical means, a easy redirect and Location
header, so following the path is basically simply discovering the place the chain stops, and throw an error if a cycle is detected.
It’s really fairly humorous to see what number of hops a few of these hyperlinks can take you on, the deepest I’ve ever seen is a hyperlink from NYTimes taking you on a 9 hop voyage round numerous hyperlink shortening providers. I’m guessing individuals are pasting quick URLs to different quick URLs into social media distribution platforms which simply provides to the chain.
Bonus function
With the “actual” hyperlink URL now avaialble it’s usually polluted with numerous question parameters used for monitoring or affiliate descriptors, so the service additionally strips these out and presents a “cleaned” model of the hyperlink alongside the unique which I can then select to click on on or copy to ship to others.
Classes discovered
I’m undecided whether or not I’m going to run this technique full time as the plain safety safeguards put in place have simply been overridden with hacky self signed certificates, DNS rewriting and different dangerous issues, but it surely’s been a helpful train in exploring what’s attainable, even whether it is completely terrible. In all probability ought to simply unblock t.co
on my adblocker and put the cowboy hacks apart in the interim ????