FrogFind in trouble!

I saw this post from Action Retro over on twitter, and it just reminds me why we can’t have nice things on the internet. Some jerk is always going to destroy stuff, simply because it takes effort to create, while tearing down things only shows impotent rate.

FrogFind in trouble!

FROGFIND UPDATE: Someone has been hammering the site with a scraper using rotating proxies that I can’t block. It’s pegging my server CPU at 100%. This has caused a service interruption. Working on solutions :'(

So annoying!

However, the source to Frog Find is available on github, so why not self-host a copy?

Since I’m using WordPress (not even touching that crazy WP drama!) I’ve already got a Linux machine running PHP/Apache (good olde LAMP!) so I thought this would be simple as, make a new virtual server, git clone the repo, and add a new entry to the olde DNS & Haproxy servers, and I’d be ready to go live!

[Tue Nov 19 11:35:50.601570 2024] [php:warn] [pid 25997:tid 25997] [client 127.0.0.1:49854] PHP Warning:  require_once(vendor/autoload.php): Failed to open stream: No such file or directory in /srv/www/FrogFind/index.php on line 2

Well, that was disappointing.

Looking at the repo there is this composer.json, so obviously it’s got something to do with that.

Looking around there is an incredible amount of information for installing & using it in your projects, but I don’t know why I struggled to find out how to actually use/deploy it. But it’s super simple, and of course it just needs you to curl & run stuff, which of course will never be used as a vector to spread malware. Never.

Installing composer into the local directory is simple as:

sudo curl -sS https://getcomposer.org/installer | php

And next, just run it with the following flags:

./composer.phar update --no-dev

And that’s it! It’ll get everything needed, and now hitting the site:

Just like that!

Since mine sits behind Cloudflare it locks out old browsers. I wonder about modifying some proxy to use 2 SSL’s to support ancient SSL, and ‘modern’ to proxy stuff, although it sure wouldn’t load new pages, but it could gate to services like a Frog Find on a service like Cloud Flare and modern crypto.

I’d encourage people to run locally hosted on their own LANs a copy of FrogFind. At least that way you won’t be interrupted by jerks.

Getting Graylog onto the internet

A while back I had made a small post about getting Graylog running on Windows. It was fun, as it’s just JAVA so you know it should be portable, and other than some weird disk access thing it does seem to run fine.

Of course, the next step is to create a dashboard to replace what I used on wp-statistics, as it was crashing taking up 100% of my CPU, and exceeding PHP’s 12GB of RAM per process limit. You know things are messed up when I’m replacing you with not one, but 2 Java apps! (Graylog & Opensearch).

magical dashboard

It’s by no means perfect, but the guide How to se -up graylog geoip configuration, is all around great to have. The rest of it is me learning how to do aggregate searches, and simple lists, to see latest hits, 404’s and count the pages and build a graph.

Again, this is all good, now for the real question, how to get this onto the Internet?!

The firs thing to do is enable cors.. It’s for being on the internet!

http_enable_cors = true

Next enable the external URI name

http_external_uri = https://dash.board.com/

And now the changes I had to make in my haproxy config

frontend http-in
        acl host_graylog hdr(host) -i dash.board.com
        http-request set-uri %[path,regsub(/api/api,/api)] if host_graylog
        use_backend graylog if host_graylog

backend graylog
        option forwardfor
        http-request add-header X-Forwarded-Host %[req.hdr(host)]

        acl bad_ct path_end .js
        http-response set-header Content-Type application/javascript if bad_ct

        http-request set-header X-Graylog-Server-URL http://192.168.33.5:9000
        server graylog 192.168.23.33:9000 maxconn 20 check

I kind of wish I saved the logs while going crazy but YES for some reason it’ll try to reference itself as /api/api. I don’t know why, so I had to do some uri regex to fix that. Neat!

Next for some reason Graylog responds that all .js (javascript) files are actually text. Chrome doesn’t allow that to work, so yes you need to set the content type header to “application/javascript” for Chrome to be happy.

I had wasted over an hour with this and couldn’t get it working. So, I walked away for a few hours, and it suddenly was working. I think Cloudflare was doing some caching against it.

This is probably too terse to be really useful, and I lost all the pages I was reading about setting stuff in haproxy as I was doing that incognito. Oops. I picked this config out of fragments from five other people’s stuff. There is other considerations to host it on a subdirectory of a public site, but I just wanted to K.I.S.S.

Web Rendering Proxy (WRP) 4.5.2

(This is a guest post by Antoni Sawicki aka Tenox)

Pleased to announce WRP version 4.5.2. This is just a bug fix release however it also contains two frequently requested features:

UI customization via HTML template file. This has been requested by many users and it makes total sense. To use it download wrp.html from github, place in the same directory as wrp binary and edit to your liking. WRP will load built-in version if file is not present.

This should enable easy development of more modern UI for never browsers. Potentially with JS and CSS. Please send PR if you make something!

Second most frequently asked feature – re-capture (retake?) of a screenshot without page reload. For example if the page did not capture correctly or if something is changing on the page.

I have also updated Docker Hub and gcr.io repos.

Sun IPX using WRP at VCF West

As usual please test and report bugs!

The next update will focus on issues with page size, viewport and rendering full length pages (h=0) which is currently very broken.

Web Rendering Proxy – Full Page Scrolling

(This is a guest post by Antoni Sawicki aka Tenox)

Due to a popular demand I have added an option of generating full page height screenshot and allowing client browser to do the scrolling.

https://youtu.be/lDqrPxkOFlI

This makes the browsing experience much smoother, you have resources for it. Beware, a full page screenshot can be several MB in size encoded as gif/png and much more as a decoded raw bitmap on the client. I managed to crash Mosaic and OmniWeb a few times. Fortunately typical Wikipedia page is under 1 MB so for most part is should be fine. To activate just put 0 in page Height.

I have drafted a pre-release on github for testing. Please let me know any feedback. I’m also thinking whether enable this by default, or not.

WRP 4.0 Preview

(This is a guest post from Antoni Sawicki aka Tenox)

Welcome a completely new and absolutely insane mode of Web Rendering Proxy. ISMAP on steroids!

While v3.0 was largely just a port from Python/Webkit to GoLang/Chromedp, the new version is a whole new game. Previously WRP worked by walking the DOM and making a clickable imagemap out of <A HREF> nodes. Version 4.0 works by using x,y coordinates obtained from ISMAP to perform a simulated mouse click in Chrome browser. This way you can click on any element of the page. From annoying cookie warnings, to various drop down menus and even play some online games. Also pagination has been replaced with a clickable scroll bar.

Enough talking, you can watch this video:

Or download the new version and try it yourself!

Please report bugs on github.com. Thank you!

WRP 3.0 Beta ready for testing

(This is a guest post from Antoni Sawicki aka Tenox)

I have released WRP 3.0 for testing. It’s currently a browser-in-browser server rather than a true proxy, but that’s in the works. Please try it out and let me know. Usage instructions are on the main github project page.

Today using trickery I was able to login to my reddit account from Mosaic:

Update: just added the missing image quantizer so that the color number input box actually does something useful. Now you can browse porn even with 16 colors:

WRP Runs on Windows

(This is a guest post by Antoni Sawicki aka Tenox)

Thats right, the new beta version of Web Rendering Proxy runs natively on Windows. Single EXE, no libraries or dependencies required. Only Chrome Browser.

I took a Internet Explorer 1.5 for a spin today while WRP was running on my Windows 10 PC. Worked just fine.

I have added Prev/Next buttons so that you can easily “scroll” through long pages.

ISMAP support has been added, proof:

You can download a preview build on github.

Web Rendering Proxy – Overdue Status Update

(This is a guest post from Antoni Sawicki aka Tenox)

There hasn’t been a major update to WRP (Web Rendering Proxy) in 5 years or so. Some new features have been added thanks to efforts of Claunia but the whole project was mostly impeded with mass migration of the whole Internet to SSL/TLS/https. It does semi work somehow thanks to sslstrip but the whole stack is an unmaintainable pile of crap which I’m not going to update any more.

A new rewrite from scratch is well under way. This time written in GoLang and using Chrome DevTools Protocol. Things should be much more stable and future proof.

Far from complete but I have a fully functional prototype now working in just under 100 lines of code:

UPDATE 1: You can play with it if you want. Please do not submit any bug reports just yet, as this is just a development version. Note that WRP is currently not a true HTTP proxy but rather browser-in-browser. Proxy may be supported later.

UPDATE 2: As of today online setting of size, scaling and scrolling is supported. I’m specifically happy about the scrolling feature albeit it probably needs a better user input, like prev/next page.

Windows version still doesn’t work due to an upstream bug, which is probably easy to fix.

ISMAP is currently in development.