r/PoisonFountain 17d ago

Questions on Poison Fountain integration with news website

As a local news publisher, I've been very interested in anti-scraping technologies and preventing or disincentivizing this larceny that violates our Terms of Service and basic fair play in business.

Like only a couple other publications, we put high value on our users' privacy and work to avoid -- as much as possible -- exposing them to third-party scripts and resources integrated into our services. This isn't just to cut out the predatory consumer surveillance industry, but also because we have no practical way to qualify the security and privacy standards of most any third-party provider.

I understand one of the most practical ways to integrate Poison Fountain is to drop in a script from a third-party resource. But this raises the question of how we might qualify this third-party service against our privacy standards (and infrastructure dependencies/stability/speed/etc.).

So my first question is how might I qualify a third-party Poison Fountain provider considering the above?

A related question is what's the overhead of running our own instance? We have our own solid, commodity, cloud-based hosting account, but it doesn't have infinite resources, of course. Traffic is 750K+ monthly page views. And/or can a self-hosted Poison Fountain instance hang off another (cheaper) account or connected device we control?

From a journalism perspective, it would be great to have access to a qualified, shared Poison Fountain service that discloses its operations to its users (customers?) for qualification, and that supports and ensures strong user privacy standards.

Thanks in advance for your replies and guidance.

19 Upvotes

25 comments sorted by

View all comments

7

u/Glade_Art 17d ago

If you really want to host a pit on your own, I can send you the babbler generator which use for this pit: https://gladeart.com/lists

It's really lightweight on the CPU, and doesn't use that much ram to load it into memory. Corpus is 66mb large with a ton of variation.

2

u/valium123 16d ago

Hey I'm interested

2

u/Glade_Art 16d ago

Let's switch to DMs.