due to european legal reasons (GDPR) some of our customers are insecure about webflow hosting. We wanted a solution where we could still design and develop with webflow but not be bound to the hosting.
What was our idea?
Download all the content from webflow and put it on a webserver located in europe.
Using the node library website-scraper (website-scraper - npm) we download the page, when the site is published. After successfully downloading the page we use Github Actions to upload all the content we just downloaded.
This is what we did and we can present our proof of concept here:
In the project you find the README, which explains in detail which steps are needed to make it work.
How does it work?
Webflow gives the option to integrate a webhook on publish. We wrote a small PHP script which authenticates the user on Github and runs the Github workflow. The workflow then runs the scraper and uploads the changes to your server.
Time based publishing of CMS-Content won’t work, therefore we integrated a scheduler in the Github workflow to run every night.
We know this is not an easy topic. Also it will increase the cost of your website hosting, since you not only need a payed plan on webflow, but you also need a webhosting package. Additionally you have to set up this script for every site, which increases costs. We just wanted to do some r&d on possible solutions and this is what we came up with.
Soooo… what is your opinion on this solution, i am looking forward for feedback!
Disclaimer: I am a colleague of @konnisoelch, so I know the underlying thoughts very well.
@flashsites: This looks extremely promising, actually exactly the solution we were looking for or what we had in mind with the proof of concept that @konnisoelch published. We will definitely test it right away.
We did not know your solution until now. We have also researched in this direction a few weeks ago (probably before your announcement) and have found nothing in this direction. That’s why we have sat down on it build this POC. But I think it’s great that we are probably not the only ones who think in this direction, so the approach seems to be confirmed.
We answered it like this: To use CMS items or editors (we need them in almost every project), you need at least the CMS plan anyway. So it’s more like an add on and you don’t take anything away from webflow?
I am definitely not a legal expert, but I guess that this is simply a gray area. Probably you use a headless chrome (puppeteer) or similar and then it is somehow still a scraper or bot im my opinion. In the end, it’s probably just a matter of interpretation.
We don’t really put any energy into is the issue because of saving costs with it. Having a payed site plan for each website is completely ok, for us it’s really all about solving the GDPR issue.
In the light of the recent banning of Stacket I asked the Webflow support about the method mentioned above:
Any service that uses a “page scraping” technology is against our Terms of Service. I would not recommend using any service that is “page-scraping” or sometimes called “website-scraping”.
Also…these services are probably not using official Webflow APIs to transfer your data. As such, we cannot vouch for your data’s security/privacy/authenticity and your site may be exploited or compromised. Further, we cannot guarantee that their service will continue to operate without interruption.