Review workflow to bulk compress 600+ images in collection list and set alt-texts

Hi all,

I’ve built a site for a client last year, a classic car company, and transferred all the previously sold classics from the old to the new, around 600 items.

Each classic has about 40 images and they weren’t optimized, with filesizes of 500kb per image, some more. From the moment of launch new classics that are for sale are added in .WebP with a filesize of about 150kb.

Since the audience is expanding the bandwidth has grown, it has lead to much more hosting costs. I started to optimize the sold classics manually, by removing 32 of the images per classic, leaving just the best 8 images, and transforming them into WebP with an Alt-text, so also adding SEO value as a bonus.

But my question is if there’s a better way: I watched this tutorial on Sygnal by @memetican and am on the verge of trying the batch optimization tool for the full collection, but just want to check if this is the right workflow, since it’s a critical step:

  • Manual back-up of Webflow
  • Remove 32 images of sold classics
  • Batch optimize to .AVIF, so approximately 570 * 40 = 22.800 images…
  • Add alt-texts to the remaining images manually, since I didn’t find a way to let AI to this to a multi-image field

Can someone review this workflow, since it’s such a large operation that can really mess up a site if something goes wrong. For instance, the tutorial tells that an .AVIF output takes a very long time, so maybe the scope of this project is way too big?

I also found this tutorial by thelazygod on Youtube, where he uses Make to compress images and set alt-texts using AI, but he doesn’t mention that it’s possible for the multi-image field. Since I’ve never used Make or anything like this, I wonder if someone can shine a light on this.

So very curious to hear how you would set this up. Thanks in advance!


Here is my public share link: LINK

Hi Tookster, a few thoughts.

Due to the size of your optimization effort, I’d probably do it outside of Webflow. The reason is that your images are large and I’d expect the AVIF conversion to trip up in large batches of large files. Webflow doesn’t have any reporting system for its optimizer so there’s no way for you to know where it had problems and what needs attention.

Also Webflow’s CSV import/export does not support alt text on multi-image fields.
However the API does. For example here’s an update operation setting alt text on an existing multi-image image.

  "fieldData": {
    "images": [
      {
        "alt": "Serene mountain landscape with dramatic clouds at sunset",
        "url": "https://cdn.prod.website-files.com/67b584978f140ef6402c77a0/682c316848481c69d31fcc72_600.jpeg",
        "fileId": "682c316848481c69d31fcc72"
      }
    ]
  }

What I’d do is write a program that walks your collection, pulls your multi-image fields, downloads the media, and converts/re-uploads as AVIFs. As part of that process you can create/update your alt text or even use AI to do it.

The basic setup is straightforward, but you’d need a bunch of pieces;

  • A working list, probably a local CSV file export of the table
  • The AVIF converter, I like xnConvert and it has command line support
  • Possibly a temporary host for your AVIF images so that they can be imported from a public URL
  • Possibly an LLM API service that can do image URL to alt text generation
  • The controller, probably python scripts, handles
    • Parsing and and walking the CSV worklist, and removing items it completes
    • Downloading the images for the current item
    • AVIF conversion using a tool
    • Upload of the compressed image. I think this would be to temporary public storage like an S3 bucket, so that Webflow can access and download it via a public URL
    • Possible LLM API integration to describe the image
    • API call to update the record ( one call per )

You could have it ignore any existing AVIFs and existing alt text, so that if you ever needed to rerun the whole thing you don’t need to worry about redoing existing work.

For future additions/updates, I’d probably just make certain your processes already do a good job of optimizing and adding the alts before you add new records, to keep life simple.

Hey, thanks for your reply. Watching your tutorial I was already guessing you would not advise to compress all the images inside of Webflow.
I do have to say you solution is way over my head in technical terms, I don’t have a coding background so have to dive into this. Would this also be possible from a graphic interface, for instance via Make? These kind of environments are more easy for me I guess.

For now thanks a lot, I’m going to discuss this with my client.

Hi memetican,

A few months later, and finally have gathered the skills, knowledge and courage to start optimizing. Your suggestions are still a bridge too far for me in regards of my technical abilities, but I’ve found another way which does the same but is a bit less technical, working with Make.com and Bunny.net. (I’m also writing this down in case others need to optimize assets in the collection lists or more general minimize bandwidth usage).

First of all, I was able to host the hero backgroundvideo at Bunny.net, saving 20-25 GB per month. This was the largest file in the usage table, so that was a great optimization. Extra bonus is the improved resolution and quality, since the native Webflow module scales it down to I think 720p (not sure about the precise number), and now I could fiddle around with the lenght and resolution until it got to about 12MB, giving a smooth 1920x1080 resolution. I used this tutorial, in case someone is interested.

Then I set up a scenario in Make.com, where the Webflow List Items module retrieves the data of the 650 collectibles. Via the HTTP Request module the ShortPixel API is called, returning the optimized URL of the spotlight image, and then being able to push it back into the Webflow item using the same ID. As a bonus I was able to set an alt-text for each image using the brand, year and color of each car, helping SEO a bit too. ShortPixel is by the way cheaper then TinyPNG, which also requires multiple API calls if you also like to convert the image to AVIF and resize it. I also couldn’t get it to work this way, sequencing the HTTP modules, so decided to stick to ShortPixel. The only thing I’m wrestling with, is that ShortPixel needs you to run the scenario twice, since the optimization process needs time and in the first run it’ll give a “code 1 - image is schedule for processing”. I’m waiting a minute know and have set a wait time of 15 seconds per item, giving me decent results. Not 100% of the images are converted, I don’t know why exactly, but for this purpose it’s fine and when I’m a bit more skilled in Make I’ll retrieve the list of images that hasn’t been optimized and will convert them afterwards. One of the hardest parts of learning automation is being able to get the exact data you need with functions and regex, still learning but I can already experience it’s super powerful in combination with Webflow. At this moment I’m optimizing 50 or 100 spotlight images per batch, which is great to have this control, exactly what I was missing with the Webflow option to optimize all images at once.

When this is done I will try to do the same with the multi-image fields. Here retrieving the URLs of the images and setting them in an array so each of them can be converted and optimized is the most difficult, but getting there.

So all in all making progress. Two questions, if someone can answer these:

  1. the images that have used the most bandwidth have been loaded over 10.000 or 20.000 times (see first image below). The actual clicks to pages of the collectible are far far less, looking in Google Search Console (second image below). What’s causing all the image loads? Google search results? Sorry for this noob question, still learning…
  2. Would it be possible to host all the images of for instance the multi-images fields in Bunny.net?

Webflow usage table

Google Search Console table

All in all getting there. I’m still super mad my client had to upgrade his bandwidth usage to a new plan, which is super expensive. My rookie mistake to not have optimized the old images from the previous website, but I think they increased this price dramatically last year. My client and me are still debating what we shall do next year, migrating to another platform is a serious consideration. Happy if the above steps will help to come down a few steps in the bandwith site plan though…

** Adding this out of order, for others who read this thread later, as it hadn’t posted from my edit window

You can definitely try it, the built in tools are useful, just expect a lot of waiting, retrying, and problems with larger files that “break”. It’s the lack of feedback I really struggle with, you don’t know if it’s running, or stuck on a file, or what it skipped, etc.

I prefer a lot more control and visibility.

Clone your site first so that you have a full copy with all of your uncompressed originals.

It will just be time consuming continually rerunning it and trying to determine what it has succeeded at.

Very nice! From my perspective you actually took a more complex route by using Make and a 3rd party compressor, but those are hugely valuable skills. Approaching it with an automated setup could also be useful for the site owner, and also for optimizing any automated content ingress feeds. Really great stuff!

:clap: Bunny is great, excellent choice. For vids there is also Bunnystream which supports streaming vid options for quick playback on larger vids, if you should ever need it.

Thanks for sharing shortpixel as well, very good to know.

Depends entirely on the image. Typically it would be;

  • Image is used in multiple places, on multiple pages ( header, footer, cta, components ) and it just adds up
  • Scripts are causing it to reload, e.g. a large slider could potentially re-retrieve images depending on its configuration and the browser cache settings. Look esp for scripts that just keep running when the page is open in a tab.
  • RSS feeds; if you have readers continually scanning an RSS feed, some may keep re-downloading assets and in RSS the assets shared are the original uncompressed images- likely to support older reader tech that can’t do AVIFs.
  • If you’re using SEO optimization tools to help you craft content, they can also recurringly grab content and I’ve seen some do complete image downloads as well [not sure why, as they rarely analyze images- but perhaps AI alt tag generation is becoming more standard]

Then there are secondary possibilities like leeching- e.g. someone else puts a URL to your image or vid in their site, or a web forum, etc, and it just gets hit recurringly.

Unfortunately it’s difficult to trace because it’s not possible to put a reverse proxy directly in front of Webflow’s asset’s CDN, since you don’t own that domain.

Not easily, for a lot of reasons.

Your first problem would be synchronizing your assets over to bunny which means quite a bit more automation work. Then suppose you have two consistent URLs, something like;

Webflow CDN- https://cdn.prod.website-files.com/672312f4fccc306fc10c4893/6723132c2028179b53b055ca_somefile.png

Bunny CDN- https://cdn.bunny.net/6723132c2028179b53b055ca_somefile.png

If you’re consistent it’s easy to guess the Bunny URL by taking the Webflow asset URL, swapping the origin, and removing the site ID from the path.

Then you need to reform all of your multi-image-linked mage elements so that they try to load Bunny’s first, and fallback to Webflow’s.

It’s a bit messy to do that transform, and you’d need to R&D it a bit however you could try adding an attribute like preferred-origin = https://cdn.bunny.net ( or whatever ), make the image loading lazy, and then page load you’d find those img[preferred-origin] elements and restructure them something like this;

<img
  src="https://example.com/preferred.jpg"
  data-fallback="https://example.com/fallback.jpg"
  onerror="if(!this.dataset.swapped){this.dataset.swapped=1;this.src=this.dataset.fallback}"
  alt=""
>

The idea would be to make the src point to Bunny, but to handle image-load errors and fallback to Webflow’s CDN.

A bit gnarly.

I much prefer to use a reverse proxy for this type of work but that ups the infrastructure complexity level significantly.

Hi, thanks for your extensive reply.

Very nice! From my perspective you actually took a more complex route by using Make and a 3rd party compressor, but those are hugely valuable skills.

Thanks for your encouraging words. Yes, I can imagine there’s a more direct dev route towards this, but somehow this flowchart UI icm with the Make Academy was the first time this part of the development process said ‘click’ to my brain. Took me some time but can now say working with APIs isn’t intimidating anymore and actually a lot of fun :grin: . Probably at some time I will start working with these things in a development/coding environment.

Good to hear you like Bunny and was able to offer you a new service in naming ShortPixel. I had some contact with them to help setting up the API, super friendly people. Still having to find out why some images aren’t converting when converting a larger batch, probably has something to do with the wait time, but still investigating.

  • RSS feeds; if you have readers continually scanning an RSS feed, some may keep re-downloading assets and in RSS the assets shared are the original uncompressed images- likely to support older reader tech that can’t do AVIFs.

This might actually be the perpetrator! The client is a seller of classic cars. The offered classics are offered on several platforms. Just talked to my client and he insists that these platforms work independently of the ‘native’ page on our site where the classic is offered, but it might be so that other automotive platforms are continually scanning the web, retrieving data without us knowing. Adding to this that search engines display the spotlight image of the classic, so this might add up right? I mean, 48 actual clicks on the page compared to 16.629 times loaded as GSC shows the Land Rover Defender 90 Custom, such a big difference has to come from something like this right?

Thanks for shining your light on hosting all images on Bunny. That sounds a like bridge too far from me, but I get the concept. I think converting all Multi Image Field images to AVIF will be sufficient, and not we’ll start thinking of migrating to another platform (I actually think optimizing the spotlight images will be sufficient, because of the above mentioned phenomenon of the many calls from outside of the site that only download this image).

All in all thanks so much for helping me in this process, so great to get your insights!

ps. For the sake of completeness of this forum topic, I’m just adding a Youtube tutorial here of someone optimizing images via Make.com and Tinify.com, which helped me a lot in understanding the workflow: here’s the link.

1 Like

GSC won’t show site traffic- only keyword traffic. But if you’re using an analytics package like GA4 and (a) it’s showing anything close to 48 clicks on that page and (b) that’s the only page on your site which references that image ( no headers, footers, components, pop-ups, CMS RSS feeds ), then that strongly suggests leeching.

It’s quite possible your client’s resellers, or various ad publications are linking directly to your own hosted copy of that image and generating that bandwidth traffic. While it does probably indirectly benefit your client ( unless they’re marketing other listings with that image ), it’s still an unwarranted expense.

Keep going with your optimization work; but a few things-

  1. Check any Webflow RSS feeds you’re publishing and promoting, and see if you can find those high-traffic image URLs with them.
  2. After optimization, watch your bandwidth reports ( filter to today only ) to see if those larger JPEGs disappear. If not, something’s still accessing them and it’s not your own HTML.

If you continue to run into problems here, look me up. Optimization is crucial but in some cases it’s not enough if you’re also getting traffic from off-site sources. I setup web application firewalls that can let you see where your HTML traffic is originating, and block it. Assets are trickier but if you’re using RSS, we can protect your bandwidth by proxying that, monitoring traffic, modifying the image URLs and caching the assets.

Coincidentally, I know of another auto-industry database that had to completely block AI bots like Claude because it was generating far too much traffic and actually affecting their server performance and bandwidth costs ( not on Webflow ). So maybe bots are extra aggressive on certain site types.

Thanks again!

I will show this to my client. We’ve started working with an online marketing agency doing SEO recently, and they’ve starting installing analytics tools and have some good developers that have already gave some good insights and tweaks. I’ll ask them if they can see anything in the data (I can enter but don’t know how to interpret it) and I might come back to you after talking to them, thanks for the offer. Indeed it’s possible that in automotive a lot of bots are at work (it’s indeed a very competitive scene…).

I just checked the bandwidth usage and the curve is flattening a lot, so that’s good news. Still a couple large JPEGs loaded today, but have to see if they might ‘ve skipped the optimization since there were a few dropouts, but nothing serious. So getting more optimistic, apart from the possible leeching.

1 Like