I have a collection with 1,000+ items. Is there a way to check a Collection for duplicate values in a field? For example, a listing of URL’s, is it possible to see if the same URL is listed twice?
Or… Do I need to:
Download my data
Use Excel to search for the duplicates
Delete ALL the data in my collection (100 records at a time because there is no way that I have found, other than completely deleting the collection, to delete all records in one shot.)
Import my data to build the collection all over again?
I apologize if I sound grumpy. But when you start to find the limitations of Webflow, those limitations start to get very frustrating after all the hours of development that you have put in to understand and build your site.
Webflow’s ability to query the API for specific content is pretty much non-existent, so a task like finding duplicates values in a field is an adventure in creative problem solving.
Good for one-time, occasional checks-
OPTION 1 - Use a Spreadsheet or Database tool
Download the CSV, load it into a spreadsheet, and use its own tools to find duplicates. This still has some hurdles as most spreadsheet solutions do this using conditional formatting which means you still have to read the whole sheet to find those duplicate/highlighted rows.
Loading into a tool like Access or Airtables gives you better tools for finding the duplicates but the load process is generally a bit more work as you need to specify field types.
OPTION 2 - Use Python or Awk
Download the CSV and use a command line tool like awk, or use a Python dataframe.
Requires light programming knowledge and the necessary tools.
Good for regular or automated checks;
Build a special page in your Webflow site. Have it load all of your content, just the slug field and the field(s) you need to monitor for duplicates. Load all 1,000+ items. Probably sort on the duplicate field and then have a script run to iterate from the end of the list to delete any non-duplicates, so all you have is a list of duplicates remaining. Visit the page any time you need to check. There are some challenges here, getting all of the data in requires a tool like Finsweet’s CMS Load More, and waiting for the data to load before you so your sort & delete.
OPTION 4 - Sync the CMS to AirTable and automate the check
If you need a realtime / automated solution for monitoring duplicates, you can use Whalesync or Powerimporter Pro to sync your CMS tables with Airtable, and have an automated process check there and alert you with any dupes.
A bit more here, and some Python / Awk examples to work with;