Client product feeds and associated product images are the fuel for the Slyce Visual Search platform. The responses that we return will include everything included in the product feed records. There may be instances where you will need more information than your product feed contains to build out your product listing page (PLP). In these instances, our clients typically do a secondary, internal API call to gather information not found in the feed.
We ingest a wide variety of feeds for our clients today. The majority of the feeds are tab-delimited text files, but we handle standard CSV and XML. Some of these feeds are much easier to ingest and get running than others. We'd like to be able to speed up the feed ingestion process, and the following guidelines help us to get the data we need quickly.
Let's start by breaking down what Slyce needs in a feed for a good refinement result. Please note that our ingestion process is schemaless, so there are no required field naming conventions.
Product Image URL (required)
- The best case for this is a clear product image against a white or consistent background
- Best when all the products are consistently photographed.
- Multiple images can be helpful in identifying items from multiple angles.
- Image size is important. We need a minimum size of 500px by 500px to be able to do effective image processing. Smaller images can be included in the feed to be passed back as thumbnails.
- This helps our deep features and color extraction process be more accurate
Product Name (required)
- This should be an easily searchable string
Product URL (required)
- To link the products -- this is almost always a link to the product detail page.
Product ID (required)
- Unique product Identifier
- In a perfect world, there are a couple levels of these and one is more generic. ex: Shirts > Tee
- Used to narrow down results to filter on
- This type of metadata is extremely helpful in building relevant results
- Used to send back gender-relevant results
- Used to make products searchable, ex: Mens Shirt
- Only required for fashion retailers
- This is a short text summary of the product
- A primary color name that is easy to search
- We have a color extraction step that we perform on incoming images, but it is good to have a value here from the retailer when available as well.
- We use this for search-ability and relevance
- Sometimes we use this to keep our extracted Keywords from matching brand names for improved results.
Product Barcode (UPC Codes)
- Using this for server side and soon client side auto product detection
- Here are the barcode types we are currently able to read:
- 1D barcodes
- EAN-13, EAN-8, UPC-A, UPC-E, Code-39, Code-93, Code-128, ITF, Codabar
- 2D barcodes
- QR Code, Data Matrix, PDF-417, Aztec
- 1D barcodes
Now let's break down things that make building feed scripts difficult or generating good refinement results difficult. These are the things we want to avoid.
A flat file with all data for a product in one record is ideal. Having to merge multiple data sources can cause problems. In some cases, you may have to provide multiple files. We'll work with you to see if we can accommodate this.
Image Processing Issues
We have noticed sometimes clients block our servers because of how many times we load their images in our search interface as well as and more likely due to our color and feature extraction processes. In some instances, we may need to have you whitelist our GPU server IPs before we begin ingesting your feeds.
Marketing Text on Product Images
Occasionally, we see product images coming in with marketing text included. We sometimes see stylized text like “NEW!” or “HOT!” over the product imagery. This can have adverse effects on the image ingestions and cause false colors and categories to be returned. It can also lead to inconsistent result data sets.
Sample Feed Links
Here is a small snippet of a feed that has what we need.