Introduction
This document highlights the steps to bulk download all Images requested by VOR. The script uses a CSV as input and download an image per row and then group them according to their brand names.
Folder structure:
/downloads
/downloads/brandA/image1.jpg
/downloads/brandA/image2.jpg
/downloads/brandB/image1.jpg
/downloads/brandB/image2.jpg
CSV Format
The Script expects two required column for it to work. "Brand", "Image URL" but you can add any number of columns.
example CSV:
Brand,Part Name,Part Number,Image URL,Image Order
DV8 Offroad,Front bumper,FBCS1-01,https://cdn.shopify.com/s/files/1/1118/3812/products/FBCS1-01_Chevrolet_Silverado_1500_front_bumper_DV8_1024x1024.jpg,1
DV8 Offroad,Front bumper,FBCS1-01,https://cdn.shopify.com/s/files/1/1118/3812/products/DSC_0292b_Medium_1024x1024.jpg,3
DV8 Offroad,Front bumper,FBCS1-02,https://cdn.shopify.com/s/files/1/1118/3812/products/FBCS1-02_1024x1024.jpg,1
DV8 Offroad,Rear bumper,RBCS1-01,https://cdn.shopify.com/s/files/1/1118/3812/products/DSC_0298b_Medium_1024x1024.jpg,2
Steps to download images
1. Upload CSV to Server
Open your terminal and upload the csv file to the server> scp /path/to/request.csv root@tools.netfluence.io:
2. Connect to the server and create a folder to collect all files
> ssh root@tools.netfluence.io
> mkdir -p /root/image_crawler/$(date +%Y%m%d)
> cd /root/vor
> time php public/index.php /download-images GET "input=/root/dv8.csv&folder=/root/image_crawler/$(date +%Y%m%d)"
3. Compile all files and folder to a single archive
Collect all files and folder and archive them to a single file for easier download
> cd /root/image_crawler/
> tar -zcvf $(date +%Y%m%d).tgz $(date +%Y%m%d)
4. Prepare for Download
This step will give us the URL of the archived images for download
> mv /root/image_crawler/$(date +%Y%m%d).tgz /var/www/tools/public/download/
> echo "Download available at https://tools.netfluence.io/download/$(date +%Y%m%d).tgz"
Downloading failed download:
There are cases that the Image URL cannot be downloaded, whether the URL is 404 or the server did not allow connection. After execution, the script will output a CSV that contains all rows that was not downloaded.
To download the failed download file, follow the steps below
> mv /file/path/of/faileddownload.csv /var/www/tools/public
Then you can download the file by visiting this URL
https://tools.netfluence.io/download/faileddownload.csv