Netfluence Corporation

NETFLUENCE CORPORATION

Knowledge Base

Bulk Image Downloads for VOR

Published: Nov 21, 2019 3:12:51 AM

 

Introduction

This document highlights the steps to bulk download all Images requested by VOR. The script uses a CSV as input and download an image per row and then group them according to their brand names.


Folder structure:

/downloads
/downloads/brandA/image1.jpg
/downloads/brandA/image2.jpg
/downloads/brandB/image1.jpg
/downloads/brandB/image2.jpg

 

CSV Format

The Script expects two required column for it to work. "Brand", "Image URL" but you can add any number of columns.

example CSV:

Brand,Part Name,Part Number,Image URL,Image Order
DV8 Offroad,Front bumper,FBCS1-01,https://cdn.shopify.com/s/files/1/1118/3812/products/FBCS1-01_Chevrolet_Silverado_1500_front_bumper_DV8_1024x1024.jpg,1
DV8 Offroad,Front bumper,FBCS1-01,https://cdn.shopify.com/s/files/1/1118/3812/products/DSC_0292b_Medium_1024x1024.jpg,3
DV8 Offroad,Front bumper,FBCS1-02,https://cdn.shopify.com/s/files/1/1118/3812/products/FBCS1-02_1024x1024.jpg,1
DV8 Offroad,Rear bumper,RBCS1-01,https://cdn.shopify.com/s/files/1/1118/3812/products/DSC_0298b_Medium_1024x1024.jpg,2

 

Steps to download images

1. Upload CSV to Server

Open your terminal and upload the csv file to the server
> scp /path/to/request.csv root@tools.netfluence.io:

2. Connect to the server and create a folder to collect all files

> ssh root@tools.netfluence.io
> mkdir -p /root/image_crawler/$(date +%Y%m%d)
> cd /root/vor
> time php public/index.php /download-images GET "input=/root/dv8.csv&folder=/root/image_crawler/$(date +%Y%m%d)"

3. Compile all files and folder to a single archive

Collect all files and folder and archive them to a single file for easier download

> cd /root/image_crawler/
> tar -zcvf $(date +%Y%m%d).tgz $(date +%Y%m%d)

4. Prepare for Download

This step will give us the URL of the archived images for download

> mv /root/image_crawler/$(date +%Y%m%d).tgz /var/www/tools/public/download/
> echo "Download available at https://tools.netfluence.io/download/$(date +%Y%m%d).tgz"

Screenshot 2019-11-21 at 4.20.21 PM

 

Downloading failed download:

There are cases that the Image URL cannot be downloaded, whether the URL is 404 or the server did not allow connection. After execution, the script will output a CSV that contains all rows that was not downloaded.

Screenshot 2019-11-21 at 3.49.54 PM


To download the failed download file, follow the steps below


> mv /file/path/of/faileddownload.csv /var/www/tools/public


Then you can download the file by visiting this URL


https://tools.netfluence.io/download/faileddownload.csv