Skip to main content

Product Images Download Script - Summary

✅ Script Status: READY TO USE

Script đã hoàn thiện và sẵn sàng sử dụng!

📁 Files Created

/scripts/download-product-images.js    # Main script
/scripts/README-product-images.md       # Documentation
/package.json                           # Dependencies
/temp-images/                           # Output directory (created)

🚀 Quick Start

1. Install Dependencies (Already done!)

npm install

2. Run Script

# Download all images from default CSV
npm run download:images

# Or directly with node
node scripts/download-product-images.js

# With custom CSV file
node scripts/download-product-images.js /path/to/custom.csv

📊 CSV Analysis Results

Phân tích file bruno_vassari_products.csv:
  • Total products: 48
  • With image URLs: 8
  • Without image URLs: 40
Note: Chỉ có 8 products có image URLs trong CSV, 40 products còn lại không có URLs.

🎯 Script Features

✅ Implemented Features

  1. ✅ Parse CSV với regex (handle malformed CSV với commas trong text)
  2. ✅ Download images với retry logic (3 lần)
  3. ✅ Organized folder structure (/temp-images/bruno-vassari/)
  4. ✅ Error handling gracefull
  5. ✅ Generate mapping file (images-mapping.json)
  6. ✅ Support resume (skip already downloaded)
  7. ✅ Progress reporting với percentage
  8. ✅ Concurrent downloads (5 at a time)
  9. ✅ Timeout handling (30s)
  10. ✅ HTTP error handling (404, 500, timeout)

🎨 Output Format

File naming: product-{No}-{sanitized-filename}.jpg
/temp-images/bruno-vassari/
  product-1-kem-duong-phuc-hoi.jpg
  product-2-kem-chong-kich-ung.jpg
  ...
Mapping file (images-mapping.json):
{
  "1": {
    "csv_number": "1",
    "product_name": "Bruno Vassari Skin Comfort Gentle Night Cream",
    "original_url": "https://sieuthilamdep.com/images/...",
    "local_path": "/temp-images/bruno-vassari/product-1.jpg",
    "downloaded": true,
    "skipped": false,
    "file_size": 14273
  }
}

🧪 Testing

Test 1: Single Image Download ✅

node scripts/test-single-download.js
Result: ✅ Success (13.93 KB)

Test 2: CSV Parsing ✅

node scripts/test-regex-parse.js
Result: ✅ Parsed 48 products, 8 with URLs

Test 3: Dry Run ✅

node scripts/test-download-script.js
Result: ✅ CSV analysis completed

📈 Expected Runtime

Với 8 images để download:
  • Estimated time: ~30-60 seconds
  • Concurrent downloads: 5 at a time
  • Timeout per image: 30 seconds
  • Retry attempts: 3 per failed download

🔧 Configuration

Edit config tại đầu script nếu cần:
const CONFIG = {
  csvFile: './docs/bruno_vassari_products.csv',
  outputDir: './temp-images',
  brandFolder: 'bruno-vassari',
  mappingFile: './temp-images/images-mapping.json',
  timeout: 30000,      // 30 seconds
  maxRetries: 3,       // 3 retries
  retryDelay: 1000,    // 1 second
  concurrency: 5       // 5 concurrent
};

🐛 Troubleshooting

”No image URL” cho nhiều products?

  • Expected: CSV chỉ có 8/48 products với image URLs
  • Check: CSV file columns - cần column “Image URL” với URLs

Downloads fail?

  • Check: Internet connection
  • Check: URLs are accessible
  • Solution: Script tự động retry 3 lần

”Permission denied”?

chmod +x scripts/download-product-images.js

📝 Next Steps

  1. Run script: npm run download:images
  2. Check output: ls -la temp-images/bruno-vassari/
  3. Review mapping: cat temp-images/images-mapping.json
  4. Handle failures: Check statistics report cho failed downloads

📚 Additional Notes

CSV Format Handling

Script sử dụng regex-based parser để handle malformed CSV:
  • Dấu phẩy trong text không được quote (VD: “Kem Giảm Kích Ứng, Mẩn Đỏ Da”)
  • Regex extract URLs trực tiếp từ raw line
  • Workaround cho standard CSV parser limitations

Resume Feature

Chạy lại script sẽ skip đã downloaded:
# Run 1: Download 8 images
npm run download:images

# Run 2: Skip 8 downloaded, retry failed
npm run download:images

Error Logging

Failed URLs logged in:
  1. Console output
  2. Statistics report
  3. Mapping file với error details

Script Location: /Users/tannguyen/Documents/fullstack/triseo.drmanhlinhmd.com/scripts/download-product-images.js Last Updated: 2025-01-10 Status: ✅ Ready for production use