Download Images From Instagram Using NodeJS and Puppeteer
Add to your RSS feed22 March 20244 min readTable of Contents
This article explain how to use Google Puppeteer and download images from a Instagram using Puppeteer.
Downloading images from Instagram using Node.js and Puppeteer involves automating the process of navigating to Instagram, accessing the desired images, and saving them to your local machine. Here's a basic example of how you can achieve this:
Let's download images from Instagram of Kim Kardashian (@kimkardashian).
What is Puppeteer?
Puppeteer is a Node.js library developed by Google that provides a high-level API over the Chrome DevTools Protocol. It allows you to control and automate Chromium or Chrome browser instances, enabling tasks such as web scraping, automated testing, taking screenshots, generating PDFs, and more.
Puppeteer provides a powerful set of features for interacting with web pages programmatically.
Setup Application
Step 1: Install Dependencies
First, you need to create Puppeteer config file and install a library:
Create file .puppeteerrc.cjs
1 const { join } = require('path');23 /**4 * @type {import("puppeteer").Configuration}5 */6 module.exports = {7 // Changes the cache location for Puppeteer.8 cacheDirectory: join(__dirname, '.cache', 'puppeteer'),9 };
now run
npm install puppeteerAdd to your package.json file:
1 "type": "module"
Step 2: Test the Puppeteteer
We will attempt to create a screenshot using Puppeteer of a random post by Kim Kardashian (https://www.instagram.com/kimkardashian/p/C4lwwOYSpW-/?hl=en&img_index=1).
Create a JavaScript file, for example, downloadInstagramImages.js, and write the script to check if the puppeteteer is working properly:
1 import puppeteer from 'puppeteer';23 async function run() {4 const browser = await puppeteer.launch({ headless: true });5 const page = await browser.newPage();6 await page.goto('https://www.instagram.com/kimkardashian/p/C4lwwOYSpW-/?hl=en&img_index=1');7 await page.waitForSelector('section');8 await page.setViewport({ width: 1080, height: 1024 });9 await page.screenshot({ path: 'screen.png', fullPage: true });10 await browser.close();11 }1213 run();
Now run the code:
node downloadInstagramImages.jsWe got this in our screen.png file:
Step 3: Create Helpers Functions
We need to create two functions: one to download an image from a source link and another to check if our destination folder already exists.
Check if the destination folder already exists function
1 const checkIfDirExists = (directory) => {2 return new Promise((resolve, reject) => {3 fs.access(directory, fs.constants.F_OK, (err) => {4 if (err) {5 // Directory doesn't exist, create it6 fs.mkdir(directory, { recursive: true }, (err) => {7 if (err) {8 console.error('Error creating directory:', err);9 reject();10 } else {11 console.log('Directory created successfully');12 resolve();13 }14 });15 } else {16 console.log('Directory already exists');17 resolve();18 }19 resolve();20 });21 });22 };
You can also use another method to resolve a directory
Download function
1 const download = (url, destination) => {2 return new Promise((resolve, reject) => {3 checkIfDirExists('images').then(() => {4 const file = fs.createWriteStream(destination);56 https7 .get(url, (response) => {8 response.pipe(file);910 file.on('finish', () => {11 file.close(resolve(true));12 });13 })14 .on('error', (error) => {15 fs.unlink(destination);1617 reject(error.message);18 });19 });20 });21 };
Add new imports at the top of the file:
1 import fs from 'fs';2 import https from 'https';
Step 4: Write the Run Function
1 async function run() {2 const browser = await puppeteer.launch({ headless: true });3 const page = await browser.newPage();4 await page.goto('https://www.instagram.com/kimkardashian/p/C4lwwOYSpW-/?hl=en&img_index=1');5 await page.waitForSelector('section');6 await page.setViewport({ width: 1080, height: 1024 });7 await page.screenshot({ path: 'screen.png', fullPage: true });8 const links = await page.evaluate(() =>9 Array.from(document.querySelectorAll('article a'), (el) => el.href),10 );11 const images = await page.evaluate(() =>12 Array.from(document.querySelectorAll('article div[role=button] div._aagv img'), (img) => {13 return {14 imgUrl: img.src,15 alt: img.alt,16 slug: img.src.slice(img.src.lastIndexOf('/') + 1, img.src.lastIndexOf('.jpg') + 4),17 };18 }),19 );2021 await browser.close();22 images.map(async (img) => {23 download(img.imgUrl, 'images/' + img.slug);24 });25 }2627 run();
Step 5: Run the Script
Run the script using Node.js:
node downloadInstagramImages.jsHere is a complete example of the script:
1 import fs from 'fs';2 import https from 'https';3 import puppeteer from 'puppeteer';45 const checkIfDirExists = (directory) => {6 return new Promise((resolve, reject) => {7 fs.access(directory, fs.constants.F_OK, (err) => {8 if (err) {9 // Directory doesn't exist, create it10 fs.mkdir(directory, { recursive: true }, (err) => {11 if (err) {12 console.error('Error creating directory:', err);13 reject();14 } else {15 console.log('Directory created successfully');16 resolve();17 }18 });19 } else {20 console.log('Directory already exists');21 resolve();22 }23 resolve();24 });25 });26 };2728 const download = (url, destination) => {29 return new Promise((resolve, reject) => {30 checkIfDirExists('images').then(() => {31 const file = fs.createWriteStream(destination);3233 https34 .get(url, (response) => {35 response.pipe(file);3637 file.on('finish', () => {38 file.close(resolve(true));39 });40 })41 .on('error', (error) => {42 fs.unlink(destination);4344 reject(error.message);45 });46 });47 });48 };4950 async function run() {51 const browser = await puppeteer.launch({ headless: true });52 const page = await browser.newPage();53 await page.goto('https://www.instagram.com/kimkardashian/p/C4lwwOYSpW-/?hl=en&img_index=1');54 await page.waitForSelector('section');55 await page.setViewport({ width: 1080, height: 1024 });56 await page.screenshot({ path: 'screen.png', fullPage: true });57 const links = await page.evaluate(() =>58 Array.from(document.querySelectorAll('article a'), (el) => el.href),59 );60 const images = await page.evaluate(() =>61 Array.from(document.querySelectorAll('article div[role=button] div._aagv img'), (img) => {62 return {63 imgUrl: img.src,64 alt: img.alt,65 slug: img.src.slice(img.src.lastIndexOf('/') + 1, img.src.lastIndexOf('.jpg') + 4),66 };67 }),68 );6970 await browser.close();71 images.map(async (img) => {72 download(img.imgUrl, 'images/' + img.slug);73 });74 }7576 run();
Conclusion:
Using Puppeteer, you can automate the process of downloading images from Instagram. However, keep in mind the legal and ethical considerations involved when accessing and downloading content from websites.