Extract Text From Images With N8N (No Code)

Introduction

In this article, we will explore a straightforward method to extract text from images using N8N, a no-code automation tool. This method leverages an API from a service called LlamaIndex to efficiently process image files and retrieve text content. Let's walk through the workflow step-by-step.

Demo Overview

To illustrate the text extraction process, I will first demonstrate the method using a receipt from my Google Drive.

Select an Image: I have uploaded a receipt image to my Google Drive that includes details like the date, item names, and prices.
View the Results: Upon executing the workflow, the tool will extract key information from the image, and the output will be displayed.

Workflow Breakdown

Here’s how the workflow is structured to accomplish the task:

1. Downloading the Image

First, we need to download the file from Google Drive. This is done by logging in with your Google account and specifying the file name.

2. Extracting Text

To extract text, we utilize LlamaIndex's API. You will need to create an account at cloud.llamaindex.com, where you receive 1,000 free pages for processing daily. Generally, each image extraction consumes about 10 to 15 tokens.
After setting up your account, generate your API key. The keys are needed for authentication in subsequent API calls.

3. Creating the HTTP Request

In N8N, we will perform a POST request to send the image to the API. Here’s what you need to include:
- URL: Utilize the API endpoint specified in the LlamaIndex documentation.
- Authorization: You should include a header named “Authorization” with the value formatted as Bearer YOUR_API_TOKEN.
- Headers: In addition to the authorization header, include an “Accept” header set to application/json.
- Body:
  - The first parameter should specify the binary file of the image (named file).
  - The second parameter should enable “premium mode” for better results, named premium_mode.

4. Checking Status

The workflow includes steps to check the status of the image processing, which may take some time depending on the file size and text density.
We use another HTTP request to check if the parsing is complete. The status can either be “pending” or “successful.”
If the status remains pending, the workflow will wait for a specified duration and then recheck:

5. Retrieving Results

Once the processing is marked “successful,” the final step involves extracting the parsed text from the image through another API request. Using the job ID, you can get the processed text in a structured format.

Conclusion

Using N8N in combination with LlamaIndex API simplifies the process of extracting text from images, enabling you to automate workflows without writing a line of code. If you have further questions or need any clarification, feel free to leave a comment below.

Keywords

N8N
Text Extraction
LlamaIndex
No Code
Google Drive
API
HTTP Request
Workflow

FAQ

Q1: What is N8N?
A1: N8N is a no-code automation tool that allows users to create workflows by connecting different applications and services without coding.

Q2: How does LlamaIndex work with N8N?
A2: LlamaIndex provides an API that allows N8N users to send image files for text extraction in a streamlined and automated process.

Q3: Do I need any programming skills to use this method?
A3: No, this method is designed for users with no coding experience, offering a user-friendly interface for building workflows.

Q4: What type of images can I extract text from?
A4: You can extract text from various types of images, such as receipts, documents, or any image containing legible text.

Q5: Is there a limit on how many images I can process daily?
A5: Yes, LlamaIndex provides 1,000 free pages per day for processing, depending on your subscription.

Q6: What should I do if I encounter problems using the workflow?
A6: Check the setup steps for any errors, consult the LlamaIndex documentation, or leave a comment for assistance.