Scrape Data from Amazon Using Octoparse

Post on 11-Apr-2017

115 Views

Category:

Data & Analytics

4 Downloads

Preview:

Click to see full reader

Transcript

Collect Data from Amazon

www. octoparse.com

Click “start”to build a new task.Or hit the “Quick start” button in Navigation Panel to Create a new task.( Here we use Advanced Mode.)

Step 3. Complete basic information. Click ➜ “Next”.

Step 4. Design Workflow to configure the extraction rule. You can check your configuration rule in Workflow Designer here if something goes wrong.

Create a list of links of all the subcategories. Wait until the page loaded, click the first subcategory. Choose “create a list of items”. ➜

Select “Add current item to the list” “Continue to edit the list” ➜ ➜Click the second subcategory

Select “Add current item to the list” again.

When you get all the subcategory links, click “Finish Creating List”. ➜ Select “Loop” to process the list.

Now you can see it automatically enter the first category page

Click “Next Page” “loop click next page” to create a loop action to process all the web pages. ➜The action of pagination has been added to the extraction rule.

Then go back to the first product section. If you want to capture the information inside the product section, you have to click the detail link to get into the detail page. Choose the ➜detail link. Click the first product title to "create a list of items" . ➜

Click “Add current item to the list” “Continue to edit the list”. ➜

Then click the second product title. ➜ Click “Add current item to the list” “Finish Creating List”➜

As can be seen, all the detail links on the first page are all here. And Click “loop” to process the list.

Now you' re on the detail page. Then extract any information you need. Click on the product title to extract it.

Click “Extract Text”.

Click on price to extract. Then click ➜ “Extract Text”. And you get the product title and price in the Customize Current Action box.

Drag the second “Loop Item” before “Click to paginate” action.

Now we are done configuring extraction rule! Click “Next” to process configured rule. When images are not needed, you can choose not to load images to speed up the extraction.

Now the Task is completed! Choose the “Local extraction” to run the task on your computer.

The data extracted will be shown in "Data Extracted" pane. Click “Export” button to export the results to Excel file, databases or other formats and save the file to your computer.

Happy Data Hunting

www.octoparse.com

top related