We're really happy that you would like to submit a project on Apify Marketplace!
To make sure that the developers are able to send you an accurate price estimation for your project, we've created this guide to show you templates of the most important aspects to be covered in the assignment.
How do we reach the items you need?
Try to describe whether you are only interested in a specific category/subcategory of the website or whether you need the whole shop/site. In general, describe how to get detail URLs for the items that you are interested in. Optionally, we can also search for keywords in the shop and store only those results that come up. The best way to tell us which URLs you are interested in is to describe step-by-step what we should do to get all the URLs you need.
What information do you need to save for results?
The most important thing is to tell us what information needs to be extracted and stored for each record that we get for you. Do you want to store title, price, stock count and shipping costs for each item, or is there any other information that you would like to store, such as breadcrumbs? If you are familiar with data types, you can specify these for each of them. If not, don't worry about this and we'll suggest some.
Name of your project:
Notino.co.uk pagination scraper
List of URLs of the web pages to extract the data from:
List of data attributes to extract from each web page:
"itemId": 15843458, //int
"itemUrl": "https://www.notino.co.uk/armani/code-absolu-eau-de-parfum-for-men/", //string
"itemName": "Armani Code Absolu Eau de Parfum for Men 110 ml", // string
"discounted": false, //boolean
"currentPrice": 2500, //float
"originalPrice": null, //float
"category": ["Brands","Armani","Code"] //array of breadcrumbs
How many web pages do you need to crawl?
How often do you need to get fresh data?
Please provide a detailed description of how to extract the data. Imagine you're explaining the steps to a person who will perform them manually:
Prepare a solution that will scrape all products from the website. It will start on the website https://www.notino.co.uk/, grab all menu categories, go through all of them and through all pagination and from there take all URLs for item details.
Then visit detail and store data as in the example below.
For every item, we need to save the unique identifier of the product, which is also on the detail page for the product. For notino.co.uk, this is the id number in the data-product-code in HTML, e.g.
15843458 on this URL
If it is discounted, then put TRUE for discounted: currentPrice will be the discounted price and originalPrice is the price before discount.
If the price contains the word "from" (see the screenshot attached), then we need to visit the detail page and grab variants, so we don't miss any items.
Upload any relevant files, e.g. screenshots :