At Apify, we specialize in web scraping and automation. Using our actors, you can do anything in a web browser. In addition, the Apify SDK, our JavaScript scraping and automation library, enables you to create powerful tools efficiently and without hard-to-solve bugs. Our three generic scrapers provide easy setup for new users via their friendly interface. 

That's a lot of features and we know how overwhelming it can feel for our new users. You, as a user, care much more about extracting high-quality data that can power your business than our platform's technical details. This article provides a short summary of each of our tools, so to help you choose the right one.

TL;DR

  • Apify Store - ready-made solutions for popular sites. The quickest way to get your data or automate your process.

  • Marketplace - get a custom solution from an Apify-approved developer. For more specific use cases.

  • Enterprise - premium solution at any scale from a dedicated data expert here at Apify.

  • Our generic scrapers - customize our tools to your needs with a bit of JavaScript and setup.

  • Apify SDK - create your own solution from scratch using our JavaScript library. Involves more coding.

Apify Store

Apify Store is the quickest path to the data you need. It's full of open-source actors (Google Search, Booking, Instagram, etc.) that you can use right away without any upfront commitment. Just sign up for a free account, set your input, press Run, and wait for the data to flow in. 

If your use case is more specific, you may need a more tailored solution. In this case, you can either develop your tools yourself (or with your team) or let our experts (or partners) prepare a custom solution.

Apify Marketplace

On Apify Marketplace, we connect our customers with talented developers who have completed our internal training. You can think of it as a more specialized and more managed Freelancer.com. We guarantee that the developers know our platform well and have plenty of expertise using it.

All you need to do is provide a specification. The developers will bid their price and you can choose the offer that fits you best. The developer will then build the actor and add it to your account, along with an initial dataset. You own the code, plus the solution includes a guarantee period during which Apify will fix any problems that occur in the actor for free.

Apify for Enterprise

If you don't want to spend time managing multiple projects on Apify Marketplace, we also offer a premium service called Apify for Enterprise. As our enterprise customer, we will assign you a dedicated data expert who will manage the whole project, send you periodic reports, and communicate with you via whatever tools you prefer (email, Slack, UberConference, etc.). We will sign a contract describing a specific SLA, integrate into your platform, and maintain the project long-term (if needed). 

If you aren't sure whether a Marketplace or Enterprise solution is best for you, you can compare them here.

Develop your own solution

The most interesting but also the most difficult is the do-it-yourself path. You will need to write at least a bit of JavaScript, but you definitely don't need to be a master developer. Just don't give up too easily!

Fortunately, Apify provides you with some of the best development tools and documentation on the market. In the rest of this article, we will discuss these tools and their particular pros and cons.

Scrapers vs. SDK

We provide three generic scrapers: Web Scraper (apify/web-scraper), Cheerio Scraper (apify/cheerio-scraper), and Puppeteer Scraper (apify/puppeteer-scraper). Their main goal is to provide an intuitive UI plus configuration that will help you start extracting data as quickly as possible. Usually, you just provide a simple JavaScript function, set up one or two parameters, and you're good to go.

Since scraping and automation come in various forms, we decided to build not just one, but three scrapers. This way, you can always pick the right tool for the job.

Moving on to more technical development, the Apify SDK is a JavaScript library. With it, you develop a standard Node.js application, so be prepared to write a little more code. The reward for that is almost infinite flexibility. There aren't many problems that you cannot solve with the whole JavaScript ecosystem at your disposal (and the Apify SDK to guide you through it).

Let's discuss each particular tool and its advantages and disadvantages.

Web Scraper

Web Scraper (apify/web-scraper) crawls a website's pages using a headless Chrome browser and executes some code (we call it the pageFunction) on each page.

+ simple
+ fully JavaScript-rendered pages

- can only execute client-side (inside the browser) JavaScript

Cheerio Scraper

The UI of Cheerio Scraper (apify/cheerio-scraper) is almost the same as that of Web Scraper. What changes is the underlying technology. Instead of using a browser to crawl the website, it fires a series of simple HTTP requests to get the page's HTML. Your code then executes on top of that HTML with help from the Cheerio parser, which is essentially JQuery for servers. The main goal here is speed (and therefore cost-effectiveness).

+ simple

+ fast

- some pages may not be fully rendered (lacking JavaScript rendering)
- can execute only server-side (in Node.js) JavaScript

Puppeteer Scraper

Unlike the two previous scrapers, Puppeteer Scraper (apify/puppeteer-scraper) doesn't focus primarily on simplicity, but provides a wider variety of powerful features. You have the whole Puppeteer library for managing headless Chrome at your disposal.

+ powerful Puppeteer functions (methods on the page object)
+ can execute both server-side and client-side JavaScript

- more complex

Apify SDK

If you love JavaScript and Node.js as much as we do, definitely go for our SDK. Apify SDK is fully open-sourced on GitHub and many developers actively contribute to its continuous growth. It is meant mainly as a tool for developers who know at least the basics of Node.js to get up to speed quickly and build powerful applications. The SDK outsources the hard problems (like managing concurrency, auto-scaling, request queues, etc.) from the developer to the library so you can just focus on the task you want to complete.

+ complete freedom, maximum flexibility
+ full power of JavaScript ecosystem (npm) and Apify SDK

- requires that you write some boilerplate code
- more complex, higher chance of making mistakes (the flip side of freedom)

Summary

Whatever tool you choose, don't feel trapped by your initial choice. Play and experiment: you can easily transfer large parts of the code between all of our tools. Over time you'll get a feeling for when to use what and you may also find your personal favorite.

If you feel like our scrapers aren't doing what they should do, you can always report an issue on their public GitHub page. 

Just remember you shouldn't post cases when your own code fails for any reason. If that is the case and you cannot progress despite your best efforts, it's best to post the problem on StackOverflow with an apify  tag. Our experts go through these posts periodically and will leave you a reply anybody can learn from. Or you can always just contact our support team!

Happy scraping!

Did this answer your question?