At Apify, we specialize in web scraping and automation. Using our Actors, you can do anything in a web browser. In addition, Crawlee with the Apify SDK, our JavaScript scraping and automation library, enables you to create powerful tools efficiently and without hard-to-solve bugs. Our universal scrapers provide an easy setup for new users via their friendly interface. And you can also create Actors with any other programming language with great support for Python.
That's a lot of features and we know how overwhelming it can feel for our new users. You, as a user, care much more about extracting high-quality data that can power your business than our platform's technical details. This article provides a short summary of each of our tools and services, so you can choose the one for you.
TL;DR
With our range of services and solutions, you have three basic options: use it, build it, or buy it.
♨️ Ready-made web scrapers - use it
Apify Store - ready-made solutions for popular sites; both simple like Threads and complex like Google Maps. The quickest way to get your data or automate your process.
🛠 Code-it-yourself solutions - build it
Our universal scrapers - customize our boilerplate tools to your needs with a bit of JavaScript and setup.
Our code templates for web scraping projects - for a quick project setup to save you development time (JavaScript, TypeScript, and Python templates).
JavaScript SDK & Crawlee - create your own solution from scratch using our JavaScript libraries. Involves more coding but offers infinite flexibility.
Python SDK - create your own solution from scratch using our Python libraries. Involves more coding but offers infinite flexibility.
Apify Store - mentioning Store once again because this is where you can publish the web scrapers you’ve built, find users, and get paid.
👔 Web scrapers for business - buy it
Professional Services - this option will get you a premium solution at any scale from a dedicated data expert here at Apify.
Apify freelancers - get a custom solution from a developer from the Apify community. Good for more specific and/or small use cases.
Apify Store
Apify Store is the quickest path to the data you need. It's full of open-source Actors (Google Search, Booking, Instagram, etc.) that you can use right away without any upfront commitment. Just sign up for a free account, set your input, press Start, and wait for the data to flow in.
If your use case is more specific, you may need a more tailored solution. In this case, you can either develop the tool yourself (🛠 using our libraries, code templates, or universal scrapers) or let our experts prepare a custom solution (👔 Professional Services and freelancers).
If you chose the DYI path (”develop-it-yourself” in our case), you can keep the Actor private or add it among the other tools in the Store for anyone to try (see How to publish your Actor ↗️).
Develop your own solution
The most interesting but also the most difficult is the do-it-yourself path. You will need to write at least a bit of JavaScript or Python, but you definitely don't need to be a master developer. Just don't give up too easily!
Fortunately, Apify provides you with some of the best development tools and documentation on the market. In the rest of this article, we will discuss these tools and their particular pros and cons.
1️⃣ Universal scrapers
We provide several universal scrapers, the best known are: Web Scraper (apify/web-scraper), Cheerio Scraper (apify/cheerio-scraper), Puppeteer Scraper (apify/puppeteer-scraper), Playwright Scraper (apify/playwright-scraper), JSDOM Scraper (apify/jsdom-scraper) and BeautifulSoup Scraper(apify/beautifulsoup-scraper). Their main goal is to provide an intuitive UI plus configuration that will help you start extracting data as quickly as possible. Usually, you just provide a simple JavaScript function, set up one or two parameters, and you're good to go.
Since scraping and automation come in various forms, we decided to build not just one, but six scrapers. This way, you can always pick the right tool for the job. Let's discuss each particular tool and its advantages and disadvantages.
Web Scraper
Web Scraper (apify/web-scraper) crawls a website's pages using a headless Chrome browser and executes some code (we call it the pageFunction
) on each page.
+ simple
+ fully JavaScript-rendered pages
- can only execute client-side (inside the browser) JavaScript
Cheerio Scraper
The UI of Cheerio Scraper (apify/cheerio-scraper) is almost the same as that of Web Scraper. What changes is the underlying technology. Instead of using a browser to crawl the website, it fires a series of simple HTTP requests to get the page's HTML. Your code then executes on top of that HTML with help from the Cheerio parser, which is essentially JQuery for servers. The main goal here is speed (and therefore cost-effectiveness).
+ simple
+ fast
- some pages may not be fully rendered (lacking JavaScript rendering)
- can execute only server-side (in Node.js) JavaScript
Puppeteer Scraper & Playwright Scraper
Unlike the two previous scrapers, Puppeteer Scraper (apify/puppeteer-scraper) (or Playwright Scraper) doesn't focus primarily on simplicity but provides a wider variety of powerful features. Playwright Scraper (apify/playwright-scraper) just adds another level to that offering cross-browser support. You have the whole Puppeteer / Playwright library for managing headless Chrome at your disposal.
+ powerful Puppeteer / Playwright functions (methods on the page
object)
+ can execute both server-side and client-side JavaScript
- more complex
JSDOM Scraper
JSDOM Scraper (apify/jsdom-scraper) employs the JSDOM library to parse HTML, offering a browser-like DOM API, including 'window.' It efficiently processes client-side JavaScript without the need for a real browser, positioning itself between the speed of Cheerio Scraper and traditional browser-based scrapers.
+ capable of handling client-side JavaScript
+ ideal for pages with light client-side scripting
+ can outperform full-browser solutions like Puppeteer by up to 20 times in terms of speed.
- not suitable for websites relying heavily on dynamic client-side JavaScript for rendering content
- limited to executing server-side code in Node.js within the Page function
- dependency on NPM modules already installed in the Actor for page
and Prepare request
functions
BeautifulSoup Scraper
BeautifulSoup Scraper (apify/beautifulsoup-scraper) is designed for crawling websites using raw HTTP requests and parsing HTML with the BeautifulSoup library. It offers a Python-based solution for extracting data from web pages using custom Python code. This Actor supports both recursive crawling and processing lists of URLs, serving as a Python counterpart to Cheerio Scraper.
+ Python-based web scraping
+ supports recursive crawling and URL lists
- lacks a full-featured web browser like Chromium or Firefox, making it unsuitable for web pages that render content dynamically using client-side JavaScript. For such scenarios, consider using Web Scraper (apify/web-scraper) with full browser capabilities.
In the Page function
, you can only use Python modules that are already pre-installed in this Actor. For additional modules, you'll need to develop a new Actor or engage with the project through issues or pull requests on GH.
▶ For this last scraper, we’ve also prepared a quick video tutorial to get you started.
2️⃣ JavaScript SDK
If you love JavaScript and Node.js as much as we do, definitely go for Crawlee with Apify SDK – our web scraping libraries. They are fully open-sourced on GitHub and many developers actively contribute to its continuous growth. They are meant mainly as a tool for developers who know at least the basics of Node.js to get up to speed quickly and build powerful applications.
There aren't many problems that you cannot solve with the whole JavaScript ecosystem at your disposal (and Crawlee documentation to guide you through it). Crawlee outsources the hard problems (like managing concurrency, auto-scaling, request queues, etc.) from the developer to the library so you can just focus on the task you want to complete.
+ complete freedom, maximum flexibility
+ full power of JavaScript ecosystem (npm)
- requires that you write some boilerplate code
- more complex, higher chance of making mistakes (the flip side of freedom)
▶ Here’s a short video tutorial to introduce Crawlee to you.
JavaScript code templates
These web scraping templates are essentially built on top of Crawlee but spare you the setup part. They include support for major libraries like Playwright, Selenium, Cheerio, along with Cypress and LangChain. These templates are available in both JavaScript and TypeScript, providing just enough flexibility for your web scraping needs.
+ suitable for creating scrapers, automation, and testing tools
+ requires more coding than using a generic scraper
- might be too restrictive if you need full freedom, in that case, you’d be better off using one of our JS libraries
▶ Here’s a short video tutorial showing how to use any of these web scraper templates.
3️⃣ Python SDK
Python SDK provides full support for integrating your favorite framework into Apify. You can use standard libraries like Beautiful Soup or Scrapy. Whether you have a simple scraper using BeautifulSoup, a powerful web spider written with Scrapy, or tapping into Selenium or Playwright for browser automation, the Apify SDK for Python will help you run your projects in the cloud, regardless of scale.
+ complete freedom, maximum flexibility
+ full power of Python ecosystem
- requires that you write some boilerplate code
- more complex, higher chance of making mistakes
▶ Here’s a short video tutorial to give you an idea of what it’s like to develop using our Python SDK.
Python code templates
These are built on top of the Python SDK but give you a bit of a head start. In our web scraping templates for Python, we include support for major libraries like Requests, BeautifulSoup, Scrapy, Selenium, and Playwright.
+ suitable for creating scrapers, automation, and testing tools
+ requires more coding than using a generic scraper
- might be too restrictive if you need full freedom, in that case, you’d be better off using the Python library itself
▶ Here’s a short video tutorial on how to start out.
Universal scrapers vs. libraries
Basically, the choice here depends on how much flexibility you need and how much coding you're willing to do. More flexibility → more coding.
Universal scrapers are simple to set up but are less flexible and configurable. Our libraries, on the other hand, enable the development of a standard Node.js or Python application, so be prepared to write a little more code. The reward for that is almost infinite flexibility.
Code templates are sort of a middle ground between scrapers and libraries. But since they are built on libraries, they are still on rather more coding than less coding side. They will only give you a starter code to begin with. So please take this into account when choosing the way to build your scraper, and if in doubt - just ask us and we'll help you out.
Apify Professional Services
If you want a more organized approach, we also offer premium Apify Professional Services. As our enterprise customer, we will assign you a dedicated data expert who will manage the whole project, send you periodic reports, and communicate with you via whatever tools you prefer (email, Slack, UberConference, etc.). We will sign a contract describing a specific SLA, build integration into your platform, and maintain the project long-term (if needed).
If you aren't sure whether an enterprise solution is the right fit for you, don't worry, just submit our form on the Apify Professional Services page and we will figure it out together.
→ We also recommend you see our extensive guide on the topic (written after some trial-and-error with our customers) - 6 things you should know before buying or building a web scraper
Apify freelancers
Apify freelancers are members of the Apify community who can provide custom web scraping and automation solutions at a fair price. You can think of it as a more specialized Upwork, Fiverr, or Freelancer.com. The developers know our platform well and have plenty of expertise using it.
All you need to do is provide a specification. Developers will bid their price and you can choose the offer that fits you best. Just head over to the Apify Discord and post your project.
Summary
Whatever tool you choose, don't feel trapped by your initial choice. Play and experiment: you can easily transfer large parts of the code between all of our tools. Over time you'll get a feeling for when to use what and you may also find your personal favorite. If you feel like our scrapers aren't doing what they should do, you can always report an issue on their Actor page.
If you decide to give developing your own Actors a try, start with Apify Academy and follow up with your questions on Crawlee & Apify Discord server.
Happy scraping!
Video resources
Improved way to build your scrapers from a Git repo
How to create Apify Actors with web scraping code templates
Introduction to Apify's Python SDK
Introduction to Crawlee
Webinar on how to use Crawlee
How to use Web Scraper to scrape any website
How to programmatically retrieve data with the Apify API