All Collections
Integrations
Keboola integration
Keboola integration

How to easily push data from Apify Actors and crawlers to Keboola

Jakub Drobník avatar
Written by Jakub Drobník
Updated over a week ago

Keboola is a cloud-based service that enables companies to integrate data from several sources to single cloud storage. Keboola then makes it possible to import and transform data into several databases and platforms, such as GoodData, Tableau or Google Sheets.
We want to offer all these Keboola features to our customers and so we integrated Apify into the Keboola platform. In this article, I show you how you can easily set up integrations for your existing Actor.

Getting started with Keboola

First, we have to sign in to the Keboola connection portal. If you don't have your account yet, you can create it from this link. Next, go to the Components -> Extractors section from the left-hand menu and click on the add New Extractor button.

After that, we have to find the Apify extractor and click on i

On the Apify extractor page, you can see all your configurations. We want to create new configurations, so we have to click on the New Configuration button for that.

In the next step, you can set up your extractor configuration name and description and click on the Create Configuration button.

Configure Apify extractor

With the created configuration, we can configure the extractor to get the data that we want. We can do that by clicking on the Configure Crawler button.

Choose action

In the next step, we can choose an action. All possible actions are described below.

Run Crawler - This action is deprecated and will be removed soon.
Run Actor - This action runs the selected Actor, waits until the Actor finishes, then pushes all items from the default dataset to Keboola Storage.
Retrieve results from Crawler run - This action is deprecated and will be removed soon.
Retrieve items from Dataset - This action takes the dataset ID or dataset name and retrieves all items from that.

Authentication

After hitting "Next", you have to set up your Apify API credentials. You can go to your Apify account page, where you can copy & paste your credentials into the form.

Specifications

In the next step, you can set up options for a specific run. You can choose an Actor to run from your account

Actor - You can choose which Actor you want to run. All Actors from your account will be loaded in the selected box.
Input Table - You can choose a table from the Keboola platform to be sent to the Actor. Data from the input table will be pushed to Actor, where you can access them through the Key-value store. The ID of the Key-value store and key of record will be saved to the input of the Actor in attribute inputTableRecord.

Memory & build - Usually you can keep those intact. Increase memory for faster runs.
Actor input - You can pass any JSON data here to the Actor.

After you fill in all options, you have to save your options using the Save button.

Run configured extractor

After your extractor has been configured, we can run it. You can do that with the Run button in the upper right corner of your configuration. 

After you run the extractor, you can go to job detail, which you can find in the list in the right-hand column.

After the run finishes, you can find the results on the job detail page under link in Storage Stats section.

Next steps

As I said at the beginning, you can integrate your results with the dozens of other services that Keboola integrates. Check out the full list here. You can set up a writer for a selected service using Keboola Writer. You can also set up orchestrations, where you can transform, merge or split your data from results.

And that's it, thanks for reading! We’d love to hear from you if you’ve found a great use case for Keboola <> Apify integration, if so just let us know at info@apify.com or contact us through chat.

Did this answer your question?