When using public scrapers to extract data from websites, it's common to encounter a wide range of attributes and fields that may not be necessary for your use case. Luckily, you can clean your dataset before downloading it and get only the data you need right away.
This article will explain how to selectively export data using the Selected or Omit fields in the Export dataset section.
Step 1: Navigate to your dataset
To begin, navigate to your dataset via Actor Run > Storage > Datasets. You can also access all your datasets via the sidebar Storages > Dataset in the Apify Console.
Scroll down to the section labeled Export dataset. Select the preferred format for the data to be downloaded in.
Step 2: Take a look at the available fields
Now let’s take a look at the options within the Export dataset section: Selected fields and Omit fields. These options allow you to choose which data attributes you want to include or exclude from the dataset you want to download.
Step 3. Check the dropdown menu
Check the dropdown menu in the Selected or Omit fields section. Both of these sections contain the same list of attributes you can choose from. Note that only the first 2000 attributes (including nested fields) will be shown due to storage limitations.
If the attributes you need are available in the dropdown menu, select it and continue. If you select an attribute in Selected fields, it will be kept in the final dataset. If you select an attribute in Omit fields, it will be deleted. ****You can select multiple attributes by holding down the appropriate key (Ctrl for Windows or Command for Mac) while clicking on the attributes.
Don’t worry if the attributes you need are not present in the dropdown menu: you can easily create them. All you need to do is type the data attribute you need and click Create.
Step 4. Export the dataset
Upon selecting the desired fields, review your choices to ensure you have included or omitted all necessary data attributes. Click the Preview button to see your cleaned dataset. Once you are satisfied, proceed with the dataset export. The system will generate a file in your preferred format containing only the attributes you’ve selected.
Following the instructions above, you can easily download only the data you need 😎