---
# System prepended metadata

title: How to Create a Web scraper Using the Apify SDK for Python

---

# How to Create a Web scraper Using the Apify SDK for Python 

Building a web scraper often involves setting up your development environment, choosing and installing various dependencies just to get started. This can be time-consuming which consequently impacts your overall productivity. However with Apify, you can streamline this process as the platform enables you to efficiently set up, develop, run, deploy and maintain your web scrapers with ease, allowing you to be more productive.

## Creating an actor from Python Scrapy template 

To get started, you need to sign up on Apify once you’re logged in the console, navigate to the **Actors** page and click on the **Create an Actor** button on the top-right hand corner of the page. You’ll then be provided with two on how to develop an actor - the first one is using an existing source code and the second one is using code template. In this tutorial, we’ll focus on the latter which is more time efficient and easier to use. While the platform offers three main templates to choose from - Python, JavaScript, and Typescript - we’ll use Scrapy from the Python templates. 

Once you have selected ``Scrapy``, you’ll be taken to the template’s detail page where you can see more information about the template including the code preview. Scrapy, unlike Beautiful Soup, is an extensible web scraping library ideal for large-scale projects. To edit the code click on the **Use this Template** button which will then redirect you to the inbuilt Web IDE. After editing click on the **Build** button to deploy your actor on the platform.

## Deploying a web scraper from a local machine

Alternatively, you can edit the code locally on your machine by clicking on the Use Locally button instead. For this you’ll need to install Apify CLI using the by running `brew install apify/tap/apify-cli` command on terminal on MacOS/Linux. You can also install it via NPM by running `npm -g install apify-cli` command. Once installed simply run the command `apify create my-actor` then choose Python, Scrapy template.

 Essentially, the commands have created a folder named my-actor, installed all the required dependencies, and generated the boilerplate code for you. You can go ahead and edit the code by moving into the folder - `cd my-actor` - and opening it in your preferred code editor. This what the src/main.py where you can do all the edits:

```python
from __future__ import annotations

from scrapy.crawler import CrawlerProcess

from apify import Actor
from apify.scrapy.utils import apply_apify_settings

# Import your Scrapy spider here
from .spiders.title import TitleSpider as Spider

# Default input values for local execution using `apify run`
LOCAL_DEFAULT_START_URLS = [{'url': 'https://apify.com'}]


async def main() -> None:
    """
    Apify Actor main coroutine for executing the Scrapy spider.
    """
    async with Actor:
        Actor.log.info('Actor is being executed...')

        # Process Actor input
        actor_input = await Actor.get_input() or {}
        start_urls = actor_input.get('startUrls', LOCAL_DEFAULT_START_URLS)
        proxy_config = actor_input.get('proxyConfiguration')

        # Add start URLs to the request queue
        rq = await Actor.open_request_queue()
        for start_url in start_urls:
            url = start_url.get('url')
            await rq.add_request(request={'url': url, 'method': 'GET'})

        # Apply Apify settings, it will override the Scrapy project settings
        settings = apply_apify_settings(proxy_config=proxy_config)

        # Execute the spider using Scrapy CrawlerProcess
        process = CrawlerProcess(settings, install_root_handler=False)
        process.crawl(Spider)
        process.start()
```

You can test if it's working by running `apify run` command - the scraped results will be stored in the ``storage/datasets`` file. To deploy the actor, run the command `apify login` which will prompt you to provide your Apify API Token. You can find this token in your Apify console in **Integrations** in the **Settings** menu. Finally, run the Apify push command to build and deploy your actor on the Apfiy platform. You can view your actor in the console under the Actors tab. 

The build process usually takes 5-10 seconds to complete, once it ‘s done provide a url input to scrape data from and then click the **Start** button at the bottom. The scraped data will be stored in the **Storage** under **Dataset** tab from where you can export it in various formats, including JSON, CSV, XML, Excel, HTML Table, RSS or JSONL.