Set up your environment
Before getting started, ensure you have the following installed on your
system:
- Python 3.8 or higher
- Flask
- Requests
- Beautiful Soup
Install the necessary packages using pip:
pip install flask requests beautifulsoup4
Creating a basic Flask app
Start with creating a folder for your project, then create a new file named
`app.py`. Inside `app.py`, write a basic Flask app:
from flask import Flask, render_template app = Flask(__name__) @app.route("/") def home(): return render_template("index.html") if __name__ == "__main__": app.run(debug=True)
Design the scraping function
In this step, create a scraping function to fetch data from the example news
portal, `https://www.example-news.com`. We’ll use the Requests library to
send HTTP requests and Beautiful Soup for parsing the HTML.
import requests from bs4 import BeautifulSoup def scrape_news(): url = "https://www.example-news.com" response = requests.get(url) soup = BeautifulSoup(response.content, "html.parser") headlines = [] for headline in soup.find_all("h3", class_="headline"): headlines.append(headline.text) return headlines
Integrating scraping function with Flask
Integrate the `scrape_news` function with your Flask app by calling it in
the `home` route and passing the headlines to the `index.html` template.
@app.route("/") def home(): headlines = scrape_news() return render_template("index.html", headlines=headlines)
Creating the HTML template
Create a new folder named `templates` inside your project folder and create
a file named `index.html`. In this file, display the scraped headlines using
Jinja2 templating:
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Headlines</title> </head> <body> <h1>Today's Headlines</h1> <ul> {% for headline in headlines %} <li>{{ headline }}</li> {% endfor %} </ul> </body> </html>
Testing the web scraper
Finally, test your web scraper by running the Flask application with the
command `python app.py`. It will start the development server, and you can
visit http://localhost:5000 in your browser to see the latest news headlines
displayed in your app.
Conclusion
In this article, we used Flask to build a web app that allows users to
interact with a web scraper written in Python. This is just a starting point
— with this foundation, you can build more complex applications, explore
different sources of data, and customize your web app’s appearance and
functionality. I just hope this will be of some help to developers who want
to do web scraping.