Extracting URLs (faster) with Python

The recommended approach to do any HTML parsing with Python is to use BeautifulSoup. It's a great library, easy to use but at the same time a bit slow when processing a lot of documents. In this blog post, I would like to highlight some alternative ways on how to extract URLs from HTML documents without using BeautifulSoup. I added a performance test at the end to compare each alternative. Beautif…

Read more

How to create a thumbnail API service in 5 minutes

In this blog article, I would like to show you how to develop an API service for creating thumbnails with AWS Lambda in less than 5 minutes. The service will accept pictures over a REST API and return the thumbnails using ImageMagick. In a second step, we are going to store the thumbnails directly in S3 and return their public accessible link. We will use Chalice a new framework for Python from Am…

Read more