I needed to get notified of a new job opening at a faculty, but their system did not have a subscription feature. I have been using Scrapyd for a couple of years for work reasons, but I wanted something super-easy that did not need any framework.
Problem
My problem was super clear (and easy). I wanted to receive, together with another person, an email whenever a new item was added to a list on a Web page.
The check would be daily, but the frequency of update of the page is weekly/monthly. So I need an email per week more or less.
You can find all the code in the link at the end of the article.
Subproblems
There were a couple of open questions
- Which email service to use. I needed a free and limited one given my use case?
- Should I use a framework?
- How to schedule it?
Email Service
I actually heard good things about MailChimp and wanted to give it a try. Not the tool for the job. MailChimp per se gives you the SMTP parameter you can use to set up a tiny notification system. I was hoping to be able to avoid storing the SMTP configuration somewhere and just using some API key to send an email.
If you look at MandrilApp, a MailChimp addon, it looks good but still not the tool for the job. Look at the limitations
Limitations
The Mailchimp Transactional demo includes the following limitations:
You can send up to 500 transactional emails to any email address on your verified domain. Read more about domain verification and authentication here.
Unfortunately, I did not see that at the beginning. So I went all the way to configure the DNS records to prove ownership of rafspiny.eu before I realized this tool was no use to me.
Basically, useless in my use case. So I started googling and found Mailjet. Just perfect for my needs. A very limited number of emails, but to any domain. Just about right.
You can easily create an API key and use it with Mailjet Official Python API
Have I also said that is comes with a simple but effective dashboard?
Framework
The piece of software I needed solves a very simple problem. So I opted for very simple tools.
- lxml to parse the page with XPath
- mailjet-rest to send the notification
- Poetry to manage the dependency and the venv
What I like about Poetry is how easy it makes to do the following things:
- Managing updates for your dependencies
- Handling production and dev dependencies
- Coping with different versions of Python
- Seamlessly switch between virtual envs
Although it is a bit overkilling for a problem like mine, I like to show how useful Poetry can be.
Structure I gave a bit of structure by organizing the code in
.
├── business
│ ├── email_provider.py
│ ├── __init__.py
│ └── logic.py
├── conf
│ ├── constants.py
│ ├── email_config.py
│ ├── __init__.py
│ └── secrets.py
├── data_layer
│ ├── data_storage.py
│ └── __init__.py
├── existing_vacancies.json
├── main.py
├── model
│ ├── __init__.py
│ └── models.py
├── poetry.lock
├── pyproject.toml
└── README.md
Models
It is self-explanatory. And simpleI hopeI
Constants
Just gathered all the standard configuration that I needed
Business
The basic operation I needed to execute
data layer
Access to the storage
Schedule
For the schedule you can easily set up a cronjob. That's what I always do. With Poetry its even easier.
When you fire up your favourite cron file editor (crontab -e
in my case), you should add something like this
40 07 * * * cd $HOME/work/LeidenMonitor && $HOME/.local/bin/poetry run python main.py >/tmp/cronlog
I hope this can be useful to anyone who wants to do something similar.
If you want to look at the whole code, you can go to the GitHub repo.