Uploaded by Johnathan Pruitt

full stack python 2020

advertisement
Full Stack Python: 2020 Supporter's Edition
Table of Contents
1. Introduction........................................ 8
1.1 Learning Programming............................ 8
The Python Language............................ 12
Why Use Python?................................ 16
Python 2 or 3?................................. 19
Enterprise Python.............................. 23
1.2 Python Community............................... 27
Companies Using Python......................... 30
Best Python Resources.......................... 34
Must-watch Python Videos....................... 38
Podcasts....................................... 42
2. Development Environments........................... 47
2.1 Text Editors and IDEs.......................... 50
Vim............................................ 54
Emacs.......................................... 61
Sublime Text................................... 64
PyCharm........................................ 67
Jupyter Notebook............................... 69
2.2 Shells......................................... 74
Bourne-again shell (Bash)...................... 75
Zsh............................................ 77
PowerShell..................................... 78
2.3 Terminal multiplexers.......................... 80
tmux........................................... 82
Screen......................................... 83
2.4 Environment configuration...................... 83
Application dependencies....................... 85
virtual environments (virtualenvs)............. 91
Localhost tunnels.............................. 92
2.5 Source Control................................. 92
Git............................................ 98
Mercurial..................................... 104
3. Data.............................................. 107
3.1 Relational databases.......................... 111
2
Full Stack Python: 2020 Supporter's Edition
3.2
3.3
3.4
3.5
3.6
4. Web
4.1
4.2
PostgreSQL.................................... 116
MySQL......................................... 123
SQLite........................................ 127
Object-relational mappers..................... 131
SQLAlchemy.................................... 138
Peewee........................................ 143
Django ORM.................................... 146
Pony ORM...................................... 150
NoSQL......................................... 151
Redis......................................... 155
MongoDB....................................... 159
Apache Cassandra.............................. 162
Neo4j......................................... 165
Data analysis................................. 166
pandas........................................ 169
SciPy & NumPy................................. 172
Data visualization............................ 175
Bokeh......................................... 178
d3.js......................................... 181
Matplotlib.................................... 184
Markup Languages.............................. 185
Markdown...................................... 185
reStructuredText.............................. 188
Development................................... 190
Web Frameworks................................ 193
Django........................................ 198
Flask......................................... 206
Bottle........................................ 212
Pyramid....................................... 215
Falcon........................................ 221
Morepath...................................... 221
Sanic......................................... 223
Other web frameworks.......................... 225
Template Engines.............................. 227
Jinja2........................................ 231
Mako.......................................... 234
3
Full Stack Python: 2020 Supporter's Edition
Django Templates.............................. 235
4.3 Web design.................................... 236
HTML.......................................... 240
CSS........................................... 242
Responsive Design............................. 248
Minification.................................. 250
4.4 CSS Frameworks................................ 251
Bootstrap..................................... 251
Foundation.................................... 253
4.5 JavaScript.................................... 253
React......................................... 257
Vue.js........................................ 258
Angular....................................... 261
4.6 Task queues................................... 261
Celery........................................ 266
Redis Queue (RQ).............................. 270
Dramatiq...................................... 272
4.7 Static site generators........................ 272
Pelican....................................... 278
Lektor........................................ 280
MkDocs........................................ 282
4.8 Testing....................................... 283
Unit testing.................................. 287
Integration testing........................... 290
Debugging..................................... 291
Code Metrics.................................. 294
4.9 Networking.................................... 297
HTTPS......................................... 298
WebSockets.................................... 300
WebRTC........................................ 307
4.10 Web APIs..................................... 309
Microservices................................ 312
Webhooks..................................... 314
Bots......................................... 315
4.11 API creation................................. 318
API Frameworks............................... 323
4
Full Stack Python: 2020 Supporter's Edition
Django REST Framework........................ 324
4.12 API integration.............................. 325
Twilio....................................... 328
Stripe....................................... 330
Slack........................................ 331
Okta......................................... 333
4.13 Web application security..................... 334
SQL injection................................ 338
Cross Site Request Forgery................... 339
5. Web App Deployment................................ 341
5.1 Hosting....................................... 346
Servers....................................... 346
Static content................................ 350
Content Delivery Networks..................... 351
5.2 Virtual Private Servers (VPS)................. 352
Linode........................................ 353
DigitalOcean.................................. 353
Lightsail..................................... 354
5.3 Platform-as-a-Service......................... 354
Heroku........................................ 359
PythonAnywhere................................ 360
AWS Codestar.................................. 360
5.4 Operating systems............................. 361
Ubuntu Linux.................................. 365
macOS......................................... 367
FreeBSD....................................... 369
Windows....................................... 367
5.5 Web servers................................... 369
Apache HTTP Server............................ 373
Nginx......................................... 374
Caddy......................................... 378
5.6 WSGI servers.................................. 379
Green Unicorn................................. 384
uWSGI......................................... 387
mod_wsgi...................................... 387
5.7 Continuous integration........................ 388
5
Full Stack Python: 2020 Supporter's Edition
Jenkins....................................... 392
GoCD.......................................... 394
5.8 Configuration management...................... 395
Ansible....................................... 397
Salt.......................................... 399
5.9 Containers.................................... 400
Docker........................................ 403
Kubernetes.................................... 406
5.10 Serverless Architectures..................... 410
AWS Lambda................................... 414
Azure Functions.............................. 418
Google Cloud Functions....................... 419
6. DevOps............................................ 421
6.1 Monitoring.................................... 423
Datadog....................................... 427
Prometheus.................................... 428
Rollbar....................................... 429
Sentry........................................ 429
6.2 Web App Performance........................... 430
Logging....................................... 433
Caching....................................... 432
Web Analytics................................. 437
7. Meta
About the author.................................. 442
What "full stack" means........................... 442
6
Full Stack Python: 2020 Supporter's Edition
Introduction
You're knee deep in learning Python programming. The syntax is starting to
make sense. The first few "ahh-ha!" moments hit you as you learn conditional
statements, for loops and classes while playing around with the open source
libraries that make Python such an amazing programming ecosystem.
Now you want to take your initial Python knowledge and make something
real, like a web application to show off or sell as a service to customers. That's
where Full Stack Python comes in. You've come to the right place to learn
everything you need to know to create, deploy and operate your Pythonpowered projects.
Learning Programming
Learning to program is about understanding how to translate thoughts into
source code that can be executed on computers to achieve one or more goals.
There are many steps in learning how to program, including
1. setting up a development environment
2. selecting a programming language, of which Python is just one of many
amazing ecosystems that you can decide to use
3. understanding the syntax and commands for the language
4. writing code in the language, often using pre-existing code libraries
and frameworks
5. executing the program
6. debugging errors and testing for unexpected results
7. deploying an application so it can run for intended users
How should I learn programming?
There are several schools of thought on how a person should start learning to
program. One school of thought is that a lower-level programming language
such as Assembly or C are the most appropriate languages to start with
because they force new developers to write their own data structures, learn
about pointers and generally work their way through the hard problems in
computer science.
8
Full Stack Python: 2020 Supporter's Edition
There's certainly wisdom in this "low-level first" philosophy because it forces
a beginner to gain a strong foundation before moving on to higher level topics
such as web and mobile application development. This philosophy is the one
most commonly used in university computer science programs.
The atomic units of progress in the "low-level first" method of learning are
1. aspects of programming language understood (type systems, syntax)
2. number of data structures coded and able to be used (stacks, queues)
3. algorithms in a developer's toolbelt (quicksort, binary search)
Another school of thought is that new developers should bootstrap themselves
through working on projects in whatever programming language interests
them enough to keep working through the frustrations that will undoubtably
occur.
In this "project-based" line of thinking, the number of projects completed that
expand a programmer's abilities are the units of progress. Extra value is
placed on making the projects open source and working with experienced
mentors to learn what he or she can improve on in their programs.
Another way to learn that combines the project-based learning with defined
objectives is to play a computer game that will guide you through the learning
process. For example, TwilioQuest teaches the basics of Python in one of its
missions and then has a ton of free content for studying intermediate and
advanced topics.
Should I learn Python first?
Python is good choice in the project-based approach because of the extensive
availability of free and low cost introductory resources, many of which provide
example projects to build upon.
Note that this question of whether or not Python is a good first language for
an aspiring programmer is highly subjective and these approaches are not
mutually exclusive. Python is also widely taught in universities to explain the
fundamental concepts in computer science, which is in line with the "low-level
first" philosophy than the projects-first method.
In a nutshell, whether Python is the right first programming language to learn
is up to your own learning style and what feels right. If Ruby or Java seem like
9
Full Stack Python: 2020 Supporter's Edition
they are easier to learn than Python, go for those languages. Programming
languages, and the ecosystems around them, are human-made constructs.
Find one that appears to match your personal style and give it a try, knowing
that whatever you choose you'll need to put in many long days and nights to
really get comfortable as a software developer.
Practice problems
Working on practice programming challenges and studying their solutions in
Python or another language is a great way to learn whether you are just
starting or an experienced developer. Here are numerous open source
repositories and sites with practice problems and solutions:
• Pytudes are an awesome collection of Python programs to practice and
demonstrate skills. These problems go above and beyond the common
data structures and algorithm questions often found in other practice
problem sets.
• Interactive Python coding interview challenges is an awesome Jupyter
Notebook to learn and test your data structures and algorithms
knowledge in Python.
• Kindling project provides a wonderful list of resources that challenge
beginners with programming problems that beginners can solve to
grow their skills.
• Build your own "x" does not contain practice problems but instead
provides tutorials for how to build your own programming languages,
blockchain, bots, databases, frameworks and more awesome projects.
• Python Programming Exercises is a free short PDF book with exercises
across many standard Python language features such as dictionaries,
classes and functions.
• Code problems provides common algorithm and data structures
challenges with solutions in several programming languages including
Python.
• Python basics contains materials and exercises to learn basic Python 3
syntax such as variables, functions and lists.
• TeachCraft combines Minecraft with Python to learn coding.
10
Full Stack Python: 2020 Supporter's Edition
• 500 Data Structures and Algorithms practice problems and their
solutions covers a large swath of the computer science space. It is not
important to know all of these algorithms and data structures but
experience with many of them will be greatly beneficial in becoming a
better developer.
First-hand advice
These articles are written by programmers who explain how they learned to
code. They should not be taken as "this is how you must learn" but instead
give example paths you can think about taking as a beginner:
• Learning to program is a long read but goes through Dan's experience
in math and engineering before fully committing to software
development.
• Developing as a developer gives general advice on qualities necessary
to become a programmer, including persistence, respecting others and
considering ideas that are outside your comfort zone.
• Mastering programming by Kent Beck contains patterns and
observations for how experienced programmers he has worked with in
the past became great software developers.
• This Picture Will Change the Way You Learn to Code covers a well done
graphics of many up-to-date concepts and tools that developers use.
The post reminds you that you will not and should not learn everything
but that you should pick tools you want to gain experience in while
generally knowing what else is out there.
Teaching perspectives
Are you an experienced programmer working with new and junior
programmers? These articles give some insight into how you may want to
structure your teaching experience:
• Five Principles For Programming Languages For Learners is a
perspective on teaching children to program but is good advice for an
audience of any age.
11
Full Stack Python: 2020 Supporter's Edition
• Teach tech with cartoons is an awesome resource that explains how you
can use simple but fun drawings to teach otherwise difficult technical
concepts to students.
• Teaching programming to working professionals is a video podcast
with Trey Hunter about his experience teaching Python to experienced
professionals.
• Teach Yourself Computer Science is intended as a self-teaching tool
with many resources that are classic computer science textbooks. There
are also nice explanations for why each resource is useful in your
learning and teaching journey.
Python Programming Language
The Python programming language is an open source, widely-used tool for
creating software applications.
What is Python used for?
Python is often used to build and deploy web applications and web APIs.
Python can also analyze and visualize data and test software, even if the
software being tested was not written in Python.
Language concepts
Python has several useful programming language concepts that are less
frequently found in other languages. These concepts include:
• generators
• comprehensions
• application dependency handling via the built-in venv (as of Python
3.3) and pip (as of Python 3.4) commands
Generators
Generators are a Python core language construct that allow a function's return
value to behave as an iterator. A generator can allow more efficient memory
usage by allocating and deallocating memory during the context of a large
number of iterations. Generators are defined in PEP255 and included in the
language as of Python 2.2 in 2001.
12
Full Stack Python: 2020 Supporter's Edition
Comprehensions
Comprehensions are a Python language construct for concisely creating data
in lists, dictionaries and sets. List comprehensions are included in Python 2
while dictionary and set comprehensions were introduced to the language in
Python 3.
Why are comprehensions important?
Comprehensions are a more clear syntax for populating conditional data in
the core Python data structures. Creating data without comprehensions often
involves nested loops with conditionals that can be difficult for code readers
to properly evaluate.
Example comprehensions code
List comprehension:
>>> double_digit_evens = [e*2 for e in range(5, 50)]
>>> double_digit_evens
[10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 5
8, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98]
Set comprehension:
>>> double_digit_odds = {e*2+1 for e in range(5, 50)}
{11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 5
9, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99}
Dictionary comprehension:
>>> {e: e*10 for e in range(1, 11)}
{1: 10, 2: 20, 3: 30, 4: 40, 5: 50, 6: 60, 7: 70, 8: 80, 9: 90, 10: 100}
General Python language resources
• The online Python tutor visually walks through code and shows how it
executes on the Python interpreter.
• Python Module of the Week is a tour through the Python standard
library.
• A Python interpreter written in Python is incredibly meta but really
useful for wrapping your head around some of the lower level stuff
going on in the language.
13
Full Stack Python: 2020 Supporter's Edition
• A few things to remember while coding in Python is a nice collection of
good practices to use while building programs with the language.
• Python internals: adding a new statement to Python
• Python tricks that you can't live without is a slideshow by Audrey Roy
that goes over code readability, linting, dependency isolation, and other
good Python practices.
• Python innards introduction explains how some of Python's internal
execution happens.
• What is a metaclass in Python is one of the best Stack Overflow
answers about Python.
• Armin Roacher presented things you didn't know about Python at
PyCon South Africa in 2012.
• Writing idiomatic Python is a guide for writing Pythonic code.
Python ecosystem resources
There's an entire page on best Python resources with links but the following
resources are a better fit for when you're past the very beginner topics.
• The Python Subreddit rolls up great Python links and has an active
community ready to answer questions from beginners and advanced
Python developers alike.
• The blog Free Python Tips provides posts on Python topics as well as
news for the Python ecosystem.
• Python Books is a collection of freely available books on Python,
Django, and data analysis.
• Python IAQ: Infrequently Asked Questions is a list of quirky queries on
rare Python features and why certain syntax was or was not built into
the language.
• A practical introduction to Functional Programming for Python coders
is a good starter for developers looking to learn the functional
programming paradigm side of the language.
14
Full Stack Python: 2020 Supporter's Edition
• Getting Started with the Python Internals takes a slice of the huge
CPython codebase and deconstructs some of it to see what we can learn
about how Python itself is built.
Comprehension resources
• Comprehending Python’s Comprehensions is an awesome post by Dan
Bader with a slew of examples that explain how list, dictionary and set
comprehensions should be used.
• Python List Comprehensions: Explained Visually explains how the
common idiom for iteration became syntactic sugar in the language
itself and how you can use it in your own programs.
• The Python 3 Patterns and Idioms site has an overview of
comprehensions including code examples and diagrams to explain how
they work.
• Comprehensions in Python the Jedi way shows off comprehensions
with a Star Wars theme to walk through the nuts and bolts. All
examples use Python 3.5.
• Idiomatic Python: Comprehensions explains how Python's
comprehensions were inspired by Haskell's list comprehensions. It also
provides clear examples that show how comprehensions are shorthand
for common iteration code, such as copying one list into another while
performing some operation on the contained elements.
• List comprehensions in Python covers what the code for list
comprehensions looks like and gives some example code to show how
they work.
• An Introduction to Python Lists is a solid overview of Python lists in
general and tangentially covers list comprehensions.
Python generator resources
• This blog post entitled Python Generators specifically focuses on
generating dictionaries. It provides a good introduction for those new
to Python.
15
Full Stack Python: 2020 Supporter's Edition
• Generator Expressions in Python: An Introduction is the best allaround introduction to how to use generators and provides numerous
code examples to learn from.
• Python 201: An Intro to Generators is another short but informative
read with example generators code.
• Iterators & Generators provides code examples for these two constructs
and some simple explanations for each one.
• Python: Generators - How to use them and the benefits you receive is a
screencast with code that walks through generators in Python.
• The question to Understanding Generators in Python? on Stack
Overflow has an impressive answer that clearly lays out the code and
concepts involved with Python generators.
• Generator Tricks for Systems Programmers provides code examples for
using generators. The material was originally presented in a PyCon
workshop for systems programmers but is relevant to all Python
developers working to understand appropriate ways to use generators.
Why Use Python?
Python's expansive library of open source data analysis tools, web
frameworks, and testing instruments make its ecosystem one of the largest
out of any programming community.
Python is an accessible language for new programmers because the
community provides many introductory resources. The language is also
widely taught in universities and used for working with beginner-friendly
devices such as the Raspberry Pi.
16
Full Stack Python: 2020 Supporter's Edition
Python's programming language popularity
Several programming language popularity rankings exist. While it's possible
to criticize that these guides are not exact, every ranking shows Python as a
top programming language within the top ten, if not the top five of all
languages.
The IEEE ranked Python as the #1 programming language in 2019, which
continued its hot streak after ranking it #1 in 2018, #1 in 2017 and #3 top
programming language in 2016. RedMonk's June 2019 ranking had Python at
#3, which held consistent from previous years' rankings in 2018 and 2017.
Stack Overflow's community-created question and answer data confirms the
incredible growth of the Python ecosystem and tries to determine why it
growing so quickly with their own analysis. In the 2020 Stack Overflow
developer survey the data indicated that Python was the fastest growing major
programming language and that there is a close alignment between the
languages and tools that developers choose to learn and the usage in
developers' professional work.
The TIOBE Index a long-running language ranking, has Python moving up
the charts to #3, climbing from #8 just a few years ago.
The PopularitY of Programming Language (PYPL), based on leading
indicators from Google Trends search keyword analysis, shows Python at #1.
GitHut, a visualization of GitHub language popularity, pegs Python at #3
overall.
These rankings provide a rough measure for language popularity. They are
not intended as a precise measurement tool to determine exactly how many
developers are using a language. However, the aggregate view shows that
Python remains a stable programming language with a growing ecosystem.
Why does the choice of programming language matter?
Programming languages have unique ecosystems, cultures and philosophies
built around them. You will find friction with a community and difficulty in
learning if your approach to programming varies from the philosophy of the
programming language you've selected.
17
Full Stack Python: 2020 Supporter's Edition
Python's culture values open source software, community involvement with
local, national and international events and teaching to new programmers. If
those values are also important to you and/or your organization then Python
may be a good fit.
The philosophy for Python is so strongly held that it's even embedded in the
language as shown when the interpreter executes "import this" and displays
The Zen of Python.
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
More perspectives on using Python
Programming language rankings and the philosophy behind a language
provide solid initial data points for why you should learn Python. These
resources also give perspectives on why people switched from other
programming communities and why they advocate for Python as a primary
language.
• How to argue for Python’s use explains that choosing a programming
language can be complicated but that Python is a very good option for
many use cases.
• Why I Push for Python gives one professor's rationale for promoting
Python to teach programming to undergraduates.
18
Full Stack Python: 2020 Supporter's Edition
• If you're wondering about the differences in Python's dynamically
typed system versus statically typed languages, be sure to read this
thorough explanation of the topic.
• Why I swapped C#.NET for Python as my default language and
platform (and won’t be going back) provides a viewpoint from someone
who is not a professional developer but uses coding to hack out some
projects. He found Microsoft's .NET ecosystem lacking when it came to
satisfying his needs and Python filled the gap for him with its wide
array of open source code libraries, package management and ability to
work well on platforms other than Windows.
• Python, Machine Learning, and Language Wars compares Python with
R, MATLAB and Julia for data science work. While Python is great for
deployment automation and web development, many non-developers
are first introduced to the language and ecosystem while getting data
analysis work done.
• Evangelizing Python for Business contains helpful hints if you are
trying to convince your company to use Python, particularly for web
development.
• Python: Beyond Just Web Apps supplies non-web development project
examples that use Python. The article also does a solid job comparing
and contrasting Python to other common programming languages such
as Java, Ruby and JavaScript.
• Python or Ruby for web development gives reasons for one language
compared to the other just focused on the web development space.
• If you are worried about the community split between Python 2 and
Python 3, you should go with Python 3. To read more about how
companies handled upgrading, check out Instagram's smooth move to
Python 3, practical steps for moving to Python 3 and lessons learned
from migrating to Python 3.
Python 2 or 3?
The Python programming language is almost finished with a long-term
transition from version 2 to version 3. New programmers often have
questions about which version they should learn. It can be confusing to hear
19
Full Stack Python: 2020 Supporter's Edition
that Python 3, which was originally released in 2008, is still not the default
installation on some operating systems such as macOS. However, that
situation is rapidly changing as the final version 2 release, Python 2.7, is
approaching its end-of-life that is definitively scheduled for January 1, 2020.
The simple answer right now is: learn Python 3, specifically the latest version
which as of October 2019 is Python 3.7. If for some reason you absolutely have
to learn Python 2, for example because your employer is working on a bunch
of legacy code, you will be able to transfer the majority of your knowledge
from Python 2 right into Python 3. Likewise, you will still be able to read and
write Python 2 code if you start with Python 3.
There are enough great resources out there that will teach you to code in
version 3 without any prior version 2 experience. Python 3 is the future and
you will not regret starting with the latest version of the programming
language.
There is one small caveat to the recommendation to go full-on Python 3. You
may infrequently come across lesser-used open source code libraries that
were originally written in Python 2 that do not completely support Python 3.
That was the case before 2019 with DevOps configuration management tools
such as Fabric or Ansible. However, those libraries now support Python 3 and
the usage problems that were frequent in years past are now typically not a
concern. Knowing how to upgrade Python 2 libraries to 3.x is still a useful skill
to apply at the edges of the Python open source community.
Visualizations and Projects
Since upgrading from Python 2 to 3 has been such a huge undertaking within
the community, many projects have sprung up to make the transition easier.
• six is a 2/3 compatibility library that is a dependency for many popular
Python projects to make it easier to support both Python 2 and 3 at the
same time.
• Python 3 Readiness is a visualization of which most popular 360
libraries (by downloads) are ready to be used with Python 3.
• The Python clock counts down the time until Python 2.x is no longer
maintained. While Python 2's retirement may still seem a long time
20
Full Stack Python: 2020 Supporter's Edition
away, it can take a lot of time and effort to migrate existing application
to the modified syntax in 3.x.
Porting to Python 3 resources
Moving an existing codebase to Python 3 from 2 can be a daunting task, These
resources were created by fellow developers who've previously gone through
the process and have advice for making it less painful.
• Python 3 Porting is an entire book with details for how to upgrade your
existing projects and libraries to Python 3.x.
• Moving from Python 2 to Python 3 is a PDF cheatsheet for porting your
Python code.
• The official porting code to Python 3 page links to resources on porting
Python code as well as underlying C implementations. There is also a
quick reference for writting code with Python 2 and 3 compatibility.
• Upgrading to Python 3 with Zero Downtime supplies advice on
transitioning a large existing Python 2 web application to Python 3.
Their process involved upgrading dependencies, testing and deploying
the new version before going back to clean up unnecessary code created
by the transition.
• Migrating to Python 3 with pleasure is a porting guide that focuses on
code that data scientists would typically use in their programs.
• Instagram Makes a Smooth Move to Python 3 explains their upgrade
process for getting all of their code over to Python 3 compatibility over
a period of about a year.
• Practical steps for moving to Python 3 is a podcast episode that goes
over migrating a large existing application's codebase to Python 3 from
Python 2.
• Lessons learned from migrating to Python 3 covers how a development
team with a large e-commerce site built on Django was able to upgrade
their project.
21
Full Stack Python: 2020 Supporter's Edition
Python 2 to 3 resources
The following resources will give you more context on how the community
feels the transition from Python 2 to 3 is going, as well as why you should
upgrade as soon as possible.
• Why should I use Python 3? is a detailed FAQ on important topics such
as unicode support, iteration improvements and async upgrades
provided by 3.x. There is also a great follow up post by the author titled
A Rebuttal For Python 3 that counters some arguments made by other
community members who are unhappy about various features in
Python 3.
• Want to know all of the advantages and what's changed in Python 3
compared to Python 2? There's an official guide to Python 3 changes
you'll want to read.
• Python 3 is winning presents data and graphs from PyPI to show that
at the current rate, by mid-2016 overall Python 3 library support will
overtake Python 2 support.
• Python 3 Retrospective from the Benevolent Dictator for Life is a talk
by Guido van Rossum on what is working, not working and still needs
to be done before the changover can be considered complete.
• The stages of the Python 3 transition provides perspective from a core
Python developer on how the transition from Python 2 to 3 is going as
of the end of 2015.
• How Dropbox rolled out one of the largest Python 3 migrations ever
explains how their transition began in 2015 and was successfully
completed in 2018. There is also a follow up post on incrementally
migrating over one million lines of code from Python 2 to Python 3 that
has more details on how hack weeks were able to help make enough
progress so the engineers could better estimate the scope of work when
the transition from 2 to 3 became critical to their development
toolchain.
• Zato: A successful Python 3 migration story examines the background,
preparation, execution and testing of moving an existing Python 2 code
base over to Python 3.
22
Full Stack Python: 2020 Supporter's Edition
• Why Python 3? randomly outputs valid reasons to use Python 3 over
2.x.
• Rules for Radicals: Changing the Culture of Python at Facebook is a
fascinating look at how Facebook moved from primarily Python 2 up to
Python 3 due to the efforts of a small passionate group of developers
within the company. Definitely worth watching to understand how to
shift a large organization with an established codebase.
• Porting to Python 3 is like eating your vegetables explains that there
are treats in Python 3 that are worth porting for and has some tips on
making the transition easier.
• Scrapy on the road to Python 3 support explains from the perspective
of a widely used Python project what their plan is for supporting
Python 3 and why it has taken so long to make it happen.
• All major scientific Python libraries have pledged to drop Python 2
support no later than 2020, when Python 2's maintenance life is over.
The pledge strongly encourages Python 3 adoption by publicly stating
their intentions.
• 10 awesome features of Python that you can't use because you refuse to
upgrade to Python 3 is a great slideshow with code snippets that show
useful new features of Python 3 that are not available in 2.x such as
keyword-only arguments, chained exceptions and the yield from
keyword.
Enterprise Python
One of the misconceptions around Python and other dynamically-typed
languages is that they cannot be reliably used to build enterprise-grade
software. However, almost all commercial and government enterprises
already use Python in some capacity, either as glue code between disparate
applications or to build the applications themselves.
What is enterprise software?
Enterprise software is built for the requirements of an organization rather
than the needs of an individual. Software written for enterprises often needs
to integrate with legacy systems, such as existing databases and non-web
23
Full Stack Python: 2020 Supporter's Edition
applications. There are often requirements to integrate with authentication
systems such as the Lightweight Directory Access Protocol (LDAP) and Active
Directory (AD).
Organizations develop enterprise software with numerous custom
requirements to fit the specific needs of their operating model. Therefore the
software development process often becomes far more complicated due to
disparate factions within an organization vying for the software to handle
their needs at the expense of other factions.
The complexity due to the many stakeholders involved in the building of
enterprise software leads to large budgets and extreme scrutiny by nontechnical members of an organization. Typically those non-technical people
place irrational emphasis on the choice of programming language and
frameworks when otherwise they should not make technical design decisions.
Why are there misconceptions about Python in enterprise
environments?
Traditionally large organizations building enterprise software have used
statically typed languages such as C++, .NET and Java. Throughout the 1980s
and 1990s large companies such as Microsoft, Sun Microsystems and Oracle
marketed these languages as "enterprise grade". The inherent snub to other
languages was that they were not appropriate for CIOs' difficult technical
environments. Languages other than Java, C++ and .NET were seen as risky
and therefore not worthy of investment.
In addition, "scripting languages" such as Python, Perl and Ruby were not yet
robust enough in the 1990s because their core standard libraries were still
being developed. Frameworks such as Django, Flask and Rails (for Ruby) did
not yet exist. The Web was just beginning and most enterprise applications
were desktop apps built for Windows. Python simply wasn't made for such
environments.
Why is Python now appropriate for building enterprise
software?
From the early 2000s through today the languages and ecosystems for many
dynamically typed languages have greatly improved and often surpassed some
24
Full Stack Python: 2020 Supporter's Edition
aspects of other ecosystems. Python, Ruby and other previously derided
languages now have vast, well-maintained open source ecosystems backed by
both independent developers and large companies including Microsoft, IBM,
Google, Facebook, Dropbox, Twilio and many, many others.
Python's open source libraries, especially for web development and data
analysis, are some of the best maintained and fully featured pieces of code for
any language.
Meanwhile, some of the traditional enterprise software development
languages such as Java have languished due to underinvestment by their
major corporate backers. When Oracle purchased Sun Microsystems in 2009
there was a long lag time before Java was enhanced with new language
features in Java 7. Oracle also bundles unwanted adware with the Java
installation, whereas the Python community would never put up with such a
situation because the language is open source and does not have a single
corporate controller.
Other ecosystems, such as the .NET platform by Microsoft have fared much
better. Microsoft continued to invest in moving the .NET platform along
throughout the early part of the new millennium.
However, Microsoft's enterprise products often have expensive licensing fees
for their application servers and associated software. In addition, Microsoft is
also a major backer of open source, especially Python, and their Python tools
for Visual Studio provide a top-notch development environment.
The end result is that enterprise software development has changed
dramatically over the past couple of decades. CIOs and technical executives
can no longer ignore the progress of Python and the great open source
community in the enterprise software development landscape if they want to
continue delivering business value to their business side customers.
Open source enterprise Python projects
Python is widely used across large enterprise organizations but the code is
often not put out as open source. If you come across projects that are
appropriate for this list, contact me to let me know:
• Collab by the U.S. government's Consumer Financial Protection Bureau
(CFPB) agency is a corporate intranet and collaboration platform for
25
Full Stack Python: 2020 Supporter's Edition
large organizations. The project is currently running and in-use by
thousands of CFPB employees.
• Pants is a build system for software projects with many distinct parts
and built with many different programming languages as is often the
case in large organizations.
Enterprise Python software development resources
The following articles cover topics in enterprise development that are often
not discussed when dealing with standard Python development.
• Talk Python to Me's fourth episode interviewed PayPal's lead developer
on Enterprise Python and Large-Scale Projects. They rebuke many of
the myths around Python for large scale projects including the variable
typing system and scalability.
• Building and deploying an Enterprise Django Web App in 16 hours
covers one developer's experience researching, building and deploying
an enterprise application in Python and Django.
• Mozilla's Enterprise InfoSec resources are programming language
agnostic but very useful to developers trying to understand all the
jargon that goes along with the enterprise security domain.
• The end of enterprise IT is a fascinating essay that actually does not
talk about Python in particular but shows how large enterprise IT
departments such as the one at ING Bank have to evolve their
structure, processes and tools to successfully ship software.
Programming languages such as Python are more likely to be used in
these updated polyglot and DevOps-driven environments.
• There are a couple of solid demystifying articles in CIO magazine
including this broad overview of Python in enterprises and this article
on why dynamic languages are gaining share for enterprise
development.
• JavaWorld wrote an interesting article about Python's inroads into
enterprise software development.
26
Full Stack Python: 2020 Supporter's Edition
• I gave a talk at DjangoCon 2014 on How to Solve the Top 5 Headaches
with Django in the Enterprise which covered both Python and Django
in large organizations.
• This StackExchange answer contains a solid explanation what
differentiates enterprise software from traditional software.
• There was a Python subreddit thread about Python in the enterprise
that's worth a look for broader opinions on Python compared to Java
and .NET in enterprise environments.
• Why are enterprises so slow? is not specific to Python but is a fantastic
article on the regulatory, cultural and financial reasons why large
companies often move so slowly.
Python Community
The Python programming language has a global community with millions of
software developers who interact online and offline in thousands of virtual
and physical locations.
Who drives the Python community?
There are tens of thousands of Python developers who help steer the
community with local, regional and global events. Most, if not all of them, are
members of the Python Software Foundation (PSF). The PSF is a 501(c)(3)
non-profit with the mission to "promote, protect, and advance the Python
programming language, and to support and facilitate the growth of a diverse
and international community of Python programmers." Anyone that uses
Python and has an active interest in the Python community can join the PSF
as a member. There are five classes of PSF members:
27
Full Stack Python: 2020 Supporter's Edition
1.
2.
3.
4.
5.
Basic Members
Supporting Members
Sponsor Members
Managing Members
Contributing Members and Fellows
Starting out by joining as a basic PSF member is a great way to show your
support and begin your own journey in working with the larger open source
ecosystem.
What is a PEP and why do they matter?
Python Enhancement Proposals (PEPs) are design documents that serve to
drive Python's continued evolution. There are three kinds of PEPs that serve
disinct purposes:
1. Standards track: enhances the Python language with new features
2. Informational: provides information to the Python community
3. Process: modifies or makes improvements on topics relevant to the
community but outside the Python language itself
PEP 1 defines what a PEP is and its purpose. In true computer science
fashion, PEP 0 also exists and it is an index of all the PEPs that have been
created.
The PEPs are important because they drive a transparent process to evolve
the language and broader ecosystems. Some other programming communities
are opaquely pushed by a single person or small cliques who refuse to
understand outside perspectives. The insular nature of some groups typically
causes decline over time as the original community members move on to new
projects and no new members take their place.
Conferences and events
The Python online community has fantastic resources for learning but talking
to your fellow developers in-person at conferences, meetups and hackathons
is a crucial way to discover new tools and coding approaches. The following
resources provide perspective on offline events like PyCon US.
• How to have a great first PyCon gives a slew of advice on how to get the
most out of your busy, and often overwhelming, PyCon experience.
28
Full Stack Python: 2020 Supporter's Edition
• Clinton Dreisbach wrote two fantastic retrospectives for PyCon US
2015 (his first PyCon) and PyCon US 2016. There are many other
retrospectives for other community-led conferences such as
EuroPython. These summaries can be a great way to get a slice of the
experience before purchasing a ticket and booking a trip.
• One reason among many that Python conferences are spectacular
experiences is the transparency about the how and why decisions are
made for event attendees. For example, PyCon is transparent about the
locations it selects for the conference as well as why the community has
a long-standing tradition of having speakers pay their own way just as
any conference attendee would. From the outside it may appear these
decisions are minor but they instill confidence in the stewardship of the
Python ecosystem by current community leadership.
• How I Learned to Stop Worrying and Love PyCon goes over one firsttime PyCon attendee's nervousness about attending the conference and
how she got over it while having a great time in the process.
• Don't Submit A Talk About Your Great Hack: Jesse's Seven Tips To Get
Into PyCon provides wonderful advice on how to approach the Call for
Proposals (CFP) when you want to speak at a Python conference. The
notes on how to balance specifics with vagarity in your title is
particularly useful.
Python community resources
The Python community exists both online and offline. The following resources
will help you get connected with other Python developers in both forms. They
also provide context on how some of the major decisions are made within the
community.
• The community page on Python.org provides a starter page with links
to community-run newsletters, resources and conferences.
• There are many large active online communities on Reddit and IRC
channels such as #python, #python-dev and #distutils.
• The Python community has a concept known as "Benevolent Dictator
For Life" that may appear odd to newcomers. Essentially, Guido Van
Rossum created the language and still has the ability to decide
29
Full Stack Python: 2020 Supporter's Edition
community arguments one way or the other. This post on the origin of
BDFL has more context about Guido's role.
• Python Community and Python at Dropbox is an interview with Jessica
McKellar, one of the most visible Python core committers and
organizers for her fantastic coding and community work. She explains
what it means to be a member and leader in the larger Python
community.
• The history behind the decision to move Python to GitHub is a
transparent and personal story by one of the Python Core Team
members, Brett Cannon, on why the main Python projects including
the language itself are now hosted for development on GitHub. The
post is a wonderful read about the history of where Python
development was centralized and how it moved from SourceForge to
svg.python.org and then over to GitHub.
• Python community trends in 2017/2018 by Ewa Jodlowska, who is
heavily involved in PyCon and the overall Python ecosystem, provides a
ton of data and metrics on areas of community interest such as Python
developers' locations, Python 3 adoption and years of experience as
professional developers.
• There are some amazing communities within the overall Python
ecosystem, such as PyLadies, which encourages women to be strong,
proactive community members.
Companies using Python
The Python programming language is widely used by companies around the
world to build web apps, analyze data, automate operations via DevOps and
create reliable, scalable enterprise applications.
There is also a fantastic list of organizations using Python on the Python.org
wiki as well as a detailed write-up of several top Python-powered companies
on Real Python's blog.
Many companies do not even realize they are using Python across their
organizations. For example, if a company is a "Java-only shop" but they use
IBM WebSphere as a web application server then they have to use Python to
30
Full Stack Python: 2020 Supporter's Edition
script the server's configuration! Python has a habit of getting in everywhere
regardless of whether the usage is intentional.
Financial institutions
Python is widely-used across financial institutions, whether they are hedge
funds, large banks or regulators (see "Government Agencies" section below).
• Goldman Sachs uses Python and often asks candidates about their
experience with the language during the interview process.
• You can see publicly what companies are using internally by looking at
job descriptions on sites like Glassdoor with "Python Goldman Sachs"
keywords and Indeed for JP Morgan Chase. Salaries and
responsibilities vary widely based on the role and whether Python is
used for data analysis, web application development or DevOps.
• PayPal uses Python across their entire infrastructure and often writes
great technical blog posts on packaging, optimization using C and
configuring DNS.
Large tech companies
Large technology companies tend to be polyglot (use many programming
languages rather than standardizing on one), with Python either as a primary
language or the "glue" that helps all the other languages fit together. The
following articles explain how these leading large companies like Uber, Twilio,
Netflix and Facebook uses Python in their development stacks.
• Uber's tech stack contains a significant amount of Python, which they
documented in a series of engineering posts. Part one describes the
lower backend levels, which are written in Python, with Node.js, Go
and Java mixed in. Part two explains the higher levels of the
marketplace and user interfaces.
• Twilio uses Python with Django and the Wagtail content management
system to power the amazing Twilio documentation as well as
TwilioQuest. They wrote a post about how TwilioQuest was built that
goes into detail on the code including the usage of the front-end Vue.js
framework. Twilio also uses Flask to run the REST API endpoints and
31
Full Stack Python: 2020 Supporter's Edition
open sourced the Flask-RESTful framework so other developers could
cut down the boilerplate in their web APIs.
• Netflix uses Python throughout their organization to run chaos
engineering tests and generally glue together the code from their highfunctioning polyglot teams. Netflix also wrote a 2019 update for PyCon
US to give more detail on what teams and projects work in Python.
• Python 3 at Mozilla explains how their "build system, CI configuration,
test harnesses, command line tooling and countless other scripts, tools
or Github projects are all handled by Python". So just about everything
a developer touches every day to build anything else needs Python to
hook into the larger organization!
• Google uses Python extensively and officially supports it internally as
one of their three core languages, the other two being Java and Golang.
While Google likely has every programming language running
somewhere in their infrastructure, Python receives priority support due
to its core language status.
• Dropbox is well-known for using Python across their application
development, infrastructure and operations. They also did a good job of
cornering the market on hiring well-known Python core contributors
for a period of time, such as Guido van Rossum and Jessica McKellar
(although Jessica is now at a new company that she co-founded).
• Facebook and Instagram use Python 3 at scale. They've been very vocal
about successfully making the migration from the Python 2 world into
Python 3.
• A significant portion of Reddit is built in Python and it is one of the
largest sites at scale to use the programming language.
• Increment covers usage of Python (and other programming languages)
at Lyft, Digital Ocean, Sauce Labs, Slack and Fastly in this awesome
overview post titled "What its like to be a developer at...".
Government agencies
Python usage in government agencies is widespread despite the reputation of
agencies as stodgy late technology adopters. Organizations range from
32
Full Stack Python: 2020 Supporter's Edition
financial industry regulators like the SEC and CFPB, to intelligence agencies
like the CIA, FBI and NSA.
• The Consumer Financial Protection Bureau (CFPB) not only uses
Python for running most of their applications but also open sources
many of those Python projects for other agencies (or any organization)
to use. For example, collab is a Django project that provides an
enterprise application for storing and looking up information on
employees and contractors.
• NASA uses Python extensively and open sources much of their
software.
• The United States Central Intelligence Agency (CIA) appears to be a
huge fan of using Python in its state sponsored hacking tools. They
even published their own Python code conventions documentation due
to how many developers at the agency are using the language.
• The SEC uses Python and proposes organizations use Python to comply
with regulations.
• A quick search for government jobs that require or recommend Python
via USAJobs turns up numerous listings at organizations such as the
Smithsonian Institution, Department of Education, Department of the
Navy and National Institute of Standards and Technology (NIST).
Industry-specific Python guides
Python is so widely used across various industries that developers have
written guides specific to their occupations for how to use Python. The
following resources are guides for using Python in astronomy, social sciences
and other fields rather than specific companies.
• Practical Business Python covers business-related topics such as how to
automate generating large Excel spreadsheets or perform analysis
when your data is locked in Microsoft Office files.
• Practical Python for Astronomers gives open source workshop
materials to teach astronomy students how to use Python for data
analysis.
33
Full Stack Python: 2020 Supporter's Edition
• Python for the Humanities is a textbook on the basics of text processing
in Python. The material ramps up quickly after the first chapter so you
will likely want to combine this walkthrough with other great Python
learning resources.
• Python for Social Scientists and Real Python's Python for Social
Scientists walkthroughs are specific to fields that work with a lot of
data gathered from studies such as psychology, sociology and
economics.
Best Python Resources
The Python community is amazing at sharing detailed resources and helping
beginners learn to program with the language. There are so many resources
out there though that it can be difficult to know how to find them.
This page aggregates the best general Python resources with descriptions of
what they provide to readers.
New to programming
If you're learning your first programming language these books were written
with you in mind. Developers learning Python as a second or later language
should skip down to the next section for "experienced developers".
• TwilioQuest is an free and incredible 16-bit adventure game that
teaches programming in the Python basics mission. It's an absolutely
way to stay engaged with the core concepts that can often be otherwise
dry to learn. There are also more advanced missions to learn about web
APIs, and new learning missions are added regularly on new
programming topics.
• Automate the Boring Stuff with Python is an incredible book for both
non-developers and professional developers alike. Each chapter walks
through a situation that can be automated using Python such as
manipulating images, organizing your files and programmatically
controlling your mouse and keyboard to handle any sort of tasks.
• CS for All is an open book by professors at Harvey Mudd College which
teaches the fundamentals of computer science using Python. It's an
accessible read and perfect for programming beginners.
34
Full Stack Python: 2020 Supporter's Edition
• This short 5 minute video explains why it's better to think of projects
you'd like to build and problems you want to solve with programming.
Start working on those projects and problems rather than jumping into
a specific language that's recommended to you by a friend.
• A Python Crash Course gives an awesome overview of the history of
Python, what drives the programming community and dives into
example code. You will likely need to read this in combination with
other resources to really let the syntax sink in, but it's a great article to
read several times over as you continue to learn.
• The Python projects tag on the Twilio blog is constantly updated with
fun tutorials you can build to learn Python, such as the International
Space Station Tracker with Flask and Redis-Queue, Choose Your Own
Adventures Presentations using Flask and WebSockets and Martianify
Photos with OpenCV.
• A Byte of Python is a beginner's tutorial for the Python language.
• Google put together a great compilation of materials and subjects you
should read and learn from if you want to be a professional
programmer. Those resources are useful not only for Python beginners
but any developer who wants to have a strong professional career in
software.
• The O'Reilly book Think Python: How to Think Like a Computer
Scientist is available in HTML form for free on the web.
• Looking for ideas about what projects to use to learn to code? Check
out this list of 5 programming projects for Python beginners.
• Python for you and me is an approachable book with sections for
Python syntax and the major language constructs. The book also
contains a short guide at the end to get programmers to write their first
Flask web application.
Python for specific occupations
Python is powerful for many professions. If you're seeking to use Python in a
specific field, one of these guides may be the most appropriate for you.
35
Full Stack Python: 2020 Supporter's Edition
• Python for Social Scientists contains a textbook, course outline and
slides for a college course that taught social scientists to use Python for
their profession.
• Practical Business Python is a blog that covers topics such as how to
automate generating large Excel spreadsheets or perform analysis
when your data is locked in Microsoft Office files.
• Python for the Humanities is a textbook and course on the basics of
Python and text processing. Note if you've never worked with Python
before the material ramps up quickly after the first chapter so you will
likely want to combine it with some other introduction to Python
resources.
• Practical Python for Astronomers provides open source workshop
materials for teaching students studying astronomy to use Python for
data analysis.
Experienced developers new to Python
If you can already program in another language, these resources are better for
getting up to speed because they are more concise when explaining
introductory topics.
• Learn Python in y minutes provides a whirlwind tour of the Python
language. The guide is especially useful if you're coming in with
previous software development experience and want to quickly grasp
how the language is structured.
• Developing a Real-Time Taxi App with Django Channels and Angular is
a great tutorial for jumping into a real project instead of a simple
starter application while learning common Python concepts and tools
such as Django, Angular, WebSockets and Redis.
• Developers familiar with other languages often have difficulty adapting
to accepted Python code style. Make sure to read the PEP8 code style
guidelines as well as The Elements of Python Style to know the Python
community standards.
36
Full Stack Python: 2020 Supporter's Edition
• Authentication with Flask, React, and Docker is another detailed
course that shows how to combine Flask, React, Docker and Heroku to
build a solid intermediate-to-advanced web application and deploy it.
• Essential Reads for Any Python Programmer is a great collection of
advice for developers coming to Python from another programming
language ecosystem such as Java.
• How to Develop Quality Python Code is a good read to begin learning
about development environments, application dependencies and
project structure.
• The Python module of the week chapters are a good way to get up to
speed with the standard library. Doug Hellmann is also now updating
the list for changes brought about from the upgrade to Python 3 from
2.x.
• Composing Programs shows how to build compilers with Python 3,
which is a good undertaking if you're looking to learn both more about
the Python language and how compiles work.
• free-for-dev is not specific to Python but it's a fantastic list of free tier
resources for experienced developers. The list is especially handy if you
want to try building a Python project and need new third party services
to kick around while experimenting.
Videos, screencasts and presentations
Videos from conferences and meetups along with screencasts are listed on the
best Python videos page.
Curated Python packages lists
• awesome-python is an incredible list of Python frameworks, libraries
and software. I wish I had this page when I was just getting started.
• easy-python is like awesome-python although instead of just a Git
repository this site is in the Read the Docs format.
• Hacker News Tools of the Trade is not specific to Python but almost all
of the tools and services are useful to building software projects.
37
Full Stack Python: 2020 Supporter's Edition
Podcasts
Take a look at the best Python podcasts section for a curated list of both
Python-specific and general software development podcasts.
Newsletters
Python's active community constantly publishes new tutorials and
walkthroughs. It is easier to keep up if you follow along by subscribing to
several email newsletters that round up and curate the best new resources. I
subscribe to all of the following newsletters and find that each one has its own
unique take on what resources are most important to send out to the
community.
• Python Weekly is a free weekly roundup of the latest Python articles,
videos, projects and upcoming events.
• PyCoder's Weekly is another great free weekly email newsletter similar
to Python Weekly. The best resources are generally covered in both
newsletters but they often cover different articles and projects from
around the web.
• Talk Python's Friends of the Show always highlights new episodes of
this wonderful podcast as well as new useful tools to add to your
developer toolbelt.
• Awesome Python Newsletter provides another solid selection of new
and existing tutorials along with an extensive issues archive with
previous links to resources.
Best Python Videos
If you prefer to learn Python programming by watching videos then this is the
resource for you. I've watched hundreds of live technical talks and combed
through videos to pick out the ones with great speakers who'll teach you the
most about the language and ecosystem.
This page links to the best free videos as well as other video lists so you can do
your own searching through the huge backlog of conference and meetup talks
from the past several years.
38
Full Stack Python: 2020 Supporter's Edition
Web development videos
The following web development videos cover the broad topics of using web
frameworks like Django, Flask, Pyramid and other frameworks, as well as web
design and deployments.
• My EuroPython 2014 "Full Stack Python" talk goes over each topic
from this guide and provides context for how the pieces fit together.
The talk slides are also available. Even though the talk is from 2014,
almost all of the general web development principles remain consistent
in 2018.
• Kate Heddleston gave a talk at PyCon 2014 called "Full-stack Python
Web Applications" with clear visuals for how numerous layers of the
Python web stack fit together. There are also slides available from the
talk with all the diagrams.
• Taking Django Async is a great overview by Andrew Godwin, who
created South (now Django Migrations as part of the core framework)
and Django Channels. He discusses the synchronous blocking worker
design of WSGI and why it is incompatible with asynchronous
protocols like WebSockets. A potential solution could be a new protocol
like Asynchronous Server Gateway Interface (ASGI), but how far would
the integration into Django need to go and would it be worth the pain?
Andrew does a great job of mixing the philosophical questions with
technical implementation details throughout the talk.
• Design 101 for Developers covers a difficult topic for many analyticminded developers to learn: design. The talk is not specific to web
development or web design but instead explains general design
principles such as spacing, consistency and making interactions
obvious for a user.
• How Netflix does failovers in 7 minutes flat by Amjith Ramanujam at
PyCon US was an excellent talk on handling network outages and
predicting the reliability of failover routes. This talk is worth watching
even for the data visualizations alone!
• Kate Heddleston and I gave a talk at DjangoCon 2014 called Choose
Your Own Django Deployment Adventure which walked through many
of the scenarios you'd face when deploying your first Django website.
39
Full Stack Python: 2020 Supporter's Edition
• The Discover Flask series is a detailed Flask tutorial on video with
corresponding code examples on GitHub.
• Designing Django's Migrations covers Django 1.7's new migrations
from the main programmer of South and now Django's built-in
migrations, Andrew Godwin.
• DjangoCon US videos from 2019, 2018, 2017, 2016, 2015, 2014, are all
available free of charge.
• DjangoCon EU videos are also available from 2017, 2016, and 2015.
• GoDjango screencasts and tutorials are free short videos for learning
how to build Django applications.
• The videos and slides from Django: Under the Hood 2015 are from
Django core committers and provide insight into the ORM,
internationalization, templates and other important web framework
topics.
Core Python language videos
The core Python programming language has many new features now that
almost all community resources are working on Python 3 instead of split
across legacy 2.x branches. The following videos cover topics within the core
Python language primarily relevant to Python 3 features although some can
be used with Python 2 as well.
• Jessica McKellar's Building and breaking a Python sandbox is a
fascinating walk through the lower layers of the Python interpreter.
• Brandon Rhodes' All Your Ducks In A Row: Data Structures in the Std
Lib and Beyond goes through how data structures are implemented,
how to select a data structure appropriate to your application and how
the list and dictionary can be used in many situations.
• Guido van Rossum's Python Language keynote talk from PyCon 2016
reinforced that there would be no Python version 2.8 and that
development on backported security releases into the Python 2 branch
would end by January 1, 2020. Guido also covered many topics
important to the Python language community like expanding the
number and backgrounds of core committers.
40
Full Stack Python: 2020 Supporter's Edition
• The talk Python Descriptors by Simeon Franklin explains the what and
why of this core Python language feature.
• David Beazley gives an amazing live coded performance to show
Python concurrency using threads, event loops and coroutines. David
makes the live coding look easy but a whole lot of work must've gone
into that talk.
• What is a Python Core Developer? explains the responsibilities,
projects, repositories and expectations of core Python committers as
well as how to become one.
• Google's Python Class contains lecture videos and exercises for
learning Python.
Video compilations
All major Python conferences, as well as most regional ones, release technical
talk videos for free. These sites either aggregate the thousands of videos that
have been released or are lists from specific conferences like PyCon US and
EuroPython.
• PyVideo organizes and indexes thousands of Python videos from both
major conferences and meetups.
• Incredible Technical Speakers is a repository I put together that
features software developer speakers talking about programming
language agnostic topics. The list is intended to emphasize professional
software developers who also have the ability to engage an audience of
peers with an exciting talk. These talks are relevant to all software
developers even though not every talk is specific to the Python
language.
• PyCon US videos from 2019, 2018, 2017, 2016, 2015 and 2014 are all
available online for free.
• All of the talk videos are available on YouTube for EuroPython 2019,
EuroPython 2018, EuroPython 2017, EuroPython 2016, EuroPython
2015, EuroPython 2014 and earlier years.
41
Full Stack Python: 2020 Supporter's Edition
Best Python Podcasts
The Python community has an embarrassment of riches when it comes to free
and low cost resources for both new and experienced software developers.
These great resources include several Python podcasts that are released on
regular schedules.
This page contains a list of active Python-specific and software engineering
high-quality podcasts.
Python-specific podcasts
These podcasts are run and recorded by Python developers who focus on
topics that are important to our field. Each podcast series has a long list of
previously-recorded episodes that are still relevant even if they are several
years old so you will always have some great material to listen and learn.
• Talk Python to Me focuses on the people and organizations coding on
Python. Each episode features a different guest interviewee to talk
about his or her work.
• Podcast.__init__ is another regular podcast that presents stories about
Python and interviews "with the people who make it great".
• Python Bytes is a new podcast from the creators of the above
mentioned "Talk Python to Me" and "Test and Code Podcast".
• Test and Code Podcast focuses on testing and related topics such as
mocking and code metrics.
• Django Riffs is a podcast for learning web application development in
Python using the Django web framework.
• Teaching Python is a podcast by two teachers about their adventures
teaching middle school computer science, problem solving, handling
failure, frustration, and success with teaching Python programming.
• Professor Philip Guo has a video podcast called PG Podcast, which
typically covers Python subjects.
• The Real Python Podcast is a weekly podcast with interviews, coding
tips, and conversation with guests from the Python community.
42
Full Stack Python: 2020 Supporter's Edition
Favorite podcast episodes
Here are a list of my favorite episodes from various Python podcasts before we
dive into entire podcast recommendations in the next section. Dig into these
episodes to get a feel for various personalities and styles in podcast
presentation.
• SQLAlchemy and data access in Python brought me some muchneeded context and understanding for how the object-relational
mapping library SQLAlchemy evolved. This episode interviews
SQLAlchemy's creator. The show host, Michael Kennedy, asks great
questions based on his in-depth research and prior usage of
SQLAlchemy.
• Python past, present, and future with Guido van Rossum covers the
history of Python, Guido's motivations in creating the language and
shepherding it through almost thirty years of releases. Fun fact: the
question about whether Python being open source contributed to its
success was my question when the podcast host Michael Kennedy
asked me what topics they should talk about.
• Deploying Python Web Applications. Spoiler alert: this is the episode I
was on Talk Python to Me explaining how Python web application
deployments work.
• Python Bytes talked extensively about object-relational mappers
(ORMs) on episode #39 where a lot of the discussion was based on the
Full Stack Python ORMs page. Thanks guys, it was great to hear the
feedback about what worked and what didn't work for you about the
explanations!
• The Python at Netflix episode of Talk Python to Me provided an
awesome view inside the largest internet site by bandwidth and how
Python fits into their polyglot organization.
• Another great Talk Python to Me episode, Python in Finance, explains
how Python is used in the broad finance industry for stock trading,
quantitative analysis and data analytics. Check this one out if you have
wondered how opaque private companies like hedge funds use Python
to make (a lot of) money.
43
Full Stack Python: 2020 Supporter's Edition
General software development podcasts
These podcasts are not specific to Python but cover relevant software
engineering topics and often have episodes on Python topics. At the very least
you will become a better general software developer by listening and learning
from them.
• Software Engineering Daily incredibly has a podcast every day with a
different developer on a huge range of relevant development subjects.
• All things Git talks to developers using, building and working with Git
with new episodes every two weeks.
• CodeNewbie interviews developers who are early in their software
development journey to find out their stories of why they are
programming and what they are working on. There are also interviews
with experienced developers building well-known projects.
• Developer on Fire interviews programmers, architects and testers with
their stories of success, failure and excellence.
• Command_line Heroes covers operating system-level topics as well as
DevOps.
• Embedded.fm goes into embedded systems and hardware hacking.
• The Changelog is a weekly podcast on general software development
matters.
• Full Stack Radio. No relation to Full Stack Python but another one to
check out!
• Exponent is not a software development podcast but it covers the
intersection of corporate strategy and technology in an in-depth way
that allows me to better understand the decisions businesses make
when building and releasing software. I listen to every episode (at 1.5x
speed) and it's well worth the 45 to 60 minutes spent listening to Ben
Thompson and James Allworth go deep on a weekly topic.
• Test Talks examines a software testing topic each week, usually with a
special guest focused on the area being examined.
• The Cloudcast focuses on cloud computing and DevOps-related topics.
44
Full Stack Python: 2020 Supporter's Edition
Data science and analysis podcasts
The data science community does not only use Python as its core
programming language but it plays a big role in almost every organization
performing data analysis. The following podcasts cover data science broadly
and often get specific into Python ecosystem tools.
• DataFramed is a data science podcast that often covers Python libraries
and other areas of interest to people using Python to analyze data.
• Data Skeptic covers data science, statistics, machine learning, artificial
intelligence and "scientific skepticism".
• Data stories is a podcast on data visualization.
• Partially Derivative was a podcast for machine learning, artificial
intelligence and the data community. They finished their last episode in
late 2017 but the episodes list contains a slew of existing content.
45
Full Stack Python: 2020 Supporter's Edition
Development Environments
A development environment is a combination of a text editor and a Python
runtime implementation. The text editor allows you to write code for your
applications. The runtime implementation, such as CPython or PyPy, provides
the method for executing your code.
A text editor can be as simple as Notepad running on Windows or a more
complicated integrated development environment (IDE) with syntax
checking, integrated test runner and code highlighting. A couple of common
IDEs for Python development are PyCharm and VSCode, both of which runs
on any major operating system.
Why is a development environment necessary?
Python code needs to be written, executed and tested to build applications.
The text editor provides a way to write the code. The interpreter allows it to be
executed. Testing to see if the code does what you want can either be done
manually or by unit and functional tests.
An example development environment
Here's what I (the author of Full Stack Python, Matt Makai) use to develop
most of my Python applications. I have a Macbook Pro with Mac OS X as its
base operating system. Ubuntu 18.04 LTS is virtualized on top with Parallels.
My code is written in vim and executed with the Python 3.6 release via the
command line. I use virtualenv to create separate Python interpreters with
47
Full Stack Python: 2020 Supporter's Edition
their own isolated application dependencies and virtualenvwrapper to quickly
switch between the interpreters created by virtualenv.
That's a common set up but you can certainly write great code with a much
less expensive set up or a cloud-based development environment.
Other developers' environments
Often the best way to figure out how to get comfortable in your own
development environment is to see examples of how other experienced
developers have set up their configurations. The following posts contain the
tools, editors and workflows that developers have taken the time to publicly
document.
• My Python development environment has a setup with Sublime Text,
Anaconda, PyCharm and the author's workflow for how to use the
different editors for different purposes.
• My Python Development Environment, 2018 Edition explains Jacob
Kaplan-Moss' (one of the original creators of the Django web
framework) local setup.
• The definitive guide to setup my Python workspace is geared towards
using Python for data science but the guide remains useful for
configuring your system for any type of Python work. There is some
solid advice in the post about not adulterating your global Python
installation as well as how to split out many virtual environments for
Python 2 & 3.
Cloud hosted dev environments
Several cloud-based development environments have popped up over the past
several years. These hosted environments can work well when you are
learning or stuck on a machine with a web browser but otherwise no
administrative privileges to install your own software. Most of these have free
tiers for getting started and then require payment as you scale up your
application.
• CodeAnywhere is a cloud IDE that can be used in the web browser or
on an iOS or Android device.
48
Full Stack Python: 2020 Supporter's Edition
• Cloud9 began as an independent company and is now owned by
Amazon as part of Amazon Web Services.
• code.xyz is an online text editor built by Stdlib that can integrate with
external web APIs.
• GitLab Web IDE is integrated into the GitLab web application for
modifying your Git repository files directly in your browser.
General dev environment resources
Development environments are unique to each programmer because Python
is used for many different purposes. The following guides range from web
development to DevOps and from getting started to data science. Even though
your environment requirements are unique, you should be able to find
someone who has set up something similar to what you need. Use that
configuration as a starting point and customize it from there.
• The Python subreddit had a nice thread with developers giving the
specifications to their Python development environments in this post
on What is in your Python Development Environment?.
• Real Python has an awesome, detailed post on setting up your Sublime
Text 3 environment as a full-fledged IDE.
• How to bootstrap a Python project covers using a virtualenv, where to
store your files, which version of Python to use and adding code
metrics libraries for checking syntax.
• Three Ways to Install Python on your Windows Computer provides
multiple avenues for Windows users to get Python on their machine
before setting up the rest of their development environment. Unlike
macOS and Linux, the Windows operating system does not include
Python with its default installation.
• PyCharm: The Good Parts shows you how to be more efficient and
productive with that IDE if it's your choice for writing Python code.
• JetBrains' PyCharm Blog is required reading if you're using the IDE or
considering trying it. One of the core developers also has an interview
on the Talk Python to Me podcast that's worth listening to.
49
Full Stack Python: 2020 Supporter's Edition
• The Joy of Linux Desktop Environments talks about desktop
environments, not specifically development environments, but
provides an explanation for why the core Linux operating system is
awesome for being unbundled from the desktop environment itself.
You can change your desktop environment from just a command line
without a windowing system to a full windowed system provided by
Gnome, KDE or Unity for using the system and getting your
programming work done.
• The Hitchhiker's Guide to Python has a page dedicated to development
environments.
• Setting up a Python Development Environment with and without
Docker explains the reasoning behind why and when to use various
tools in your local environment.
• Epic Development Environment using Windows Subsystem for Linux
is geared towards JavaScript developers but contains a slew of good
advice for developers trying to configure Windows.
Text Editors and IDEs
Text editors and integrated development environments (IDEs) are
applications for writing code. These applications are the primary user
interface for developers to create their own programs.
50
Full Stack Python: 2020 Supporter's Edition
Vim is an example of a text editor implementation that can be expanded into a
full Python IDE using configuration files and plugins.
Why is a text editor or IDE necessary?
Where will you write your code if you do not have a text editor? Your
development environment must include a text editor so you can enter, edit
and delete characters to create Python applications.
Preferrably your editor will have a monospace font. It will also get out of your
way, so no "smart" correction or automatic letter capitalization. The more
comfortable you become in your editor of choice the faster you can figure out
how to implement that next feature in your application or squash that pesky
bug that you just found.
What's the difference between editors and IDEs?
IDEs contain text editors but many text editors, for example Notepad
included with Windows, do not include IDE features. Many text editors such
as Vim or Emacs have IDE features by default but then can be further
customized to add file trees, syntax highlighting, line numbers and syntax
checking that is commonly found in full-featured IDEs.
Open source text editors
Open source provides an embarrassment of riches when its come to stable,
extendable text editors. Some version of these editors, such as the original vi
version of Vim, have been used for over 40 years! You can't go wrong with
using one of the editors as your development environment foundation.
The following text editor implementations can be upgraded with
configurations and plugins to become full-fledged IDEs when a developer
wants that kind of functionality.
• Vim is my editor of choice and installed by default on most *nix
systems.
• Emacs is another editor often used on *nix.
• Mu (source code) is an open source editor intended for Python
beginners.
51
Full Stack Python: 2020 Supporter's Edition
• Atom is an open source editor built by the GitHub team.
• Visual Studio Code by Microsoft provides spectacular Python support.
Python-specific IDEs
Editors built from the foundation up are not necessarily better than generalpurpose text editors and IDEs like Vim and Emacs but they are typically much
easier to configure for gathering code metrics, running unit tests and
debugging.
• PyCharm is a Python-specific IDE built on JetBrains' platform. There
are free editions for students and open source projects.
• Thonny is an open source Python IDE for new programmers. The tool
bakes in syntax highlighting, code completion, a simple debugger, a
beginner-friendly shell and in situ documentation to assist new
developers who are just starting to code.
• Wing IDE is a paid development environment with integrated
debugging and code completion.
• PyDev is a Python IDE plug in for Eclipse.
Proprietary (closed source) editors
There are some editors that are closed source that developers are very happy
using.
• Sublime Text versions 2 and 3 (currently in beta) are popular text
editors that can be extended with code completion, linting, syntax
highlighting and other features using plugins. If you are considering
using Sublime Text for Python development, check out this 2016 in
review - likes and dislikes about Sublime Text post that summarizes
many of the positives and negatives of using the editor.
• Komodo is a cross-platform text editor and IDE for major languages
including Python, Ruby, JavaScript, Go and more.
52
Full Stack Python: 2020 Supporter's Edition
Building your own text editor
One great way to learn more about how text editors work is by building your
own, even if it turns out to be a hacked-together proof-of-concept. These
resources give walkthroughs on building an editor, or explain how existing
editors work by digging into their source code.
• Build your own text editor provides an awesome tutorial for creating a
basic editor in the C programming language. This walkthrough is useful
to break open the black box of how a tool you use every day as a
programmer works under the covers.
• A brief glance at how various text editors manage their textual data is a
fun dive into the source code of vi, GNU Moe, Emacs, Scintilla and
GNOME GtkTextView/GtkTextBuffer.
• Xi: an editor for the next 20 years is an awesome technical talk about
designing a text editor with the current (2018) set of tools available to a
developer.
• Building a Text Editor for a Digital-First Newsroom gives some
wonderful insight into the New York Times' homegrown legacy text
editor and why they started building a new text editor named Oak that
is customized to the newsroom's workflow.
General text editor & IDE resources
These resources provide comparisons of various editors and give some deeper
insight into the IDE vs plain text editor debate.
• EditorConfig (source code) is an open source tool for keeping many
text editors and IDEs on the same code styles and configurations.
• PyCharm vs Sublime Text has a comparison of several features between
the two editors.
• What is the best IDE for Python tries to answer a loaded question and
gives some rationale behind choosing one application or another.
• Why I Deleted My IDE; and How It Changed My Life For the Better
contains some hyperbole but still has some solid reasoning why
integrated environments are not necessarily for everyone depending on
a developer's chosen workflow.
53
Full Stack Python: 2020 Supporter's Edition
• Text editing techniques every Front-End developer should know
contains tricks that experienced developers use in their editor of
choice. The author uses Sublime Text to demonstrate how the methods
work but they can be used in just about any editor.
Vim
Vim (source code), short for Vi IMproved, is a configurable text editor often
used as a Python development environment. Vim proponents commonly cite
the numerous plugins, Vimscript and logical command language as major
Vim strengths.
Why is Vim a good Python development environment?
Vim's philosophy is that developers are more productive when they avoid
taking their hands off the keyboard. Code should flow naturally from the
developer's thoughts through the keyboard and onto the screen. Using a
mouse or other peripheral is a detriment to the rate at which a developer's
thoughts become code. This "efficiency by keyboard" keeps Vim as one of the
most popular text editors despite having been around for decades. Few
programming tools have that kind of staying power.
Vim has a logical, structured command language. When a beginner is learning
the editor she may feel like it is impossible to understand all the key
commands. However, the commands stack together in a logical way so that
over time the editor becomes predictable.
Configuring Vim with a Vimrc file
The Vimrc file is used to configure the Vim editor. A Vimrc file can range from
nothing in it to very complicated with hundreds or thousands of lines of
configuration commands.
54
Full Stack Python: 2020 Supporter's Edition
Here's a short, commented example .vimrc file I use for Python development
to get a feel for some of the configuration statements:
" enable syntax highlighting
syntax enable
" show line numbers
set number
" set tabs to have 4 spaces
set ts=4
" indent when moving to the next line while writing code
set autoindent
" expand tabs into spaces
set expandtab
" when using the >> or << commands, shift lines by 4 spaces
set shiftwidth=4
" show a visual line under the cursor's current line
set cursorline
" show the matching part of the pair for [] {} and ()
set showmatch
" enable all Python syntax highlighting features
let python_highlight_all = 1
Here is how these configuration options look with a dark background on Mac
OS X while editing the markdown for this webpage (how meta!).
Take a look at another example using these configuration options, this time
with a light background and editing Python code from my Choose Your Own
Adventures Presentations project.
55
Full Stack Python: 2020 Supporter's Edition
The Vimrc file lives under the home directory of the user account running
Vim. For example, when my user account is 'matt', on Mac OS X my Vimrc file
is found at /Users/matt/.vimrc . On Ubuntu Linux my .vimrc file can be
found within the /home/matt/ directory.
If a Vimrc file does not already exist, just create it within the user's home
directory and it will be picked up by Vim the next time you open the editor.
Vim tutorials
Vim has a reputation for a difficult learning curve, but it's much easier to get
started with these tutorials.
• Learn Vim Progressively is a wonderful tutorial that follows the path I
took when learning Vim: learn just enough to survive with it as your
day-to-day editor then begin adding more advanced commands on top.
• A vim Tutorial and Primer is an incredibly deep study in how to go
from beginner to knowledgeable in Vim.
• Why Atom Can't Replace Vim discusses one of Vim's core principles:
command composability. Vim has a language where simple commands
are combined to execute more advanced operations. For example, in
command mode, $ moves to the end of a line. When $ is preceded by
d then everything to the end of the line is deleted. Over time the
simple commands become intuitive and the combinations become
56
Full Stack Python: 2020 Supporter's Edition
more powerful than having distinct commands such as a drop-down
menu with a specific option to delete all text until the end of the line.
• Vim as a Language explains the language syntax and how you can build
up over time to master the editor.
• How to install and use Vim on a cloud server along with How to use
Vim for advanced editing of code on a VPS are two detailed Digital
Ocean guides for getting up and running with Vim, regardless of
whether you're using it locally or on a cloud server.
• PacVim: a commandline game to learn Vim commands takes the
PacMan theme and teaches you how to use Vim by forcing you to move
around and use Vim commands while gaming.
• Ten years of Vim provides an insightful retrospective on one
experienced developer's journey with using Vim as a primary text
editor and development environment. I found the part about going
overboard with plugins before switching back to a simpler
configuration fascinating because it is the same path I've found myself
taking as I approach my own ten year mark with Vim.
• At least one Vim trick you might not know about is a collection of nonobvious keyboard shortcuts, many of which are infrequently used but
still useful.
• Vim Adventures is a cute, fun browser-based game that helps you learn
Vim commands by playing through the adventure.
• In Vim: revisited the author explains his on-again off-again
relationship with using Vim. He then shows how he configures and
uses the editor so it sticks as his primary code editing tool.
• Things About Vim I Wish I Knew Earlier explores the lessons one
developer learned while exclusively using Vim for several years. The
author includes using relative instead of absolute line numbering,
setting numerous configuration options and fuzzy finding to quickly
open files in other directories rather than expanding the whole path.
• Seven habits of effective text editing explains moving around
efficiently, fixing errors quickly and forming good habits.
57
Full Stack Python: 2020 Supporter's Edition
Vimrc resources
These are a few resources for learning how to structure a .vimrc file. I
recommend adding configuration options one at a time to test them
individually instead of going whole hog with a Vimrc you are unfamiliar with.
• A Good Vimrc is a fantastic, detailed overview and opinionated guide to
configuring Vim. Highly recommended for new and experienced Vim
users.
• Vim and Python shows and explains many Python-specific .vimrc
options.
• Vim as a Python IDE shows a slew of plugins and configuration options
for coding with Python in Vim.
• This repository's folder with Vimrc files has example configurations
that are well commented and easy to learn from.
• For people who are having trouble getting started with Vim, check out
this blog post on the two simple steps that helped this author learn
Vim.
Vim installation guides
These installation guides will help you get Vim up and running on Mac OS X,
Linux and Windows.
• Upgrading Vim on OS X explains why to upgrade from Vim 7.2 to 7.3+
and how to do it using Homebrew.
• The easiest way to install Vim on Windows 7+ is to download and run
the gvim74.exe file.
• On Linux make sure to install the vim package with
sudo apt-get install vim .
• If you're using PyCharm as your IDE you won't need to install Vim as a
separate text editor - instead use the IdeaVim PyCharm plugin to get
Vim keybindings, visual/insert mode, configuration with ~/.ideavimrc
and other Vim emulation features.
58
Full Stack Python: 2020 Supporter's Edition
Using Vim as a Python IDE
Once you get comfortable with Vim as an editor, there are several
configuration options and plugins you can use to enhance your Python
productivity. These are the resources and tutorials to read when you're ready
to take that step.
• VIM and Python - a Match Made in Heaven details how to set up a
powerful VIM environment geared towards wrangling Python day in
and day out.
• The python-mode project is a Vim plugin with syntax highlighting,
breakpoints, PEP8 linting, code completion and many other features
you'd expect from an integrated development environment.
• Vim as Your IDE discusses how to set up Vim for greater productivity
once you learn the initial Vim language for using the editor.
• Setting up Vim for Python has a well written answer on Stack Overflow
for getting started with Vim.
• If you're writing your documentation in Markdown using Vim, be sure
to read this insightful post on a Vim setup for Markdown.
Vim Plugin resources
• 5 Essential VIM Plugins That Greatly Increase my Productivity covers
the author's experience with the Vundle, NERDTree, ctrlp, Syntastic
and EasyMotion Vim plugins.
• Getting more from Vim with plugins provides a list of plugins with a
description for each one on its usefulness. The comments at the bottom
are also interesting as people have suggested alternatives to some of
the plugins mentioned in the post.
• Powerline is a popular statusline plugin for Vim that works with both
Python 2 and 3.
• VimAwesome is a directory of Vim plugins sourced from Vim.org,
GitHub and user submissions.
• Command-T is a Vim plugin for fast fuzzy searching files.
59
Full Stack Python: 2020 Supporter's Edition
• YouCompleteMe (source code) is a code-completion engine and plugin
that works for Python.
Vim Plugin Managers
If you use many Vim plugins together it is really handy to have a plugin
managers to sort out all of the dependencies. The following plugin managers
are the most commonly-used ones in the Vim ecosystem.
• Vundle comes highly recommended as a plugin manager for Vim.
• Pathogen is a widely used plugin manager.
• Vim-plug bills itself as a minimalistic Vim plugin manager.
Niche tutorials
After you have been using Vim for awhile there will be features you bump into
without realizing they were ever there. The following tutorials show how to
use some specific niche features. You may already know about these if you
have been using Vim for awhile but everyone's learning path is different so it's
useful to do a quick scan to make sure you are not missing anything.
• Vim’s absolute, relative and hybrid line numbers shows how to change
the line numbering scheme. There was a period of time I used relative
line numbers although I eventually switched back to absolute numbers.
The usefulness of these schemes is often dependent on what language
you are working in.
• A simpler Vim statusline explains how to customize your bottom screen
statusline without using plugins such as vim-powerline or vim-airline.
• The vim-clutch is a really cool project and walkthrough that shows how
you can create a foot pedal to switch between Normal and Insert modes
instead of using the typical ESC key (or a remapped key).
• How I'm able to take notes in mathematics lectures using LaTeX and
Vim explains how the author is able to keep up with mathematics
lectures by using Vim and LaTeX which produces gorgeous notes that
can be used to study.
60
Full Stack Python: 2020 Supporter's Edition
Emacs
Emacs (source code) is an extensible text editor that can be customized by
writing Emacs Lisp (Elisp) code.
Why is Emacs a good choice for coding Python?
Emacs is designed to be customized via the built-in Lisp interpreter and
package manager. The package manager, named package.el, has menus for
handling installation. The largest Lisp Package Archive is Melpa, which
provides automatic updates from upstream sources.
Macros are useful for performing repetitive actions in Emacs. A macro is just
a recording of a previous set of keystrokes that can be replayed to perform
future actions.
Hooks, which are Lisp variables that hold lists of functions to call, provide an
extension mechanism for Emacs. For example, kill-emacs-hook runs
before exiting Emacs so functions can be loaded into that hook to perform
necessary actions before the exiting completes.
Python plus Emacs resources
Emacs is programming language agnostic by design so it takes some effort to
customize the editor as a Python-specific development environment. The
following resources will walk you through the setups that other developers
have created for working with Python.
• Emacs - the Best Python Editor? continues the excellent Real Python
series showing how to get started with editors. In addition to this
61
Full Stack Python: 2020 Supporter's Edition
Emacs post, there are also posts on Vim and Sublime Text 3 specifically
for Python development.
• Python developers often use reStructuredText (RST) to document their
projects. This guide on Emacs Support for ReStructuredText is handy
for properly configuring your environment to work with RST files.
• Emacs as a Python IDE is a video of a technical talk where the speaker
sets up code completion, documentation lookup and the jedi-starter kit
on his Emacs environment.
• How do you create a robust Python IDE with Emacs (as the Text editor)
is a quality Stack Exchange thread that offers opinions about how to
best go about setting up Emacs for efficient Python development.
• Tricked out emacs for python coding is a short guide for handling
ReStructuredText documents and adding Python code metrics tools to
your Emacs environment.
General Emacs resources
Emacs, like any powerful tool, takes significant intentional practice to use
properly. These resources provide instructions for becoming comfortable with
the editor itself rather than specific Python environment configuration advice.
• GNU Emacs Manual provides an official in-depth review for how to use
Emacs.
• Emacs is sexy! provides a whole site with installation instructions,
cheat sheets and other materials for learning Emacs.
• Emacs Redux is a blog with tips and tricks for how to use Emacs
effectively.
• Emacs Rocks is a video tutorial series for Emacs.
• The Absolute Beginner's Guide to Emacs and Emacs as a Python IDE
are a couple of awesome detailed walkthroughs by Jessica Hamrick for
setting up Emacs for general development as well as Python coding.
• Compiling and running scripts in Emacs explains one workflow option
you can use with Emacs to run code from within the editor rather than
bouncing out to a shell.
62
Full Stack Python: 2020 Supporter's Edition
• What the .emacs.d?! provides a bunch of tiny optimizations for Emacs'
workflow.
• The Using Emacs Series is a set of videos along with an open source Git
repository of the configuration for using and gaining experience with
Emacs.
Programming Emacs with Elisp
Emacs can be completely customized and rewritten by using the Emacsspecific Lisp programming language named Emacs Lisp (Elisp). The ability to
completely modify the editor is part of what led to the old joke "a great
operating system, lacking only a decent editor". Nevertheless, Elisp is what
gives Emacs its text editing power despite the perception that the editor is
overkill for working with text.
These tutorials will help you learn the Elisp language and use it to modify
Emacs for your own purposes.
• An Introduction to Programming in Emacs Lisp is the "official"
introduction intended for a beginner programming audience.
• Emacs Lisp Guide is for developers that have been using Emacs for a
while and want to start making extensions to get more out of the editor.
• The Emacs Wiki contains advice and resources for getting oriented.
• Practical Emacs Lisp contains a bunch of useful code and focuses on
examples throughout the tutorial.
Notable Elisp Packages
Elisp is the LISP programming language dialect that Emacs using for adding
and customizing functionality in the editor. The following Elisp packages are
existing Elisp libraries that many developers using Emacs incorporate into
their environment.
• Magit allows the user to inspect and modify Git repositories from
within Emacs.
• company-mode creates a modular in-buffer completion framework.
• Flycheck provides syntax checking.
63
Full Stack Python: 2020 Supporter's Edition
• anaconda-mode is specific to Python development and allows code
navigation, documentation lookup and code completion.
• web-mode.el is a package for editing web templates like Jinja. Many
Python template engines are supported including Django templates,
Mako and Cheetah as well as JavaScript front-end frameworks.
Popular user configurations
Numerous custom Emacs user configurations exist that bundle together
custom Elisp packages and libraries to handle creating a powerful integrated
development environment. I recommend trying to configure Emacs yourself
before you dive into any of these configurations so it is easier to learn base
Emacs rather than get distracted by the customizations.
• Prelude is an enhanced Emacs version 24 distribution.
• A reasonable Emacs config shows a batteries-includes Emacs
configuration bundle.
• Emacs settings is a repository of configurations used in the Emacs
Rocks screencasts.
• Spacemacs mashes together Emacs' extensibility and Vim's ergonomic
text editing features.
Sublime Text
Sublime Text is a commonly-used text editor used to write Python code.
Sublime Text's slick user interface along with its numerous extensions for
syntax highlighting, source file finding and analyzing code metrics make the
editor more accessible to new programmers than some other applications like
Vim and Emacs.
64
Full Stack Python: 2020 Supporter's Edition
What makes Sublime Text awesome?
Sublime Text is often the first editor that newer programmers pick up because
it works on all operating systems and it is far more approachable than Emacs,
Vim or even PyCharm.
It is easy to get started in Sublime because the menus and options are
accessible by using a mouse. There are no different modes to learn like Vim's
normal and insert modes. The keyboard shortcuts can be learned over time
rather than all at once in the case of Vim or Emacs.
Sublime Text works well for beginners as soon as they install it and then can
be extended with many of the features provided by an IDE like PyCharm as a
developer's skill level ramps up.
An additional bonus of using Sublime Text as a Python developer is that
plugins are written in Python. Python developers can extend Sublime Text
with their own programming language rather than learn a new language like
Emacs' Elisp or Vim's Vimscript.
Why use any other editor if Sublime is so great?
Picking a text editor or IDE to use tends to be a weirdly personal decision for
each developer. Yet it makes sense when you realize that you are going to
spend hours upon hours every day in your chosen environment so why not
make sure it is one that is enjoyable and highly productive?
For some folks they prefer Vim's keyboard-driven style, PyCharm's Swiss
Army Knife set of Python tools or one of the many other editors with its own
strengths and weaknesses.
The only "best" editor choice is to pick one that works really well for you and
stick to it. Master your tool so it gets out of your way and enables as much
time in programming flow as possible.
Python-specific Sublime Text resources
There are many Python-specific Sublime Text tutorials and resources because
the editor is so frequently used to create Python applications. The following
links should get your editor customized with linters, code metrics, syntax
checking and many other integrated development environment features.
65
Full Stack Python: 2020 Supporter's Edition
• Setting Up Sublime Text 3 for Full Stack Python Development is a
spectacular tutorial that covers installing Sublime Text and configuring
a multitude of helpful Python programming plugins.
• Sublime Text 3 Heaven is a quick overview of the extensions, packages
and bonus toys that one developer uses for his own Sublime Text
development setup.
• Sublime Tutor is an interactive in-editor keyboard shortcuts tutorial
that plugs into Sublime so you can learn and become more productive
as you use the editor.
• Using Generators for Fun and Profit - Utility for developers is not
about setting up your Sublime Text environment but instead how to
create your own plugins using Python. The tutorial is written by the
author of a Sublime Text plugin who uses generators to implement
features with Sublime's API.
• Turning Sublime Text Into a Lightweight Python IDE shows the basic
settings and configuration specific to using Sublime with Python as
more than just a text editor.
• Setting up Sublime Text 3 for Python Type Checking shows one way of
setting up support for Python 3.6 static type checking in Sublime.
• Three steps to lint Python 3.6 in Sublime Text walks through setting up
Flake8 to enforce code style guidelines and show you the errors and
warnings in Sublime as you are working.
• Text editing techniques every front-end developer should know gives
examples in Sublime Text of time-saving text manipulation you may
not have known existed such as line bubbling, ragged line selection,
AceJump and transpose. While the techniques can be used in most
editors the provided video clips show how to perform each of these
shortcuts in Sublime.
General Sublime Text resources
Sublime Text can be used for much more than Python development and there
are many useful tutorials that are not targeted at a specific programming
language which are still useful.
66
Full Stack Python: 2020 Supporter's Edition
• Super charge your Sublime Text 3 to increase your productivity
provides many shortcuts and tricks for using the editor.
• Disassembling Sublime Text uses a binary disassembler to dive into the
reverse engineered source code of Sublime Text because it is not open
source software.
• Sync your sublime text 3 configurations safely and easy explains how to
mitigate configuration conflicts that can arise when trying to use copied
files from one computer to another.
• 7 shortcuts of a highly effective Sublime Text user shows keyboard
shortcuts for opening any file, going to any specific block of text,
handling multiple cursors and more.
Sublime Plugin resources
Sublime Text plugins are written in Python which makes it convenient for our
ecosystem to customize the editor. The following resources provide
information on writing your own plugins as well as great community plugins
you will want to take a look at adding to your installation.
• Sublime's documentation covers plugin basics, the API for plugins and
gives a "Hello, world!"-level example that you can extend.
• Sublime Text plugin development basics has some good advice and
further resources.
• The 25 Best Sublime Text Plugins for Front End Developers is not
specific to Python development but there is a bunch of overlap between
plugins useful for general front-end development and any Python web
development project.
• 5 Awesome Sublime Plugins you Won’t Find in Top Plugin Posts covers
some lesser-known plugins and how you can find your own via Package
Control's trending plugins section.
PyCharm
PyCharm is a text editor and integrated development environment specifically
designed for writing Python code.
67
Full Stack Python: 2020 Supporter's Edition
PyCharm resources
• JetBrains provides courses for the PyCharm Educational Edition that
can be used to learn any edition of the IDE.
• The Mastering PyCharm Talk Python to Me course is awesome when
you want to invest in your skills for using the IDE well.
• Worth the switch to Pycharm? is a solid discussion thread with
different developers' perspectives on using PyCharm for coding their
applications.
• How to Get Started with PyCharm and Have a Productive Python IDE
covers the basics of configuring PyCharm for running code within
virtualenvs, macros, using the console and code completion.
• Using PyCharm with Pyramid is specific to developing and debugging
with the Pyramid web framework.
• Just switched from Atom to PyCharm is a Reddit /r/learnpython
thread with a positive experience of switching to PyCharm along with
some comments and feedback from other developers.
• PyCharm Vs Visual Studio Code For Python Development compares
the editors on performance, extensions and resource consumption for
Python development.
• PyCharm has excellent first-party official documentation for getting
started and configuring the IDE. The advantagee to using the official
docs is that they tend to be more up-to-date than community blog posts
that were not published recently because PyCharm has new major
releases twice per year.
• Getting Started with PyCharm is another JetBrains-produced tutorial
that is exceptionally in-depth and just right for beginners who need
68
Full Stack Python: 2020 Supporter's Edition
help with every step of setting up and creating their first Python
project.
• Setting Up a Python Development Environment with PyCharm is
focused on the EV3 Lego Mindstorm Linux distribution but has some
good information and steps for anyone setting up PyCharm for the first
time.
• flask-pycharm-templates are snippets that are commonly used when
building Python, Jinja2 and Flask applications that you can import into
your environment to use during development.
• Unit Tests and Doctests in PyCharm is a beginner's guide on why you
would want to use tests and doctests in your Python code and run them
with PyCharm.
Jupyter Notebook
Jupyter Notebook (open source code), which began as the iPython Notebook
project, is a development environment for writing and executing Python code.
Jupyter Notebook is often used for exploratory data analysis and
visualization.
Project Jupyter is the top-level project name for all of the subprojects under
development, which includes Jupyter Notebook. Jupyter Notebooks can also
run code for other programming languages such as Julia and R.
How does Jupyter Notebook work?
The key piece of Jupyter Notebook infrastructure is a web application that
runs locally for creating and sharing documents that contain embedded code
and execution results.
69
Full Stack Python: 2020 Supporter's Edition
How are IPython Notebook and Jupyter Notebook related?
IPython Notebook was the original project that proved that there was great
demand among data scientists and programmers for an interactive,
repeatable development environment. Jupyter Notebook became the new
official name for the overall project during The Big Split after the IPython
Notebook project matured into distinct submodules such as the interactive
shell, notebook document format and user interface widgets tools. However,
the IPython Notebook name sticks around as the Python backend for Jupyter
Notebook which is seriously confusing if you are searching the internet and
come across both current and old articles that use all of these names
interchangeably.
Jupyter Notebook beginner tutorials
Jupyter Notebook's powerful analysis and visualization environment can be
intimidating even for experienced developers that are new to the tool. The
following tutorials will explain the basics so you can quickly figure out your
own productive workflow.
• Jupyter Notebook for Beginners: A Tutorial is a great place to start if
you have never before used the tool. The guide covers installation,
terminology, the user interface and how to publish your notebooks to
the web. Screenshots walk you through some of the more confusing bits
as you are getting up and running.
70
Full Stack Python: 2020 Supporter's Edition
• First Python Notebook is a free guide on analyzing data with Python
and Jupyter Notebook. It covers many "Hello, World!"-style examples
in both data analysis topics and more general software development
areas like Git, GitHub and Markdown.
• IPython Or Jupyter? covers the evolution of the Notebook concept
from its origins in the IPython Notebook implementation through the
IPython and Jupyter split that happened in 2015 that separated
IPython Notebook into logical subprojects. The post kicks off with
some fun lesser-known historical context on other data science
notebook projects such as MATLAB and Mathematica to set the stage
for IPython and Jupyter's creation.
• How to use Jupyter Notebooks in 2020 (Part 1: The data science
landscape) is a high-level overview post that starts a series on Jupyter
Notebooks. This first post covers why a tool like Jupyter Notebook is
needed in the broader landscape of data science. Part 2 examines the
growth of the Jupyter ecosystem and the jump from exploratory
analysis notebooks to production notebooks.
• Jupyter Notebook Best Practices contains tips for beginners such as
learning shortcuts and properly documenting the analysis you work on.
• How to Version Control Jupyter Notebooks explains how Jupyter
Notebooks are stored in JSON, the issues with that format for source
control and how to get around the problem.
Example Notebooks
Example Notebooks are easy to fire up and see how other people are working.
These resources are highly recommended after you read a couple of tutorials
and play around with the tool.
• Peter Norvig's collection of Jupyter Notebooks is a an incredible
resource for example projects.
• Building and Exploring a Map of Reddit with Python is a detailed
notebook that digs into public Reddit data while explaining the "what"
and "why" along the way.
71
Full Stack Python: 2020 Supporter's Edition
• This gallery of interesting Jupyter Notebooks provides many great
examples across numerous programming languages.
• jupyter-samples contains an extensive set of notebooks along with
public data sets that can be used for analysis.
• learn-python3 is geared towards Python beginners who want to learn
basic syntax and standard library features such as about string
manipulation, functions and iteration.
Intermediate to advanced Jupyter Notebook tutorials
Once you get the hang of the basics there are a slew of ways to connect your
notebooks to third party APIs and use more advanced Python libraries with
your code. These walkthroughs cover a range of topics from niche tricks to
common but advanced situations like advanced interactive visualizations.
• Advanced Jupyter Notebook Tricks — Part I and Building Interactive
Dashboards with Jupyter (Part 2) have a ton more details on ways to
set up Jupyter Notebooks as dashboards and export results to other
formats.
• Creating Interactive Dashboards from Jupyter Notebooks shows how
to use public Reddit data for a data analysis project as an example to
display in dashboards running in a Jupyter Notebook.
• mapboxgl-jupyter library along with the quickstart show you how to
visualize geospatial data within your notebooks.
• Reproducible Data Analysis in Jupyter is a fantastic series of videos by
Jake Vanderplas that shows how to move your code from the
interactive Jupyter environment into packaged, tested Python code
that is suitable for deployment to a production environment.
• 28 Jupyter Notebook tips, tricks and shortcuts explains many of the
lesser-known keyboard shortcuts and mechanisms to output settings.
• Boost Your Jupyter Notebook Productivity covers hotkeys, data
plotting, shell commands, timing and other topics you will eventually
want to handle within your notebooks as you get comfortable in the
environment.
72
Full Stack Python: 2020 Supporter's Edition
• Making Publication Ready Python Notebooks explores the plugins that
the author uses when creating and exporting reports from Jupyter.
• PyData has an extensive list of Jupyter Notebook talks from past
events. JupyterCon has a similarly extensive talks list that is also worth
watching.
• Integrate Google Sheets and Jupyter Notebooks answers the common
question of how to extract data directly from a Google Sheet and start
working with it in your Jupyter Notebook. The screenshots help a lot to
make sure you avoid getting lost in the sea of menus along the way.
• Analyzing Cryptocurrency Markets using Python uses a freely-available
Bitcoin API as a source data set for a data analysis and visualization
project in Jupyter.
• nbdev: use Jupyter Notebooks for everything shows how to use the
nbdev tool to create a literate programming environment within a
Jupyter Notebook so that you can do all of your debugging and
refactoring there rather than switching between a more traditional IDE
and Jupyter.
• Running Jupyter Notebooks on GPU on AWS: a starter guide explains
how to run notebooks on Amazon Web Services using a graphicsprocessing unit (video card), which for some machine learning
situations can result in significantly faster execution times.
• Python & Big Data: Airflow & Jupyter Notebook with Hadoop 3, Spark
& Presto walks through a data pipeline that combines several
commonly-used data analysis tools with a Jupyter Notebook.
• Ansible-jupyter-kernel is a kernel that allows you to run Ansible tasks
and playbooks from within your Jupyter environment.
• Jupyter Notebooks in the IDE explains how to use Jupyter files in
Visual Studio Code or PyCharm with Jupytext, which defines the
pairing information and notebook kernel.
• The Notebook Wars is not a tutorial but instead points to the
weaknesses that become apparent when using Jupyter and the current
generation of notebook projects. The article raises many good points
about barriers to entry although you could also argue some of these
73
Full Stack Python: 2020 Supporter's Edition
issues have been mitigated by Jupyter, just not as much as some people
would like to see. Overall there is a lot to enjoy reading here and reflect
on so that the community can continue making Jupyter a fantastic
environment for development.
• Creating Presentations with Jupyter Notebook shows how to make
slides out of your cells so you can present your work like a traditional
presentation.
• The number of open source Jupyter Notebooks on GitHub is exploding
and this post attempts to estimate the growth using Python, pandas
and a scraped data set.
• Reproducible Jupyter Notebooks with Docker explains when to use
Docker in combination with Jupyter Notebooks as well as the
instructions for creating a dockerfile to build your images.
• JupyterLab GPU Dashboards contains a Bokeh server and TypeScript
code for displaying GPU utilization charts.
Shells
Shells are computer user interfaces that typically refer to a text-only or
primarily text-based command prompt.
The above screenshot shows the bash shell with an active Python virtual
environment named fullstackpython within the macOS Terminal
application.
Shell resources
• cmd is the Pythonic standard library module that can be used for
building your own shells. The Python CmdModule wiki page has a great
overview of the module and its capabilities.
74
Full Stack Python: 2020 Supporter's Edition
• Give your Python program a shell with the cmd module shows a short
code example of how to use cmd to build a simple shell.
• Super Charge Your Shell For Python Development covers aliases,
environment variables via Autoenv and some basic shell commands
often used during development.
• Terminal latency quantifies the impact of lag in your keystrokes
appearing on the screen. It's a fascinating look at how a small
difference of tens of milliseconds causes some shells and editors to feel
slow while others are snappy.
• Why Create a New Unix Shell? is a post by the creator of Oil shell that
goes into the rationale for building a new shell even though so many
others such as Bash, zsh, PowerShell and KornShell already exist.
• explainshell (source code) is a wonderful little tool that shows how
input and arguments in the shell break down and are interpreted by
commands. The data is pulled from the Ubuntu man pages.
• Shell productivity tips and tricks covers how to increase your
effectiveness on the shell across topics such as navigating history,
autocompletion, and pattern matching.
Bash shell
The Bourne-Again SHell (source code), almost always referred to simply as
"Bash", interprets and executes input entered from a source such as the user
or a program. Bash is an implementation of the shell concept and is often
used during Python software development as part of a programmer's
development environment.
75
Full Stack Python: 2020 Supporter's Edition
Bash resources
• Bash Guide for beginners is an entire book for those new to working
with commandlines. It covers commands, paths, Bash shell scripting,
variables and many other critical topics that are necessary to move
from beginner to advanced Bash user.
• Advancing in the Bash shell covers important concepts such as bang
syntax, movement commands, tab completion and aliases.
• Mastering Bash and Terminal shows methods for repeating commands,
changing directories and handling background processes.
• Ten Things I Wish I’d Known About Bash covers some edge cases that
are very useful to know about such as proper exit code usage and
configuration options through the set command. There is also a great
follow up post called Ten MORE Things I Wish I'd Known About Bash
that covers new topics such as on-the-fly command re-execution using
the carrot character. The Seven Surprising Bash Variables post
continues the series by examining built-in variables such as
PROMPT_COMMAND , CDPATH and REPLY which can simplify your scripts
by using values that Bash already has stored for you.
• Google's Shell Style Guide covers how to write consistent, maintainable
shell scripts, which is particularly important if you have ever tried to
debug a hacky shell script that was never meant to be used by anyone
other than the original author.
• 101 Bash Commands and Tips for Beginners to Experts is a well-done
laundry list of tricks to explore.
• Bash scripting quirks & safety tips explains Bash basic programming
constructs like for loops and variable assignment then goes into ways
to avoid weird issues in your code.
• Safe ways to do things in bash shows you how to not shoot yourself in
the foot by using safe coding practices with your shell scripts.
• The Bash Infinity Framework source code provides boilerplate and a
standard library for Bash projects so they are easier to read and
maintain. If you have ever tried to read someone else's Bash scripts or
even your own after setting them aside for a couple of months, you
76
Full Stack Python: 2020 Supporter's Edition
know that anything which makes readability better is a major step up
from vanilla Bash.
• Static status is a Bash application that generates a hostable,
customizable status page for your services.
• Replacing Bash scripts with Python is a guide on using using Python
for administrative scripting, including what to do about replacing
invaluable command line tools such as awk , sed and grep .
• Using Aliases to Speed Up Your Git Workflow has a bunch of shell
aliases that make it easier for you to execute complicated or uncommon
Git commands.
• Creating a bash completion script is a great tutorial that walks you
through a reasonably complex Bash script for completing syntax in
other Bash shell scripts.
• 6 Tips Before You Write Your Next Bash Cronjob covers starting your
scripts with shebang, redirecting output, timeouts and sudo privileges.
• Better Bash history shows how to make your Bash history more useful
by having it store more previous commands (which takes up more
persistent storage but is not a huge deal in 2019) and add timestamps
to the history command.
• 9 Evil Bash Commands Explained presents a list of commands you
should never run, but can learn about their destructive abilities by
reading through the descriptions provided by the author.
• Bash Quick References is a cheat sheet for common operators and
signals that come up when working with scripts.
• Anybody can write good bash (with a little effort) covers the basics of
shell scripting and provides some recommendations for creating more
maintainable scripts such as using linters and formatters.
Zsh shell
Zsh interprets and executes input entered from text sources such as user input
or from another application. Zsh is an implementation of the shell concept
77
Full Stack Python: 2020 Supporter's Edition
that is frequently used during Python software development as part of a
programmer's development environment.
Zsh beginner's resources
• Getting Started with Zshell has a short video showing off Zsh features
and then shows how to handle aliases, globbing and parameter
expansion.
• Switching to Zsh goes on a bit about how Zsh is better than Bash in his
opinion, then shows how to use it as your default shell with a custom
theme.
• Become A Command-Line Power User With Oh My ZSH And Z
provides a long in-depth tutorial on a slew of Zsh features and how to
use them if you have never previously used Zsh.
• No, Really. Use Zsh. goes through the laundry list of advantages
provided by Zsh compared to other shells.
Zsh configuration
• Oh My Zsh is a configuration manager for Zsh.
• Zsh Configuration From the Ground Up is a wonderfully-written
detailed post on the setup that the author uses along with why he
enjoys using Zsh for development.
PowerShell
PowerShell is a commandline user interface for Windows that is often used as
part of a Python programmer's development environment.
78
Full Stack Python: 2020 Supporter's Edition
PowerShell resources
• A Python Developer's Guide to Powershell explains the PowerShell
scripting language then shows how to combine a Python script and a
PowerShell script to automate web scrapining downloads.
• ChatOps with PowerShell covers how to use the Python-based chatbot
named ErrBot. It also presents example code to connect ErrBot to
applications you are running.
• PowerShell in Azure Functions shows how to use PowerShell code in
Azure serverless Functions.
• Getting Started with Windows PowerShell is a guide for your first steps
with PowerShell.
• Windows Command-Line: Backgrounder and Windows CommandLine: The Evolution of the Windows Command-Line give historical
perspective on how the Windows shell has evolved from MS-DOS days
into the current Windows 10 world.
• Learning a New REST API with PowerShell covers how to create the
common GET, POST, PUT and DELETE requests that are used to work
with RESTful APIs. The the tutorial also contains some tips and tricks
for reading API documentation and how to use PowerShell more
effectively in these situations.
• PowerShell in Azure Functions shows you how to use PowerShell
scripting language code in your Azure Functions. The language is only
in experimental mode on Azure Functions but could be useful if you
have a bunch of existing scripts that you want to use on the serverless
platform.
79
Full Stack Python: 2020 Supporter's Edition
• PowerShell Core support in AWS Lambda is an announcement post
that PowerShell can be used in AWS Lambda Functions along with how
to get started.
• PowerShellBuild provides common reusable tasks for building
PowerShell modules.
• Invoke-Build is a build automation tool. There is also extensive
documentation on their wiki.
• Two months with Powershell on a UNIX examines one developer's
experience using PowerShell not on Windows but on a *nix operating
system. The author covers the advantages and disadvantages he found
during his experience and some of the bugs he hopes are fixed in
PowerShell 7.
Terminal Multiplexers
A terminal multiplexer provides separation between where a shell is running
and where the shell is accessed. Each shell can be running on a different
computer, but to the developer it does not matter where each shell is being
executed.
For example, a developer could have many shells running within a terminal,
like the following screenshot.
The above terminal window is using the tmux terminal multiplexer
implementation with two windows and three panes.
80
Full Stack Python: 2020 Supporter's Edition
Why are terminal multiplexers awesome?
Developers gain greater control over the usage of their shells by working with
a terminal multiplexer.
Shells are typically executed locally on a computer but terminal multiplexers
allow one or more virtual shells to be run within a single terminal. Shells can
also be left running within the multiplexer and attached to again from a
different machine.
Terminal multiplexers are used by developers to run many virtual shells
within a single terminal. These shells can be run via a mix of local, remote,
containerized and virtualized resources. The shells can also be persisted and
moved while running from one computer to another.
Terminal multiplexer implementations
Many terminal multiplexer implementations exist, including:
• tmux
• screen
• byobu
• Pymux (source code) is a terminal multiplexer implementation written
in Python that clones the functionality of tmux. Like tmux and Screen,
Pymux makes it easier for programmers to use many shells within a
single terminal window during development.
Terminal multiplexer resources
• Terminal multiplexers provides a wonderful overview of the subject,
including the history of various implementations and why you would
want to use one for development.
• Terminal multiplexer commands is a comparison of equivalent key
command in the two most popular implementations, tmux and screen.
• Byobu vs. GNU Screen vs. tmux — usefulness and transferability of
skills gives solid answers on this (now closed) question of the
usefulness of the major terminal multiplexer implementations.
• Pymux discussion on Hacker News
81
Full Stack Python: 2020 Supporter's Edition
tmux
tmux (source code) is a terminal multiplexer implementation often used
during Python development on Linux and macOS. tmux grants greater control
over a programmers's development environment by making it easier to use
many shells at once and attaching to both local and remote shell sessions.
tmux resources
• Tmux and Vim - even better together explains how using tmux with the
Vim editor provide better multi-window navigation using only
keystrokes and other benefits. As an avid tmux+Vim user myself, I can
attest to how great these two tools complement each other.
• Making tmux Pretty and Usable - A Guide to Customizing your
tmux.conf
• Tmux Pairing Anywhere: On Your Box
• Using tmux Properly
• Differences between tmux vs screen
• The Power Of tmux Hooks
• There are a slew of "cheat sheets" for tmux out there, here are a few
good ones:
◦
◦
◦
◦
tmux cheat sheet
shortcuts & cheatsheet
tmux & screen cheat-sheet
Multiplexers: tmux and screen
82
Full Stack Python: 2020 Supporter's Edition
Screen
Screen (source code) is a terminal multiplexer implementation often used
during Python development on Linux and macOS operating systems. Screen
makes it easier for programmers to use many shells within a single terminal
window while developing their applications.
Screen resources
• Learn to use screen, a terminal multiplexer
• SCREEN - the terminal multiplexer
• Using GNU Screen to Manage Persistent Terminal Sessions
Environment configuration
Properly configuring a development, test or production environment is
important for the successful execution of your Python application.
There are many de facto names for environments:
1.
2.
3.
4.
5.
Local development
Development / integration
Test
Staging
Production
83
Full Stack Python: 2020 Supporter's Edition
The above list provides the common environment names but there can be a
limitless number and each organization has their own configuration.
Environment variables
Environment variables are modifiable system values that can be read by
Python applications to affect a program's execution.
One answer I found very useful when learning about getting environment
variables in Python code is knowing the difference between os.getenv and
os.environ.get. Either one can be used in your applications but there are slight
differences that can make one better than the other in various situations.
Other useful environment variables resources: * The Twelve-Factor App
describes a method for securing environment data for your applications. The
twelve factors are commonly referenced across many programming
ecosystems, not just Python, so it's worthwhile to familiarize yourself with
how to use this method to configure your applications.
• Everything You Need to Know About the Twelve-Factor App breaks
down the original twelve-factor app source material and provides solid
additional advice and context.
• Environment variables on Ubuntu Linux
• Why you shouldn't use ENV variables for secret data
• Environment variables in Windows
• Security of infrastructure secrets elaborates on techniques to protect
your secret tokens such as API keys as well as the threats that are out
there which put your secrets at risk.
Environment configuration resources
• Staging Servers, Source Control & Deploy Workflows, And Other Stuff
Nobody Teaches You
• Deployments best practices explains the differences between various
environments and why you need each one.
• Best practices for staging environments
• Staging environment vs Production environment
84
Full Stack Python: 2020 Supporter's Edition
• A good QA team needs a proper software staging environment for
testing
Application Dependencies
Application dependencies are the libraries other than your project code that
are required to create and run your application.
Why are application dependencies important?
Python web applications are built upon the work done by thousands of open
source programmers. Application dependencies include not only web
frameworks but also libraries for scraping, parsing, processing, analyzing,
visualizing, and many other tasks. Python's ecosystem facilitates discovery,
retrieval and installation so applications are easier for developers to create.
Finding libraries
Python libraries are stored in a central location known as the Python Package
Index. PyPi contains search functionality with results weighted by usage and
relevance based on keyword terms.
Besides PyPi there are numerous resources that list common or "must-have"
libraries. Ultimately the decision for which application dependencies are
necessary for your project is up to you and the functionality you're looking to
build. However, it's useful to browse through these lists in case you come
across a library to solve a problem by reusing the code instead of writing it all
yourself. A few of the best collections of Python libraries are
• Python.org's useful modules which groups modules into categories.
• GitHub Explore Trending repositories shows the open source Python
projects trending today, this week, and this month.
• This list of 20 Python libraries you can’t live without is a wide-ranging
collection from data analysis to testing tools.
• Wikipedia actually has an extensive page dedicated to Python libraries
grouped by categories.
85
Full Stack Python: 2020 Supporter's Edition
Isolating dependencies
Dependencies are installed separately from system-level packages to prevent
library version conflicts. The most common isolation method is virtualenv.
Each virtualenv is its own copy of the Python interpreter and dependencies in
the site-packages directory. To use a virtualenv it must first be created with
the virtualenv command and then activated.
The virtualenv stores dependencies in an isolated environment. The web
application then relies only on that virtualenv instance which has a separate
copy of the Python interpreter and site-packages directory. A high level of how
a server configured with virtualenv can look is shown in the picture below.
Installing Python dependencies
The recommended way to install Python library dependencies is with the pip
command when a virtualenv is activated.
86
Full Stack Python: 2020 Supporter's Edition
Pip and virtualenv work together and have complementary responsibilities.
Pip downloads and installs application dependencies from the central PyPi
repository.
requirements.txt
The pip convention for specifying application dependencies is with a
requirements.txt file. When you build a Python web application you
should include requirements.txt in the base directory of your project.
Python projects' dependencies for a web application should be specified with
pegged dependencies like the following:
django==1.11.0
bpython==0.12
django-braces==0.2.1
django-model-utils==1.1.0
logutils==0.3.3
South==0.7.6
requests==1.2.0
stripe==1.9.1
dj-database-url==0.2.1
django-oauth2-provider==0.2.4
djangorestframework==2.3.1
Pegged dependencies with precise version numbers or Git tags are important
because otherwise the latest version of a dependency will be used. While it
may sound good to stay up to date, there's no telling if your application
actually works with the latest versions of all dependencies. Developers should
deliberately upgrade and test to make sure there were no backwardsincompatible modifications in newer dependency library versions.
setup.py
There is another type of dependency specification for Python libraries known
as setup.py. Setup.py is a standard for distributing and installing Python
libraries. If you're building a Python library, such as twilio or underwear you
must include setup.py so a dependency manager can correctly install both the
library as well as additional dependencies for the library. There's still quite a
bit of confusion in the Python community over the difference between
requirements.txt and setup.py, so read this well written post for further
clarification.
87
Full Stack Python: 2020 Supporter's Edition
Open source app dependency projects
pip and venv are part of Python 3's standard library as of version 3.3.
However, there are numerous other open source libraries that can be helpful
when managing application dependencies in your projects, as listed below.
• Autoenv is a tool for activating environment variables stored in a .env
file in your projects' home directories. Environment variables aren't
managed by virtualenv and although virtualenvwrapper has some
hooks for handling them, it's often easiest to use a shell script or .env
file to set them in a development environment.
• Pipenv is a newer Python packaging and dependency management
library that has seen some adoption in place of the standard pip
library.
• Pipreqs searches through a project for dependencies based on imports.
It then generates a requirements.txt file based on the libraries
necessary to run those dependencies. Note though that while this could
come in handy with a legacy project, the version numbers for those
libraries will not be generated with the output.
• pip-check presents a nicely-formatted list of all your installed
dependencies and the status of whether or not updates are available for
each of them.
• pip-name is a straightforward library that looks up package names on
PyPI and tells you whether or not the library name is already taken.
Code library packaging guides
There are many steps in creating and distributing packages on PyPI and your
own hosted application dependency servers. Many of these steps involve
writing configuration files that are not as well documented as some other
areas of Python development. These resources are the best ones I have found
so far to get up to speed on building and releasing your own packages.
• Python Packaging User Guide provides a collection of resources to
understand how to package and distribute Python code libraries.
• Alice in Python projectland is an amazing post that takes the reader
from simple Python script into a complete Python package.
88
Full Stack Python: 2020 Supporter's Edition
• Perils of packaging covers several edge cases that come up when trying
to put together pieces like Travis CI, PyPI and conda. The post walks
through the errors and how to get around them until they are smoothed
out by updates to the tools.
• How to Publish Your Package on PyPI is for developers who have
created a code library they would like to share and make installable for
other developers.
• How to Submit a Package to PyPI presents the basic steps like signing
up for a PyPI account and other accounts that go along with the
tutorial. It then walks through the configuration code for setting up
continuous integration and deploying your package.
Application dependency resources
The following links provide advice on how to use Python packages as well as
package your own dependencies for projects or consumption by other
developers.
• Python's New Package Landscape covers the history of Python
packaging tools and examines the problems with dependency isolation
and the dependency graphs that newer tools such as Pipenv, Poetry,
Hatch and pipsi aim to solve.
• Python Packaging Is Good Now is a wonderfully written blog post. It
provides historical context on why Python's code library packaging was
painful for a long time, and what's been fixed to make building and
installing application dependencies so much better.
• A non-magical introduction to virtualenv and pip breaks down what
problems these tools solve and how to use them.
• There’s no magic: virtualenv edition breaks open the virtual
environment "black box" to show you what the tool is doing when you
use its commands.
• Using pyenv to manage your Python interpreters explains how the
pyenv tool can make it easier to switch between different versions of
Python for each project and gives a brief review of the important things
to know when using the tool such as local versus global scope.
89
Full Stack Python: 2020 Supporter's Edition
• Testing & Packaging examines a configuration for ensuring tests run
against your code and how to properly package your project.
• 12 Alternatives for Distributing Python Applications in 2020 covers
packaging with Docker, Vagrant, PyInstaller, Briefcase, virtual
environments, Pipx and several more options for bundling and running
Python code.
• Python Application Dependency Management in 2018 presents some
critical analysis and critique oof the existing Python dependency
management tools including newer ones such as pipenv and Poetry.
• Occasionally arguments about using Python's dependency manager
versus one of Linux's dependency managers comes up. This provides
one perspective on that debate.
• Open source trust scaling is a good piece for the Python community
(and other programming communities) that is based on the left-pad
NPM situation that broke many dependent packages in the Node.js
community.
• Major speed improvements were made in pip 7 over previous versions.
Read this article about the differences and be sure to upgrade.
• Typosquatting programming language package managers shows how
many packages on centralized dependency servers for Python, Node.js
and Ruby can be vulnerable to "typosquatting" where a developer
either confuses a fake package for the correct one or simply makes a
typo when specifying her dependency list.
• The Many Layers of Packaging goes up and down the packaging stack
and even covers bits about virtual environments and security. It's well
worth investing some time to read this post to get an overview of the
many layers involved in dependency packaging.
Application dependencies learning checklist
1. Ensure the libraries your web application depends on are all captured
in a requirement.txt file with pegged versions.
2. An easy way to capture currently installed dependencies is with the
pip freeze command.
90
Full Stack Python: 2020 Supporter's Edition
3. Create a fresh virtualenv and install the dependencies from your
requirements.txt file by using the
pip install -r requirements.txt command.
4. Check that your application runs properly with the fresh virtualenv and
only the installed dependencies from the requirements.txt file.
Virtual environments (virtualenvs)
Virtual environments, implemented by the library virtualenv and venv (added
to Python standard library in Python 3.3 via PEP 405), separate project
dependencies, such as the Django library code, from your code projects. For
example, if you have three projects, one that uses Django 1.7, another that
uses Django 2.0 and another project that does not use Django at all, you will
have three virtualenvs that each contain those dependencies separated from
each other.
How do virtualenvs work?
Virtualenv provides dependency isolation for Python projects. A virtualenv
creates a separate copy of the Python installation that is clean of existing code
libraries and provides a directory for new application dependencies on a perproject basis. A programmer can technically use a virtualenv for many
projects at once but that is not consider to be a good practice.
Virtual environment resources
• Package management in Python 2 or 3 (Linux and Mac) with virtualenv
or venv
• There’s no magic: virtualenv edition
• Virtual environments dymystified
• What is the relationship between virtualenv and pyenv?
• Setting up Python on a Unix machine (with pyenv and direnv)
• venv — Create Virtual Environments
• Python development environment, 2018 edition
91
Full Stack Python: 2020 Supporter's Edition
Localhost tunnels
A localhost tunnel establishes a connection between your local machine and a
remote connection. The connection is intended to proxy traffic from a
publicly-addressable IP address and URL to your local machine. Localhost
tunnels are most useful for allowing a tester to connect to a server running on
your local development system so they can try out an in-development
application you are building but have not yet deployed.
Localhost tunnel services
There are numerous localhost tunnel services that have similar features. The
following services are listed in order from ones I have had the most
experience with to the ones I have not used.
• ngrok is the service I use most often. It is easy and worth the small fee
to upgrade your account with a few extra features such as fixed,
customizable subdomains. There is also a Python wrapper for ngrok
called pyngrok that makes it easy to programmatically access the ngrok
client from Python applications.
• Localtunnel is a localhost tunnel written in Node.js.
• Burrow provides another service, albeit one that I have not used
myself.
Source Control
Source control, also known as version control, stores software code files with a
detailed history of every modification made to those files.
Why is source control necessary?
Version control systems allow developers to modify code without worrying
about permanently screwing something up. Unwanted changes can be easily
rolled back to previous working versions of the code.
Source control also makes team software development easier. One developer
can combine her code modifications with other developers' code through diff
views that show line-by-line changes then merge the appropriate code into the
main code branch.
92
Full Stack Python: 2020 Supporter's Edition
Version control is a necessity on all software projects regardless of
development time, codebase size or the programming language used. Every
project should immediately begin by using a version control system such as
Git or Mercurial.
Monorepo vs Multirepo
There is a spectrum of philosophies for how to store projects within source
code repositories.
On one extreme end of the spectrum, every line of code for every project
within an organization is stored in a single repository
repository. That approach is called
monorepo and it is used by companies like Google. On the other end of the
spectrum, there are potentially tens of thousands or more repositories that
store parts of projects. That approach is known as multirepo or manyrepo.
For example, in a microservices architecture, there could be thousands of
microservices and each one is stored within its own repository. No one
repository contains the code for the entire application created by the
interaction of the microservices.
There are many hybrid strategies for how to store source code that fall
between these opposite approaches. What to choose will depend on your
organization's needs, resources and culture.
Source control during deployment
Pulling code during a deployment is a potential way source control systems fit
into the deployment process.
93
Full Stack Python: 2020 Supporter's Edition
Note that some developers recommend deployment pipelines package the
source code to deploy it and never have a production environment touch a
source control system directly. However, for small scale deployments it's often
easiest to pull from source code when you're getting started instead of figuring
out how to wrap the Python code in a system installation package.
Source control projects
Numerous source control systems have been created over the past several
decades. In the past, proprietary source control software offered features
tailored to large development teams and specific project workflows. However,
open source systems are now used for version control on the largest and most
complicated software projects in existence. There's no reason why your
project should use anything other than an open source version control system
in today's Python development world. The two primary choices are:
• Git is a free and open source distributed version control system.
• Mercurial is similar to Git, also a free and open source distributed
version control system.
• Subversion is a centralized system where developers must check files in
and out of the hosted repository to minimize merge conflicts.
94
Full Stack Python: 2020 Supporter's Edition
Hosted version control services
Git and Mercurial can be downloaded and run on your own server. However,
it's easy and cheap to get started with a hosted version control service. You
can transition away from the service at a later time by moving your
repositories if your needs change. A couple of recommended hosted version
control services are:
• GitLab has both a self-hosted version of its open source software as
well as their hosted version with pricing for businesses that need
additional hosting support.
• GitHub is a software-as-a-service platform that provides a user
interface, tools and backup for developers to use with their Git
repositories. Accounts are free for public open source development and
private Git repositories can also be hosted for $7 per month.
• BitBucket is Atlassian's software-as-a-service tool that with a user
interface, comparison tools and backup for Git projects. There are
many features in BitBucket focused on making it easier for groups of
developers to work on projects together. BitBucket also has private
repositories for up to five users. Users pay for hosting private
repositories with more than five users.
General source control resources
• Staging Servers, Source Control & Deploy Workflows, And Other Stuff
Nobody Teaches You is a comprehensive overview by Patrick McKenzie
of why you need source control.
• Version control best practices is a good write up of how to work with
version control systems. The post is part of an ongoing deployment
guide written by the folks at Rainforest.
• A visual guide to version control is a detailed article with real-life
examples for why version control is necessary in software development.
• An introduction to version control shows the basic concepts behind
version control systems.
• What Is Version Control? Why Is It Important For Due Diligence?
explains the benefits and necessity of version control systems.
95
Full Stack Python: 2020 Supporter's Edition
• Version control before Git with CVS goes into the history of version
control systems and defines three generations, of which CVS and SVN
were part of the second generation while Git and Mercurial are thirdgeneration version control systems.
• About version control reviews the basics of distributed version control
systems.
• Why not Git? covers SQLite's development workflow and why they do
not use Git as their version control system.
Monorepo vs multirepo resources
Monorepo versus multirepo version control strategies are a weirdly
contentious topic in software development, likely because once a policy is set
for an organization it is exceptionally difficult to change your approach. The
following resources give more insight into the debate on how to structure your
repositories.
• Monorepo, Manyrepo, Metarepo is an awesome guide to varying ways
of structuring your source repositories that contain more than one
project. The guide covers advantages and disadvantages of common
approaches used in both small and large organizations.
• Repo Style Wars: Mono vs Multi goes into the implications of using one
side or the other and why it is unlikely you can create a combination
solution that will give you the advantages of both without the
disadvantages.
• Why Google Stores Billions of Lines of Code in a Single Repository
covers the history and background of Google's source control
monorepo, which is one of if not the largest monorepo for an
organization in the world.
• Advantages of monorepos goes into the advantages of using a
monorepo and does not discuss the downsides but admits there are
many so the decision is not clear-cut on using either strategy.
• Monorepos and the Fallacy of Scale argues that having all of an
organization's code in a single repository encourages code sharing. The
author considers the concerns often raised about tight coupling
96
Full Stack Python: 2020 Supporter's Edition
between components in a monorepo code base but says that the
advantages outweigh the disadvantages overall.
Git distributed source control system
Git is the most widely-used source control system currently in use. Its
distributed design eliminates the need to check files in and out of a
centralized repository, which is a problem when using Subversion without a
network connection. There is a full page on Git with further details and
resources.
Subversion resources
Apache Subversion (source code), often just called "Subversion" or "SVN", is a
source control system implementation.
• The SVN book is the free online version of the O'Reilly Version Control
with Subversion book.
• How to use Subversion (SVN) lays out the basic concepts and provides
the first few steps for getting started tracking files.
• 10 Most Used SVN Commands with Examples is a good refresher list if
you've used SVN in the past but it has been awhile since you worked
with all the commands.
Source control learning checklist
1. Pick a version control system. Git is recommended because on the web
there are a significant number of tutorials to help both new and
advanced users.
2. Learn basic use cases for version control such as committing changes,
rolling back to earlier file versions and searching for when lines of code
were modified during development history.
3. Ensure your source code is backed up in a central repository. A central
repository is critical not only if your local development version is
corrupted but also for the deployment process.
4. Integrate source control into your deployment process in three ways.
First, pull the project source code from version control during
97
Full Stack Python: 2020 Supporter's Edition
deployments. Second, kick off deployments when code is modified by
using webhooks or polling on the repository. Third, ensure you can roll
back to a previous version if a code deployment goes wrong.
Git
Git is a distributed open source source control (also referred to as "version
control") system commonly used to track and manage file changes. Git is
frequently used as the version control system for Python projects.
Why is Git widely-used by developers?
Git is a distributed version control system (DVCS) compared to the
centralized models previously provided by Subversion and CVS. Files would
need to be "checked out" over the network by a single person at a time while
she was working. The network transfer speed as well as the blocking check out
model became a significant bottleneck, especially for large development
teams.
Git clones a full repository and its entire history instead of just the current
state of a file. Developers only require a network connection when pulling
updates and pushing changes to a backup repository. The commit log and file
histories are stored and transmitted far more efficiently than prior version
control systems to maximize the effectiveness of the distributed version
control design.
Another issue with traditional VCS was that it was difficult to create branches.
Take a look at this tutorial on managing a CVS repository as an example of the
98
Full Stack Python: 2020 Supporter's Edition
confusion the existing non-distributed models could cause. Git simplified the
branching process with simplified commands such as git checkout -b and
faster branch merging and clean up. In contrast to earlier version control
systems, Git encourages developers to create local branches and experiment
in them without impacting a stable master branch.
GitHub also helped to drive Git as the overwhelming version control favorite
by providing the open source community with free open remote Git
repositories. GitHub's web application user interface, issue tracking and pull
request features for maintainers and consumers also encouraged more
collaboration than Git alone. Recently, GitHub's third-party marketplace has
begun to add more features by integrating continuous integration servers like
as Jenkins and code metrics services.
Beginner Git tutorials
Git can take awhile to wrap your head around, even for experienced software
developers. The following tutorials can quickly get you up to speed.
• The official Pro Git book is available online for free. It is awesome both
as a step-by-step walkthrough and as a bookmarked reference on
specific topics.
• Git from the inside out provides a spectacular walkthrough for
developers who have used Git before but want to go deeper in
understanding what each command does under the covers instead of
simply using the tool as a black box.
• Think like a Git is another introduction that focuses more on the graph
theory and conceptual ideas behind Git to help the reader understand
what's happening as they use Git commands.
• Git and GitHub in plain English is a high-level overview of both Git and
GitHub. This guide is intended for both non-programmers and junior
developers who want to learn everything from terminology to
workflow.
• A Hacker's Guide to Git is a free ebook written for experienced
developers that contains both the syntax and the conceptual ideas
behind how Git works.
99
Full Stack Python: 2020 Supporter's Edition
• A Designer's Guide to Git gives a beginner's Git overview for nonprogrammers. The tutorial also covers using Git clients such as the
GitHub desktop application.
• Git in Six Hundred Words is a concise essay explaining what happens
when you add and commit files in a Git repository.
• 19 Tips For Everyday Git Use is a laundry list of helpful Git tips on
commands such as git bisect , git stash and git difftool .
• git ready presents beginner, intermediate and advanced tips for how to
use Git. The example commands and their results are great for learning
Git piece-by-piece.
Advanced Git tutorials and resources
You won't learn Git in an afternoon or even a few months of usage. After sixplus years of working with Git I still get tripped up and have a lot to learn.
These tutorials have taught me some of the beyond-the-basics edge cases.
• Flight rules for git contains common commands that answer specific
desired tasks such as "I want to discard specific unstaged files"
( git checkout filename ) and "I want to rename a branch"
( git branch -m newname ).
• Shadows Of The Past: Analysis Of Git Repositories explains how you
can extract some surprising data from Git repositories' commit history,
such as which developers are domain experts in certain tools, potential
hot spots in the code and coupling between source code files. This is a
great read once you get past the basics of using Git.
• Write yourself a Git! is a tutorial for building your own version of Git
from scratch with 503 lines of Python code. The result is obviously not
as full-featured as the real Git implementation but this program is
awesome for understanding how Git's internals work.
• Phil Nash shows how to use the git reflog command in Git back to
the future.
• On undoing, fixing, or removing commits in git is a fantastic overview
of how to unscrew a whole slew of bad situations you may find yourself
in if you use Git for long enough.
100
Full Stack Python: 2020 Supporter's Edition
• High-level Problems with Git and How to Fix Them is a long-form
article on how to fork properly (and how not to use them) and how to
not go crazy using branches and remote repositories.
Specific Git resources
Large tutorials are great for getting started with Git. However, sometimes you
need tactical support or want to learn new tricks to add to your workflow.
These resources will come in handy for specific Git subjects.
• How to Write a Git Commit Message provides strong advice that will
help you write consistent, concise and contextual messages on your
commits. Commit messages are especially important when working
with others on a long-lasting project where you dive through the
commit history via git log and related commands.
• How to squash Git commits explains how to use the git rebase
command in interactive mode to consolidated the number of commits
in your history. This technique is useful when a group of commits are
related and it's easier to understand them as a single commit rather
than a collection of smaller commits.
• Oh shit, Git! is a profanity-filled description of tips to get you out of
binds you may find yourself in when you get too tricky with Git
commands.
• Tips for a disciplined git workflow is less about workflow and more
about how to write self-explaining commit messages, self-containing
each commit and modifying branch history when you muff up before it
is merged into master.
• Another Git catastrophe cleaned up goes through a difficult merge
scenario that required deep Git understanding to properly fix.
• Erlang's source code provides a concise explanation on writing good
commit messages that any programming ecosystem can learn from.
• Git internals is a presentation that covers how Git stores data, how to
work with the Git history, and good practices for using Git based on the
knowledge of how it works internally.
101
Full Stack Python: 2020 Supporter's Edition
• Chasing a bad commit examines the git bisect command and how
it can be used in either interactive mode or on its own with
git bisect run to find the problematic code commit that needs to
be fixed.
• How Microsoft uses Git gives a high-level overview of their repository
structure and hosting at the extremely large scale organization.
• GitTips is a list of pro tips to clean up common issues and how to dive
through Git history to find specific text.
• Git allows command aliasing, which allowed one developer to create
his own list of lesser known Git commands that alias more complicated
Git lines.
• Little things I like to do with Git has some nice tips such as easily
viewing branches you recently worked on and generating a changelog
from your commits.
• Git from the inside out demonstrates how Git's graph-based data
structure produces certain behavior through example Git commands.
This is a highly recommended read after you've grasped the basics and
are looking to go deeper with Git.
• How I configure my git in a new computer shows how to handle a
.gitconfig file, with an example Gist that the author uses for his own
environment.
• How to Quickly and Correctly Generate a Git Log in HTML is an
interesting look at how string processing on *nix systems works by
generating an HTML page from a Git log. If you need to output your Git
commits somewhere and are having trouble writing your own script
you should check out some of the interesting solutions the author
presents.
• Better Git configuration explains global config options, revisions and
merging along with several other commands that can be customized to
your taste.
• Why does Git use a cryptographic hash function? explains that the
SHA-1 hash isn't used for security on Git, it's a consistency check.
SHA-1 has been broken in practice so Git needs to transition to a
102
Full Stack Python: 2020 Supporter's Edition
stronger hash without proven collisions but it's not quite as big of a
concern compared to security-related projects that use SHA-1.
• The anatomy of a git commit digs into the tree and commit objects that
underpin the Git source control system. This is an awesome read to get
a view on how Git works under the commands you're using to
manipulate these objects.
Git Workflows
Teams of developers can use Git in varying workflows because of Git's
distributed model and lightweight branching. There is no "right way" to use
Git, especially because development teams can range in size from a single
developer up to entire companies with thousands of developers in a
repository. The only correct answer is to let the developers decide on a
workflow that maximizes their ability to frequently commit code and
minimize merge conflicts.
• git-flow shows one possible way for small teams to use Git branches.
GitHub Flow explains why at GitHub they do not use the git-flow
model and provides an alternative that solves some of the issues they
found with git-flow.
• Git Workflows That Work is a helpful post with diagrams to show how
teams can create a Git workflow that will help their development
process.
• Comparing workflows provides a slew of examples for how developers
on a team can handle merge conflicts and other situations that
commonly arise when using Git.
GitHub resources
GitHub is a software-as-a-service application owned by Microsoft that makes
it easier to collaborate with other developers on centralized Git repositories.
The site also provides a remote backup location for repositories as well as
secure, private repository storage. The following tutorials show how to get
started using Git on GitHub.
103
Full Stack Python: 2020 Supporter's Edition
• Introduction to Git and GitHub for Python Developers covers basic
usage for Git and working with repositories locally and on the GitHub
service.
• Hello World: GitHub edition and Git and GitHub learning resources
are GitHub's official guide and learning resources.
• A Beginner’s Git and GitHub Tutorial shows how to perform your first
commit and back it up on GitHub.
Mercurial
Mercurial is a distributed open source source control (also known as "version
control") system written in Python for tracking and handling file
modifications. Mercurial can be used as the version control system for Python
projects.
Mercurial tutorials
• The official Mercurial tutorial goes through the basics. It has great
examples for syntax and expected output.
• Getting started with Mercurial for version control seems mistitled
because it is actually a short tutorial on how to build Mercurial
extensions in Python. With a few lines of Python code it shows how to
create your first extension and test it.
• Mercurial: The Definitive Guide is a free online version of the O'Reilly
book.
104
Full Stack Python: 2020 Supporter's Edition
• Monoroke is a Mercurial server written in Rust designed to be used for
very large monorepos that have thousands of commits affecting
millions of files per hour.
105
Full Stack Python: 2020 Supporter's Edition
Data
Data is an incredibly broad topic but it can be broken down into many
subsections, including (in no particular order):
•
•
•
•
•
•
•
•
•
data processing / wrangling
machine learning
data analysis
visualization
geospatial mapping
persistence via relational databases and NoSQL data stores
object-relational mappers
natural language processing (NLP)
indexing, search and retrieval
The Python community has built and continues to create open source libraries
and tutorials for all of the above topics.
Why is Python a great language choice for data tasks?
Python has a wide array of open source code libraries available and a diverse
community of people with different backgrounds who contribute to make
those libraries better each day.
In addition, Python data manipulation code can be combined with web
frameworks and web APIs to build software that would be difficult to create
with a single other language. For example, Ruby is a fantastic language for
building web applications but its data analysis and visualization libraries are
very limited compared to what is currently available in the Python ecosystem.
How did Python become so widely used for working with
data?
Python is a general purpose programming language and can be applied to
many problem areas. Over the past couple of decades, Python has become
increasingly popular in the scientific and financial communities. Projects such
as pandas grew out of a hedge-fund while NumPy and SciPy were created in
academic environments then improved by the broader open source
community.
107
Full Stack Python: 2020 Supporter's Edition
The question is: why Python was used to created these projects? The answer is
a mix of luck, the growth of the open source community as Python was
maturing and wide adoption by people not formally trained as computer
scientists. The pragmatic syntax and explicit style helped very intelligent
people without programming backgrounds to pick up the language and get
their work done with less fuss than other programming languages. Over time
the code used in the financial world and scientific community was shared at
the same time global open source communities were developing, further
spreading their usage among a broader base of software developers.
There's no doubt some of the momentum behind Python's wide adoption for
all types of data manipulation was that it happened to be the right language in
the right place at the right time. Nevertheless, it was ultimately the hard work
of a massive number of engineers and scientists around the world who created
the incredible mix of data code libraries available today.
Data inspiration
Sometimes you just need to see it to understand how data analysis,
visualization and storytelling can intersect in a meaningful way. The following
resources do a great job of telling stories with data. There are more links to
stories listed on the data analysis and data visualization pages.
• Data — from objects to assets covers the history of data collection and
usage, from 150 years ago to today. The article covers how initial steps
by individual scientists sponsored by wealthy patrons in the 1800s gave
way to systematic collection by governments and businesses in the
20th century. A significant amount of personal data is now held by a
few dozen large corporations worldwide such as Google, Amazon and
Facebook. The article covers some of the implications of data as a
valuable asset and in general is a great read as a high-level overview of
on this topic.
• Metadata Investigation : Inside Hacking Team presents what metadata
is and how it can be used to track people even though it is often
thought of as less of a problem than typical stored data.
• A visual introduction to machine learning is a spectacular example of
data visualization to explain what a machine learning model does on a
San Francisco and New York housing data set.
108
Full Stack Python: 2020 Supporter's Edition
• Earthquake recurrence and survival analysis: How long should we wait
for an overdue earthquake? combines earthquake data with questions
around earthquake recurrence probabilities to tell its story.
• Data Science Project: Profitable App Profiles for App Store and Google
Play is a tutorial that shows you how to use iOS and Android app store
data for business analysis. This post is part of a larger series on how to
get your first job as a data scientist which is all worth your time reading
to understand the intersection of working with data to figure out its
value to companies sand organizations.
Example data sets
Looking for freely-available data to use in your projects but aren't sure where
to get it? The following links have large free, open data sets.
• Check out the awesome public datasets project repository for data in
many different categories ranging from finance to museums.
• Kickstarter datasets are scraped JSON and CSV structured monthly
data from Kickstarter projects.
• Data is Plural is a weekly newsletter that highlights open data that you
can use for your projects. I have been a subscriber to the newsletter for
a couple of years now and love seeing the wide variety of data sources
that are freely available.
• Data analysis and machine learning projects provides more than just
the data, it also includes instructions and code for working with the
data in your own development environment.
• Discovering millions of datasets on the web introduces Google's dataset
search and explains what they learned from iterating on earlier
versions of it before they released this one.
General Python data resources
• PyData is a community for developer and users of Python data tools.
They put on fantastic conferences around the world and fund the
continued development of open source data-related libraries.
109
Full Stack Python: 2020 Supporter's Edition
• Anaconda is one of the leading Python companies that pours a
tremendous amount of time and funding into the data community.
• A crash course in Python for scientists provides an overview of the
Python language with iPython Notebook for those in scientific fields.
• The videos of Travis Oliphant on Python's Role in Big Data Analytics:
Past, Present, and Future and Building the PyData Community give
historical perspective on how the Python data tools have evolved over
the past 20ish years based on his first-hand experience as a leader and
member in that community.
• The Open Source Data Science Masters is a well-crafted free
curriculum and set of resources for students who want to learn both the
theory and technologies for working with data.
• Reproducible research: Stripe’s approach to data science goes through
the workflow and tools such as Jupyter Notebook that Stripe for their
data analysis across the company.
• The Definitive Data Scientist Environment Setup explains how to set
up both a hardware and software configuration that is conducive to
data science research and analysis.
Databases
A database is an abstraction over an operating system's file system that makes
it easier for developers to build applications that create, read, update and
delete persistent data.
Why are databases necessary?
At a high level web applications store data and present it to users in a useful
way. For example, Google stores data about roads and provides directions to
110
Full Stack Python: 2020 Supporter's Edition
get from one location to another by driving through the Maps application.
Driving directions are possible because the data is stored in a structured
format.
Databases make structured storage reliable and fast. They also give you a
mental framework for how the data should be saved and retrieved instead of
having to figure out what to do with the data every time you build a new
application.
Relational databases
The database storage abstraction most commonly used in Python web
development is sets of relational tables. Alternative storage abstractions are
explained on the NoSQL page.
Relational databases store data in a series of tables. Interconnections between
the tables are specified as foreign keys. A foreign key is a unique reference
from one row in a relational table to another row in a table, which can be the
same table but is most commonly a different table.
Databases storage implementations vary in complexity. SQLite, a database
included with Python, creates a single file for all data per database. Other
databases such as PostgreSQL, MySQL, Oracle and Microsoft SQL Server have
more complicated persistence schemes while offering additional advanced
features that are useful for web application data storage. These advanced
features include but are not limited to:
1. data replication between a master database and one or more read-only
slave instances
2. advanced column types that can efficiently store semi-structured data
such as JavaScript Object Notation (JSON)
3. sharding, which allows horizontal scaling of multiple databases that
each serve as read-write instances at the cost of latency in data
consistency
4. monitoring, statistics and other useful runtime information for
database schemas and tables
Typically web applications start with a single database instance such as
PostgreSQL with a straightforward schema. Over time the database schema
evolves to a more complex structure using schema migrations and advanced
111
Full Stack Python: 2020 Supporter's Edition
features such as replication, sharding and monitoring become more useful as
database utilization increases based on the application users' needs.
Most common databases for Python web apps
PostgreSQL and MySQL are two of the most common open source databases
for storing Python web applications' data.
SQLite is a database that is stored in a single file on disk. SQLite is built into
Python but is only built for access by a single connection at a time. Therefore
is highly recommended to not run a production web application with SQLite.
PostgreSQL database
PostgreSQL is the recommended relational database for working with Python
web applications. PostgreSQL's feature set, active development and stability
contribute to its usage as the backend for millions of applications live on the
Web today.
Learn more about using PostgreSQL with Python on the PostgreSQL page.
MySQL database
MySQL is another viable open source database implementation for Python
applications. MySQL has a slightly easier initial learning curve than
PostgreSQL but is not as feature rich.
Find out about Python applications with a MySQL backed on the dedicated
MySQL page.
Connecting to a database with Python
To work with a relational database using Python, you need to use a code
library. The most common libraries for relational databases are:
• psycopg2 (source code) for PostgreSQL.
• MySQLdb (source code) for MySQL. Note that this driver's
development is mostly frozen so evaluating alternative drivers is wise if
you are using MySQL as a backend.
112
Full Stack Python: 2020 Supporter's Edition
• cx_Oracle for Oracle Database (source code). Oracle moved their open
source driver code from SourceForge to GitHub in 2017.
SQLite support is built into Python 2.7+ and therefore a separate library is not
necessary. Simply "import sqlite3" to begin interfacing with the single filebased database.
Object-relational Mapping
Object-relational mappers (ORMs) allow developers to access data from a
backend by writing Python code instead of SQL queries. Each web application
framework handles integrating ORMs differently. There's an entire page on
object-relational mapping (ORMs) that you should read to get a handle on
this subject.
Database third-party services
Numerous companies run scalable database servers as a hosted service.
Hosted databases can often provide automated backups and recovery,
tightened security configurations and easy vertical scaling, depending on the
provider.
• Amazon Relational Database Service (RDS) provides pre-configured
MySQL and PostgreSQL instances. The instances can be scaled to
larger or smaller configurations based on storage and performance
needs.
• Google Cloud SQL is a service with managed, backed up, replicated,
and auto-patched MySQL instances. Cloud SQL integrates with Google
App Engine but can be used independently as well.
• BitCan provides both MySQL and MongoDB hosted databases with
extensive backup services.
• ElephantSQL is a software-as-a-service company that hosts
PostgreSQL databases and handles the server configuration, backups
and data connections on top of Amazon Web Services instances.
113
Full Stack Python: 2020 Supporter's Edition
SQL resources
You may plan to use an object-relational mapper (ORM) as your main way of
interacting with a database, but you should still learn the basics of SQL to
create schemas and understand the SQL code generated by the ORM. The
following resources can help you get up to speed on SQL if you have never
previously used it.
• Select Star SQL is an interactive book for learning SQL. Highly
recommended even if you feel you will only be working with an objectrelational mapper on your projects because you never know when you
will need to drop into SQL to improve a generated query's slow
performance.
• A beginners guide to SQL does a good job explaining the main
keywords used in SQL statements such as SELECT , WHERE , FROM ,
UPDATE and DELETE .
• SQL Tutorial teaches the SQL basics that can be used in all major
relational database implementations.
• Life of a SQL query explains what happens both conceptually and
technically within a database when a SQL query is run. The author uses
PostgreSQL as the example database and SQL syntax throughout the
post.
• A Probably Incomplete, Comprehensive Guide to the Many Different
Ways to JOIN Tables in SQL elaborates on one of the trickiest parts of
writing SQL statements that bridge one or more tables: the JOIN .
• Writing better SQL is a short code styling guide to make your queries
easier to read.
• SQL Intermediate is a beyond-the-basics tutorial that uses open data
from the US Consumer Financial Protection Bureau as examples for
counting, querying and using views in PostgreSQL.
General database resources
• How does a relational database work? is a detailed longform post on
the sorting, searching, merging and other operations we often take for
114
Full Stack Python: 2020 Supporter's Edition
granted when using an established relational database such as
PostgreSQL.
• Databases 101 gives a great overview of the main relational database
concepts that is relevant to even non-developers as an introduction.
• Five Mistakes Beginners Make When Working With Databases explains
why you should not store images in databases as well as why to be
cautious with how you normalize your schema.
• DB-Engines ranks the most popular database management systems.
• DB Weekly is a weekly roundup of general database articles and
resources.
• Designing Highly Scalable Database Architectures covers horizontal
and vertical scaling, replication and caching in relational database
architectures.
• Online migrations at scale is a great read on breaking down the
complexity of a database schema migration for an operational
database. The approach the author's team used was a 4-step dual
writing pattern to carefully evolved the way data for subscriptions were
stored so they could move to a new, more efficient storage model.
• A one size fits all database doesn't fit anyone explains Amazon Web
Services' specific rationale for having so many types of relational and
non-relational databases on its platform but the article is also a good
overview of various database models and their use cases.
• SQL is 43 years old - here’s 8 reasons we still use it today lists why SQL
is commonly used by almost all developers even as the language
approaches its fiftieth anniversary.
• SQL keys in depth provides a great explanation for what primary keys
are and how you should use them.
• Exploring a data set in SQL is a good example of how SQL alone can be
used for data analysis. This tutorial uses Spotify data to show how to
extract what you are looking to learn from a data set.
• Databases integration testing strategies covers a difficult topic that
comes up on every real world project.
115
Full Stack Python: 2020 Supporter's Edition
• GitLab provided their postmortem of a database outage on January 31
as a way to be transparent to customers and help other development
teams learn how they screwed up their database systems then found a
way to recover.
• Asynchronous Python and Databases is an in-depth article covering
why many Python database drivers cannot be used without
modification due to the differences in blocking versus asychronous
event models. Definitely worth a read if you are using WebSockets via
Tornado or gevent.
• PostgreSQL vs. MS SQL Server is one perspective on the differences
between the two database servers from a data analyst.
Databases learning checklist
1. Install PostgreSQL on your server. Assuming you went with Ubuntu
run sudo apt-get install postgresql .
2. Make sure the psycopg2 library is in your application's dependencies.
3. Configure your web application to connect to the PostgreSQL instance.
4. Create models in your ORM, either with Django's built-in ORM or
SQLAlchemy with Flask.
5. Build your database tables or sync the ORM models with the
PostgreSQL instance, if you're using an ORM.
6. Start creating, reading, updating and deleting data in the database
from your web application.
PostgreSQL
PostgreSQL, often written as "Postgres" and pronounced "Poss-gres", is an
open source relational database implementation frequently used by Python
applications as a backend for data storage and retrieval.
116
Full Stack Python: 2020 Supporter's Edition
How does PostgreSQL fit within the Python stack?
PostgreSQL is the default database choice for many Python developers,
including the Django team when testing the Django ORM. PostgreSQL is
often viewed as more feature robust and stable when compared to MySQL,
SQLServer and Oracle. All of those databases are reasonable choices.
However, because PostgreSQL tends to be used by Python developers the
drivers and example code for using the database tend to be better documented
and contain fewer bugs for typical usage scenarios. If you try to use an Oracle
database with Django, you'll see there is far less example code for that setup
compared to PostgreSQL backend setups.
Why is PostgreSQL a good database choice?
PostgreSQL's open source license allows developers to operate one or more
databases without licensing cost in their applications. The open source license
operating model is much less expensive compared to Oracle or other
proprietary databases, especially as replication and sharding become
necessary at large scale. In addition, because so many people ranging from
independent developers to multinational organizations use PostgreSQL, it's
often easier to find developers with PostgreSQL experience than other
relational databases. There is also ancedotal evidence that PostgreSQL fixes
bugs faster than MySQL, although to be fair there has not been a
comprehensive study comparing how the two projects handle defect
resolution.
The PostgreSQL core team also releases frequent updates that greatly enhance
the database's capabilities. For example, in the PostgreSQL 9.4 release the
117
Full Stack Python: 2020 Supporter's Edition
jsonb type was added to enhance JavaScript Object Notation (JSON) storage
capabilities so that in many cases a separate NoSQL database is not required
in an application's architecture.
Connecting to PostgreSQL with Python
To work with relational databases in Python you need to use a database
driver, which is also referred to as a database connector. The most common
driver library for working with PostgreSQL is psycopg2. There is a list of all
drivers on the PostgreSQL wiki, including several libraries that are no longer
maintained. If you're working with the asyncio Python stdlib module you
should also take a look at the aiopg library which wraps psycopg2's
asychronouos features together.
To abstract the connection between tables and objects, many Python
developers use an object-relational mapper (ORM) with to turn relational
data from PostgreSQL into objects that can be used in their Python
application. For example, while PostgreSQL provides a relational database
and psycopg is the common database connector, there are many ORMs that
can be used with varying web frameworks, as shown in the table below.
Learn more about Python ORMs on that dedicated topic page.
PostgreSQL data safety
If you're on Linux it's easy to get PostgreSQL installed using a package
manager. However, once the database is installed and running your
118
Full Stack Python: 2020 Supporter's Edition
responsibility is just beginning. Before you go live with a production
application, make sure to:
1. Lock down access with a whitelist in the pg_hba.conf file
2. Enable replication to another database that's preferrably on different
infrastructure in a separate location
3. Perform regular backups and test the restoration process
4. Ensure your application prevents SQL injection attacks
When possible have someone qualified do a PostgreSQL security audit to
identify the biggest risks to your database. Small applications and
bootstrapped companies often cannot afford a full audit in the beginning but
as an application grows over time it becomes a bigger target.
The data stored in your database is the lifeblood of your application. If you
have ever accidentally dropped a production database or been the victim of
malicious activity such as SQL injection attacks, you'll know it's far easier to
recover when a bit of work has been performed beforehand on backups,
replication and security measures.
Python-specific PostgreSQL resources
Many quickstarts and tutorials exist specifically for Django, Flask and other
web application frameworks. The ones below are some of the best
walkthroughs I've read.
• Setting up PostgreSQL with Python 3 and psycopg on Ubuntu 16.04
provides instructions for getting a fresh Ubuntu install working with
PostgreSQL and Python 3.
• This post on using PostgreSQL with Django or Flask is a great
quickstart guide for either framework.
• This article explains how and why PostgreSQL can handle full text
searching for many use cases. If you're going down this route, read this
blog post that explains how one developer implemented PostgreSQL
full text search with SQLAlchemy.
• django-postgres-copy is a tool for bulk loading data into a PostgreSQL
database based on Django models. Say hello to our new open-source
119
Full Stack Python: 2020 Supporter's Edition
software for loading bulk data into PostgreSQL is an introduction to
using the tool in your own projects.
• How to speed up tests in Django and PostgreSQL explains some hacks
for making your schema migration-backed run quicker.
• Thinking psycopg3 is written by a developer who has worked on this
critical Python library for interacting with PostgreSQL since 2005. The
author writes up thoughts on what should change if backwardsincompatible changes are ever introduced in a new hypothetical future
version.
• Records is a wrapper around the psycopg2 driver that allows easy
access to direct SQL access. It's worth a look if you prefer writing SQL
over using an ORM like SQLAlchemy.
o Postgres Joins and Django Querysets is a well done post with a specific
example of how a standard Django ORM query can lead to degraded
performance due when obtaining data from many related tables. The
prefetch_related command and database performance monitoring tools
can help analyze and alleviate some of the issues in these unoptimized
queries.
• Loading Google Analytics data to PostgreSQL using Python is a quality
tutorial that combines API calls with psycopg and PostgreSQL to take
data from Google Analytics and save it in a PostgreSQL database.
• 1M rows/s from Postgres to Python shows some benchmarks for the
performance of the asyncpg Python database client and why you may
want to consider using it for data transfers.
General PostgreSQL resources
PostgreSQL tutorials not specific to Python are also really helpful for properly
handling your data.
• Why PostgreSQL? (5 years later) covers the improvements that have
been made to PostgreSQL over the past five years. It's amazing to see
how far this project has come and how it continues to evolve.
120
Full Stack Python: 2020 Supporter's Edition
• The Internals of PostgreSQL is a book that goes into how PostgreSQL
works, including core topics such as query processing, concurrency
control and the layout of heap table files.
• PostgreSQL Weekly is a weekly newsletter of PostgreSQL content from
around the web.
• My Favorite PostgreSQL Extensions - Part One and part two are
roundups of useful PostgreSQL extensions that augment the standard
PostgreSQL functionality.
• An introduction to PostgreSQL physical storage provides a solid
walkthrough of where PostgreSQL files are located on disk, how the
files store your data and what mappings are important for the
underlying database structure. This post is an easy read and well worth
your time.
• Braintree wrote about their experiences scaling PostgreSQL. The post
is an inside look at the evolution of Braintree's usage of the database.
• There is no such thing as total security but this IBM article covers
hardening a PostgreSQL database.
• Handling growth with Postgres provides 5 specific tips from
Instagram's engineering team on how to scale the design of your
PostgreSQL database.
• Inserting And Using A New Record In Postgres shows some SQL
equivalents to what many developers just do in their ORM of choice.
• Following a Select Statement Through Postgres Internals provides a
fascinating look into the internal workings of PostgreSQL during a
query.
• Locating the recovery point just before a dropped table and logging
transactions that dropped tables are two posts that show you how to
recover from an accidentally dropped table. In the first post the author
shows how recovery is possible with recovery points while the second
post shows how to put logging in place to assist in future recoveries.
• awesome-postgres is a list of code libraries, tutorials and newsletters
focused specifically on PostgreSQL.
121
Full Stack Python: 2020 Supporter's Edition
• While you can use a graphical interface for working with PostgreSQL,
it's best to spend some time getting comfortable with the commandline interface.
• Backing up databases is important because data loss can and does
happen. This article explains how to back up a PostgreSQL database
hosted on an Amazon Web Services EC2 instance if managing your own
database on a cloud server is your preferred setup.
• Is bi-directional replication (BDR) in PostgreSQL transactional?
explores a relatively obscure topic with the final result that BDR is
similar to data stores with eventual consistency rather than consistency
as a requirement.
• PostgreSQL-metrics is a tool built by Spotify's engineers that extracts
and outputs metrics from an existing PostgreSQL database. There's
also a way to extend the tools to pull custom metrics as well.
• Creating a Document-Store Hybrid in Postgres 9.5 explains how to
store and query JSON data, similar to how NoSQL data stores operate.
• This slideshow on high availability for web applications has a good
overview of various database setups common in production web
applications.
• The JSONB data type was introduced in PostgreSQL 9.4 to make it
easier to store semi-structured data that previously NoSQL databases
such as MongoDB covered. However, there are times when using
JSONB isn't a good idea and this blog post covers when to avoid the
column type.
PostgreSQL monitoring and performance
Monitoring one or more PostgreSQL instances and trying to performance
tune them is a rare skillset. Here are some resources to get you started if you
have to handle these issues in your applications.
• This guide to PostgreSQL monitoring is handy for knowing what to
measure and how to do it.
• Craig Kerstiens wrote a detailed post about understanding PostgreSQL
performance.
122
Full Stack Python: 2020 Supporter's Edition
• The Practical Guide to PostgreSQL Optimizations covers using cache
sizes, restore configurations and shared buffers to improve database
performance.
• This article on performance tuning PostgreSQL shows how to find slow
queries, tune indexes and modify your queries to run faster.
• What PostgreSQL tells you about its performance explains how to
gather general performance metrics and provides the exact queries you
should run to get them. The article also covers performance monitoring
and how to analyze trigger functions.
• PostgreSQL monitoring queries is a simple GitHub repository of SQL
queries that can be run against a PostgreSQL instance to determine
usage, caching and bloat.
• PgSQL Indexes and "LIKE" examines why LIKE queries do not take
advantage of PostgreSQL indexes when the locale is set to something
other than the default "C", which is for the North American UNIX
default. The gist is that you need to build a special index to support
LIKE whenever you use a locale other than "C".
• The PostgreSQL page on PopSQL has a ton of useful syntax snippets
categorized by type of action you want to perform using SQL.
MySQL
MySQL is an open source relational database implementation for storing and
retrieving data.
123
Full Stack Python: 2020 Supporter's Edition
MySQL or PostgreSQL?
MySQL is a viable open source database implementation for Python web
applications. MySQL has a slightly easier initial learning curve than
PostgreSQL. However, PostgreSQL's design is often preferred by Python web
developers, especially when data migrations are run as an application evolves.
Python Drivers for MySQL
Accessing MySQL from a Python application requires a database driver (also
called a "connector"). While it is possible to write a driver as part of your
application, in practice most developers use an existing open source driver.
There was a major issue with MySQL drivers since the introduction of Python
3. One of the most popular libraries called MySQLdb did not work in its
existing form with Python 3 and there were no plans to update it. Therefore a
fork of MySQLdb named mysqlclient added Python 3 compatibility.
The mysqlclient fork was good in that existing MySQLdb users could drop
mysqlclient into existing projects that were upgrading to Python 3. However,
the fork often causes confusion when searching for which Python driver to use
with MySQL. Many developer simply decide to use PostgreSQL because there
is better support for Python drivers in the PostgreSQL community.
124
Full Stack Python: 2020 Supporter's Edition
With that driver support context in mind, it's absolutely possible to build a
Python 3 web application with MySQL as a backend. Here is a list of drivers
along with whether it supports Python 2, 3 or both.
• mysqlclient is a fork of MySQLdb that supports Python 2 and 3.
• MySQL Connector is Oracle's "official" (Oracle currently owns MySQL)
Python connector. The driver supports Python 2 and 3, just make sure
to check the version guide for what releases work with which Python
versions.
• MySQLdb supports Python 2 and was frequently used by Python web
applications before the mass migration to Python 3 began.
• PyMySQL is a pure Python (no C low-level code) implementation that
attempts to be a drop-in replacement for MySQLdb. However, some
MySQL APIs are not supported by the driver so whether or not your
application can use this connector will depend on what you're building.
What organizations use MySQL?
The database is deployed in production at some of the highest trafficked sites
such as Uber, Twitter, Facebook and many others major organizations.
However, since MySQL AB, the company that developed MySQL, was
purchased by Sun Microsystems (which was in turn purchased by Oracle),
there have been major defections away from the database by Wikipedia and
Google. MySQL remains a viable database option but I always recommend
new Python developers learn PostgreSQL if they do not already know MySQL.
Python-specific MySQL resources
The following resources show you how to work with MySQL in your Python
code either directly through SQL queries or less directly with an objectrelational mapper (ORM) like SQLAlchemy or the Django ORM.
• Python MySQL tutorial uses the MySQL Connector Python library to
demonstrate how to run queries and stored procedures in your Python
applications.
125
Full Stack Python: 2020 Supporter's Edition
• Python 3.4.0 with MySQL database and Python 3 and MySQL provide
context for the commonly asked question about which database
MySQL driver to use with Python 3.
• Terrible Choices: MySQL is a blog post about specific deficiencies in
MySQL's implementation that hinder its usage with Django's ORM.
• MySQL Python tutorial uses the MySQLdb driver to connect to a
MySQL server instance and shows some examples for inserting and
querying data.
General MySQL resources
There are many programming language agnostic tutorials for MySQL. A
handful of the best of these tutorials are listed below.
• How to Install and Use MySQL on Ubuntu 16.04 is a quick tutorial for
getting up and running on Ubuntu Linux.
• 28 Beginner's Tutorials for Learning about MySQL Databases is a
curated collection on various introductory MySQL topics.
• A Basic MySQL Tutorial doesn't have the most original title but it's a
good walkthrough of your first few steps in MySQL for creating users
and working with tables.
• mycli is a command line interface for MySQL that includes command
completion and other super handy features.
• Bye Bye MySQL & MongoDB, Guten Tag PostgreSQL goes into details
for why the company Userlike migrated from their MySQL database
setup to PostgreSQL.
• MySQL sharding at Quora provides details behind Quora's at-scale
infrastructure and how their MySQL sharding evolved over time.
• Growing up with MySQL is a story about how one company went
through dramatic growth and had to keep up with it by quickly scaling
their MySQL database.
• Monitoring MySQL metrics is the first of a three part series, with the
other parts on collecting metrics and monitoring & collecting
specifically with the DataDog tool. The series explains what metrics you
126
Full Stack Python: 2020 Supporter's Edition
should be collecting and monitoring in your production database along
with the purpose for why those metrics are important.
• gh-ost (source code) is a schema migration tool built by GitHub and
open sourced to the development community. The advantages of gh-ost
are sustainable workloads on the master node to allow it to keep
serving inbound query requests and the ability to pause the migration.
The post on how to use gh-ost pairs nicely with GitHub's detailed writeup on how they perform backups, failover and schema migrations in
MySQL infrastructure testing automation at GitHub.
• The unofficial MySQL optimizers guide is intended for experienced
developers who need to get better performance out of MySQL for their
specific use cases.
• The Ultimate Postgres vs MySQL Blog Post provides comparisons of
data types, default values, arrays, joins and many other differences
between MySQL and PostgreSQL.
SQLite
SQLite is an open source relational database included with the Python
standard library as of Python 2.5. The pysqlite database driver is also included
with the standard library so that no further external dependencies are
required to access a SQLite database from within Python applications.
127
Full Stack Python: 2020 Supporter's Edition
Useful SQLite tools and code
SQLite is used in such a wide variety of industries that there are open source
tools and example code for all kinds of edge case uses. Here are several tools
and bits of code I have found useful while coding my applications:
• sqlitebiter (source code) is a command-line tool for converting various
data formats such as comma-separated values (CSV), HTML,
Markdown and JSON (among others) into a SQLite database file.
• Scout (source code) is a Flask-powered search server for SQLite
backends. The introductory post is really handy for getting started with
Scout.
• Datasette makes it easy to expose JSON APIs from your SQLite
database without coding up a custom web application. Make sure to
check out the Datasette getting started guide as well.
• SQLite Browser is an open source graphical user interface for working
with SQLite.
• The Membership SQLite SQL scripts provide example code for storing
user accounts, roles and authentication tokens in web applications.
• ExtendsClass is an online SQLite browser.
SQLite tutorials
It's a good idea to brush up on the basics for using SQLite before you use the
database in your project through SQL scripts or via an object-relational
mapper. These tutorials will help you get started.
• A simple step-by-step SQLite tutorial walks through creating databases
as well as inserting, updating, querying and deleting data.
• A Minimalist Guide to SQLite shows how to install SQLite, load data
and work with the data stored in a new SQLite database.
• Python SQLite3 Basics covers how to connect to a SQLite database in
Python, executing statements, committing and retrieving saved values.
• sqlite3 - embedded relational database is an extensive tutorial showing
many of the common create, read, update and delete operations a
developer would want to do with SQLite.
128
Full Stack Python: 2020 Supporter's Edition
• The official sqlite3 module in the Python stdlib docs contains a bunch
of scenarios with code for how to use the database from a Python
application.
• Finding bugs in SQLite, the easy way explains how a bug was found and quickly fixed - in the SQLite codebase. It's a great short read which
shows that the code is well-tested and maintained.
• Data Analysis of 8.2 Million Rows with Python and SQLite explains
how you can load a large dataset in to SQLite and visualize it using the
Plotly service.
• SQLite: The art of keep it simple uses C code examples from SQLite's
codebase to show how its design has been kept consistent and tight
throughout 15+ years of active development. There's also a great design
document on the SQLite site that covers many of these principles.
• My list of SQLite resources is a nice roundup of useful tools to use with
SQLite and tutorials for learning more about the database.
• Python SQLite3 tutorial provides another beginner's tutorial using the
built-in sqlite3 Python standard library module.
• A SQLite tutorial with Python covers both SQL and Python code to
interact with SQLite.
Specific SQLite scenarios
These are solid resources if you are looking to solve a particular problem you
are having with SQLite rather than going through a general tutorial.
• Let's Build a Simple Database is an awesome read where the author recreates a SQLite-type database for learning purposes.
• We are pretty happy with SQLite & not urgently interested in a fancier
DBMS gives the rationale behind one development teams' decision to
stick to SQLite instead of porting to another relational database such as
MySQL or PostgreSQL.
• This overview of SQLite as part of the Databaseology Lectures is
amazing because they are given by the creator and he shines a ton of
light on how SQLite is built and why.
129
Full Stack Python: 2020 Supporter's Edition
• How SQLite is tested digs into the nitty-gritty behind the quality
assurance practices for testing potential SQLite releases.
• Using the SQLite JSON1 and FTS5 Extensions with Python shows how
to compile SQLite 3.9.0+ with json1 and fts5 (full-text search) support
to use these new features.
• SQLite with a fine-toothed comb digs into the internals of SQLite and
shows some bugs found (and since fixed) while the author was
researching the SQLite source code.
• Going Fast with SQLite and Python shares essential knowledge for
working effectively with SQLite in Python, particularly when it comes
to transactions, concurrency and commits.
• Extending SQLite with Python uses the Peewee object-relational
mapper (ORM) to implement virtual tables and aggregates on top of
SQLite.
• Use SQLite with Django On AWS Lambda with Zappa provides an
example dev_settings.py file for locally testing a Django application
intended for AWS Lambda.
• SQLite Database Authorization and Access Control with Python covers
how to control access to the SQLite database connection and file even
though SQLite normally allows unauthorized access by design.
• Can I read and write to a SQLite database concurrently from multiple
connections? answers one of the concerns that was an issue in earlier
versions of SQLite that could have issues if more than one connection
was writing to the database at one time.
• Appropriate uses for SQLite is an official documentation page that
explains what types of applications are designed to work well with
SQLite as the backend.
• How to corrupt a SQLite file explains how the database file could
potentially get corrupted if you really work at screwing it up.
130
Full Stack Python: 2020 Supporter's Edition
Object-relational Mappers (ORMs)
An object-relational mapper (ORM) is a code library that automates the
transfer of data stored in relational databases tables into objects that are more
commonly used in application code.
Why are ORMs useful?
ORMs provide a high-level abstraction upon a relational database that allows
a developer to write Python code instead of SQL to create, read, update and
delete data and schemas in their database. Developers can use the
programming language they are comfortable with to work with a database
instead of writing SQL statements or stored procedures.
For example, without an ORM a developer would write the following SQL
statement to retrieve every row in the USERS table where the zip_code
column is 94107:
SELECT * FROM USERS WHERE zip_code=94107;
The equivalent Django ORM query would instead look like the following
Python code:
# obtain everyone in the 94107 zip code and assign to users variable
users = Users.objects.filter(zip_code=94107)
The ability to write Python code instead of SQL can speed up web application
development, especially at the beginning of a project. The potential
development speed boost comes from not having to switch from Python code
into writing declarative paradigm SQL statements. While some software
developers may not mind switching back and forth between languages, it's
131
Full Stack Python: 2020 Supporter's Edition
typically easier to knock out a prototype or start a web application using a
single programming language.
ORMs also make it theoretically possible to switch an application between
various relational databases. For example, a developer could use SQLite for
local development and MySQL in production. A production application could
be switched from MySQL to PostgreSQL with minimal code modifications.
In practice however, it's best to use the same database for local development
as is used in production. Otherwise unexpected errors could hit in production
that were not seen in a local development environment. Also, it's rare that a
project would switch from one database in production to another one unless
there was a pressing reason.
Do I have to use an ORM for my web application?
Python ORM libraries are not required for accessing relational databases. In
fact, the low-level access is typically provided by another library called a
database connector, such as psycopg (for PostgreSQL) or MySQL-python (for
MySQL). Take a look at the table below which shows how ORMs can work
with different web frameworks and connectors and relational databases.
The above table shows for example that SQLAlchemy can work with varying
web frameworks and database connectors. Developers can also use ORMs
without a web framework, such as when creating a data analysis tool or a
batch script without a user interface.
132
Full Stack Python: 2020 Supporter's Edition
What are the downsides of using an ORM?
There are numerous downsides of ORMs, including
1. Impedance mismatch
2. Potential for reduced performance
3. Shifting complexity from the database into the application code
Impedance mismatch
The phrase "impedance mismatch" is commonly used in conjunction with
ORMs. Impedance mismatch is a catch-all term for the difficulties that occur
when moving data between relational tables and application objects. The gist
is that the way a developer uses objects is different from how data is stored
and joined in relational tables.
This article on ORM impedance mismatch does a solid job of explaing what
the concept is at a high level and provides diagrams to visualize why the
problem occurs.
Potential for reduced performance
One of the concerns that's associated with any higher-level abstraction or
framework is potential for reduced performance. With ORMs, the
performance hit comes from the translation of application code into a
corresponding SQL statement which may not be tuned properly.
ORMs are also often easy to try but difficult to master. For example, a
beginner using Django might not know about the select_related()
function and how it can improve some queries' foreign key relationship
performance. There are dozens of performance tips and tricks for every ORM.
It's possible that investing time in learning those quirks may be better spent
just learning SQL and how to write stored procedures.
There's a lot of hand-waving "may or may not" and "potential for" in this
section. In large projects ORMs are good enough for roughly 80-90% of use
cases but in 10-20% of a project's database interactions there can be major
performance improvements by having a knowledgeable database
administrator write tuned SQL statements to replace the ORM's generated
SQL code.
133
Full Stack Python: 2020 Supporter's Edition
Shifting complexity from the database into the app code
The code for working with an application's data has to live somewhere. Before
ORMs were common, database stored procedures were used to encapsulate
the database logic. With an ORM, the data manipulation code instead lives
within the application's Python codebase. The addition of data handling logic
in the codebase generally isn't an issue with a sound application design, but it
does increase the total amount of Python code instead of splitting code
between the application and the database stored procedures.
Python ORM Implementations
There are numerous ORM implementations written in Python, including
1.
2.
3.
4.
5.
6.
SQLAlchemy
Peewee
The Django ORM
PonyORM
SQLObject
Tortoise ORM (source code)
There are other ORMs, such as Canonical's Storm, but most of them do not
appear to currently be under active development. Learn more about the major
active ORMs below.
Django's ORM
The Django web framework comes with its own built-in object-relational
mapping module, generally referred to as "the Django ORM" or "Django's
ORM".
Django's ORM works well for simple and medium-complexity database
operations. However, there are often complaints that the ORM makes
complex queries much more complicated than writing straight SQL or using
SQLAlchemy.
It is technically possible to drop down to SQL but it ties the queries to a
specific database implementation. The ORM is coupled closely with Django so
replacing the default ORM with SQLAlchemy is currently a hack workaround.
Note though it is possible that swappable ORM backends will be possible in
134
Full Stack Python: 2020 Supporter's Edition
the future as it is now possible to change the template engine for rendering
output in Django.
Since the majority of Django projects are tied to the default ORM, it is best to
read up on advanced use cases and tools for doing your best work within the
existing framework.
SQLAlchemy ORM
SQLAlchemy is a well-regarded Python ORM because it gets the abstraction
level "just right" and seems to make complex database queries easier to write
than the Django ORM in most cases. There is an entire page on SQLAlchemy
that you should read if you want to learn more about using the library.
Peewee ORM
Peewee is a Python ORM implementation that is written to be "simpler,
smaller and more hackable" than SQLAlchemy. Read the full Peewee page for
more information on the Python ORM implementation.
Pony
Pony ORM is another Python ORM available as open source, under the
Apache 2.0 license.
SQLObject ORM
SQLObject is an ORM that has been under active open source development
for over 14 years, since before 2003.
Schema migrations
Schema migrations, for example when you need to add a new column to an
existing table in your database, are not technically part of ORMs. However,
since ORMs typically lead to a hands-off approach to the database (at the
developers peril in many cases), libraries to perform schema migrations often
go hand-in-hand with Python ORM usage on web application projects.
Database schema migrations are a complex topic and deserve their own page.
For now, we'll lump schema migration resources under ORM links below.
135
Full Stack Python: 2020 Supporter's Edition
General ORM resources
• This detailed overview of ORMs is a generic description of how ORMs
work and how to use them.
• This example GitHub project implements the same Flask application
with several different ORMs: SQLAlchemy, Peewee, MongoEngine,
stdnet and PonyORM.
• Martin Fowler addresses the ORM hate in an essay about how ORMs
are often misused but that they do provide benefits to developers.
• The Rise and Fall of Object Relational Mapping is a talk on the history
of ORMs that doesn't shy away from some controversy. Overall I found
the critique of conceptual ideas worth the time it took to read the
presentation slides and companion text.
• If you're confused about the difference between a connector, such as
MySQL-python and an ORM like SQLAlchemy, read this
StackOverflow answer on the topic.
• What ORMs have taught me: just learn SQL is another angle in the
ORM versus embedded SQL / stored procedures debate. The author's
conclusion is that while working with ORMs such as SQLAlchemy and
Hibernate (a Java-based ORM) can save time up front there are issues
as a project evolves such as partial objects and schema redundancies. I
think the author makes some valid points that some ORMs can be a
shaky foundation for extremely complicated database-backed
applications. However, I disagree with the overriding conclusion to
eschew ORMs in favor of stored procedures. Stored procedures have
their own issues and there are no perfect solutions, but I personally
prefer using an ORM at the start of almost every project even if it later
needs to be replaced with direct SQL queries.
• The Vietnam of Computer Science provides the perspective from Ted
Neward, the originator of the phrase "Object/relational mapping is the
Vietnam of Computer Science" that he first spoke about in 2004. The
gist of the argument against ORMs is captured in Ted's quote that an
ORM "represents a quagmire which starts well, gets more complicated
as time passes, and before long entraps its users in a commitment that
has no clear demarcation point, no clear win conditions, and no clear
136
Full Stack Python: 2020 Supporter's Edition
exit strategy." There are follow up posts on Coding Horror and another
one from Ted entitled thoughts on Vietnam commentary.
• Turning the Tables: How to Get Along with your Object-Relational
Mapper coins the funny but insightful phrase "database denial" to
describe how some ORMs provide a usage model that can cause more
issues than they solve over straight SQL queries. The post then goes
into much more detail about the problems that can arise and how to
mitigate or avoid them.
SQLAlchemy and Peewee resources
A comprehensive list of SQLAlchemy and Peewee ORM resources can be
found on their respective pages.
Django ORM links
A curated list of resources can be found on the dedicated Django ORM
resources page.
Pony ORM resources
All Pony ORM resources are listed on the dedicated Pony ORM page.
SQLObject resources
SQLObject has been around for a long time as an open source project but
unfortunately there are not that many tutorials for it. The following talks and
posts will get you started. If you take an interest in the project and write
additional resources, file an issue ticket so we can get them added to this list.
• This post on Object-Relational Mapping with SQLObject explains the
concept behind ORMs and shows the Python code for how they can be
used.
• Ian Bicking presented on SQLObject back in 2004 with a talk on
SQLObject and Database Programming in Python.
• Connecting databases to Python with SQLObject is an older post but
still relevant with getting started basics.
137
Full Stack Python: 2020 Supporter's Edition
SQLAlchemy
SQLAlchemy (source code) is a well-regarded database toolkit and objectrelational mapper (ORM) implementation written in Python. SQLAlchemy
provides a generalized interface for creating and executing database-agnostic
code without needing to write SQL statements.
Why is SQLAlchemy a good ORM choice?
SQLAlchemy isn't just an ORM- it also provides SQLAlchemy Core for
performing database work that is abstracted from the implementation
differences between PostgreSQL, SQLite, etc. In some ways, the ORM is a
bonus to Core that automates commonly-required create, read, update and
delete operations.
SQLAlchemy can be used with or without the ORM features. Any given project
can choose to just use SQLAlchemy Core or both Core and the ORM. The
following diagram shows a few example configurations with various
application software stacks and backend databases. Any of these
configurations can be a valid option depending on what type of application
you are coding.
138
Full Stack Python: 2020 Supporter's Edition
A benefit many developers enjoy with SQLAlchemy is that it allows them to
write Python code in their project to map from the database schema to the
applications' Python objects. No SQL is required to create, maintain and
query the database. The mapping allows SQLAlchemy to handle the
underlying database so developers can work with their Python objects instead
of writing bridge code to get data in and out of relational tables.
How does SQLAlchemy code compare to raw SQL?
Below is an example of a SQLAlchemy model definition from the open source
compare-python-web-frameworks project that uses SQLAlchemy with Flask
and Flask-SQLAlchemy.
class Contact(db.Model):
__tablename__ = 'contacts'
id = db.Column(db.Integer, primary_key=True)
first_name = db.Column(db.String(100))
last_name = db.Column(db.String(100))
phone_number = db.Column(db.String(32))
def __repr__(self):
return '<Contact {0} {1}: {2}>'.format(self.first_name,
self.last_name,
self.phone_number)
SQLAlchemy handles the table creation that otherwise we would have had to
write a create table statement like this one to do the work:
CREATE TABLE CONTACTS(
ID INT PRIMARY KEY
FIRST_NAME
CHAR(100)
NOT NULL,
NOT NULL,
139
Full Stack Python: 2020 Supporter's Edition
LAST_NAME
PHONE_NUMBER
CHAR(100)
CHAR(32)
NOT NULL,
NOT NULL,
);
By using SQLAlchemy in our Python code, all records can be obtained with a
line like contacts = Contact.query.all() instead of a plain SQL such as
SELECT * FROM contacts . That may not look like much of a difference in
syntax but writing the queries in Python is often faster and easier for many
Python developers once multiple tables and specific filtering on fields for
queries have to be written. In addition, SQLAlchemy abstracts away
idiosyncratic differences between database implementations in SQLite,
MySQL and PostgreSQL.
SQLAlchemy Extensions, Plug-ins and Related Libraries
Take a look at the SQLAlchemy extensions, plug-ins and related libraries page
for a curated list of useful code libraries to use with SQLAlchemy.
Using SQLAlchemy with Web Frameworks
There is no reason why you cannot use the SQLAlchemy library in any
application that requires a database backend. However, if you are building a
web app with Flask, Bottle or another web framework then take a look at the
following extensions. They provide some glue code along with helper
functions that can reduce the boilerplate code needed to connect your
application's code with the SQLAlchemy library.
• SQLAlchemy is typically used with Flask as the database ORM via the
Flask-SQLAlchemy extension.
• The bottle-sqlalchemy extension for Bottle provides a bridge between
the standard SQLAlchemy library and Bottle. However, from my
experience using the library it does not have quite as many helper
functions as Flask-SQLAlchemy.
• Pyramid uses the alchemy scaffold to make it easy to add SQLAlchemy
to a Pyramid web app.
• While Django does not yet support easy swapping of the default Django
backend ORM with SQLAlchemy (like it does for template engines),
there are hacks for using SQLAlchemy within Django projects.
140
Full Stack Python: 2020 Supporter's Edition
• Morepath has easy-to-use support for SQLAlchemy via its
more.transaction module. There is a morepath-sqlalchemy demo that
serves as a working example.
• Merging Django ORM with SQLAlchemy for Easier Data Analysis has
details on why, how and when you may want to use SQLAlchemy to
augment the Django ORM.
• Building a Simple Birthday App with Flask-SQLAlchemy combines
SQLAlchemy with Flask to create a birthday reminder application.
SQLAlchemy resources
The best way to get comfortable with SQLAlchemy is to dig in and write a
database-driven application. The following resources can be helpful if you are
having trouble getting started or are starting to run into some edge cases.
• There is an entire chapter in the Architecture of Open Source
Applications book on SQLAlchemy. The content is detailed and well
worth reading to understand what is executing under the covers.
• The SQLAlchemy cheatsheet has many examples for querying,
generating database metadata and many other common (and not so
common) operations when working with Core and the ORM.
• 10 reasons to love SQLAlchemy is a bit of a non-critical lovefest for the
code library. However, the post makes some good points about the
quality of SQLAlchemy's documentation and what a pleasure it can be
to use it in a Python project.
• SQLAlchemy and Django explains how one development team uses the
Django ORM for most of their standard queries but relies on
SQLAlchemy for really advanced queries.
• This SQLAlchemy tutorial provides a slew of code examples that cover
the basics for working with SQLAlchemy.
• Implementing User Comments with SQLAlchemy gives a wonderful
walkthrough of how to build your own online commenting system in
Python using SQLAlchemy.
141
Full Stack Python: 2020 Supporter's Edition
• Master SQLAlchemy Relationships in a Performance Friendly Way
dives into code that shows how to improve performance when setting
and accessing relationship-based data in your models.
• SQLAlchemy and data access in Python is a podcast interview with the
creator of SQLAlchemy that covers the project's history and how it has
evolved over the past decade.
• Most Flask developers use SQLAlchemy as an ORM to relational
databases. If you're unfamiliar with SQLAlchemy questions will often
come up such as what's the difference between flush and commit? that
are important to understand as you build out your app.
• SQLAlchemy in batches shows the code that a popular iOS application
runs in background batch scripts which uses SQLAlchemy to generate
playlists. They provide some context and advice for using SQLAlchemy
in batch scripts.
• Getting PostgreSQL transactions under control with SQLAlchemy
provides a quick introduction to the tool Chryso that they are working
on to provide better transaction management in SQLAlchemy
connections.
SQLAlchemy compared to other ORMs
SQLAlchemy is one of many Python object-relational mapper (ORM)
implementations. Several open source projects and articles are listed here to
make it a bit easier to understand the differences between these
implementations.
• Introduction to SQLAlchemy ORM for Django Developers is written by
a developer who typically used the Django ORM at work and then had a
chance to try SQLAlchemy for one project. He covers differences in
how each one handles transactions, models and queries.
• SQLAlchemy vs Other ORMs provides a detailed comparison of
SQLAlchemy against alternatives.
• If you're interested in the differences between SQLAlchemy and the
Django ORM I recommend reading SQLAlchemy and You by Armin
Ronacher.
142
Full Stack Python: 2020 Supporter's Edition
• This GitHub project named PythonORMSleepy implements the same
Flask application with several different ORMs: SQLAlchemy, Peewee,
MongoEngine, stdnet and PonyORM. Looking through the code is
helpful for understanding the varying approaches each library takes to
accomplish a similar objective.
• Quora has several answers to the question of which is better and why:
Django ORM or SQLALchemy based on various developers'
experiences.
Open source code for learning SQLAlchemy
Many open source projects rely on SQLAlchemy. A great way to learn how to
properly work with this tool is to read the code that shows how those projects
use SQLAlchemy. This section alphabetically lists these code examples by
class and function in the SQLAlchemy code base.
Peewee
Peewee (source code) is a object-relational mapper (ORM) implementation
for bridging data stored in relational database tables with Python objects.
What makes Peewee a useful ORM?
Peewee can be an easier library to wrap your head around than SQLAlchemy
and other ORMs. It is designed to be easier to hack on and understand,
similar to how Bottle is a smaller, one-file web framework compared to the
comprehensive Django framework. If you are just getting started with web
development, it may be worth using Peewee for your database mapping and
operations, especially if you use a microframework such as Flask or Bottle.
143
Full Stack Python: 2020 Supporter's Edition
Peewee can be used with pretty much any web framework (although using it
with Django would currently be complicated due to its tight built-in ORM
coupling) or without a web framework. In the latter case Peewee is good for
pulling data out of a relational database in a script or Jupyter notebook.
Any of the common relational database backends such as PostgreSQL, MySQL
or SQLite are supported, although a database driver is still required. The chart
below shows a few example configurations that could use Peewee as an ORM.
How does Peewee compare to other Python ORMs?
The analogy used by the core Peewee author is that Peewee is to SQLAlchemy
as SQLite is to PostgreSQL. An ORM does not have to work for every
exhaustive use case in order to be useful.
Peewee resources
Peewee is a much newer library than several other Python ORMs. For
example, Peewee's first public commit was in 2010, compared to 2005 for
SQLAlchemy. The project is still over five years old though and matured
substantially in development during that time. However, there are typically
less resources and examples available to demonstrate how to use Peewee in
your projects than some other ORMs that have been around for a longer
period of time.
Many of the best resources come from the project's author, Charles Leifer, on
his blog and on the official site. There are also hundreds of questions
144
Full Stack Python: 2020 Supporter's Edition
answered on the Stack Overflow peewee tag, so as usual that can be a rich
source of examples for your Peewee-powered Python applications.
• An encrypted command-line diary with Python is an awesome
walkthrough explaining how to use SQLite, SQLCipher and Peewee to
create an encrypted file with your contents, diary or otherwise.
• An Intro to Peewee – Another Python ORM is a short tutorial that
walks through creating a database model mapping, adding data,
deleting records and querying fields.
• Introduction to peewee uses an example public dataset, loads it into a
SQLite database and shows how to query it using Peewee.
• Shortcomings in the Django ORM and a look at Peewee from the
author of the Peewee ORM explains how some of the design decisions
made in Peewee were in reaction to parts of the Django ORM that
didn't work so well in practice.
• The official Peewee quickstart documentation along with the example
Twitter clone app will walk you through the ins and outs of your first
couple Peewee-powered projects.
• Flask and Peewee 101 has some basic code for querying with Peewee
and populating a drop-down in a Jinja2 template. Note that the Flaskpeewee extension is no longer maintained, although you do not need to
use it to work with both Flask and Peewee in an application.
• How to make a Flask blog in one hour or less is a well written tutorial
that uses the Peewee ORM instead of SQLAlchemy for the blog back
end.
• These posts on querying the top item by group with Peewee ORM and
querying the top N objects per group with Peewee ORM provide
working examples on how to properly query your data via Peewee.
• There was a good discussion in a Python subreddit thread about the
differences between Peewee and SQLAlchemy. Charles Leifer even
chimed in to add his own fair assessment of the differences in the
ORMs.
145
Full Stack Python: 2020 Supporter's Edition
• peewee-async (source code) is an alpha library for using Python 3's
asyncio standard library with Peewee. This library is worth watching if
you use an async web framework and want to have Peewee serve as
your application's ORM.
• Accessing remote MySQL database with peewee debugs a question
where the original author had issues accessing a remote MySQL
database because they did not properly include the Model class from
peewee.py when instantiating a mapper class.
Django ORM
The Django web framework includes a default object-relational mapping layer
(ORM) that can be used to interact with application data from various
relational databases such as SQLite, PostgreSQL and MySQL.
Django ORM resources
The Django ORM has evolved over the past dozen years since it was created
make sure to not only read up on the latest tutorials but also learn about
newer optimizations, such as prefetch_related and select_related, that have
been added throughout the project's history.
• Django models, encapsulation and data integrity is a detailed article by
Tom Christie on encapsulating Django models for data integrity.
• Django Debug Toolbar is a powerful Django ORM database query
inspection tool. Highly recommended during development to ensure
you're writing reasonable query code. Django Silk is another inspection
tool and has capabilities to do more than just SQL inspection.
146
Full Stack Python: 2020 Supporter's Edition
• Django QuerySet Examples (with SQL code included) teaches how
QuerySets work and shows the corresponding SQL code behind the
Python code you write to use the ORM.
• Making a specific Django app faster is a Django performance blog post
with some tips on measuring performance and optimizing based on the
measured results.
• Why I Hate the Django ORM is Alex Gaynor's overview of the bad
designs decisions, some of which he made, while building the Django
ORM.
• Going Beyond Django ORM with Postgres is specific to using
PostgreSQL with Django.
• How to view Django ORM SQL queries along with django-sql-explorer
allow you to better understand the SQL code that is generated from the
Django ORM.
• Migrating a Django app from MySQL to PostgreSQL is a quick look at
how to move from MySQL to PostgreSQL. However, my guess is that
any Django app that's been running for awhile on one relational
database will require a lot more work to port over to another backend
even with the power of the ORM.
• Adding basic search to your Django site shows how to write generic
queries that'll allow you to provide site search via the Django ORM
without relying on another tool like ElasticSearch. This is great for
small sites before you scale them up with a more robust search engine.
• The Django ORM Cookbook provides code recipes for various ways to
use the Django ORM to insert and query data.
• How to use Django's Proxy Models is a solid post on a Django ORM
concept that doesn't frequently get a lot of love or explanation.
• Tightening Django Admin Logins shows you how to log authentication
failures, create an IP addresses white list and combine fail2ban with
the authentication failures list.
• How to Turn Django Admin Into a Lightweight Dashboard and How to
Use Grouping Sets in Django are two great posts on how to add custom
147
Full Stack Python: 2020 Supporter's Edition
features to the Django Admin as well as optimize with more advanced
SQL when the first attempt at the queries get slow due to larger
amounts of data.
• Sorting querysets with NULLs in Django shows what to do if you're
struggling with the common issue of sorting columns that contain
NULL values.
• Best Practices working with Django models in Python has a ton of great
advice on proper model naming conventions, quirks to avoid with
ForeignKey field relationships, handling IDs and many other edge
cases that come up when frequently working with Django's ORM.
• Merging Django ORM with SQLAlchemy for Easier Data Analysis
provides rationale for using the SQLAlchemy ORM instead of Django's
default ORM in some situations.
• Working with huge data sets in Django explains how to slice the data
you retrieve by query into pages and then use prefetch_related on a
subset of the data rather than your whole data set.
• Solving performance problems in the Django ORM gives a slew of great
code snippets to use with django.db.connection so you can discover
issues such as unexpected extra queries and problematic key
relationships.
• Full-text search in Django with PostgreSQL is a very detailed example
that shows how to work specifically with a PostgreSQL backend.
• Django Anti-Patterns: Signals explains why you should avoid using
Django ORM's signals feature in your applications if you want to make
them easier to maintain.
• Django ORM optimization story on selecting the least possible goes
through one developer's Django ORM code refactoring to optimize the
performance and results of a single query.
• Fixing your Django async job - database integration is a great article on
how to properly integrate the RQ task queue with a Django backend.
148
Full Stack Python: 2020 Supporter's Edition
Django migrations resources
Django migrations were added in version 1.7. Django projects prior to 1.7 used
the South project, which is now deprecated and merged into Django.
Migrations can be tricky to wrap your head around as you're getting started
with the overall framework but the following resources should get you past
the initial hurdles.
• Django Migrations - a Primer takes you through the new migrations
system integrated in the Django core as of Django 1.7, looking
specifically at a solid workflow that you can use for creating and
applying migrations.
• Django 1.7: Database Migrations Done Right explains why South was
not directly integrated into Django, how migrations are built and shows
how backwards migrations work.
• Executing custom SQL in Django migrations examines how you can
hook in straight SQL that will run during a Django migration.
• Squashing and optimizing migrations in Django shows a simple
example with code for how to use the migrations integrated into
Django 1.7.
• Supporting both Django 1.7 and South explains the difficulty of
supporting Django 1.7 and maintaining South migrations for Django
1.6 then goes into how it can be done.
• Writing unit tests for Django migrations contains a ton of awesome
code examples for testing your migrations to ensure data migrations
work well throughout the lifecycle of your Django project.
• Strategies for reducing memory usage in Django migrations shows the
large memory usage problem that often occurs with Django migrations
at scale and what you can do to mitigate the issue.
• How to Create Django Data Migrations has a straightforward blog
ORM modeling example to show how to perform data migration.
• Keeping data integrity with Django migrations shows two table
modification scenarios, one where a column needs to be added to an
existing table, and another where a Many-to-Many field needs to be
149
Full Stack Python: 2020 Supporter's Edition
converted to a standard ForeignKey column while retaining all of the
data.
• Double-checked locking with Django ORM shows how you can
implement a double-checking locking pattern in the Django ORM with
PostgreSQL, which is useful when you want to prevent multiple
processes from accessing the same data at the same time.
• Using Django Check Constraints for the Sum of Percentage Fields
shows how you can combine several PositiveIntegerField model
fields with a checking constraint and a web form that ensures all of the
fields sum up to a precise amount, such as 100%.
Pony ORM
Pony (source code) is a Python object-relational mapper (ORM) library
(database module source code). Pony can be used to interact and manipulate
data in relational databases, including PostgreSQL, SQLite and MySQL.
Pony resources
• Why you should give Pony ORM a chance explains some of the benefits
of Pony ORM that make it worth trying out.
• An intro to Pony ORM shows the basics of how to use the library, such
as creating databases and manipulating data.
• The Pony ORM author explains on a Stack Overflow answer how Pony
ORM works behind the scenes. Worth a read whether or not you're
using the ORM just to find out how some of the magic coding works.
150
Full Stack Python: 2020 Supporter's Edition
NoSQL Data Stores
Relational databases store the vast majority of web application persistent
data. However, there are several alternative classifications of storage
representations.
1.
2.
3.
4.
Key-value pair
Document-oriented
Column-family table
Graph
These persistent data storage representations are commonly used to augment,
rather than completely replace, relational databases. The underlying
persistence type used by the NoSQL database often gives it different
performance characteristics than a relational database, with better results on
some types of read/writes and worse performance on others.
Key-value Pair
Key-value pair data stores are based on hash map data structures.
Key-value pair data stores
• Redis is an open source in-memory key-value pair data store. Redis is
often called "the Swiss Army Knife of web application development." It
can be used for caching, queuing, and storing session data for faster
access than a traditional relational database, among many other use
cases. Learn more on the Redis page.
• Memcached is another widely used in-memory key-value pair storage
system.
Key-value pair resources
• What is a key-value store database? is a Stack Overflow Q&A that
straight on answers this subject.
Redis resources
• How to Use Redis with Python 3 and redis-py on Ubuntu 16.04
contains detailed steps to install and start using Redis in Python.
151
Full Stack Python: 2020 Supporter's Edition
• "How To Install and Use Redis" is a guide for getting up with the
extremely useful in-memory data store.
• This video on Scaling Redis at Twitter is a detailed look behind the
scenes with a massive Redis deployment.
• Walrus is a higher-level Python wrapper for Redis with some caching,
querying and data structure components build into the library.
• Real World Redis Tips provides some guidance from Heroku's
engineers from deploying Redis at scale. The tips include setting an
explicit idle connection timeout, using a connection pooler and
avoiding using KEYS in favor of SCAN .
• Writing Redis in Python with Asyncio shows a detailed example for
how to use the new Asyncio standard library in Python 3.4+ for
working with Redis.
• How to collect Redis metrics shows how to use the Redis CLI client to
grab key metrics on latency.
• You should revise your Redis max connections setting is a retrospective
from a hard web application failure due to Redis connections maxing
out on Heroku, and how to avoid this in your own applications by
modifying your redis.conf settings.
Document-oriented
A document-oriented database provides a semi-structured representation for
nested data.
Document-oriented data stores
• MongoDB is an open source document-oriented data store with a
Binary Object Notation (BSON) storage format that is JSON-style and
familiar to web developers. PyMongo is a commonly used client for
interfacing with one or more MongoDB instances through Python code.
MongoEngine is a Python ORM specifically written for MongoDB that
is built on top of PyMongo.
• Riak is an open source distributed data store focused on availability,
fault tolerance and large scale deployments.
152
Full Stack Python: 2020 Supporter's Edition
• Apache CouchDB is also an open source project where the focus is on
embracing RESTful-style HTTP access for working with stored JSON
data.
Document-oriented data store resources
• The creator and maintainers of PyMongo review four decisions they
regret from building the widely-used Python MongoDB driver.
1. start_request
2. use_greenlets
3. copy_database
4. MongoReplicaSetClient
• The Python and MongoDB Talk Python to Me podcast has a great
interview with the maintainer of the Python driver for MongoDB.
• MongoDB queries don’t always return all matching documents! is a
walkthrough of discovering how MongoDB queries actually work, and
shows some potential pitfalls of relying on technologies where you do
not fully understand how they operate.
• Introduction to MongoDB and Python shows how to use Python to
interface with MongoDB via PyMongo and MongoEngine.
Column-family table
A column-family table class of NoSQL data stores builds on the key-value pair
type. Each key-value pair is considered a row in the store while the column
family is similar to a table in the relational database model.
Column-family table data stores
• Apache Cassandra
• Apache HBase
Graph
A graph database represents and stores data in three aspects: nodes, edges
and properties.
A node is an entity, such as a person or business.
153
Full Stack Python: 2020 Supporter's Edition
An edge is the relationship between two entities. For example, an edge could
represent that a node for a person entity is an employee of a business entity.
A property represents information about nodes. For example, an entity
representing a person could have a property of "female" or "male".
Graph data stores
• Neo4j is one of the most widely used graph databases and runs on the
Java Virtual Machine stack.
• Cayley is an open source graph data store written by Google primarily
written in Go.
• Titan is a distributed graph database built for multi-node clusters.
Graph data store resources
• Introduction to Graph Databases covers trends in NoSQL data stores
and compares graph databases to other data store types.
• Graph search algorithm basics explains the methods for searching
nodes for data in a graph database.
NoSQL third-party services
• Compose provides MongoDB as a service. It's easy to set up with either
a standard LAMP stack or on Heroku.
NoSQL data store resources
• NoSQL databases: an overview explains what NoSQL means, how data
is stored differently than in relational systems and what the
Consistency, Availability and Partition-Tolerance (CAP) Theorem
means.
• NoSQL Explained is a good high-level overview of considerations and
features when choosing a type of NoSQL database compared to a
relational database.
• CAP Theorem overview presents the basic constraints all databases
must trade off in operation.
154
Full Stack Python: 2020 Supporter's Edition
• This post on What is a NoSQL database? Learn By Writing One in
Python is a detailed article that breaks the mystique behind what some
forms of NoSQL databases are doing under the covers.
• The CAP Theorem series explains concepts related to NoSQL such as
what is ACID compared to CAP, CP versus CA and high availability in
large scale deployments.
• NoSQL Weekly is a free curated email newsletter that aggregates
articles, tutorials, and videos about non-relational data stores.
• NoSQL comparison is a large list of popular, BigTable-based, special
purpose, and other datastores with attributes and the best use cases for
each one.
• Relational databases such as MySQL and PostgreSQL have added
features in more recent versions that mimic some of the capabilities of
NoSQL data stores. For example, check out this blog post on storing
JSON data in PostgreSQL.
NoSQL data stores learning checklist
1. Understand why NoSQL data stores are better for some use cases than
relational databases. In general these benefits are only seen at large
scale so they may not be applicable to your web application.
2. Integrate Redis into your project for a speed boost over slower
persistent storage. Storing session data in memory is generally much
faster than saving that data in a traditional relational database that
uses persistent storage. Note that when memory is flushed the data
goes away so anything that needs to be persistent must still be backed
up to disk on a regular basis.
3. Evaluate other use cases such as storing transient logs in a documentoriented data store such as MongoDB.
Redis
Redis is an in-memory key-value pair database typically classified as a NoSQL
database. Redis is commonly used for caching, transient data storage and as a
holding area for data during analysis in Python applications.
155
Full Stack Python: 2020 Supporter's Edition
Redis tutorials
Redis is easy to install and start using compared to most other persistent
backends, but it's useful to follow a walkthrough if you have never previously
used Redis or any NoSQL data store.
• How to Use Redis with Python 3 and redis-py on Ubuntu 16.04
contains detailed steps to install and start using Redis in Python.
• How To Install and Use Redis is a straightforward starter guide that
includes installation instructions.
Redis with Python
Redis is easier to use with Python if you have a code library client that bridges
from your code to your Redis instace. The following libraries and resources
provide more information on handling data in a Redis instance with your
Python code.
• Redis-py is a solid Python client to use with Redis.
• Walrus is a higher-level Python wrapper for Redis with some caching,
querying and data structure components build into the library.
• Writing Redis in Python with Asyncio shows a detailed example for
how to use the new Asyncio standard library in Python 3.4+ for
working with Redis. There is also a EuroPython video of the talk that
goes along with the code.
• Cache_deco is a generic Python caching decorator library.
• Write your own miniature Redis with Python doesn't actually use Redis
but shows how you can write a simplified version of Redis' in-memory
156
Full Stack Python: 2020 Supporter's Edition
data store with Python. It's a good article to understand more about
how NoSQL data stores can work under the covers.
• Introduction to Redis streams with Python shows how to use the new
(as of Redis version 5.0) append-only streams functionality via Python
code.
Redis tools and examples
Redis' wide applicability can be a downside if you don't know what to start
using it for in your application. The following code and posts provide common
use cases for Redis.
• redis-labs-use-cases has a couple of examples of using Redis to analyze
geospatial data and tweets.
• redis-migrate-tool is a library to make it easier to move data between
redis clusters and groups.
• redis-rdb-tools parses the Redis' database storage files and can dump
the contents to JSON files.
Redis Security
Redis should be customized out of its default configuration to secure it against
unauthorized and unauthenticated users. These resources provide some
advice on Reids security and guarding against data breaches.
• Pentesting Redis servers shows that security is important not only on
your application but also the databases you're using as well.
• Redis, just as with any relational or NoSQL database, needs to be
secured based on security guidelines. There is also a post where the
main author of Redis cracks its security to show the tradeoffs purposely
made between ease of use and security in the default settings.
• For God’s sake, secure your Mongo/Redis/etc! digs into the
unfortunate default security settings that come with many NoSQL
databases which can be used to compromise your systems. Make sure
to not only install your dependencies such as Redis, but automate
modifying default settings to lock them down against attackers.
157
Full Stack Python: 2020 Supporter's Edition
Specific Redis topics
Once you have configured Redis, become comfortable using it and locked it
down against malicious actors, you will want to learn more about operating,
scaling and collecting metrics. The following resources should help you get
started in those areas.
• A Key Expired In Redis, You Won't Believe What Happened Next is a
well-written post on issues one team found with their configuration
when they were using Redis for a lot of data caching.
• Redis-playbook is an Ansible playbook for installing, configuring and
securing a Redis instance.
• Monitoring Redis shows common commands for accessing meta data
about your Redis databases, such as info and slowlog .
• GitHub wrote a retrospective on moving persistent data out of Redis
and into MySQL that is worth a read as you scale up your Redis usage.
• Learn Redis the hard way (in production) investigates problems found
with a development team's Redis infrastructure, how they went about
debugging them and ultimately fixing some of the issues, while being
aware of limitations that could cause them issues in the future.
• This video on Scaling Redis at Twitter is a detailed look behind the
scenes with a massive Redis deployment.
• Real World Redis Tips provides some guidance from Heroku's
engineers from deploying Redis at scale. The tips include setting an
explicit idle connection timeout, using a connection pooler and
avoiding using KEYS in favor of SCAN .
• How to collect Redis metrics shows how to use the Redis CLI client to
grab key metrics on latency.
• You should revise your Redis max connections setting is a retrospective
from a hard web application failure due to Redis connections maxing
out on Heroku, and how to avoid this in your own applications by
modifying your redis.conf settings.
• A Speed Guide To Redis Lua Scripting shows how to use the Lua
programming language to create extensions for Redis.
158
Full Stack Python: 2020 Supporter's Edition
• Better tests for Redis integrations with redislite shows how to mock out
a Redis instance using the redislite library and clean up existing hacks
you may be using to test your Redis usage.
• Our journey from Redis 2 to Redis 3 while not taking the site down
explains their infrastructure and uptime demands in the gambling
industry, and how they were able to roll their upgraded Redis versions.
MongoDB
MongoDB is a document-oriented NoSQL database that is often used for
storing, querying and analyzing persistence data in Python applications.
General MongoDB tutorials
It is worth taking some time to learn the ins and outs of MongoDB before
connecting it to your Python application. The following tutorials are not
specific to Python and will have you work directly with the MongoDB
command line and query language.
• Getting Started with MongoDB - Part 1 and Part 2 are programming
language agnostic tutorials that show how to interact via querying and
various operators such as $in , $lte and $gte .
• An Introduction to MongoDB examines common commands frequently
used to perform data operations.
• MongoDB In 30 Minutes goes over the basics for creating, querying,
updating and deleting data in MongoDB.
• MongoDB queries do not always return all matching documents! walks
through discovering that potential pitfalls on how MongoDB queries
159
Full Stack Python: 2020 Supporter's Edition
operate that were non-intuitive to developers who are new to using this
database.
• On MongoDB is not a tutorial but instead discusses the culture and
adoption patterns around how MongoDB became such a common
NoSQL database. This is a must read to understand both the strengths
and many weaknesses Mongo has despite what you may read in other
introductory tutorials.
• This 3-part series on monitoring MongoDB with WiredTiger MMAP
and Datadog explains how to install and configure agents and gather
metrics out of your MongoDB instances.
• How to Investigate MongoDB Query Performance shows how to work
with the MongoDB profiler, use the explain method and check
execution plans.
• How to Optimize Performance of MongoDB covers schema design,
replication lag, resource provisioning and query efficiency.
• Getting Started With Google Cloud Functions and MongoDB shows
how to connect Mongo with a Google Cloud Function to store
persistent data while running on a serverless platform.
MongoDB security
NoSQL databases can be a weak spot in a production deployment
environment, especially when default settings are built for ease of
development instead of proper access control. MongoDB is no exception with
its loose default security controls so make sure to lock down your instances.
• For God's sake, secure your Mongo/Redis/etc! explains the weak
default security settings provided by many NoSQL databases, including
MongoDB. Make sure to automate locking down your NoSQL
databases just as you would any other component in your stack.
• Before deploying a MongoDB instance to production, be sure to go
through each of the items on the official MongoDB security checklist.
• The definitive guide to MongoDB security is a high-level overview for
the multitude of tasks you must perform to lock down your MongoDB
160
Full Stack Python: 2020 Supporter's Edition
instances, such as appropriately using SSL certificates and accesscontrol lists.
• MongoDB Security Basics For Your Deployments in AWS is primarily a
guide on AWS security from the perspective of using installing and
using MongoDB on your own instance. The post covers authentication,
SSL and firewalls.
• Securing MongoDB using Let's Encrypt certificate gives a configuration
that encrypts that traffic coming from and going to your MongoDB
instances using free Let's Encrypt certificates.
• This 4 post securing MongoDB series covers Data Security
Requirements for Regulatory Compliance, Database Access Control,
Database Auditing and Encryption and Environmental Control &
Database Management.
• Lightweight Directory Access Protocol (LDAP) is common in many
established company environments for security. This post on How to
Configure LDAP Authentication for MongoDB goes over how to
authenticate users via LDAP who are using MongoDB.
Python with MongoDB resources
MongoDB is straightforward to use in a Python application when a driver
such as PyMongo is installed. The following tutorials show how to install,
configure and start using MongoDB with Python.
• Introduction to MongoDB and Python shows how to use Python to
interface with MongoDB via PyMongo and MongoEngine.
• How To Set Up Flask with MongoDB and Docker combines the Flask
web framework with MongoDB then containerizes the example
application with Docker.
• The PyMongo project creators wrote a retrospective focusing on four
decisions they would have done differently with the benefit of
hindsight:
1. start_request
2. use_greenlets
3. copy_database
161
Full Stack Python: 2020 Supporter's Edition
4. MongoReplicaSetClient
• Python and MongoDB on the Talk Python to Me podcast has a great
interview with the MongoDB Python driver maintainer.
• PyMongo Monday: Setting Up Your PyMongo Environment is an
introduction to using MongoDB with Python code. This first part of the
series shows how to set up the development environment required for
working with Mongo.
• Testing MongoDB Failover in Your Python App shows show to switch
to a MongoDB replica in production failure scenarios using the
PyMongo library.
Apache Cassandra
Apache Cassandra is a column-family NoSQL data store designed for writeheavy persistent storage in Python web applications and data projects.
Python with Cassandra resources
Cassandra is commonly used with Python for write-heavy application
demands. The following tutorials walk through several of the helper libraries
that can be used to interact with Cassandra, with and without web
frameworks such as Django.
• DataStax's Python Cassandra driver can be installed as an application
dependency to make it easier to access and work with Cassandra in
your Python applications.
• Async Python and Cassandra with Gevent explains how you
monkeypatch gevent into a Python 2.7 application and work with
Cassandra using gevent's coroutines. Note that this post could have
instead been written with asycnio if it were coded with Python 3.
162
Full Stack Python: 2020 Supporter's Edition
• How to Install and Use Cassandra on Django instructs how to use
Cassandra with Django 1.8 but it should still be relevant for newer
Django versions as well.
• Using Cassandra with Python and uWSGI gives some short example
code for connecting to a Cassandra cluster outside the HTTP requestresponse cycle to prevent timeouts and blocking issues with WSGI
servers.
• The Stack Overflow thread asking about the best Cassandra library/
driver for Python? has a good answer on why to use the datastax/
python-driver project due to its CQL support and active development.
• Cassandra performance in Python: Avoid namedtuple covers the
performance penalty of using the namedtuple type with the DataStax
Cassandra Python driver and how you can work around it.
How Companies Use Cassandra
These resources are written by engineering teams at organizations that have
large scale Cassandra deployments. The posts cover topics such as
monitoring, scaling and usage with billions of records.
• How Discord Stores Billions of Messages talks about the evolution of
Discord's very large scale message store system from a MongoDB
instance to Cassandra for storing messages in a distributed, replicated
cluster.
• Monitoring Cassandra at Scale explains how the Yelp engineering team
uses Cassandra to complement their MySQL and ElasticSearch
instances. The post does a nice job of enumerating the warning signs to
monitor and provides a short example of an issue with replication that
could be caught by their approach.
• How Uber Manages A Million Writes Per Second Using Mesos And
Cassandra Across Multiple Datacenters shows why Uber needs
accurate real-time data at large scale to make their driver and
passenger operations run properly. The post goes into the overall
architecture they use including cluster size, tolerable latency and other
libraries in their stack.
163
Full Stack Python: 2020 Supporter's Edition
General Cassandra resources
Apache Cassandra can be used independently of Python applications for data
storage and querying. The learning curve for getting started is similar to other
NoSQL data stores but scaling, performance and monitoring can be
challenging. The following resources focus on addressing those issues based
on teams that have felt the pain and often released their resulting tools as
open source projects.
• The official getting started documentation for Cassandra provides
installation, configuration, and basic querying information.
• How Not To Use Cassandra Like An RDBMS (and what will happen if
you do) gives examples in Cassandra's query language CQL of
operations that are typical with relational databases but go terribly
wrong with Cassandra, due to its NoSQL architecture that is optimized
for other types of operations.
• Cassandra Query Language (CQL) Tutorial explains the concepts and
syntax behind the data management language that is Cassandra's
equivalent to relational database SQL.
• Backup and Recovery for Apache Cassandra and Scale-Out Databases
covers issues encountered when trying to take snapshot backups of
Cassandra due to partitions and consistency lag time that occur with
just about every Cassandra setup.
• Getting the Most Out of Cassandra is a video for on data modeling and
application development for developers new to Cassandra.
• The Total Newbie’s Guide to Cassandra compares Cassandra to
traditional relational databases.
• On Cassandra Collections, Updates, and Tombstones and Undetectable
tombstones in Apache Cassandra present how developers often use
Cassandra collections incorrectly when they are not experienced with
how the data store operates.
• When to use Cassandra and when to steer clear explains the advantages
Cassandra provides such as high throughput on writes (versus reads)
and availability. The disadvantages are also given such as strong
consistency, typical relational database-style (ACID) transactions and
164
Full Stack Python: 2020 Supporter's Edition
reads without knowing the primary key of the record you want to
access. These are common database tradeoffs you need to understand
based on your workload and decide upon before you build out your
whole data architecture!
• Analyzing Cassandra Performance with Flame Graphs and Garbage
Collection Tuning for Apache Cassandra are two posts in a series on
how to debug issues in operational Cassandra deployments using
appropriate data visualization, especially when the issue is due to the
Java Virtual Machine (JVM)'s garbage collection methods.
Neo4j
Neo4j (source code) is a NoSQL graph database that can be used to persist
data in Python web applications and data projects. Neo4j has both a
commercial version and a community version of the database.
Comparing Neo4j with relational databases
• RDBMS & Graphs: Relational vs. Graph Data Modeling
• Building social network with Neo4j and Python explores varying
approaches for performing social data analysis in relational databases
and graph databases.
• The Newest RDBMS-to-Neo4j ETL Tool explains the differences
between traditional relational database models and the graph-based
structure Neo4j provides. The article also covers how to use an Extract,
Transform and Load (ETL) tool to move your data from one database
such as MySQL into Neo4j.
165
Full Stack Python: 2020 Supporter's Edition
Neo4j resources
• Building a Recommendation Engine with Neo4j and Python shows how
to use Neo4j's Cypher query language to retrieve and process data.
• Using Neo4j from Python is the official page with Python-based
database drivers.
• Getting started with Neo4j and Python is a short tutorial for installing
Neo4j and running your first query.
• impfuzzy for Neo4j is a Python script that uses Neo4j as a backend to
analyze malware.
• Neo4j runs an online monthly developer "meetup" and records the
talks. Here are a few that stand out:
◦ Natural Language Understanding with Python and Neo4j
◦ Analysing football transfers with Neo4j
◦ Efficient Graph Algorithms in Neo4j
◦ An introduction to Neo4j Bolt Drivers
• A Pythonic Tour of Neo4j and the Cypher Query Language is a PyData
conference talk that gives a Python view of Neo4j's query language.
• Analyzing the Paradise Papers with Neo4j: A Closer Look at Queries,
Data Models & More uses the Neo4j Cypher Query Language to
perform analysis on the leaked Paradise Papers data.
• How to Import the Bitcoin Blockchain into Neo4j shows how to use an
existing cryptocurrency data set within Neo4j to perform analysis on
the graph structure.
• Getting Started with Data Analysis using Neo4j is a programming
language agnostic tutorial that explains how to do analysis directly in
the Neo4j Cypher Query Language.
Data analysis
Data analysis involves a broad set of activities to clean, process and transform
a data collection to learn from it. Python is commonly used as a programming
language to perform data analysis because many tools, such as Jupyter
Notebook, pandas and Bokeh, are written in Python and can be quickly
applied rather than coding your own data analysis libraries from scratch.
166
Full Stack Python: 2020 Supporter's Edition
Data analysis resources
• The following series on data exploration uses Python as the
implementation language while walking through various stages of how
to analyze a data set.
◦ Part 1 gives insight into how you should think about data and
clarify what you are looking to learn.
◦ Part 2 explains categorization and transforming a data set into
one that is easier to analyze.
◦ Part 3 shows how to visualize the results of your data
exploration.
• PyData 101 presents the slides for one of the leading developers in the
Python ecosystem on how to orient yourself if you are new to data
science.
• The Python Data Science Handbook is available to read for free online,
although I also recommend buying the book as it is a great resource for
learning the topic.
• PyData TV contains all the videos from the PyData conference series.
The conference talks are often given by professional data scientists and
the developers who write these analysis libraries, so there is a wealth of
information not necessarily captured anywhere else.
• Python Plotting for Exploratory Data Analysis is a great tutorial on how
to use simple data visualizations to bootstrap your understanding of a
data set. The walkthrough covers histograms, time series analysis,
scatter plots and various forms of bar charts.
• This series entitled "Agile Analytics" has three parts that cover how to
work in a data science team and how to operate one if you are a
manager:
◦ Part 1: The Good Stuff
◦ Part 2: The Bad Stuff
◦ Part 3: The Adjustments
• Learning Seattle's Work Habits from Bicycle Counts provides a great
example of using open data, in this case from the city of Seattle,
messing with it using Python and pandas, then charting it using skikit-
167
Full Stack Python: 2020 Supporter's Edition
learn. You can do this type of analysis on almost any data set to find out
its patterns.
• Exploring the shapes of stories using Python and sentiment APIs is a
wonderful read with context for the problem being solved, plenty of
insight into how to reproduce the results with your own code and a
good number of charts that show how sentiment analysis can extract
information from blocks of text.
• How to automate creating high end virtual machines on AWS for data
science projects walks through setting up a development environment
on Amazon Web Services so that you can perform data analysis without
owning a high-end computer. Also check out the Introduction to AWS
for Data Scientists for another tutorial that shows you how to set up
additional commonly-used data science tools on AWS.
• Analyzing bugs.python.org uses extracted data from CPython
development to show the most-commented issues and issues by
version number throughout the project's history.
• Divergent and Convergent Phases of Data Analysis examines the flow
most people doing data science and analysis projects go through during
the exploration, synthesis, modeling and narration phases.
• Forget privacy: you're terrible at targeting anyway is a different type of
article. It is a strong piece of commentary rather than a tutorial on a
specific data analysis topic. The author argues that collecting data is
typically easy but doing the dirty analysis work often yields little in the
way of definitive, actionable insight. Overall it's a well-written thought
piece that will make you at least stop and ask yourself, "do we really
need to collect this user data?"
• Gender Distribution in North Korean Posters with Convolutional
Neural Networks is a fascinating post that uses convolutional neural
networks as a mechanism to identify gender by faces in North Korean
posters. The article's analysis on this messy data set and the results it
produces using some Python glue code with various open source
libraries is a great example of how data analysis can answer questions
that would be very time consuming for a person to figure out without a
computer.
168
Full Stack Python: 2020 Supporter's Edition
• Time Series Analysis in Python: An Introduction shows how to use the
open source Prophet library to perform time series analysis on a data
set.
• Python Data Wrangling Tutorial: Cryptocurrency Edition uses the
pandas library to clean up a messy cryptocurrency data set and shift the
data into a structure that is useful for analysis the author wantds to
perform.
• Handy Python Libraries for Formatting and Cleaning Data provides a
short overview of the libraries such as Arrow and Dora that make it
easier to wrangle your data before doing analysis.
• Analyzing one million robots.txt files explains what a robots.txt file
is, why it matters, how to download a bunch of them and then perform
some analysis with NumPy.
• Safely Analyzing Popular Licenses on GitHub Projects uses a Google
BigQuery Python helper library to work with a massive 3 terabyte data
set provided by GitHub.
• Cleaning and Preparing Data in Python shows how to uses pandas to
do the "boring" part of a data analysis job and convert dirty data into a
more consistent, structured format.
• 9 obscure Python libraries for data science presents several lesserknown but still very useful libraries for performing data analysis such
as fuzzywuzzy and gym.
• Nvidia's series on defining data analysis, machine learning and deep
learning are worth reading for the background and how they break
down the problem domains:
◦ What’s the Difference Between Artificial Intelligence, Machine
Learning, and Deep Learning?
◦ Deep Learning in a Nutshell: History and Training
◦ Deep Learning in a Nutshell: Core Concepts
pandas
The Python Data Analysis Library (pandas) is a data structures and analysis
library.
169
Full Stack Python: 2020 Supporter's Edition
pandas resources
• Intro to pandas data structures, working with pandas data frames and
Using pandas on the MovieLens dataset is a well-written three-part
introduction to pandas blog series that builds on itself as the reader
works from the first through the third post.
• pandas exercises is a GitHub repository with Jupyter Notebooks that
let you practice sorting, filtering, visualizing, grouping, merging and
more with pandas.
• A simple way to anonymize data with Python and Pandas is a good
tutorial on removing sensitive data from your unfiltered data sets.
• Learn a new pandas trick every day! is a running list of great pandas
tips that the author originally posted on Twitter and then aggregated
onto a single webpage.
• Time Series Analysis with Pandas show you how to combine Python
3.6, pandas, matplotlib and seaborn to analyze and visualize open data
from Germany's power grid. This is a great tutorial to learn these tools
with a realistic data set.
• Analyzing a photographer's flickr stream using pandas explains how
the author grabbed a bunch of Flickr data using the flickr-api library
then analyzed the EXIF data in the photos using pandas.
• Pandas Crosstab Explained shows how to use the crosstab function
in pandas so you can summarize and group data.
• Calculating streaks in pandas shows how to measure and report on
streaks in data, which is where several events happen in a row
consecutively.
170
Full Stack Python: 2020 Supporter's Edition
• How to Convert a Python Dictionary to a Pandas DataFrame is a
straightforward tutorial with example code for loading and adding data
stored in a typical Python dictionary into a DataFrame.
• This two-part series on loading data into a pandas DataFrame presents
what to do when CSV files do not match your expectations and how to
handle missing values so you can start performing your analysis rather
than getting frustrated with common issues at the beginning of your
workflow.
• Building a financial model with pandas explains how to create an
amortization schedule with corresponding table and charts that show
the pay off period broken down by interest and principal.
• tabula-py: Extract table from PDF into Python DataFrame presents
how to use the Python wrapper for the Tabula library that makes it
easier to extract table data from PDF files.
• Analyzing Browser History Using Python and Pandas shows how to
take data from Google Chrome and start to visualize it with pandas and
matplotlib.
• Time Series Forecast Case Study with Python: Monthly Armed
Robberies in Boston walks through the data wrangling, analysis and
visualization steps with a public data set of murders in Boston from
1966 to 1975. This particular data problem may not be your thing but
by going through the process you can learn a lot that can be applied to
any data set.
• A Gentle Visual Intro to Data Analysis in Python Using Pandas
presents spreadsheet-like pictures to show conceptually what pandas is
doing with your data as you apply various functions like groupby and
loc .
• Data Manipulation with Pandas: A Brief Tutorial uses some example
data sets to show how the most commonly-used functions in pandas
work.
• Analyzing Pronto CycleShare Data with Python and Pandas uses Seattle
bikeshare data as a source for wrangling, analysis and visualization.
171
Full Stack Python: 2020 Supporter's Edition
• Stylin' with pandas shows how to add colors and sparklines to your
output when using pandas for data visualization.
• Python and JSON: Working with large datasets using Pandas is a welldone detailed tutorial that shows how to mung and analyze JSON data.
• Fun with NFL Stats, Bokeh, and Pandas uses National (American)
Football League data as a source for wrangling and visualization.
• Analyzing my Spotify Music Library With Jupyter And a Bit of Pandas
shows how to grab all of your user data from the Spotify API then
analyze it using pandas in Jupyter Notebook.
• Scalable Python Code with Pandas UDFs explains that pandas
operations can often be parallelized for better performance using the
Pandas UDFs feature in PySpark version 2.3 or greater.
• How to use Pandas read_html to Scrape Data from HTML Tables has a
bunch of great code examples that show how to load data from HTML
directly into your DataFrames.
• How to download fundamentals data with Python shows how to obtain
and use financial data, such as balance sheets, stock prices, and various
ratios to perform your own analysis on.
• How to convert JSON to Excel with Python and pandas provides
instructions for creating a spreadsheet out of JSON file.
SciPy and NumPy
SciPy is a collection of open source code libraries for math, science and
engineering. NumPy, Matplotlib and pandas are libraries that fall under the
SciPy project umbrella.
172
Full Stack Python: 2020 Supporter's Edition
NumPy (source code) is a Python code library that adds scientific computing
capabilities such as N-dimensional array objects, FORTRAN and C++ code
integration, linear algebra and Fourier transformations. NumPy serves as a
required dependency for many other scientific computing packages such as
pandas.
Blaze is a similar, but separate, ecosystem with additional tools for wrangling,
cleaning, processing and analyzing data.
SciPy resources
Take a look at the pages on Matplotlib and pandas for tutorials specific to
those projects. The following resources are broader walkthroughs for the
SciPy ecosystem:
• SciPy Lecture notes goes into the overall Python scientific computing
ecosystem and how to use it.
173
Full Stack Python: 2020 Supporter's Edition
• The SciPy Cookbook contains instructions for various SciPy packages
that were previously hosted on the SciPy wiki.
• Robots and Generative Art and Python, oh my! uses Scipy, Numpy, and
Matplotlib to generate some nice looking art that can even be written to
paper using a plotter. This is a very cool example project that ties
together the scientific world and the art world.
• Lectures in Quantitative Economics: SciPy provides a good overview of
SciPy compared to the specific NumPy project, as well as explanations
for the wrappers SciPy provides over lower-level FORTRAN libraries.
• A plea for stability in the SciPy ecosystem presents concerns from one
scientist's perspective about how fast the Python programming
ecosystem changes and that code can become backwards incompatible
in only a few years. The issue is that many science projects last decades
and therefore cannot follow the rate of change as easily as typical
software development projects.
NumPy resources
• From Python to NumPy is an awesome resource that shows how to use
your basic Python knowledge to learn how to do vectorization with
NumPy.
• The ultimate beginner's guide to NumPy explains how to install and
import NumPy, then digs into using arrays for computation and how to
perform operations that get the results you need for your data analysis.
• Math to Code provides an interactive tutorial to learn how to
implement math in NumPy.
• 101 NumPy Exercises for Data Analysis
• NumPy: creating and manipulating numerical data
• Python NumPy Array Tutorial is a starter tutorial specifically focused
on using and working with NumPy's powerful arrays.
• Beyond Numpy Arrays in Python is a predecessor to a Numpy
Enhancement Proposal that recommends how to prepare the scientific
computing ecosystme for GPU, distributed and sparse arrays.
174
Full Stack Python: 2020 Supporter's Edition
• Probability distribution explorer contains graphs for understanding
how different probabilities look when plotted. There is also code for
implementing the visuals in NumPy and SciPy.
Example NumPy code
• SmoothLife is an implementation of Conway's Game of Life using
NumPy. The project uses a continuous space rather than the traditional
discrete board.
• Advanced Numpy Techniques is a Jupyter Notebook with code on
beyond-the-basics NumPy features.
Data Visualization
Data visualizations transform raw numbers into graphic formats that make it
easier for humans to see patterns, trends and other useful information.
Python data visualization tools
• Bokeh
• HoloViews
• Matplotlib
• Chartify (source code)
• Graphviz
Python-specific data viz resources
• Python Data Visualization 2018: Why So Many Libraries? is an indepth article on the Python data visualization tools landscape. A mustread whether you are new to the space or have been using one or more
of these libraries for awhile.
• The Python Graph Gallery has a slew of visualizations created with
Python and includes the code used to produced each one.
• Python & OpenGL for Scientific Visualization is a free book that shows
how to combine open source tools such as PyOpenGL with Python data
analysis libraries to generate interactive scientific data visualizations.
175
Full Stack Python: 2020 Supporter's Edition
• 10 Useful Python Data Visualization Libraries for Any Discipline is a
straightforward overview of Python packages that create Python
visualizations.
• Introduction to Data Visualization with Altair is a starter post for the
wonderful Altair visualization tool written in Python.
• The Next Level of Data Visualization in Python uses the Plotly graphing
library to draw more complex visualizations.
• An introduction to Altair provides another wonderful tutorial on this
data visualization tool.
• A Dramatic Tour through Python’s Data Visualization Landscape
provides examples with the ggplot and Altair libraries. The questionand-answer format for what you can do with the data is a really good
model that keeps your attention throughout the post.
• How to Generate FiveThirtyEight Graphs in Python gives a great
tutorial on generating a specific style graph with pandas and Matplotlib
that is similar to FiveThirtyEight's plots.
• Intro to pdvega - Plotting for Pandas using Vega-Lite shows how to
generate plots from your pandas-structured data using pdvega.
• Sorting Algorithms Visualized in Python uses Python, numpy and
scikit-image to animate how sorting algorithms work.
• How to Build a Reporting Dashboard using Dash and Plotly explains
how to use the Dash library to take a bunch of data and turn it into a
nice-looking dashboard.
• The reason I am using Altair for most of my visualization in Python
explains why this wrapper for Vega-lite is awesome and that the author
uses Altair because Matplotlib can be very complicated for whipping up
quick visualizations.
Beautiful example visualizations
Sometimes you need inspiration from other sources to figure out what you
want to build. The following links have made me excited about data
visualization and gave me ideas for what to build.
176
Full Stack Python: 2020 Supporter's Edition
• Roads to Rome is a beautiful visualization showing the data behind the
expression "all roads lead to Rome" and whether or not there is a
"Rome" central city in every country.
• Monarchs is a wonderful 1,000 year history visual of European rulers.
The developer also wrote an in-depth article on how Monarchs was
created using d3.js.
• Star Wars: The Force Accounted is Bloomberg's way of breaking down
on-screen action between light and dark sides, the main characters,
various bits about the Force and other data extracted from the movies.
• Big League Graphs presents a bunch of creative ways to view data for
sports such as basketball, baseball and hockey.
• What do numbers look like? is a Python 3 dimensional visualization of
millions of integers, colored by special factors such as prime and
Fibonacci numbers.
• Bay Area Housing Marketing Analysis: Part 1 and Part 2 are a
combination of inspiration and tutorial. These posts contain a ton of
data analysis and graphing and show numerous ways to slice and
present information.
• How We Animated Trillions of Tons of Flowing Ice breaks down the
process that the NY Times data team used to create the beautiful
Antarctic Dispatches articles that show how glaciers and ice are
moving.
• Who knew good old histograms could be so fascinating? Check out this
post titled What's so hard about histograms? and scroll through to
learn a ton about the details you can think about when creating these
types of data visuals.
• Optimized Brewery Road Trip, With Genetic Algorithm shows how a
heuristic solving approach such as a genetic algorithm can be used to
handle a version of the Traveling Salesman Problem (TSP), but with
the more fun Top 100 American brewery locations.
177
Full Stack Python: 2020 Supporter's Edition
Data visualization resources
• Guides for visualizing data provides the thought processes that you can
use when you are showing uncertainty, incompletet data, differences,
outliers, and other common scenarios that occur when your data is
messy - and it usually is!
• Data visualization, from 1987 to today is a wonderful reference about
the pre-computer age era of visualization which was a combination of
cartography, art and statistics rather than any cohesive field as it is
often seen today. The images showing how people worked with paper
to build their visuals add fantastic context to the story.
• Xenographics presents uncommon and unusual visualization formats
such as the Manhattan Plot and Time Curve.
• Engineering Intelligence Through Data Visualization at Uber explains
how Uber's data visualization team grew from 1 person to 15 and the
output they created along the way, including the open source tool
react-vis.
• The Practitioner's Guide to System Dashboard Design series covers a
lot of ground for what you should consider when building one form of
visualization, the data dashboard:
◦ Part 1: Structure and Layout
◦ Part 2: Presentation and Accessibility
◦ Part 3: What Charts to Use
◦ Part 4: Context Improvement
• Truncating the Y-Axis: Threat or Menace? is a great in-indepth
explanation of why something that seems as straightforward as laying
out the Y-Axis requires thought and care otherwise the visualization
could end up being misleading.
Bokeh
Bokeh is a data visualization library that allows a developer to code in Python
and output JavaScript charts and visuals in web browsers.
178
Full Stack Python: 2020 Supporter's Edition
Why is Bokeh a useful library?
Web browsers are ideal clients for consuming interactive visualizations.
However, libraries such as d3.js can be difficult to learn and time consuming
to connect to your Python backend web app. Bokeh instead generates the
JavaScript for your application while you write all your code in Python. The
removal of context switching between the two programming languages can
make it easier and faster to create charts and visualizations.
What do Bokeh visualizations look like?
Bokeh can create any type of custom graph or visualization. For example, here
is a screenshot of a bar chart created with the figure plot:
For more references, including interactive live demonstrations, check out
these sites:
• The official Bokeh gallery has many example Bokeh visual formats.
• Bokeh Applications hosts numerous data visualizations built with
Bokeh.
179
Full Stack Python: 2020 Supporter's Edition
Bokeh resources
Bokeh is under heavy development ahead of the upcoming 1.0 release. Note
that while all of the following tutorials are useful, it is possible some of the
basic syntax will change as the library's API is not yet stable.
• Integrating Bokeh Visualisations Into Django Projects does a nice job
of walking through how to use Bokeh to render visualizations in Django
projects.
• Responsive Bar Charts with Bokeh, Flask and Python 3 is my
recommended tutorial for those new to Bokeh who want to try out the
library and get an example project running quickly with Flask.
• Fun with NFL Stats, Bokeh, and Pandas takes an NFL play-by-play
data set, shows how to wrangle the data into an appropriate format
then explains the code that uses Bokeh to visualize it.
• Data is beautiful: Visualizing Roman imperial dynasties provides a
walkthrough for creating a gorgeous visualization based on historical
Roman data. The post is about more than just the visual, it also goes
into the ideation, data wrangling and analysis phases that came before
using Bokeh to show the results.
• Visualizing with Bokeh gives a detailed explanation with the code for
number Bokeh visuals you can output while working with a pandas
data set.
• Interactive Data Visualization in Python With Bokeh is a great
beginners tutorial that shows you how to structure your data, draw
your first figures and add interactivity to the visualizations.
• Creating Bar Chart Visuals with Bokeh, Bottle and Python 3 is a tutorial
that combines the Bottle web framework
• Building Bullet Graphs and Waterfall Charts with Bokeh covers
buildings two types of useful visualizations into your applications using
Bokeh.
• Interactive Visualization of Australian Wine Ratings builds a nontrivial visualization with a nice sample set of data based on wine
ratings.
180
Full Stack Python: 2020 Supporter's Edition
• Visualization with Bokeh
• Drawing a Brain with Bokeh is a fun example of a chord diagram that
represents neural connections in the brain.
• Bryan Van de Ven on Bokeh is a podcast episode by one of the main
Bokeh maintainers.
• The Python Visualization Landscape by Jake VanderPlas at PyCon 2017
covers many Python data visualization tools, including Bokeh.
• This flask-bokeh-example project has the code to create a simple chart
with Bokeh and Flask.
• Bokeh vs Dash — Which is the Best Dashboard Framework for Python?
contains a single project that was written in both Dash and Bokeh. The
author gives his subjective view on the implementation difficulty
although the web application only contained a single type of data
visualization so it is hard to drawn any real conclusions from his
opinion.
• Realtime Flight Tracking with Pandas and Bokeh provides a great
example of combining pandas for structuring data with Bokeh for
visualization.
• How to Create an Interactive Geographic Map Using Python and Bokeh
shows how to use a GeoJSONDataSource as input for Bokeh and draw
a map with the data.
d3.js
Data-Driven Documents (d3.js) is a JavaScript visualization library used to
create interactive visuals for web browsers.
d3.js tutorials
d3.js has a steep learning curve so it is a good idea to read several tutorials
before diving in and trying to create your own visualization from scratch.
181
Full Stack Python: 2020 Supporter's Edition
• The Hitchhiker’s Guide to d3.js is a wonderfully-written resource that
explains the context for how d3.js works and how all the pieces can be
used to create your desired visualizations.
• d3.js first steps contains the code and markup for building your first
d3.js visual.
• Let's Make a D3 Plugin shows you how to create your own reusable
plugin that you can use across multiple visualizations as a separate
JavaScript library. d3.js version 4 is now widely used so don't worry
about the note at the start of the article.
• Reusable and extendable d3 charts is a natural extension of the d3
plugin post. It shows how to reuse visualization code between multiple
visuals.
• Visualizing Movement Data - Part I provides a detailed example of how
to draw a complex visualization.
• This Fantasy Map Generator is such a cool example of what d3.js can
procedurally generate based on a set of inputs.
• Building dashboards with Django and D3
• Argyle in d3? Oh yes, the library can do that, and here is the code to
prove it.
• How and why to use D3 with React is an awesome overview of the D3.js
plugin ecosystem and how to use the tool with a React-based
JavaScript front end.
• D3 is not a data visualization library breaks down the parts to D3 and
why it's not directly comparable to a typical charting library.
Charts with d3.js
• Responsive D3js Charts shows how to take a static line chart and make
it responsive when the browser size changes.
• Resize to Scale with d3.js gives code for a render function that adjusts
the size of the viewing window based on the parent element for the
visualization.
182
Full Stack Python: 2020 Supporter's Edition
• Responsive Data Visualization provides another approach for making
responsive D3.js charts.
• Make great-looking d3.js charts in Python without coding a line of
JavaScript combines a Python backend with the python-nvd3 library to
generate d3.js charts without having to hand-write the JavaScript code.
If you are interested in a solution like this for your own visualizations
then you should also check out Bokeh.
• How to make a modern dashboard with NVD3.js uses the NVD3.js
library that works as an abstraction on top of d3.js to create charts. The
post puts together several charts to show how to build a dashboard
based on public JSON data.
• d3-regression is a module for calculating statistical regressions from
two-dimensionala data.
D3 ecosystem
• The trouble with D3 is not a tutorial but it's an important read because
it discusses why D3 can be very difficult to learn: the learning curve
depends on your background. If you are a front-end developer you will
likely have the easiest time if you already understand JavaScript, SVG
and the browser Document Object Model (DOM). Non-technical
designers and analysts typically have the hardest time with using D3
because the not only have to learn the tool itself but all the concepts
and web browser technologies that it is built upon.
• This visualization of d3.js modules and resources is a wonderful map
for understanding how various parts of the library and the ecosystem
fit together. The visual provides a useful map for where to concentrate
your learning depending on what type of visualization you are working
to build.
• D3.js in Action, Second Edition is partially an announcement for the
authors book but also contains good context for who uses D3 and why
its usage continues to grow.
183
Full Stack Python: 2020 Supporter's Edition
Matplotlib
Matplotlib is a data visualization plotting library where a developer can code
visuals in Python and output them as part of Jupyter Notebooks, web
applications and graphical user interface (GUI) toolkits.
Matplotlib resources
• Effectively using Matplotlib is an awesome getting started tutorial that
breaks through the confusing beginner steps so you can quick start
using the plotting library.
• Web Scraping With Python: Scrapy, SQL, Matplotlib To Gain Web Data
Insights is a long and comprehensive tutorial that walks through
obtaining, cleaning and visualizing data.
• Matplotlib Cheat Sheet: Plotting in Python contains some handy
snippets of code to perform common plotting operations in Matplotlib.
• 5 Quick and Easy Data Visualizations in Python shows several code
examples with explanations for performing exploratory data analysis
using Matplotlib.
• Introduction to Matplotlib — Data Visualization in Python explains how
to install and start using Matplotlib. The post has a ton of detail on
customizing your plots and graphs after creating the initial visuals.
• Visualize World Trends using Seaborn in Python shows world life
expectancy in plots generated by Matplotlib and Seaborn.
• Pandas & Seaborn - A guide to handle & visualize data in Python builds
a visualization by starting with pandas for data wrangling then outputs
charts with Matplotlib and Seaborn.
• Matplotlib Tips and Demos is a long Jupyter Notebook with a ton of
example code that shows how to use Matplotlib in many ways.
184
Full Stack Python: 2020 Supporter's Edition
• Animation with Matplotlib explains the animation base class and the
main interfaces for creating animations in your visualizations.
• Matplotlib: Creating Plots is a video tutorial series where each 15-ish
minute episode covers one important topic such as shading areas on
line plots, drawing pie charts or plotting a stream of updating data.
Markup Languages
Markup languages provide pre-defined, parsable syntax within text
documents that is used for annotations. For example, within a Markdown
document, the syntax
[this is a link to Full Stack Python](https://www.fullstackpython.com)
indicates the text "this is a link to Full Stack Python" should be annotated
with a link to "https://www.fullstackpython.com/" when run through a
Markdown parser and then transformed into HTML output.
Markup language resources
• Reach for Markdown, not LaTeX argues for the simplicity of Markdown
versus the steep learning curve and less easily adopted LaTeX for
creating documents.
• Yet Another Markup LOL? explains the virtues and the significant
downsides of tooling for the YAML markup language. Mistakes in
configuration files that use YAML or any markup language often fly
past testing and continuous integration services that catch errors in
regular code. The post also introduces a couple of tools that can help
with specific YAML issues, especially when using Kubernetes.
Markdown
Markdown is a common markup language frequently used by developers to
write Python project documention.
185
Full Stack Python: 2020 Supporter's Edition
Markdown's origin
Markdown was originally developed by John Gruber in 2004. The markup
language's lightweight design helped it gain rapid adoption by software
developers and designers. The format's simplicity also makes it easier to write
parsers to convert the structured syntax into other formats such as HTML and
JSON.
Markdown resources
Markdown does not have an extensive set of strict rules like some other text
formats so you should be able to read up on the basics with these articles then
write a few practice documents to be comfortable with it. The following
resources are really helpful when you are getting started or need a quick
reference on a less commonly-used feature such as tables or block quotes.
• Say yes to Markdown, no to MS Word provides a really awesome
overview of why Markdown is a more usable file format than Microsoft
Word and similar proprietary file types. The article also has a good list
of useful Markdown-related tools such as a Markdown-to-PDF
converter (a NodeJS package but easy enough to use with a basic
development environment).
186
Full Stack Python: 2020 Supporter's Edition
• Markdown syntax is the defacto standard and wonderful reading for
both initial learning and random reference.
• Markdown cheatsheet is a quick reference that is a shortened version of
the above Markdown syntax page.
• Markdown parsers in Python reviews many of the most common
Python Markdown parser implementations to give insight into the
advantages and disadvantages of each one.
• reStructuredText vs Markdown for documentation brings up some
really good points about the downsides to Markdown's simplicity. First,
a lot of documentation needs more complex output that is not possible
with vanilla Markdown so you need to drop into plain old HTML,
which defeats the purpose of using a markup language. Second, some
of the syntax around inserting blank lines by adding spaces at the end
of lines is confusing if someone is using a text editor or development
environment that is not configured to show blank spaces. Worse yet, if
your editor is set to remove blank spaces at the end of lines, which is
fairly common among developers, then you can mistakenly break the
formatting intended by the original author. Overall this is a good piece
to read for a balanced view of Markdown and the reasons it provides
are one reason why I use both Markdown and reStructuredText
depending on the project.
• The Python Package Index (PyPI) supports Markdown as of 2018
although there are still some tweaks being made to the flavors that can
be used such as GitHub-flavored Markdown.
• PowerShell and Markdown shows how to work with Markdown in
PowerShell including customizing colors and listing some quirks you
may need to get around.
• reStructuredText vs. Markdown for technical documentation compares
Markdown and reStructuredText specifically for documenting software
and explains where each one has advantages.
• Reach for Markdown, not LaTeX examines the virtues of using straight
Markdown along with tools such as pandoc to convert from one file
format to another, including how to use Markdown for presentations
and not just regular documentation.
187
Full Stack Python: 2020 Supporter's Edition
• Markdown page is a JavaScript file that makes it easy to render plain
old Markdown as a webpage.
reStructuredText
reStructuredText, sometimes abbreviated as "RST" or "reST", is a markup
language implementation that is often used to document Python projects.
reStructuredText resources
• A brief tutorial on parsing reStructuredText (reST)
• reStructuredText vs. Markdown for technical documentation
• RestructuredText (reST) and Sphinx CheatSheet
• reStructuredText cheat sheet
188
Full Stack Python: 2020 Supporter's Edition
Web Development
Web development is the umbrella term for conceptualizing, creating,
deploying and operating web applications and application programming
interfaces for the Web.
Why is web development important?
The Web has grown a mindboggling amount in the number of sites, users and
implementation capabilities since the first website went live in 1989. Web
development is the concept that encompasses all the activities involved with
websites and web applications.
How does Python fit into web development?
Python can be used to build server-side web applications. While a web
framework is not required to build web apps, it's rare that developers would
not use existing open source libraries to speed up their progress in getting
their application working.
Python is not used in a web browser. The language executed in browsers such
as Chrome, Firefox and Internet Explorer is JavaScript. Projects such as pyjs
can compile from Python to JavaScript. However, most Python developers
write their web applications using a combination of Python and JavaScript.
Python is executed on the server side while JavaScript is downloaded to the
client and run by the web browser.
Web development resources
To become an experienced web developer you need to know the foundation
principles that the web is built with, such as HTTP requests and responses,
client (typically web browsers) and server (web servers such as Nginx and
Apache architectures, HTML, CSS and JavaScript, among many other topics.
The following resources provide a range of perspectives and when combined
together should get you oriented in the web development world.
• How the Internet works is a must-read to get a quick overview of all the
pieces that go into a network connection from one machine to another.
The example explains how an email is sent and the story is just as
190
Full Stack Python: 2020 Supporter's Edition
useful for learning about other connections such as downloading a
webpage.
• If you want to be a web developer it's important to know the
foundational tools used to build websites and web applications. It is
also important to understand that the core concepts such as HTTP,
URLs and HTML were all there at the beginning and then were
expanded with new specifications over time. This article on the History
of the Web succinctly explains the origins of the web starting from Tim
Berners-Lee's origin vision and release at CERN.
• Web Architecture 101 is a great high-level overview of the technologies
that run the modern web, such as DNS, load balancers, web application
servers (for Python that equates to WSGI servers), data bases, task
queues, caching and several other critical concepts.
• The Evolution of the Web visualizes how web browsers and related
technologies have changed over time as well as the overall growth of
the Internet in the amount of data transferred. Note that the
visualization unfortunately stops around the beginning of 2013 but it's
a good way to explore what happened in the first 24 years.
• What happens when? is an incredibly detailed answer to the questions
"What happens when you type google.com into your browser's address
box and press enter?" that seems straightforward on the surface until
you really dig in.
• How browsers work provides an overview with solid detail on how
browsers take the HTML, CSS, JavaScript, images and other files as
input and render webpages as output. It is well worth your time to
know this stuff as a web developer.
• The history of the URL explains how the growth of ARPANET to
hundreds of nodes eventually led to the creation of the URL. This is a
great read that provides historical context for why things are the way
they are with the web.
• Web app checklist presents good practices that developers building and
deploying web applications should follow. Don't worry about having
every single one of these recommendations implemented before getting
your site live, but it is worthwhile to review the list to make sure there
191
Full Stack Python: 2020 Supporter's Edition
is not something obvious you can handle in a few minutes that will
improve your site's security, performance or usability.
• Web application development is different and better provides some
context for how web development has evolved from writing static
HTML files into the complex JavaScript client-side applications
produced today.
• The Browser Hacker's Guide to Instantly Loading Everything is a
spectacular technical talk given by Addy Osmani at JSConf EU 2017
that has great bits of developer knowledge for both beginner and
experienced web developers alike.
• Build a web application from scratch and its follow on posts for request
handling middleware explores the fundamentals of web development.
Learning these foundational concepts is critical for a web developer
even though you should still plan to use an established web framework
such as Django or Flask to build real-world applications. The open
source code for these posts is available on GitHub.
• While not Python-specific, Mozilla put together a Learning the Web
tutorial for beginners and intermediate web users who want to build
websites. It's worth a look for general web development learning.
• Web development involves HTTP communication between the server,
hosting a website or web application, and the client, a web browser.
Knowing how web browsers works is important as a developer, so take
a look at this article on what's in a web browser.
• Ping at the speed of light dives into the computer networking weeds
with how fast packets travel through the internet plumbing. The author
created a Python script that scrapes network speeds from disparate
locations to see what the network speed is in fiber optic cables as a
percentage of the speed of light.
• The critical path: optimizing load times with the Chrome DevTools
provides a well-written explanation about using Chrome's developer
features to improve the performance of your websites and web
applications.
192
Full Stack Python: 2020 Supporter's Edition
• Three takeaways for web developers after two weeks of painfully slow
Internet is a must-read for every web developer. Not everyone has fast
Internet service, whether because they are in a remote part of the world
or they're just in a subway tunnel. Optimizing sites so they work in
those situations is important for keeping your users happy.
• The History of the URL: Path, Fragment, Query, and Auth gives a
comprenhensive historical perspective on the fundamental way to link
to resources on the web. This post should be required reading for web
developers.
• Quantum Up Close: What is a browser engine? explains how a browser
takes in HTML, JavaScript, CSS, images and any other data and files to
produce a webpage as output.
• How to understand performance tests is an important topic because
many websites are slow and bloated. Learning about improving the
performance of your site is one of the best ways to become a better web
developer. Another great article on website performance is The average
web page is 3MB. How much should we care?. The visuals alone tell a
compelling story about how large webpage sizes have grown in recent
years.
Web Frameworks
A web framework is a code library that makes web development faster and
easier by providing common patterns for building reliable, scalable and
maintainable web applications. After the early 2000s, professional web
development projects always use an existing web framework except in very
unusual situations.
193
Full Stack Python: 2020 Supporter's Edition
Why are web frameworks useful?
Web frameworks encapsulate what developers have learned over the past
twenty years while programming sites and applications for the web.
Frameworks make it easier to reuse code for common HTTP operations and to
structure projects so other developers with knowledge of the framework can
quickly build and maintain the application.
Common web framework functionality
Frameworks provide functionality in their code or through extensions to
perform common operations required to run web applications. These
common operations include:
1. URL routing
2. Input form handling and validation
3. HTML, XML, JSON, and other output formats with a templating
engine
4. Database connection configuration and persistent data manipulation
through an object-relational mapper (ORM)
5. Web security against Cross-site request forgery (CSRF), SQL Injection,
Cross-site Scripting (XSS) and other common malicious attacks
6. Session storage and retrieval
Not all web frameworks include code for all of the above functionality.
Frameworks fall on the spectrum from executing a single use case to
providing every known web framework feature to every developer. Some
frameworks take the "batteries-included" approach where everything possible
comes bundled with the framework while others have a minimal core package
that is amenable to extensions provided by other packages.
For example, the Django web application framework includes the Django
ORM layer that allows a developer to write relational database read, write,
query, and delete operations in Python code rather than SQL. However,
Django's ORM cannot work without significant modification on nonrelational (NoSQL) databases such as MongoDB or Cassandra.
Some other web frameworks such as Flask and Pyramid are easier to use with
non-relational databases by incorporating external Python libraries. There is a
spectrum between minimal functionality with easy extensibility on one end
194
Full Stack Python: 2020 Supporter's Edition
and including everything in the framework with tight integration on the other
end.
Do I have to use a web framework?
Whether or not you use a web framework in your project depends on your
experience with web development and what you're trying to accomplish. If
you are a beginner programmer and just want to work on a web application as
a learning project then a framework can help you understand the concepts
listed above, such as URL routing, data manipulation and authentication that
are common to the majority of web applications.
On the other hand if you're an experienced programmer with significant web
development experience you may feel like the existing frameworks do not
match your project's requirements. In that case, you can mix and match open
source libraries such as Werkzeug for WSGI plumbing with your own code to
create your own framework. There's still plenty of room in the Python
ecosystem for new frameworks to satisfy the needs of web developers that are
unmet by Django, Flask, Pyramid, Bottle and many others.
In short, whether or not you need to use a web framework to build a web
application depends on your experience and what you're trying to accomplish.
Using a web framework to build a web application certainly isn't required, but
it'll make most developers' lives easier in many cases.
Comparing web frameworks
Are you curious about how the code in a Django project is structured
compared with Flask? Check out this Django web application tutorial and
then view the same application built with Flask.
Talk Python to Me had a podcast episode with a detailed comparison of the
Django, Flask, Tornado and Pyramid frameworks.
There is also a repository called compare-python-web-frameworks where the
same web application is being coded with varying Python web frameworks,
templating engines and object-relational mappers.
195
Full Stack Python: 2020 Supporter's Edition
Web framework resources
• Building Your Own Python Web Framework is an awesome way to
learn how the WSGI works and the many other pieces that combine to
make web frameworks useful to web developers.
• When you are learning how to use one or more web frameworks it's
helpful to have an idea of what the code under the covers is doing. This
post on building a simple Python framework from scratch shows how
HTTP connections, routing, and requests can work in just 320 lines of
code. This post is awesome even though the resulting framework is a
simplification of what frameworks such as Django, Flask and Pyramid
allow developers to accomplish.
• There is also another, more recent multi-part tutorial about building
your own web framework in Python. This series is based on the alcazar
project the author is coding for learning purposes:
◦ Part 1: Handling requests
◦ Part 2: Routes, Class-Based Handlers and Unit Testing
◦ Part 3: Test Client and Templating Support
◦ Part 4: Exception Handling, Static Files and Middleware
• Check out the answer to the "What is a web framework and how does it
compare to LAMP?" question on Stack Overflow.
• Another great series that digs behind the web framework magic is
"Web Application from Scratch". The four parts are:
◦ Part 1: handling HTTP requests and responses
◦ Part 2: abstracting Requests, Responses and Servers
◦ Part 3: request handlers and middleware
◦ Part 4: abstracting applications
• Frameworks is a really well done short video that explains how to
choose between web frameworks. The author has some particular
opinions about what should be in a framework. For the most part I
agree although I've found sessions and database ORMs to be a helpful
part of a framework when done well.
• "What is a web framework?" is an in-depth explanation of what web
frameworks are and their relation to web servers.
196
Full Stack Python: 2020 Supporter's Edition
• Django vs Flask vs Pyramid: Choosing a Python Web Framework
contains background information and code comparisons for similar
web applications built in these three big Python frameworks.
• This fascinating blog post takes a look at the code complexity of several
Python web frameworks by providing visualizations based on their
code bases.
• Python's web frameworks benchmarks is a test of the responsiveness of
a framework with encoding an object to JSON and returning it as a
response as well as retrieving data from the database and rendering it
in a template. There were no conclusive results but the output is fun to
read about nonetheless.
• What web frameworks do you use and why are they awesome? is a
language agnostic Reddit discussion on web frameworks. It's
interesting to see what programmers in other languages like and dislike
about their suite of web frameworks compared to the main Python
frameworks.
• This user-voted question & answer site asked "What are the best
general purpose Python web frameworks usable in production?". The
votes aren't as important as the list of the many frameworks that are
available to Python developers.
• Django vs. Flask in 2019: Which Framework to Choose looks at the best
use cases for Django and Flask along with what makes them unique,
from an educational and development standpoint.
• 11 new Python web frameworks has a quick blurb on several newer
frameworks that are still emerging, such as Sanic, Masonite and
Molten.
Web frameworks learning checklist
1. Choose a major Python web framework (Django or Flask are
recommended) and stick with it. When you're just starting it's best to
learn one framework first instead of bouncing around trying to
understand every framework.
197
Full Stack Python: 2020 Supporter's Edition
2. Work through a detailed tutorial found within the resources links on
the framework's page.
3. Study open source examples built with your framework of choice so you
can take parts of those projects and reuse the code in your application.
4. Build the first simple iteration of your web application then go to the
deployment section to make it accessible on the web.
Django
Django is a widely-used Python web application framework with a "batteriesincluded" philosophy. The principle behind batteries-included is that the
common functionality for building web applications should come with the
framework instead of as separate libraries.
For example, authentication, URL routing, a template engine, an objectrelational mapper (ORM), and database schema migrations are all included
with the Django framework. Compare that included functionality to the Flask
framework which requires a separate library such as Flask-Login to perform
user authentication.
The batteries-included and extensibility philosophies are simply two different
ways to tackle framework building. Neither philosophy is inherently better
than the other one.
Why is Django a good web framework choice?
The Django project's stability, performance and community have grown
tremendously over the past decade since the framework's creation. Detailed
tutorials and good practices are readily available on the web and in books. The
198
Full Stack Python: 2020 Supporter's Edition
framework continues to add significant new functionality such as database
migrations with each release.
I highly recommend the Django framework as a starting place for new Python
web developers because the official documentation and tutorials are some of
the best anywhere in software development. Many cities also have Djangospecific groups such as Django District, Django Boston and San Francisco
Django so new developers can get help when they are stuck.
There's some debate on whether learning Python by using Django is a bad
idea. However, that criticism is invalid if you take the time to learn the Python
syntax and language semantics first before diving into web development.
Django books and tutorials
There are a slew of free or low cost resources out there for Django. Make sure
to check the version numbers used in each post you read because Django was
released over 10 years ago and has had a huge number of updates since then.
These resources are geared towards beginners. If you are already experienced
with Django you should take a look at the next section of resources for more
advanced tutorials.
• Tango with Django is an extensive set of free introductions to using the
most popular Python web framework. Several current developers said
this book really helped them get over the initial framework learning
curve.
• The Django Girls Tutorial is a great tutorial that doesn't assume any
prior knowledge of Python or Django while helping you build your first
web application.
• A Complete Beginner's Guide to Django is a wonderful seven-part
series that incrementally builds out a Django project and handles
deploying the app in the final post. The seven parts are:
◦
◦
◦
◦
◦
◦
Getting Started
Fundamentals
Advanced Concepts
Authentication
Django ORM
Class-Based Views
199
Full Stack Python: 2020 Supporter's Edition
◦ Deployment
• Test-Driven Development with Python focuses on web development
using Django and JavaScript. This book uses the development of a
website using the Django web framework as a real world example of
how to perform test-driven development (TDD). There is also coverage
of NoSQL, WebSockets and asynchronous responses. The book can be
read online for free or purchased in hard copy via O'Reilly.
• Django OverIQ is a project-based tutorial for beginners that covers the
required features such as the Django ORM and Django Templates.
• The Django subreddit often has links to the latest resources for
learning Django and is also a good spot to ask questions about it.
• This Django tutorial shows how to build a project from scratch using
Twitter Bootstrap, Bower, Requests and the Github API.
• The recommended Django project layout is helpful for developers new
to Django to understand how to structure the directories and files
within apps for projects.
• Django for Beginners: Build websites with Python and Django by
William S. Vincent is perfect if you are just getting started with Django
and web development, taking you from total beginner to confident web
developer with Django and Python.
Django videos
Are you looking for Django videos in addition to articles? There is a special
section for Django and web development on the best Python videos page.
Intermediate and advanced Django topics
These books and tutorials assume that you know the basics of building Django
and want to go further to become much more knowledgeable about the
framework.
• 2 Scoops of Django by Daniel Greenfeld and Audrey Roy is well worth
the price of admission if you're serious about learning how to correctly
develop Django websites.
200
Full Stack Python: 2020 Supporter's Edition
• The Test-Driven Development with Django, Django REST Framework,
and Docker course details how to set up a development environment
with Docker in order to build and deploy a RESTful API powered by
Python, Django, and Django REST Framework.
• User Interaction With Forms explains general web form input, how
Django handles forms via POST requests, different types of input such
as CharFields, DateFields and EmailFields, and validating that input.
• This 3-part Django project optimization guide covers a wide range of
advanced topics such as Profiling and Django settings, working with
databases and caching.
• Caching in Django is a detailed look at the configuration required for
caching and how to measure the performance improvements once you
have it in place.
• Mental Models for Class Based Views provides some comparison points
between class based views (CBVs) and function based views and the
author's opinions for how you can better understand CBVs.
• Working with time zones is necessary for every web application. This
blog post on pytz and Django is a great start for figuring out what you
need to know.
• A Guide to ASGI in Django 3.0 and its Performance covers the new
Asynchronous Server Gateway Interface (ASGI) that was introduced in
Django 3.0 and explains some of the nuances and gotchas that you
should consider if you decide to use it for your web apps.
• REST APIs with Django: Build powerful web APIs with Python and
Django by William S. Vincent is the book for you if you are just moving
beyond the basics of Django and looking to get up speed with Django
REST Framework (DRF) and service-oriented architecture (SOA). It
also dives into more advanced topics like token-based authentication
and permissions.
• Django Stripe Tutorial details how to quickly add Stripe to accept
payments in a Django web app.
• This Python Social Auth for Django tutorial will show you how to
integrate social media sign in buttons into your Django application.
201
Full Stack Python: 2020 Supporter's Edition
• Upgrading Django provides a version-by-version guide for updating
your Django projects' code.
• The Django Admin Cookbook and Building Multi Tenant Applications
with Django are two solid "code recipe-style" free open source books
that will teach you more about the admin interface as well as building
projects that will be used by more than a single customer so their data
needs to be properly separated.
• How to Create Custom Django Management Commands explains how
to expand the default manage.py commands list with your own
custom commands in your projects. The tutorial has a bunch of great
examples with expected output to make it easy to follow along and
learn while you work through the post.
• Luke Plant writes about his approach to class based views (CBVs),
which often provoke heated debate in the Django community for
whether they are a time saver or "too much magic" for the framework.
• Django Apps Checklist gives some good practices rules for building
reusable Django apps.
Django migrations
• Paul Hallett wrote a detailed Django 1.7 app upgrade guide on the
Twilio blog from his experience working with the django-twilio
package.
• Real Python's migrations primer explores the difference between
South's migrations and the built-in Django 1.7 migrations as well as
how you use them.
• Andrew Pinkham's "Upgrading to Django 1.7" series is great learning
material for understanding what's changed in this major release and
how to adapt your Django project. Part 1, part 2 and part 3 and part 4
are now all available to read.
• Django migrations without downtimes shows one potential way of
performing on-line schema migrations with Django.
202
Full Stack Python: 2020 Supporter's Edition
• How to Extend Django User Model presents four main ways to expand
upon the built-in User model that is packaged with Django. This
scenario is very common for all but the simplest Django projects.
• Creating a Custom User Model in Django looks at how to create a
custom User model in Django so that an email address can be used as
the primary user identifier instead of a username for authentication.
Django Channels
Channels are a new mechanism in Django 1.9 provided as a standalone app.
They may be incorporated into the core framework in 2.0+. Channels provide
"real-time" full-duplex communication between the browser and the server
based on WebSockets.
• This tutorial shows how to get started with Django Channels in your
project.
• The channels examples repository contains a couple of good starter
projects such as a live blog and a chat application to use as base code.
• The Developing a Real-Time Taxi App with Django Channels and
Angular course details how to create a ride-sharing app with Django
Channels, Angular, and Docker. Along the way, you'll learn how to
manage client/server communication with Django Channels, control
flow and routing with Angular, and build a RESTful API with Django
REST Framework.
Django testing
• Integrating Front End Tools with Django is a good post to read for
figuring out how to use Gulp for handling front end tools in
development and production Django sites.
• Django Testing Cheat Sheet covers many common scenarios for Django
applications such as testing POST requests, request headers,
authentication, and large numbers of model fields in the Django ORM.
• Getting Started with Django Testing will help you stop procrastinating
on testing your Django projects if you're uncertain where to begin.
203
Full Stack Python: 2020 Supporter's Edition
• Testing in Django provides numerous examples and explanations for
how to test your Django project's code.
• Django views automated testing with Selenium gives some example
code to get up and running with Selenium browser-based tests.
Django with JavaScript MVC frameworks
There are resources for JavaScript MVC frameworks such as Angular, React
and Vue.js on their respective pages.
Django ORM tutorials
Django comes with its own custom object-relational mapper (ORM) typically
referred to as "the Django ORM". Learn more about the Django ORM on the
its own page and more broadly about ORMs on the Python object-relational
mappers page.
Static and media files
Deploying and handling static and media files can be confusing for new
Django developers. These resources along with the static content page are
useful for figuring out how to handle these files properly.
• Using Amazon S3 to Store your Django Site's Static and Media Files is a
well written guide to a question commonly asked about static and
media file serving.
• Loading Django FileField and ImageFields from the file system shows
how to load a model field with a file from the file system.
• Storing Django Static and Media Files on Amazon S3 shows how to
configure Django to load and serve up static and media files, public and
private, via an Amazon S3 bucket.
Django project templates
Project templates, not to be confused with a template engine, generate
boilerplate code for a base Django project plus optional libraries that are often
used when developing web applications.
• Caktus Group's Django project template is Django 2.2+ ready.
204
Full Stack Python: 2020 Supporter's Edition
• Cookiecutter Django is a project template from Daniel Greenfeld, for
use with Audrey Roy's Cookiecutter. The template results are Heroku
deployment-ready.
• Two Scoops Django project template is also from the PyDanny and
Audrey Roy. This one provides a quick scaffold described in the Two
Scoops of Django book.
• Sugardough is a Django project template from Mozilla that is
compatible with cookiecutter.
Open source Django example projects
Reading open source code can be useful when you are trying to figure out how
to build your own projects. This is a short list of some real-world example
applications, and many more can be found on the Django example projects
and code page.
• Browser calls with Django and Twilio shows how to build a web app
with Django and Twilio Client to turn a user's web browser into a fullfledged phone. Pretty awesome!
• Openduty is a website status checking and alert system similar to
PagerDuty.
• Courtside is a pick up sports web application written and maintained
by the author of PyCoder's Weekly.
• These two Django Interactive Voice Response (IVR) system web
application repositories part 1 and part 2 show you how to build a really
cool Django application. There's also an accompanying blog post with
detailed explanations of each step.
• Taiga is a project management tool built with Django as the backend
and AngularJS as the front end.
Open source code to learn Django
There are many open source projects that rely on Django. One of the best
ways to learn how to use this framework is to read how other projects use it in
real-world code. This section lists these code examples by class and method in
Django's code base.
205
Full Stack Python: 2020 Supporter's Edition
Flask
Flask (source code) is a Python web framework built with a small core and
easy-to-extend philosophy.
Why is Flask a good web framework choice?
Flask is considered more Pythonic than the Django web framework because in
common situations the equivalent Flask web application is more explicit.
Flask is also easy to get started with as a beginner because there is little
boilerplate code for getting a simple app up and running.
For example, here is a valid "Hello, world!" web application with Flask:
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello_world():
return 'Hello, World!'
if __name__ == '__main__':
app.run()
The above code shows "Hello, World!" on localhost port 5000 in a web
browser when run with the python app.py command and the Flask library
installed.
The equivalent "Hello, World!" web application using the Django web
framework would involve significantly more boilerplate code.
Flask was also written several years after Django and therefore learned from
the Python community's reactions as the framework evolved. Jökull Sólberg
206
Full Stack Python: 2020 Supporter's Edition
wrote a great piece articulating to this effect in his experience switching
between Flask and Django.
How does Flask relate to the Pallets Projects?
Flask was originally designed and developed by Armin Ronacher as an April
Fool's Day joke in 2010. Despite the origin as a joke, the Flask framework
became wildly popular as an alternative to Django projects with their
monolithic structure and dependencies.
Flask's success created a lot of additional work in issue tickets and pull
requests. Armin eventually created The Pallets Projects collection of open
source code libraries after he had been managing Flask under his own GitHub
account for several years. The Pallets Project now serves as the communitydriven organization that handles Flask and other related Python libraries such
as Lektor, Jinja and several others.
Flask tutorials
The "Hello, World!" code for Flask is just seven lines of code but learning how
to build full-featured web applications with any framework takes a lot of work.
These resources listed below are the best up-to-date tutorials and references
for getting started.
• The Flask mega tutorial by Miguel Grinberg is a perfect starting
resource for using this web framework. Each post focuses on a single
topic and builds on previous posts. The series includes 18 parts: #1
Hello World, #2 Templates, #3 Web Forms, #4 Database, #5 User
Logins, #6 Profile Page and Avatars, #7 Unit Testing, #8 Followers,
Contacts, and Friends, #9 Pagination, #10 Full Text Search, #11 Email
Support, #12 Facelift, #13 Dates and Times, #14 I18n and L10n, #15
Ajax, #16 Debugging, Testing and Profiling, #17 Deployment on Linux
and #18 Deployment on the Heroku Cloud. Miguel also wrote and
recorded numerous Flask Web Development content including a great
book and video book that are excellent resources worth the price,
especially to support his continuous revisions to the content.
• Armin Ronacher, the creator of Flask, presented the technical talk
Flask for Fun and Profit at PyBay 2016 where he discusses using the
framework to build web apps and APIs.
207
Full Stack Python: 2020 Supporter's Edition
• Explore Flask is a public domain book that was previously backed on
Kickstarter and cost money for about a year before being open sourced.
The book explains best practices and patterns for building Flask apps.
• Learn to Build Web Applications with Flask and Docker is a video
course by Nick Janetakis that shows how to build a Software-as-aService (SaaS) application that he open sourced which uses Flask for
the web framework and Docker for the local development environment.
• Flask by Example: Part 1 shows the basic first steps for setting up a
Flask project. Part 2 explains how to use PostgreSQL, SQLAlchemy and
Alembic. Part 3 describes text processing with BeautifulSoup and
NLTK. Part 4 shows how to build a task queue with Flask and Redis.
• The blog post series "Things which aren't magic" covers how Flask's
ubiquitous @app.route decorator works under the covers. There are
two parts in the series, part 1 and part 2.
• How to Structure Large Flask Applications covers a subject that comes
up quickly once you begin adding significant functionality to your Flask
application.
• Flask Blueprint templates shows a way of structuring your
__init__.py file with blueprints for expanding projects into many
files and modules.
• If you're not sure why DEBUG should be set to False in a production
deployment, be sure to read this article on how Patreon got hacked.
• Developing a Single Page App with Flask and Vue.js step-by-step
walkthrough of how to set up a basic CRUD app with Vue and Flask.
Intermediate to advanced Flask resources
Once you move past the beginner tutorials and have created a few Flask
projects you will want to learn how to use Flask extensions, deploy your code
and integrate web APIs to build more extensive functionality. The following
tutorials will guide you through more advanced topics and provide solid
learning materials, especially when combined with the example real-world
projects listed in the next section.
208
Full Stack Python: 2020 Supporter's Edition
• Microservices with Docker, Flask, and React is an awesome course for
beyond-the-basics work with Flask. There are a couple of free chapters
and the rest of the course is well worth paying for to learn a bunch of
valuable tools such as Docker, React and microservices architectures.
• Visualize your trip with Flask and Mapbox along with the open source
flask_mapbox GitHub repository provides a fantastic example
visualization of a trip to Iceland with Flask as the backend web
framework.
• Microservices with Flask, Docker, and React teaches how to spin up a
reproducible Flask development environment with Docker. It shows
how to deploy it to an Amazon EC2 instance then scale the services on
Amazon EC2 Container Service (ECS).
• Build a Video Chat Application with Python, JavaScript and Twilio
Programmable Video shows how to use Twilio Programmable Video to
build cross-platform (web, iOS and Android) video into Flask
applications.
• Why and how to handle exceptions in Python Flask has some great
example code and reasons why you should code defensively by
anticipating and handling the unhappy path exceptions in your Flask
applications. The examples are relevant to any web framework you will
use and are easy to copy and paste to test in your own applications.
• The Flask Extensions Registry is a curated list of the best packages that
extend Flask. It's the first location to look through when you're
wondering how to do something that's not in the core framework.
• How I Structure My Flask Application walks through how this
developer organizes the components and architecture for his Flask
applications.
• Adding phone calling to your web application is a killer Flask tutorial
with all the code needed to create a web app that can dial phones and
receive inbound calls.
• Jeff Knupp provides some solid advice on how to productionize a Flask
app.
209
Full Stack Python: 2020 Supporter's Edition
• If you're looking for a fun tutorial with Flask and WebSockets, check
out my blog post on creating Choose Your Own Adventure
Presentations with Reveal.js, Python and WebSockets. Follow up that
tutorial by building an admin interface in part 1, part 2 and part 3
that'll show you how to use forms and SQLAlchemy. There is also a
companion open source GitHub repository for the app with tags for
each step in the blog posts.
• One line of code cut our Flask page load times by 60% is an important
note about optimizing Flask template cache size to dramatically
increase performance in some cases.
• Unit Testing Your Twilio App Using Python’s Flask and Nose covers
integrating the Twilio API into a Flask application and how to test that
functionality with nose.
• The Flask documentation has some quick examples for how to deploy
Flask with standalone WSGI containers.
• Serverless Python Web Applications With AWS Lambda and Flask is a
spectacular post that walks through how to run Flask applications on
AWS Lambda's serverless offering. The tutorial has instructions on
how to include application dependencies and handle your deployment
workflow.
• Visualize your trip with Flask and Mapbox uses geographic GeoJSON
data and presents it in a Flask application that uses Mapbox.
• Handling Email Confirmation in Flask is a great walkthrough for a
common use case of ensuring an email address matches with the user's
login information.
• Static websites with Flask shows how to use Flask with Frozen-Flask to
generate a static website from a backend data source.
• Running Flask on Docker Swarm details how to run a Flask app on
Docker Swarm.
• Running Flask on Kubernetes step-by-step walkthrough of how to
deploy a Flask-based microservice (along with Postgres and Vue.js) to a
Kubernetes cluster.
210
Full Stack Python: 2020 Supporter's Edition
• Dynamic Secret Generation with Vault and Flask looks at how to use
Hashicorp's Vault and Consul to create dynamic Postgres credentials
for a Flask web app.
Open source Flask example projects
Flask's lack of standard boilerplate via a commandline interface for setting up
your project structure is a double edged sword. When you get started with
Flask you will have to figure out how to scale the files and modules for the
code in your application. The following open source projects range from
simple to complex and can give you ideas about how to working on your
codebase.
• Skylines is an open source flight tracking web application built with
Flask. You can check out a running version of the application.
• Flask JSONDash is a Flask blueprint that creates JavaScript Object
Notiation (JSON) APIs for data dashboards.
• Microblog is the companion open source project that goes along with
Miguel Grinberg's O'Reilly Flask book.
• Flaskr TDD takes the official Flask tutorial and adds test driven
development and JQuery to the project.
• Charles Leifer (author of Peewee and Pony ORM) built a note-taking
app along with the source code in Gists.
• Reddit Job Search uses the Reddit API for a jobs data set and presents
them via a Flask web app.
• Bean Counter is an open source Flask app for tracking coffee.
• FlaskBB is a Flask app for a discussion forum.
• psdash is an app built with Flask and psutils to display information
about the computer it is running on.
Flask project templates
Flask's wide array of extension libraries comes at the cost of having a more
complicated project setup. The following project templates provide a starter
211
Full Stack Python: 2020 Supporter's Edition
base that you can either use for your own applications or just learn various
ways to structure your code.
• Use the Flask App Engine Template for getting set up on Google App
Engine with Flask.
• Flask Foundation is a starting point for new Flask projects. There's also
a companion website for the project that explains what extensions the
base project includes.
• Cookiecutter Flask is a project template for use with Cookiecutter.
• Flask-Boilerplate provides another starting project with sign up, log in
and password reset.
• The company Sunscrapers provides this Flask boilerplate project with
SQLAlchemy, py.test and Celery baked into the Flask project structure.
• flask-webpack-cookiecutter combines a Flask framework project
structure with Webpack, a module bundler frequently used in the
JavaScript world.
Open source code for learning Flask
There are many open source projects that rely on Flask to operate. One of the
best ways to learn how to use this framework is to read how other projects use
it in real-world code. This section lists these code examples by class and
method in Flask.
Bottle
Bottle (source code) is a WSGI-compliant single source file web framework
with no external dependencies other than the Python standard library (stdlib).
212
Full Stack Python: 2020 Supporter's Edition
Should I use Bottle for web development?
Bottle is awesome for a few web development situations:
1. Prototyping ideas
2. Learning how web frameworks are built
3. Building and running simple personal web applications
Prototyping
Prototyping simple ideas is often easier with Bottle than a more opinionated
web framework like Django because Django projects start with a significant
amount of boilerplate code. The Model-View-Template structure for Django
apps within projects makes maintaining projects easier, but it can be
cumbersome on starter projects where you're just playing with random ideas
so you aren't worried about your application's long-term code structure.
Learning about frameworks
Bottle is contained within a single large source file named bottle.py so it
provides great reading when learning how WSGI web frameworks work.
Everything you need to learn about how your web application's code connects
with the Bottle framework is contained within that single source code.
Personal projects
Personal projects can be deployed with Bottle as the only dependency. If
you've never performed a Python web app deployment before, the number of
concepts and steps can be daunting. By packaging bottle.py with your
app's source code, you can skip some of the steps to more easily get your web
application up and running.
Bottle resources
• Configuring Python 3, Bottle and Gunicorn for Development on
Ubuntu 16.04 LTS is a quick tutorial for getting an out-of-the-box
default Ubuntu 16.04 image ready for Bottle development with Green
Unicorn as the WSGI server.
• Check out these Full Stack Python Bottle tutorials that'll teach you how
to write a few small but very useful Bottle web apps:
◦ Creating Bar Chart Visuals with Bokeh, Bottle and Python 3
◦ Dialing Outbound Phone Calls with a Bottle Web App
213
Full Stack Python: 2020 Supporter's Edition
◦ How to Monitor Python / Bottle Web Apps
◦ Replying to SMS Text Messages with Python and Bottle
• Digital Ocean provides an extensive introductory post on Bottle.
• First Steps with Python and Bottle is a quick 4 minute introduction that
I created for developers so they can get the simplest possible Bottle web
app running. There is also a companion blog post with the code.
• Getting Started with Python, Bottle and Twilio SMS / MMS shows how
to build a simple Bottle web application that can send and receive text
and picture messages. How to Make and Receive Phone Calls with
Python, Bottle and Twilio Voice is a similar beginner's tutorial for
handling phone calls with a Bottle app using Twilio.
• Developing With Bottle details how to create a basic application with
Bottle.
• The official Bottle tutorial provides a thorough view of basic concepts
and features for the framework.
• Running a Bottle app with Gunicorn shows how to execute a simple
Bottle web app with Green Unicorn.
• Here's a short code snippet for creating a RESTful API with Bottle and
MongoDB.
• Bottle, full stack without Django does a nice job of connecting
SQLAlchemy with Bottle and building an example application using the
framework.
• Using bottle.py in Production has some good tips on deploying a Bottle
app to a production environment.
• Jinja2 Templates and Bottle shows how to use Jinja instead of the
built-in templating engine for Bottle page rendering.
• Python patterns contains a setup that combines Bottle, Celery and
Peewee as the developer's choice for backend web development. The
developer also uses Vim as the primary editor for working with Bottle.
Open source Bottle example projects
• Pattle is a pastebin clone built with Bottle.
214
Full Stack Python: 2020 Supporter's Edition
• Decanter is a library for structuring Bottle projects.
• compare-python-web-frameworks provides an example application
using Bottle as one of the implementations.
Bottle framework learning checklist
1. Download Bottle or install via pip with pip install bottle on your
local development machine.
2. Work through the official Bottle tutorial.
3. Start coding your Bottle app based on what you learned in the official
tutorial plus reading open source example applications found above.
4. Move on to the deployment section to get your initial Bottle application
on the web.
Pyramid
Pyramid is an open source WSGI web framework based on the Model-ViewController (MVC) architectural pattern.
Open source Pyramid example apps
These projects provide solid starting code to learn from as you are building
your own applications.
• pyramid_blogr is an example project that shows how to build a blog
with Pyramid modeled on the Flaskr tutorial.
215
Full Stack Python: 2020 Supporter's Edition
• pyramid-blogr-cf is another Pyramid web app that has a similar title to
the one above but this one is intended for teaching web development to
new developers.
• pyramid_appengine provides a project skeleton for running Pyramid
on Google App Engine.
Pyramid-specific packages
The following packages are designed to make Pyramid play nicely with
existing open source libraries by reducing the boilerplate you need to add to
your project.
• pyramid_celery and pyramid_rq make it easier to use the Celery task
queue in your Pyramid applications for handling asynchronous work.
• pyramid_zipkin provides distributed tracing via the Zipkin library.
• Ramses (source code) is a RESTful web API generation framework,
similar in concept (but not in implementation details) to how Django
REST Framework works with Django.
Pyramid resources
Pyramid has fantastic official project documentation on its site. Other
resources are harder to come by compared to other established web
frameworks such as Django and Flask, but there are enough tutorials out
there for you to learn Pyramid if you choose to build your web applications
with it.
• Try Pyramid is the official marketing website for Pyramid, with
resources for extending your Pyramid apps. It also provides some
sample "hello world!" code.
• An introduction to the Pyramid web framework for Python provides a
detailed configuration and more than trivial code to build a "to do"
application. This post is one in a four part series that compares
Pyramid to Flask, Django and Tornado so there is some commentary
on how this framework compares to the others.
• The first Pyramid app is a good place to start getting your hands dirty
with an example project.
216
Full Stack Python: 2020 Supporter's Edition
• Six Feet Up explains why Pyramid is their choice for rapid development
projects in that blog post.
• Build a chat app with Pyramid, SQLDB, and Bluemix is a Pyramid
application walkthrough specific to IBM's Bluemix platform.
• Developing Web Apps Using the Python Pyramid Framework is a video
from San Francisco Python with an overview of how to install, get
started and build a web app with the Pyramid framework.
• This podcast interview with the primary author of the Pyramid
framework explains how Pyramid sprang from Pylons and how
Pyramid compares to other modern frameworks.
• Anyone using the Pyramid framework? is a good /r/Python thread with
responses by current users as well as frustrations by developers who
tried and decided against sticking with the framework.
TurboGears
TurboGears, born as a full stack layer on top of Pylons, is now a standalone
WSGI web framework that can act both as a full stack framework (like
Django) or as a micro framework (like Flask)
Originally inspired by RubyOnRails it's based on MVC where the controller
dispatches the request to a set of actions exposed from the controller itself.
TurboGears, in its full stack mode, provides all the features you would require
during development of a web application:
•
•
•
•
Identification and Authentication
Authorization
Autogenerated Admin and CRUD
Sessions
217
Full Stack Python: 2020 Supporter's Edition
•
•
•
•
•
•
•
Caching
Schema Migrations
Master/Slave Database Queries Balancing
Request Bound Transactions
Interactive Debugger
Builtin Profiling
Pluggable Applications
It's also one of the few web frameworks officially supporting MongoDB as one
of the primary storage backends, including support into the TurboGears
Admin to autogenerate CRUDs from MongoDB models.
While TurboGears has always been a full stack framework with same scope of
projects like Django, it differentiates from other frameworks due the its
philosophy on two major parts of a web framework: Templating and Routing
Templating
While TurboGears provides support for multiple template engines, the
primary one has always been a fully validated XML template engine.
Currently TurboGears ships with the Kajiki template engine, which was
developed within the project itself, but in the past it relied on the Genshi and
Kid template engines which were mostly syntax compatible with Kajiki.
Historically validated xml template engines has always been slower than text
template engines, but the Kajiki project was able to create a very fast template
engine that usually renders faster than Mako or Django Template while still
retaining all the expected features.
The fact that it relies on a validated XML engine provides some benefits
compared to plain text engines like Django Template, Jinja2 and Mako:
Automatic Escaping
It automatically escapes content rendered into the template, thus making
easier to avoid XSS and injection security issues:
<div>${value}</div>
with value='<script>alert("hello")</script>' will render as
218
Full Stack Python: 2020 Supporter's Edition
<div><script>alert("hello")</script></div>;
thus preventing any form of injection from user provided content.
Automatic Internationalization
The template engine parses the provided template document and recognises
the nodes that contain static text.
As the engine is able to distinguish text from markup it's able to flag the text
for translation.
Content like <div>Hello World</div> would get automatically translated if
a translation for "Hello World" is provided, without having to wrap text in
gettext calls.
Compatibility with WYSIWYG Editors
As the template engine syntax is purely valid XHTML the template itself can
be opened with WYSIWYG editors and as far as they don't strip unknown
attributes the template can be edited and saved back from those editors.
Routing
Most web frameworks have been relying on regular expressions to declare
routing, through decorators or through a routing map.
TurboGears supports regular expressions through the tgext.routes
extension, but the preferred way of routing is through the Object Dispatch
system.
In Object Dispatch a root controller object is traversed while resolving the
URL. Each part of the url path is mapped to a property of the controller
(Which might point to a sub controller) until a final callable action is
encountered.
This leads to a very natural mapping between URLs and the code serving
them, allowing people with minimal knowledge of a project to jump in and
quickly find actions in charge of serving a specific page.
In Object Dispatch an URL like /users/new?name=MyName would be served
by a hierarchy of objects like:
219
Full Stack Python: 2020 Supporter's Edition
class UsersController(TGController):
@expose()
def new(self, name=None):
return 'Hi, %s' % name
class RootController(TGController):
users = UsersController()
It's easy to see how /users/new actually resolves to
RootController.users.new and all options provided to the URL are passed
to the action serving the response as arguments.
TurboGears Resources
• TurboGears Introduction Video An overview of TurboGears2 features
presented at the PyConWeb
• TurboGears Documentation The official TurboGears documentation
• Microframework Mode Tutorial The official tutorial that focuses on
starting TurboGears in microframework mode and leads to
developement of a single file web application
• FullStack Tutorial The Wiki in 20 minutes tutorial that showcases how
to create a fully functional wiki application with TurboGears in full
stack mode.
• The CogBin The CogBin is a list of the most common pluggable
applications for TurboGears, it enlists ready made pieces you can plug
into your web application to provide features like Facebook Login,
Comments, Registration and so on...
• React in Pure Python An article showcasing how to create web
applications relying on React without the need to have NodeJS
installed at all. The article uses TurboGears as the web framework to
develop the example application.
• Turbogears and the future of Python web frameworks is a Talk Python
to Me podcast episode featuring an interview with one of the
TurboGears core developers.
220
Full Stack Python: 2020 Supporter's Edition
Falcon
Falcon is a WSGI-compliant web framework designed to build RESTful APIs
without requiring external code library dependencies.
Falcon resources
• Building Very Fast App Backends with Falcon Web Framework on
PyPy provides a walkthrough of a web API in Falcon that runs with
PyPy and Nginx.
• Asynchronous Tasks with Falcon and Celery shows how to configure
Celery with the framework.
• The official Falcon tutorial has a meaty guide for building and
deploying your first Falcon web application.
• Building Scalable RESTful APIs with Falcon and PyPy shows a to-do
list example with Falcon running on PyPy.
Morepath
Morepath is a micro web framework with a model-driven approach to creating
web applications and web APIs.
221
Full Stack Python: 2020 Supporter's Edition
Morepath's framework philosophy is that the data models should drive the
creation via the web framework. By default the framework routes URLs
directly to model code, unlike for example Django which requires explicit URL
routing by the developer.
Why is Morepath an interesting web framework?
Simple CRUD web applications and APIs can be tedious to build when they
are driven straight from data models without much logic between the model
and the view. Learn more about how Morepath compares with other web
frameworks from the creator.
With the rise of front end JavaScript frameworks, many Python web
frameworks are first being used to build RESTful APIs that return JSON
instead rendering HTML via a templating system. Morepath appears to have
been created with the RESTful API model approach in mind and cuts out the
assumption that templates will drive the user interface.
Morepath resources
Morepath, with its first commit using the Morepath name in 2013, is a much
newer web framework than Django, Flask or Pyramid, which results in fewer
tutorials. There is also a lot of opportunity for newer Python developers to fill
the gaps with their own Morepath tutorials. However, these resources below
are a good place to get started.
• On the Morepath is a blog post by Startifact on how they use Morepath
and some of the features of the framework.
• Build a better batching UI with Morepath and Jinja2 is an introductory
post on building a simple web application with the framework. The
code for the application is also open source and available on GitHub.
• podcast.__init__ interviewed Martijn Faassen about Morepath and he
described what makes the framework different from other existing web
frameworks as well as why someone should be convinced to switch for
a new project.
• Morepath's creator gave a great talk on the motivation and structure
for the new framework at EuroPython 2014.
222
Full Stack Python: 2020 Supporter's Edition
• Is Morepath Fast Yet? includes some benchmarks and discusses
performance implications of using a Python-based web framework for
your web application.
• The Morepath-cookiecutter project handles project creation templates
using
Cookiecutter and recommended file structure from the Morepath
documentation.
Sanic
Sanic is a Python web framework built on uvloop and designed for fast HTTP
responses via asynchronous request handling.
What are the tradeoffs of using Sanic?
Sanic cannot be developed or deployed on Windows due to its necessary
uvloop dependency.
There was an excellent discussion on the /r/python subreddit about using one
of the newer async frameworks such as Sanic or Japronto compared with a
traditional web framework like Django. One of the major tradeoff of adopting
a newer framework is simply that the code library ecosystem has not, and may
never, grow up around that framework. You have to accept the risk that you
will need to build a significant amount of the plumbing yourself rather than
pip installing existing, well-tested libraries.
223
Full Stack Python: 2020 Supporter's Edition
Sanic tutorials
Sanic is under very active development and is still in its infancy as a web
framework. The following tutorials will get you started but there is a chance
you will have to work through errors as Sanic is regularly updated.
• Getting started with Sanic: the asynchronous, uvloop based web
framework for Python 3.5+ is a "Hello, World!" style post for the
framework and also shows how to respond to SMS text messages using
Twilio.
• Fixing bugs and handling 186k requests/second using Python is a fun
benchmarking exercise that a developer ran when testing out Sanic on
a Digital Ocean droplet.
• Exploring Asyncio - uvloop, sanic and motor explains why asyncio is
important to the Python community and how uvloop & sanic fit into
the bigger picture.
• Python Sanic Tutorial is a video tutorial on how to write your first
Sanic web apps.
• A Guide to Instrumenting Sanic Applications, Part 1 shows how to add
Prometheus-based monitoring to Sanic applications.
Sanic open source projects and examples
There are not many example applications and extensions for Sanic compared
to Flask, Django or other web frameworks because Sanic is still so new.
However, there are some initial projects that are useful for figuring out how to
build your first applications with this framework.
• Gutenberg-HTTP is a web application and API built with Sanic. It's a
solid clean example of how to build a decent-sized project with Sanic.
There is even a demo that was deployed to Azure to show how it works.
• Practical Log Viewers with Sanic and Elasticsearch - Designing CI/CD
Systems shows how to build a log viewer using Sanic that collects data
from various Docker containers being created through a build system.
• Sanic comes with a slew of examples in the official repository.
224
Full Stack Python: 2020 Supporter's Edition
• Sanic starter bundles Sanic with SQLAlchemy and Alembic (for data
migrations) as a starter project.
• Sanic-limiter is an extension for rate-limiting the number of requests
from a single user on Sanic APIs.
• Sanic-GraphQL adds GraphQL support to a Sanic web application.
• Sanic OpenAPI provides a user interface for Sanic APIs.
• This Sanic & Nginx & docker-compose example has boilerplate code for
setting up a Sanic project using Docker and Nginx.
• Sanic JWT (source code) adds support for authentication via JSON
Web Tokens (JWT).
Other Web Frameworks
Python has a significant number of newer and less frequently-used web
frameworks that are still worth your time to investigate. The list on this page
does not include the following web frameworks that have their own dedicated
pages:
• Django
• Flask
• Pyramid
• TurboGears
• Bottle
• Falcon
• Morepath
• Sanic
web.py
web.py is a Python web framework designed for simplicity in building web
applications.
• See this Reddit discussion on reasons why to not use web.py for some
insight into the state of the project.
225
Full Stack Python: 2020 Supporter's Edition
web2py
Web2py is a batteries-included philosophy framework with project structure
based on model-view-controller patterns.
CherryPy
CherryPy is billed as a minimalist web framework, from the perspective of the
amount of code needed to write a web application using the framework. The
project has a long history and made a major transition between the second
and third release.
Muffin
Muffin is a web framework built on top of the asyncio module in the Python
3.4+ standard library. Muffin takes inspiration from Flask with URL routes
defined as decorators upon view functions. The Peewee ORM is used instead
of the more common SQLAlchemy ORM.
Ray
Ray is a framework for building RESTful APIs, similar to Falcon. The
introductory post provides some initial code to get started with creating
endpoints, adding authentication and protecting against malicious clients.
Vibora
Vibora is an asynchronous model framework similar to Sanic that was
inspired by Flask's syntax. However, the framework's author rewrote many
parts like the template engine to maximize performance.
Pecan
Pecan is inspired by CherryPy and TurboGears. It purely focuses on HTTP
requests and responses via Python objects and does not integrate session
handling or database access.
Masonite
Masonite is a modern, developer centric, batteries-included Python web
framework. It uses the MVC (Model-View-Controller) architecture pattern
226
Full Stack Python: 2020 Supporter's Edition
and comes with a lot of functionality out of the box with an extremely
extendable architecture.
Check out the following resources to lean more:
1. 5 reasons why people are choosing Masonite over Django
2. MasoniteCasts
3. Dockerizing Masonite with Postgres, Gunicorn, and Nginx
Other web framework resources
• This roundup of 14 minimal Python frameworks contains both familiar
and less known Python libraries.
• The web micro-framework battle presentation goes over Bottle, Flask,
and many other lesser known Python web frameworks.
• A Python newcomer asked the Python Subreddit to explain the
differences between numerous Python web frameworks and received
some interesting responses from other users.
Other frameworks learning checklist
1. Read through the web frameworks listed above and check out their
project websites.
2. It's useful to know what other web frameworks exist besides Django
and Flask. However, when you're just starting to learn to program
there are significantly more tutorials and resources for Django and
Flask on the web. My recommendation is to start with one of those two
frameworks then expand your knowledge from there.
Template Engines
Template engines take in tokenized strings and produce rendered strings with
values in place of the tokens as output. Templates are typically used as an
intermediate format written by developers to programmatically produce one
or more desired output formats, commonly HTML, XML or PDF.
227
Full Stack Python: 2020 Supporter's Edition
Why are template engines important?
Template engines allow developers to generate desired content types, such as
HTML, while using some of the data and programming constructs such as
conditionals and for loops to manipulate the output. Template files that are
created by developers and then processed by the template engine consist of
prewritten markup and template tag blocks where data is inserted.
For example, look at the first ten source lines of HTML of this webpage:
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="author" content="Matt Makai">
<meta name="description" content="Template engines provide programmatic output of formatted conte
nt such as HTML, XML or PDF.">
<link rel="shortcut icon" href="//static.fullstackpython.com/fsp-fav.png">
Every one of the HTML lines above is standard for each page on Full Stack
Python, with the exception of the <meta name="description"... line
which provides a unique short description of what the individual page
contains.
The base.html Jinja template used to generate Full Stack Python allows every
page on the site to have consistent HTML but dynamically generate the pieces
that need to change between pages when the static site generator executes.
The below code from the base.html template shows that the meta
description is up to child templates to generate.
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta name="author" content="Matt Makai">
{% block meta_header %}{% endblock %}
<link rel="shortcut icon" href="//static.fullstackpython.com/fsp-fav.png">
In a typical WSGI application, the template engine would generate the HTML
output response when an HTTP request comes in for a particular URL.
228
Full Stack Python: 2020 Supporter's Edition
Python template engines
There are several popular Python template engines. A template engine
implementation will fall somewhere on the spectrum between allowing
arbitrary code execution and granting only a limited set of capabilities via
template tags. A rough visual of the code in template spectrum can be seen
below for four of the major Python template engines.
Jinja (Jinja2)
Jinja, also known and referred to as "Jinja2", is a popular Python template
engine written as a self-contained open source project. Some template
engines, such as Django templates are provided as part of a larger web
framework, which can make them difficult to reuse in projects outside their
coupled library.
Major Python open source applications such as the configuration
management tools Ansible and SaltStack as well as the static site generator
Pelican use the Jinja template engine by default for generating output files.
There is a whole lot more to learn about Jinja on the Jinja2 page.
Django templating
Django comes with its own template engine in addition to supporting (as of
Django 1.9) drop-in replacement with other template engines such as Jinja.
Mako template engine
Mako was the default templating engine for the Pylons web framework and is
one of many template engines supported by Pyramid. Mako has wide support
as a replacement template engine for many other web frameworks as well.
229
Full Stack Python: 2020 Supporter's Edition
Other Python template engine implementations
There are numerous Python template engine implementations that range
from weekend hacks to actively developed mature libraries. These template
engines are listed alphabetically:
• Chameleon is an HTML and XML template engine that supports both
Python 2 and 3.
• Cheetah
• Diazo
• evoque
• Genshi
• Juno
• Myghty
• pyratemp
• pystache
Template engine implementation comparisons
There are many Python template engine implementations in addition to the
ones listed above. These resources can help you select a Python template
engine implementation that works well for your project.
• This template engines site contains a range of information from what
templates engines are to listing more esoteric Python template engines.
• Python Template Engine Comparison is an older but still relevant post
by the creator of Jinja that explains why and how he switches between
Mako, Jinja and Genshi for various projects he works on.
• Python Web Frameworks: What are the advantages and disadvantages
of using Mako vs. Jinja2? has some good answers from developers on
Quora about using Mako compared with Jinja2.
Template engine resources
Template engines are often used with web frameworks a black box where
input goes in, and rendered text magically appears out the other side.
230
Full Stack Python: 2020 Supporter's Edition
However, when something unexpected returns from a template engine it is
useful to know how they work to aid your debugging. The following resources
examine existing template engine design as well as how to build your own
engine when that's necessary for your projects.
• How a template engine works uses the template module in Tornado as
an example to step through how a template engine produces output,
from parsing the incoming string to rendering the final output.
• A template engine in 500 lines or less is an article by Ned Batchelder
provides a
template engine in 252 lines of Python that can be used to understand
how template engines work under the cover.
• The world's simplest Python template engine shows how the
format() function can implement a simple template engine with
conditionals, loops and method invocations.
• When to use a Templating Engine (in Python)? is a Stack Overflow
question with a useful answer on why and when to use an existing
template engine.
• Template Engines uses Jinja as an implementation example to explain
the tasks that template engines can be used to perform.
• Approach: Building a toy template engine in Python walks through how
to create your own simple template engine in Python to understand the
basics of how most template engines work.
Jinja2
Jinja, also commonly referred to as "Jinja2" to specify the newest release
version, is a Python template engine used to create HTML, XML or other
markup formats that are returned to the user via an HTTP response.
231
Full Stack Python: 2020 Supporter's Edition
Why is Jinja2 useful?
Jinja2 is useful because it has consistent template tag syntax and the project is
cleanly extracted as an independent open source project so it can be used as a
dependency by other code libraries.
Jinja2 strikes a thoughtful balance on the template engine spectrum where on
one end you can embed arbitrary code in the templates and the other end a
developer can code whatever she wants.
Jinja2 origin and development
The first recorded public released of Jinja2 was in 2008 with 2.0rc1. Since
then the engine has seen numerous updates and remains in active
development.
Jinja2 engine certainly wasn't the first template engine. In fact, Jinja2's
syntax is inspired by Django's built-in template engine, which was released
several years earlier. There were many template systems, such as JavaServer
Pages (JSPs), that originated almost a decade before Jinja2. Jinja2 built upon
the concepts of other template engines and today is widely used by the Python
community.
What projects depend on Jinja2?
Jinja2 is a commonly-used templating engine for web frameworks such as
Flask, Bottle, Morepath and, as of its 1.8 update, optionally Django as well.
Jinja2 is also used as a template language by configuration management tool
Ansible and the static site generator Pelican, among many other similar tools.
232
Full Stack Python: 2020 Supporter's Edition
The idea is that if a developer already knows Jinja2 from working with one
project then the exact same syntax and style can be used in another project
that requires templating. The re-use reduces the learning curve and saves the
open source project author from having to reinvent a new templating style.
Jinja2 resources
• Real Python has a nice Jinja2 primer with many code examples to show
how to use the template engine.
• The second part of the Flask mega tutorial is all about Jinja2 templates.
It walks through control flow, template inheritance and other standard
features of the engine.
• Upgrading to Jinja2 Templates in Django 1.8 With Admin shows how
to fix an issue that can occur with Django 1.8 and using Jinja2 as the
template engine.
• The official Jinja2 template designer documentation is exceptionally
useful both as a reference as well as a full read-through to understand
how to properly work with template tags.
• When you want to use Jinja2 outside of a web framework or other
existing tool, here's a handy quick load function snippet so the
template engine can be easily used from a script or the REPL.
• When working with Jinja2 in combination with LaTeX, some of
Jinja2's blocks can conflict with LaTeX commands. Check out this post
on LaTeX templates with Python and Jinja2 to generate PDFs to
resolve those issues.
• When you use Jinja2 for long enough, eventually you'll want to escape
large blocks of Jinja2-like text in your templates. To do so, you'll need
the "raw" template tag.
• Python Templating Performance Showdown: Django vs Jinja puts
together some benchmarks for how Django templates compare with
Jinja templates. The usual benchmarking caveats apply here but there
are some interesting tests that examine how the template engines
handle large numbers of variables and other factors.
233
Full Stack Python: 2020 Supporter's Edition
• Universal Jinja presents a high-level overview of what you could do
using the Jinja-like Nunchuks library to perform server-side template
rendering for Django applications.
Mako
Mako is a template engine built in Python that is used to generate output
HTML, XML and similar formats.
Mako resources
• Exploring Mako explains a bit about the template engines Myghty and
Mason, which influenced Mako's design. The post then shows a few
basic examples for how to use Mako.
• Configuration Templates with Python and Mako shows some basic
situations for how to use Mako in an example project.
• Flask-Mako is a Flask extension that makes it easier to use Mako as the
template engine in your Flask web app projects.
• The Stack Overflow question on What is the fastest template system for
Python? provides some basic benchmarks comparing Mako, Jinja and
other template engines. Any benchmark should be taken as a data point
rather than a rule on which engine is actually the fastest in real world
scenarios. In addition, if you are using Mako or any other template
engine as part of a static website generator then it will not really matter
which one is the fastest because the output is created before the
234
Full Stack Python: 2020 Supporter's Edition
website is deployed rather than during the web server's HTTP requestresponse cycle.
Django Templates
The Django web framework contains its own template engine for generating
HTML, XML and other output formats.
What's the difference between a "project template" and
Django templates?
A project template contains the files and code to start a new web application.
For example, when you run django-admin.py startproject abc , the
Django admin script creates a new abc directory along with several Python
configuration so the web app can be run by a WSGI server.
Django templates are different from a project template because they live
within a project and are written by the developer to generate output, most
commonly HTML.
Django template resources
• Make ALL Your Django Forms Better presents some tricks for
customizing Django templates to handle the widgets on your site.
• Python Templating Performance Showdown: Django vs Jinja provides
some benchmarks for how Django templates compare with Jinja
templates. Note that as with any benchmarks these only provide a few
data points that can be useful rather than a definitive statement that
one tool is always faster than the other.
235
Full Stack Python: 2020 Supporter's Edition
• Reconciling Backend Templates with Frontend Components explains
how to use React components with the traditional server-side Django
templates despite some mismatch in how each tool approaches the end
goal of rendering a webpage.
• When and how to use Django TemplateView is not specifically about
using the Django template engine, but instead how to use the
TemplateView in your views which lead directly into rendering a
template.
Web Design
Web design is the creation of a web application's style and user interaction
using CSS and JavaScript.
Why is web design important?
You wouldn’t use a web application that looked like the following screenshot,
would you?
Creating web pages with their own style and interactivity so users can easily
accomplish their tasks is a major part of building modern web applications.
236
Full Stack Python: 2020 Supporter's Edition
Getting started if you have no "eye" for design
Design can feel like something "creative" people understand intuitively, but
like all skills design is something that can be learned. Some people are faster
learners in design just like some folks are quicker in picking up programming.
But anyone can learn how to be a better designer by learning the basic
principles and practicing them.
One of the best mental models for basic design is C.R.A.P., which helped me
grasp why some designs look good while others do not. CRAP is an acronym
for:
*
*
*
*
Contrast: noticeable differences from one element to another
Repetition: elements' consistency
Alignment: order among all elements
Proximity: placement between elements and how they are organized
These basic principles all you to start breaking down the problem into
digestible pieces that you can work on rather than feeling like you "just don't
have an eye for design".
Designing for various screen sizes
Separating the content from the rules for how to display the content allows
devices to render the output differently based on factors such as screen size
and device type. Displaying content differently based on varying screen
attributes is often called responsive design. The responsiveness is
accomplished by implementing media queries in the CSS.
For example, a mobile device does not have as much space to display a
navigation bar on the side of a page so it is often pushed down below the main
content. The Bootstrap Blog example shows that navigation bar relocation
scenario when you resize the browser width.
Fantastic design resources
There are way too many design resources on the web, so I picked out this
short list as my absolute favorites that help developers become (hopefully
much) better with design.
• Clean up your mess: A Guide to Visual Design for Everyone walks
through the basic principles for clean and effective design. You can
237
Full Stack Python: 2020 Supporter's Edition
make a website go from terrible to well-designed often by following a
few principles on spacing, alignment, contrast and repetition of page
elements.
• Resilient web design is an incredible online book that teaches how to
create websites that are accessible to every reader and look great while
doing it.
• Design 101 for Developers gives away the "secrets" to good design that
designers follow but that can be similarly accessible to developers who
understand what they want their design to accomplish.
• Laws of UX provides a beautiful overview of design principles for
building user experiences. Highly recommended even if just to see how
the information is presented.
• How I Work with Color is a fantastic article from a professional
designer on how he thinks about color and uses it for certain effects in
his designs.
• Building dark mode on Stack Overflow explains the thought process
and work that the team at Stack Overflow had to do with colors, styling
and the code implementation so they could offer a dark mode design to
their users.
• A short history of body copy sizes on the Web provides a useful
examination of how originally the web's typography mirrored what was
in traditional print. Eventually, the designs shifted upwards in font size
because screen resolutions changed. However, even in 2020 there is no
consensus for what font size to use in various situations so designers
simply have to do what they've always done and try their sites on
various devices and screen sizes.
• Web bloat is the story of traveling and using the web with low
bandwidth, high latency internet connections that often drop packets.
The author explains how many websites are barely usable and that if
you truly want your site to work well you need to ensure it works for
connections much worse than the fiber connection you may have at the
home or office.
238
Full Stack Python: 2020 Supporter's Edition
• Fundamental design principles for non-designers summarizes
numerous design tips into four principles that those without prior
design knowledge can follow. The author gives a bunch of great
examples and further details for the four principles, which are contrast,
consistency, Occam's Razor and space.
• Gallery of web design history is a collection of websites from between
1991 and 2006 that show the evolution of what the web looked like
before the modern CSS era. This is a great resource to see how websites
evolved, such as Microsoft in 1996 and YouTube in 2005.
• The Average Web Page (Data from Analyzing 8 Million Websites)
shows the most frequently used HTML elements, metadata, text
content and other statistics from a large scale analysis of the web.
• How to design delightful dark themes explains some subtle tactics to
make dark themes work well for users.
• Setting height and width on images is important again describes how
browser layout and resizing settings could affect your images so
manually setting the size is more useful than it has been in the past few
years.
• Kuler is a complementary color picker by Adobe that helps choose
colors for your designs.
• If you want to learn more about how browsers work behind the scenes
to render a webpage's design, here is a blog post series on building a
browser engine that will show you how to build a simple rendering
engine.
• How to Use C.R.A.P. Design Principles for Better UX has a good
summary of what contrast, repetition, alignment and proximity means
for designing user interfaces.
• Defining Colors in CSS presents how to define color in your Cascading
Style Sheets (CSS) and breaks down the differences between specifying
predefined color values, hexadecimal values, red-green-blue (RGB) and
Hue-Saturation-Lightness (HSL).
239
Full Stack Python: 2020 Supporter's Edition
• Easy to Remember Color Guide for Non-Designers gives guidance to
less aesthetically-inclined folks like myself who need rules for picking
groups of colors to use together in your designs.
• Styling HTML checkboxes is hard - here's why explains why form
elements like checkboxes are more difficult to style than other HTML
elements, The post shows how to perform the styling in various ways
such as CSS-only and then with JavaScript plus CSS.
• 13 Terrible Web Trends From the 90s, and How to Recreate Them
revisits a simpler and perhaps more... ugly time on the web where
designs were a bit out of control. Learn more about the history of web
design and styling techniques with this hilarious but useful blog post.
Checklists and design guidelines
• Frontend Guidelines is an amazing write up of good practices for
HTML, CSS and JS.
• Learn Design Principles is a well thought out clear explanation for how
to think about design according to specific rules such as axis,
symmetry, hierarchy and rhythm.
• Front-end performance checklist is a comprehensive checklist useful
when you are implementing a web application's client side code.
• Front-end Developer Handbook 2018 provides a high-level overview of
many of the tools developers use in the browser and some other useful
information on average salaries and where to search for front-end
development jobs.
• Who Can Use shows how color contrast can affect different people with
visual impairments and can help you improve your site's accessibility to
these groups.
Hypertext Markup Language (HTML)
Hypertext Markup Language (HTML) is the standard markup language that is
downloaded from a web server to a web browser and is used to display
website and web application content.
240
Full Stack Python: 2020 Supporter's Edition
HTML resources
• Mozilla's HTML page and introduction to HTML break down HTML by
elements, metadata and forms before diving into more advanced web
development topics.
• A few HTML tips is written for beginners who are learning HTML. The
article gives guidance on common mistakes to avoid and what to do
instead to write proper HTML.
• CodeAcademy's HTML basics course provides an interactive
environment for learning the gist of the markup language.
• A list of everything that could go in the head of your document provides
a comprehensive reference for elements that are required or optional in
the <head> element of your webpage.
• (Why) Some HTML is "optional" gives historical context for the <p>
paragraph element as an example to explain how HTML was originally
designed. The backwards-compatibility remains because there is not
enough optimization juice to squeeze from changing the
implementation compared to the backwards-breaking changes in
rendering existing sites.
• A Complete Guide to Links and Buttons extensively covers what might
be thought of as a simple topic: the a and button elements in HTML,
along with their many attributes and quirks.
• HTML: the inaccessible parts explains how even basic HTML elements
can cause accessibility issues for screen readers and other devices that
help people with impairments to use the web.
241
Full Stack Python: 2020 Supporter's Edition
Cascading Style Sheets (CSS)
Cascading Style Sheet (CSS) files contain rules for how to display and lay out
the HTML content when it is rendered by a web browser.
Why is CSS necessary?
CSS separates the content contained in HTML files from how the content
should be displayed. It is important to separate the content from the rules for
how it should be rendered primarily because it is easier to reuse those rules
across many pages. CSS files are also much easier to maintain on large
projects than styles embedded within the HTML files.
How is CSS retrieved from a web server?
The HTML file sent by the web server contains a reference to the CSS file(s)
needed to render the content. The web browser requests the CSS file after the
HTML file as shown below in a screenshot captured of the Chrome Web
Developer Tools network traffic.
242
Full Stack Python: 2020 Supporter's Edition
That request for the fsp.css file is made because the HTML file for Full Stack
Python contains a reference to theme/css/fsp.css which is shown in the
view source screenshot below.
CSS preprocessors
A CSS preprocessor compiles a processed language into plain CSS code. CSS
preprocessing languages add syntax such as variables, mixins and functions to
reduce code duplication. The additional syntax also makes it possible for
243
Full Stack Python: 2020 Supporter's Edition
designers to use these basic programming constructs to write maintainable
front end code.
• Sass is currently the favored preprocessor in the design community.
Sass is considered the most powerful CSS preprocessor in terms of
advanced language features.
• LESS is right up there with Sass and has an ace up its sleeve in that the
Bootstrap Framework is written in LESS which brings up its
popularity.
• Stylus is often cited as the third most popular CSS preprocessing
language.
CSS preprocessor resources
• The Advanced Guide to HTML and CSS book has a well-written
chapter on preprocessors.
• Sass vs LESS provides a short answer on which framework to use then
a longer more detailed response for those interested in understanding
the details.
• How to choose the right CSS preprocessor has a comparison of Sass,
LESS and Stylus.
• Musings on CSS preprocessors contains helpful advice ranging from
how to work with preprocessors in a team environment to what apps
you can use to aid your workflow.
CSS libraries and frameworks
CSS frameworks provide structure and a boilerplate base for building a web
application's design.
• Bootstrap
• Foundation
• Gumby
• Compass
• Profound Grid
244
Full Stack Python: 2020 Supporter's Edition
• Skeleton
• HTML5 Boilerplate
• Spectre
CSS resources
• The languages which were almost CSS contains the history of what
might have been if other styling proposals were adopted instead of CSS,
such as RRP, PWP, FOSI, DSSSL, PSL96 and CHSS. Many of those
proposals came before CSS was first published as a specification in
1996 so the article is a wonderful view into the Web in its infancy.
• Frontend Development Bookmarks is one of the largest collections of
valuable resources for frontend learning both in CSS as well as
JavaScript.
• This series on how CSS works including How CSS works: Parsing &
painting CSS in the critical rendering path and How CSS works:
Understanding the cascade examines the rendering methods browsers
use to display web pages along with details of the algorithms they use
to cascade style rules.
• CSS Reference provides much-needed visual examples for every CSS
property to show you what they are actually going to look like on your
pagee when you use them.
• CSS coding techniques provides advice on how to write simpler, easierto-maintain CSS code to reduce your need to rely on CSS preprocessors
and build pipelines.
• CSS refresher notes is incredibly helpful if you've learned CSS in bits
and pieces along the way and you now want to fill in the gaps in your
knowledge.
• Mozilla Developer Network's CSS page contains an extensive set of
resources, tutorials and demos for learning CSS.
• CSS Positioning 101 is a detailed guide for learning how to do element
positioning correctly with CSS.
245
Full Stack Python: 2020 Supporter's Edition
• Did CSS get more complicated since the late nineties? is a solid look
back at how CSS evolved and where it has ended up today compared to
its origins.
• Using feature queries in CSS covers the @supports rule and how to
use it in your stylesheets.
• Learn CSS layout is a simple guide that breaks CSS layout topics into
chapters so you can learn each part one at a time.
• How well do you know CSS display? zooms into a single CSS property,
display , to teach the reader about it in-depth.
• Google's Web Fundamentals class shows how to create responsive
designs and performant websites.
• Tailoring CSS for performance is an interesting read since many
developers do not consider the implications of CSS complexity in
browser rendering time.
• Can I Use... is a compatibility table that shows which versions of
browsers implement specific CSS features.
• How do you remove unused CSS from a site? covers tools for
identifying unnecessary CSS and the process for eliminating rules that
are overwritten by other rules and therefore do not need to be sent to
the client.
• The Web Design Museum is an amazing look back at how web design
has evolved over the past 25+ years. Some of the designs can still be
seen in their current site's presentation such as the top navigation of
Apple's 2001 site.
• The invisible parts of CSS asks the question "can you describe what the
display:block property and value do? Most developers would have
some sense of what it is for but could not explain it to someone else
beyond that. The article helps fix this situation with display as well
as other less visible properties such as floats and auto width.
• A brief history of CSS until 2016 explains how CSS originated at CERN
in 1994 as a solution to the problem of HTML not having reasonable
styling features (in-line style attributes on elements don't count).
246
Full Stack Python: 2020 Supporter's Edition
• Old CSS, New CSS tells the wonderful and painful story of web design
in the early days of the Web, when inline styling was required, HTML
CAPS were mandatory, and most websites wished they had design
styles like the amazing Space Jam pages. Oh also, lots and lots of talk
about tables, because that was the only way to position anything back
in the day.
• 30 seconds of CSS provides short useful code snippets for you to learn
from and use for building your own web applications.
• CSS: The bad bits examines global scope, implicit percentage styling
rules and the z-index which can be difficult to use and require some
restraint to ensure they do not cause issues for the rest of your
stylesheet rules as you create and maintain your frontend.
• Improving Your CSS with Parker shows how to use the static CSS
analysis tool Parker to improve your stylesheets.
• CSS and network performance analyzes how splitting your CSS can
affect browser render times and how you can improve your site loading
performance by changing how you structure your CSS files. My
recommendation: there's a lot you can do with these techniques but it
is probably a better idea to make your CSS simpler and cut down the
massive bloat that can accumulate as you build your site as a first step
to improving your performance.
• A Guide To CSS Support In Browsers analyzes features versus bugs in
different browser versions and how to test for support so that issues are
less likely to hit your web app.
• That Time I Tried Browsing the Web Without CSS is an enlightening
look at how crucial CSS is to the modern web. You can view examples
of what it is like to use Amazon, DuckDuckGo, GitHub and several
other sites.
• Third party CSS is not safe is a good reminder that any code you did
not write yourself, especially code served through 3rd party sources not
under your control can contain potentially malicious applications, such
as the experimental CSS keylogger hack that made the rounds in early
2018.
247
Full Stack Python: 2020 Supporter's Edition
• Understanding specificity in CSS provides some wonderful detail on
the various ways to select elements via CSS selectors, including type
selectors, pseudo-elements, class selectors, attribute selectors and ID
selectors.
• Realistic Water Effect without JavaScript - HTML/CSS Only is one of
the coolest tutorials I have seen that uses CSS to create a water effect
over an image. This video provides an example of how there are so
many incredible ways to use CSS and web development
technologies.
CSS learning checklist
1. Create a simple HTML file with basic elements in it. Use the
python -m SimpleHTTPServer command to serve it up. Create a
<style></style> element within the <head> section in the HTML
markup. Play with CSS within that style element to change the look and
feel of the page.
2. Check out front end frameworks such as Bootstrap and Foundation and
integrate one of those into the HTML page.
3. Work through the framework's grid system, styling options and
customization so you get comfortable with how to use the framework.
4. Apply the framework to your web application and tweak the design
until you have something that looks much better than generic HTML.
Responsive Design
Responsive design is a method of using media queries in Cascading Style
Sheets (CSS) to ensure a site is usable on various devices with different screen
sizes, from very small to very large.
248
Full Stack Python: 2020 Supporter's Edition
Full Stack Python uses responsive design to improve readability across a
broad range of devices that people use to read this site.
Responsive design resources
• The Responsive Design podcast and accompanying email newsletter
interview web designers who've dealt with creating responsive designs
from both a blank slate and retrofitting into an existing site.
• This site is entirely on responsive design and has many articles on how
to achieve readability across devices.
• Using Media Queries For Responsive Design In 2018 contains a bunch
of great examples for how to use media queries to achieve a responsive
design.
• Making Tables Responsive With Minimal CSS steps through a couple of
ways to responsively handle tables.
• Smashing Magazine's article from 2011 on responsive design - what it is
and how to use it remains a definitive source for understanding why
sites should be responsive.
• Fundamentals of responsive images shows the code behind several
ways to make your web images resize according to browser window
size.
• ResponsiveViewer is a Chrome plugin in that allows you to view
multiple sizes and devices in the same browser window. This is a handy
249
Full Stack Python: 2020 Supporter's Edition
tool to avoid constantly flipping between tabs or resizing the same tab
to see the changes you are making to a design.
Minification
Minification is the process of reducing the size of static content assets that
need to be transferred from a web server to the web browser client. The goal
of minification is to make webpages and web applications load faster, without
changing how the assets are ultimately used after being downloaded.
Cascading Style Sheet (CSS) files, JavaScript and even HTML can be minified.
The techniques to minify an HTML file differ from a CSS file because the file's
contents and syntax are different.
CSS Minification example
Let's say your web application has a CSS file with the following four lines to
add some padding around every element with the pad-me class:
.pad-me { padding-top: 10px;
padding-right: 5px;
padding-left: 5px;
padding-bottom: 10px; }
That example has 122 characters. A CSS minifier would take the above four
lines and convert them to the following single line:
.pad-me{padding:10px 5px 5px 10px}
The result would have only a single line with 35 characters, that's 87 less
characters that would need to be sent from the web server to the web browser!
The minification process reduced the single CSS class by 71% but kept the
exact same result when the web browser applies pad-me to all elements with
that class.
The file size savings can be a major benefit when applied across hundreds or
thousands of lines in a typical CSS file. When you also multiply that savings
across every client that downloads the CSS from your web server or CDN it
becomes clear that minification is a powerful way to improve the efficiency of
your production web application.
250
Full Stack Python: 2020 Supporter's Edition
CSS minification resources
• CSS minifier is an online form to copy and paste CSS then retrieve the
minified results on the same page.
JavaScript minification resources
• JavaScript minifier is similar to the above CSS minifier but works with
JavaScript syntax.
CSS Frameworks
A Cascading Style Sheet (CSS) framework is a code library that abstracts
common web designs and makes the designs easier for a developer to
implement in their web apps.
CSS framework implementations
• Bootstrap
• Foundation
• Screen
• Materialize
• Concise
CSS framework resources
• Which Responsive Design Framework Is Best? Of Course, It Depends.
• awesome-css-frameworks
• Bulma: CSS framework you should consider in 2018
• An Introduction to Tailwind CSS explains the basics of getting started
and using this newer CSS framework.
Bootstrap
Bootstrap is a CSS framework that makes it easier to create website and web
application user interfaces. Bootstrap is especially useful as a base layer of
CSS to build sites with responsive web design.
251
Full Stack Python: 2020 Supporter's Edition
Should I use Bootstrap 3 or 4?
Bootstrap 3 was out for almost five years before Bootstrap 4 was finally
released at the start of 2018. When learning Bootstrap you will likely come
across many tutorials that cover Bootstrap 3 but not version 4.
If you are completely new to Bootstrap and CSS frameworks in general then I
recommend learning Bootstrap 4. If you have already been working with
Bootstrap 3 then there is no major rush to upgrade to the latest version.
Bootstrap 4 is more complicated than version 3 because it has a lot more
features so the learning curve is a bit steeper.
Full Stack Python is actually built with an early version of Bootstrap 3.
However, this site is so heavily customized with my own CSS that I likely will
never upgrade to Bootstrap 4 because there are no new features that I feel will
be useful in my specific situation.
Bootstrap resources
• Getting Started with Bootstrap provides a walkthrough of a sample
starter template so you can understand the CSS classes that Bootstrap
uses. There is also a follow up post that shows how to create an
effective sales page with Bootstrap 3.
• What is Bootstrap and how do I use it? is an awesome tutorial that goes
through many of the main Bootstrap elements such as icons, navigation
bars and "jumbotron"-style webpages.
• Bootstrap 5 tutorial covers the alpha version of the fifth major revision
for Bootstrap.
252
Full Stack Python: 2020 Supporter's Edition
• Bootstrap 4: What's New? shows a slew of comparisons for how user
interface elements have changed design over the past several Bootstrap
major version iterations.
• Tabler is a free, open source admin theme for Bootstrap 4. You can also
check out a demo of the theme.
• How to use Bootstrap 4 forms with Django shows how to combine the
popular django-crispy-forms library with Django to create and obtain
user data through web forms that are styled with Bootstrap 4 CSS.
• React Bootstrap (source code replaces the existing Bootstrap
JavaScript with React components that do not rely on jQuery.
Foundation
Foundation is a Cascading Style Sheet (CSS) framework that makes it easier to
create website and web application user interfaces. Bootstrap is especially
useful as a base layer of CSS to build sites with responsive web design.
JavaScript
JavaScript is a scripting programming language interpretted by web browsers
that enables dynamic content and interactions in web applications.
Why is JavaScript necessary?
JavaScript executes in the client and enables dynamic content and interaction
that is not possible with HTML and CSS alone. Every modern Python web
application uses JavaScript on the front end.
253
Full Stack Python: 2020 Supporter's Edition
Front end frameworks
Front end JavaScript frameworks move the rendering for most of a web
application to the client side. Often these applications are informally referred
to as "one page apps" because the webpage is not reloaded upon every click to
a new URL. Instead, partial HTML pages are loaded into the document object
model or data is retrieved through an API call then displayed on the existing
page.
Examples of these front end frameworks include:
• React
• Angular.js
• Vue.js
• Backbone.js
• Ember.js
Front end frameworks are rapidly evolving. Over the next several years
consensus about good practices for using the frameworks will emerge.
How did JavaScript originate?
JavaScript is an implementation of the ECMAScript specification which is
defined by the Ecma International Standards Body. Read this paper on the
first 20 years of JavaScript by Brendan Eich, the original creator of the
programming language, and Allen Wirfs-Brock for more context on the
language's evolution.
JavaScript resources
• How Browsers Work is a great overview of both JavaScript and CSS as
well as how pages are rendered in a browser.
• This step-by-step tutorial to build a modern JavaScript stack is useful
to understanding how front-end JS frameworks such as Angular and
Ember.js work.
• A re-introduction to JavaScript by Mozilla walks through the basic
syntax and operators.
254
Full Stack Python: 2020 Supporter's Edition
• The Cost of Javascript Frameworks presents a balanced view of how
JavaScript libraries and frameworks impact web performance on both
desktop and mobile. The author acknowledges that these libraries are
useful for developers but in extreme cases there are significant
downsides to including so much JavaScript in pages.
• The State of JavaScript report contains a wealth of data on what
libraries developers are using in the JavaScript ecosystem. There are
also reports from previous years which show how the community's
preferences have changed over time.
• Coding tools and JavaScript libraries is a huge list by Smashing
Magazine with explanations for each tool and library for working with
JavaScript.
• Superhero.js is an incredibly well designed list of resources for how to
test, organize, understand and generally work with JavaScript.
• Unheap is an amazing collection of reusable JQuery plugins for
everything from navigation to displaying media.
• The Modern JavaScript Developer's Toolbox provides a high-level
overview of tools frequently used on the client and server side for
developers using JavaScript in their web applications.
• Front-end Walkthrough: Designing a Single Page Application
Architecture covers what a single page app (SPA) architecture looks
like, what the tools are that you can use and some comparisons when
deciding between Angular and React.
• Developing a Single Page App with Flask and Vue.js is a step-by-step
walkthrough of how to set up a basic CRUD app with Vue.js and Flask.
• A Guide to Console Commands shows off what JavaScript commands
you can use in your browser's console, which is a typical debugging
pattern for JavaScript development.
• is-website-vulnerable is an open source tool that identifies security
vulnerabilities based on the front end JavaScript code a web
application runs.
255
Full Stack Python: 2020 Supporter's Edition
• SPAs are harder, and always will be talks about the inherent complexity
in building client-side user interfaces with JavaScript.
• The Deep Roots of Javascript Fatigue starts by covering the non-stop
library churn in the JavaScript ecosystem and then relates JavaScript
evolution since the mid-90s to explain the history of the problem.
• How JavaScript works: the rendering engine and tips to optimize its
performance is one particularly relevant post in a multi-part series that
explains how you can optimize slower JavaScript code to better suit the
JS engines in common web browsers.
• How to reduce the impact of JavaScript on your page load time
provides insight into minimizing the size and improving the download
and execution speed for JavaScript libraries, especially ones that are
used at scale by many websites.
• Learn JS data teaches how to manipulate data using JavaScript in a
browser or on the server using Node.js.
• A Beginner's Guide to JavaScript's Prototype explains the
fundamentals of JavaScript's object system, which is a prototype-based
model and different from many other common programming
languages' object models.
• Understanding Data Types in JavaScript examines JavaScript's
dynamic data type model and how it manifests in the way numbers,
string, Booleans and arrays are used.
JavaScript learning checklist
1. Create a simple HTML file with basic elements in it. Use the
python -m SimpleHTTPServer command to serve it up. Create a
<script type="text/javascript"></script> element at the end
of the <body> section in the HTML page. Play with JavaScript within
that element to learn the basic syntax.
2. Download JQuery and add it to the page above your JavaScript
element. Start working with JQuery and learning how it makes basic
JavaScript easier.
256
Full Stack Python: 2020 Supporter's Edition
3. Work with JavaScript on the page. Incorporate examples from open
source projects listed below as well as JQuery plugins. Check out
Unheap to find a large collection of categorized JQuery plugins.
4. Check out the JavaScript resources below to learn more about
advanced concepts and open source libraries.
5. Integrate JavaScript into your web application and check the static
content section for how to host the JavaScript files.
React
React is a JavaScript web application framework for building rich user
interfaces that run in web browsers.
Beginner React tutorials
Generally before you start working with React you will want to learn how to
build your Python backend with a web framework such as Django, Flask or
Pyramid. Once you get comfortable with the web development basics with one
of those frameworks as well as JavaScript then it will be much easier to tack
on React to build your client-side user interfaces.
• JavaScript fundamentals before learning React provides a gut check
that you have the prerequisite knowledge to avoid getting impossibly
stuck while trying to learn React.
• The official React tutorial is one of the best ways to start using React
because many other tutorials quickly fall out of date while this one
tends to stick to the basics that are relevant to beginners.
257
Full Stack Python: 2020 Supporter's Edition
• This Modern Django 4-part tutorial series is well-done, has freely
available source code and includes:
1. Setting up Django and React
2. Redux and React Router setup
3. Creating an API and integrating with React
4. Adding authentication to React SPA using DRF
• Django REST with React (Django 2.0 and a sprinkle of testing)
combines a Django plus Django REST Framework (DRF) backend with
React on the front end and shows how to stich it all together.
• Build a Simple CRUD App with Python, Flask, and React shows how to
combine a Flask backend with React.
• Learn React app is a Git repository with a code tutorial and instructions
for how to follow along, as well as exercises to ensure you are tested as
you go.
• 9 things every React.js beginner should know is not a tutorial but
instead the author gives some strong opinions for what beginners
should know as they start learning React.
• React Bootstrap (source code replaces the existing Bootstrap
JavaScript with React components that do not rely on jQuery.
Other React resources
• React interview questions is a good quiz to see what you know or still
need to learn about the fundamentals of using React.
• Under-the-hood-ReactJS examines the React codebase itself rather
than teaching you how to use it.
Vue.js
Vue.js (source code) is a JavaScript web application framework for building
rich apps that run in web browsers.
258
Full Stack Python: 2020 Supporter's Edition
Vue.js-related projects
• flask-vue-spa is an example project with a Flask API on the backend
and Vue on the front.
• django-vue-template contains example code for a Django backend.
There is also another project named django-vue-template by a different
developer so take a peek at that one as well if the first one does not suit
your needs.
• vuepress (source code) is a static site generator that uses Vue and
Markdown to create pre-rendered static HTML pages.
Vue.js resources
• A friendly introduction to Vue.js contains the code and brief
explanations of what it's doing so you can learn to create your first Vue
app.
• Developing a Single Page App with Flask and Vue.js walks through all
of the environment configuration, project setup and coding you need to
do to build a legitimate Vue.js application that uses a Flask API on the
backend.
• Building a SaaS using Django and Vue.js shows how to create a
software-as-a-service application with a Django back end and Vue on
the front end via a series of videos that are each between 10-15 minutes
in length.
259
Full Stack Python: 2020 Supporter's Edition
• Learn Vue by Building and Deploying a CRUD App is an awesome
course that walks through the Vue basics and building a whole
application and deploying it to Netlify.
• The official Vue.js getting started documentation is fantastic and highly
recommended as a top starting resource.
• Replacing jQuery With Vue.js: No Build Step Necessary explains why
you may want to replace existing jQuery code with Vue and how to do
so with minimal steps, depending on the complexity of your
applicationi code, of course!
• Why we chose Vue.js covers GitLab's reasons for using this JavaScript
framework over other options.
• The Vue.js publishes their own documentation page on how Vue
compares with other frameworks. It is refreshing to see a
straightforward technical analysis without the posturing that often
comes from authors of one project discussing other work in the same
space.
• Build a Crud application using Vue and Django combines Django and
Django REST Framework with Vue.js in this tutorial that shows how to
build a simple web app for managing subscriptions.
• Building Modern Applications with Django and Vue.js combines
Django, Django REST Framework, Vue.js and Auth0 (an
authentication web API) in an introductory web application.
• Vue.js And SEO: How To Optimize Reactive Websites For Search
Engines And Bots shows how pre-rendering and various other
attributes of JavaScript MVC frameworks like Vue.js can negatively
impact your SEO. It then walks through the most important items to
address to mitigate these problems on your own Vue-based sites.
• Exploring Data with Serverless and Vue: Automatically Update GitHub
Files With Serverless Functions combines a serverless backend with
Vue.js on the front.
260
Full Stack Python: 2020 Supporter's Edition
Angular
Angular is a JavaScript web application framework for building rich apps that
run in web browsers.
Angular resources
• This end to end web app with Django-Rest-Framework & AngularJS
part 1 tutorial along with part 2, part 3 and part 4 creates an example
blog application with Djangular. There is also a corresponding GitHub
repo for the project code.
• Django-angular is a code library that aims to make it easier to pair
Django with AngularJS on the front end.
• Developing a Real-Time Taxi App with Django Channels and Angular
this course teaches you how to build and test a real-time ride-sharing
app with Django Channels and Angular.
Task Queues
Task queues manage background work that must be executed outside the
usual HTTP request-response cycle.
Why are task queues necessary?
Tasks are handled asynchronously either because they are not initiated by an
HTTP request or because they are long-running jobs that would dramatically
reduce the performance of an HTTP response.
For example, a web application could poll the GitHub API every 10 minutes to
collect the names of the top 100 starred repositories. A task queue would
handle invoking code to call the GitHub API, process the results and store
them in a persistent database for later use.
261
Full Stack Python: 2020 Supporter's Edition
Another example is when a database query would take too long during the
HTTP request-response cycle. The query could be performed in the
background on a fixed interval with the results stored in the database. When
an HTTP request comes in that needs those results a query would simply fetch
the precalculated result instead of re-executing the longer query. This
precalculation scenario is a form of caching enabled by task queues.
Other types of jobs for task queues include
• spreading out large numbers of independent database inserts over time
instead of inserting everything at once
• aggregating collected data values on a fixed interval, such as every 15
minutes
• scheduling periodic jobs such as batch processes
Task queue projects
The defacto standard Python task queue is Celery. The other task queue
projects that arise tend to come from the perspective that Celery is overly
complicated for simple use cases. My recommendation is to put the effort into
Celery's reasonable learning curve as it is worth the time it takes to
understand how to use the project.
• The Celery distributed task queue is the most commonly used Python
library for handling asynchronous tasks and scheduling.
• The RQ (Redis Queue) is a simple Python library for queueing jobs and
processing them in the background with workers. RQ is backed by
Redis and is designed to have a low barrier to entry.
• Taskmaster is a lightweight simple distributed queue for handling large
volumes of one-off tasks.
• Huey is a Redis-based task queue that aims to provide a simple, yet
flexible framework for executing tasks. Huey supports task scheduling,
crontab-like repeating tasks, result storage and automatic retry in the
event of failure.
262
Full Stack Python: 2020 Supporter's Edition
• Kuyruk is simple and easy to use task queue system built on top of
RabbitMQ. Although feature set is small, new features can be added by
extensions.
• Dramatiq is a fast and reliable alternative to Celery. It supports
RabbitMQ and Redis as message brokers.
• django-carrot is a simple task queue specifically for Django that can
serve when Celery is overkill.
• tasq is a brokerless task queue for simple use cases. It is not
recommended for production unless further testing and development is
done.
Hosted message and task queue services
Task queue third party services aim to solve the complexity issues that arise
when scaling out a large deployment of distributed task queues.
• Iron.io is a distributed messaging service platform that works with
many types of task queues such as Celery. It also is built to work with
other IaaS and PaaS environments such as Amazon Web Services and
Heroku.
• Amazon Simple Queue Service (SQS) is a set of five APIs for creating,
sending, receiving, modifying and deleting messages.
• CloudAMQP is at its core managed servers with RabbitMQ installed
and configured. This service is an option if you are using RabbitMQ
and do not want to maintain RabbitMQ installations on your own
servers.
Open source examples that use task queues
• Take a look at the code in this open source Flask application and this
Django application for examples of how to use and deploy Celery with a
Redis broker to send text messages with these frameworks.
• flask-celery-example is a simple Flask application with Celery as a task
queue and Redis as the broker.
263
Full Stack Python: 2020 Supporter's Edition
• django_dramatiq_example and flask_dramatiq_example are simple
apps that demo how you can use Dramatiq with Django and Flask,
respectively.
Task queue resources
• International Space Station notifications with Python and Redis Queue
(RQ) shows how to combine the RQ task queue library with Flask to
send text message notifications every time a condition is met - in this
blog post's case that the ISS is currently flying over your location on
Earth.
• Evaluating persistent, replicated message queues is a detailed
comparison of Amazon SQS, MongoDB, RabbitMQ, HornetQ and
Kafka's designs and performance.
• Queues.io is a collection of task queue systems with short summaries
for each one. The task queues are not all compatible with Python but
ones that work with it are tagged with the "Python" keyword.
• Why Task Queues is a presentation for what task queues are and why
they are needed.
• Asynchronous Processing in Web Applications Part One and Part Two
are great reads for understanding the difference between a task queue
and why you shouldn't use your database as one.
• Flask by Example Implementing a Redis Task Queue provides a
detailed walkthrough of setting up workers to use RQ with Redis.
• Heroku has a clear walkthrough for using RQ for background tasks.
• How to use Celery with RabbitMQ is a detailed walkthrough for using
these tools on an Ubuntu VPS.
• Celery - Best Practices explains things you should not do with Celery
and shows some underused features for making task queues easier to
work with.
• Celery in Production on the Caktus Group blog contains good practices
from their experience using Celery with RabbitMQ, monitoring tools
and other aspects not often discussed in existing documentation.
264
Full Stack Python: 2020 Supporter's Edition
• A 4 Minute Intro to Celery is a short introductory task queue
screencast.
• This Celery tasks checklist has some nice tips and resources for using
Celery in your applications.
• Heroku wrote about how to secure Celery when tasks are otherwise
sent over unencrypted networks.
• Miguel Grinberg wrote a nice post on using the task queue Celery with
Flask. He gives an overview of Celery followed by specific code to set up
the task queue and integrate it with Flask.
• Ditching the Task Queue for Gevent explains how in some cases you
can replace the complexity of a task queue with concurrency. For
example, you can remove Celery in favor of gevent.
• 3 Gotchas for Working with Celery are things to keep in mind when
you're new to the Celery task queue implementation.
• Setting up an asynchronous task queue for Django using Celery and
Redis is a straightforward tutorial for setting up the Celery task queue
for Django web applications using the Redis broker on the back end.
• Three quick tips from two years with Celery provides some solid advice
on retry delays, the -Ofair flag and global task timeouts for Celery.
• Asynchronous Tasks with Flask and Redis Queue looks at how to
configure Redis Queue to handle long-running tasks in a Flask app.
• Developing an Asynchronous Task Queue in Python looks at how to
implement several asynchronous task queues using Python's
multiprocessing library and Redis.
Task queue learning checklist
1. Pick a slow function in your project that is called during an HTTP
request.
2. Determine if you can precompute the results on a fixed interval instead
of during the HTTP request. If so, create a separate function you can
call from elsewhere then store the precomputed value in the database.
265
Full Stack Python: 2020 Supporter's Edition
3. Read the Celery documentation and the links in the resources section
below to understand how the project works.
4. Install a message broker such as RabbitMQ or Redis and then add
Celery to your project. Configure Celery to work with the installed
message broker.
5. Use Celery to invoke the function from step one on a regular basis.
6. Have the HTTP request function use the precomputed value instead of
the slow running code it originally relied upon.
Celery
Celery is a task queue implementation for Python web applications used to
asynchronously execute work outside the HTTP request-response cycle.
Why is Celery useful?
Task queues and the Celery implementation in particular are one of the
trickier parts of a Python web application stack to understand.
If you are a junior developer it can be unclear why moving work outside the
HTTP request-response cycle is important. In short, you want your WSGI
server to respond to incoming requests as quickly as possible because each
request ties up a worker process until the response is finished. Moving work
off those workers by spinning up asynchronous jobs as tasks in a queue is a
straightforward way to improve WSGI server response times.
What's the difference between Celeryd and Celerybeat?
Celery can be used to run batch jobs in the background on a regular schedule.
A key concept in Celery is the difference between the Celery daemon (celeryd),
266
Full Stack Python: 2020 Supporter's Edition
which executes tasks, Celerybeat, which is a scheduler. Think of Celeryd as a
tunnel-vision set of one or more workers that handle whatever tasks you put
in front of them. Each worker will perform a task and when the task is
completed will pick up the next one. The cycle will repeat continously, only
waiting idly when there are no more tasks to put in front of them.
Celerybeat on the other hand is like a boss who keeps track of when tasks
should be executed. Your application can tell Celerybeat to execute a task at
time intervals, such as every 5 seconds or once a week. Celerybeat can also be
instructed to run tasks on a specific date or time, such as 5:03pm every
Sunday. When the interval or specific time is hit, Celerybeat will hand the job
over to Celeryd to execute on the next available worker.
Celery tutorials and advice
Celery is a powerful tool that can be difficult to wrap your mind around at
first. Be sure to read up on task queue concepts then dive into these specific
Celery tutorials.
• A 4 Minute Intro to Celery is a short introductory task queue
screencast.
• This blog post series on Celery's architecture, Celery in the wild: tips
and tricks to run async tasks in the real world and dealing with
resource-consuming tasks on Celery provide great context for how
Celery works and how to handle some of the trickier bits to working
with the task queue.
• How to use Celery with RabbitMQ is a detailed walkthrough for using
these tools on an Ubuntu VPS.
• Celery - Best Practices explains things you should not do with Celery
and shows some underused features for making task queues easier to
work with.
• Celery Best Practices is a different author's follow up to the above best
practices post that builds upon some of his own learnings from 3+
years using Celery.
267
Full Stack Python: 2020 Supporter's Edition
• Common Issues Using Celery (And Other Task Queues) contains good
advice about mistakes to avoid in your task configurations, such as
database transaction usage and retrying failed tasks.
• Asynchronous Processing in Web Applications Part One and Part Two
are great reads for understanding the difference between a task queue
and why you shouldn't use your database as one.
• My Experiences With A Long-Running Celery-Based Microprocess
gives some good tips and advice based on experience with Celery
workers that take a long time to complete their jobs.
• Checklist to build great Celery async tasks is a site specifically designed
to give you a list of good practices to follow as you design your task
queue configuration and deploy to development, staging and
production environments.
• Heroku wrote about how to secure Celery when tasks are otherwise
sent over unencrypted networks.
• Unit testing Celery tasks explains three strategies for testing code
within functions that Celery executes. The post concludes that calling
Celery tasks synchronously to test them is the best strategy without any
downsides. However, keep in mind that any testing method that is not
the same as how the function will execute in a production environment
can potentially lead to overlooked bugs. There is also an open source
Git repository with all of the source code from the post.
• Rollbar monitoring of Celery in a Django app explains how to use
Rollbar to monitor tasks. Super useful when workers invariably die for
no apparent reason.
• 3 Gotchas for Working with Celery are things to keep in mind when
you're new to the Celery task queue implementation.
• Dask and Celery compares Dask.distributed with Celery for Python
projects. The post gives code examples to show how to execute tasks
with either task queue.
• Python+Celery: Chaining jobs? explains that Celery tasks should be
dependent upon each other using Celery chains, not direct
dependencies between tasks.
268
Full Stack Python: 2020 Supporter's Edition
Celery with web frameworks
Celery is typically used with a web framework such as Django, Flask or
Pyramid. These resources show you how to integrate the Celery task queue
with the web framework of your choice.
• How to Use Celery and RabbitMQ with Django is a great tutorial that
shows how to both install and set up a basic task with Django.
• Miguel Grinberg wrote a nice post on using the task queue Celery with
Flask. He gives an overview of Celery followed by specific code to set up
the task queue and integrate it with Flask.
• Setting up an asynchronous task queue for Django using Celery and
Redis is a straightforward tutorial for setting up the Celery task queue
for Django web applications using the Redis broker on the back end.
• A Guide to Sending Scheduled Reports Via Email Using Django And
Celery shows you how to use django-celery in your application. Note
however there are other ways of integrating Celery with Django that do
not require the django-celery dependency.
• Flask asynchronous background tasks with Celery and Redis combines
Celery with Redis as the broker and Flask for the example application's
framework.
• Celery and Django and Docker: Oh My! shows how to create Celery
tasks for Django within a Docker container. It also provides some
• Asynchronous Tasks With Django and Celery shows how to integrate
Celery with Django and create Periodic Tasks.
• Getting Started Scheduling Tasks with Celery is a detailed walkthrough
for setting up Celery with Django (although Celery can also be used
without a problem with other frameworks).
• Introducing Celery for Python+Django provides an introduction to the
Celery task queue with Django as the intended framework for building
a web application.
• Asynchronous Tasks with Falcon and Celery configures Celery with the
Falcon framework, which is less commonly-used in web tutorials.
269
Full Stack Python: 2020 Supporter's Edition
• Custom Celery task states is an advanced post on creating custom
states, which is especially useful for transient states in your application
that are not covered by the default Celery configuration.
• Asynchronous Tasks with Django and Celery looks at how to configure
Celery to handle long-running tasks in a Django app.
Celery deployment resources
Celery and its broker run separately from your web and WSGI servers so it
adds some additional complexity to your deployments. The following
resources walk you through how to handle deployments and get the right
configuration settings in place.
• The "Django in Production" series by Rob Golding contains a post
specifically on Background Tasks.
• How to run celery as a daemon? is a short post with the minimal code
for running the Celery daemon and Celerybeat as system services on
Linux.
• Celery in Production on the Caktus Group blog contains good practices
from their experience using Celery with RabbitMQ, monitoring tools
and other aspects not often discussed in existing documentation.
• Three quick tips from two years with Celery provides some solid advice
on retry delays, the -Ofair flag and global task timeouts for Celery.
Redis Queue (RQ)
Redis Queue (RQ) is a Python task queue implementation that uses Redis to
keep track of tasks in the queue that need to be executed.
270
Full Stack Python: 2020 Supporter's Edition
RQ resources
• Asynchronous Tasks in Python with Redis Queue is a quickstart-style
tutorial that shows how to use RQ to fetch data from the Mars Rover
web API and process URLs for each of the photos taken by NASA's
Mars rover. There is also a follow-up post on Scheduling Tasks in
Python with Redis Queue and RQ Scheduler that shows how to
schedule tasks in advance, which is a common way of working with task
queues.
• The RQ intro post contains information on design decisions and how to
use RQ in your projects.
• International Space Station notifications with Python and Redis Queue
(RQ) shows how to combine the RQ task queue library with Flask to
send text message notifications every time a condition is met - in this
blog post's case that the ISS is currently flying over your location on
Earth.
• Asynchronous Tasks with Flask and Redis Queue looks at how to
configure RQ to handle long-running tasks in a Flask app.
• How We Spotted and Fixed a Performance Degradation in Our Python
Code is a quick story about how an engineering team moving from
Celery to RQ fixed some deficiencies in their RQ performance as they
started to understand the difference between how the two tools execute
workers.
• Flask by Example - Implementing a Redis Task Queue explains how to
install and use RQ in a Flask application.
• rq-dashboard is an awesome Flask-based dashboard for viewing
queues, workers and other critical information when using RQ.
• Sending Confirmation Emails with Flask, Redis Queue, and Amazon
SES shows how RQ fits into a real-world application that uses many
libraries and third party APIs.
• Background Tasks in Python using Redis Queue gives a code example
for web scraping data from the Goodreads website. Note that the first
sentence in the post is not accurate: it's not the Python language that is
linear, but instead the way workers in WSGI servers handle a single
271
Full Stack Python: 2020 Supporter's Edition
request at a time by blocking. Nevertheless, the example is a good one
for understanding how RQ can execute.
Dramatiq
Dramatiq (source code) is a Python-based task queue created as an alernative
to the ubiquitous Celery project Dramatiq supports RabbitMQ and Redis as
message brokers.
Dramatiq resources
• django_dramatiq is an app for integrating Dramatiq with Django.
• django_dramatiq_example is a simple Django project demoing how
you can combine Django and Dramatiq.
• flask_dramatiq_example is an example Flask app that shows how to
use Dramatiq.
Static Site Generators
A static website generator combines a markup language, such as Markdown
or reStructuredText, with a templating engine such as Jinja, to produce
HTML files. The HTML files can be hosted and served by a web server or
content delivery network (CDN) without any additional dependencies such as
a WSGI server.
Why are static site generators useful?
Static content files such as HTML, CSS and JavaScript can be served from a
content delivery network (CDN) for high scale and low cost. If a statically
generated website is hit by high concurrent traffic it will be easily served by
the CDN without dropped connections.
For example, when Full Stack Python was on the top of Hacker News for a
weekend, GitHub Pages was used as a CDN to serve the site and didn't have
any issues even with close to 400 concurrent connections at a time, as shown
in the following Google Analytics screenshot captured during that traffic
burst.
272
Full Stack Python: 2020 Supporter's Edition
How do static website generators work?
Static site generators allow a user to create HTML files by writing in a markup
language and coding template files. The static site generator then combines
the markup language and templates to produce HTML. The output HTML
does not need to be maintained by hand because it is regenerated every time
the markup or templates are modified.
For example, as shown in the diagram below, the Pelican static site generator
can take in reStructuredText files and Jinja2 template files as input then
combine them to output a set of static HTML files.
273
Full Stack Python: 2020 Supporter's Edition
What's the downside of using static site generators?
The major downside is that code cannot be executed after a site is created.
You are stuck with the output files so if you're used to building web
applications with a traditional web framework you'll have to change your
expectations.
Content that is typically powered by a database, such as comments, sessions
and user data can only be handled through third party services. For example,
if you want to have comments on a static website you'd need to embed
Disqus's form and be completely reliant upon their service.
Many web applications simply cannot be built with only a static site
generator. However, a static website generator can create part of a site that
will be served up by a web server while other pages are handled by the WSGI
server. If done right, those web applications have the potential to scale better
than if every page is rendered by the WSGI server. The complexity may or
may not be worth it for your specific application.
274
Full Stack Python: 2020 Supporter's Edition
Python implementations
Numerous static website generators exist in many different languages. The
ones listed here are primarily coded in Python.
• Pelican is a commonly used Python static website generator which is
used to create Full Stack Python. The primary templating engine is
Jinja and Markdown, reStructuredText and AsciiDoc are supported
with the default configuration.
• Lektor is a static content management system and site generator that
can deploy its output to any webserver. It uses Jinja2 as its template
engine.
• MkDocs uses a YAML configuration file to take Markdown files and an
optional theme to output a documentation site. The templating engine
is Jinja, but a user doesn't have to create her own templates unless a
custom site is desired at which point it might make more sense to use a
different static site generator instead.
• mynt (source code) is built to create blogs and uses Jinja to generate
HTML pages.
• Nikola (source code) takes in reStructuredText, Markdown or Jupyter
(IPython) Notebooks and combines the files with Mako or Jinja2
templates to output static sites. It is compatible with both Python 2.7
and 3.3+. Python 2.7 will be dropped in early 2016 while Python 3.3+
will continue to be supported.
• Acrylamid (source code) uses incremental builds to generate static sites
faster than recreating every page after each change is made to the input
files.
• Hyde (source code) started out as a Python rewrite of the popular
Ruby-based Jekyll static site generator. Today the project has moved
past those "clone Jekyll" origins. Hyde supports Jinja as well as other
templating languages and places more emphasis on metadata within
the markup files to instruct the generator how to produce the output
files. Check out the Hyde-powered websites page to see live examples
created with Hyde.
275
Full Stack Python: 2020 Supporter's Edition
• Grow SDK (source code) uses projects, known as pods, which contain a
specific file and directory structure so the site can be generated. The
project remains in the "experimental" phase.
• Complexity (source code) is a site generator for users who like to work
in HTML. It uses HTML for templating but has some functionality
from Jinja for inheritance. Works with Python 2.6+, 3.3+ and PyPy.
• Cactus (source code) uses the Django templating engine that was
originally built with front-end designers in mind. It works with both
Python 2.x and 3.x.
Open source Python static site generator examples
• This site is all open source in its own GitHub repository under the MIT
license. Fork away!
• Django REST Framework uses MkDocs to create its documentation
site. Be sure to take a look at the mkdocs.yml file to see how large, wellwritten docs are structured for that project.
• Practicing web development uses Acrylamid to create its site. The code
is open source on GitHub.
• Linux Open Admin Days (Loadsys) has their site open source and
available for viewing.
• The Pythonic Perambulations blog has a fairly standard theme but is
also open source on GitHub.
Static site generator resources
Static site generators can be implemented in any programming language. The
following resources either are general to any programming ecosystem or
provide a unique angle on how to use a static site generator.
• Static vs Dynamic Websites does an excellent job of showing the
differences between a dynamic website that uses a database backend to
produce content in response to a request compared with static sites
that are pregenerated. There is also a second part in the series where
generic static site generator concepts are explained.
276
Full Stack Python: 2020 Supporter's Edition
• Staticgen lists static website generators of all programming languages
sorted by various attributes such as the number of GitHub stars, forks
and issues.
• The title is a big grandiose, but there's some solid detail in this article
on why static website generators are the next big thing. I'd argue static
website generators have been big for a long time now.
• Static site generators can be used for a range of websites from side
projects up to big sites. This blog post by WeWork on why they use a
static site generator explains it from the perspective of a large business.
• Ditching Wordpress and becoming one of the cool kids is one
developer's experience moving away from Wordpress and onto Pelican
with reStructuredText for his personal blog.
• Static websites with Flask explains how to use Flask-Frozen to generate
a static site based on content from the web framework and a data
source backend. This approach is an alternative to using a purposebuilt static website generator such as Pelican, Lektor or MkDocs.
• Building A Serverless Contact Form For Your Static Site shows how to
use HTML and JavaScript deployed to AWS Lambda to collect input
with a form on a static site.
• 5 ways to handle forms on your static site gives a good overview of
options like Google Forms for when you absolutely must get input from
users on a static site.
Static site deployment resources
Deploying a static site is far less complicated than a traditional web
application deployment, but you still need to host the files somewhere
accessible. You'll also to set up DNS to point a domain name to your site as
well as provide HTTPS support. These guides walk through various ways of
handling the static site deployment.
• Static site hosting with S3 and Cloudflare shows how to set up an S3
bucket with Cloudflare in front as a CDN that serves the content with
HTTPS. You should be able to accomplish roughly the same situation
277
Full Stack Python: 2020 Supporter's Edition
with Amazon Cloudfront, but as a Cloudflare user I like their service for
these static site configurations.
• Google Cloud provides a tutorial on how to use them to host your static
site. Note that you cannot currently use HTTPS on Google Storage
servers, which is a major downside.
• scar is an open source tool for making static site deployments and
redeployments to Amazon Web Services easier.
• Deploying a Static Blog with Continuous Integration uses a Hugo (a
Golang-based static site generator) generated site as an example but
the instructions can easily be used to deploy a Python-based static site
generator output as well.
• How to Make an AWS S3 Static Website With SSL explains the
configuration required to use SSL for HTTPS on an AWS-hosted static
site.
• Hosting your static site with AWS S3, Route 53, and CloudFront is
another solid tutorial that uses the AWS stack to deploy a globallyhosted site.
• Why your static website needs HTTPS shows all of the malicious
activity that bad actors can cause if you do not use HTTPS as part of
your static site deployment. There are few excuses for having an
insecure site without required HTTPS in today's world of free Let's
Encrypt certificates.
• How to Build a Low-tech Website? takes static site deployments to the
extreme. The site served with solar power and customized hardware
setup. This is a great read even though it will not be remotely practical
for most organizations.
Pelican
Pelican is a static site generator implemented in Python that combines Jinja
templates with content written in Markdown or reStructuredText to produce
websites.
Pelican's source code is available on GitHub under the AGPL 3 license.
278
Full Stack Python: 2020 Supporter's Edition
Why is Pelican a useful tool?
Static websites are easier to deploy than full web applications built with a web
framework that rely on a persistent database backend. In addition, static sites
are typically much faster to load because there are no database queries or
middleware code to execute during the HTTP request-response cycle.
A web server that hosts a static website simply responds to inbound HTTP
requests with the file being requests - no dynamic data is populated on the
server during the response.
Pelican resources
Static site generators like Pelican are a simple compared to web frameworks
so most tutorials focus on creating simple sites that you can style yourself, as
well as deploying to hosting services such as Amazon S3 and GitHub Pages.
• How to Create Your First Static Site with Pelican and Jinja2 walks
through installing, generating the boilerplate and customizing your
first static site using Pelican.
• How I built this website, using Pelican walks through getting your first
Pelican site generated and running.
• A Pelican Tutorial to Build A Static, Python-Powered Blog with Search
& Comments provides a walkthrough for how to build a great
combination of useful features into your static site such as search and
comments with the Staticman library. Bonus points at the end for
showing how to deploy to Netlify as an alternative to GitHub Pages or
S3.
• Creating your own blog with Pelican covers the decision-making
process with building a static versus dynamic website. The post then
279
Full Stack Python: 2020 Supporter's Edition
dives into using Pelican as a static site generator with a blog structure
and basic theme.
• Using Pelican to generate and manage static websites explains how to
use the pelican-quickstart command to generate an initial site
then adds the Tipue Search plugin to provide content search despite
the static site limitations.
• Pelican Folder Structure explains how the pages and posts structure
under content works when using Pelican.
• Pelican's official Plugin creation documentation gives a great starting
point for building your own plugins that can take in new input markup
formats, modify the generator process and add handy features such as
a custom table of contents.
• Pelican Sitemap and Pagination explains how to generate a
sitemap.xml file for your static site that includes all pages instead of
just auto-included top-level pages.
• Getting started with Pelican and GitHub pages is a tutorial I wrote to
use the Full Stack Python source code to create and deploy your first
static site.
• Moving blogs to Pelican talks about one developer's transition from
Jekyll to Pelican for his own sites.
• Using Travis & GitHub to deploy static sites shows how to automate
deployments of a Pelican-based static site using Travis CI.
Lektor
Lektor is a static website generator with content management system (CMS)
and web framework features for creating websites.
Lektor's source code is available on GitHub under the BSD 3-clause license.
280
Full Stack Python: 2020 Supporter's Edition
How is Lektor different from other static site generators?
Most static site generators such as Pelican are built with programmers as the
primary user. Lektor tries to be more accessible to non-programmers by
providing an admin panel for creating and updating site content similar to
Django or Wordpress.
Lektor example projects
• PyCon Colombia's 2018 site was built with Lektor and is freely
available on GitHub.
• freedombox.org is also available for reference.
Lektor resources
Lektor is a young project and therefore has a nascent community compared
with Pelican and Jekyll (the most popular Ruby-based static site generator).
However, the official documentation and initial quickstarts for the project are
wonderful so getting a basic site up and running is painless.
• Introducing Lektor is the background story for what motivated Armin
Ronacher to start hacking on his own static site generator project after
jumping around from Django to Wordpress for hosting content. The
post also includes details on the differences in the project compared to
other static site generators.
• Hello, Lektor is a wonderful getting started and overview post. The post
walks through the files Lektor generates, the admin content
management system and pulling data into pages from the Meetup API.
281
Full Stack Python: 2020 Supporter's Edition
• The official Lektor quickstart contains the first commands to use to
generate a new project scaffold. There is also a getting started
screencast that walks through installing and initial steps for getting set
up with the project.
• Lektor Static CMS, put the fun back into Content Management is a
short overview as the first part in what aims to be a continuing series
on how to use Lektor as a content management system.
• In Experiences Migrating to Lektor the author gives his impression of
Lektor after moving his 400+ articles over from a home-grown
blogging engine. He talks a bit about how he went from deploying on
GitHub Pages to surge.sh and finally over to Netlify.
• Automating deployment of Lektor blog sites covers using OpenShift to
deploy a static site. Seems like a lot of work when AWS S3 deployments
are a lot easier but OpenShift has its own ecosystem to keep you away
from AWS world if that's your thing.
MkDocs
MkDocs is a Python-based static site generator that combines Markdown
content with Jinja2 templates to produce websites. MkDocs can be
pronounced "McDocs" or "M-K Docs", although the core committers do not
have a strong preference one way or the other on the name's pronunciation.
MkDocs' source code is available on GitHub under the BSD 2-clause license.
What's great about MkDocs?
MkDocs uses a YAML configuration file and can optionally use themes to easy
change the look-and-feel of the documentation output.
282
Full Stack Python: 2020 Supporter's Edition
In addition to the easy configuration via a YAML file and the drop-in themes,
MkDocs also has a wonderful search feature. Search is often not possible out
of the box in other static site generators. With MkDocs search can easily be
added without plugins or code changes to the static site generator.
MkDocs resources
• The official Getting Started with MkDocs is likely the best place to go
when you are just getting set up with your first site that uses this
project.
• Mkdocs documentation is a quick tutorial to get MkDocs installed and
modify the initial mkdocs.yml file.
• MkDocs making strides is a post from one of the project's core
commiters on some changes that greatly improved the project such as
site regeneration during development when a file is modified, search,
the command-line client and packageable theming.
Testing
Testing determines whether software runs correctly based on specific inputs
and identifies defects that need to be fixed.
Why is testing important?
As software scales in codebase size, it's impossible for a person or even a large
team to keep up with all of the changes and the interactions between the
changes. Automated testing is the only proven method for building reliable
software once they grow past the point of a simple prototype. Many major
software program development failures can be traced back to inadequate or a
complete lack of testing.
It's impossible to know whether software works properly unless it is tested.
While testing can be done manually, by a user clicking buttons or typing in
input, it should be performed automatically by writing software programs that
test the application under test.
There are many forms of testing and they should all be used together. When a
single function of a program is isolated for testing, that is called unit testing.
Testing more than a single function in an application at the same time is
283
Full Stack Python: 2020 Supporter's Edition
known as integration testing. User interface testing ensures the correctness of
how a user would interact with the software. There are even more forms of
testing that large programs need, such as load testing, database testing, and
browser testing (for web applications).
Testing in Python
Python software development culture is heavy on software testing. Because
Python is a dynamically-typed language as opposed to a statically-typed
language, testing takes on even greater importance for ensuring program
correctness.
Python testing tools
The Python ecosystem has a wealth of tools to make it easier to run your tests
and interpret the results. The following tools encompass test runners,
coverage reports and related libraries.
• green is a test runner that has pretty printing on output to make the
results easier to read and understand.
• requestium merges the Requests library with Selenium to make it
easier to run automated browser tests.
• coverage.py is a tool for measuring code coverage when running tests.
Testing resources
• The Minimum Viable Test Suite shows how to set unit tests and
integration tests for a Flask example application.
• BDD Testing a Restful Web Application in Python is an introduction to
behavior-driven development (BDD) and uses a Flask web application
as an example project for learning.
• Testing, for people who hate testing provides examples for how to
improve your testing environment such as using a new test harness and
getting your test suite to run fast.
• Getting started with Pytest goes over some code challenges as examples
for how to use Pytest in your own projects.
284
Full Stack Python: 2020 Supporter's Edition
• Good test, bad test explains the difference between a "good" test case
and one that is not as useful. Along the way the post breaks down some
myths about common testing subjects such as code coverage, assertions
and mocking.
• Python Testing is a site devoted to testing in - you guessed it - the
Python programming language.
• In praise of property-based testing is an article by the author of the
Hypothesis testing tool written in Python. The article explains the
shortcomings of example-based tests and how property-based testing
can augment or replace those example-based tests to find more defects
in software. There is also a great introductory post on Hypothesis that
goes into further detail about getting your first Hypothesis tests up and
running.
• Google has a testing blog where they write about various aspects of
testing software at scale.
• A beginner's guide to Python testing covers test-driven development
for unit, integration and smoke tests.
• Testing a Twilio Interactive Voice Response (IVR) System With Python
and pytest is an incredibly in-depth tutorial on testing a Pythonpowered IVR using pytest. There is also a tutorial on how to build the
IVR before this testing tutorial, although you can just clone it to jump
into the testing walkthrough.
• Still confused about the difference between unit, functional and
integration tests? Check out this top answer on Stack Overflow to that
very question.
• Testing Python applications with Pytest walks through the basics of
using Pytest and some more advanced ways to use it such as
continuous testing through Semiphore CI.
• The cleaning hand of Pytest provides a couple of case studies for how
companies set up their testing systems. It then gives the boilerplate
code the author uses for Pytest and goes through a bunch of code
samples for common situations.
285
Full Stack Python: 2020 Supporter's Edition
• Using pytest with Django shows how to get a basic pytest test case
running for a Django project and explains why the author prefers
pytest over standard unittest testing.
• Distributed Testing with Selenium Grid and Docker shows how to
distribute automated, browser tests with Selenium Grid and Docker
Swarm. It also looks at how to run tests against a number of browsers
and automate the provisioning and deprovisioning of machines to keep
costs down.
• Principles of Automated Testing explains how to prioritize what to test,
goes through some levels of testing from unit to integration and
examines when to use example and bulk tests.
• How to run tests continuously while coding contains a Python script
that uses the watchdog to check for changes to source code files in your
project directory. If changes are detected then your tests will be run to
check that everything is still working as intended.
• 8 great pytest plugins is a list of plugins that go with PyTest and help
with tasks like reducing the friction around testing Django, as well as
running tests in parallel.
• This test-driven development series shows you how to write an
interpreter in Python and contains a ton of great code samples to learn
from:
◦ A game of tokens: write an interpreter in Python with TDD Part 1
◦ A game of tokens - part 2
◦ A game of tokens - part 3
◦ A game of tokens - part 4
• Pytest leaking examines situations where tests leak memory and can
cause abnormal results if they are not fixed.
Mocking resources
Mocking allows you to isolate parts of your code under test and avoid areas
that are not critical to the tests you are writing. For example, you may want to
test how you handle text message responses but do not want to actually
receive text messages in your application. You can mock a part of the code
286
Full Stack Python: 2020 Supporter's Edition
that would provide a happy-path scenario so you can run your tests properly.
The following resources show you how to use mocks in your test cases.
• Getting Started with Mocking in Python provides a whole code example
based on a blog project that shows how to use mock when testing.
• Python Mocking 101: Fake It Before You Make It explains what
mocking is and is not, and shows how to use the patch function to
accomplish it in your project. Revisiting Unit Testing and Mocking in
Python is a follow-up post that expands upon using the patch
function along with dependency injection.
• Mocks and Monkeypatching in Python uses the mock and
monkeypatch tools to show how to mock your test cases.
• Mock yourself, not your tests examines when mocks are necessary and
when they are not as useful so you can avoid them in your test cases.
• Mocking Redis & Expiration in Python is a specific scenario where you
would want to test your Redis-dependent code but prefer to mock it
rather than ensure an installation and connection are present
whenever you run your tests.
• Better tests for Redis integrations with redislite is a great example of
how using the right mocking library can clean up existing hacky testing
code and make it more straightforward for any developer that happens
upon the tests in the future.
Unit Testing
Unit testing is a method of determining the correctness of a single function
isolated from a larger codebase. The idea is that if all the atomic units of an
application work as intended in isolation, then integrating them together as
intended is much easier.
Why is unit testing important?
Unit testing is just one form of testing that works in combination with other
testing approaches to wring out the bugs from a piece of software being
developed. When several functions and classes are put together it's often
difficult to determine the source of a problem if several bugs are occurring at
287
Full Stack Python: 2020 Supporter's Edition
the same time. Unit testing helps eliminate as many of the individual bugs as
possible so when the application comes together as a whole the separate parts
work as correct as possible. Then when issues arise they can often be tracked
down as unintended consequences of the disparate pieces not fitting together
properly.
Unit testing tools
There are many tools for creating tests in Python. Some of these tools, such as
pytest, replace the built-in unittest framework. Other tools, such as nose, are
extensions that ease test case creation. Note that many of these tools are also
used for integration testing by writing the test cases to exercise multiple parts
of code at once.
• unittest is the built-in standard library tool for testing Python code.
• pytest is a complete testing tool that emphasizes backwardscompatibility and minimizing boilerplate code.
• nose is an extension to unittest that makes it easier to create and
execute test cases.
• Hypothesis is a unit test-generation tool that assists developers in
creating tests that exercise edge cases in code blocks. The best way to
get started using Hypothesis is by going through the well-written
quickstart.
• mimesis generates synthetic test data which is useful to apply in your
tests.
• testify was a testing framework meant to replace the common
unittest+nose combination. However, the team behind testify is
transitioning to pytest so it's recommended you do not use testify for
new projects.
Unit testing resources
Unit tests are useful in every project regardless of programming language.
The following resources provide a good overview of unit testing from several
viewpoints and follow up with additional depth in testing Python-specific
applications.
288
Full Stack Python: 2020 Supporter's Edition
• Introduction to Unit Testing provides a broad introduction to unit
testing, its importance and how to get started in your projects.
• Unit Testing Your Twilio App Using Python’s Flask and Nose is a
detailed tutorial for using the nose test runner for ensuring a Flask
application is working properly.
• Understanding unit testing explains why testing is important and
shows how to do it effectively in your applications.
• Unit testing with Python provides a high-level overview of testing and
has diagrams to demonstrate what's going on in the testing cycle.
• The Python wiki has a page with a list of Python testing tools and
extensions.
• An Extended Introduction to the nose Unit Testing Framework shows
how this test runner can be used to write basic test suites. While the
article is from 2006, it remains relevant today for learning how to use
nose with your projects.
• Revisiting Unit Testing and Mocking in Python is a wonderful post with
many code examples showing how and why to use dependency
injection and @property to mock unit tests.
• Unit Testing Doesn’t Affect Codebases the Way You Would Think
presents research on how unit testing impacts project code and ways
that it does not. It is only one research report but the findings on more
unit tests leading to higher Cyclomatic Complexity per method are
interesting. Perhaps more tests are needed to keep a project running
due to the increased complexity.
• Python unittest with Robert Collins is the transcript of an interview
with Robert Collins who is a core committer of unittest .
• Precise Unit Tests with PyHamcrest is a short tutorial on using the
PyHamcrest assertion tool for unit testing.
• Why most unit testing is a waste discusses how low risk unit tests
rarely fail even as the code changes and why they do not matter as
much to a project's health as many developers are led to believe based
on the test-driven development dogma.
289
Full Stack Python: 2020 Supporter's Edition
• Writing Unit Tests for Django Migrations digs into code examples for
testing Django data migrations.
• The business case for unit testing makes arguments for management
should buy into unit testing, including risk management and faster
development timelines.
• Unit testing often requires a bunch of fake data to use as inputs that
otherwise are generated by users or oother parts of a system not under
test. Fake data generation tools can help create data instead of having
too write it all yourself. This post on generating fake data with Faker
shows how to use the Faker tool with various seeds and outputs into
your tests.
• Unit Testing Applications that use Flask-Login and Flask-SocketIO
explains how to set up a WebSockets unit test with a couple of common
Flask libraries.
Integration Testing
Integration testing exercises two or more parts of an application at once,
including the interactions between the parts, to determine if they function as
intended. This type of testing identifies defects in the interfaces between
disparate parts of a codebase as they invoke each other and pass data between
themselves.
How is integration testing different from unit testing?
While unit testing is used to find bugs in individual functions, integration
testing tests the system as a whole. These two approaches should be used
together, instead of doing just one approach over the other. When a system is
comprehensively unit tested, it makes integration testing far easier because
many of the bugs in the individual components will have already been found
and fixed.
As a codebase scales up, both unit and integration testing allow developers to
quickly identify breaking changes in their code. Many times these breaking
changes are unintended and wouldn't be known about until later in the
development cycle, potentially when an end user discovers the issue while
using the software. Automated unit and integration tests greatly increase the
290
Full Stack Python: 2020 Supporter's Edition
likelihood that bugs will be found as soon as possible during development so
they can be addressed immediately.
Integration testing resources
• Integration testing with Context Managers gives an example of a
system that needs integration tests and shows how context managers
can be used to address the problem.
• Pytest has a page on integration good practices that you'll likely want to
follow when testing your application.
• Integration testing, or how to sleep well at night explains what
integration tests are and gives an example. The example is coded in
Java but still relevant when you're learning about integration testing.
• What is an integration test exactly? is an awesome Stack Exchange
thread that defines the differences in testing approaches like unit tests
versus integration and other tests. There is also some practical advice
like "It’s not important what you call it, but what it does" which as a
pragmatic programmer I am keen to agree on.
• Consistent Selenium Testing in Python gives a spectacular code-driven
walkthrough for setting up Selenium along with SauceLabs for
continuous browser-based testing.
• Where do our flaky tests come from? presents Google's data on where
their integration tests fail and how the tools you use can sometimes
lead to higher incidents of failed tests than other testing tools.
• Unleash the test army covers the author's first impressions of using
Hypothesis for testing the properties of a system under test.
Debugging
Developers often find themselves in situations where the code they've written
is not working quite right. When that happens, a developer debugs their code
by instrumenting, executing and inspecting the code to determine what state
of the application does not match the assumptions of how the code should be
correctly running.
291
Full Stack Python: 2020 Supporter's Edition
Why is debugging important?
There are bugs in every modest sized or larger application. Every developer
has to learn how to debug code in order to write programs that work as
correctly as time and budget allow.
Debugging tools
There are many debugging tools, some of which are built into IDEs like
PyCharm and others that are standalone applications. The following list
contains mostly standalone tools that you can use in any development
environment.
• pdb is a debugger built into the Python standard library and is the one
most developers come across first when trying to debug their
programs.
• Web-PDB provides a web-based user interface for pdb to make it easier
to understand what is happening while inspecting your running code.
• wdb uses WebSockets to allow you to debug running Python code from
a web browser.
• Pyflame (source code) is a profiling tool that generates flame graphs for
executing Python program code.
• objgraph (source code) uses graphviz to draw the connections between
Python objects running in an application
pdb tutorials
pdb is the most commonly-used debugger for Python because it is built into
the standard library. The following walkthroughs will show you how to use
pdb while fixing your own code.
• How to Use Pdb to Debug Your Code is a wonderful code-first tutorial
on getting started with pdb.
• pdb - Interactive Debugger is featured on the Python Module of the
Week blog and has some great detail on using the program effectively.
• pdb: Using the Python debugger in Django is a tutorial specific to
working with pdb in Django projects.
292
Full Stack Python: 2020 Supporter's Edition
• Debugging your Python code walks through a scenario where pdb can
be used to find a defect in a block of Python code.
• pdb Tutorial is a code-heavy beginners tutorial for pdb.
• Debugging in Python elaborates on what pdb does and how it can be
used.
Python-specific debugging tutorials
The Python ecosystem has a range of tools to help with debugging your code.
These tutorials show you how to either use a tool other than pdb or provide an
overview of the debugging ecosystem for Python.
• Python debugging tools provides a list of tools such as pdb and its
derivatives ipdb, pudb and pdb++ along with how they can be used in
the hunt for defects.
• Profiling Python web applications with visual tools details a
configuration for visualizing code execution using KCachegrind.
• My Startling Encounter With Python Debuggers along with the followup second post are a fantastic couple of posts that walk through a
specific scenario of how a well-tested distributed web crawler failed
and how tools like gdb, top and Winpdb were used to debug a
multithreaded application.
• Debugging Python like a boss covers several Python debuggers such as
pudb, pydbgr and ipdb.
• The case of the mysterious Python crash explains the symptoms that
happened during a crash and what steps the author took to figure out
what was going on.
General debugging resources
The following resources are not specific to Python development but give solid
programming language-agnostic debugging advice.
• The art of debugging provides a whirlwind overview for how to fix
issues in your code.
293
Full Stack Python: 2020 Supporter's Edition
• How to debug JavaScript errors introduces some key debugging tools
such as source maps that make identifying errors on the client side
much easier during development.
• Linux debugging tools you'll love is an awesome comic that covers the
Linux ecosystem for debugging.
Code Metrics
Code metrics can be produced by static code analysis tools to determine
complexity and non-standard practices.
Why are code metrics important?
Code metrics allow developers to find problematic codebase areas that may
need refactoring. In addition, some metrics such as technical debt assist
developers in communicating to non-technical audiences why issues with a
system are occurring.
Open source code metrics projects
• Radon is a tool for obtaining raw metrics on line counts, Cyclomatic
Complexity, Halstead metrics and maintainability metrics.
• Pylint contains checkers for PEP8 code style compliance, design,
exceptions and many other source code analysis tools.
• PyFlakes parses source files for errors and reports on them.
• Pyntch is a static code analyzer that attempts to detect runtime errors.
It does not perform code style checking.
• Prospector inspects Python source code files to give data on type and
location of classes, methods and other related source information.
• Flake8 is a code format style guideline enforcer. Its goal is not to gather
metrics but ensure a consistent style in all of your Python programs for
maximum readability. The rules for Flask8 are all defined in this list,
which rolls up the Flake8 dependencies of pycodestyle, pyflakes and
McCabe.
294
Full Stack Python: 2020 Supporter's Edition
• Black is a Python code formatter with strong, uncompromising
assumptions about how your code must be formatted.
• dlint is a linter for secure coding practices.
• pylintdb puts pylint results into a SQLite database for programmatic
access and searching. Ned Batchelder coded it and wrote about how he
uses the program in this bite-sized command line tools: pylintdb post.
• Flask8-eradicate (source code) is a Flask8 plugin for identifying dead
code.
Hosted code metrics services
The following tools are ready to use by going to the service, punching in the
URL to your site, letting them perform their analysis and then reading the
results.
• Coveralls shows code coverage from test suites and other metrics to
help developers improve the quality of their code.
• webhint, formerly Sonarwhal scans your website for accessibility, speed
and security. There is both an online version that you can point at an
arbitrary URL as well as a command-line version for running yourself.
• Codecov hooks into GitHub, BitBucket or GitLab and reports code
coverage on your code repositories.
Code metrics resources
Code metrics are really useful when you have a team working on a project for
awhile and want to keep the code quality from degrading. However, you can
easily go overboard instrumenting everything and overanalyzing the results.
The following resources will introduce code metrics topics to you as well as
give perspective when metrics are useful to the point of diminishing returns.
• How do Ruby & Python profilers work? explains the difference between
sampling and tracing profilers then digs into how they work and their
advantages and disadvantages.
• Moving Fast With High Code Quality provides Quora's code quality
goals and how they handle code reviews with their internal tools.
295
Full Stack Python: 2020 Supporter's Edition
• Lessons from Building Static Analysis Tools at Google is a fantastic indepth read that explains why workflow integration and adapting to
developer feedback are critical for static analysis tools adoption in
development environments.
• Static Code Analizers for Python is an older article but goes over the
basics of what Python static code analyzers do.
• Getting Started with Pylint goes over setting up Pylint, generating the
.pylintrc file and what's in the configuration.
• Linting as Lightweight Defect Detection for Python presents a
straightforward code example of how linters can detect certain classes
of errors in code that especially in dynamically-typed languages are not
caught at compile time. The post then shows how to use Flake8 in your
own code reviews.
• Static Analysis at Scale explains how Instagram's extremely hightrafficked Python-powered site is enabled by linting, codemods, and
type-checking with Pyre.
• This /r/Python poll on what linters the community uses provides some
input on using PyCharm just for its linting features as well as some
other approaches.
• Automatically PEP8 & Format Your Python Code shows how to use the
autopep8 library, including an Emacs plugin that lints your code for
PEP8 compliance as you work.
• How to use Flask8 explains what Flask8 is, its usage and expected
output.
• Pylint false positives is a walkthrough of issues that Pylint detects in an
example project, which ones cannot be fixed and the ones where the
tool was incorrect. The author concludes that with all of the false
positives that were found the signal to noise ratio was not useful
enough to use the tool on a typical project. However, on a brand new
project without many dependencies it might be helpful to keep your
code in a pristine state before the code base grows beyond the nosiness
false positives threshold.
296
Full Stack Python: 2020 Supporter's Edition
• What is Flake8 and why we should use it? covers why using a linting
tool like Flake8 can improve the quality of your Python code and how
to install and configure it for your environment.
• Which Python static analysis tools should I use? covers Pylint, PyFlakes
and mypy with a short description of the advantages and disadvantage
for each one.
• This Stack Overflow question on Python static code analysis tools
contains comparison discussions of PyLint, PyChecker and PyFlakes.
• Consistent Python code with Black covers how to use Black and add it
as a pre-commit hook in Git to ensure consistency in repository
updates.
• Format Python however you like with Black is a super short
introduction on what Black does to keep your code formatting
consistent.
• Intro to Black – The Uncompromising Python Code Formatter contains
some introductory examples for using Black and shows how to use it on
your source files.
• Python static analysis tools is a quick overview of the features for Black,
Pylint, Pyflakes, Mypy, Bandit and Prospector.
Networking
Computing networking is critical to building reliable, performant Python web
applications.
Resources about networking
• Monitoring and Tuning the Linux Networking Stack: Receiving Data
along with Monitoring and Tuning the Linux Networking Stack:
Sending Data are incredibly detailed technical posts on the networking
layer within Linux operating systems.
• Computer networking is a free book that explains how networking
between computer systems works. There are also exercises for testing
what you learned along the way.
297
Full Stack Python: 2020 Supporter's Edition
• What's the history behind 192.168.1.1? Why not 192.169.1.1 or any
other IP address? When did it start being used? Who started it? Why?
Why not 1.1.1.1? What is the relation to 127.0.0.1? What about 10.0.0.1
(Apple)? is a nice answer on the history of IPv4 addressing and why
various IP addresses such as 192.168.1.1 became standards for localhost
or other local networking.
• We built network isolation for 1,500 services to make Monzo more
secure provides the thinking, processes and data analysis behind how
one team took a complex environment, separated the dependencies
and was able to improve their network. It's a great case study-style
article that has more detail than a lot of similar operations posts.
• Dropbox traffic infrastructure: Edge network explains how Dropbox
uses edge-of-the-network resources closer to the end user to optimize
performance of their service.
• On the shoulders of giants: recent changes in Internet traffic examines
how the COVID-19 pandemic of 2020 changed the times during the day
and locations of how the majority of internet traffic was routed and
consumed.
HTTPS
The HTTP Secure (HTTPS) protocol encyrpts data between a web server and
the client web browser.
HTTPS tutorials
• How To Secure Nginx with Let's Encrypt on Ubuntu 18.04 walks
through how to configure your Nginx server with HTTPS using a free
Let's Encrypt domain certificate.
• Switching Your Site to HTTPS on a Shoestring Budget shows the steps
for moving a GitHub Pages-based site or any site that can be behind
Cloudflare's free tier to HTTPS.
298
Full Stack Python: 2020 Supporter's Edition
Additional HTTPS resources
• The 6-Step "Happy Path" to HTTPS covers how to obtain a free SSL
certificate, permanently redirect HTTP to HTTPS and fix insecure
references to non-HTTPS resources.
• HTTPS on Stack Overflow: the end of a long road is a wonderfully indepth post on transitioning from HTTP to HTTPS, including
redirecting all HTTP requests to HTTPS, for a very high trafficked
website.
• TLS stats from 1.6 billion connections to mozilla.org provides realworld data for which ciphersuites to use based on mozilla.org
connections.
• HTTPS in the real world is both an accesible read and gives the nitty
gritty details beyond how the HTTPS handshake and transmission
work, along the assumptions it breaks down about where the most
likely attack vectors really come from.
• How Let's Encrypt Works is a primer on how the free and now widelyused certificate service grants and revokes domain certificates.
• Performing & Preventing SSL Stripping: A Plain-English Primer
explains the KRACK Attack for stealing data despite an HTTPS
connection and how your site needs to use HSTS to prevent the attack.
• HTTPS adoption has reached the tipping point shows data about the
growth of HTTPS and how most sites now serve more than half their
traffic via secure connections.
• Google's HTTPS transparency report card shows the growth of HTTPS
connections across Google's properties. There is a clear upward trend
but even some of Google's properties still are not 100% using HTTPS
over unencrypted HTTP connections.
• An overview of TLS 1.3 and TLS 1.3 - enhanced performance, hardened
security cover the high-level information on the latest approved version
of Transport Security Layer (TLS) 1.3.
• How https works is a fun cartoon illustration that demonstrates the
basic concepts of a secure HTTP connection.
299
Full Stack Python: 2020 Supporter's Edition
WebSockets
A WebSocket is a standard protocol for two-way data transfer between a client
and server. The WebSockets protocol does not run over HTTP, instead it is a
separate implementation on top of TCP.
Why use WebSockets?
A WebSocket connection allows full-duplex communication between a client
and server so that either side can push data to the other through an
established connection. The reason why WebSockets, along with the related
technologies of Server-sent Events (SSE) and WebRTC data channels, are
important is that HTTP is not meant for keeping open a connection for the
server to frequently push data to a web browser. Previously, most web
applications would implement long polling via frequent Asynchronous
JavaScript and XML (AJAX) requests as shown in the below diagram.
Server push is more efficient and scalable than long polling because the web
browser does not have to constantly ask for updates through a stream of
AJAX requests.
300
Full Stack Python: 2020 Supporter's Edition
While the above diagram shows a server pushing data to the client,
WebSockets is a full-duplex connection so the client can also push data to the
server as shown in the diagram below.
The WebSockets approach for server- and client-pushed updates works well
for certain categories of web applications such as chat room, which is why
that's often an example application for a WebSocket library.
301
Full Stack Python: 2020 Supporter's Edition
Implementing WebSockets
Both the web browser and the server must implement the WebSockets
protocol to establish and maintain the connection. There are important
implications for servers since WebSockets connections are long lived, unlike
typical HTTP connections.
A multi-threaded or multi-process based server cannot scale appropriately for
WebSockets because it is designed to open a connection, handle a request as
quickly as possible and then close the connection. An asynchronous server
such as Tornado or Green Unicorn monkey patched with gevent is necessary
for any practical WebSockets server-side implementation.
On the client side, it is not necessary to use a JavaScript library for
WebSockets. Web browsers that implement WebSockets will expose all
necessary client-side functionality through the WebSockets object.
However, a JavaScript wrapper library can make a developer's life easier by
implementing graceful degradation (often falling back to long-polling when
WebSockets are not supported) and by providing a wrapper around browserspecific WebSocket quirks. Examples of JavaScript client libraries and Python
implementations are shown in a section below.
Nginx WebSocket proxying
Nginx officially supports WebSocket proxying as of version 1.3. However, you
have to configure the Upgrade and Connection headers to ensure requests are
passed through Nginx to your WSGI server. It can be tricky to set this up the
first time.
Here are the configuration settings I use in my Nginx file as part of my
WebSockets proxy.
# this is where my WSGI server sits answering only on localhost
# usually this is Gunicorn monkey patched with gevent
upstream app_server_wsgiapp {
server localhost:5000 fail_timeout=0;
}
server {
# typical web server configuration goes here
# this section is specific to the WebSockets proxying
location /socket.io {
proxy_pass http://app_server_wsgiapp/socket.io;
302
Full Stack Python: 2020 Supporter's Edition
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 600;
}
}
Note if you run into any issues with the above example configuration you'll
want to scope out the official HTTP proxy module documentation.
The following resources are also helpful for setting up the configuration
properly.
• Nginx has an official page for WebSocket proxying.
• Proxying WebSockets with Nginx shows how to proxy with Socket.io.
Python WebSockets implementations
The following projects either implement WebSockets in Python or provide
example code you can follow to use WebSockets in your own projects.
• Autobahn uses Twisted and asyncio to create the server-side
WebSockets component while AutobahnJS assists on the client web
browser side.
• Django Channels is built on top of WebSockets and is easy to integrate
with existing or new Django projects.
• Flask-SocketIO is a Flask extension that relies upon eventlet or gevent
to create server-side WebSockets connections.
• websocket-client provides low level APIs for WebSockets and works
with both Python 2 and 3.
• Crossbar.io builds upon Autobahn and includes a separate server for
handling the WebSockets connections if desired by the web app
developer.
303
Full Stack Python: 2020 Supporter's Edition
JavaScript client libraries
JavaScript is used to create the client side of the WebSocket connection
because the client is typically a web browser. While you do not need one of
these client-side libraries to use WebSockets, they are useful for minimizing
the amount of boilerplate code to establish and handle your connections.
• Socket.io's client side JavaScript library can be used to connect to a
server side WebSockets implementation.
• web-socket-js is a Flash-based client-side WebSockets implementation.
• AutobahnJS assists on the client web browser side.
Example code for WebSockets in Python
There are numerous Python-based implementations for WebSockets so
sometimes it's just easiest to examine an example so you can build something
for your own project.
• The Flask-SocketIO project has a chat web application that demos
sending server generated events as well as input from users via a text
box input on a form.
• The realtime codenames game source code is a full-featured example
for using WebSockets via Flask-SocketIO. There is also a multi-part
tutorial that walks through the code.
• The python-websockets-example contains code to create a simple web
application that provides WebSockets using Flask, Flask-SocketIO and
gevent.
Python-specific WebSockets resources
• The "Async Python Web Apps with WebSockets & gevent" talk I gave at
San Francisco Python in January 2015 is a live-coded example Flask
web app implementation that allows the audience to interact with
WebSockets as I built out the application.
• Creating a Real-time Web-based Application using Flask, Vue, and
Socket.io: part 1, part 2 and part 3 are a complete front-to-backend
WebSockets, Python and JavaScript front end framework example with
open source code.
304
Full Stack Python: 2020 Supporter's Edition
• Real-time in Python provides Python-specific context for how the
server push updates were implemented in the past and how Python's
tools have evolved to perform server side updates.
• websockets is a WebSockets implementation for Python 3.3+ written
with the asyncio module (or with Tulip if you're working with Python
3.3).
• Speeding up Websockets 60X is a cool experiment in coding loops
different ways to eek out more performance from WebSockets
connections. It is unclear how generalizable the results in the blog post
are to other programs but it is a good example of how tweaking and
tuning can produce outsized returns in some applications.
• The Choose Your Own Adventure Presentations tutorial uses
WebSockets via gevent on the server and socketio.js for pushing vote
count updates from the server to the client.
• Adding Real Time to Django Applications shows how to use Django
and Crossbar.io to implement a publish/subscribe feature in the
application.
• Async with Bottle shows how to use greenlets to support WebSockets
with the Bottle web framework.
• If you're deploying to Heroku, there is a specific WebSockets guide for
getting your Python application up and running.
• The Reddit thread for this page has some interesting comments on
what's missing from the above content that I'm working to address.
• Creating Websockets Chat with Python shows code for a Twisted server
that handles WebSockets connections on the server side along with the
JavaScript code for the client side.
• Synchronize clients of a Flask application with WebSockets is a quick
tutorial showing how to use Flask, the Flask-SocketIO extension and
Socket.IO to update values between web browser clients when changes
occur.
305
Full Stack Python: 2020 Supporter's Edition
General WebSockets resources
WebSockets have wide browser support and therefore many web frameworks
across all major programming languages have libraries to make creating
WebSockets connections easier. The following resources are general guides
and tutorials that provide context for the protocol without getting into the
weeds of how to use WebSockets in Python.
• The official W3C candidate draft for WebSockets API and the working
draft for WebSockets are good reference material but can be tough for
those new to the WebSockets concepts. I recommend reading the
working draft after looking through some of the more beginner-friendly
resources list below.
• WebSockets 101 by Armin Ronacher provides a detailed assessment of
the subpar state of HTTP proxying in regards to WebSockets. He also
discusses the complexities of the WebSockets protocol including the
packet implementation.
• The "Can I Use?" website has a handy WebSockets reference chart for
which web browsers and specific versions support WebSockets.
• WebSockets for fun and profit has a nice concise overview of
WebSockets alternatives like long polling and Server-Sent Events (SSE)
before it goes into a WebSockets example that includes JavaScript code
for the client-side implementation.
• Mozilla's Developer Resources for WebSockets is a good place to find
documentation and tools for developing with WebSockets.
• WebSockets from Scratch gives a nice overview of the protocol then
shows how the lower-level pieces work with WebSockets, which are
often a black box to developers who only use libraries like Socket.IO.
• websocketd is a WebSockets server aiming to be the "CGI of
WebSockets". Worth a look.
• How WebSockets Work – With Socket.io Demo walks through the
HTTP-to-WebSocket upgrade handshake and explains a bit about how
WebSockets work. The provided code is NodeJS on the backend but the
SocketIO client side JavaScript is the same as you would implement in
a Python-backed web application.
306
Full Stack Python: 2020 Supporter's Edition
• Can WebSockets and HTTP/2 Co-exist? compares and contrasts the
two protocols and shows how they have differences which will likely
lead to WebSockets sticking around for awhile longer.
• A Brief Introduction to WebSockets and Socket.io by Saleh Hamadeh is
a video on WebSockets basics and using the Socket.io JavaScript
library to wrap WebSockets functionality in web browsers.
• Benchmarking and Scaling WebSockets: Handling 60000 concurrent
connections is a detailed examination of how WebSockets connections
can scale to tens of thousands of users.
• Writing WebSocket servers gets into the nitty-gritty of how
WebSockets work. Well worth reading to get a deep understanding of
WebSockets connections.
WebRTC
Web Real-Time Communications (WebRTC) is a specification for a protocol
implementation that enables web apps to transmit video, audio and data
streams between client (typically a web browser) and server (usually a web
server).
WebRTC tutorials
• How to Get Started Learning WebRTC Development explains what you
do and do not need to know as prerequisites for building with WebRTC
along with some sources for learning.
• This post titled WebRTC: a working example and the companion open
source repository provides a simple working example of WebRTC
technology, without any 3rd party dependencies. It allows 2 web
browsers to exchange audio and video streams by using the aiohttp
and python-socketio modules.
• A real world guide to WebRTC goes through WebRTC fundamentals
such as data channels, audio and video, screen sharing and file
transfers with the JavaScript code provided for each concept.
307
Full Stack Python: 2020 Supporter's Edition
• The Introduction to WebRTC video series (part 2 and part 3) can be a
bit dry at points but overall has a ton of good information that gives a
solid overview of the technology.
• Building a Snapchat-like app with WebRTC in the browser walks
through the front end JavaScript for building a photo filter application
using the WebRTC browser APIs.
• WebRTC issues and how to debug them explains the various ways that
implementations can go wrong and where to start looking when you
run into errors.
Other WebRTC resources
• A Study of WebRTC Security gives a great overview of WebRTC and the
new security concerns it can bring as it is integrated into more web
applications.
• How Discord Handles Two and Half Million Concurrent Voice Users
using WebRTC provides detailed insight into the what and why of the
highly scalable Discord technical architecture that relies upon WebRTC
for communication. There are a bunch of great examples here for why
some of the service must be centralized (to prevent client IP addresses
from leaking to other clients) while others are decentralized to assist
with scaling the number of possible connections.
• Architectures for a kickass WebRTC application is a video of a technical
talk that covers some of the tools and protocols that can be used to
create your WebRTC projects and why you would choose one tool other
another.
• WebRTC connection times and the power of playing around with data
provides data on connection times and potential reasons for WebRTC
connection quality suffers in some cases.
• A closer look into WebRTC covers the Safari WebRTC implementation
in WebKit and explains some of the nuances for that specific web
browser's implementation.
• STUN/TURN servers are used to relay data to a non-public IP address
in a WebRTC application. This blog post on Do you still need TURN if
308
Full Stack Python: 2020 Supporter's Edition
your media server has a public IP address? answers some frequently
asked questions about when a TURN server is truly required.
• An Intro to WebRTC and Accessing a User’s Media Devices goes into
the JavaScript needed to use a computer's media devices such as the
microphone and video camera through the web browser's APIs.
• AIORTC: An Asynchronous WebRTC Framework is an interview with
the developer of an async WebRTC framework that is built upon
asyncio.
Application Programming Interfaces
Application programming interfaces (APIs) provide machine-readable data
transfer and signaling between applications.
Why are APIs important?
HTML, CSS and JavaScript create human-readable webpages. However, those
webpages are not easily consumable by other machines.
Numerous scraping programs and libraries exist to rip data out of HTML but
it's simpler to consume data through APIs. For example, if you want the
content of a news article it's easier to get the content through an API than to
scrap the text out of the HTML.
Key API concepts
There are several key concepts that get thrown around in the APIs world. It's
best to understand these ideas first before diving into the API literature.
• Representation State Transfer (REST)
• Webhooks
• JavaScript Object Notation (JSON) and Extensible Markup Language
(XML)
• Endpoints
309
Full Stack Python: 2020 Supporter's Edition
What are Webhooks?
A webhook is a user-defined HTTP callback to a URL that executes when a
system condition is met. The call alerts the second system via a POST or GET
request and often passes data as well.
Webhooks are important because they enable two-way communication
initiation for APIs. Webhook flexibility comes in from their definition by the
API user instead of the API itself.
For example, in the Twilio API when a text message is sent to a Twilio phone
number Twilio sends an HTTP POST request webhook to the URL specified
by the user. The URL is defined in a text box on the number's page on Twilio
as shown below.
API open source projects
• Swagger is an open source project written in Scala that defines a
standard interface for RESTful APIs.
API resources
• Zapier has an APIs 101 free guide for what APIs are, why they are
valuable and how to use them properly.
310
Full Stack Python: 2020 Supporter's Edition
• What is REST? is a well-written overview of the REpresentational State
Transfer (REST) architecture proposed by Roy Fielding in his
dissertation.
• The list of public APIs in this Git repository is incredible and worth
examining if you are looking to find data sources for your projects.
• GET PUT POST is a newsletter just about APIs. Past issues have
included interviews with the developers behind Stripe, Dropbox and
Coinbase.
• Designing robust and predictable APIs with idempotency discusses
designing APIs for idempotency, which means guaranteeing that side
effects only occur once. This topic is especially important with web
APIs because network connections are and will always be unreliable so
you need to build knowing network problems will happen.
• What RESTful actually means does a fantastic job of laying out the
REST principles in plain language terms while giving some history on
how they came to be.
• What is a webhook? by Nick Quinlan is a plain English explanation for
what webhooks are and why they are necessary in the API world.
• Simplicity and Utility, or, Why SOAP Lost provides context for why
JSON-based web services are more common today than SOAP which
was popular in the early 2000s.
• API tools for every occasion provides a list of 10 tools that are really
helpful when working with APIs that are new in 2015.
APIs learning checklist
1. Learn the API concepts of machine-to-machine communication with
JSON and XML, endpoints and webhooks.
2. Integrate an API such as Twilio or Stripe into your web application.
Read the API integration section for more information.
3. Use a framework to create an API for your own application. Learn
about web API frameworks on the API creation page.
311
Full Stack Python: 2020 Supporter's Edition
4. Expose your web application's API so other applications can consume
data you want to share.
Microservices
Microservices are an application architecture style where independent, selfcontained programs with a single purpose each can communicate with each
other over a network. Typically, these microservices are able to be deployed
independently because they have strong separation of responsibilities via a
well-defined specification with significant backwards compatibility to avoid
sudden dependency breakage.
Why are microservices getting so much buzz?
Microservices follow in a long trend of software architecture patterns that
become all the rage. Previously, CORBA and (mostly XML-based) serviceoriented architectures (SOA) were the hip buzzword among ivory tower
architects.
However, microservices have more substance because they are typically based
on RESTful APIs that are far easier for actual software developers to use
compared with the previous complicated XML-based schemas thrown around
by enterprise software companies. In addition, successful applications begin
with a monolith-first approach using a single, shared application codebase
and deployment. Only after the application proves its usefulness is it then
broken down into microservice components to ease further development and
deployment. This approach is called the "monolith-first" or "MonolithFirst"
pattern.
Microservice resources
• Martin Fowler's microservices article is one of the best in-depth
explanations for what microservices are and why to consider them as
an architectural pattern.
• Why microservices? presents some of the advantages, such as the
dramatically increased number of deployments per day, that a welldone microservices architecture can provide in the right situation.
Many organizational environments won't allow this level of flexibility
but if yours is one that will, it's worth considering these points.
312
Full Stack Python: 2020 Supporter's Edition
• On monoliths and microservices provides some advice on using
microservices in a fairly early stage of a software project's lifecycle.
• Why Microservices? presents advantages microservices can bring to an
existing monolithic application where it is clear what needs to be
broken down into smaller components to make it easier to iterate and
maintain.
• Developing a RESTful microservice in Python is a good story of how an
aging Java project was replaced with a microservice built with Python
and Flask.
• Microservices: The essential practices first goes over what a monolith
application looks like then dives into what operations you need to
support potential microservices. For example, you really need to have
continuous integration and deployment already set up. This is a good
high-level overview of the topics many developers aren't aware of when
they embark on converting a monolith to microservices.
• Using Nginx to Load Balance Microservices explains how an Nginx
instance can use configuration values from etcd updated by confd as
the values are modified. This setup can be useful for load balancing
microservices as the backend services are brought up and taken down.
• How Microservices have changed and why they matter is a high level
overview of the topic with some quotes from various developers around
the industry.
• The State of Microservices Today provides some general trends and
broad data showing the increasing popularity of microservices heading
into 2016. This is more of an overview of the term than a tutorial but
useful context for both developers and non-developers.
• bla bla microservices bla bla is a transcript for a killer talk on
microservices that breaks down the important first principles of
distributed systems, including asynchronous communication, isolation,
autonomicity, single responsibility, exclusive state, and mobility. The
slides along with the accompanying text go into how reality gets messy
and how to embrace the constraints inherent in distributed systems.
313
Full Stack Python: 2020 Supporter's Edition
• In the Microservices with Docker, Flask, and React course bundle, you
will learn how to quickly spin up a reproducible development
environment with Docker to manage a number of microservices. Once
the app is up and running locally, you'll learn how to deploy it to an
Amazon EC2 instance. Finally, we'll look at scaling the services on
Amazon EC2 Container Service (ECS).
• Should I use microservices? contains a high-level perspective on why or
why not use microservices as an architectural choice.
• Zuul is open source proxy for combining multiple microservices into a
unified API call. Check out this post on Using Netflix Zuul to Proxy
your Microservices to learn more and get started using it.
• The Majestic Monolith explains the advantages of a monolithic
architecture and how it's worked amazingly well for the Basecamp
small development team.
• Developing a RESTful micro service in Python goes into detail on how
one development team rebuilt an existing Java application as a
microservice in Python with Flask.
• Documenting microservices has some good thoughts on how to explain
your microservice API to other developers such as clearly showing all of
the endpoints as well as the intersection of multiple endpoints.
• Best practices for building a microservice is an exhaustive (and
somewhat exhausting to read!) list with what you should think about as
you build your microservice.
• The Hardest Part About Microservices: Your Data presents a datacentric view on how to structure and transport data in a microservices
architecture.
• Deleting data distributed throughout your microservices architecture
examines how Twitter handles issues with discoverability, access and
erasure in their microservices-heavy production environment.
Webhooks
Webhooks are user-defined HTTP callbacks that allow interactions between
otherwise independent web applications.
314
Full Stack Python: 2020 Supporter's Edition
Webhooks projects
• Thorn is a Python framework for building webhooks and event-driven
applications. There is also an introduction post with more information
on using Thorn.
• PyWebhooks is a proof-of-concept library for building webhooks-based
services.
• Webhook Tester
Webhook resources
• What's a webhook? is a high-level explanation of this concept that also
contains some basic security considerations when using them.
• How to Listen for Webhooks with Python has code examples in both
Flask and Django for how to receive an HTTP POST webhook request,
as well as how to test it locally with Ngrok.
• Should you build a webhooks API?
• Webhooks do’s and dont’s: what we learned after integrating +100
APIs
• Why Are Webhooks Better Than Serverless Extensibility?
• Webhooks Provide an Efficient Alternative to API Polling is a high-level
overview of the advantages of webhooks over alternatives such as
constant polling for updates.
• Webhooks: the devil in the details
• How to use webhooks for recovering customers
Bots
Bots are software programs that combine requests, which are typically
provided as text, with contextual data, such as geolocation and payment
information, to appropriately handle the request and respond. Bots are often
also called "chatbots", "assistants" or "agents."
315
Full Stack Python: 2020 Supporter's Edition
Open source bot examples
• Limbo is an awesome Slack chatbot that provides a base for Python
code that otherwise would require boilerplate to handle the Slack API
events firehose.
• python-rtmbot is the bot framework for building Slack bots with the
Real Time Messaging (RTM) API over WebSockets.
• The GitHub bots search results and the bots GitHub topic contain tens
of thousands of example bots you take analyze to see how they are
built.
• Errbot can work with multiple backends such as Hipchat, Discord,
Slack and Telegram. It's designed to be deployed "as is" except for your
credentials but the Python source code can also be customized.
Python-specific bot resources
• How to Buid an SMS Slack Bot is a tutorial on using SMS text messages
to communicate with a Slack bot that can post and receive messages.
The bot is a good base for a more complicated Slack bot that could use
natural language processing or other more advanced parsing
techniques. Either Python 2 or 3 can be used with the code which is
also available on GitHub.
• Dropbox open sourced their security Slack bot, which is built in
Python. The bot converses with a user when backend systems detect
strange behavior on one of their accounts to check if there has been a
security breach.
• Making a Reddit + Facebook Messenger Bot builds a bot for two
platforms and shows how to deploy it to Heroku.
• Build a Slack Bot that Mimics Your Colleagues with Python is a
humorous post that uses the markovify Markov Chains library to
generate responses that are similar to ones other Slack users have said.
• A Slack bot with Python's asyncio shows how to connect a bot to Slack
via the web API using the Python 3 asyncio standard library.
• Facebook-Message-Bot is an open source Facebook Messenger bot
written in Python.
316
Full Stack Python: 2020 Supporter's Edition
• Build a Reddit bot is a four part tutorial series that starts with reading
posts, continues with replying to posts, automating the bot and finally
adding behavior and a personality to the bot.
Additional Bots resources
• Bots: An introduction for developers explains the technical details of
how to create Telegram bots.
• Building better bots with AWS Lex: Part 1 and part 2 show how to use
Amazon's service offering for better natural language processing in
your bots.
• Slack bot token leakage exposing business critical information is a
detailed look at a search on GitHub for Slack tokens that are used
mostly for bots but must be kept secret. Otherwise those tokens expose
the entire Slack team's messaging to outside parties.
• The Economist wrote a general piece on why bots look like they'll gain
adoption in various market segments. The piece doesn't have much
technical depth but it's a good overview of how some businesses are
looking at the opportunity.
• Three challenges you’re going to face when building a chatbot provides
insightful thoughts on problems to anticipate based on the author's
experience building, deploying and scaling chatbots.
• Bots won't replace apps is a fantastic piece by WeChat's product
manager on how text-based bots alone typically do not provide a good
user experience. Instead, chat apps with automated responses, user
data and basic web browser functionality are what has allowed bot
concepts to bloom in Asian markets. There's a lot of good information
in this post to unpack.
• Principles of bot design contains some general, common-sense ideas to
keep in mind when building bots such as do not pretend to be a human
(because it will be quickly discovered that your bot is not a human) and
keep it as simple as possible so people can actually use the damn thing.
• 6 things I learned creating my own Messenger chatbot contains some
solid general advice for building your custom bots.
317
Full Stack Python: 2020 Supporter's Edition
API Creation
Creating and exposing APIs allows your web application to interact with other
applications through machine-to-machine communication.
API creation frameworks
• Django REST framework and Tastypie are the two most widely used
API frameworks to use with Django. The edge currently goes to Django
REST framework based on rough community sentiment. Django REST
framework continues to knock out great releases after the 3.0 release
mark when Tom Christie ran a successful Kickstarter campaign.
• Flask-RESTful is widely used for creating web APIs with Flask. It was
originally open sourced by Twilio then moved into its own GitHub
organization so engineers from outside the company could be core
contributors.
• Flask API is another common library for exposing APIs from Flask web
applications.
• Sandman is a widely used tool to automatically generate a RESTful API
service from a legacy database without writing a line of code (though
it's easily extensible through code).
• Cornice is a REST framework for Pyramid.
• Restless is a lightweight API framework that aims to be framework
agnostic. The general concept is that you can use the same API code for
Django, Flask, Bottle, Pyramid or any other WSGI framework with
minimal porting effort.
• Eve is a Python REST framework built with Flask, MongoDB and
Redis. The framework's primary author Nicola Iarocci gave a great talk
at EuroPython 2014 that introduced the main features of the
framework.
• Falcon is a fast and lightweight framework well suited to create
RESTful APIs.
• Hug built on-top of Falcon and Python3 with an aim to make
developing Python driven APIs as simple as possible, but no simpler.
318
Full Stack Python: 2020 Supporter's Edition
Hug leverages Python3 annotations to automatically validate and
convert incoming and outgoing API parameters.
• Pycnic is a JSON-API-only framework designed with REST in mind.
API testing projects
Building, running and maintaining APIs requires as much effort as building,
running and maintaining a web application. API testing frameworks are the
equivalent of browser testing in the web application world.
• zato-apitest invokes HTTP APIs and provides hooks for running
through other testing frameworks.
• Tavern is a pytest plugin for automated API testing.
Hosted API testing services
• Runscope is an API testing SaaS application that can test both your
own APIs and external APIs that your application relies upon.
• API Science is focused on deep API testing, including multi-step API
calls and monitoring of external APIs.
API creation resources
• An API is only as good as its documentation is a strongly held mantra
in the web API world because so many APIs have poor documentation
that prevents ease-of-use. If an API is not well documented then
developers who have options to use something else will just skip it.
• 8 Open-Source Frameworks for Building APIs in Python presents a
high-level overview of the options for building APIs in Python.
• Adventures in running a free, public API is a quick story of a
developer's geolocation API being abused and his lack of resources for
preventing further abuse. Eventually he had to shut down the free plan
and only provide a paid plan in addition to allowing others to host the
open source code. Fraud and malware prevention are difficult
problems so keep an eye on server utilization and endpoint calls growth
to separate legitimate from illegitimate traffic.
319
Full Stack Python: 2020 Supporter's Edition
• API versioning is a wonderful article on the tradeoffs between
including an API version in the URL compared to other common ways
to version APIs.
• API Doc JS allows a developer to embed markup in their
documentation that will generate a site based on the endpoints
available in the API.
• 10 Reasons Why Developers Hate Your API (And what to do about it)
goes through the top difficulties and annoyances developers face when
working with APIs and how you can avoid your API falling into the
same traps.
• Versioning of RESTful APIs is a difficult and contentious topic in the
web API community. This two-part series covers various ways to
version your API and how to architect a version-less API.
• NARWHL is a practical API design site for developers confused about
what is appropriate for RESTful APIs.
• 18F's API standards explains the details behind their design decisions
on creating modern RESTful APIs.
• Design a beautiful REST API reviews common design decisions
regarding endpoints, versioning, errors and pagination. There is also a
source material YouTube video where this blog post derives its
recommendations from.
• Move Fast, Don't Break Your API are slides and a detailed blog post
from Amber Feng at Stripe about building an API, separating layers of
responsibility, hiding backwards compatibility and a whole slew of
other great advice for developers and API designers.
• Self-descriptive, isn't. Don't assume anything. is an appeal that
metadata makes a difference in whether APIs are descriptive or not.
• Designing the Artsy API has their recommendations list for building an
API based on their recent experiences.
• Hacker News had a discussion on what's the best way to write an API
spec? that provides a few different viewpoints on this topic.
320
Full Stack Python: 2020 Supporter's Edition
• Apigee's Web API Design ebook is free and contains a wealth of
practical advice for what design decisions to make for your web API.
• Documenting APIs: A guide for technical writers and engineers is a
guide that covers good practices for thinking like a developer who will
use your API, as well as what the documentation for endpoints and
other important pieces should look like.
• 1-to-1 Relationships and Subresources in REST APIs tells the story of
design decisions that were made during an API's creation and why
those choices were made.
• How many status codes does your API need? gives an answer from a
Dropbox API developer as to their decision making process.
• These two Stack Overflow questions and answers on Is it better to place
a REST API on a subdomain or in a subfolder? and subdomain vs.
subdirectory in web programming provide reasons and opinions on the
debate around using a subdomain, for example
api.fullstackpython.com versus www.fullstackpython.com/api/. There
is also a nice summary of endpoint configurations in this article from
ProgrammableWeb.
• This API Design Guide is based on Heroku's best practices for the
platform's API.
Python-specific API creation resources
• Deploying a Machine Learning Model as a REST API
• Choosing an API framework for Django by PyDanny contains questions
and insight into what makes a good API framework and which one you
should currently choose for Django.
• Creating Web APIs with Python and Flask is a free book on building
APIs with Flask as the core web framework.
• RESTful web services with Python is an interesting overview of the
Python API frameworks space.
• Implementing a RESTful Web API with Python & Flask is a good
walkthrough for coding a Flask app that provides standard web API
321
Full Stack Python: 2020 Supporter's Edition
functionality such as proper HTTP responses, authentication and
logging.
• REST Hooks is an open source Python project that makes it easier to
implement subscription-based "REST hooks". These REST hooks are
similar to webhooks, but provide a different mechanism for subscribing
to updates via a REST interface. Both REST hooks and webhooks are
far more efficient than polling for updates and notifications.
• Rate limiters provides a great overview of how limiting access in both
the number of requests per second as well as the number of concurrent
open connections can help keep your API alive during times of heavy
traffic.
• Writing HTTP files to test HTTP APIs shows how to perform
automated testing of APIs using HTTP file formats provided by VS
Code REST Client and the JetBrains HTTP Client Editor.
• Serialization is common for transforming objects into web API JSON
results. One company found the serialization performance of Django
REST framework was lacking so they created Serpy and wrote a blog
post with the results of its performance.
• Designing a Web API gives a detailed walkthrough of concepts and
design decisions you need to make when building an API.
• Microsoft's REST API Guidelines are a detailed set of considerations
for when you are building your own APIs that you want to be easilyconsumable by other developers.
• Designing Good Static REST API Documentation is about
documentation not APIs themselves, but it covers a critical topic if you
want your API to succeed: how to use the damn thing.
• Building better API docs shows how Square used Swagger with React to
create more helpful docs.
• Best Practices For Creating Useful API Documentation covers standard
but important topics such as knowing your audience, ensuring your
documentation covers the error codes, and providing a changelog as
well as terms of service.
322
Full Stack Python: 2020 Supporter's Edition
Django REST Framework resources
• This multi-part series on getting started with Django REST framework
and AngularJS (part 1) along with its second part do a good job of
showing how a RESTful API can serve as the backend for a client front
end built with a JavaScript MVC framework.
• If you're looking for a working example of a Django REST framework
project, check out the PokeAPI, open sourced under the BSD license.
API creation learning checklist
1. Pick an API framework appropriate for your web framework. For
Django I recommend Django REST framework and for Flask I
recommend Flask-RESTful.
2. Begin by building out a simple use case for the API. Generally the use
case will either involve data that users want in a machine-readable
format or a backend for alternative clients such as an iOS or Android
mobile app.
3. Add an authentication mechanism through OAuth or a token scheme.
4. Add rate limiting to the API if data usage volume could be a
performance issue. Also add basic metrics so you can determine how
often the API is being accessed and whether it is performing properly.
5. Provide ample documentation and a walkthrough for how the API can
be accessed and used.
6. Figure out other use cases and expand based on what you learned with
the initial API use case.
API Frameworks
API frameworks are code libraries that provide commonly-used functionality
when building your own web application programming interfaces (APIs).
Python API Frameworks
• Django REST Framework
• API Star
323
Full Stack Python: 2020 Supporter's Edition
• Starlette
• Flask RESTful
Django REST Framework
Django REST Framework (source code), typically abbreviated "DRF", is a
Python library for building web application programming interfaces (APIs).
Django REST Framework resources
• How to Developer APIs with Django REST Framework covers the steps
for creating a development environment for your Django+DRF project
then creating API endpoints with the test-driven development (TDD)
approach.
• The official DRF tutorial is one of the best first-party pieces of
documentation on any open source project. The rest of the DRF docs
are spectacular as well.
• Django: Building REST APIs is the first part in an excellent multi-part
series on DRF:
◦
◦
◦
◦
◦
Getting started with DRF
Serializers
ModelSerializer and Generic Views
ViewSet, ModelViewSet and Router
Authentication and Permissions
324
Full Stack Python: 2020 Supporter's Edition
◦ JSON Web Tokens (JWT)
• How to optimize your Django REST Viewsets provides a good step-bystep example about using select_related and prefetch_related
in the Django ORM layer to avoid large numbers of unnecessary
queries in your views. Also, props to the author for wearing a UVA tshirt in his picture when his blog says he works as a developer in
Blacksburg, Virginia (where Virginia Tech is located).
• How to Save Extra Data to a Django REST Framework Serializer is a
concise, handy tutorial for combining additional data with the alreadydefined DRF serializer fields before saving everything to a database or
similar action.
• Django polls api using Django REST Framework gives a great
walkthrough for creating a question polling application backend with
code and the explanation as you build it.
• Django REST Framework Permissions in Depth has code examples and
explains permission classes versus authentication classes in DRF.
• Optimizing slow Django REST Framework performance
• TLT: Serializing Authenticated User Data With Django REST
Framework
• Building an API with Django REST Framework and Class-Based Views
• Simple Nested API Using Django REST Framework
• Building APIs with Django and Django Rest Framework
API Integration
The majority of production Python web applications rely on several externally
hosted application programming interfaces (APIs). APIs are also commonly
referred to as third party services or external platforms. Examples include
Twilio for messaging and voice services, Stripe for payment processing and
Disqus for embedded webpage comments.
There are many articles about proper API design but best practices for
integrating APIs is less commonly written about. However, this subject
325
Full Stack Python: 2020 Supporter's Edition
continuously grows in importance because APIs provide critical functionality
across many implementation areas.
API Integration Resources
• APIs for Beginners is an awesome free video course that shows how to
use APIs and add them to applications with a bunch of useful code
examples.
• Some developers prefer to use Requests instead of an API's helper
library. In that case check out this tutorial on using requests to access
web APIs.
• How to Debug Common API Errors talks about the 1xx through 5xx
status codes you will receive when working with HTTP requests and
what to do when you get some of the more common ones such as 403
Forbidden and 502 Bad Gateway.
• Product Hunt lists many commonly used commercial and free web
APIs to show "there's an API for everything".
• There's a list of all government web APIs at 18F's API-All-the-X list.
The list is updated whenever a new API comes online.
• The Only Type of API Services I'll Use explains how alignment between
usage and pricing is crucial to a solid, long-lasting API experience.
Other models such as tiered usage and enterprise upsells typically lead
to terrible developer experiences and should generally be avoided when
building applications with APIs.
• John Sheehan's "Zen and the Art of API Maintenance" slides are
relevant for API integration.
• This post on "API Driven Development" by Randall Degges explains
how using APIs in your application cuts down on the amount of code
you have to write and maintain so you can launch your application
faster.
• Retries in Requests is a nice tutorial for easily re-executing failed HTTP
requests with the Requests library.
326
Full Stack Python: 2020 Supporter's Edition
• My DjangoCon 2013 talk dove into "Making Django Play Nice With
Third Party Services."
• If you're looking for a fun project that uses two web APIs within a
Django application, try out this tutorial to Build your own Pokédex
with Django, MMS and PokéAPI.
• vcr.py is a way to capture and replay HTTP requests with mocks. It's
extremely useful for testing API integrations.
• Caching external API requests is a good post on how to potentially limit
the number of HTTP calls required when accessing an external web
API via the Requests library.
• Working with APIs the Pythonic way does a nice job walking through
creating a naive web API client, building on it with better error
handling and then explaining how to test it by mocking out the service.
API integration learning checklist
1. Pick an API known for top notch documentation. Here's a list of ten
APIs that are a good starting point for beginners.
2. Read the API documentation for your chosen API. Figure out a simple
use case for how your application could be improved by using that API.
3. Before you start writing any code, play around with the API through
the commandline with cURL or in the browser with Postman. This
exercise will help you get a better understanding of API authentication
and the data required for requests and responses.
4. Evaluate whether to use a helper library or work with Requests. Helper
libraries are usually easier to get started with while Requests gives you
more control over the HTTP calls.
5. Move your API calls into a task queue so they do not block the HTTP
request-response cycle for your web application.
327
Full Stack Python: 2020 Supporter's Edition
Twilio
Twilio is a web application programming interface (API) that software
developers can use to add communications such as phone calling, messaging,
video and two-factor authentication into their Python applications.
Why is Twilio a good API choice?
Interacting with the standard telephone networks to send and receive phone
calls and text messages without Twilio is extremely difficult if you do not
know the unique telecommunications protocols such as Session Initiation
Protocol (SIP). Twilio's API abstracts the telecommunications pieces so as a
developer you can simply use your favorite programming languages and
frameworks in your application. For example, here's how you can send an
outbound SMS using a few lines of Python code:
# import the Twilio helper library (installed with pip install twilio)
from twilio.rest import TwilioRestClient
# replace the placeholder strings in the following code line with
# your Twilio Account SID and Auth Token from the Twilio Console
client = TwilioRestClient("ACxxxxxxxxxxxxxx", "zzzzzzzzzzzzz")
# change the "from_" number to your Twilio number and the "to" number
# to any phone number you want to send the message to
client.messages.create(to="+19732644152", from_="+12023358536",
body="Hello from Python!")
Learn more about the above code in the How to Send SMS Text Messages with
Python tutorial.
328
Full Stack Python: 2020 Supporter's Edition
How is Twilio's documentation for Python developers?
Twilio is a developer-focused company, rather than a traditional "enterprise
company", so their tutorials and documentation are written by developers for
fellow developers.
More Twilio resources
• Most Twilio Tutorials have idiomatic Python versions with entire open
source applications in Django and Flask.
• Automate the Boring Stuff with Python includes a chapter on sending
text messages that uses Twilio to get the job done.
• Build a Simpsons Quote-Bot with Twilio MMS, Frinkiac, and Python
combines picture messages sent via Twilio MMS with Frinkiac to create
and sent Simpsons cartoon quotes to any phone number.
• Finding free food with Python is a fun web scraping tutorial that uses
Beautiful Soup 4 to obtain some data from websites then uses the
Twilio SMS API via some Python code to send a text message with the
results.
• IBM's Bluemix blog contains a nice tutorial on building an IoT Python
app with a Raspberry Pi and Bluemix that uses Twilio to interact with
the Raspberry Pi.
• The Python tag on the Twilio blog provides walkthroughs for Django,
Flask and Bottle apps to learn from while building your own projects.
• Serverless Phone Number Validation with AWS Lambda, Python and
Twilio walks through using the Python runtime on AWS Lambda and
Twilio's APIs for validating phone number information.
• Receive Flask Error Alerts by Email with Twilio SendGrid shows how to
use Twilio to add email error reports to Flask web applications.
• Google Cloud recommends developers use Twilio for communications
in their apps and provides a short walkthrough for Python developers.
• SMS Tracking Notifications is a fun tutorial that combines two APIs
together - Twilio and Easypost to track packages sent through the
329
Full Stack Python: 2020 Supporter's Edition
Easypost service. There is also another tutorial on shipment
notifications specifically for Flask.
• Build a Video Chat Application with Python, JavaScript and Twilio
Programmable Video shows how to use Flask and the Twilio
Programmable Video API to build cross-platform video into new and
existing applications.
• This video titled "We're No Strangers to VoIP: Building the National
Rick Astley Hotline)" presents a hilarious hack that uses Python and
AWS Lambda as a serverless backend to power thousands of Rick
Rolling calls in several countries with Twilio.
• Stripe SMS Notifications via Twilio, Heroku, and Python is a quick
tutorial that demonstrates how to combine multiple services to receive
helpful notifications via SMS.
Disclaimer
I currently work at Twilio on the Developer Network and run the Developer
Voices team.
Stripe
Stripe is a web application programming interface (API) for processing
payments.
330
Full Stack Python: 2020 Supporter's Edition
Stripe tutorials
• How to Create a Subscription SaaS Application with Django and Stripe
shows how to build a Django application with models for the
subscription data in the Django ORM and create a pricing page.
• Switching from Braintree to Stripe covers one development team's
experience with moving payment providers.
• Dirt Cheap Recurring Payments with Stripe and AWS Lambda explains
how to use the Stripe API with AWS Lambda to handle recurring
payments instead of using a more expensive service like Chargify or
Recurly if you only have minimal requirements.
• Django Stripe Tutorial looks at how to quickly add Stripe to accept
payments on a Django/Python website.
• Setting up Stripe Connect with Django this tutorial looks at how to
integrate Stripe Connect into a Django application.
• Adding a Custom Stripe Checkout to a Flask App looks at how to add a
custom Stripe checkout to a Flask application for processing payments.
Resources about Stripe
• How Stripe Designs Beautiful Websites explains the process for how
Stripe creates their gorgeous design that makes people want to use the
service and explore what else they can build with it.
• Creating a Culture of Observability is a technical talk about monitoring
systems at scale. The presenter works at Stripe so much of his
• Implementing API Billing with Stripe covers billing and invoicing
requirements for a video calling API product. They explain how they
matched their requirements to what Stripe offers then what they had to
build themselves to get everything working the way they intended.
Slack
Slack provides a web application programming interface (API) for
programmatically interacting with its messaging service.
331
Full Stack Python: 2020 Supporter's Edition
Slack resources
• How to Build Your First Slack Bot with Python contains all the code for
getting a Slack bot up and running with Python even if you have not
previously worked with their API or built other bots.
• Use a Slack bot to deploy your app gives the sample code to a simplified
bot that you can engage with in your chat channels to perform
application deployments.
• How I built a Slack bot to help me find an apartment in San Francisco
is a story about how the author had issues finding an apartment while
moving from Boston to San Francisco. He started scraping Craigslist to
gather apartment data and built a Slack bot to message him as soon as
something that matched his criteria became available so he could take a
look at it.
• Slack on an SNES is not a Python tutorial but it provides a crazy hack
for communicating with Slack using a Super Nintendo.
• Hacking Slack accounts: As easy as searching GitHub explains how
secret Slack API keys are often committed to public GitHub
repositories which allows malicious actors to easily break into an
organization's messaging systems. Secret credentials in public
repositories is a problem for any API and it is a particular problem for
ones that are critical to a business' private communications.
• Serverless Slash Commands with Python shows how to build a
serverless Flask plus Zappa framework web app that is hosted on AWS
Lambda and can use the Slack API.
332
Full Stack Python: 2020 Supporter's Edition
• Hacking Slack using postMessage and WebSocket-reconnect to steal
your precious token examines a bug that the author found in Slack's
WebSockets reconnection operation that he reported to Slack. Slack
fixed the issue and paid him a bug bounty for his work.
• Posting messages to Slack using incoming webhooks and Python3
Requests API is a short script that uses the Requests library instead of
the Slack-provided Python helper libraries to interact with the API.
• Build a Google Analytics Slack Bot with Python walks through creating
a bot that posts Google Analytics data into Slack channels by
combining the Slack and Google APIs.
Example Slack bots
• python-rtmbot is the Slack-provided library for working with the Slack
API and WebSockets connection.
• slackify is a lightweight framework for building bots and the quickstart
walks you through how to create a simple example bot.
• slack-starterbot
• slack-api-python-examples contains the example code from several
Slack bot blog posts.
• slackbot another popular Slack bot implementation.
Okta
Okta is an application programming interface (API) for authentication and
identity management in web applications.
333
Full Stack Python: 2020 Supporter's Edition
Okta resources
• How to Add User Authentication to Flask Apps with Okta covers using
OpenID Connect with the Okta API for handling user authentication in
Flask applications.
Web Application Security
Website security must be thought about while building every level of the web
stack. However, this section includes topics that deserve particular treatment,
such as cross-site scripting (XSS), SQL injection, cross-site request forgery
and usage of public-private keypairs.
Security tools
• Bro is a network security and traffic monitor.
• lynis (source code) is a security audit tool that can run as a shell script
on a Linux system to find out its vulnerabilities so that you can fix them
instead of allowing them to be exploited by malicious actors.
• Charles is an HTTP proxy for inspecting headers, requests and
responses for all traffic that flows through it.
• TLS Observatory provides a suite of security tools for analyzing and
inspecting Transport Layer Security (TLS) services. There is also a
hosted version you can use at observatory.mozilla.org.
• WIG contains tools for gathering wireless data via Wifi protocols.
• HTTP Evader is an automated testing tool for checking firewalls to
ensure they are protecting the appropriate ports and payloads.
• Security monkey monitors for changes to AWS, Google Cloud, GitHub
and other infrastructure systems.
Specific vulnerabilities
• httpoxy is a set of vulnerabilities that can affect Python web application
servers via HTTP requests.
• Heartbleed is a vulnerability in OpenSSL implementations that must be
patched for any systems you run otherwise you are at serious risk for
data leakage.
334
Full Stack Python: 2020 Supporter's Edition
• Meltdown and Spectre are x86 architecture problems caused by
exploiting CPU branch-prediction implementations.
HTTPS resources
SSL over HTTP (HTTPS) is mandatory for securing web data traffic in transit.
There is a page dedicated to HTTPS and the following resources can also give
you a good overview of how HTTPS works.
• How does HTTPS actually work? is a well-written overview of the
protocol including certificates, signatures, signing and related topics.
• These introduction to HTTPS videos explain what HTTPS is and how to
implement it.
• This question asking what is the difference between TLS and SSL?
explains that TLS is a newer version of SSL and should be used because
SSL through version 3.0 is insecure.
• If you have wondered what all the SSL/TLS acronyms and settings
mean, read the Security/Server Side TLS guide which Mozilla uses to
operationalize its servers.
• If you're having users submit sensitive information to your site you
need to use SSL/TLS. Anything before TLS is now insecure. Check out
this handy guide that goes over some of the nuances of the subject.
• The Sorry State of SSL details the history and evolution of SSL/TLS.
There are important differences between the versions and Hynek
explains why TLS should always be used. The talk prompted work to
improve Python's SSL in 2.7.9 based on the upgrades in Python 3
outlined in The not-so-sorry state of SSL in Python.
• How HTTPS Secures Connections is a guide for what HTTPS does and
does not secure against.
• The first few milliseconds of an HTTPS connection provides a detailed
look at the SSL handshake process that is implemented by browsers
based on the RFC 2818 specification.
• Qualy SSL Server Test can be used to determine what's in place and
what is missing for your server's HTTPS connection. Once you run the
335
Full Stack Python: 2020 Supporter's Edition
test read this article on Getting an A+ on Qualy's SSL Labs Tester to
improve your situation.
General security resources
• The Open Web Application Security Project (OWASP) has cheat sheets
for security topics.
• Stanford's CS253 class is available for free online, including lecture
slides, videos and course materials to learn about web browser
internals, session attacks, fingerprinting, HTTPS and many other
fundamental topics.
• The SaaS CTO Security Checklist is an awesome list of steps for
securing your infrastructure and employees as well as what stage and
size company it is recommended that you put those procedures in
place.
• Reckon you've seen some stupid security things? Here, hold my beer...
provides hilarious, and terribly sad, security vulnerabilities and
weaknesses around encryption and password storage.
• This page contains a fantastic curated list of security reading material
from beginning to advanced topics.
• How to protect your infrastructure against the basic attacker presents a
good overview of what you need to think about when hardening your
system against reasonablely competent malicious attackers.
• The /r/netsec subreddit is one place to go to learn more about network
and application security.
• Hacking Tools Repository is a great list of password cracking, scanning,
sniffing and other security penetration testing tools.
• The EFF has a well written overview on what makes a good security
audit. It's broad but contains some of their behind the scenes thinking
on important considerations with security audits.
• Ubuntu system hardening guide provides step-by-step instructions for
hardening the most recent three Ubuntu LTS releases.
336
Full Stack Python: 2020 Supporter's Edition
• Ars Technica wrote posts on securing your website along with how to
set up a safe and secure web server: part 1 and part 2 to explain HTTPS
and SSL without much required pre-existing knowledge.
• Crypto 101 is an introductory course on cryptography for programmers.
• The first answer to the question "Why are salted hashes more secure
for password storage?" on Stack Overflow gives a wonderful
explanation for why this is an important technique to use to keep your
database passwords and other secrets more secure if the hashed strings
are leaked.
• An in-depth analysis of SSH attacks on Amazon EC2 shows how
important it is to secure your web servers, especially when they are
hosted in IP address ranges that are commonly scanned by malicious
actors.
• Cloud Security Auditing: Challenges and Emerging Approaches is a
high-level overview of some of security auditing problems that come
with cloud deployments.
• Wondering how the common buffer overflow attack works? Check out
this article on buffer overflows that explains the attack in layman's
terms.
• 7 Security Measures to Protect Your Servers provides a good overview
of the fundamentals for how servers should be configured for baseline
security.
• As you're developing on Linux, you'll want to read and follow this
Linux workstation security document to make sure your code and
environment are not compromised. If you're on Mac OS X, check out
this securing Yosemite guide which covers that environment.
• Timing attacks are one form of vulnerability that can be used to defeat
HTTPS in certain configurations. Understanding how those attacks
work is important in keeping your users' connections secure.
• Let's Encrypt at Scale shows an implementation for securing thousands
of sites with SSL certificates to support HTTPS everywhere.
337
Full Stack Python: 2020 Supporter's Edition
Web security learning checklist
1. Read and understand the major web application security flaws that are
commonly exploited by malicious actors. These include cross-site
request forgery (CSRF), cross-site scripting (XSS), SQL injection and
session hijacking. The OWASP top 10 web application vulnerabilities
list is a great place to get an overview of these topics.
2. Determine how the framework you've chosen mitigates these
vulnerabilities.
3. Ensure your code implements the mitigation techniques for your
framework.
4. Think like an attacker and actively work to break into your own system.
If you do not have enough experience to confidently break the security
consider hiring a known white hat attacker. Have her break the
application's security, report the easiest vulnerabilities to exploit in
your app and help implement protections against those weaknesses.
5. Recognize that no system is ever totally secure. However, the more
popular an application becomes the more attractive a target it is to
attackers. Reevaluate your web application security on a frequent basis.
SQL Injection
SQL injections are a category of web application security vulnerabilities that
can affect both relational databases and NoSQL data stores.
SQL Injection resources
• How security flaws work: SQL injection is an approachable primer on
the history and danger of how unsanitized inputs to a database work.
• Preventing SQL injections provides a PostgreSQL and psycopg2
example for how to avoid getting bit by a SQL injection vulnerability.
• Securing your site like it's 1999 covers a bunch of common web
application vulnerabilities including SQL injection.
338
Full Stack Python: 2020 Supporter's Edition
Cross-Site Request Forgery (CSRF)
Cross-Site Request Forgery is a type of web app vulnerability that forces users
to execute unwanted actions when authenticated to an application.
Cross-Site Request Forgery (CSRF) resources
• Preventing cross-site attacks using same-site cookies explains how
Dropbox's engineering team rolled out their same-site cookie defense
that augments other CSRF protections for users.
• Securing your site like it's 1999 covers many common web application
vulnerabilities including Cross-Site Request Forgery issues.
339
Full Stack Python: 2020 Supporter's Edition
Deployment
Deployment involves packaging up your web application and putting it in a
production environment that can run the app.
Why is deployment necessary?
Your web application must live somewhere other than your own desktop or
laptop. A production environment is the canonical version of your current
application and its associated data.
Deployment topics map
Python web application deployments are comprised of many pieces that need
to be individually configured. Here is a map that visually depicts how each
deployment topic relates to each other. Click the image to pull up a PDF
version.
341
Full Stack Python: 2020 Supporter's Edition
342
Full Stack Python: 2020 Supporter's Edition
Deployment hosting options
There are four options for deploying and hosting a web application:
1. "Bare metal" servers
2. Virtualized servers
3. Infrastructure-as-a-service
4. Platform-as-a-service
The first three options are similar. The deployer needs to provision one or
more servers with a Linux distribution. System packages, a web server, WSGI
server, database and the Python environment are then installed. Finally the
application can be pulled from source and installed in the environment.
Note that there are other ways of installing a Python web application through
system-specific package management systems. We won't cover those in this
guide as they are considered advanced deployment techniques.
Deployment tools
• teletraan is the deploy system used by the development teams at
Pinterest, a huge Python shop!
• pants is a build system originally created at Twitter and now split out
as its own sustainable open source project.
• Screwdriver is an open source build system originally developed at
Yahoo! that is now open source. Learn more about it in the
introduction post that contains the rationale for its creation.
Deployment resources
• Hello, production lays out the powerful philosophy of putting a project
into production as soon as possible in a project's lifecycle to establish
the pipeline, identify issues and bottlenecks, and build the foundation
for continuous delivery. The post also covers common objections and
provides some arguments to help you convince others that this strategy
is the right way to go on all projects.
343
Full Stack Python: 2020 Supporter's Edition
• Automated Continuous Deployment at Heroku explains Heroku's
deployment system, the checks they use to ensure code quality and
what they have learned from building the pipeline and process.
• Deploying Python web applications is an episode of the great Talk
Python to Me podcast series where I discuss deploying web
applications based on a fairly traditional virtual private server, Nginx
and Green Unicorn stack.
• Thoughts on web application deployment walks through stages of
deployment with source control, planning, continuous deployment and
monitoring the results.
• Deploying Software is a long must-read for understanding how to
deploy software properly.
• The evolution of code deploys at Reddit teaches the history, including
the mistakes, that Reddit's development teams learned as they scaled
up the development team and the traffic on one of the most-visited
websites in the world.
• Deployment strategies defined explains various ways that development
teams deploy applications, ranging from reckless to versioned.
• How we release so frequently provides a high-level overview of tactics
for how teams at large scale can deploy changes several times per day
or more with confidence the systems will not completely fail. There will
be bugs, but that does not mean the entire operation will stop.
• Hands-off deployment with Canary explains how SoundCloud
automates their deployment process and uses canary builds to identify
and roll back issues to mitigate reliability issues that can occur with
shipping software at scale.
• Practical continuous deployment defines delivery versus deployment
and walks through a continuous deployment workflow.
• 5 ways to deploy your Python application in 2017 is a talk from PyCon
US 2017 where Andrew Baker deploys the getting started Flask app
using Ngrok, Heroku, Zappa on the serverless AWS Lambda platform,
a virtual machine on Google Cloud and Docker.
344
Full Stack Python: 2020 Supporter's Edition
• Continuous deployment at Instagram is the story of how their
deployment process evolved over time from a large Fabric script to
continuous deployments. Along the way they encountered issues with
code reviews, test failures, canary builds and rollbacks. It's a great read
that sheds some light on how Python deployments can be done well at
large scale.
• Stack Overflow's guide on how they do deployment is an awesome indepth read covering topics ranging from git branching to database
migrations.
• In this free video by Neal Ford, he talks about engineering practices for
continuous delivery. He explains the difference between continuous
integration, continuous deployment and continuous delivery. Highly
recommended for an overview of deployment concepts and as an
introduction to the other videos on those subjects in that series.
• TestDriven.io shows how to deploy a microservices architecture that
uses Docker, Flask, and React with container orchestration on Amazon
ECS.
Deployment learning checklist
1. If you're tight on time look at the platform-as-a-service (PaaS) options.
You can deploy a low traffic project web app for free or low cost. You
won't have to worry about setting up the operating system and web
server compared to going the traditional server route. In theory you
should be able to get your application live on the web sooner with PaaS
hosting.
2. Traditional server options are your best bet for learning how the entire
Python web stack works. You'll often save money with a virtual private
server instead of a platform-as-a-service as you scale up.
3. Read about servers, operating systems, web servers and WSGI servers
to get a broad picture of what components need to be set up to run a
Python web application.
345
Full Stack Python: 2020 Supporter's Edition
Hosting
A Python application must be deployed and hosted on one or more servers, or
a platform-as-a-service, to run.
Hosting resources
• An engineer’s guide to cloud capacity planning is a wonderful article
that discusses on-demand provisioning, horizontal and vertical scaling
and how to estimate performance of your infrastructure.
• Selecting a cloud provider reviews Etsy's decisionmaking around selfhosted infrastructure versus cloud hosting. They put together an
architectural model and ultimately decided to start migrating over to
Google Cloud Platform.
• Going Multi-Cloud with AWS and GCP: Lessons Learned at Scale
covers the compute, networking, persistent storage, billing and failover
aspects of using more than one infrastructure provider.
• Auth0 Architecture: Running In Multiple Cloud Providers And Regions
explains their multi-cloud architecture and how it has evolved over the
past several years.
• Choose A Cloud has a few posts with charts for easy cross-cloud
comparisons pricing on features such as persistent storage.
• How Netlify migrated to a fully multi-cloud infrastructure is another
story post about developing a multi-cloud architecture and the
considerations for disaster recovery, databases and testing.
Servers
Servers are the physical infrastructure to run all the layers of software so your
web application can respond to requests from clients such as web browsers.
Why are servers necessary?
Your web application must live somewhere other than your own desktop or
laptop. Servers should ideally be accessible 24 hours a day, 7 days a week,
with no unplanned downtime. The servers that host your web application for
actual users (as opposed to test users) are known as production servers.
346
Full Stack Python: 2020 Supporter's Edition
Production servers hold real data (again as opposed to test data) and must be
secure against unauthorized access.
Bare metal servers
The term bare metal refers to purchasing the actual hardware and hooking it
up to the Internet either through a business-class internet service provider
(ISP) or co-locating the server with other servers. A "business-class" ISP is
necessary because most residential Internet service agreements explicitly
prohibit running web servers on their networks. You may be able to get away
with low traffic volume but if your site serves a lot of traffic it will alert an
ISP's filters.
The bare metal option offers the most control over the server configuration,
usually has the highest performance for the price, but also is the most
expensive upfront option and the highest ongoing maintenance. With bare
metal servers the ongoing operating cost is the electricity the server(s) use as
well as handling repairs when server components malfunction. You're taking
on manual labor working with hardware as well as the rest of the software
stack.
Buy actual hardware from a vendor either pre-built or as a collection of
components that you assemble yourself. You can also buy pre-configured
servers from Dell or HP. Those servers tend to be in smaller case form factors
(called "blades") but are correspondingly more expensive than putting off-theshelf components together yourself in a standard computer case.
Virtualized servers
Virtual private servers (VPSs) are slices of hardware on top of a larger bare
metal server. Virtualization software such as Xen and VMWare allow
providers such as Linode and prgmr (as well as a many others) to provide
fractions of a full server that appear as their own instances. For example, a
server with an 8-core Xeon processor and 16 gigabytes of memory can be
sliced into 8 pieces with the equivalent of 1-core and 2 gigabytes of memory.
The primary disadvantage of virtualized servers is that there is resource
overhead in the virtualization process. In addition, physical constraints such
as heavy I/O operations by a single virtualized instance on persistent storage
347
Full Stack Python: 2020 Supporter's Edition
can cause performance bottlenecks for other virtualized instances on the
shared server. Choosing virtualized server hosting should be based on your
needs for urgency of service ticket requests and the frequency you require for
ongoing maintenance such as persistent storage backups.
Virtualized servers resources
• Choosing a low cost VPS reviews the factors that you should weigh
when deciding on hosting providers.
• How to set up your Linode for maximum awesomeness shows how to
work with a VPS once you've got the server up and running.
• CPU Load Averages explains how to measure CPU load and what to do
about it.
• Which cloud hosting company to choose in 2017? compares
DigitalOcean, Linode, Vultr, OVH and Scaleway in various benchmarks
such as CPUs, memory, disk space, network performance, traffic
capacity and cost. At the end of the article the author also provides
some qualitative feedback on the strengths and weaknesses of each
services' offerings.
Infrastructure-as-a-service
Infrastructure-as-a-service (IaaS) overlaps with virtualized servers because
the resources are often presented in the same way. The difference between
virtualized servers and IaaS is the granularity of the billing cycle. IaaS
generally encourages a finer granularity based on minutes or hours of server
usage instead of on monthly billing cycles.
IaaS can be used in combination with virtualized servers to provide dynamic
upscaling for heavy traffic. When traffic is low then virtualized servers can
solely be used. This combination of resources reduces cost at the expense of
greater complexity in the dynamically scaled infrastructure.
The most common IaaS platforms are Amazon Web Services and Rackspace
Cloud.
The disadvantage to IaaS platforms is the lock-in if you have to write custom
code to deploy, dynamically scale, and generally understand your
348
Full Stack Python: 2020 Supporter's Edition
infrastructure. Every platform has its quirks. For example, Amazon's standard
Elastic Block Store storage infrastructure has at least an order of magnitude
worse I/O throughput than working with your local disk. Your application's
database queries may work great locally but then when you deploy the
performance is inadequate. Amazon has higher throughput EBS instances but
you will pay correspondingly more for them. EBS throughput is just one of
many quirks you need to understand before committing to an IaaS platform.
Infrastructure-as-a-service resources
• The cloud versus dedicated servers
• 5 common server setups for your web application is a great
introduction to how hosting can be arranged.
• Apache Libcloud is a Python library that provides a unified API for
many cloud service providers.
• Amazon Web Services has official documentation for running Python
web applications.
• boto is an extensive and well-tested Python library for working with
Amazon Web Services.
• Poseidon is a Python commandline interface for managing Digital
Ocean droplets (servers).
Servers learning checklist
1. Sign up for a hosting provider. I recommend getting a Linode VPS to
set up your initial infrastructure and deploy your web application there.
Digital Ocean and prgrmr are other VPS options. You can change
hosting providers later after the deployment process is automated.
2. Provision your first server. It will be ready but in a shutdown state
while awaiting your instructions.
3. Move to the operating systems section to learn how to load Ubuntu
14.04 LTS as a base OS for Python web applications.
349
Full Stack Python: 2020 Supporter's Edition
Static Content
Some content on a website does not change and therefore should be served up
either directly through the web server or a content delivery network (CDN).
Examples include JavaScript, image, and CSS files.
Types of static content
Static content can be either assets created as part of your development
process such as images on your landing page or user-generated content. The
Django framework calls these two categories assets and media.
Content delivery networks
A content delivery network (CDN) is a third party that stores and serves static
files. Amazon CloudFront, Akamai, and Rackspace Cloud Files are examples
of CDNs. The purpose of a CDN is to remove the load of static file requests
from web servers that are handling dynamic web content. For example, if you
have an nginx server that handles both static files and acts as a front for a
Green Unicorn WSGI server on a 512 megabyte virtual private server, the
nginx server will run into resource constraints under heavy traffic. A CDN can
remove the need to serve static assets from that nginx server so it can purely
act as a pass through for requests to the Green Unicorn WSGI server.
CDNs send content responses from data centers with the closest proximity to
the requester.
Static Content Resources
• Crushing, caching and CDN deployment in Django shows how to use
django-compressor and a CDN to scale static and media file serving.
• Using Amazon S3 to host your Django static files
• CDNs fail, but your scripts don't have to
• django-storages is a Django library for managing static and media files
on services such as Amazon S3 and other content delivery networks.
• RevSys has a nice article on a range of important static file
optimizations such as setting cache headers, optimizing JavaScript and
reducing the size of images.
350
Full Stack Python: 2020 Supporter's Edition
• Twelve folks with significant experience working on and with CDNs
provide their perspectives in this piece: CDN experts on CDNs.
• Serving Static Files from Flask with WhiteNoise and Amazon
CloudFront looks at how to manage static files with Flask, WhiteNoise,
and Amazon CloudFront.
Static content learning checklist
1. Identify a content delivery network to offload serving static content
files from your local web server. I recommend using Amazon S3 with
CloudFront as it's easy to set up and will scale to high bandwidth
demands.
2. Update your web application deployment process so updated static files
are uploaded to the CDN.
3. Move static content serving from the www subdomain to a static (or
similarly named) subdomain so browsers will load static content in
parallel to www HTTP requests.
Content Delivery Networks (CDNs)
Content delivery networks (CDNs) serve static assets via globally distributed
servers to improve web app loading speed.
CDN resources
• Mastering HTTP Caching is a fantastic post that goes into great
technical detail on how CDNs and caching work.
• MaxCDN vs CloudFlare vs Amazon CloudFront vs Akamai Edge vs
Fastly compares and contrasts the most popular CDN services based on
features, performance and pricing. Note that Full Stack Python uses
Cloudflare to serve all content.
• Crushing, caching and CDN deployment in Django explains how to use
the django-compressor package with the django-storages library to
deploy static assets for a Django application to a CDN.
351
Full Stack Python: 2020 Supporter's Edition
• Building your own CDN for Fun and Profit is a great high-level
overview of how CDNs work and shows you how to create your own,
albeit simplified CDN.
• Do not let your CDN betray you: Use Subresource Integrity describes
the security implications for CDNs with unexpectedly modified content
and how Subresource Integrity in modern web browsers can mitigate
this vulnerability if used properly.
Virtual Private Servers (VPS)
Virtual private servers (VPSs) are sandboxed slices of hardware run with a
hypervisor running on top of a physical server. Virtualization software such as
Xen and VMWare allow a providers' customers to use fractions of a full server
that appear as their own independent instances. For example, a server with an
8-core processor and 16 gigabytes of memory can be roughly virtualized into 8
pieces with the equivalent of 1-core and 2 gigabytes of memory.
The primary disadvantage of virtualized servers is that there is resource
overhead in the virtualization process. But for our web application
deployment, a single well-configured virtual private server provides more
than enough performance and represents a huge cost savings over purchasing
dedicated hardware.
VPS providers
There are many VPS providers and their cost ranges dramatically based on
reliability, support, security and uptime. Make sure to choose a provider that
has a solid reputation unless you are willing to rebuild your server on another
provider whenever issues hit your service.
A few providers I currently use to host my Python web applications:
• Linode
• Digital Ocean
• Amazon Web Services' Lightsail
352
Full Stack Python: 2020 Supporter's Edition
VPS comparisons
• Ready. Steady. Go! The speed of VM creation and SSH access on AWS,
DigitalOcean, Linode, Vexxhost, Google Cloud, Rackspace and
Microsoft Azure and Comparing the speed of VM creation and SSH
access of cloud providers are one way to measure some of the
infrastructure speed provided by several cloud vendors. The virtual
machine and SSH access data points are taken in multiple regions. It's
unclear how these metrics would change over time based on backend
tweaks made by each provider.
• The State of Cloud Instance Provisioning explains the tools and
operations behind how AWS, DigitalOcean, Google Cloud and
Microsoft Azure stand up virtual machine instances for you to use.
• VPS $5 Showdown - October 2018 - DigitalOcean vs. Lightsail vs.
Linode vs. Vultr compares and contrasts the cheapest options for four
popular virtual private server providers.
• VPS comparisons uses Ansible to get some data around provisioning
speed and system performance. The whole README file in that
repository has a ton of useful information and summaries of the tested
providers.
Linode
Linode is a virtual private server service provider that is frequently used for
deploying production Python web applications.
Linode resources
• How I survived going viral on a $5 Linode shows the traffic behind a
high-trafficked site that runs on an inexpensive Linode virtual private
server. The author explains that because the site was a single-page
application with minimal JSON server requests it was able to easily
withstand the load from hundreds of concurrent connections.
DigitalOcean
DigitalOcean is a virtual private server provider and deployment platform that
can be used for running Python applications.
353
Full Stack Python: 2020 Supporter's Edition
DigitalOcean resources
• Creating a Kubernetes Cluster on DigitalOcean with Python and Fabric
shows how to configure a three node Kubernetes cluster using Ubuntu
16.04 LTS.
• Setting up a Digital Ocean server for Selenium, Chrome, and Python
gives the code and instructions for setting up a testing server that uses
Selenium for executing user interface tests.
• The digitalocean tag on GitHub has a slew of open source projects such
as digitalocean-developer-firewall to make it easier to configure
firewalls and other services on your droplets.
Useful tools for working with DigitalOcean
• python-digitalocean is a helper library for interacting with
DigitalOcean's APIs so you can, for example, spin up and shut down
your servers.
• vagrant-digitalocean is a Vagrant provider plugin for managing
DigitalOcean infrastructure.
Lightsail
Lightsail is Amazon Web Services' virtual private server service that can be
used for running production Python applications.
Platform-as-a-service
A platform-as-a-service (PaaS) provides infrastructure and a software layer on
which a web application is deployed. Running your web application from a
PaaS removes the need to know as much about the underlying servers,
operating system, web server, and often the WSGI server.
354
Full Stack Python: 2020 Supporter's Edition
Note: If you are not interested in deploying to a PaaS you can move ahead to
the WSGI servers section.
The PaaS layer defines how the application accesses resources such as
computing time, files, and external services. The PaaS provides a higher-level
abstraction for working with computing resources than deploying an
application to a server or IaaS.
A PaaS makes deployment and operations easier because it forces the
developer to conform applications to the PaaS architecture. For example,
Heroku looks for Python's requirements.txt file in the base directory of the
repository during deployment because that is the file's de facto community
standard location.
If you go the PaaS route, you can skip configuring an operating system and
web server prebaked into PaaS offerings. PaaS offerings generally start at the
WSGI server layer.
355
Full Stack Python: 2020 Supporter's Edition
Platform-as-a-service responsibilities
Although PaaS offerings simplify setting up and maintaining the servers,
operating system, and web server, developers still have responsibilities for
other layers of their web stack.
While it's useful to know the operating system that underpins your PaaS, for
example Heroku uses Ubuntu 10.04, you will not have to know as much about
securing the operating system and server level. However, web applications
deployed to a PaaS are just as vulnerable to security breaches at the
application level as a standard LAMP stack. It's still your responsibility to
ensure the web application framework and your app itself is up to date and
secured. See the security section for further information.
Platforms-as-a-service that support Python
• Heroku
• Google App Engine
• PythonAnywhere
• OpenShift
• AWS Elastic Beanstalk and AWS CodeStar are Amazon Web Services'
PaaS offerings. CodeStar is the newer service and recommended for
new projects.
Platform-as-a-service open source projects
The following open source projects allow you to host your own version of a
platform-as-a-service. Running one of these gives you the advantage of
controlling and modifying the project for your own applications, but prevents
you from offloading the responsibility of keeping servers running to someone
else.
• Kel uses Kubernetes as a foundation for a custom self-hosted PaaS.
Note that it was created by Eldarion, which had one of the first Pythonspecific PaaS offerings on the market around the time that Heroku was
launched.
• Dokku builds on Docker and has hooks for plugins to extend the small
core of the project and customize deployments for your applications.
356
Full Stack Python: 2020 Supporter's Edition
• Convox Rack is open source PaaS designed to run on top of AWS
services.
Platform-as-a-service resources
• The differences between IaaS, PaaS and SaaS explains the abstract
layer differences among "X-as-a-service" offering types and when to
consider using each one.
• PaaS bakeoff: Comparing Stackato, OpenShift, Dotcloud and Heroku
for Django hosting and deployment by Nate Aune.
• Deploying Django by Randall Degges is another great free resource
about Heroku.
• AWS in Plain English shows what current Amazon Web Services
individual services are currently called and what they could've been
called to be more clear to users.
• PAAS comparison - Dokku vs Flynn vs Deis vs Kubernetes vs Docker
Swarm in 2017 covers high-level advantages and disadvantages of
several self-hosted PaaS projects.
• 5 AWS mistakes you should avoid explains how common beginner
practices such as manually managing infrastructure, not using scaling
groups and underutilizing instances can create problems you'd be
better off avoiding altogether.
• Heroku's Python deployment documentation provides clear examples
for how to work with virtualenv, pip and requirements.txt to get a
applications deployed to their platform.
• Miguel Grinberg's Flask tutorial contains an entire post on deploying
Flask applications to Heroku.
• This series on DevOps Django by Randall Degges is great reading for
using the Heroku service:
◦
◦
◦
◦
Part One: Goals
Part Two: The Pain of Deployment
Part Three: The Heroku Way
Part Four: Choosing Heroku
357
Full Stack Python: 2020 Supporter's Edition
• Deploying a Django App to AWS Elastic Beanstalk is a fantastic post
that shows how to deploy to Amazon Web Service's own PaaS.
• Deploy your hack in 3 steps: Intro to AWS and Elastic Beanstalk shows
how to deploy a simple Ruby Sinatra app, but the steps are generally
applicable to Python web apps as well.
• Are you wondering what it will cost to deploy a reasonable sized
production app on a platform-as-a-service like Heroku? Check out
Cushion's transparent costs list where they include their expenses from
using a PaaS as well as other services.
• The beginner's guide to scaling to 11 million users on AWS is a useful
list of services you'll need to look at as you grow an application from 10
to 100 to 1000 to 500,000 and beyond to millions of users.
• How to Separate Your AWS Production and Development Accounts is a
basic post on keeping developer sandbox accounts separate from
production AWS environments.
• Deploying a Django app on Amazon EC2 instance is a detailed
walkthrough for deploying an example Django app to Amazon Web
Services.
• How much is Spotify Paying Google Cloud? provides some insight into
how Spotify runs all of their infrastructure on Google Cloud and posits
what they may be paying to run their service.
• PaaS (false) economics gives some quick back-of-the-envelope
calculations on why running your applications on a PaaS is obviously
going to appear more expensive if you do not take the cost of your own
software engineers into the equation.
• Two blog posts on using AWS Autoscaling in Automatic replacement of
Autoscaling nodes with equivalent spot instances and Autoscaling
nodes: seeing it in action provide a potential approach for making AWS
cheaper via autoscaling. While these posts may look a bit more
dfifficult than the Heroku dyno slider bar, if you're already using AWS
this should prove fairly easy to configure.
358
Full Stack Python: 2020 Supporter's Edition
Platform-as-a-service learning checklist
1. Review the potential Python platform-as-a-service options listed above.
2. Sign up for a PaaS account at the provider that appears to best fit your
application needs. Heroku is the PaaS option recommended for starters
due to their detailed documentation and walkthroughs available on the
web. However, the other options are also viable since their purpose is
to make deploying applications as easy as possible.
3. Check if there are any PaaS-specific configuration files needed for your
app to run properly on the PaaS after it is deployed.
4. Deploy your app to the PaaS.
5. Sync your application's configuration with the database.
6. Set up a content delivery network for your application's static content
unless your PaaS provider already handles this deployment step for
you.
7. Check if the application's functionality is working and tweak as
necessary.
Heroku
Heroku is an implementation of the platform-as-a-service (PaaS) concept that
can be used to more easily deploy Python applications.
359
Full Stack Python: 2020 Supporter's Edition
Heroku resources
• Migrating your Django Project to Heroku is a full tutorial on using
Heroku to run Django web applications. It includes instructions on
converting from MySQL to PostgreSQL if necessary as well as how to
properly handle your settings files.
• Heroku's official Python documentation is fantastic and walks through
deploying WSGI applications in short order.
• Production Django Deployments on Heroku aims to simplify the
process of deploying, maintaining, and scaling production-grade
Django apps on Heroku.
• Deploying Django to Heroku With Docker looks at how to deploy a
Django app to Heroku with Docker via the Heroku Container Runtime.
• Heroku Chatbot with Celery, WebSockets, and Redis is a walkthrough
with available source code to build a Django and Redis-based chat bot
that can be easily deployed to Heroku.
PythonAnywhere
PythonAnywhere is an implementation of the platform-as-a-service (PaaS)
concept and can be used to deploy and operate Python applications.
PythonAnywhere resources
• Turning a Python script into a website shows how to take a simple
script and deploy it to PythonAnywhere so you can have it running
somewhere other than your local machine.
AWS CodeStar
AWS CodeStar is a platform-as-a-service implementation and continuous
delivery deployment pipeline for running Python applications.
360
Full Stack Python: 2020 Supporter's Edition
Operating Systems
An operating system runs on the server or virtual server and controls access to
computing resources. The operating system also includes a way to install
programs necessary for running your Python web application.
Why are operating systems necessary?
An operating system makes many of the computing tasks we take for granted
easy. For example, the operating system enables writing to files,
communicating over a network and running multiple programs at once.
Otherwise you'd need to control the CPU, memory, network, graphics card,
and many other components with your own low-level implementation.
Without using an existing operating system like Linux, Mac OS X or
Windows, you'd be forced to write a new operating system as part of your web
application. It would be impossible to write features for your Python web
application because you'd be too busy hunting down a memory leak in your
assembly code, if you even were able to get that far.
Fortunately, the open source community provides Linux to the Python world
as a rock solid free operating system for running our applications.
Recommended operating systems
The only recommended operating systems for production Python web stack
deployments are Linux and FreeBSD. There are several Linux distributions
commonly used for running production servers. Ubuntu Long Term Support
(LTS) releases, Red Hat Enterprise Linux, and CentOS are all viable options.
361
Full Stack Python: 2020 Supporter's Edition
Mac OS X is fine for development activities. Windows and Mac OS X are not
appropriate for production deployments unless there is a major reason why
you must use them in lieu of Linux.
Canonical's Ubuntu Linux
Ubuntu is a Linux distribution packaged by the Canonical Ltd company.
Ubuntu uses the Debian distribution as a base for packages, including the
aptitude package manager. For desktop versions of Ubuntu, GNOME (until
the 11.04 release, then again in 18.04) or Unity (11.10 until 17.10) is bundled
with the distribution to provide a user interface.
Ubuntu Long Term Support (LTS) releases are the recommended versions to
use for deployments. LTS versions receive five years of post-release updates
from Canonical. Every two years, Canonical creates a new LTS release, which
allows for an easy upgrade path as well as flexibility in skipping every other
LTS release if necessary. As of May 2018, 18.04 Bionic Beaver is the latest
Ubuntu LTS release. Xenial Xerus includes Python 3.6 as its default Python
version, which is a major update compared with 2.7 in Ubuntu 14.04 LTS and
a solid improvement over Python 3.5 included in Ubuntu 16.04 LTS.
Red Hat and CentOS
Red Hat Enterprise Linux (RHEL) and Community ENTerprise Operating
System (CentOS) are the same distribution. The primary difference between
the two is that CentOS is an open source, liberally licensed free derivative of
RHEL.
RHEL and CentOS use a different package manager and command-line
interface from Debian-based Linux distributions: RPM Package Manager
(RPM) and the Yellowdog Updater, Modified (YUM). RPM has a specific .rpm
file format to handle the packaging and installation of libraries and
applications. YUM provides a command-line interface for interacting with the
RPM system.
Learning how operating systems work
• Linux Performance is an incredible site that links to a number of
performance-focused materials that are useful when developing on or
deploying to any Linux distribution.
362
Full Stack Python: 2020 Supporter's Edition
• Linux Journey is a really well designed curriculum for learning Linux
basics such as the command line, package management, text handling.
There are also courses for more advanced topics such as how the kernel
works, setting up logging and device management.
• The Ops School curriculum is a comprehensive resource for learning
about Linux fundamentals and how to perform the work that system
administrators typically handle.
• Since Linux is your go-to production operating system, it's important to
get comfortable with the Unix/Linux commands and philosophy. Study
up on this introduction to Unix tutorial to become more familiar with
the operating system.
• First 5 Minutes on a Server shows the first several security steps that
should be done manually or automatically on any server you stand up.
• How to Use the Command Line for Apple macOS and Linux is useful
for learning the shell and is even helpful for Windows now that the
Windows Subsystem for Linux (WSL) allows you to work with
Widnows as if it is a *nix operating system.
• Linux System Mining with Python shows how to gather system
information using the platform module and some of your own
Python code.
• Digital Ocean has a detailed walkthrough for setting up Python web
applications on Ubuntu.
• linux-internals is a series of posts about how Linux works under the
covers, starting from the low level booting process.
• While not quite necessary to run your Python application, if you want
to dig into how operating systems are built, check out this free book
How to Make a Computer Operating System, which was originally
written by a high school student and later updated as he became a
professional software developer.
• ops-class.org provides an online lecture videos, slides and sample
exams for learning how operating systems are built.
363
Full Stack Python: 2020 Supporter's Edition
• Operating Systems: Three Easy Pieces is a free book by University of
Wisconsin Computer Science professors that teaches how operating
systems are built. Although you do not know exactly how to build your
own OS to use one, understanding the foundation for how software
works is incredibly helpful in unexpected ways while developing and
operating your applications.
• Operating systems: From 0 to 1 is a self-learner resource for writing
your own operating system from scratch.
Choosing an OS resources
macOS and Linux are generally preferred by Python developers over Windows
because many Python packages like gevent simply do not work on Windows.
Others such as Ansible cannot be used as intended on Windows without
major hacks.
The following operating system resources cover perspectives on why
developers chose one operating system over others.
• Finding an alternative to Mac OS X: Part 1, part 2 and part 3: being
productive on Linux explain what alternative applications are available
for common functionality such as the Gnome windowing system, email
and terminal. There are a ton of tips and tricks in there for getting
comfortable with Linux as well as a lot of thought put into what and
why the developer wants his environment set up in a particular way.
• Why I switched from OS X to GNU/Linux explains the rationale for
switching from the Apple-based operating system to Linux along with
what applications the author now uses.
• Ultimate Linux on the Desktop explains one experienced developer's
Linux desktop development environment for getting coding work done.
• Lifehacker's guide to choosing a Linux distro.
• Distro chooser walks you through a set of sixteen questions to
determine which Linux distribution could fit your personal needs.
364
Full Stack Python: 2020 Supporter's Edition
Operating system learning checklist
1. Choose either a Debian-based Linux distribution such as Ubuntu or a
Fedora-based distribution like CentOS.
2. Harden the security through a few basic steps. Install basic security
packages such as fail2ban and unattended-upgrades. Create a new user
account with sudo privileges and disable root logins. Disable passwordonly logins and use a public-private keypair instead. Read more about
hardening systems in the resources listed below.
3. Install Python-specific packages to prepare the environment for
running a Python application. Which packages you'll need to install
depends on the distribution you've selected.
4. Read up on web servers as installing one will be the next step in the
deployment process.
Ubuntu
Ubuntu is a Debian Linux-based operating system distribution often used for
Python development and web application deployment.
Why is Ubuntu important for Python?
Ubuntu is one of the most commonly used Linux distributions for both local
development and server deployments. Some platforms-as-a-service such as
Heroku run Ubuntu as the base operating system, so as a Python developer
you'll often have to work with Ubuntu or a similar Debian-based Linux
operating system.
365
Full Stack Python: 2020 Supporter's Edition
What does "LTS" mean for Ubuntu?
Every two years Ubuntu releases a Long-Term Support (LTS) version that
receives five years of updates instead of only two years for non-LTS releases.
However, there are some issues with the current LTS model, in that you must
only use packages from the main repository unless you're going to manually
handle security updates for non-main repository system packages.
Additional Ubuntu resources
• Get your Python development environment set up with one of these
quick tutorials for Ubuntu 16.04 LTS:
◦ How to set up Python 3, Flask and Green Unicorn on Ubuntu
16.04 LTS
◦ Configuring Python 3, Bottle and Gunicorn for Development on
Ubuntu 16.04 LTS
◦ Setting up Python 3, Django and Gunicorn on Ubuntu 16.04 LTS
• There are also walkthroughs for configuring relational databases and
Redis on Ubuntu:
◦ Setting up PostgreSQL with Python 3 and psycopg on Ubuntu
16.04
◦ How to Install and Use MySQL on Ubuntu 16.04
◦ How to Use Redis with Python 3 and redis-py on Ubuntu 16.04
• Configuring Ubuntu for deep learning with Python is a great tutorial on
which packages you should install and why to use Python 3, OpenCV
and Keras on Ubuntu Linux.
• How to Use the Command Line for Apple macOS and Linux is a
fantastic guide relevant to Ubuntu users who should be able to use the
terminal to accomplish their tasks.
• Linux System Mining with Python shows how to use Python libraries to
gather Linux system information and work with it programmatically in
your applications.
• Canonical, the organization that produces Ubuntu, typically pushes the
boundaries on non-LTS releases, but occasionally rocks the boat with
major changes for an LTS release. 16.04 LTS was one such version,
366
Full Stack Python: 2020 Supporter's Edition
which is described in this article about how Ubuntu 16.04 proves even
an LTS release can live at Linux's bleeding edge.
• What I learned while securing Ubuntu explains how difficult it can be
just to find correct information on how to secure an operating system.
In this case, the author goes over how he went about securing package
management, security standards and file integrity on Ubuntu 14.04
LTS.
• In Beaver We Trust: A Lengthy, Pedantic Review of Ubuntu 18.04 LTS
examines the latest Ubuntu Long Term Support desktop release in
detail.
macOS
macOS is an operating system within the Unix family tree that is developed by
Apple and is often used for developing Python applications.
macOS Python resources
• Using Python on a Macintosh explains how Python 2.7 is installed by
default on Macs. You should start setting up your environment by
installing Python 3 and then follow the instructions in the post to get a
development environment configured.
• Python Development Environment on macOS High Sierra takes the
defualt macOS environment and gives you the step to configure
everything for Python development.
• How to Set Up Your Python Environment is a data science-flavored
configuration guide.
• How to Use the Command Line for Apple macOS and Linux explains
the Terminal. If this guide is useful to you then you should also check
out the shells and Bash pages.
Windows
Windows is a closed-source, proprietary operating system created by
Microsoft that is often used to develop Python applications.
367
Full Stack Python: 2020 Supporter's Edition
Useful resources for Python on Windows
• Windows for Linux nerds gives context for how Windows works
compared to Linux and digs into the Windows Subsystem for Linux
(WSL) which allows the operating system to handle Linux calls.
• How to set up the perfect modern dev environment on Windows
provides a good guide that includes configuring Vagrant, setting up the
Windows toolchain and updating your permissions to handle
development.
• Setting Up Python for Machine Learning on Windows presents detailed
steps for installing Anaconda and using it to install dependencies on
your local machine.
• PyInstaller packages a Python application with its associated
dependencies into a Windows executable file so it can be more easily
distributed and run on other computers.
• Cmder is a beautiful console emulator designed for Windows that can
be useful for when you need to get work done on the commandline.
• Setting up Python on Windows 10 is a short guide to making sure you
have Python 3 installed correctly on your Windows computer.
• Using both Python 2 and 3 in Windows explains how to configure
Windows so you can use both Python 2 and 3 if you still need to dig
into older projects that have not yet been updated.
• Epic Development Environment using Windows Subsystem for Linux
shows how to get your Windows development environment configured.
It is geared towards JavaScript/Node developers but most of the steps
will also be useful to Python developers.
• Windows dev box setup scripts are Powershell programs for setting up
a Windows machine for development.
368
Full Stack Python: 2020 Supporter's Edition
FreeBSD
FreeBSD is an operating system within the Unix family tree that can be used
used for developing Python applications.
FreeBSD Python resources
• How to Get Started with FreeBSD covers the first steps of logging into a
FreeBSD server with SSH, updating the root password and setting
your default shell. There is also a follow-up post for setting up your
SSH keys on FreeBSD.
• The FreeBSD Developer's Handbook shows the extensive
documentation available to you as you work with Python and other
programming languages on this operating system.
• Recommended Steps For New FreeBSD 12.0 Servers explains setting
time zones, setting up the IPFW Firewall and configuring NTP for
accurate time.
Web Servers
Web servers respond to Hypertext Transfer Protocol (HTTP) requests from
clients and send back a response containing a status code and often content
such as HTML, XML or JSON as well.
Why are web servers necessary?
Web servers are the ying to the web client's yang. The server and client speak
the standardized language of the World Wide Web. This standard language is
369
Full Stack Python: 2020 Supporter's Edition
why an old Mozilla Netscape browser can still talk to a modern Apache or
Nginx web server, even if it cannot properly render the page design like a
modern web browser can.
The basic language of the Web with the request and response cycle from client
to server then server back to client remains the same as it was when the Web
was invented by Tim Berners-Lee at CERN in 1989. Modern browsers and
web servers have simply extended the language of the Web to incorporate new
standards.
Web server implementations
The conceptual web server idea can be implemented in various ways. The
following web server implementations each have varying features, extensions
and configurations.
• The Apache HTTP Server has been the most commonly deployed web
server on the Internet for 20+ years.
• Nginx is the second most commonly used server for the top 100,000
websites and often serves as a reverse proxy for Python WSGI servers.
• Caddy is a newcomer to the web server scene and is focused on serving
the HTTP/2 protocol with HTTPS.
• rwasa is a newer web server written in Assembly with no external
dependencies that tuned to be faster than Nginx. The benchmarks are
worth taking a look at to see if this server could fit your needs if you
need the fastest performance trading off for as of yet untested web
server.
Client requests
A client that sends a request to a web server is usually a browser such as
Internet Explorer, Firefox, or Chrome, but it can also be a
•
•
•
•
headless browser, commonly use for testing, such as phantomjs
commandline utility, for example wget and cURL
text-based web browser such as Lynx
web crawler.
370
Full Stack Python: 2020 Supporter's Edition
Web servers process requests from the above clients. The result of the web
server's processing is a response code and commonly a content response.
Some status codes, such as 204 (No content) and 403 (Forbidden), do not
have content responses.
In a simple case, the client will request a static asset such as a picture or
JavaScript file. The file sits on the file system in a location the web server is
authorized to access and the web server sends the file to the client with a 200
status code. If the client already requested the file and the file has not
changed, the web server will pass back a 304 "Not modified" response
indicating the client already has the latest version of that file.
A web server sends files to a web browser based on the web browser's request.
In the first request, the browser accessed the "www.fullstackpython.com"
address and the server responded with the index.html HTML-formatted file.
That HTML file contained references to other files, such as style.css and
script.js that the browser then requested from the server.
Sending static assets (such as CSS and JavaScript files) can eat up a large
amount of bandwidth which is why using a Content Delivery Network (CDN)
to serve static assets is important when possible.
Building web servers
• A Simple Web Server in less than 500 lines of code from the
Architecture of Open Source book provides a great example with
Python as the implementation language..
371
Full Stack Python: 2020 Supporter's Edition
• If you're looking to learn about web servers by building one, here's part
one, part two and part three of a great tutorial that shows how to code a
web server in Python.
• Building a basic HTTP Server from scratch in Python (source code
builds a very simple but insecure web server to show you how HTTP
works.
Web server references
• HTTP/1.1 and HTTP/2 specifications are the source for how web
servers implement the modern web.
• A reference with the full list of HTTP status codes is provided by W3C.
• Usage of web servers broken down by ranking shows how popular
Apache, Nginx and other websites are among the top million through
the top 1,000 sites in the world.
• Apache vs Nginx: practical considerations gives an an overview of each
project, explains how each one handles connections, how it is
configured and the differences between the two web servers in how it
can use custom modules.
• Optimizing web servers for high throughput and low latency is a
wonderful read that shows how detailed knowledge at every layer of the
stack is necessary to optimize web server connections at scale.
• Implementing a tiny web server in a single printf call is an absurd C
language hack that you would never want to use in any real project but
still an amazing little application and wonderful explanation that you
can learn a bit more about web servers by reading.
• Top 5 open source web servers is a short overview of Apache, Nginx,
Lighttpd and two programming language specific servers, Node.js for
JavaScript and Tomcat for Java.
Web servers learning checklist
1. Choose a web server. Nginx is often recommended although Apache is
also a great choice.
372
Full Stack Python: 2020 Supporter's Edition
2. Create an SSL certificate via Let's Encrypt. You will need SSL for
serving HTTPS traffic and preventing myriad security issues that occur
with unencrypted user input.
3. Configure the web server to serve up static files such as CSS, JavaScript
and images.
4. Once you set up the WSGI server you'll need to configure the web
server as a pass through for dynamic content.
Apache HTTP Server
The Apache HTTP Server is a widely deployed web server that can be used in
combination with a WSGI module, such as mod_wsgi or a stand-alone WSGI
server to run Python web applications.
Why is the Apache HTTP Server important?
Apache remains the most commonly deployed web server with a reign of 20+
years. Its wide usage contributes to the large number of tutorials and open
source modules created by developers for anyone to use.
Apache's development began in mid-1994 as a fork of the NCSA HTTP Server
project. By early 1996, Apache overtook the previously dominant but suddenly
stagnant NCSA server as NCSA's progress stalled due to signficantly reduced
development attention.
373
Full Stack Python: 2020 Supporter's Edition
Apache HTTP Server resources
• The official project documentation page contains a section with HowTos and Tutorials to handle authentication, security and dynamic
content.
• Reverse proxies shows how to set up Apache as a reverse proxy using
mod_proxy .
• Deploy Django on Apache with Virtualenv and mod_wsgi provides
instructions for what packages to install to get Apache up and running
with mod_wsgi on Ubuntu.
• Detecting Bots in Apache & Nginx Logs is a great tutorial for filtering
out the significant traffic generated by web crawlers and bots when
using Apache HTTP Server logs for traffic analytics.
• Apache vs Nginx: Practical Considerations is a good comparison post
that covers the differences between Apache and Nginx such as how
they handle connections and serve content.
• Monitoring Apache web server performance gives a really nice
overview of metrics to watch when you are using Apache as your web
server.
• Web Performance 101: HTTP Headers covers the gamut of HTTP
headers and shows how they can impact performance based on your
configuration.
• Apache Web Server on Ubuntu 14.04 LTS explains how to install
Apache on Ubuntu 14.04, which is still a supported release. Note
however, do not install mod_python because it is now insecure and
made obsolete by mod_wsgi and WSGI servers.
Nginx
Nginx, pronounced "engine-X", is the second most common web server
among the top 100,000 websites. Nginx also functions well as a reverse proxy
to handle requests and pass back responses for Python WSGI servers or even
other web servers such as Apache.
374
Full Stack Python: 2020 Supporter's Edition
How is Nginx used in a Python web app deployment?
Nginx is commonly used as a web server to serve static assets such as images,
CSS and JavaScript to web browser clients.
Nginx is also typically configured as a reverse proxy, which passes appropriate
incoming HTTP requests to a WSGI server. The WSGI server produces
dynamic content by running Python code. When the WSGI server passes its
response, which is often in the HTML, JSON or XML format, the reverse
proxy then responds to the client with that result.
The request and response cycle with a reverse proxy server and the WSGI
server can be seen in the following diagram.
Typically the client will not know or need to know that a Python web
application generated the result. The result could have instead been generated
by one or more backend systems written in any programming language, not
just Python.
375
Full Stack Python: 2020 Supporter's Edition
Should I use Nginx or the Apache HTTP Server?
Let's be clear about these two "competing" servers: they are both fantastic
open source projects and either will serve your web application deployment
well. In fact, many of the top global web applications use both servers in their
deployments to function in many steps throughout the HTTP requestresponse cycle.
I personally use Nginx more frequently than Apache because Nginx's
configuration feel easier to write, with less boilerplate than alternatives.
There's also a bit of laziness in the usage: Nginx works well, it never causes me
problems. So I stick with my battle-tested Ansible configuration management
files that set up Nginx with HTTPS and SSL/TLS certificates
Securing Nginx
Nginx's default configuration after a standard installation through a system
package manager or compiling from source is a good base for security.
However, setting up ciphers and redirects can be confusing the first few times
you try it. It's a really good idea to read some of these tutorials to make sure
you are avoiding the most common security errors that plague HTTP(S)
configurations.
• HTTPS with Let's Encrypt and nginx walks throough installing a free
SSL certificate from Let's Encrypt to secure HTTP connects to your
nginx server via HTTPS.
• The Nginx Config tool can generate strong encryption configurations
and ciphers for Nginx.
• Gixy is a static analyzer for your Nginx configuration and can tell you
issues with how you are setup.
• Strong SSL Security on Nginx shows how to mitigate high profile SSL
attacks like Logjam, Heartbleed and FREAK.
Specific Nginx resources
Nginx can be used without Python so there are a massive number of fantastic
resources available for installing, configuring and optimizing this web server
implementation. The following resources are ones that I collected during my
376
Full Stack Python: 2020 Supporter's Edition
own struggle while learning how to use Nginx after I had used Apache HTTP
Server for several years.
• The Nginx chapter in the Architecture of Open Source Applications
book has a great chapter devoted to why Nginx is built to scale a certain
way and lessons learned along the development journey.
• nginx-quick-reference provides fantastic tactical advice for improving
Nginx performance, handling security and many other critical aspects.
• Inside Nginx: How we designed for performance and scale is a blog
post from the developers behind Nginx on why they believe their
architecture model is more performant and scalable than other
approaches used to build web servers.
• Test-driving web server configuration tells a good story for how to
iteratively apply configuration changes, such as routing traffic to Piwik
for web analytics, reverse proxying to backend application servers and
terminately TLS connections appropriately. It is impressive to read a
well-written softare development article like this from a government
agency, although UK's Government Digital Service as well as USA's 18F
and US Digital Service foster a far more credible culture than most
typical agencies.
• Hacker News broke our site – how Nginx and PageSpeed fixed the
problem is primarily about optimizing Nginx's configuration for more
efficient SSL connections. The post also covers configuration
management with Ansible as well as the Pagespeed module that Google
released for both Nginx and the Apache HTTP Server.
• Nginx for Developers: An Introduction provides the first steps to
getting an initial Nginx configuration up and running.
• A faster Web server: ripping out Apache for Nginx explains how Nginx
can be used instead of Apache in some cases for better performance.
• Nginx vs Apache: Our view is a first-party perspective written by the
developers behind Nginx as to the differences between the web servers.
• Rate Limiting with Nginx covers how to mitigate against brute force
password guessing attempts using Nginx rate limits.
377
Full Stack Python: 2020 Supporter's Edition
• Nginx with dynamic upstreams is an important note for setting up your
upstream WSGI server(s) if you're using Nginx as a reverse proxy with
hostnames that change.
• Nginx Caching shows how to set up Nginx for caching HTTP requests,
which is often done by Varnish but can also be handled by Nginx with
the proxy_cache and related directives.
• Dynamic log formats in nginx explains how to use the
HttpSetMiscModule module to transform variables in Nginx and map
input to controlled output in the logs. The author uses this technique
for pixel tracking but there are other purposes this method could be
used for such as advanced debugging.
• Detecting Bots in Apache & Nginx Logs is an awesome tutorial that
shows how to filter web crawlers and bots from your traffic logs when
using them for web traffic analytics.
Caddy
Caddy is a relatively new HTTP server created in 2015 and written in Go. The
server's philosophy and design emphasize HTTPS-everywhere along with the
HTTP/2 protocol.
How can Caddy be used with Python deployments?
Caddy can be used both for testing during local development or as part of a
production deployment as an HTTP server and a reverse proxy with the proxy
directive.
General Caddy resources
• A look inside Caddy shows and explains some of the Go code written to
build the server.
• The official Caddy server docs are the spot to look for what directives
can be placed into a Caddy configuration file
• Caddy a modern web server supporting HTTP/2 is a quick synopsis on
installing Caddy along with a short example configuration file.
378
Full Stack Python: 2020 Supporter's Edition
• HTTP 2.0 on localhost with Caddy shows how to use a self-signed
certificate with Caddy to do local development with an HTTP/2 web
server.
WSGI Servers
A Web Server Gateway Interface (WSGI) server implements the web server
side of the WSGI interface for running Python web applications.
Why is WSGI necessary?
A traditional web server does not understand or have any way to run Python
applications. In the late 1990s, a developer named Grisha Trubetskoy came up
with an Apache module called mod_python to execute arbitrary Python code.
For several years in the late 1990s and early 2000s, Apache configured with
mod_python ran most Python web applications.
However, mod_python wasn't a standard specification. It was just an
implementation that allowed Python code to run on a server. As
mod_python's development stalled and security vulnerabilities were
discovered there was recognition by the community that a consistent way to
execute Python code for web applications was needed.
Therefore the Python community came up with WSGI as a standard interface
that modules and containers could implement. WSGI is now the accepted
approach for running Python web applications.
As shown in the above diagram, a WSGI server simply invokes a callable
object on the WSGI application as defined by the PEP 3333 standard.
379
Full Stack Python: 2020 Supporter's Edition
WSGI's Purpose
Why use WSGI and not just point a web server directly at an application?
• WSGI gives you flexibility
flexibility. Application developers can swap out web
stack components for others. For example, a developer can switch from
Green Unicorn to uWSGI without modifying the application or
framework that implements WSGI. From PEP 3333:
The availability and widespread use of such an API in web servers for
Python [...] would separate choice of framework from choice of web
server, freeing users to choose a pairing that suits them, while freeing
framework and server developers to focus on their preferred area of
specialization.
• WSGI servers promote scaling
scaling. Serving thousands of requests for
dynamic content at once is the domain of WSGI servers, not
frameworks. WSGI servers handle processing requests from the web
server and deciding how to communicate those requests to an
application framework's process. The segregation of responsibilities is
important for efficiently scaling web traffic.
WSGI is by design a simple standard interface for running Python code. As a
web developer you won't need to know much more than
• what WSGI stands for (Web Server Gateway Inteface)
• that a WSGI container is a separate running process that runs on a
different port than your web server
• your web server is configured to pass requests to the WSGI container
which runs your web application, then pass the response (in the form
of HTML) back to the requester
380
Full Stack Python: 2020 Supporter's Edition
If you're using a standard web framework such as Django, Flask, or Bottle, or
almost any other current Python framework, you don't need to worry about
how frameworks implement the application side of the WSGI standard.
Likewise, if you're using a standard WSGI container such as Green Unicorn,
uWSGI, mod_wsgi, or gevent, you can get them running without worrying
about how they implement the WSGI standard.
However, knowing the WSGI standard and how these frameworks and
containers implement WSGI should be on your learning checklist though as
you become a more experienced Python web developer.
Official WSGI specifications
The WSGI standard v1.0 is specified in PEP 0333. As of September 2010,
WSGI v1.0 is superseded by PEP 3333, which defines the v1.0.1 WSGI
standard. If you're working with Python 2.x and you're compliant with PEP
0333, then you're also compliant with 3333. The newer version is simply an
update for Python 3 and has instructions for how unicode should be handled.
wsgiref in Python 2.x and wsgiref in Python 3.x are the reference
implementations of the WSGI specification built into Python's standard
library so it can be used to build WSGI servers and applications.
Example web server configuration
A web server's configuration specifies what requests should be passed to the
WSGI server to process. Once a request is processed and generated by the
WSGI server, the response is passed back through the web server and onto the
browser.
For example, this Nginx web server's configuration specifies that Nginx
should handle static assets (such as images, JavaScript, and CSS files) under
the /static directory and pass all other requests to the WSGI server running
on port 8000:
# this specifies that there is a WSGI server running on port 8000
upstream app_server_djangoapp {
server localhost:8000 fail_timeout=0;
}
# Nginx is set up to run on the standard HTTP port and listen for requests
server {
listen 80;
381
Full Stack Python: 2020 Supporter's Edition
# nginx should serve up static files and never send to the WSGI server
location /static {
autoindex on;
alias /srv/www/assets;
}
# requests that do not fall under /static are passed on to the WSGI
# server that was specified above running on port 8000
location / {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_redirect off;
if (!-f $request_filename) {
proxy_pass http://app_server_djangoapp;
break;
}
}
}
Note that the above code is a simplified version of a production-ready Nginx
configuration. For real SSL and non-SSL templates, take a look at the
Underwear web server templates on GitHub.
WSGI server implementations
There is a comprehensive list of WSGI servers on the WSGI Read the Docs
page. The following are WSGI servers based on community
recommendations.
• Green Unicorn is a pre-fork worker model based server ported from the
Ruby Unicorn project.
• uWSGI is gaining steam as a highly-performant WSGI server
implementation.
• mod_wsgi is an Apache module implementing the WSGI specification.
• CherryPy is a pure Python web server that also functions as a WSGI
server.
WSGI resources
• PEP 0333 WSGI v1.0 and PEP 3333 WSGI v1.0.1 specifications.
• This basics of WSGI post contains a simple example of how a WSGIcompatible application works.
382
Full Stack Python: 2020 Supporter's Edition
• A comparison of web servers for Python web apps is a good read to
understand basic information about various WSGI server
implementations.
• A thorough and informative post for LAMP-stack hosting choices is
presented in the "complete single server Django stack tutorial."
• The Python community made a long effort to transition from
mod_python to the WSGI standard. That transition period is now
complete and an implementation of WSGI should always be used
instead mod_python.
• How to Deploy Python WSGI Applications with CherryPy answers why
CherryPy is a simple combination web and WSGI server along with
how to use it.
• Another Digital Ocean walkthrough goes into How to Deploy Python
WSGI Apps Using Gunicorn HTTP Server Behind Nginx.
• The uWSGI Swiss Army Knife shows how uWSGI can potentially be
used for more than just running the Python web application - it can
also serve static files and handle caching in a deployment.
WSGI servers learning checklist
1. Understand that WSGI is a standard Python specification for
applications and servers to implement.
2. Pick a WSGI server based on available documentation and tutorials.
Green Unicorn is a good one to start with since it's been around for
awhile.
3. Add the WSGI server to your server deployment.
4. Configure the web server to pass requests to the WSGI server for
appropriate URL patterns.
5. Test that the WSGI server responds to local requests but not direct
requests outside your infrastructure. The web server should be the pass
through for requests to and responses from the WSGI server.
383
Full Stack Python: 2020 Supporter's Edition
Green Unicorn (Gunicorn)
Green Unicorn, commonly shortened to "Gunicorn", is a Web Server Gateway
Interface (WSGI) server implementation that is commonly used to run Python
web applications.
Why is Gunicorn important?
Gunicorn is one of many WSGI server implementations, but it's particularly
important because it is a stable, commonly-used part of web app deployments
that's powered some of the largest Python-powered web applications in the
world, such as Instagram.
Gunicorn implements the PEP3333 WSGI server standard specification so
that it can run Python web applications that implement the application
interface. For example, if you write a web application with a web framework
such as Django, Flask or Bottle, then your application implements the WSGI
specification.
How does Gunicorn know how to run my web app?
Gunicorn knows how to run a web application based on the hook between the
WSGI server and the WSGI-compliant web app.
Here is an example of a typical Django web application and how it is run by
Gunicorn. We'll use the django_defaults as an example Django project.
Within the django_defaults project subdirectory, there is a short wsgi.py file
with the following contents:
"""
WSGI config for django_defaults project.
It exposes the WSGI callable as a module-level variable named ``application``.
For more information on this file, see
https://docs.djangoproject.com/en/1.8/howto/deployment/wsgi/
384
Full Stack Python: 2020 Supporter's Edition
"""
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "django_defaults.settings")
application = get_wsgi_application()
That wsgi.py file was generated by the django-admin.py startproject
command when the Django project was first created. Django exposes an
application variable via that wsgi.py file so that a WSGI server can use
application as a hook for running the web app. Here's how the situation
looks visually:
What's a "pre-fork" worker model?
Gunicorn is based on a pre-fork worker model, compared to a worker model
architecture. The pre-work worker model means that a master thread spins up
workers to handle requests but otherwise does not control how those workers
perform the request handling. Each worker is independent of the controller.
Gunicorn resources
• There are three framework-specific posts on the [Full Stack Python
blog](https://www.fullstackpython.com/blog.html for configuring
Gunicorn for development on Ubuntu:
1. Setting up Python 3, Django and Gunicorn on Ubuntu 16.04 LTS
2. How to set up Python 3, Flask and Green Unicorn on Ubuntu
16.04 LTS
3. Configuring Python 3, Bottle and Gunicorn for Development on
Ubuntu 16.04 LTS
385
Full Stack Python: 2020 Supporter's Edition
• gunicorn as your Django development server is a short post with a few
good tips on using Gunicorn for local application development.
• The Full Stack Python Guide to Deployments provides detailed step-bystep instructions for deploying Gunicorn as part of an entire Python
web application deployment.
• Flask on Nginx and Gunicorn combines the Nginx web server with
Gunicorn in a deployment to serve up a Flask application.
• The answers to the question "what's the best practice for running
Django with Gunicorn?" provide some nuance for how Gunicorn
should be invokin the callable application variable provided by Django
within a deployment.
• How to Install Django with Gunicorn and Nginx on FreeBSD 10.2 is a
tutorial for FreeBSD, which is not often used in walkthroughs
compared to the frequency that Ubuntu and CentOS tutorials appear.
• How to make a Scalable Python Web App using Flask, Gunicorn,
NGINX on Ubuntu 14.04 and Deploy a Flask App on Ubuntu both
provide steps for setting up a Flask web app using Gunicorn. There
isn't much explanation provided with each tutorial but they can still be
good concise references in case you're having issues with Ubuntu.
• Deploying a Flask Site Using Nginx, Gunicorn, Supervisor and
Virtualenv on Ubuntu is a similar tutorial to the previous two links. It
provides some good screenshots along the way with what to expect
while you are configuring the deployment server.
• The Django and Flask documentation each contain instructions for
deploying the respective frameworks with Gunicorn.
• Set up Django, Nginx and Gunicorn in a Virtualenv controled by
Supervisor is a GitHub Gist with some great explanations for why we're
setting up virtualenv and what to watch out for while you're doing the
deployment.
• The Gunicorn design document is worth reading because it describes
the server model and gives some context on how to choose the number
of workers for your execution environment.
386
Full Stack Python: 2020 Supporter's Edition
• Configuring Gunicorn for containers explains how to avoid excessive
slowness in your Docker containers that run Gunicorn workers, as well
as some tips on proper logging.
mod_wsgi
mod_wsgi (source code) is an implementation of the Web Server Gateway
Interface specification for running Python code in the Apache HTTP Server.
mod_wsgi is implemented as an Apache HTTP Server module and it supports
both Python 2 and Python 3 code.
mod_wsgi resources
• The official documentation is wonderful for getting started as well as
digging into how to configure your server for better security.
uWSGI
uWSGI (source code), pronounced "mu wiz gee", is a Web Server Gateway
Interface (WSGI) server implementation that is typically used to run Python
web applications.
uWSGI resources
• Configuring uWSGI for Production Deployment explains how
Bloomberg uses uWSGI as a production WSGI server for some of their
Python projects and how to set it up for your own applications.
387
Full Stack Python: 2020 Supporter's Edition
• The official Django framework docs on how to use Django with uWSGI
along with the corresponding official uWSGI Django docs are great
places to start when you are trying to deploy your Django project.
• The uWSGI Swiss Army Knife examines a few lesser-known uWSGI
features that support serving static files, working with SSL, caching and
asynchronous task execution.
• How To Serve Django Applications with uWSGI and Nginx on Debian 8
shows how to set up a Django web app on Debian Linux that uses
Nginx as a web server and reverse proxy for the uWSGI server.
• The official uWSGI quickstart is awesome because it shows you how to
code a quick WSGI application without using a framework then builds
up an example with deploying a traditional Django web app.
• Deployment Notes for Pylons, Nginx, and uWSGI gives the code and
instructions for setting up a Pylons application with uWSGI.
Continuous Integration
Continuous integration automates the building, testing and deploying of
applications. Software projects, whether created by a single individual or
entire teams, typically use continuous integration as a hub to ensure
important steps such as unit testing are automated rather than manual
processes.
Why is continuous integration important?
When continuous integration (CI) is established as a step in a software
project's development process it can dramatically reduce deployment times by
minimizing steps that require human intervention. The only minor downside
to using CI is that it takes some initial time by a developer to set up and then
there is some ongoing maintainence if a project is broken into multiple parts,
such as going from a monolith architecture to microservices.
Automated testing
Another major advantage with CI is that testing can be an automated step in
the deployment process. Broken deployments can be prevented by running a
comprehensive test suite of unit and integration tests when developers check
388
Full Stack Python: 2020 Supporter's Edition
in code to a source code repository. Any bugs accidentally introduced during a
check-in that are caught by the test suite are reported and prevent the
deployment from proceeding.
The automated testing on checked in source code can be thought of like the
bumper guards in bowling that prevent code quality from going too far off
track. CI combined with unit and integration tests check that any code
modifications do not break existing tests to ensure the software works as
intended.
Continuous integration example
The following picture represents a high level perspective on how continuous
integration and deployment can work.
In the above diagram, when new code is committed to a source repository
there is a hook that notifies the continuous integration server that new code
needs to be built (the continuous integration server could also poll the source
code repository if a notification is not possible).
The continuous integration server pulls the code to build and test it. If all tests
pass, the continuous integration server begins the deployment process. The
new code is pulled down to the server where the deployment is taking place.
389
Full Stack Python: 2020 Supporter's Edition
Finally the deployment process is completed via restarting services and
related deployment activities.
There are many other ways a continuous integration server and its
deployments can be structured. The above was just one example of a relatively
simple setup.
Open source CI projects
There are a variety of free and open source continuous integration servers that
are configurable based on a project's needs.
Note that many of these servers are not written in Python but work just fine
for Python applications. Polyglot organizations (ones that use more than a
single language and ecosystem) often use a single CI server for all of their
projects regardless of the programming language the application was written
in.
• Jenkins is a common CI server for building and deploying to test and
production servers. Jenkins source code is on GitHub.
• Go CD is a CI server by ThoughtWorks that was designed with best
practices for the build and test & release cycles in mind. Go CD source
code is on GitHub.
• Bazel is a build tool that works with CI tools to organize large code
bases and provide consistency with a well-defined, automated build
process.
• BuildBot is a continuous integration framework with a set of
components for creating your own CI server. It's written in Python and
intended for development teams that want more control over their
build and deployment pipeline. BuildBot source code is on GitHub.
• TeamCity is JetBrains' closed source CI server that requires a license to
use.
Jenkins CI resources
Jenkins is commonly used as a continuous integration server implementation
for Python projects because it is open source and programming language
390
Full Stack Python: 2020 Supporter's Edition
agnostic. Learn more via the following resources or on the dedicated Jenkins
page.
• My book on deploying Python web applications walks through every
step of setting up a Jenkins project with a WSGI application to enable
continuous delivery. Take a look if you're not grokking all of the steps
provided in these other blog posts.
• Assembling a continuous integration service for a Django project on
Jenkins shows how to set up a Ubuntu instance with a Jenkins server
that'll build a Django project.
• Setting up Jenkins as a continuous integration server for Django is
another solid tutorial that also shows how to send email notifications
as part of the build process.
General continuous integration resources
• What is continuous integration? is a classic detailed article by Martin
Fowler on the concepts behind CI and how to implement it.
• Continuous Deployment For Practical People is not specific to Python
but a great read on what it entails.
• Continuous Integration & Delivery - Illustrated uses well done
drawings to show how continuous integration and delivery works for
testing and managing data.
• The real difference between CI and CD explains what advantages CI
provides, what constraints it operates under (such as total build time)
to work well, and how that is different from the related but distinct
concept of continuous delivery.
• 6 top continuous integration tools gives a high level overview of six CI
tools from a programming language agnostic perspective.
• Updating the GOV.UK Continuous Integration environment explains
the UK's Government Digital Service continuous integration
configuration that relies on Jenkins.
• StackShare's Continuous Integration tag lists a slew of hosted CI
services roughly ranked by user upvotes.
391
Full Stack Python: 2020 Supporter's Edition
• Good practices for continuous integration includes advice on checking
in code, commit tests and reverting to previous revisions.
• Scoring Continuous Integration gives an interesting perspective on
ways to rank the effectiveness of how teams use their CI tooling.
• Why Continuous Integration Is Important is a high-level overview of
how CI can build trust both among developers and between developers
and non-technical people in an organization. The post also discusses
tasks related to setting up reliable CI such as test environments,
integration testing and visibility into the CI results.
• Continuous Intrusion: Why CI tools are a hacker's best friend (PDF)
strongly advises securing your continuous integration server just as you
would every other part of your production application, unless you want
your environment to be vulnerable to malicious actors.
• Measuring and Improving your CI/CD Pipelines provides metrics for
what you should measure with your CI/CD setup to improve the
process for helping your development teams ship code.
• Six rules for setting up continuous integration systems has some solid
general advice for culling problematic tests, ensuring the integration
speed supports the development culture you are building and keeping
all code in source control instead of having complicated logic
configured within the CI server.
• How to Identify Major Blockers in a CI/CD Pipeline gives a high level
overview of concepts such as shipping velocity, test execution and
environment provisioning with regards to CI configurations.
Jenkins
Jenkins is a continuous integration (CI) server often used to automate
building, testing and deploying Python applications.
392
Full Stack Python: 2020 Supporter's Edition
Jenkins resources
• Assembling a continuous integration service for a Django project on
Jenkins shows how to set up a Ubuntu instance with a Jenkins server
that'll build a Django project.
• My book on deploying Python web applications walks through every
step of setting up a Jenkins project with a WSGI application to enable
continuous delivery. Take a look if you're not grokking all of the steps
provided in these other blog posts.
• Revisiting Docker and Jenkins is a fantastic series of posts that
explains how Riot Games combines Jenkins and Docker to test their
back end services.
• Setting up Jenkins as a continuous integration server for Django is
another solid tutorial that also shows how to send email notifications
as part of the build process.
• If you're running into difficulty adding an SSH key to your Jenkins
system account so you can connect to another server or Git repository
this blog post on connecting Jenkins with Git to get the steps to solve
that problem.
• Running Jenkins in Docker Containers is a short tutorial showing how
to use the official Jenkins container on the Docker hub.
• Securing Jenkins is the landing page for Jenkins security. If you're
deploying your own instance, you'll need to lock it down against
unauthorized users.
393
Full Stack Python: 2020 Supporter's Edition
• Updating the GOV.UK Continuous Integration environment describes
the configuration improvements one infrastructure team made to
Jenkins, where they enabled jenkinsfiles to store CI data within project
repositories instead of having to handle the setup through the Jenkins
user interface. The team also published their Puppet files for building
Jenkins infrastructure.
• Jenkins configuration as code details the launch of a new Jenkins tool
for programmatically configuring Jenkins so you can automate the
setup of this part of your deployment infrastructure. The post goes into
the motivations behind creating another tool for code configuration
when other similar libraries already exist.
• Automated servers and deployments with Ansible & Jenkins covers
using Ansible for automating deployments and handling the
coordination via Jenkins builds.
• Building GitHub Pull Requests using Jenkins Pipelines explains how to
use Jenkins 2.0 with Pipelines to create builds that run in Docker
containers off of new GitHub pull requests.
• Automated API testing with Jenkins walks through how to use Jenkins
to tests your API upon each deployment.
• Continuous Delivery with Jenkins and Rollbar is a tutorial on using
Jenkins for continuous integration paired with Rollbar for tracking
deployments and errors.
GoCD
GoCD is a continuous integration (CI) server often used to automatically
build, test and deploy Python applications.
394
Full Stack Python: 2020 Supporter's Edition
GoCD resources
• The official getting started with GoCD guide is a solid place to begin
configuring the CI server and learning about its basic concepts.
• py-gocd is a code library for programmatically interacting with GoCD
from your Python code.
Configuration Management
Configuration management involves modifying servers from an existing state
to a desired state and automating how an application is deployed.
Configuration management tools
Numerous tools exist to modify server state in a controlled way, including
Puppet, Chef, SaltStack, and Ansible. Puppet and Chef are written in Ruby,
while SaltStack and Ansible are written in Python.
Ad hoc tasks
Configuration management tools such as Chef, Puppet, Ansible, and SaltStack
are not useful for performing ad hoc tasks that require interactive responses.
Fabric and Invoke are used for interactive operations, such as querying the
database from the Django manage.py shell.
395
Full Stack Python: 2020 Supporter's Edition
Configuration management tool comparisons
• Moving away from Puppet: SaltStack or Ansible? is an openly biased
but detailed post on why to choose SaltStack over Ansible in certain
situations.
• Ansible vs. Shell Scripts provides some perspective on why using a
configuration management tool is a better choice than venerable but
brittle shell scripts.
• Ansible vs. Chef is a comparsion of Ansible with the Chef configuration
management tool.
• This post on Ansible and Salt: A Detailed Comparison shows the
differences between these two Python-powered tools.
Configuration management resources
• A quick guide to choosing infrastructure tools presents a high-level
overview of the categories of tools you will need to perform
infrastructure-as-code in an organization.
Ansible configuration management
Ansible is an open source configuration management and application
deployment tool built in Python.
Ansible Resources
• An Ansible Tutorial is a fantastically detailed introduction on using
Ansible to set up servers.
• Ansible Text Message Notifications with Twilio SMS is my blog post
with a detailed example for using the Twilio module in core Ansible
1.6+.
• Python for Configuration Management with Ansible slides from PyCon
UK 2013
• First Steps with Ansible
• Red Badger on Ansible
• Getting Started with Ansible
396
Full Stack Python: 2020 Supporter's Edition
• An introduction to Ansible is a tutorial on the basics of getting started
with the tool.
• Ansible and Linode
• Multi-factor SSH authentication with Ansible and Duo Security
• Introducing Ansible into Legacy Projects
• Post-install steps with Ansible
• 6 practices for super smooth Ansible experience
• Create a Couchbase Cluster with Ansible
• Idempotence, convergence, and other silly fancy words we often use
• Testing with Jenkins, Docker and Ansible
Application dependencies learning checklist
1. Learn about configuration management in the context of deployment
automation and infrastructure-as-code.
2. Pick a configuration management tool and stick with it. My
recommendation is Ansible because it is by far the easiest tool to learn
and use.
3. Read your configuration management tool's documentation and, when
necessary, the source code.
4. Automate the configuration management and deployment for your
project. Note that this is by far the most time consuming step in this
checklist but will pay dividends every time you deploy your project.
5. Hook the automated deployment tool into your existing deployment
process.
Ansible
Ansible is a configuration management tool used for application deployment
and environment setup.
397
Full Stack Python: 2020 Supporter's Edition
Example Ansible playbooks
Ansible is far easier to learn when you can read how more full-featured
playbooks are built using many tasks. An interesting note from my own
experience is that when you get more experienced using Ansible there are
many shortcuts in the task syntax so you can often make playbooks that have
fewer lines of code than when you were less experienced yet the readability
does not suffer.
Check out some of these example playbooks to learn more about how you may
be able to structure your playbooks:
• The prod directory under the Full Stack Python Deployment Guide
open source project code contains a full playbook for deploying a
standard Nginx, Gunicorn and PostgreSQL stack.
• mac-dev-playbook configures macOS with various applications and
developer tools such as Docker, Homebrew and Sublime Text.
• ansible-nginx-haproxy-elasticsearch sets up a server with Nginx,
HAProxy and ElasticSearch.
Specific Ansible topics
• An Ansible2 Tutorial is an incredibly detailed look at how one
developer installs and run Ansible.
• This retrospective from a developer on lessons from using Ansible
exclusively for 2 years explains his rationale for choosing Ansible over
Puppet and Chef, then goes through several use cases and best
practices learned over time with the tool.
398
Full Stack Python: 2020 Supporter's Edition
• Using Ansible for deploying serverless applications provides a short
overview with an example playbook how Ansible can also be useful for
configuring serverless applications.
• Painless Immutable Infrastructure with Ansible and AWS covers the
steps needed for the unique authentication complexities that arise from
using Amazon Web Services for your infrastructure.
• DevOps from Scratch, Part 1: Vagrant & Ansible
• Ansible: Post-Install Setup
• How To Use Vault to Protect Sensitive Ansible Data on Ubuntu 16.04
• How to use Ansible Variables and Vaults
• CI for Ansible playbooks which require Ansible Vault protected
variables
• How to use Ansible to manage PostgreSQL
• Deploy A Replicated MongoDB instance on AWS with Terraform and
Ansible
Salt
Salt (source code) is a configuration management tool used for application
deployment and setting up development environments.
Salt resources
• What's new in Salt 3000 Neon covers the latest release and the
significant number of new features and fixes contained within it.
• Introduction to Salt gives a 30 second summary of what the tool can do
for you then provides a collection of links to other resources that plug
399
Full Stack Python: 2020 Supporter's Edition
you into the Salt community, such as the salt-users mailing list and the
Salt Stack company blog.
• Linode has two great beginner guides to Salt, the first one on Getting
Started with Salt - Basic Installation and Setup and the other titled A
Beginner's Guide to Salt.
Containers
Containers are an operating system-level isolation mechanism for running
processes and other system resources from other containers and the base
system.
Are containers new?
Containers are not conceptually new, dating back to around the 1970s, but
they gained rapid adoption starting in 2012 when several Linux distributions
began integrating containers tools generally made it more practical to use
them. Previously, to use containers a developer would need to use a less
common operating system or distribution customized with some sort of
virtual machine feature. Using containers was basically not supported in
typical deployment scenarios.
Containers history and introduction
The following resources do a great job of explaining where the containers
concept came from, how they differ from virtual machines and why they are
useful.
• The Missing Introduction To Containerization truly grants what its title
sets out to do: give a wide-ranging overview and history of container
concepts and tools. The post is dense but well worth the read to start
learning about chroot , Solaris zones, LXC, Docker and how they've
influenced each other throughout the past 40 years.
• A brief history of containers has some solid context for why containers
have taken off in the last several years, including the integration of
operating system container virtualization in most distributions as well
as the creation of management tools such as Docker, Kubernetes,
Docker Swarm and Mesosphere.
400
Full Stack Python: 2020 Supporter's Edition
• Setting the Record Straight: containers vs. Zones vs. Jails vs. VMs
compares and contrasts the designs of Linux containers, zones, jails
and virtual machines. Containers typically take advantage of primitives
but are more complicated because they have more individual parts put
together while zones and jails are designed as top-level operating
system components. There are advantages and disadvantages of these
approaches that you should understand as you use each one.
• Containers and Distributed Systems: Where They Came From and
Where They’re Going is an interview that digs into the past, present
and future of containers based on the experience of Chuck McManis
who has worked on building jails and other process isolation
abstractions into operating systems.
• Linux containers in a few lines of code shows how containers work by
providing some code to run a busybox Docker image but without using
docker. It then explains what's happening under the hood as you run
basic commands such as /bin/sh .
• A Practical Introduction to Container Terminology has both some solid
introductory information on containers as well as a good description of
terms such as container host, registry server, image layer, orchestration
and many others that come up frequently when using containers.
• Containers from scratch explains how Linux features such as cgroups ,
chroot and namespaces are used by container implementations.
• Running containers without Docker reviews a migration path for an
organization that already has a bunch of infrastructure but sees
advantages in using containers. However, the author explains why you
can use containers without Docker even if you eventually plan to use
Docker, Kubernetes or other container tools and orchestration layer.
• mocker is a Docker imitation open source project written in all Python
which is intended for learning purposes.
Working with containers
You can get started using containers once you understand some of the
terminology and work through a couple of introductory tutorials like the ones
401
Full Stack Python: 2020 Supporter's Edition
listed above. Check out the below resources when you want to do more
advanced configurations and dig deeper into how containers work.
• Linux containers in 500 lines of code is a bonkers in-depth post about
building your own simplified, but not simple version of Docker to learn
how it works.
• A Comparison of Linux Container Images presents data on many of the
frequently-used base container images.
• 7 best practices for building containers provides Google's
recommendations for creating containers such as include only a single
application per container, make sure to use descriptive tags and build
the smallest image size possible.
• Building healthier containers examines how Docker containers are
different from virtual machines and digs into dependencies that can be
included in your container image if you do not know how to properly
build them.
• Containers patterns covers common usage patterns that have
developed now that containers have been in development workflows
for a few years.
Container security resources
Container security is a hot topic because there are so many ways of screwing it
up, just like any infrastructure that runs your applications. These resources
explain security considerations specific to containers.
• Building Container Images Securely on Kubernetes discusses some of
the issues with building containers and why the author created img as a
tool to help solve the problems she was seeing.
• Making security invisible is a great presentation that covers sandboxes,
Seccomp and other concepts for isolating potentially unsafe code to
limit attack scope.
• 10 layers of Linux container security explains many of the attack
vectors you need to be aware of when you are working with containers.
402
Full Stack Python: 2020 Supporter's Edition
Docker
Docker (source code for core Docker project) is an infrastructure management
platform for running and deploying software. The Docker platform is evolving
so an exact definition is currently a moving target, but the core idea behind
Docker is that operating system-level containers are used as an abstraction
layer on top of regular servers for deployment and application operations.
Why is Docker important?
Docker can package up applications along with their necessary operating
system dependencies for easier deployment across environments. In the long
run it has the potential to be the abstraction layer that easily manages
containers running on top of any type of server, regardless of whether that
server is on Amazon Web Services, Google Compute Engine, Linode,
Rackspace or elsewhere.
Python projects within Docker images
• This Docker image contains a Flask application configured to run with
uWSGI and Nginx. You can also see the image on Docker hub.
• minimal-docker-python-setup contains an image with Nginx, uWSGI,
Redis and Flask.
Docker resources
• Beginners guide to Docker explains what it is, the difference between
containers and virtual machines, and then provides a hands-on
walkthrough command-driven tutorial.
403
Full Stack Python: 2020 Supporter's Edition
• What is Docker and When to Use It clearly delineates what Docker is
and what it isn't. This is a good article for when you're first wrapping
your head around Docker conceptually.
• rubber-docker is an open source repository and tutorial that shows you
how to recreate a simplified version of Docker to better understand
what it's doing under the hood.
• Andrew Baker presented a fantastic tutorial at PyOhio on beginner and
advanced Docker usage. Andrew also wrote the article what containers
can do for you.
• Docker curriculum is a detailed tutorial created by a developer to show
the exact steps for deploying an application that relies on Elasticsearch.
• How To Install and Use Docker on Ubuntu 16.04 provides a
walkthrough for Ubuntu 16.04 for installing and beginning to use
Docker for development.
• It Really is the Future discusses Docker and containers in the context of
whether it's all just a bunch of hype or if this is a real trend for
infrastructure automation. This is a great read to set the context for
why these tools are important.
• Docker Jumpstart is a comprehensive introduction to what Docker is
and how to get started with using the tool.
• If you want to quickly use Docker on Mac OS X, check out these concise
instructions for setting up your Docker workflow on OS X in 60
seconds.
• What the Bleep is Docker? is a plain English explanation with examples
of what Docker provides and what it can be used for in your
environment.
• Docker in Practice - A Guide for Engineers is an explanation of the
concepts and philosophy by the authors of the new Manning Docker
book in early access format.
• Building Docker containers from scratch is a short tutorial for creating
a Docker container with a specific configuration.
404
Full Stack Python: 2020 Supporter's Edition
• 10 things to avoid in Docker containers provides a lot of "don'ts" that
you'll want to consider before bumping up against the limitations of
how containers should be used.
• Docker Internals presents Linux containers and how Docker uses them
as its base for how the project works. This article is a great way to
bridge what you know about Docker with a more traditional Linux
operating system architecture understanding.
• Improve your Dockerfile, best practices covers image size, layers,
starting scripts and LABEL.
• This post gives an overview and comparison of Docker GUIs which can
be handy for monitoring your Docker containers.
Python-specific Docker resources
• Dockering Django, uWSGI and PostgreSQL the serious way walks
through both the code and the error messages that will likely crop up as
you attempt to container-ize a Django project that uses a PostgreSQL
database on the backend.
• How to deploy Django using Docker assumes you already have the
basic grasp of working with Docker and jumps right into a Django
deployment. The post shows you how to set up your Dockerfile and
explains that GitLab CI can be used to to build this Docker image.
• Hosting Python WSGI applications using Docker shows how to use
Docker in WSGI application deployments specifically using mod_wsgi.
• How to Containerize Python Web Applications is an extensive tutorial
that uses a Flask application and deploys it using a Docker container.
• How to use Django, PostgreSQL, and Docker shows how to get a
Django project that uses PostgreSQL as its back end running in Docker.
• Docker in Action - Fitter, Happier, More Productive is a killer tutorial
that shows how to combine Docker with CircleCI to continuously
deploy a Flask application.
405
Full Stack Python: 2020 Supporter's Edition
• Building smaller Python Docker images examines how to inspect layers
in Dockerfiles and minimize the overhead of what images contain for
better performance.
• The Flask Mega-Tutorial Part XIX: Deployment on Docker Containers
is one post in Miguel Grinberg's absolutely spectacular Flask
application series.
• Deploying Django Applications in Docker explains some of the
concepts behind using Docker for Python deployments and shows how
to specifically use it for deploying Django.
• Using Docker and Docker Compose to replace virtualenv is a tutorial
for using Docker instead of virtualenv for dependency isolation.
• Lincoln Loop wrote up a closer look at Docker from the perspective of
Python developers handling deployments.
• Curious how pip and Docker can be used together? Read this article on
Efficient management Python projects dependencies with Docker.
• Python virtual environments and Docker goes into detail on whether
virtual environments should be used with Docker and how system
packages can generally be a safer route to go.
• Dockerizing Django with Postgres, Gunicorn, and Nginx details how to
configure Django to run on Docker along with Postgres, Nginx and
Gunicorn.
• Dockerizing a Python Django Web Application is another in-depth
tutorial on combining Docker with Django.
• Dockerizing Flask with Postgres, Gunicorn, and Nginx looks at how to
configure Flask to run on Docker along with Postgres, Nginx, and
Gunicorn.
Kubernetes
Kubernetes (source code) is a container orchestration system for deploying,
scaling and operating applications.
406
Full Stack Python: 2020 Supporter's Edition
Kubernetes tools
• Helm (source code) is a package manager for Kubernetes charts, which
are the way to define common types of Kubernetes cluster
arrangements, like MySQL, Cassandra or Jenkins.
• Gitkube (source code) makes it possible to deploy an application to
Kubernetes using git push , similar to how Heroku popularized
making platform-as-a-service deployments easy.
• Kompose (source code) translate Docker Compose files into
Kubernetes configuration resources.
• skaffold. Using Kubernetes for local development is a good starting
place for more information on getting started with Skaffold.
• kubethanos is a tool to kill half of your Kubernetes pods at random, to
test the resilience of your infrastructure under highly chaotic scenarios.
Kubernetes background and retrospectives
• Borg, Omega and Kubernetes goes into the history of Borg and Omega,
projects that preceded Kubernetes' creation. There are a ton of great
notes on why they developed the project in certain ways and what they
knew to avoid based on the prior work on Borg and Omega.
• Kubernetes at GitHub provides a retrospective on transitioning
GitHub's infrastructure from a traditional Ruby on Rails deployment
architecture to a more scalable container-based Kubernetes system.
There are some great details on the steps in the transition and ramping
up capacity until it was the full system for github.com and other critical
services.
407
Full Stack Python: 2020 Supporter's Edition
• Reasons Kubernetes is cool breaks past the "why would I ever need
this?" initial developer reaction and gives solid reasons such as better
visibility into all of the services running on your Kubernetes cluster and
potentially much faster deployment after appropriate configuration.
• How we designed our Kubernetes infrastructure on AWS explains how
the Kubernetes Infrastructure Technology Team (yes, that abbreviates
to KITT in honor of the 1980s Knight Rider TV show) at Atlassian
starting using the tool and how they have built infrastructure around it
for the company to operate their container-ized applications.
• 10 Most Common Reasons Kubernetes Deployments Fail goes over
many of the top technical reasons why issues come up with Kubernetes
and what you need to do to avoid or work through them.
• Draft vs Gitkube vs Helm vs Ksonnet vs Metaparticle vs Skaffold gives a
great overview of the most popular tools that make it easier to use
Kubernetes.
• Architecting applications for Kubernetes is stuffed full of great design
advice that is now available as people having been using Kubernetes for
a couple of years.
• "Let’s use Kubernetes!" Now you have 8 problems is a counterargument for why you should be cautious about introducing the
significant complexity overhead of Kubernetes (or any related tools)
into your environment unless you really need the advantages that they
can provide. Each developer, team and organization should perform an
explicit cost-benefit analysis to make sure the tool's scability, reliability
and related functionality will outweigh the downsides.
• How Zolando manages 140+ Kubernetes clusters covers the
architecture, monitoring and workflow of a team that has to run a
decent number of clusters for their development teams.
Kubernetes tutorials
• Kubernetes The Hard Way is a tutorial that walks you through
manually setting up a Kubernetes cluster. The purpose is to teach you
what is happening at each step instead of performing everything
408
Full Stack Python: 2020 Supporter's Edition
through automation like you normally would after you understand how
to use the tool.
• Kubernetes Any% Speedrun hilariously presents the pain of using
Kubernetes and gives the basic steps for getting a deployment up and
running.
• A Gentle introduction to Kubernetes with more than just the basics is a
Git README tutorial with clear steps for how to get started running a
Kubernetes cluster.
• Anatomy of my Kubernetes Cluster shows how one developer created
their own Raspberry Pi cluster that could run Kubernetes to learn more
about how it works.
• The cult of Kubernetes is a hilarious rant that also manages to teach the
reader a lot about how to avoid some big issues the author ran into
while working with Kubernetes for simple starter projects.
• Kubernetes by Example provides the commands and code for you to
get started with the core Kubernetes concepts.
• Your instant Kubernetes cluster provide a concise set of instructions for
setting up a cluster.
• A tutorial introduction to Kubernetes covers a bunch of introductory
steps using an example Python application.
• An Example Of Real Kubernetes: Bitnami gives instructions for what to
do after you have finished creating a Kubernetes cluster and learned
the "Hello, World!"-style example.
• Kubernetes Production Patterns is a tutorial with good and bad
practices so you can learn what to do and what to avoid in your
Kubernetes infrastructure.
• Django Production Deployment on GCP with Kubernetes uses Helm to
make it easier to deploy the example Django web app with a
PostgreSQL backend.
• K8s YAML Alternative: Python shows how you can use Python scripts
instead of YAML to configure your Kubernetes clusters.
409
Full Stack Python: 2020 Supporter's Edition
Serverless
Serverless is a deployment architecture where servers are not explicitly
provisioned by the deployer. Code is instead executed based on developerdefined events that are triggered, for example when an HTTP POST request is
sent to an API a new line written to a file.
How can code be executed "without" servers?
Servers still exist to execute the code but they are abstracted away from the
developer and handled by a compute platform such as Amazon Web Services
Lambda or Google Cloud Functions.
Think about deploying code as a spectrum, where on one side you build your
own server from components, hook it up to the internet with a static IP
address, connect the IP address to DNS and start serving requests. The
hardware, operating system, web server, WSGI server, etc are all completely
controlled by you. On the opposite side of the spectrum are serverless
compute platforms that take Python code and execute it without you ever
touching hardware or even knowing what operating system it runs on.
In between those extremes are levels that remove the need to worry about
hardware (virtual private servers), up through removing concerns about web
servers (platforms-as-a-service). Where you fall on the spectrum for your
deployment will depend on your own situation. Serverless is simply the
newest and most extreme of these deployment models so it is up to you as to
how much complexity you want to take on with the deployment versus your
control over each aspect of the hardware and software.
410
Full Stack Python: 2020 Supporter's Edition
Serverless implementations
Each major cloud vendor has a serverless compute implementation. These
implementations are under significant active development and not all of them
have Python support.
• AWS Lambda is the current leader among serverless compute
implementations. It has support for both Python 2.7 and Python 3.6/
3.7.
• Azure Functions has second-class citizen support for Python. It's
supposed to be possible but kind of hacky at the moment. Polyglot
support should be quickly coming to Azure to better compete with AWS
Lambda.
• IBM Bluemix OpenWhisk is based on the Apache OpenWhisk open
source project.
• Google Cloud Functions currently only supports JavaScript code
execution.
• Webtask.io also only supports JavaScript but there is a cool prototype
project named webtask-pytask to run Python code in the browser via
webtask. This demo is definitely not for production code use but
awesome to see what the programming community can put together
using existing code and services.
Serverless frameworks
Serverless libraries and frameworks aim to provide reusable code that handles
common or tedious tasks, similar to how web frameworks deal with common
web development tasks. Some of these frameworks are built for a single
service like AWS Lambda, while others attempt to make cross-serverless
operations more palatable.
Frameworks for building Python-based applications on serverless services
include:
• Serverless (source code), which is a useful but generically-named
library that focuses on deployment and operations for serverless
applications.
411
Full Stack Python: 2020 Supporter's Edition
• Zappa (source code) provides code and tools to make it much easier to
build on AWS Lambda and AWS API Gateway than rolling your own on
the bare services.
• Chalice (source code) is built by the AWS team specifically for Python
applications.
• Apex (source code)
General serverless resources
Serverless concepts and implementations are still in their early iterations so
there are many ideas and good practices yet to be discovered. These resources
are the first attempts at figuring out how to structure and operate serverless
applications.
• Serverless software covers a range of topics under serverless and how
deployments have changed as new options such as PaaS have become
widespread.
• Lessons Learned — A Year Of Going “Fully Serverless” In Production is
a retrospective from a small development team that combines a static
site with serverless backend code to easily scale their site without an
operations staff. They discuss the good and the bad of working in this
fashion while generally coming away with a positive experience.
• From bare metal to Serverless gives some historical detail and
background context for how various execution architectures have
evolved, from the invention of the web to software-as-a-service,
infrastructure-as-a-service to today's newer serverless platforms.
• Have you shipped anything serious with a “serverless” architecture?
provides some great answers by Hacker News developers who are using
serverless for large production applications and how they deal with the
limitations of the platforms.
• Serverless cold start war compares startup times of serverless function
instances across Google Cloud, AWS and Azure. This is only one way to
measure the results the author did a great job presenting the data and
elaborating on potential reasons why the results appeared as shown.
412
Full Stack Python: 2020 Supporter's Edition
• Serverless Deployments of Python APIs is a wonderful Python-specific
article on how to use AWS Lambda, API Gateway and DynamoDB to
create and deploy a Python API.
• What's this serverless thing, anyway?
• Serverless architectures - let's ditch the servers?
• Serverless architectures provides a fantastic overview of the subject
with a balanced approach that includes the drawbacks seen in current
serverless platforms.
• Why the fuss about serverless? is a wide-ranging post about the history
of application development and infrastructure. The timeline is a bit
hard to follow but otherwise it's a unique look at why software
deployments are moving to serverless-based architectures and the
advantages that can provide.
• Serverless architectures in short lays out some of the initial thoughts
behind what the advantages and disadvantages of serverless may be.
However, it's early days for serverless so these strengths and
weaknesses may change as the architectures and good practices evolve.
• Building A Serverless Contact Form For Your Static Site uses AWS
Lambda, some HTML and JavaScript to add an input form to a static
website created by a static site generator.
• Serverless architectures, five design patterns goes over the four main
principles of serverless infrastructure and the five major usage patterns
the AWS Lambda team is seeing from initial serverless deployments.
• Serverless Cost Calculator estimates the amount each serverless
platform would charge based on executions, average execution time
and memory needed per execution. AWS Lambda, Google Cloud
Functions, Azure Functions and IBM OpenWhisk are all included in
the results.
Serverless environment comparsions
The "big 3" serverless platforms, AWS Lambda, Azure Functions and Google
Cloud Functions have varying degrees of support for Python. AWS Lambda
has production-ready support for Python 2 and 3.7, while Azure and Google
413
Full Stack Python: 2020 Supporter's Edition
Cloud have "beta" support with unclear production-worthiness. The following
resources are some comparison articles to help you in your decision-making
process for which platform to learn.
• Serverless at scale compares the "Big 3" AWS, Azure and Google Cloud
in serverless performance. The author provides some nice data around
average response times and outliers.
• Serverless hosting comparison is a broad overview of documentation,
community, pricing and other notes for the major platforms as well as
IBM OpenWhisk and the Fission.io project.
• Microsoft Azure Functions vs. Google Cloud Functions vs. AWS
Lambda presents an overview of Azure Functions and how they
compare to Google Cloud Functions and AWS Lambda.
Serverless vendor lock-in?
There is some concern by organizations and developers about vendor lock-in
on serverless platforms. It is unclear if portability is worse for serverless than
other infrastructure-as-a-service pieces, but still worth thinking about ahead
of time. These resources provide additional perspectives on lock-in and using
multiple cloud providers.
• On Serverless, Multi-Cloud, and Vendor Lock In is an opinion piece
that for most cases the additional work of going multi-cloud is not
worth the tradeoffs, therefore at this time it's better to go for a single
vendor such as AWS or Azure and optimize on that platform.
• Why vendor lock-in with serverless isn’t what you think it is
recommends using a single vendor for now and stop worrying about
hedging your bets because it typically makes your infrastructure
significantly more complex.
AWS Lambda
Amazon Web Services (AWS) Lambda is a compute service that executes
arbitrary Python code in response to developer-defined AWS events, such as
inbound API calls or file uploads to AWS' Simple Storage Service (S3).
414
Full Stack Python: 2020 Supporter's Edition
Why is Lambda useful?
Lambda is often used as a "serverless" compute architecture, which allows
developers to upload their Python code instead of spinning and configuring
servers, deploying their code and scaling based on traffic.
Python on AWS Lambda
Lambda only had support for JavaScript, specifically Node.JS, when it was
first released in late 2014. Python 2 developers were welcomed to the platform
less than a year after its release, in October 2015. Lambda now has support for
both Python 2.7, 3.6 and 3.7.
Python-specific AWS Lambda resources
• Serverless Slash Commands with Python shows how to use the Slack
API to build slash commands that run with an AWS Lambda backend.
• Zappa is a serverless framework for deploying Python web
applications. It's a really slick project and used even by internal AWS
developers for their own application deployments. Be sure to read the
Zappa blog as well for walkthroughs and new feature announcements.
• How to Setup a Serverless URL Shortener With API Gateway Lambda
and DynamoDB on AWS builds a non-trivial URL shortener
application as an example Python application that runs on Lambda.
415
Full Stack Python: 2020 Supporter's Edition
• How we built Hamiltix.net for less than $1 a month on AWS walks
through setting up a full website that runs on AWS and scales with the
Lambda free tier to minimize spend despite large traffic spikes.
• Deploying a serverless Flask app to AWS Lambda using Zappa provides
a screen capture of one developer deploying their application to
Lambda.
• Automated SQL Injection Testing of Serverless Functions On a
Shoestring Budget (and Some Good Music) is an awesome operational
security post that uses Python to test for SQL injection vulnerabilities
in serverless functions on AWS Lambda.
• Building Scikit-Learn For AWS Lambda follows up on the Using ScikitLearn In AWS Lambda post which shows how to perform scientific
computing with Python packages on AWS Lambda.
• Creating Serverless Functions with Python and AWS Lambda explains
how to use the Serverless framework to build Python applications that
can be deployed to AWS Lambda.
• Code Evaluation With AWS Lambda and API Gateway shows how to
develop a code evaluation API, to execute arbitrary code, with AWS
Lambda and API Gateway.
• Crawling thousands of products using AWS Lambda gives a real-world
example of where using Python, Selenium and headless Chrome on
AWS Lambda could crawl thousands of pages to collect data with each
crawler running within its own Lambda Function.
• Going Serverless with AWS Lambda and API Gateway
General AWS Lambda resources
• Getting started with serverless on AWS is a wonderful tutorials,
example projects and additional resources guide created by a developer
who used all of these bits to learn AWS services herself.
• AWS Lambda Serverless Reference Architectures provides blueprints
with diagrams of common architecture patterns that developers use for
their mobile backend, file processing, stream processing and web
application projects.
416
Full Stack Python: 2020 Supporter's Edition
• Security Overview of AWS Lambda (PDF file) covers their "Shared
Responsibility Model" for security and compliance. Although the paper
bills itself as an in-depth look at AWS Lambda security it is really more
of a high-level overview, but still worth the read.
• Reverse engineering AWS Lambda is an incredible, in-depth analysis of
the author's work investigating the black box of how Lambda works
and what he learned from it.
• The AWS Lambda tag on the official AWS blog contains all the related
first-party tutorials
• Serverless Cost Calculator estimates the amount that AWS would
charge based on Lambda exeuctions, average execution time and
memory needed per execution.
• Serverless at Nordstrom is an awesome real-world story with the
architecture behind a serverless AWS Lambda application deployment
at Nordstrom.
• How was your experience with AWS Lambda in production? has a good
discussion of some of the benefits and issues that developers had as of
mid-2017 with using Lambda for production applications.
• Passwordless database authentication for AWS Lambda shows how to
use a MySQL backend from your Lambda functions.
• How does language, memory and package size affect cold starts of AWS
Lambda? investigates the performance implications of various Lambda
settings.
• Best Practices for AWS Lambda Timeouts explains some of the current
hard upper limits on AWS timeouts, such as 5 minutes for Lambdas,
when explicitly set that high, as well as 29 seconds for API Gatway
requests. There is also good advice on how the circuit breaker pattern
should be applied to your Lambdas and ultimately why low time outs
are likely the best way to go to prevent your application from becoming
entirely unresponsive.
• X-rays for Flask and Django Serverless Applications is an
instrumentation, monitoring and debugging service built into AWS
417
Full Stack Python: 2020 Supporter's Edition
Lambda specifically for Python web frameworks running on the
service.
• Cutting Through the Layers: AWS Lambda Layers Explained explains
how AWS Lambda now offer a "Bring Your Own Runtime" by exposing
the layers that were previously controlled exclusively by Amazon. There
is an overview of the layers and why they matter for customizing your
functions.
Azure Functions
Azure Functions is a compute service created by Microsoft that can execute
Python code in response to pre-defined events, such as API calls or database
transactions in other Azure services.
Azure Functions resources
• An introduction to Azure Functions is the official quickstart guide by
Microsoft and has some good high-level information on their
platform's services as well.
• Deploy Python to Azure Functions provides the step-by-step
instructions needed to get Python code running on Azure Functions.
• Azure Functions vs AWS Lambda – Scaling Face Off contains metrics
from comparing AWS Lambda with Azure Functions in response time,
user load, requests per second and error rate over various periods of
time.
• How to build a serverless report server with Azure Functions and
SendGrid combines the Sendgrid email API with some configuration
code to have Azure Functions kick off email jobs.
418
Full Stack Python: 2020 Supporter's Edition
• azure-cli are the command line tools for using all of Azure, not just
Functions.
• My (Rough) Start with Azure Functions painstakingly details signing
up for Azure, accessing Functions and finally coding a Function. The
author has some really great points on what is confusing to newcomers
that hopefully will be addressed as Microsoft continues to work on
their Azure platform.
• Azure in Plain English covers all of the Azure services and explains
them because their default names are often too vague to understand
their purpose.
Google Cloud Functions
Google Cloud Functions is a serverless compute service that executes arbitrary
code. However, at this time Google Cloud Functions is a beta service that does
not work with Python.
Amazon Web Services (AWS) Lambda is currently a good alternative until
Google Cloud Functions supports Python.
Google Cloud Functions resources
• Serverless Cost Calculator estimates the amount that AWS would
charge based on Lambda exeuctions, average execution time and
memory needed per execution.
• Microsoft Azure Functions vs. Google Cloud Functions vs. AWS
Lambda gives an overview of these three serverless cloud platforms and
compares their current capabilities.
• Getting Started With Google Cloud Functions and MongoDB explains
how to connect a Google Cloud Function to a persistent MongoDB data
store.
• Running End to End tests as Google Cloud Functions provides the code
and a walkthrough for running a headless Chrome instance in a
serverless Cloud Function to perform browser-based tests.
419
Full Stack Python: 2020 Supporter's Edition
DevOps
DevOps is the combination of application development and operations, which
minimizes or eliminates the disconnect between software developers who
build applications and systems administrators who keep infrastructure
running.
Why is DevOps important?
When the Agile methodology is properly used to develop software, a new
bottleneck often appears during the frequent deployment and operations
phases. New updates and fixes are produced so fast in each sprint that
infrastructure teams can be overwhelmed with deployments and push back on
the pace of delivery. To alleviate some of these issues, application developers
are asked to work closely with operations folks to automate the delivery of
code from development to production.
DevOps tooling resources
DevOps cannot be performed with tools alone, but having the right tools to
augment the culture and processes is important to successful software
delivery. The following resources discuss both Python-specific and general
tools and services for DevOps environments.
• DevOps: Python tools to get started is a presentation slideshow that
explains that while DevOps is a culture, it can be supported by tools
such as Fabric, Jenkins, BuildBot and Git which when used properly
can enable continuous software delivery.
• For an Atlassian-centric perspective on tooling, take a look at this post
on how to choose the right DevOps tools which is biased towards their
tools but still has some good insight such as using automated testing to
provide immediate awareness of defects that require fixing.
General DevOps resources
The following resources give advice and approaches for building the right
teams, culture, processes and tools into software development organizations.
421
Full Stack Python: 2020 Supporter's Edition
• DevOps vs. Platform Engineering considers DevOps an ad hoc
approach to developing software while building a platform is a strict
contract. I see this as "DevOps is a process", while a "platform is code".
Running code is better than any organizational process.
• The open source PagerDuty Incident Response guide is the amazing
result from their company taking the practices they use to keep their
services running and putting them out for other developers to
consume. Highly recommended.
• Operations for software developers for beginners gives advice to
developers who have never done operations work and been on call for
outages before in their career. The advantage of DevOps is greater
ownership for developers who built the applications running in
production. The disadvantage of course is the greater ownership also
leads to much greater responsibility when something breaks!
• Google's Site Reliability Engineering (SRE) book is free online and
required reading for understanding the practices and principles behind
keeping the largest websites alive. Note though that some of the advice
in the book will be considered controversial at more stodgy traditional
organizations that have done operations differently for a long time.
There is also a wonderful interview with Ben Treynor, one of the
authors of the book, that contains additional information.
• The Increment, Stripe's fantastic digital and print magazine, has an
issue dedicated to being on-call which discusses many DevOps-related
topics such as what happens when your pager goes off, ownership and
how startups can be different from large companies with their incident
responses.
• Bing: Continuous Delivery is an impressive visual story that explains
the practices for how their team delivers updates to the search engine.
• Why are we racing to DevOps? is a very high level summary of the
benefits of DevOps to IT organizations. It's not specific to Python and
doesn't dive into the details, but it's a decent start for figuring out why
IT organizations consider DevOps the hot new topic after adopting an
Agile development methodology.
422
Full Stack Python: 2020 Supporter's Edition
• SRE vs. DevOps: competing standards or close friends? covers Google's
take on how Site Reliability Engineering (SRE) fits with the DevOps
world. Roughly speaking, SRE is more closely aligned with metrics and
how to operate infrastructure and applications, rather than the broader
principles embodied by the DevOps philosophy.
Monitoring
Monitoring tools capture, analyze and display information for a web
application's execution. Every application has issues arise throughout all
levels of the web stack. Monitoring tools provide transparency so developers
and operations teams can respond and fix problems.
Why is monitoring necessary?
Capturing and analyzing data about your production environment is critical to
proactively deal with stability, performance, and errors in a web application.
Difference between monitoring and logging
Monitoring and logging are very similar in their purpose of helping to
diagnose issues with an application and aid the debugging process. One way
to think about the difference is that logging happens based on explicit events
while monitoring is a passive background collection of data.
For example, when an error occurs, that event is explicitly logged through
code in an exception handler. Meanwhile, a monitoring agent instruments the
code and gathers data not only about the logged exception but also the
performance of the functions.
This distinction between logging and monitoring is vague and not necessarily
the only way to look at it. Pragmatically, both are useful for maintaining a
production web application.
Monitoring layers
There are several important resources to monitor on the operating system and
network level of a web stack.
1. CPU utilization
2. Memory utilization
423
Full Stack Python: 2020 Supporter's Edition
3. Persistence storage consumed versus free
4. Network bandwidth and latency
Application level monitoring encompasses several aspects. The amount of
time and resources dedicated to each aspect will vary based on whether an
application is read-heavy, write-heavy, or subject to rapid swings in traffic.
1.
2.
3.
4.
5.
Application warnings and errors (500-level HTTP errors)
Application code performance
Template rendering time
Browser rendering time for the application
Database querying performance
Open source monitoring projects
• Sentry started life as a Python-only monitoring project but can now be
used for any programming language.
• Service Canary
• ping.gg (source code)
• glances (source code)
• statsd is a node.js network daemon that listens for metrics and
aggregates them for transfer into another service such as Graphite.
• Graphite stores time-series data and displays them in graphs through a
Django web application.
• Sensu is an open source monitoring framework written in Ruby but
applicable to any programming language web application.
• Graph Explorer by Vimeo is a Graphite-based dashboard with added
features and a slick design.
• Munin is a client plugin-based monitoring system that sends
monitoring traffic to the Munin node where the data can be analyzed
and visualized. Note this project is written in Perl so Perl 5 must be
installed on the node collecting the data.
• Bucky measures the performance of a web application from end user's
browsers and sends that data back to the server for collection.
424
Full Stack Python: 2020 Supporter's Edition
Hosted monitoring services
Hosted monitoring software takes away the burden of deploying and
operating the software yourself. However, hosted monitoring costs (often a
significant amount of) money and take your application's data out of your
hands so these services are not the right fit for every project.
Error Tracking
• Rollbar instruments both the server side and client side to capture and
report exceptions. The pyrollbar code library provides quick integration
for Python web applications. There are also specific instructions for
common web frameworks such as Django and Pyramid.
• Sentry is the hosted version of the open source tool that is used to
monetize and support further development.
Application Performance Monitoring (APM)
• New Relic provides application and database monitoring as well as
plug ins for capturing and analyzing data about other devleoper tools in
your stack, such as Twilio.
• Opbeat Built for django. Opbeat combines performance metrics,
release tracking, and error logging into a single simple service.
• Scout monitors the performance of Django and Flask apps, autoinstrumenting views, SQL queries, templates, and more.
Status Pages
• Status.io focuses on uptime and response metrics transparency for web
applications.
• StatusPage.io (yes, there's both a Status and StatusPage.io) provides
easy set up status pages for monitoring application up time.
Incident Management
• PagerDuty alerts a designated person or group if there are stability,
performance, or uptime issues with an application.
Monitoring resources
• How to Add Hosted Monitoring to Flask Web Applications and How to
Monitor Bottle Web Applications are a couple of posts in a series
425
Full Stack Python: 2020 Supporter's Edition
showing how to add hosted monitoring to Python web apps built with
any of the major Python web frameworks.
• Stack Overflow: How We Do Monitoring - 2018 Edition is a detailed,
long read about how Stack Overflow handles their monitoring, health
checks, alerting and dashboarding of their infrastructure.
• The Virtues of Monitoring
• Effortless Monitoring with collectd, Graphite, and Docker
• Practical Guide to StatsD/Graphite Monitoring is a detailed guide with
code examples for monitoring infrastructure.
• Bit.ly describes the "10 Things They Forgot to Monitor" beyond the
standard metrics such as disk & memory usage.
• The videos from Monitorama, a conference that's all about monitoring
and observability, are recordings of fantastic technical talks from their
events.
• Four Linux server monitoring tools
• How to design useful monitoring and graphing visualizations
• 5 years of metrics and monitoring is a great presentation highlighting
that visualization so humans can understand measurements is a hard
problem. Line graphs are often not the best solution and they are
overused.
• Keeping an eye on our network explains how GitHub uses tagged
metrics to keep better tabs on its infrastructure and network
connections.
• 10 monitoring talks every developer should watch contains a great
collection of relevant monitoring presentations.
• The Collector Highlight Series has an article on StatsD that explains
how to install it and how it works.
• OpenCensuss, OpenTracing and OpenMetrics are three projects that
are trying to create standards for application monitoring metrics. This
article on the 3 projects is a helpful overview to understand what each
one is trying to do and how they compare to each other.
426
Full Stack Python: 2020 Supporter's Edition
Monitoring learning checklist
1. Review the software-as-a-service and open source monitoring tools
below. Third party services tend to be easier to set up and host the data
for you. Open source projects give you more control but you'll need to
have additional servers ready for the monitoring.
2. My recommendation is to install New Relic's free option with the trial
period to see how it works with your app. It'll give you a good idea of
the capabilities for application-level monitoring tools.
3. As your app scales take a look at setting up one of the the open source
monitoring projects such as StatsD with Graphite. The combination of
those two projects will give you fine-grained control over the system
metrics you're collecting and visualizing.
Datadog
Datadog is a hosted monitoring service that can be used with Python web
applications to catch and report errors.
Datadog resources
• Monitoring Django performance with Datadog is a fantastic, detailed
walk through for integrating this service with Django web applications.
• Python log collection provides a framework-agnostic explanation for
how to use the service with Python code.
• Monitoring Flask apps with Datadog walks through adding Datadog to
a Flask application.
427
Full Stack Python: 2020 Supporter's Edition
Prometheus
Prometheus (source code) is an open source monitoring tool that can be used
to instrument and report on Python web applications.
Prometheus resources
• Prometheus-Basics is a newbie's introduction to this tool. It covers
what Prometheus is, the tool's architecture, types of metrics and
contains a walkthrough of how to get it configured.
• This primer on Prometheus walks through installation, configuration
and metrics collection.
• Monitoring synchronous Python web apps with Prometheus and its
asynchronous monitoring counterpart are two tutorials that show how
to add middleware to your web apps that allows Prometheus metrics
collection.
• Monitoring with Prometheus provides an overview of the tool and
explains how to combine it with Grafana to visualize the metrics that
are collected.
• Monitor your applications with Prometheus is a getting started guide
with a walkthrough of how to instrument a simple Golang application.
• Custom Application Metrics with Django, Prometheus, and Kubernetes
shows how to handle the initial configuration with
django-prometheus , deploys the Django web app using Helm and
configures Prometheus to scrape metrics from the application running
on Kubernetes.
• A gentle introduction to the wonderful world of metrics has a quick
summary that compares Prometheus with Nagios, then digs into the
logging format and what you can visualize with this tool.
428
Full Stack Python: 2020 Supporter's Edition
• From Graphite to Prometheus explains some of the differences
between using a StatsD / Graphite monitoring stack and Prometheus,
such as how Prometheus scrapes data instead of the applications
pushing data to a metrics aggregator, and the query languages for each
tool.
Rollbar
Rollbar is a hosted monitoring service that can be used with Python web
applications to catch and report errors.
Rollbar resources
• Rollbar monitoring of Celery in a Django app covers the Celery task
queue and how to discover issues that happen in asynchronous worker
processes.
• How to Add Hosted Monitoring to Flask Web Applications adds
Rollbar's error tracking to a Flask web application.
• How to Monitor Python Web Applications uses the Bottle web
application framework to show how hosted monitoring can be added to
a Python web application.
• The official Rollbar Python docs have tons of great integration
examples for various Python web frameworks.
Sentry
Sentry is an open source monitoring project as well as a service that can be
used to report errors in Python web apps.
Sentry is a monitoring service that you can set up to host yourself or used as a
service to catch and report errors in your Python web applications.
429
Full Stack Python: 2020 Supporter's Edition
Sentry resources
• Sentry's official Python docs explain how to integrate the SDK into an
application to send events to either your own server or the hosted
service.
• Sentry For Data: Error Monitoring with PySpark shows an integration
between the PySpark data analysis tool and Sentry for handling any
issues.
Web App Performance
Web application performance is affected by network latency, bandwidth,
database queries, page size and many other factors.
Performance and load testing tools
• Falco
Load testing resources
• Load Testing with Locust.io & Docker Swarm shows you how to set up
load tests using Docker containers along with AWS for scaling up the
tests.
• HTTP Load Testing with Vegeta (and a dash of Python) covers getting
started with the Vegeta load tester and uses Python to analyze the tool's
results.
• Four reasons developers should write their own load tests and four load
testing mistakes developers love to make are opinionated pieces on
430
Full Stack Python: 2020 Supporter's Edition
how developer should use load testing to ensure their applications
work properly under heavy usage.
• Building a PostgreSQL load tester explains how the pgreplay-go tool
works and how to obtain performance metrics for a PostgreSQL
database.
Web app performance resources
• Web Performance 101 introduces web application loading performance.
There is a ton of great information on JavaScript, CSS and HTTP
optimization as well as tools to use.
• Idle until urgent explains an issue the author found when measuring
First Input Delay (FID) on his site and what techniques he used to fix
the problem.
• How to measure web app performance is a 20 minute code-first demo
that shows how to get a realistic estimate for how many requests per
second your web application will be able to handle.
• How to Interpret Site Performance Tests covers the difference between
client, page and connection speed tests as well as a bit on caching
performance.
• Practical scaling techniques for websites examines how to improve
your website performance with asynchronous task queues, database
optimization and caching.
• The [Performance Testing Guidance for Web
Applications](https://docs.microsoft.com/en-us/previous-versions/
msp-n-p/bb924375(v%3dpandp.10) book from Microsoft is a gem.
There are chapters on foundations of performance testing, modeling
application usage and many other topics that are critical to working on
web app performance.
• awesome-scalability provides a list with a crazy number of scaling and
performance optimization resources and tools by category.
• Every Web Performance Test Tool provides a nice list of tools and
provides short summaries of what each one can help with in identifying
performance problems.
431
Full Stack Python: 2020 Supporter's Edition
• The Infrastructure Behind Twitter: Scale examines the evolution from
having to buy your own hardware from vendors to run a service to the
current days of being able to rely on cloud providers for some or all
workloads regardless of scale.
• Scaling to 100k users covers the architecture scaling techniques
commonly used to move up in serving users by orders of magnitude,
for example from 100 to 1000.
• Web Performance Recipes with Puppeteer digs into tracing through
page rendering to measure performance and how to extract
performance metrics from the Lighthouse tool for further analysis.
Caching
Caching can reduce the load on servers by storing the results of common
operations and serving the precomputed answers to clients.
For example, instead of retrieving data from database tables that rarely
change, you can store the values in-memory. Retrieving values from an inmemory location is far faster than retrieving them from a database (which
stores them on a persistent disk like a hard drive.) When the cached values
change the system can invalidate the cache and re-retrieve the updated values
for future requests.
A cache can be created for multiple layers of the stack.
Caching backends
• memcached is a common in-memory caching system.
• Redis is a key-value in-memory data store that can easily be configured
for caching with libraries such as django-redis-cache and the similarlynamed, but separate project django-redis.
Caching resources
• Caching at Reddit is a wonderful in-depth post that goes into detail on
how they handle caching their Python web app for billions of pageviews
each month.
432
Full Stack Python: 2020 Supporter's Edition
• "Caching: Varnish or Nginx?" reviews some considerations such as SSL
and SPDY support when choosing reverse proxy Nginx or Varnish.
• Caching is Hard, Draw me a Picture has diagrams of how web request
caching layers work. The post is relevant reading even though the
author is describing his Microsoft code as the impetus for writing the
content.
• While caching is a useful technique in many situations, it's important
to also note that there are downsides to caching that many developers
fail to take into consideration.
• Caching at Reddit covers monitoring, tuning and scaling for the very
high scale Reddit.com website.
• Mastering HTTP caching provides more advanced advice on caching
dynamic as well as static content via CDNs and other configurations.
Caching learning checklist
1. Analyze your web application for the slowest parts. It's likely there are
complex database queries that can be precomputed and stored in an inmemory data store.
2. Leverage your existing in-memory data store already used for session
data to cache the results of those complex database queries. A task
queue can often be used to precompute the results on a regular basis
and save them in the data store.
3. Incorporate a cache invalidation scheme so the precomputed results
remain accurate when served up to the user.
Logging
Logging saves output such as errors, warnings and event information to
persistent storage for debugging purposes.
Why is logging important?
Runtime exceptions that prevent code from running are important to log to
investigate and fix the source of the problems. Informational and debugging
433
Full Stack Python: 2020 Supporter's Edition
logging also helps to understand how the application is performing even if
code is working as intended.
Logging levels
Logging is often grouped into several categories:
1.
2.
3.
4.
Information
Debug
Warning
Error
Logging errors that occur while a web framework is running is crucial to
understanding how your application is performing.
Logging aggregators
When you are running your application on several servers, it is helpful to have
a monitoring tool called a "logging aggregator". You can configure your
application to forward your system and application logs to one location that
provides tools for viewing, searching, and monitoring logging events across
your cluster.
Another advantage of log aggregation tools is they allow you to set up custom
alerts and alarms so you can get notified when error rates breach a certain
threshold.
Open source log aggregators
• Sentry started as a Django-only exception handling service but now has
separate logging clients to cover almost all major languages and
frameworks. It still works really well for Python-powered web
applications and is often used in conjunction with other monitoring
tools. Raven is open source Python client for Sentry.
• Graylog2 provides a central server for log aggregation as well as a GUI
for browsing and searching through log events. There are libraries for
most major languages, including python. Saves data in Elasticache.
• Logstash Similar to Graylog2, logstash offers features to
programmatically configure log data workflows.
434
Full Stack Python: 2020 Supporter's Edition
• Scribe A project written by Facebook to aggregate logs. It's designed to
run on multiple servers and scale with the rest of your cluster. Uses the
Thrift messaging format so it can be used with any language.
Hosted logging services
• Loggly is a third party cloud based application that aggregates logs.
They have instructions for every major language, including python. It
includes email alerting on custom searches.
• Splunk offers third party cloud and self hosted solutions for event
aggregation. It excels at searching and data mining any text based data.
• Papertrail is similar to both Loggly and Splunk and provides
integration with S3 for long term storage.
• Raygun logs errors and provides immediate notification when issues
arise.
• Scalyr provides log aggregation, dashboards, alerts and search in a user
interface on top of standard logs.
• There is a hosted version of Sentry in case you do not have the time to
set up the open source project yourself.
Logging resources
• This intro to logging presents the Python logging module and how to
use it.
• Understanding Python's logging module clears up some
misconceptions about how pattern matching with logging hierarchies
works and provides a few clear diagrams to visually explain logging
handlers.
• Logging Cookbook contains useful code snippets to easily add logging
to your own applications.
• Logging in Python explains the logging module in the Python standard
library, configuration settings, handlers and how to log data.
• A Brief Digression About Logging is a short post that gets Python
logging up and running quickly.
435
Full Stack Python: 2020 Supporter's Edition
• Taking the pain out of Python logging provides a logging set up with
uWSGI.
• Good logging practice in Python shows how to use the standard library
to log data from your application. Definitely worth a read as most
applications do not log nearly enough output to help debuggin when
things go wrong, or to determine if something is going wrong.
• Structured Logging: The Best Friend You’ll Want When Things Go
Wrong explains how Grab (a large ridesharing platform in Asia) uses
structured logging to reduce Mean Time To Resolve (MTTR) and how
they arranged their log levels to support their scale.
• How to collect, customize, and centralize Python logs covers the
standard library logging module and how to configure it for various
ways you are likely to use it with one or more Python applications.
• A guide to logging in Python has some clear, simple diagrams to
explain how logging works in Python applications.
• Django Logging, the Right Way covers a few Python logging techniques
and then goes into how you use them with your Django projects.
• Django's 1.3 release brought unified logging into project configurations.
This post shows how to set up logging in a project's settings.py file.
Caktus Group also has a nice tutorial on central logging with graypy
and Graylog2.
• Django Logging Configuration: How the Default Settings Interfere with
Yours explains a problem with the default Django logging configuration
and what to do about in your project.
• The Pythonic Guide To Logging provides a quick introduction to log
levels in Python code.
• Exceptional Logging of Exceptions in Python shows how to log errors
more accurately to pinpoint the problem instead of receiving generic
exceptions in your logs.
• Python and Django logging in plain English is a wonderful overview of
how to configure logging in Django projects.
436
Full Stack Python: 2020 Supporter's Edition
Logging learning checklist
1. Read how to integrate logging into your web application framework.
2. Ensure errors and anomalous results are logged. While these logs can
be stored in monitoring solutions, it's best to have your own log storage
location to debug issues as they arise to complement other monitoring
systems.
3. Integrate logging for system events you may need to use for debugging
purposes later. For example, you may want to know the return values
on functions when they are above a certain threshold.
Web Analytics
Web analytics involves collecting, processing, visualizing web data to enable
critical thinking about how users interact with a web application.
Why is web analytics important?
User clients, especially web browsers, generate significant data while users
read and interact with webpages. The data provides insight into how visitors
use the site and why they stay or leave. The key concept to analytics is
learning about your users so you can improve your web application to better
suit their needs.
Web analytics concepts
It's easy to get overwhelmed at both the number of analytics services and the
numerous types of data points collected. Focus on just a handful of metrics
when you're just starting out. As your application scales and you understand
more about your users add additional analytics services to gain further insight
into their behavior with advanced visualizations such as heatmaps and action
funnels.
User funnels
If your application is selling a product or service you can ultimately build a
user funnel (often called "sales funnel" prior to a user becoming a customer)
to better understand why people buy or don't buy what you're selling. With a
437
Full Stack Python: 2020 Supporter's Edition
funnel you can visualize drop-off points where visitors leave your application
before taking some action, such as purchasing your service.
Open source web analytics projects
• Piwik is a web analytics platform you can host yourself. Piwik is a solid
choice if you cannot use Google Analytics or want to customize your
own web analytics platform.
• Shynet is a lightweight, privacy-friendly cookie-free web analytics
application written in Python.
• Open Web Analytics is another self-hosted platform that integrates
through a JavaScript snippet that tracks users' interactions with the
webpage.
Hosted web analytics services
• Google Analytics is a widely used free analytics tool for website traffic.
• Clicky provides real-time analytics comparable to Google Analytics'
real-time dashboard.
• MixPanel's analytics platform focuses on mobile and sales funnel
metrics. A developer builds what data points need to be collected into
the server side or client side code. MixPanel captures that data and
provides metrics and visualizations based on the data.
• Heap is a recently founded analytics service with a free introductory
tier to get started.
• CrazyEgg is tool for understanding a user's focus while using a website
based on heatmaps generated from mouse movements.
Python-specific web analytics resources
• Building an Analytics App with Flask is a detailed walkthrough for
collecting and analyzing webpage analytics with your own Flask app.
• Build a Google Analytics Slack Bot with Python explains how to
connect the Google Analytics API to a Slack bot, with all the code in
438
Full Stack Python: 2020 Supporter's Edition
Python, so you can query for Google Analytics data from your Slack
channels.
• Automating web analytics through Python is a tutorial for interacting
with your Google Analytics data using pandas and related data analysis
tools.
• The official Google Analytics Python quickstart isn't really the easiest
tutorial to follow due to all of the configuration required to make your
first API call, but it is still the right place to go to get started.
• How Accurately Can Prophet Project Website Traffic? uses the data
forecasting tool Prophet to see if it is possible to predict future trends
in website traffic based on historical data.
General web analytics resources
• The Google Analytics Setup I Use on Every Site I Build is a tutorial
written for developers to better understand the scope of what Google
Analytics can tell you about your site and how to configure it for better
output.
• Roll your own analytics shows you how to use AWS Lambda and some
custom JavaScript to create your own replacement for Google
Analytics. This route is not for everyone but it is really useful if you
want to avoid the Google data trap.
• This beginner's guide to math and stats behind web analytics provides
some context for understanding and reasoning about web traffic.
• An Analytics Primer for Developers by Mozilla explains what to track,
choosing an analytics platform and how to serve up the analytics
JavaScript asynchronously.
• Options for Hosting Your Own Non-JavaScript-Based Analytics has a
few non-Google Analytics web analytics tools that mostly rely on
server-side rather than client-side tracking.
• This post provides context for determining if a given metric is "vanity"
or actionable.
439
Full Stack Python: 2020 Supporter's Edition
• This series on measuring your technical content has a bunch of advice
for figuring out why you want to gather metrics, how to do the
instrumentation and determining your success factors.
◦ Part 1 covers "why"
◦ Part 2 examines success factors
◦ Part 3 digs further into measurement
• awesome-analytics aggregates analytics tools for both web and mobile
applications.
• 10 red flags signaling your analytics program will fail is a more
business-focused piece but it has sosme good information and
visualization on broader themes that developers who work in larger
organizations should think about when it comes to analytics.
Web analytics learning checklist
1. Add Google Analytics or Piwik to your application. Both are free and
while Piwik is not as powerful as Google Analytics you can self-host the
application which is the only option in many environments.
2. Think critically about the factors that will make your application
successful. These factors will vary based on whether it's an internal
enterprise app, an e-commerce site or an information-based
application.
3. Add metrics generated from your web traffic based on the factors that
drive your application's success. You can add these metrics with either
some custom code or with a hosted web analytics service.
4. Continuously reevaluate whether the metrics you've chosen are still the
appropriate ones defining your application's success. Improve and
refine the metrics generated by the web analytics as necessary.
440
Full Stack Python: 2020 Supporter's Edition
What "Full Stack" Means
The terms "full stack" and "Full Stack Python" are ambiguous but I am using a
specific definition here on this site. These term can be defined for a web stack
either to mean
1. Every layer, from the machine code up to the browser, are written in
Python
2. Python code interacts with code written in other languages such as C
and JavaScript to provide a complete web stack
I named this site specifically for the second definition: Python is one
programming language among many that are used to build your application
stack.
Some folks took the title of the site to mean Python runs everything from the
web browser on down. That's simply not practical or possible. While Python is
an amazing programming language, there are many tasks it does not do well.
Python is just one language among many that allows our computers to execute
software and communicate with each other.
For beginners, learning the syntax and libraries in Python necessary to build a
web application or web API is a major undertaking. Even intermediate and
advanced Python software developers need to constantly program and learn
to keep up with our ever evolving ecosystem. I created Full Stack Python to be
just one of many resources that help Python developers build and maintain
their programming skills.
About the Author
Full Stack Python is coded and written by Matt Makai. Matt currently works
in Washington, D.C. for the Twilio Developer Network as the Director of
Developer Content.
Other projects by Matt include The Full Stack Python Guide to Deployments,
Python for Entrepreneurs, Introduction to Ansible and Coding Across
America.
442
Full Stack Python: 2020 Supporter's Edition
You can reach him by email at matthew.makai@gmail.com. Matt can't
respond to every email, but he will do his best to reply when possible.
443
Download