Python
python
is an interpreted language, that looks a lot like pseudocode.
Syntax
The formal definition via BNF
grammar is the following (from the official
documentation)
compound_stmt ::= if_stmt
| while_stmt
| for_stmt
| try_stmt
| with_stmt
| match_stmt
| funcdef
| classdef
| async_with_stmt
| async_for_stmt
| async_funcdef
suite ::= stmt_list NEWLINE | NEWLINE INDENT statement+ DEDENT
statement ::= stmt_list NEWLINE | compound_stmt
stmt_list ::= simple_stmt (";" simple_stmt)* [";"]
that in practice means that a compound statement is composed of a header starting with a specific keyword and ending with a colon, followed by a line of statements separated by semicolons or a list of statements indented with respect the correspective header. These two last cases form a suite.
Data Types
Name | Description |
---|---|
None |
particular type used as a null that is only representative of its type |
numbers.Number |
representation of numerical entities |
https://docs.python.org/3/library/stdtypes.html?
container here means that it's not limited to a single data type but can have mixed types together. The opposite of container is flat.
hashable: object that has a hash value which never changes during its lifetime
Name | Description | Mutable | Container |
---|---|---|---|
int , float and complex |
unlimited precision numerical types | ✅ | ❌ |
range |
sequence of numbers | ❌ | ❌ |
tuple |
built-in sequence | ❌ | ✅ |
list |
built-in sequence | ✅ | ✅ |
str |
text sequence | ❌ | ❌ |
bytes |
bytes sequence | ❌ | ❌ |
bytearray |
bytes sequence | ✅ | ❌ |
set |
unordered collection of hashable objects | ✅ | ✅ |
frozenset |
unordered collection of hashable objects that is hashable | ❌ | ✅ |
dict |
mapping from hashable objects to an arbitrary object | ✅ | ✅ |
memoryview |
view to internal data of objects | ❓ | ❌ |
From a practical point of view, see here the time complexity associated with the data types.
Sequences/containers can implement a particular sub-type that is the
iterator: in practice you tell the external world that your object supports
iteration via the __iter__()
method that returns the actual iterator.
The iterator must implement the __next__()
method that returns the next
element in the sequence. When the sequence has not more element, this method
must raise StopIteration
.
Related to this exists the generator type, roughly speaking a function that
using the yield
keyword allows to build an iterator. Take in mind that has
other methods other the ones from the iterator protocols, like send()
,
throw()
and close()
.
For example, directly from the documentation
>>> def echo(value=None):
... print("Execution starts when 'next()' is called for the first time.")
... try:
... while True:
... try:
... value = (yield value)
... except Exception as e:
... value = e
... finally:
... print("Don't forget to clean up when 'close()' is called.")
...
>>> generator = echo(1)
>>> print(next(generator))
Execution starts when 'next()' is called for the first time.
1
>>> print(next(generator))
None
>>> print(generator.send(2))
2
>>> generator.throw(TypeError, "spam")
TypeError('spam',)
>>> generator.close()
Don't forget to clean up when 'close()' is called.
It exists also a generator expression
>>> sum(i*i for i in range(10)) # sum of squares 0, 1, 4, ... 81
285
Dictionary
A particular to keep in mind when interacting with dictionary is that the objects returned by dict.keys(), dict.values() and dict.items() are view objects. They provide a dynamic view on the dictionary’s entries, which means that when the dictionary changes, the view reflects these changes.
Objects
Classes are not "subclasses" of type
but instances of it
Name | Description |
---|---|
__new__() |
|
__init__() |
|
__del__() |
|
__hash__() |
|
__str__() |
|
__repr__() |
|
__bytes__() |
Called by bytes to compute a byte-string representation of an object. This should return a bytes object. |
__format__() |
Called by the format() built-in function, and by extension, evaluation of formatted string literals and the str.format() method, to produce a “formatted” string representation of an object. |
__weak_ref__() |
|
__slots__() |
|
__copy__() |
Used to define the implementation of a copy used by the copy module |
Take in mind that exists two convention for internal attributes on an object
- if the name starts with
_
is considered "internal" - if the name starts with
__
is considered "private" but also the interpreter mangles the name so that__<name>
becomes_<class name>__<name>
Structural pattern matching
Introduced in python 3.10 via PEP-638 PEP-636 and PEP-634
match <expression>:
case <pattern> [guard]:
<block>
where <expression>
is whatever python expression returns something that
might match with the <pattern>
and optionally must "pass" the guard
expression.
The simplest match is the "literal" matching, where you are trying to match a constant, a value; a more complex pattern matching is one that cause name bindings. When you use an identifier as pattern then on matching the value will be bounded to that name for the scope of the subsequent block.
Note: if you want to use a value coming from an attribute, to avoid the name binding you need to use a qualified name (an unqualified name is a name without dots).
Note: there is difference between (<pattern>)
and [<pattern>]
or
(<pattern>,)
. The first is a group pattern, the second a sequence
pattern.
Here some practical examples
match op.opcode, *args:
case Operatore.Store, Constant() as offset, Click(offset=(x, y)):
...
Operatore.Store
is matching with a literal, Constant()
is matching with
a type and binding the parameter is matching with the name offset
; the last
one looks for an element of type Click
that has a tuple of two elements
associated with the attribute offset
and binds this two elements to the name
x
and y
.
Coroutines
Introduced with PEP 492, the syntax as indicated from the official documentation
async_funcdef ::= [decorators] "async" "def" funcname "(" [parameter_list] ")"
["->" expression] ":" suite
async_for_stmt ::= "async" for_stmt
async_with_stmt ::= "async" with_stmt
In the following code
async def read_data(db):
data = await db.fetch('SELECT ...')
await
suspends execution like yield from
; it accepts only an "awaitable" (raises
a TypeError
doing otherwise)
Links
- http://mirnazim.org/writings/python-ecosystem-introduction/
- WTF Python Exploring and understanding Python through surprising snippets
- Code Like a Pythonista: Idiomatic Python
- http://effbot.org/zone/python-with-statement.htm
- http://www.python.org/dev/peps/
- http://www.mindviewinc.com/Books/Python3Patterns/Index.php
- http://stackoverflow.com/questions/986006/python-how-do-i-pass-a-variable-by-reference
- http://agiliq.com/blog/2012/06/understanding-args-and-kwargs/
- http://pythonbooks.revolunet.com/
- http://farmdev.com/talks/unicode/
# -*- coding: utf-8 -*-
- What's main.py
- http://blog.amir.rachum.com/post/39501813266/python-the-dictionary-playbook
- https://developers.google.com/edu/python/
- Vulnerability in session cookie http://vudang.com/2013/01/python-web-framework-from-lfr-to-rce/
- http://blog.amir.rachum.com/post/54770419679/python-common-newbie-mistakes-part-1
- http://blog.amir.rachum.com/post/55024295793/python-common-newbie-mistakes-part-2
- http://pypix.com/tools-and-tips/advanced-regular-expression-tips-techniques/
- http://sebastianraschka.com/Articles/2014_python_performance_tweaks.html
- Parallelism in one line
- http://www.fullstackpython.com/
- Writing Python 2-3 compatible code
- https://github.com/faif/python-patterns
- Decorator cheat sheet
- https://wiki.python.org/moin/TimeComplexity
- Writing sustainable Python scripts
- Why print became a function in python3
- Python 101: iterators, generators, coroutines
- How to Bootstrap a Python Project
- A Practical Guide to Using Setup.py
- Massive memory overhead: Numbers in Python and how NumPy helps
- Reproducible Python Bytecode
- pythontutor
- Less copies in Python with the buffer protocol and memoryviews
- Advanced Python: Achieving High Performance with Code Generation
Packaging
- python-packaging.readthedocs.io
- http://www.scotttorborg.com/python-packaging/index.html
- http://nvie.com/posts/pin-your-packages/
- http://tech.marksblogg.com/better-python-package-management.html
- Value error Attempted relative import in non-package
Typing
For python3.7+, you can indicate that the function returns an istance of the enclosing class
from __future__ import annotations
class Position:
def __add__(self, other: Position) -> Position:
...
- typing documentation
- PEP 483 - The Theory of Type Hints
- PEP 585 – Type Hinting Generics In Standard Collections
- PEP 612 – Parameter Specification Variables
- mypy is an optional static type checker for Python that aims to combine the benefits of dynamic (or "duck") typing and static typin
- python/mypy Optional static typing for Python 3 and 2 (PEP 484)
- Python: better typed than you think mypy assisted error handling, exception mechanisms in other languages, fun with pattern matching and type variance
- Python Typing: Resisting the Any type
- Professional-grade mypy configuration
- Using Mypy in production at Spring
Internals
- http://www.jeffknupp.com/blog/2013/02/14/drastically-improve-your-python-understanding-pythons-execution-model/
- Escaping a sandbox using magic of python
- http://stackoverflow.com/questions/878943/why-return-notimplmented-instead-of-raising-notimplementederror
- Using Cython to speed up
- http://late.am/post/2012/03/26/exploring-python-code-objects
- We are all consenting adult here
- Understanding internals of Python classes
- Python Descriptors Demystified
- Python Attribute Access and the Descriptor Protocol
- How python implements long integers?
- open and CPython
- Python behind the scenes #7: how Python attributes work
- When Python can’t thread: a deep-dive into the GIL’s impact
Metaclasses and introspection
- What's a metaclasse by stackoverflow
- http://www.slideshare.net/hychen/what-can-meta-class-do-for-you-pycon-taiwan-2012
- http://www.slideshare.net/gwiener/metaclasses-in-python
- Python’s “Disappointing” Superpowers
TESTS
- My Python testing style guide
- https://www.youtube.com/watch?v=wWu_iRuBjKs
- https://www.integralist.co.uk/posts/toxini/
- moto, a library that allows you to easily mock out tests based on AWS infrastructure.
- Testing Python Applications with Pytest
- Assertion rewriting in Pytest part 4: The implementation
pytest
@pytest.mark.parametrize('count', [
0, 1, 6, 17,
])
def test_tree42(count):
values = list(range(count))
bt = XBinarySearchTree.from_array(values)
assert list(bt.inorder_traversal()) == values
def test_myoutput(capsys): # or use "capfd" for fd-level
print("hello")
sys.stderr.write("world\n")
captured = capsys.readouterr()
assert captured.out == "hello\n"
assert captured.err == "world\n"
print("next")
captured = capsys.readouterr()
assert captured.out == "next\n"
@pytest.mark.skip(reason="no way of currently testing this")
def test_the_unknown():
...
BEST PRACTICES
- PEP8: Style Guide for Python Code
- Design pattern in python
dict()
vs{}
(hint:{}
is better)- http://excess.org/article/2011/12/unfortunate-python/
- http://www.canonical.org/~kragen/isinstance/
- http://www.artima.com/weblogs/viewpost.jsp?thread=236278
- http://satyajit.ranjeev.in/2012/05/17/python-a-few-things-to-remember.html
- http://net.tutsplus.com/tutorials/python-tutorials/behavior-driven-development-in-python/
- Things you didn't know about Python: interesting presentation about Python internal and stuff.
- Copying list, the right way
- Make one archive python executable
- HOWTO Create Python GUIs using HTML
- Slides about functional versus imperative programming
- MRO: from official documentation and a post about multiple inheritance (look at also the comments)
- http://ozkatz.github.com/improving-your-python-productivity.html
- http://ozkatz.github.com/better-python-apis.html
- Lazy evaluation
- https://speakerdeck.com/rwarren/a-brief-intro-to-profiling-in-python
- http://pyvideo.org/video/1674/getting-started-with-automated-testing
- http://hynek.me/talks/python-deployments/
- http://pyrandom.blogspot.nl/2013/04/super-wrong.html
- Python’s super() considered super!
- http://www.huyng.com/posts/python-performance-analysis/
- https://tommikaikkonen.github.io/timezones/
- format()
- pyformat.info/
Multithreading&Multiprocessing
Exceptions
From the official documentation
try_stmt ::= try1_stmt | try2_stmt | try3_stmt
try1_stmt ::= "try" ":" suite
("except" [expression ["as" identifier]] ":" suite)+
["else" ":" suite]
["finally" ":" suite]
try2_stmt ::= "try" ":" suite
("except" "*" expression ["as" identifier] ":" suite)+
["else" ":" suite]
["finally" ":" suite]
try3_stmt ::= "try" ":" suite
"finally" ":" suite
The optional else
clause is executed if the control flow leaves the try
suite,
no exception was raised, and no return
, continue
, or break
statement was
executed. Exceptions in the else
clause are not handled by the preceding except
clauses.
The finally
clause is always executed, also in case the try
has a
return
, break
or continue
and since the last return
is what
counts in a function, a return
in the finally
superseed the previous encountered one.
- Exception in python make code clearer, see also this.
- https://julien.danjou.info/blog/2015/python-retrying
- Reraising exception
LIBRARIES
- https://github.com/kennethreitz/envoy
- https://github.com/kennethreitz/requests
- http://www.nicosphere.net/clint-command-line-library-for-python/
- Docopts command line arguments parser for Human Beings.
- Get started with the Natural Language Toolkit
- pdb++ pdb++, a drop-in replacement for pdb (the Python debugger)
- napari/napari a fast, interactive, multi-dimensional image viewer for python
- pydantic Data validation and settings management using python type annotations.
- pySDR
Scientific
Numpy
Matplotlib
Scipy
Pandas
Interesting Stuffs
- https://jordan-wright.github.io/blog/2014/10/06/creating-tor-hidden-services-with-python/
SANDBOX
- http://wiki.python.org/moin/Asking%20for%20Help/How%20can%20I%20run%20an%20untrusted%20Python%20script%20safely%20%28i.e.%20Sandbox%29
- Example of pypy-c-sandbox for launching random scripts
- http://stackoverflow.com/questions/6655258/using-the-socket-module-in-sandboxed-pypy
- http://pypy.readthedocs.org/en/latest/sandbox.html
- http://blog.delroth.net/2013/03/escaping-a-python-sandbox-ndh-2013-quals-writeup/
- Python "sandbox" escape
Instructions for pypy-2.1
$ cd pypy/goal
$ python ../../rpython/bin/rpython -O2 --sandbox targetpypystandalone.py
$ PYTHONPATH=$PYTHONPATH:$PWD/../../ ../..//pypy/sandbox/pypy_interact.py pypy-c
DEBUG&Profiling
- Performance analysis
- CProfile
- https://stripe.com/blog/exploring-python-using-gdb
- scalene is a high-performance CPU and memory profiler for Python that does a number of things that other Python profilers do not and cannot do
IDE
- http://blog.dispatched.ch/2009/05/24/vim-as-python-ide/
$ python -m shlex
kdkd
Token: 'kdkd'
34 5455
Token: '34'
Token: '5455'
$edx=34
Token: '$'
Token: 'edx'
Token: '='
Token: '34'
Time
- http://www.saltycrane.com/blog/2008/11/python-datetime-time-conversions/
- http://stackoverflow.com/questions/2775864/python-datetime-to-unix-timestamp
COOKBOOK
>>> a = [1,4,-1,0,13]
>>> a.sort()
>>> a
[-1, 0, 1, 4, 13]
>>> import operator
>>> x = {1: 2, 3: 4, 4:3, 2:1, 0:0}
>>> sorted_x = sorted(x.iteritems(), key=operator.itemgetter(1))
Two's complement
>>> value = 0xb59395a9
>>> f"{ctypes.c_uint32(value).value:032b}"
'10110101100100111001010110101001'
>>> f"{ctypes.c_uint32(~value).value:032b}"
'01001010011011000110101001010110'
Getopt
import getopt, sys
def main():
try:
opts, args = getopt.getopt(sys.argv[1:], "ho:v", ["help", "output="])
except getopt.GetoptError as err:
# print help information and exit:
print(err) # will print something like "option -a not recognized"
usage()
sys.exit(2)
output = None
verbose = False
for o, a in opts:
if o == "-v":
verbose = True
elif o in ("-h", "--help"):
usage()
sys.exit()
elif o in ("-o", "--output"):
output = a
else:
assert False, "unhandled option"
# ...
if __name__ == "__main__":
main()
argparse
def argparse_vendor_product(value):
vendor, product = tuple(value.split(":"))
return int(vendor, 16), int(product, 16)
def parse_args():
args = argparse.ArgumentParser(description='upload and run some code')
args.add_argument(
'--device',
type=argparse_vendor_product,
required=True,
help="vendor:product of the device you want to interact with")
args.add_argument('--binary', required=True)
args.add_argument('--address', type=functools.partial(int, base=0))
return args.parse_args()
PySerial
import serial
ser = serial.Serial('/dev/ttyUSB0') # open serial port
print(ser.name) # check which port was really used
ser.write(b'hello') # write a string
ser.close()
Decorator
def trace(f):
def _inner(*args, **kwargs):
print ' # ', f.func_name
return f(*args, **kwargs)
return _inner
def challenge(count):
def _challenge(x):
def _inner(*args, **kwargs):
print('[+] challenge %d' % count)
return x(*args, **kwargs)
return _inner
return _challenge
DOCTESTS
def decript(cipher, key):
"""
>>> a = [0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1]
>>> b = [1, 1, 1, 1, 1, 1, 1]
>>> decript(a, b) #doctest: +NORMALIZE_WHITESPACE
[[1, 0, 1, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 0, 0]]
"""
r = []
for i in xrange(0, len(cipher) - len(key) + 1, 7):
r.append(XOR(cipher[i:i + len(key)], key))
return r
$ python -m doctest c1.py
SPHINX
It's possible to write the documentation along with the code.
http://sphinx.pocoo.org/markup/toctree.html#toctree-directive
- https://wiki.python.org/moin/TimeComplexity
Maximum float
>>> infinity = float("inf")
>>> infinity
inf
>>> infinity / 10000
inf
Print out some docstring for documentation purpose
python -c 'from macro import matrixify;print(matrixify.__doc__.replace("\n ", "\n"))' | rst2html
Logging
- Documentation
- http://victorlin.me/2012/08/good-logging-practice-in-python/
- http://hynek.me/articles/taking-some-pain-out-of-python-logging/
- Multi line formatting
- http://victorlin.me/posts/2012/08/26/good-logging-practice-in-python
Remember that logger.basicConfig()
attaches the stream handler by default, if you want
to fine tune the logging you have to set it by yourself.
import logging
import os
logging.basicConfig()
logger = logging.getLogger(__name__)
logger.setLevel(os.getenv("LOG") or "INFO")
It's possible to define a custom level
like SUBDEBUG
(http://stackoverflow.com/a/16955098/1935366)
import logging
SUBDEBUG = 5
logging.addLevelName(SUBDEBUG, 'SUBDEBUG')
def subdebug(self, message, *args, **kws):
self.log(SUBDEBUG, message, *args, **kws)
logging.Logger.subdebug = subdebug
logging.basicConfig()
l = logging.getLogger()
l.setLevel(SUBDEBUG)
l.subdebug('test')
l.setLevel(logging.DEBUG)
l.subdebug('test')
stream = logging.StreamHandler()
formatter = logging.Formatter('%(levelname)s - %(filename)s:%(lineno)d - %(message)s')
logger = logging.getLogger(__file__)
logger.setLevel(logging.DEBUG)
logger.addHandler(stream)
stream.setFormatter(formatter)
If you want that your logging string impact performance when the level is not
used you should let the logger itself doing the formatting: the various logging
functions accept a format string with the %
style and a list of positional
arguments like
logger.debug("this is a string: '%s'", string_to_log)
Flatten list
>>> chain = itertools.chain.from_iterable([[1,2],[3],[5,89],[],[6]])
>>> print(list(chain))
>>> [1, 2, 3, 5, 89, 6]
for x in s:
if x:
return True
return False
return any(x)
Traceback
try:
_manage_object(pk, *args, **kwargs)
except:
obj = Object.objects.get(pk=pk)
# get the exception context to reuse later
exc_info = sys.exc_info()
import traceback
print traceback.print_tb(exc_info[2])
Read/write UTF8 files
Seems like that the builtin open()
in python manage only ascii
files
import codecs
def create_post(filepath, content):
with codecs.open(filepath, 'w+', encoding='utf-8') as f:
f.write(content)
Get first item of a nested list
>>> from operator import itemgetter
>>> rows = [(1, 2), (3, 4), (5, 6)]
>>> map(itemgetter(1), rows)
[2, 4, 6]
>>>
Extract URL from string
import re
myString = "This is my tweet check it out http://tinyurl.com/blah"
print re.search("(?P<url>https?://[^\s]+)", myString).group("url")
Routing from REGEXs
In [1]: import re
In [2]: c = re.compile(r'^w::(?P<type>\w+)::(?P<id>\d*)::')
In [3]: s = 'w::w::1::'
In [5]: m = c.match(s)
In [6]: m.groupdict()
Out[6]: {'id': '1', 'type': 'w'}
Add file into a tarfile from a string
def elaborate_archive(filepath, **kwargs):
tar_src = tarfile.open(filepath, mode='a')
version_file = StringIO.StringIO(kwargs['version'])
version_tarinfo = tarfile.TarInfo('VERSION')
version_tarinfo.size = len(version_file.buf)
tar_src.addfile(version_tarinfo, version_file)
tar_src.close()
pandas
$ pip install pandas
import pandas as pd
You can read data from a CSV
df = pd.read_csv("/path/to/data")
or create manually one
df = pd.DataFrame({
"column 1": [data1, data2, ..., dataN],
"column 2": [...],
...
})
To have general information about the DataFrame
df.info()
A nice feature is the filtering
df[(df.duration > = 200) & (df.genre == "Drama")]
It's possible to plot directly
df.plot(x='GE', y=['TOTALE_19', 'TOTALE_20'], figsize=(20, 10))