print
and logging
in your testpytest captures stderr by default. To show logs, you need to pass -o log_cli=true
like below.
Source: stackoverflow
PYTHONPATH=. pytest -o log_cli=true test/files/test_files.py::test_passwd_to_dict
assert
Using list comprehensions in assertions reference: https://edricteo.com/list-comprehension-addiction/
import pytest
@pytest.mark.parametrize(
"inputs, expected",
[
("Chocolate", "Vanilla", "Strawberry"),
("Chocolate", "Vanilla", "Strawberry"],
[
)
],
)def test_create_scoops_with_different_iterables(inputs, expected):
assert all([value in expected for value in create_values(inputs)])
Example of how to mock a file.
def test_final_line():
= mock.mock_open(read_data='a\nab\nabc\nabcd')
mock_open with mock.patch("builtins.open", mock_open) as m:
= final_line('file_path')
result assert result == 'abcd'
# or
def test_multi_columns_multi_rows():
= StringIO('1\n'
fake_tsv '1\t2\n'
'1\t2\t3\n'
'1\t2\t3\t4')
with mock.patch("builtins.open", return_value=fake_tsv):
assert sum_multi_columns('file_path') == 32
You can also use StringIO
def test_passwd_to_dict():
= StringIO(
fake_passwd '###############\n'
'# User Database\n'
'###############\n'
' \n'
'nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false\n'
'root:*:0:0:System Administrator:/var/root:/bin/sh\n'
'funnyhaha.org\n'
'daemon:*:1:1:System Services:/var/root:/usr/bin/false\n')
with mock.patch("builtins.open", return_value=fake_passwd):
assert passwd_to_dict('file_path') == {'nobody': '-2', 'root': '0',
'daemon': '1'}
Reference: https://stackoverflow.com/a/55657594/12207563
import mock
import LogFIle
def test_logfile():
= mock.mock_open()
open_mock with mock.patch("builtins.open", open_mock, create=True):
= LogFile('dummy.log')
lf 'foobarbaz')
lf.write("dummy.log", "w")
open_mock.assert_called_with("foobarbaz") open_mock.return_value.write.assert_called_once_with(
with
when opening a file for writing ensures that the file will be flushed and closed.csv
module makes it easy to read from and write to CSV files.json
module’s dump
and load
functions allow us to move between Python data structures and JSON-formatted strings.Working with inner functions and closures can be quite surprising and confusing at first. That’s particularly true because our instinct is to believe that when a function returns, its local variables and state all go away. Indeed, that’s normally true–but remember that in Python, an object isn’t released and garbage-collected if there’s at least one reference to it. And if the inner function is still referring to the stack frame in which it was defined, then the outer function will stick around as long as the inner function exists.
On using comprehensions versus for
loops: > When you want to transform an iterable into a list, you should use a comprehension. But if you just want to execute something for each element of an iterable, then a traditional for loop is better.
tl;dr - Use comprehension for tranforming values: > taking values in a list, string, dict, or other iterable and producing a new list based on it–are common in programming. You might need to transform filenames into file objects, or words into their lengths, or usernames into user IDs. In all of these cases, a comprehension is the most Pythonic solution.
Consider what your goal is, and whether you’re better served with a comprehension or a for loop; for example
ord
values for each character. This should be a list comprehension, because you’re creating a list based on a string, which is iterable.for
loop, because you’re interested in the side effects, not the return value.Generator expressions looks like a list comprehension, but uses parentheses rather than square brackets. We can use a generator expression in a call to str.join
, just as we could put in a list comprehension, saving memory in the process.
# List Comprehension
", ".join([str(num + 1) for num in nums])
# Remove square brackets, becomes generator expression
", ".join(str(num + 1) for num in nums)
map
versus comprehensions map
pros: map
can take multiple iterables in its input and then apply functions that will work with each of them
import operator
= 'abcd'
letters = range(1, 5)
numbers
= map(operator.mul, letters, numbers)
x print(' '.join(x))
This can be done with a comprehension, but a bit more complex as we need to use zip
to iterate through two iterables.
import operator
= 'abcd'
letters = range(1, 5)
numbers
print(' '.join(operator.mul(one_letter, one_number)
for one_letter, one_number in zip(letters, numbers)))
first.py
, second.py
, and third.py
, and want to keep them together. You can put them all into a directory, mypackage
.
from mypackage import first
__init__.py
If you import a package wholesale like:
import mypackage
If the package directory contains __init__.py
, importing mypackage
effectively means that __init__.py
is loaded, and thus executed. You can, inside of that file, import one or more of the modules within the package.
A distribution package is a wrapper around a Python package containing information about the author, compatible versions, and licensing, as well as automated tests, dependencies, and installation instructions.
mypackage
, you’ll have a directory called mypackage
.mypackage
, which is where the Python package goes.Creating a distribution package means creating a file called setup.py
. Here is a tutorial from Python docs on how to create a package.
A Python class attribute is an attribute of the class, rather than an attribute of an instance of a class. Python doesn’t have constants, but we can simulate them with class attributes.
class MyClass(object):
= 1 # class attribute
class_var
def __init__(self, i_var):
self.i_var = i_var # instance attribute
Note that all instances of the class have access to class_var
, and that it can also be accessed as a property of the class itself:
= MyClass(2)
foo = MyClass(3)
bar
foo.class_var, foo.i_var## 1, 2
bar.class_var, bar.i_var## 1, 3
## <— This is key
MyClass.class_var ## 1
Reference: Python Class Attributes: An Overly Thorough Guide
Example
class Person():
def __init__(self, name):
self.name = name
def greet(self):
return f'Hello, {self.name}'
class Employee(Person)
def __init__(self, name, id_number):
self.name = name
self.id_number = id_number
self
do?self
is referring to the current object.self
__init__
.__init__
do?__init__
simply adds new attributes to the object.
__new__
which is a constructor, which actually creates a new object. Before it creates a new instance, it looks for an invokes the __init__
method.super
There’s one weird thing about my implementation of
Employee
, namely that I set self.name in__init__
. If you’re coming from a language like Java, you might be wondering why I have to set it at all, sincePerson.__init__
already sets it. But that’s just the thing: in Python,__init__
really needs to execute for it to set the attribute. If we were to remove the setting ofself.name
fromEmployee.__init__
, the attribute would never be set. By the ICPO rule, only one method would ever be called, and it would be the one that’s closest to the instance. SinceEmployee.__init__
is closer to the instance thanPerson.__init__
, the latter is never called.
Solution: super
built-in allows us to invoke a method on a parent object without explicitly naming that parent.
class Employee(Person)
def __init__(self, name, id_number):
super().__init__(name)
self.id_number = id_number
Abstract base classes are classes that are never instantiated on its own, but various subclasses will inherit.
ABCMeta
from the abc
module.If there is a default attribute for a subclass that is not going to change, consider just setting it as a subclass attribute rather than setting it as an attribute initialized in the __init__
method in the parent class.
# Parent class
class Animal:
def __init__(self, color):
# Turn the current class object into a string
self.species: str = self.__class__.__name__
self.color: str = color
def __repr__(self):
return f"{self.color} {self.species}, {self.number_of_legs} legs".lower()
# Child class - just sets number_of_legs as class attribute rather than setting it as __init__ attribute in parent.
class Wolf(Animal):
= 4
number_of_legs
def __init__(self, color):
super().__init__(color)
Whether this is right or wrong, (directly accessing data in other objects) is fairly common in the Python world. Because all data is public (i.e., there’s no private or protected), it’s considered a good and reasonable thing to just scoop the data out of objects. That said, this also means that whoever writes a class has a responsibility to document it, and to keep the API alive–or to document elements that may be deprecated or removed in the future.
(Unlike Python) In many languages, object-oriented programming is forced on you, such that you’re constantly trying to fit your programming into its syntax and structure.
There are at least three different ways to create an iterator:
The iterator protocol is both common and useful in Python. By now, it’s a bit of a chicken-and-egg situation–is it worth adding the iterator protocol to your objects because so many programs expect objects to support it? Or do programs use the iterator protocol because so many programs support it? The answer might not be clear, but the implications are. If you have a collection of data, or something that can be interpreted as a collection, then it’s worth adding the appropriate methods to your class. And if you’re not creating a new class, you can still take advantage of iterables with generator functions and expressions.
The book covers how to:
__iter__
returns an iterator__next__
must be defined on the iteratorStopIteration
exception which the iterator raises to signal the end of the iterationsfor
loop actually worksiter
built-in. iter
invokes the __iter__
method on the target object.for
loop invokes the next
built-in on the iterator, which invokes __next__
on the iterator.__next__
raises a StopIteration
exception, the loop exits.__iter__
method that takes only self
as an arg, and returns self
.__next__
method that takes only self
as an arg. It should either return a value, or raise StopIteration
when it runs out of values.class LoudIterator():
def __init__(self, data):
print('\tNow in __init__')
self.data = data
self.index = 0
def __iter__(self):
print('\tNow in __iter__')
return self
def __next__(self):
print('\tNow in __next__')
if self.index >= len(self.data):
print(
f'\tself.index ({self.index}) is too big; exiting')
raise StopIteration
= self.data[self.index]
value self.index += 1
print('\tGot value {value}, incremented index to {self.index}')
return value
for one_item in LoudIterator('abc'):
print(one_item)
# prints
"""
Now in __init__
Now in __iter__
Now in __next__
Got value a, incremented index to 1
a
Now in __next__
Got value b, incremented index to 2
b
Now in __next__
Got value c, incremented index to 3
c
Now in __next__
self.index (3) is too big; exiting
"""
Generators look like functions, but when executed acts like an iterator.
The example below, when run, doesn’t execute but rather returns a generator object.
def foo():
yield 1
yield 2
yield 3
This can be saved as a variable and put in a for
loop. With each iteration, the function executes through the next yield
statement, returns the value it got from yield
, then waits for the next iteration. When the generator function exits, it automatically raises StopIteration
to close the loop.
= foo()
g for i in g:
print(i)
__iter__
method, which returns an iterator.__next__
method.Term | What is it? | Example | To learn more |
---|---|---|---|
iter | A built-in function that returns an object’s iterator | iter(‘abcd’) | http://mng.bz/jgja |
next | A built-in function that requests the next object from an iterator | next(i) | http://mng.bz/WPBg |
StopIteration | An exception raised to indicate the end of a loop | raise StopIteration | http://mng.bz/8p0K |
enumerate | Helps us to number elements of iterables | for i, c in enumerate(‘ab’): print(f’{i}: {c}’) |
http://mng.bz/qM1K |
Iterables | A category of data in Python | Iterables can be put in for loops or passed to many functions. | http://mng.bz/EdDq |
itertools | A module with many classes for implementing iterables | import itertools | http://mng.bz/NK4E |
range | Returns an iterable sequence of integers | # every 3rd integer, from 10 # to (not including) 50 range(10, 50, 3) |
http://mng.bz/B2DJ |
os.listdir | Returns a list of files in a directory | os.listdir(‘/etc/’) | http://mng.bz/YreB |
os.walk | Iterates over the files in a directory | os.walk(‘/etc/’) | http://mng.bz/D2Ky |
yield | Returns control to the loop temporarily, optionally returning a value | yield 5 | http://mng.bz/lG9j |
os.path.join | Returns a string based on the path components | os.path.join(‘etc’, ‘passwd’) | http://mng.bz/oPPM |
time.perf_ counter | Returns the number of elapsed seconds (as a float) since the program was started | time.perf_counter() | http://mng.bz/B21v |
zip | Takes n iterables as arguments and returns an iterator of tuples of length n | # returns [(‘a’, 10), # (‘b’, 20), (‘c’, 30)] zip(‘abc’, [10, 20, 30]) |
http://mng.bz/Jyzv |
__iter__
in multi-class casesProblem: Below will throw nothing for B, because the same iterator object is being used.
= MyEnumerate('abc')
e
print('** A **')
for index, one_item in e:
print(f'{index}: {one_item}')
print('** B **')
for index, one_item in e:
print(f'{index}: {one_item}')
Solution: Implement __iter__
on the main class, but its job is to return a new instance of the helper class.
# in MyEnumerate
def __iter__(self):
return MyEnumerateIterator(self.data)
Then we define MyEnumerateIterator
, a new and separate class, whose __init__
looks much like the one we already defined for MyIterator and whose __next__
is taken directly from MyIterator
.
Advantages to this design:
__next__
) in a separate class.yield
indicates that you want to keep the generator going and return a value for the current iteration, while return
indicates that you want to exit completely.StopIteration
. That happens automatically when the generator reaches the end of the function.return
statement.return
with a value (e.g., return 5) from a generator function won’t throw an error, but the value will be ignored.itertools
Python comes with the itertools
module, which makes it easy to create many types of iterators.
The chain
tool allows you to chain together various types of iterables.
from itertools import chain
print([i for i in chain('abc', [1, 2, 3], {'a': 1, 'b': 2})])
# ['a', 'b', 'c', 1, 2, 3, 'a', 'b']