How to write a SaaS B2B product description

Writing a product description is a task every product manager will likely encounter. Here is something to help get started, in the form of questions a product description should and could try to answer.

First, there are questions every SaaS product description should answer:

  • What does SaaS mean?
  • What is the name of the product?
  • What (kind of) organizations is the product for?
  • Who are the intended users within organizations?
  • What is expected or required of them to acquire access to the product?
  • What is required to actually use the product?
  • What are the outcomes the product produces or delivers when used, for its users and customer organizations?
  • What are some actual example outcomes customers have gotten having used the product?
  • What are the key features of the product that users take advantage of to get those outcomes?
  • What, if anything, is unusually great about the product?

Consider also the following, should they apply to your product:

  • What is the pricing model
  • Where to get more information
  • What is the document version, and when was it last updated
  • What are the available product performance SLAs
  • How is end user privacy safeguarded
  • What third party data processors are used
  • What are the technical domicile(s) and jurisdiction of the service
  • Product compliance statement
  • What are the available language versions

You might also ask and answer the following and tune the description accordingly:

  • What is/are the audience(s) that the description needs to serve?
  • What are the questions you yourself would ask about the product? What would you like to know about it, were it offered to you?

Advertisement

Icinga2 for distributed system monitoring

This is a short introduction to distributed system monitoring using Icinga2, a open source monitoring solution. Besides Linux, It runs on Windows, too, although Windows support is a bit limited.

Icinga typically monitors things using so-called monitoring plugins. These are just executable programs returning an exit code and some output to stdin, wrapped in some Icinga-specific configuration. Yes, every check results in a command invocation that starts a process. Worse, in many cases there’s even further overhead; what’s run is actually shell script (or a bat file), which in turn runs the executable. Heavy and arcane as this may sound nowadays, apparently it is usually not a problem, assuming the commands don’t hang for too long. So timeouts can be important. There are also a so-called passive checks which means that instead of Icinga running a check, an outside system would submit the result of some check to Icinga.

There are lots of ready-made monitoring plugins available. If you’re nevertheless sure you need to write your own from scratch, see the monitoring-plugins docs for guidance (the old Icinga1 docs provide a shorter explanation). There is also at least one very necessary check command missing: a built-in HTTP check for use on the Microsoft Windows platform. Finding and implementing that will be a topic of a future post.

Icinga2 can be deployed in a distributed manner, for example so that there are two differently configured Icinga2 instances: a master and a slave, that connect over a network. In such a case, the master always has the monitoring configuration, ie. definitions of hosts and services to monitor, how to monitor them, and what to do depending on the outcome. Icinga has its own rather extensive configuration language for defining the monitoring configuration.

There are two alternative options for a master-slave deployment:

  1. The master schedules the checks, but does not run them. Instead, each time there is a scheduled check coming up, it sends a command to the slave telling it to perform the check and pass back the results.
  2. The master distributes the monitoring configuration to the client, which handles the scheduling and monitoring checking on its own, while passing back the results to the master.

Icinga provides built-in support for the two instances to connect securely. Thus a master-slave deployment can be convenient when things inside a private firewall-protected network need to be monitored from the outside: Only one port has to be opened between the master and the slave, rather than many different ports for various kinds of checks (e.g. ping, HTTP etc).

The distributed configuration can also provide some tolerance of disconnects: If the second option (out of the two listed above) is used and network connection is lost between the master and the slave, the slave will keep monitoring things; after all, it has all the needed configuration that it received from the master, to do so. After the connection comes up again, the slave submits a so-called replay log to the master, which master uses to update itself, ie. add the check results it missed while it and the slave were disconnected from each other.

Distributed monitoring with Icinga2 is a large and complex topic; for more information, it’s best to read the official Icinga docs and then check the forums and google for specific questions. While Icinga2 docs are extensive, their style tends to that of a reference. Good tutorials can be hard to find on some topics. So getting things going can be daunting, especially in larger or otherwise more complex scenarios. Simple things are fairly easy to configure, but the configuration language can also be very arduous; it can be difficult to get things right. Thankfully nowadays Icinga provides fairly adequate and understandable error messages. The forums are helpful for some things, but if your question shows you haven’t carefully read and tried to understand the docs before asking, be prepared to be scolded by the main developer and politely instructed to go RTFM and come back after that.

 

 

Python IoT temperature measurement

This is the second article in a ‘Python IoT’ topic series. In a previous article, I described WiThumb, a USB-connected IoT device from a KickStarter project, and how to run Python on it; specifically,  a Python implementation for resource-constrained, connected devices called MicroPython, or μPy for short.

In practice, (on WiThumb) μPy runs on a ESP8266 WiFi-enabled microcontroller chip embedded on the PCB. The PCB also has a single-chip temperature sensor module called MCP9808. The ESP8266 and the sensor are connected by what’s called a I2C bus, which can be accessed using μPy.

To read temperature information from the sensor using I2C, we need to know:

  • what is the I2C device address of the  on-board MCP9808
  • what is the temperature register number and what size the data that can be read
  • what is the structure of the data readable form the register

When dealing with IoT devices, to get all this information you may have to check actual hardware data sheets. Or ask a lot of questions from people who have them figured out. Or- just read  a nice detailed blog article about it, by someone who’s done those things..

So, to determine the I2C address of the MCP9808 sensor, we’d need to check the pertinent parts of the MCP9808 data sheet:

3.5 Address Pins (A0, A1, A2)

These pins are device address input pins.

The address pins correspond to the Least Significant bits (LSbs) of the address bits and the Most Significant bits (MSbs): A6, A5, A4, A3. This is illustrated in table 3-2

Moving on to the said table, abbreviated here for readibility:

TABLE 3-2:

MCP9808 ADDRESS BYTE

Device

Address Code

Slave Address

A6

A5

A4

A3

A2

A1

A0

MCP9808

0

0

1

1

x(1)

x

x

Note 1: User-selectable address is shown by ‘x’. A2, A1 and A0 must match the corresponding device pin configuration.

Ok, let’s see. It’s a byte with bits A6-A3 set for as, whereas A2-A0 can change. How do we know what they are? Well… it depends. The I2C bus can have many slave devices such as sensors connected to it, each with unique address. So in  each hardware design  (such as that of the WiThumb) the bits might be set differently. Still, the hardware spec suggests 0x18 as a default address, so we could reasonably try that.

But let’s nevertheless check the WiThumb hardware schematics – the actual blueprint of how the device is physically soldered together. From the schmatics we see that pins A2, A1 and A0 are all connected to a `VDD` pin, or a `3V3` line. That is telling us that the pin configuration for A2-A0 is actually `111` (for any pins left unconnected, the address bits would be zeroes). So the full address is `0011111` which is `0x1F` in hex, not `0x18`. So, for some reason the designer of WiThumb has chosen to deviate from the default address.

What about the temperature register, then? Back to the hardware design spec:

TABLE 5-1: BIT ASSIGNMENT SUMMARY FOR ALL REGISTERS

Register Pointer (Hex)

MSB/ LSB

Bit Assignment

7

6

5

4

3

2

1

0

0x05

MSB

TA ≥ TCRIT

TA > TUPPER

TA < TLOWER

SIGN

27°C

26°C

25°C

24°C

LSB

23°C

22°C

21°C

20°C

2-1°C

2-2°C

2-3°C

2-4°C

On the left we see that the register pointer is 0x05. So now we have enough information that we can try to fetch the temperature data from the MCP9808! Let’s try that using MicroPython:

from machine import Pin, I2C
i2c = I2C(scl=Pin(5, Pin.IN, Pin.PULL_UP), sda=Pin(4, Pin.IN, Pin.PULL_UP), freq=10000)
i2c.scan()
[31, 104]

So apparently the I2C bus of the WiThumb indeed has two devices connected to it, one at 0x1F and the other at 0x68.

Note that it is absolutely necessary to set the `PULL_UP` mode to construct the I2C instance. That enables the so-called internal pull-up resistors ESP8266 has, to ‘pull up’ both the I2C data and clock line. Using those internal resistors for I2C is apparently not recommended, but  in the case of WiThumb, they seem to be sufficient.

Next, let’s read ambient temperature data from the register:

R_A_TEMP = const(5)
raw = i2c.readfrom_mem(0x1F, R_A_TEMP, 2)

Here, we read two raw data bytes from the register. Note the `const()` function: While Python has no real constant type, in MicroPython `const()` can be used to optimize handling of parameters whose values are known not to change.

Finally, let’s convert the raw data to a floating-point number, using information from Table 5.1 above. All the temperature numbers in the hardware spec are in Ceicius degrees, so we know that’s what the bytes represent. To convert the bit string into a floating-point number, Python bitwise operators such as ‘bitwise left shift’ and  ‘bitwise and’ are useful:

u = (raw[0] & 0x0f) << 4
l = raw[1] / 16
if (raw[0] & 0x10) == 0x10:
   temp = 256 - (u + l)
else:
   temp = u + l

(algorithm courtesy Thomas Lee, designer of WiThumb)

That’s it.

MicroPython on the WiThumb with OSX

The WiThumb is a small USB-powered Arduino-compatible USB powered ESP8266-based WiFi IoT development board with a 6 DOF IMU, and a precision temperature sensor. Lot of stuff in a small package.

withumn_overview

WiThumb image by Thomas Lee

It is a result of a successful KickStarter campaign by Thomas Lee of “Funnyvale”, a group of San Jose area area hackers and makers from California. Curiously, when I received my WiThumb and saw the sender’s address, I realized I used to live nearby  back in -95/-96, when I was a young trainee with NCD (Network Computing Devices) of Mountain View. A small world.

But let’s get back to the device – it uses a ESP8266 chip that enjoys huge popularity among Python IoT developers, after Damien George’s successful project to bring MicroPython to it. MicroPython is used also in the OpenMV Python-programmable machine vision camera, another favourite device of mine.

When I saw that ESP8266 was used in WiThumb, I asked Thomas whether MicroPython was or would be supported. Apparently other people asked too, and he was able to make it work, using an unofficial MicroPython build for the ESP8266 by Stefan Wendler.

The instructions on flashing the WiThumb with MicroPython firmware and connecting to it were for Windows, however. While I have a Windows VM on my laptop, I wanted to try & see if I could make things work on OSX as well. It was a surprisingly smooth experience. Here’s how:

  1. Figure out what’s the device file, after plugging in the WiThumb. There are other ways, but I like doing things in Python so I created a small withumb Python package that uses PySerial to help identify the WiThumb easily. For example my WiThumb appeared as /dev/tty.SLAB_USBtoUART.
  2. Install esptool, the official Python package to work with the ESP8266, and erase the flash:
    esptool.py --port /dev/tty.SLAB_USBtoUART erase_flash
  3. Download the unofficial micropython nightly (see link earlier) and flash the WiThumb with it:
    esptool.py --port /dev/tty.SLAB_USBtoUART --baud 460800 write_flash --flash_size=8m 0 mp-esp8266-firmware-latest.bin
  4. Remove and plug in the WiThumb
  5. Connect to it using the screen command at 115200 baud speed
    screen /dev/tty.SLAB_USBtoUART 115200
  6. Have fun with MicroPython on the WiThumb!

Note: to move files to WiThumb your best bet is probably adafruit-ampy. I first tried mipy and mpy-utils but got errors with them, whereas ampy just worked right out of the box.

Practical digital business contracts

When entering into a contract with a customer, it is no longer necessary (at least in Finland) to physically transmit signed copies of a paper copy.

While convenient, given how easy digital images are to counterfeit, this raises a question of how to make sure the digital copy is not later modified by either party of an agreement. It would be easy to for example a buyer to replace the price of the contract with a lower figure by a simple Photoshop operation.

Of course, a prudent organization keeps copies of all digital contracts. So any changes made later can be detected by comparing the copies. However such checks can be time-consuming; a large contract can be dozens of pages. On first thought, checking that the file sizes match sounds like a good idea, too. However it’s also easy to add (or remove) some data from a file transparently so that file sizes match.

So, how to reliably compare an original and a copy that’s been received afterwards? There is a solution: one can generate a so-called cryptographic hash string that serves as a fingerprint of the original file. The way these hashes are generated ensures that any changes to the file will result in a different hash. So one can later easily check that the copy is a faithful copy of the original, by checking whether their hashes match.

A even better solution is to use digital signatures enabled by public key cryptography: First, a key pair is created, consisting of a private key that’s kept secret, and a public key that can be distributed or published freely. The two keys are created from a common “origin” in a way that makes it possible to use them together in a very smart way.

What happens when digital signatures are created  is that the secret private key is used to “stamp” the hash. The public key can then be used to later verify that the stamp was indeed created using the private key – in other words, by the party that gave us the digital signature.

So as a result, by using digital signatures, we can make sure that the contract copies both parties possess are the same, and also that the copy was sent to us by the contractual other party.

Here’s a simple Python module providing a command-line utility to create and verify digital keys and signatures.

"""digicheck - create and verify signatures for files
Usage:
digicheck keys
digicheck public <keyfilename>
digicheck sign <filename> <keyfilename>
digicheck check <filename> <keyfilename> <signaturefilename>
digicheck (-h | --help)
digicheck --version
Use the command-line to first create a key pair, then a
signature for a file, and finally when you need to make
sure file has not been tampered with in the meantime,
check that the signatures are still equal.
Options:
-h --help Show this screen.
--version Show version.
"""
import sys
import docopt
from Crypto.Hash import SHA256
from Crypto.PublicKey import RSA
from Crypto import Random
def generate_keys():
random_generator = Random.new().read
key = RSA.generate(2048, random_generator)
return (key.exportKey(), key.publickey().exportKey())
def generate_hash(data):
return SHA256.new(data).digest()
def generate_signature(hash, key):
return key.sign(hash, '')
def verify_signature(hash, public_key, signature):
return public_key.verify(hash, signature)
if __name__ == "__main__":
args = docopt.docopt(__doc__)
if args["keys"]:
private, public = generate_keys()
keys = private + "\n\n" + public
print(keys.strip())
elif args["public"]:
with open(args["<keyfilename>"], "r") as keyfile:
public_key = keyfile.read().split("\n\n")[1]
print(public_key.strip())
elif args["sign"]:
with open(args["<filename>"], "rb") as signedfile:
hash = generate_hash(signedfile.read())
with open(args["<keyfilename>"], "r") as keyfile:
private_key = RSA.importKey(keyfile.read().split("\n\n")[0].strip())
print(generate_signature(hash, private_key)[0])
elif args["check"]:
with open(args["<filename>"], "rb") as signedfile:
hash = generate_hash(signedfile.read())
with open(args["<keyfilename>"], "r") as keyfile:
public_key = RSA.importKey(keyfile.read().split("\n\n")[1].strip())
with open(args["<signaturefilename>"], "r") as signaturefile:
signature = long(signaturefile.read())
if verify_signature(hash, public_key, (signature,)):
sys.exit("valid signature :)")
else:
sys.exit("invalid signature! :(")
view raw digicheck.py hosted with ❤ by GitHub

For more details, see for example the blog post by Laurent Luce, describing the use of PyCrypto.

Simple & smart annotation storage for Plone forms

Introduction

In Plone 4 & 5, a package called z3c.form is used for forms. While it has quite extensive documentation, beginner-level tutorials for accomplishing specific (presumably simple) small tasks are hard to come by.

One such task might be using annotations for storing form data. Here, I describe how to accomplish just that with minimum effort while sticking to ‘the Zope/Plone way’.

Custom data manager registration

First, you have to register the built-in z3c.form.datamanager.DictionaryField for use with PersistentMapping (for more on what z3c.form data managers are and how they work, see for example my earlier post). This can be done by adding the following (multi-) adapter ZCML registration to the configure.zcml file of your Plone add-on package.

<adapter
   for="persistent.mapping,PersistentMapping zope.schema.interfaces.IField"
   provides="z3c.form.interfaces.IDataManager"
   factory="z3c.form.datamanager.DictionaryField"
/>

This basically tells Plone to use the DictionaryField adapter for loading & storing form data whenever a form is returning a PersistentMapping.

Using the built-in persistent.mapping.PersistentMapping directly instead of subclassing it has the advantage that were you ever to refactor yout (sub)class, such as move or rename it, existing annotations using that (sub)class would break. Sticking to PersistentMapping (or some other built-in persistent.* data structure is a way to avoid such problems.

Form customization

For the DictionaryField data manager to do its job, the form has to provide it with the object that the form will save submitted form to and load editable data from: In our case, that would be a PersistentMapping instance. For that purpose, the z3c.form forms have a getContent method that we customize:

   def getContent(self):
      annotations =  IAnnotations(self.context)
      return annotations["myform_mapping_storage_key"]

This assumes the context object, be it Page, Folder or your custom type, is attribute annotatable and that its annotations have been initialized with a PersistentMapping keyed at “myform_mapping_storage_key”. Doing that is left as an exercise to the reader (see annotations). For a more standalone form, add a try/except block to the getContent method that checks for existence of “myform_mapping_storage_key” and upon failure, initializes it according to your needs.

So, with just those small pieces of custom code (adapter ZCML registration & a custom form getContent method), your form can use annotations for storage.

Use with custom z3c.form validators

If you don’t need to register custom validators for your form, you’re all set. But if you do, there are some further details to be aware of.

To start with, there are some conveniences to simplify use z3c.form validators, but for anything but the simplest validators, we’d want to create a custom validator class based on z3c.form.validators.SimpleValidator. A validator typically just has to implement a custom validate() method that raises zope.interface.Invalid on failure; we’re not going into more details here.

Such a validator needs to be first registered as an adapter and then configured for use by setting so-called validator discriminators, via calling the validator.WidgetValidatorDiscriminators() function.

Among others, it expects a context parameter (discriminator) that tells the validator machinery which form storage to use the validator for.

Given we were using plain PersistentMapping, this poses a problem: if there are other forms using the same storage class (and same field), the validator will be applied to them as well. What if we want to avoid that; perhaps the other form needs another validator, or no validator?

We could subclass PersistentMapping, but earlier we saw it’s not necessarily a good idea. Instead, we can declare that our particular PersistentMapping instance provides an interface, and then pass that interface as a context (discriminator) to WidgetValidatorDiscriminators()

Defining such an interface is as easy as:

from zope.interface import Interface, directlyProvides

class IMappingAnnotationFormStorage(Interface):
   "automatically applied 'marker' interface"

Then update the getContent method introduced earlier to tag the returned PersistentMapping with the interface thus:

   def getContent(self):
      annotations =  IAnnotations(self.context)
      formstorage = annotations["myform_mapping_storage_key"]
      directlyProvides(formstorage, IMappingAnnotationFormStorage)

With that in place, you can register different validators for forms that all use nothing but plain PersistentMapping for storage.

Finally, here’s a custom mixin for your forms that has some built-in conveniences for annotation storage use:

class AnnotationStorageMixin(object):
   "mixin for z3c.forms for using annotations as form data storage"

   ANNOTATION_STORAGE_KEY = "form_annotation_storage"
   ASSIGN_INTERFACES = None

   def getContent(self):
      annotations =  IAnnotations(self.context)
      mapping = annotations[self.ANNOTATION_STORAGE_KEY]
      if self.ASSIGN_INTERFACES:
         for iface in self.ASSIGN_INTERFACES:
            directlyProvides(mapping, iface)
      return mapping

Note the two class variables ANNOTATION_STORAGE_KEY & ASSIGN_INTERFACES. Override them in your own form class to configure the mixin functionality for your specific case. If you use this mixin class for multiple different forms, you very likely want to at least have a unique storage key for each form.

Unexpectedly morphing z3c.form validation contexts considered harmful

TL;DR

The z3c.form validator context is not the context content object; rather, it is whatever getContent returns

Background

Normally, z3c.form stores form data as content object attributes. More specifically, the default AttributeField data manager directly gets and sets the instance attributes of the context content object.

To register a SimpleValidator -based validator adapter for the form, the class of that same context object (or an interface provided by it) should be passed to validator.WidgetValidatorDiscriminators. The same context object is then assigned to the validator’s context attribute, when it runs.

That’s all fine so far.

Use a custom data manager

Sometimes you may want to for example store form data in annotations, rather than in object attributes. In such a case, you can use a custom z3c.form data manager that basically just changes what object the form load & save operations are performed on.

All that’s required (for that particular case) is:

  • register the built-in (not used by default) DictionaryField as an adapter for PersistentMapping:
<adapter
   for="persistent.mapping,PersistentMapping zope.schema.interfaces.IField"
   provides="z3c.form.interfaces.IDataManager"
   factory="z3c.form.datamanager.DictionaryField"
/>
  • add to the form a custom getContent method that returns your PersistentMapping instance

Resulting change in validation behavior

It appears that when using a custom data manager, z3c.form enforces its own idea of what the context is considered to be (for purposes of z3c.form validation, anyway).

What does this mean? During form validation by a subclass of z3c.form base SimpleValidator, contrary to default behavior described in the beginning of this article, the context is no longer the context content object (say, a Page or Folder). Instead, it is what the getContent method returned. In our case, that would be PersistentMapping (or, zope.interface.common.mapping.IMapping if we passed an interface rather than class).

So if you were still expecting the context to refer to the content object, you’d be in for a surprise: First, when passing a context discriminator to validator.WidgetValidatorDiscriminators to validate the form, you’d now have to pass PersistentMapping (or, zope.interface.common.mapping.IMapping) instead of the content object (e.g. Page or Folder). Second, when thus registered validator would run, its context attribute would now also be the said PersistentMapping (or the interface).

Why is this bad

While this change in behavior may be by design, the changed semantics is confusing and at the very least not obvious.

It is not intuitive that what could reasonably be considered z3c.form’s internals (how the actual form storage is determined) would propagate to a change in semantics of validation context in this way. Especially when ‘context’ in the Zope/Plone world is pervasively used to refer to the context content object (acquisition-wrapped, but regardless).

Also, to access the actual content object when custom data manager is used, the context content object is not directly accessible (thankfully, the validator can still reach it via the view, ie. self.view.context).

How to maintain persistent connections in Plone

Disclaimer: the real substance of this article is a verbatim copy of “How This Package Maintains Persistent Connections” from the README of the alm.solrindex package (see e.g. https://pypi.python.org/pypi/alm.solrindex). I take no credit for any of it. A big thanks to authors of alm.solrindex for documenting their approach.

Why re-publish here? Well, the original documentation does not necessarily come up top when one googles the topic, and I wanted to make sure I can later get back to it even if I forget where I read about it. Perhaps this will help some others find about the mechanism a bit easier as well. After all, the need to make external connections from Plone will likely increase in our more and more networked world. And who knows some improvements or suggestions will come up that will help anyone struggling with the topic (myself included).

So, when one needs to make external network connections from Plone that need to be persistent (kept open), how do you maintain them in Plone, with all the issues related to threading etc. ?

The alm.solrindex approach, re-documented here:

This package uses a new method of maintaining an external database connection from a ZODB object. Previous approaches included storing_v_ (volatile) attributes, keeping connections in a thread local variable, and reusing the multi-database support inside ZODB, but those approaches each have significant drawbacks.

The new method is to add dictionary called foreign_connections to the ZODB Connection object (the _p_jar attribute of any persisted object). Each key in the dictionary is the OID of the object that needs to maintain a persistent connection. Each value is an implementation-dependent database connection or connection wrapper. If it is possible to write to the external database, the database connection or connection wrapper should implement the IDataManager interface so that it can be included in transaction commit or abort.

When a SolrIndex needs a connection to Solr, it first looks in the foreign_connections dictionary to see if a connection has already been made. If no connection has been made, the SolrIndex makes the connection immediately. Each ZODB connection has its own foreign_connections attribute, so database connections are not shared by concurrent threads, making this a thread safe solution.

This solution is better than _v_ attributes because connections will not be dropped due to ordinary object deactivation. This solution is better than thread local variables because it allows the object database to hold any number of external connections and it does not break when you pass control between threads. This solution is better than using multi-database support because participants in a multi-database are required to fulfill a complex contract that is irrelevant to databases other than ZODB.

Other packages that maintain an external database connection should try out this scheme to see if it improves reliability or readability. Other packages should use the same ZODB Connection attribute name, foreign_connections, which should not cause any clashes, since OIDs can not be shared.

An implementation note: when ZODB objects are first created, they are not stored in any database, so there is no simple way for the object to get a foreign_connections dictionary. During that time, one way to hold a database connection is to temporarily fall back to the volatile attribute solution. That is what SolrIndex does (see the _v_temp_cm attribute).

py.test subprocess fixture callables for testing Twisted apps

TL;DR: fork and os._exit(0) the fixture callable

Getting py.test to test Twisted apps is supported to some extent, albeit somewhat briefly documented; also there’s a py.test plugin to help with testing Twisted apps.

Tests usually require fixtures to be set up. Let’s assume your tests require running something in a separate process, for example a server such as MySQL. What to do? Ok, you can use subprocess.Popen, or Twisted spawnProcess to spin up the database. Note that you should probably not use multiprocessing: It uses its own loop for which there is no support in Twisted.

But what if it’s Python code you want to run? Yes, you can put it into a module and run the module using the above methods. However, if you want to use a Python callable defined in your test module you’re out of luck: neither subprocess.Popen and spawnProcess: nor can run a Python callable in a subprocess.

In that case, you need os.fork. Simply run the callable in the child, and depending on your use case, either wait for it to complete in parent, or kill it at the end of the test. However there’s one gotcha, at least when using py.test: since you’re forking a running test, py.test will now report two tests running and completing. The solution is to exit the child abnormally; simple sys.exit() will raise an exception, but doing os._exit(0) does not.

Here’s example code that spins up a simple test HTTP server for one request, and checks that content fetched by HTTP client matches that served by the HTTP server:

import os
from httpserver import BaseHTTPServer
import pytest
import treq
from turq import TurqHandler

def serve_request(host, port, rulecode):
   TurqHandler.rules = parse_rules(rulecode)
   server = BaseHTTPServer.HTTPServer((host,port), TurqHandler)
   server.handle_request() # nothing more needed fro this one test

@pytest.inlineCallbacks
def test_something():
   pid = os.fork()

   # set up fixture in child
   if pid == 0:
      serve_request("127.0.0.1", 8080, "path('*').text('Hello')")
      os._exit(0)

   # proceed in parent (test), wait a bit first for the server fixture to come up
   time.sleep(0.5) 

   # make a request
   r = treq.get("http://127.0.0.1:8080")
 
   # kill server in child if we cannot connect
   try:
      response = yield r
   except Exception as exc:
      os.kill(pid, signal.SIGKILL)
      raise

   responsetext = yield treq.content(response)
   assert responsetext == "Hello"

I don’t know whether the same technique works with nose and/or Twisted Trial – let me know if you find out!

Comparison of py.test and nose for Python testing

I happened upon this useful comparison of py.test and nose at the testing-in-python mailing list, by Kenny (theotherwhitemeat at gmail). He spent some time evaluating testing tools for Python with a focus on py.test and nose . This article is a reformat of his mailing list post. I assume no credit for the content. The list of references [1] … [13] is at the end of article.

py.test

  • parallelizable: threading + SMP support [3] [4]
  • better documentation: [1] [3]
  • can generate script that does all py.test functions, obviating the need to distribute py.test [1][10]
  • integrate tests into your distribution (py.test –genscript=runtests.py), to create a standalone version of py.test [10]
  • can run nose, unittest, doctest style tests [1] [2]
  • test detection via globs, configurable [3]
  • test failure output more discernible than nose [3] [9]
  • easier, more flexible assertions than nose [8]
  • setup speed is sub-second different from nose, also test speeds can be managed via distribution (threads + SMP via xdist) [9] [11]
  • provides test isolation, if needed [9]
  • dependency injection via funcargs [10] [12] [13]
  • coverage plugin [11]

nose

  • documentation concerns, this may be outdated [3]
  • parallelization issues [3] [8]
  • slightly faster than py.test [4] [11]
  • test detection via regex (setup in cmdline or config file) [3]
  • can run unittest, doctest style tests [1] [2]
  • cannot run py.test style tests [1]

Conclusions

  • test formats are so similar, that nose or py.test can be used without much consequence until you’re writing more exotic tests (you could swap with little consequence)
  • nose is sub-second faster than py.test in execution time; this is typically unimportant
  • community seems to slightly favor py.test

References

  1. http://mail.scipy.org/pipermail/astropy/2011-July/001673.html
  2. http://pytest.org/latest/nose.html
  3. http://fedoraproject.org/wiki/User:Tflink/AutoQA_nose_pytest_comparison
  4. http://www.libcrack.so/2012/01/09/a-brief-analysis-of-python-testing-software/
  5. http://pythontesting.net/framework/nose/nose-introduction/
  6. http://wiki.python.org/moin/PythonTestingToolsTaxonomy
  7. http://docs.python-guide.org/en/latest/writing/tests.html#tools
  8. http://stackoverflow.com/questions/191673/preferred-python-unit-testing-framework
  9. http://thread.gmane.org/gmane.comp.python.testing.general/3748
  10. http://article.gmane.org/gmane.comp.python.testing.general/3752
  11. http://article.gmane.org/gmane.comp.python.testing.general/3765
  12. http://pytest.org/latest/funcargs.html
  13. http://holgerkrekel.net/2009/05/13/parametrizing-python-tests-generalized/