Top Banner
Bonobo Documentation Release 0.4.3 Romain Dorgueil Sep 27, 2017
67

Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Apr 19, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo DocumentationRelease 0.4.3

Romain Dorgueil

Sep 27, 2017

Page 2: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”
Page 3: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Contents

1 Installation 11.1 Create an ETL project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Other installation options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Windows support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 First steps 32.1 What is Bonobo? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.3 What’s next? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 Guides 193.1 Concepts and best practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Third party integrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4 References 314.1 Bonobo API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.2 Config API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.3 Command-line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.4 Settings & Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5 F.A.Q. 515.1 Too long; didn’t read. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515.2 What versions of python does bonobo support? Why not more? . . . . . . . . . . . . . . . . . . . . 515.3 Can a graph contain another graph? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.4 How would one access contextual data from a transformation? Are there parameter injections like

pytest’s fixtures? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.5 What is a plugin? Do I need to write one? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.6 Is there a difference between a transformation node and a regular python function or generator? . . . 525.7 Why did you include the word «marketing» in a commit message? Why is there a marketing-

automation tag on the project? Isn’t marketing evil? . . . . . . . . . . . . . . . . . . . . . . . . . . 535.8 Why not use <some library> instead? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.9 All those references to monkeys hurt my head. Bonobos are not monkeys. . . . . . . . . . . . . . . . 535.10 Who is behind this? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.11 Documentation seriously lacks X, there is a problem in Y. . . . . . . . . . . . . . . . . . . . . . . . . 53

6 Contributing 55

i

Page 4: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

6.1 tl;dr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.2 Code-related contributions (including tests and examples) . . . . . . . . . . . . . . . . . . . . . . . 566.3 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.4 Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.5 License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.6 License for non lawyers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Python Module Index 59

ii

Page 5: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

CHAPTER 1

Installation

1.1 Create an ETL project

Creating a project and starting to write code should take less than a minute:

$ pip install --upgrade bonobo cookiecutter$ bonobo init my-etl-project$ bonobo run my-etl-project

Once you bootstrapped a project, you can start editing the default example transformation by editing my-etl-project/main.py. Now, you can head to First steps.

1.2 Other installation options

1.2.1 Install from PyPI

You can install it directly from the Python Package Index (like we did above).

$ pip install bonobo

1.2.2 Install from source

If you want to install an unreleased version, you can use git urls with pip. This is useful when using bonobo as adependency of your code and you want to try a forked version of bonobo with your software. You can use a git+httpstring in your requirements.txt file. However, the best option for development on bonobo is an editable install (seebelow).

$ pip install git+https://github.com/python-bonobo/bonobo.git@develop#egg=bonobo

1

Page 6: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

1.2.3 Editable install

If you plan on making patches to Bonobo, you should install it as an “editable” package, which is a really great pipfeature. Pip will clone your repository in a source directory and create a symlink for it in the site-package directory ofyour python interpreter.

$ pip install --editable git+https://github.com/python-bonobo/bonobo.git@master→˓#egg=bonobo

Note: You can also use the -e flag instead of the long version.

If you can’t find the “source” directory, try trunning this:

$ python -c "import bonobo; print(bonobo.__path__)"

Another option is to have a “local” editable install, which means you create the clone by yourself and make an editableinstall from the local clone.

$ git clone [email protected]:python-bonobo/bonobo.git$ cd bonobo$ pip install --editable .

You can develop on this clone, but you probably want to add your own repository if you want to push code back andmake pull requests. I usually name the git remote for the main bonobo repository “upstream”, and my own repository“origin”.

$ git remote rename origin upstream$ git remote add origin [email protected]:hartym/bonobo.git$ git fetch --all

Of course, replace my github username by the one you used to fork bonobo. You should be good to go!

1.3 Windows support

There are minor issues on the windows platform, mostly due to the fact bonobo was not developed by experiencedwindows users.

We’re trying to look into that but energy available to provide serious support on windows is very limited.

If you have experience in this domain and you’re willing to help, you’re more than welcome!

2 Chapter 1. Installation

Page 7: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

CHAPTER 2

First steps

2.1 What is Bonobo?

Bonobo is an ETL (Extract-Transform-Load) framework for python 3.5. The goal is to define data-transformations,with python code in charge of handling similar shaped independant lines of data.

Bonobo is not a statistical or data-science tool. If you’re looking for a data-analysis tool in python, use Pandas.

Bonobo is a lean manufacturing assembly line for data that let you focus on the actual work instead of the plumbery(execution contexts, parallelism, error handling, console output, logging, . . . ).

Bonobo uses simple python and should be quick and easy to learn.

2.2 Tutorial

Note: Good documentation is not easy to write. We do our best to make it better and better.

Although all content here should be accurate, you may feel a lack of completeness, for which we plaid guilty andapologize.

If you’re stuck, please come and ask on our slack channel, we’ll figure something out.

If you’re not stuck but had trouble understanding something, please consider contributing to the docs (via github pullrequests).

2.2.1 Let’s get started!

To begin with Bonobo, you need to install it in a working python 3.5+ environment, and you’ll also need cookiecutterto bootstrap your project.

3

Page 8: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

$ pip install bonobo cookiecutter

See Installation for more options.

Create an empty project

Your ETL code will live in ETL projects, which are basically a bunch of files, including python code, that bonobo canrun.

$ bonobo init tutorial

This will create a tutorial directory (content description here).

To run this project, use:

$ bonobo run tutorial

Write a first transformation

Open tutorial/main.py, and delete all the code here.

A transformation can be whatever python can call. Simplest transformations are functions and generators.

Let’s write one:

def transform(x):return x.upper()

Easy.

Note: This function is very similar to str.upper(), which you can use directly.

Let’s write two more transformations for the “extract” and “load” steps. In this example, we’ll generate the data fromscratch, and we’ll use stdout to “simulate” data-persistence.

def extract():yield 'foo'yield 'bar'yield 'baz'

def load(x):print(x)

Bonobo makes no difference between generators (yielding functions) and regular functions. It will, in all cases, iterateon things returned, and a normal function will just be seen as a generator that yields only once.

Note: Once again, you should use the builtin print() directly instead of this load() function.

Create a transformation graph

Amongst other features, Bonobo will mostly help you there with the following:

4 Chapter 2. First steps

Page 9: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

• Execute the transformations in independant threads

• Pass the outputs of one thread to other(s) thread(s) inputs.

To do this, it needs to know what data-flow you want to achieve, and you’ll use a bonobo.Graph to describe it.

import bonobo

graph = bonobo.Graph(extract, transform, load)

if __name__ == '__main__':bonobo.run(graph)

extract transform load

Note: The if __name__ == ‘__main__’: section is not required, unless you want to run it directly using the pythoninterpreter.

Execute the job

Save tutorial/main.py and execute your transformation again:

$ bonobo run tutorial

This example is available in bonobo.examples.tutorials.tut01e01, and you can also run it as a module:

$ bonobo run -m bonobo.examples.tutorials.tut01e01

Rewrite it using builtins

There is a much simpler way to describe an equivalent graph:

import bonobo

graph = bonobo.Graph([

'foo','bar','baz',

],str.upper,print,

)

2.2. Tutorial 5

Page 10: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

if __name__ == '__main__':bonobo.run(graph)

The extract() generator has been replaced by a list, as Bonobo will interpret non-callable iterables as a no-inputgenerator.

This example is also available in bonobo.examples.tutorials.tut01e02, and you can also run it as amodule:

$ bonobo run -m bonobo.examples.tutorials.tut01e02

You can now jump to the next part (Working with files), or read a small summary of concepts and definitions introducedhere below.

Takeaways

The bonobo.Graph class is used to represent a data-processing pipeline.

It can represent simple list-like linear graphs, like here, but it can also represent much more complex graphs, withforks and joins.

This is what the graph we defined looks like:

iter(['foo', 'bar', 'baz']) str.upper print

Transformations are simple python callables. Whatever can be called can be used as a transformation. Callables caneither return or yield data to send it to the next step. Regular functions (using return) should be prefered if each call isguaranteed to return exactly one result, while generators (using yield) should be prefered if the number of output linesfor a given input varies.

The Graph instance, or transformation graph is executed using an ExecutionStrategy. You won’t use it directly, butbonobo.run() created an instance of bonobo.ThreadPoolExecutorStrategy under the hood (the defaultstrategy). Actual behavior of an execution will depend on the strategy chosen, but the default should be fine for mostcases.

Before actually executing the transformations, the ExecutorStrategy instance will wrap each component in an execu-tion context, whose responsibility is to hold the state of the transformation. It enables to keep the transformationsstateless, while allowing to add an external state if required. We’ll expand on this later.

Concepts and definitions

• Transformation: a callable that takes input (as call parameters) and returns output(s), either as its return valueor by yielding values (a.k.a returning a generator).

• Transformation graph (or Graph): a set of transformations tied together in a bonobo.Graph instance,which is a directed acyclic graph (or DAG).

• Node: a graph element, most probably a transformation in a graph.

6 Chapter 2. First steps

Page 11: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

• Execution strategy (or strategy): a way to run a transformation graph. It’s responsibility is mainly to parallelize(or not) the transformations, on one or more process and/or computer, and to setup the right queuing mechanismfor transformations’ inputs and outputs.

• Execution context (or context): a wrapper around a node that holds the state for it. If the node needs state,there are tools available in bonobo to feed it to the transformation using additional call parameters, keepingtransformations stateless.

Next

Time to jump to the second part: Working with files.

2.2.2 Working with files

Bonobo would be pointless if the aim was just to uppercase small lists of strings.

In fact, Bonobo should not be used if you don’t expect any gain from parallelization/distribution of tasks.

Some background. . .

Let’s take the following graph:

A B

C

D

When run, the execution strategy wraps every component in a thread (assuming you’re using the default bonobo.strategies.ThreadPoolExecutorStrategy).

Bonobo will send each line of data in the input node’s thread (here, A). Now, each time A yields or returns something,it will be pushed on B input queue.Queue, and will be consumed by B’s thread. Meanwhile, A will continue to run,if it’s not done.

When there is more than one node linked as the output of a node (for example, with B, C, and D), the same thinghappens except that each result coming out of B will be sent to both on C and D input queue.Queue.

One thing to keep in mind here is that as the objects are passed from thread to thread, you need to write “pure”transformations (see Pure transformations).

You generally don’t have to think about it. Just be aware that your nodes will run in parallel, and don’t worry too muchabout nodes running blocking operations, as they will run in parallel. As soon as a line of output is ready, the nextnodes will start consuming it.

That being said, let’s manipulate some files.

2.2. Tutorial 7

Page 12: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Reading a file

There are a few component builders available in Bonobo that let you read from (or write to) files.

All readers work the same way. They need a filesystem to work with, and open a “path” they will read from.

• bonobo.CsvReader

• bonobo.FileReader

• bonobo.JsonReader

• bonobo.PickleReader

We’ll use a text file that was generated using Bonobo from the “liste-des-cafes-a-un-euro” dataset made available byMairie de Paris under the Open Database License (ODbL). You can explore the original dataset.

You’ll need the “coffeeshops.txt” example dataset, available in Bonobo’s repository:

$ curl https://raw.githubusercontent.com/python-bonobo/bonobo/master/bonobo/examples/→˓datasets/coffeeshops.txt > `python3 -c 'import bonobo; print(bonobo.get_examples_→˓path("datasets/coffeeshops.txt"))'`

Note: The “example dataset download” step will be easier in the future.

https://github.com/python-bonobo/bonobo/issues/134

import bonobo

graph = bonobo.Graph(bonobo.FileReader('coffeeshops.txt'),print,

)

def get_services():return {'fs': bonobo.open_examples_fs('datasets')}

if __name__ == '__main__':bonobo.run(graph, services=get_services())

You can also run this example as a module (but you’ll still need the dataset. . . ):

$ bonobo run -m bonobo.examples.tutorials.tut02e01_read

Note: Don’t focus too much on the get_services() function for now. It is required, with this exact name, but we’ll getinto that in a few minutes.

Writing to files

Let’s split this file’s each lines on the first comma and store a json file mapping coffee names to their addresses.

Here are, like the readers, the classes available to write files

• bonobo.CsvWriter

8 Chapter 2. First steps

Page 13: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

• bonobo.FileWriter

• bonobo.JsonWriter

• bonobo.PickleWriter

Let’s write a first implementation:

import bonobo

def split_one(line):return line.split(', ', 1)

graph = bonobo.Graph(bonobo.FileReader('coffeeshops.txt'),split_one,bonobo.JsonWriter(

'coffeeshops.json', fs='fs.output', ioformat='arg0'),

)

def get_services():return {

'fs': bonobo.open_examples_fs('datasets'),'fs.output': bonobo.open_fs(),

}

if __name__ == '__main__':bonobo.run(graph, services=get_services())

(run it with bonobo run -m bonobo.examples.tutorials.tut02e02_write or bonobo runmyfile.py)

If you read the output file, you’ll see it misses the “map” part of the problem.

Let’s extend bonobo.io.JsonWriter to finish the job:

import json

import bonobo

def split_one_to_map(line):k, v = line.split(', ', 1)return {k: v}

class MyJsonWriter(bonobo.JsonWriter):prefix, suffix = '{', '}'

def write(self, fs, file, lineno, row):return bonobo.FileWriter.write(

self, fs, file, lineno, json.dumps(row)[1:-1])

2.2. Tutorial 9

Page 14: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

graph = bonobo.Graph(bonobo.FileReader('coffeeshops.txt'),split_one_to_map,MyJsonWriter('coffeeshops.json', fs='fs.output', ioformat='arg0'),

)

def get_services():return {

'fs': bonobo.open_examples_fs('datasets'),'fs.output': bonobo.open_fs(),

}

if __name__ == '__main__':bonobo.run(graph, services=get_services())

(run it with bonobo run -m bonobo.examples.tutorials.tut02e03_writeasmap or bonoborun myfile.py)

It should produce a nice map.

We favored a bit hackish solution here instead of constructing a map in python then passing the whole to json.dumps() because we want to work with streams, if you have to construct the whole data structure in python, you’llloose a lot of bonobo’s benefits.

Next

Time to write some more advanced transformations, with service dependencies: Configurables and Services.

2.2.3 Configurables and Services

Note: This section lacks completeness, sorry for that (but you can still read it!).

In the last section, we used a few new tools.

Class-based transformations and configurables

Bonobo is a bit dumb. If something is callable, it considers it can be used as a transformation, and it’s up to the userto provide callables that logically fits in a graph.

You can use plain python objects with a __call__() method, and it ill just work.

As a lot of transformations needs common machinery, there is a few tools to quickly build transformations, most ofthem requiring your class to subclass bonobo.config.Configurable.

Configurables allows to use the following features:

• You can add Options (using the bonobo.config.Option descriptor). Options can be positional, or key-word based, can have a default value and will be consumed from the constructor arguments.

from bonobo.config import Configurable, Option

class PrefixIt(Configurable):

10 Chapter 2. First steps

Page 15: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

prefix = Option(str, positional=True, default='>>>')

def call(self, row):return self.prefix + ' ' + row

prefixer = PrefixIt('$')

• You can add Services (using the bonobo.config.Service descriptor). Services are a subclass ofbonobo.config.Option, sharing the same basics, but specialized in the definition of “named services”that will be resolved at runtime (a.k.a for which we will provide an implementation at runtime). We’ll dive moreinto that in the next section

from bonobo.config import Configurable, Option, Service

class HttpGet(Configurable):url = Option(default='https://jsonplaceholder.typicode.com/users')http = Service('http.client')

def call(self, http):resp = http.get(self.url)

for row in resp.json():yield row

http_get = HttpGet()

• You can add Methods (using the bonobo.config.Method descriptor). bonobo.config.Method is asubclass of bonobo.config.Option that allows to pass callable parameters, either to the class constructor,or using the class as a decorator.

from bonobo.config import Configurable, Method

class Applier(Configurable):apply = Method()

def call(self, row):return self.apply(row)

@Applierdef Prefixer(self, row):

return 'Hello, ' + row

prefixer = Prefixer()

• You can add ContextProcessors, which are an advanced feature we won’t introduce here. If you’re familiarwith pytest, you can think of them as pytest fixtures, execution wise.

Services

The motivation behind services is mostly separation of concerns, testability and deployability.

Usually, your transformations will depend on services (like a filesystem, an http client, a database, a rest api, . . . ).Those services can very well be hardcoded in the transformations, but there is two main drawbacks:

• You won’t be able to change the implementation depending on the current environment (development laptopversus production servers, bug-hunting session versus execution, etc.)

2.2. Tutorial 11

Page 16: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

• You won’t be able to test your transformations without testing the associated services.

To overcome those caveats of hardcoding things, we define Services in the configurable, which are basically string-options of the service names, and we provide an implementation at the last moment possible.

There are two ways of providing implementations:

• Either file-wide, by providing a get_services() function that returns a dict of named implementations (we did sowith filesystems in the previous step, tut02.rst)

• Either directory-wide, by providing a get_services() function in a specially named _services.py file.

The first is simpler if you only have one transformation graph in one file, the second allows to group coherent trans-formations together in a directory and share the implementations.

Let’s see how to use it, starting from the previous service example:

from bonobo.config import Configurable, Option, Service

class HttpGet(Configurable):url = Option(default='https://jsonplaceholder.typicode.com/users')http = Service('http.client')

def call(self, http):resp = http.get(self.url)

for row in resp.json():yield row

We defined an “http.client” service, that obviously should have a get() method, returning responses that have a json()method.

Let’s provide two implementations for that. The first one will be using requests, that coincidally satisfies the describedinterface:

import bonoboimport requests

def get_services():return {

'http.client': requests}

graph = bonobo.Graph(HttpGet(),print,

)

If you run this code, you should see some mock data returned by the webservice we called (assuming it’s up and youcan reach it).

Now, the second implementation will replace that with a mock, used for testing purposes:

class HttpResponseStub:def json(self):

return [{'id': 1, 'name': 'Leanne Graham', 'username': 'Bret', 'email':

→˓'[email protected]', 'address': {'street': 'Kulas Light', 'suite': 'Apt. 556', 'city→˓': 'Gwenborough', 'zipcode': '92998-3874', 'geo': {'lat': '-37.3159', 'lng': '81.→˓1496'}}, 'phone': '1-770-736-8031 x56442', 'website': 'hildegard.org', 'company': {→˓'name': 'Romaguera-Crona', 'catchPhrase': 'Multi-layered client-server neural-net',→˓'bs': 'harness real-time e-markets'}},

12 Chapter 2. First steps

Page 17: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

{'id': 2, 'name': 'Ervin Howell', 'username': 'Antonette', 'email':→˓'[email protected]', 'address': {'street': 'Victor Plains', 'suite': 'Suite 879',→˓'city': 'Wisokyburgh', 'zipcode': '90566-7771', 'geo': {'lat': '-43.9509', 'lng': '-→˓34.4618'}}, 'phone': '010-692-6593 x09125', 'website': 'anastasia.net', 'company': {→˓'name': 'Deckow-Crist', 'catchPhrase': 'Proactive didactic contingency', 'bs':→˓'synergize scalable supply-chains'}},

]

class HttpStub:def get(self, url):

return HttpResponseStub()

def get_services():return {

'http.client': HttpStub()}

graph = bonobo.Graph(HttpGet(),print,

)

The Graph definition staying the exact same, you can easily substitute the _services.py file depending on your en-vironment (the way you’re doing this is out of bonobo scope and heavily depends on your usual way of managingconfiguration files on different platforms).

Starting with bonobo 0.5 (not yet released), you will be able to use service injections with function-based transforma-tions too, using the bonobo.config.requires decorator to mark a dependency.

from bonobo.config import requires

@requires('http.client')def http_get(http):

resp = http.get('https://jsonplaceholder.typicode.com/users')

for row in resp.json():yield row

Read more

• Services and dependencies

• Config API

Next

Working with databases.

2.2.4 Working with databases

Databases (and especially SQL databases here) are not the focus of Bonobo, thus support for it is not (and will neverbe) included in the main package. Instead, working with databases is done using third party, well maintained andspecialized packages, like SQLAlchemy, or other database access libraries from the python cheese shop.

2.2. Tutorial 13

Page 18: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Note: SQLAlchemy extension is not yet complete. Things may be not optimal, and some APIs will change. You canstill try, of course.

Consider the following document as a “preview” (yes, it should work, yes it may break in the future).

Also, note that for early development stages, we explicitely support only PostreSQL, although it may work well withany other database supported by SQLAlchemy.

First, read https://www.bonobo-project.org/with/sqlalchemy for instructions on how to install. You do need the bleed-ing edge version of bonobo and bonobo-sqlalchemy to make this work.

Requirements

Once you installed bonobo_sqlalchemy (read https://www.bonobo-project.org/with/sqlalchemy to use bleeding edgeversion), install the following additional packages:

$ pip install -U python-dotenv psycopg2 awesome-slugify

Those packages are not required by the extension, but python-dotenv will help us configure the database DSN, andpsycopg2 is required by SQLAlchemy to connect to PostgreSQL databases. Also, we’ll use a slugifier to create uniqueidentifiers for the database (maybe not what you’d do in the real world, but very much sufficient for example purpose).

Configure a database engine

Open your _services.py file and replace the code:

import bonobo, dotenv, logging, osfrom bonobo_sqlalchemy.util import create_postgresql_engine

dotenv.load_dotenv(dotenv.find_dotenv())logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

def get_services():return {

'fs': bonobo.open_examples_fs('datasets'),'fs.output': bonobo.open_fs(),'sqlalchemy.engine': create_postgresql_engine(**{

'name': 'tutorial','user': 'tutorial','pass': 'tutorial',

})}

The create_postgresql_engine is a tiny function building the DSN from reasonable defaults, that you can overrideeither by providing kwargs, or with system environment variables. If you want to override something, open the .env fileand add values for one or more of POSTGRES_NAME, POSTGRES_USER, ‘POSTGRES_PASS‘, POSTGRES_HOST,POSTGRES_PORT. Please note that kwargs always have precedence on environment, but that you should prefer usingenvironment variables for anything that is not immutable from one platform to another.

Add database operation to the graph

Let’s create a tutorial/pgdb.py job:

14 Chapter 2. First steps

Page 19: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

import bonoboimport bonobo_sqlalchemy

from bonobo.examples.tutorials.tut02e03_writeasmap import graph, split_one_to_map

graph = graph.copy()graph.add_chain(

bonobo_sqlalchemy.InsertOrUpdate('coffeeshops'),_input=split_one_to_map

)

Notes here:

• We use the code from Working with files, which is bundled with bonobo in the bonobo.examples.tutorials pack-age.

• We “fork” the graph, by creating a copy and appending a new “chain”, starting at a point that exists in the othergraph.

• We use bonobo_sqlalchemy.InsertOrUpdate (which role, in case it is not obvious, is to createdatabase rows if they do not exist yet, or update the existing row, based on a “discriminant” criteria (by de-fault, “id”)).

If we run this transformation (with bonobo run tutorial/pgdb.py), we should get an error:

| File ".../lib/python3.6/site-packages/psycopg2/__init__.py", line 130, in connect| conn = _connect(dsn, connection_factory=connection_factory, **kwasync)| sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: database→˓"tutorial" does not exist||| The above exception was the direct cause of the following exception:|| Traceback (most recent call last):| File ".../bonobo-devkit/bonobo/bonobo/strategies/executor.py", line 45, in _runner| node_context.start()| File ".../bonobo-devkit/bonobo/bonobo/execution/base.py", line 75, in start| self._stack.setup(self)| File ".../bonobo-devkit/bonobo/bonobo/config/processors.py", line 94, in setup| _append_to_context = next(_processed)| File ".../bonobo-devkit/bonobo-sqlalchemy/bonobo_sqlalchemy/writers.py", line 43,→˓in create_connection| raise UnrecoverableError('Could not create SQLAlchemy connection: {}.'.→˓format(str(exc).replace('\n', ''))) from exc| bonobo.errors.UnrecoverableError: Could not create SQLAlchemy connection: (psycopg2.→˓OperationalError) FATAL: database "tutorial" does not exist.

The database we requested do not exist. It is not the role of bonobo to do database administration, and thus there is notool here to create neither the database, nor the tables we want to use.

Create database and table

There are however tools in sqlalchemy to manage tables, so we’ll create the database by ourselves, and ask sqlalchemyto create the table:

$ psql -U postgres -h localhost

psql (9.6.1, server 9.6.3)

2.2. Tutorial 15

Page 20: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Type "help" for help.

postgres=# CREATE ROLE tutorial WITH LOGIN PASSWORD 'tutorial';CREATE ROLEpostgres=# CREATE DATABASE tutorial WITH OWNER=tutorial TEMPLATE=template0 ENCODING=→˓'utf-8';CREATE DATABASE

Now, let’s use a little trick and add this section to pgdb.py:

import sysfrom sqlalchemy import Table, Column, String, Integer, MetaData

def main():from bonobo.commands.run import get_default_servicesservices = get_default_services(__file__)if len(sys.argv) == 1:

return bonobo.run(graph, services=services)elif len(sys.argv) == 2 and sys.argv[1] == 'reset':

engine = services.get('sqlalchemy.engine')metadata = MetaData()

coffee_table = Table('coffeeshops',metadata,Column('id', String(255), primary_key=True),Column('name', String(255)),Column('address', String(255)),

)

metadata.drop_all(engine)metadata.create_all(engine)

else:raise NotImplementedError('I do not understand.')

if __name__ == '__main__':main()

Note: We’re using private API of bonobo here, which is unsatisfactory, discouraged and may change. Some way toget the service dictionnary will be added to the public api in a future release of bonobo.

Now run:

$ python tutorial/pgdb.py reset

Database and table should now exist.

Format the data

Let’s prepare our data for database, and change the .add_chain(..) call to do it prior to InsertOrUpdate(. . . )

from slugify import slugify_url

def format_for_db(row):name, address = list(row.items())[0]

16 Chapter 2. First steps

Page 21: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

return {'id': slugify_url(name),'name': name,'address': address,

}

# ...

graph = graph.copy()graph.add_chain(

format_for_db,bonobo_sqlalchemy.InsertOrUpdate('coffeeshops'),_input=split_one_to_map

)

Run!

You can now run the script (either with bonobo run tutorial/pgdb.py or directly with the python interpreter, as weadded a “main” section) and the dataset should be inserted in your database. If you run it again, no new rows arecreated.

Note that as we forked the graph from Working with files, the transformation also writes the data to coffeeshops.json,as before.

2.3 What’s next?

2.3.1 Read a few examples

• Examples

2.3.2 Read about best development practices

• Guides

• Pure transformations

2.3.3 Read about integrating external tools with bonobo

• Bonobo with Docker: run transformation graphs in isolated containers.

• Bonobo with Jupyter: run transformations within jupyter notebooks.

• Bonobo with Selenium: crawl the web using a real browser and work with the gathered data.

• Bonobo with SQLAlchemy: everything you need to interract with SQL databases.

2.3. What’s next? 17

Page 22: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

18 Chapter 2. First steps

Page 23: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

CHAPTER 3

Guides

3.1 Concepts and best practices

There are a few things that you should know while writing transformations graphs with bonobo.

3.1.1 Pure transformations

The nature of components, and how the data flow from one to another, can be a bit tricky. Hopefully, they should bevery easy to write with a few hints.

The major problem we have is that one message (underlying implementation: bonobo.structs.bags.Bag) cango through more than one component, and at the same time. If you wanna be safe, you tend to copy.copy()everything between two calls to two different components, but that’s very expensive.

Instead, we chose the opposite: copies are never made, and you should not modify in place the inputs of your compo-nent before yielding them, and that mostly means that you want to recreate dicts and lists before yielding (or returning)them. Numeric values, strings and tuples being immutable in python, modifying a variable of one of those type willalready return a different instance.

Examples will be shown with return statements, of course you can do the same with yield statements in generators.

Numbers

In python, numbers are immutable. So you can’t be wrong with numbers. All of the following are correct.

def do_your_number_thing(n: int) -> int:return n

def do_your_number_thing(n: int) -> int:return n + 1

def do_your_number_thing(n: int) -> int:# correct, but bad style

19

Page 24: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

n += 1return n

The same is true with other numeric types, so don’t be shy.

Tuples

Tuples are immutable, so you risk nothing.

def do_your_tuple_thing(t: tuple) -> tuple:return ('foo', ) + t

def do_your_tuple_thing(t: tuple) -> tuple:return t + ('bar', )

def do_your_tuple_thing(t: tuple) -> tuple:# correct, but bad stylet += ('baaaz', )return t

Strings

You know the drill, strings are immutable.

def do_your_str_thing(t: str) -> str:return 'foo ' + t + ' bar'

def do_your_str_thing(t: str) -> str:return ' '.join(('foo', t, 'bar', ))

def do_your_str_thing(t: str) -> str:return 'foo {} bar'.format(t)

You can, if you’re using python 3.6+, use f-strings, but the core bonobo libraries won’t use it to stay 3.5 compatible.

Dicts

So, now it gets interesting. Dicts are mutable. It means that you can mess things up if you’re not cautious.

For example, doing the following may cause unexpected problems:

def mutate_my_dict_like_crazy(d: dict) -> dict:# Bad! Don't do that!d.update({

'foo': compute_something()})# Still bad! Don't mutate the dict!d['bar'] = compute_anotherthing()return d

The problem is easy to understand: as Bonobo won’t make copies of your dict, the same dict will be passed along thetransformation graph, and mutations will be seen in components downwards the output (and also upward). Let’s see amore obvious example of something you should not do:

20 Chapter 3. Guides

Page 25: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

def mutate_my_dict_and_yield() -> dict:d = {}for i in range(100):

# Bad! Don't do that!d['index'] = iyield d

Here, the same dict is yielded in each iteration, and its state when the next component in chain is called is undetermined(how many mutations happened since the yield? Hard to tell. . . ).

Now let’s see how to do it correctly:

def new_dicts_like_crazy(d: dict) -> dict:# Creating a new dict is correct.return {

**d,'foo': compute_something(),'bar': compute_anotherthing(),

}

def new_dict_and_yield() -> dict:d = {}for i in range(100):

# Different dict each time.yield {

'index': i}

I hear you think «Yeah, but if I create like millions of dicts . . . ».

Let’s say we chose the opposite way and copied the dict outside the transformation (in fact, it’s what we did in bonobo’sancestor). This means you will also create the same number of dicts, the difference is that you won’t even notice it.Also, it means that if you want to yield the same dict 1 million times , going “pure” makes it efficient (you’ll just yieldthe same object 1 million times) while going “copy crazy” will create 1 million objects.

Using dicts like this will create a lot of dicts, but also free them as soon as all the future components that take this dictas input are done. Also, one important thing to note is that most primitive data structures in python are immutable, socreating a new dict will of course create a new envelope, but the unchanged objects inside won’t be duplicated.

Last thing, copies made in the “pure” approach are explicit, and usually, explicit is better than implicit.

3.1.2 Transformations

Here is some guidelines on how to write transformations, to avoid the convention-jungle that could happen without afew rules.

Naming conventions

The naming convention used is the following.

If you’re naming something which is an actual transformation, that can be used directly as a graph node, then useunderscores and lowercase names:

# instance of a class based transformationfilter = Filter(...)

# function based transformation

3.1. Concepts and best practices 21

Page 26: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

def uppercase(s: str) -> str:return s.upper()

If you’re naming something which is configurable, that will need to be instanciated or called to obtain something thatcan be used as a graph node, then use camelcase names:

# configurableclass ChangeCase(Configurable):

modifier = Option(default='upper')def call(self, s: str) -> str:

return getattr(s, self.modifier)()

# transformation factorydef Apply(method):

@functools.wraps(method)def apply(s: str) -> str:

return method(s)return apply

# result is a graph node candidateupper = Apply(str.upper)

Function based transformations

The most basic transformations are function-based. Which means that you define a function, and it will be useddirectly in a graph.

def get_representation(row):return repr(row)

graph = bonobo.Graph([...],get_representation,

)

It does not allow any configuration, but if it’s an option, prefer it as it’s simpler to write.

Class based transformations

A lot of logic is a bit more complex, and you’ll want to use classes to define some of your transformations.

The bonobo.config.Configurable class gives you a few toys to write configurable transformations.

Options

class bonobo.config.Option(type=None, *, required=False, positional=False, default=None)An Option is a descriptor for Configurable’s parameters.

typeOption type allows to provide a callable used to cast, clean or validate the option value. If not provided, orNone, the option’s value will be the exact value user provided.

(default: None)

22 Chapter 3. Guides

Page 27: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

requiredIf an option is required, an error will be raised if no value is provided (at runtime). If it is not, option willhave the default value if user does not override it at runtime.

(default: False)

positionalIf this is true, it’ll be possible to provide the option value as a positional argument. Otherwise, it must beprovided as a keyword argument.

(default: False)

defaultDefault value for non-required options.

(default: None)

Example:

from bonobo.config import Configurable, Option

class Example(Configurable):title = Option(str, required=True, positional=True)keyword = Option(str, default='foo')

def call(self, s):return self.title + ': ' + s + ' (' + self.keyword + ')'

example = Example('hello', keyword='bar')

Services

class bonobo.config.Service(name)A Service is a special kind of option defining a dependency to something that will be resolved at runtime, usingan identifier. For example, you can create a Configurable that has a “database” Service in its attribute, meaningthat you’ll define which database to use, by name, when creating the instance of this class, then provide animplementation when running the graph using a strategy.

Example:

import bonobo

class QueryExtractor(bonobo.Configurable):database = bonobo.Service(default='sqlalchemy.engine.default')

graph = bonobo.Graph(QueryExtractor(database='sqlalchemy.engine.secondary'),

*more_transformations,)

if __name__ == '__main__':engine = create_engine('... dsn ...')bonobo.run(graph, services={

'sqlalchemy.engine.secondary': engine})

The main goal is not to tie transformations to actual dependencies, so the same can be run in different contexts(stages like preprod, prod, or tenants like client1, client2, or anything you want).

3.1. Concepts and best practices 23

Page 28: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

nameService name will be used to retrieve the implementation at runtime.

Methods

class bonobo.config.MethodA Method is a special callable-valued option, that can be used in three different ways (but for same purpose).

• Like a normal option, the value can be provided to the Configurable constructor.

>>> from bonobo.config import Configurable, Method

>>> class MethodExample(Configurable):... handler = Method()

>>> example1 = MethodExample(handler=str.upper)

• It can be used by a child class that overrides the Method with a normal method.

>>> class ChildMethodExample(MethodExample):... def handler(self, s: str):... return s.upper()

>>> example2 = ChildMethodExample()

• Finally, it also enables the class to be used as a decorator, to generate a subclass providing the Method avalue.

>>> @MethodExample... def OtherChildMethodExample(s):... return s.upper()

>>> example3 = OtherChildMethodExample()

ContextProcessors

class bonobo.config.ContextProcessor(func)A ContextProcessor is a kind of transformation decorator that can setup and teardown a transformation andruntime related dependencies, at the execution level.

It works like a yielding context manager, and is the recommended way to setup and teardown objects you’llneed in the context of one execution. It’s the way to overcome the stateless nature of transformations.

The yielded values will be passed as positional arguments to the next context processors (order do matter), andfinally to the __call__ method of the transformation.

Warning: this may change for a similar but simpler implementation, don’t relly too much on it (yet).

Example:

>>> from bonobo.config import Configurable>>> from bonobo.util.objects import ValueHolder

24 Chapter 3. Guides

Page 29: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

>>> class Counter(Configurable):... @ContextProcessor... def counter(self, context):... yield ValueHolder(0)...... def __call__(self, counter, *args, **kwargs):... counter += 1... yield counter.get()

3.1.3 Services and dependencies

Last-Modified 20 may 2017

You’ll probably want to use external systems within your transformations. Those systems may include databases, apis(using http, for example), filesystems, etc.

You can start by hardcoding those services. That does the job, at first.

If you’re going a little further than that, you’ll feel limited, for a few reasons:

• Hardcoded and tightly linked dependencies make your transformations hard to test, and hard to reuse.

• Processing data on your laptop is great, but being able to do it on different target systems (or stages), in differentenvironments, is more realistic. You’ll want to contigure a different database on a staging environment, preprodenvironment or production system. Maybe you have silimar systems for different clients and want to select thesystem at runtime. Etc.

Service injection

To solve this problem, we introduce a light dependency injection system. It allows to define named dependencies inyour transformations, and provide an implementation at runtime.

Class-based transformations

To define a service dependency in a class-based transformation, use bonobo.config.Service, a special descrip-tor (and subclass of bonobo.config.Option) that will hold the service names and act as a marker for runtimeresolution of service instances.

Let’s define such a transformation:

from bonobo.config import Configurable, Service

class JoinDatabaseCategories(Configurable):database = Service('primary_sql_database')

def __call__(self, database, row):return {

**row,'category': database.get_category_name_for_sku(row['sku'])

}

This piece of code tells bonobo that your transformation expect a sercive called “primary_sql_database”, that will beinjected to your calls under the parameter name “database”.

3.1. Concepts and best practices 25

Page 30: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Function-based transformations

No implementation yet, but expect something similar to CBT API, maybe using a @Service(. . . ) decorator. See issue#70.

Provide implementation at run time

Let’s see how to execute it:

import bonobo

graph = bonobo.graph(

*before,JoinDatabaseCategories(),

*after,)

if __name__ == '__main__':bonobo.run(

graph,services={

'primary_sql_database': my_database_service,}

)

A dictionary, or dictionary-like, “services” named argument can be passed to the bonobo.run() helper. The“dictionary-like” part is the real keyword here. Bonobo is not a DIC library, and won’t become one. So the im-plementation provided is pretty basic, and feature-less. But you can use much more evolved libraries instead of theprovided stub, and as long as it works the same (a.k.a implements a dictionary-like interface), the system will use it.

Solving concurrency problems

If a service cannot be used by more than one thread at a time, either because it’s just not threadsafe, or because itrequires to carefully order the calls made (apis that includes nonces, or work on results returned by previous calls areusually good candidates), you can use the bonobo.config.Exclusive context processor to lock the use of adependency for a time period.

from bonobo.config import Exclusive

def t1(api):with Exclusive(api):

api.first_call()api.second_call()# ... etcapi.last_call()

Service configuration (to be decided and implemented)

• There should be a way to configure default service implementation for a python file, a directory, a project . . .

• There should be a way to override services when running a transformation.

• There should be a way to use environment for service configuration.

26 Chapter 3. Guides

Page 31: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Future and proposals

This is the first proposed implementation and it will evolve, but looks a lot like how we used bonobo ancestor inproduction.

May or may not happen, depending on discussions.

• Singleton or prototype based injection (to use spring terminology, see https://www.tutorialspoint.com/spring/spring_bean_scopes.htm), allowing smart factory usage and efficient sharing of resources.

• Lazily resolved parameters, eventually overriden by command line or environment, so you can for exampleoverride the database DSN or target filesystem on command line (or with shell environment).

• Pool based locks that ensure that only one (or n) transformations are using a given service at the same time.

• Simple config implementation, using a python file for config (ex: bonobo run . . . –services=services_prod.py).

• Default configuration for services, using an optional callable (def get_services(args): . . . ). Maybe tie defaultconfiguration to graph, but not really a fan because this is unrelated to graph logic.

• Default implementation for a service in a transformation or in the descriptor. Maybe not a good idea, because ittends to push forward multiple instances of the same thing, but we maybe. . .

A few ideas on how it can be implemented, from the user perspective.

# using callhttp = Service('http.client')(requests)

# using more explicit callhttp = Service('http.client').set_default_impl(requests)

# using a decorator@Service('http.client')def http(self, services):

import requestsreturn requests

# as a default in a subclass of Serviceclass HttpService(Service):

def get_default_impl(self, services):import requestsreturn requests

# ... then use it as another servicehttp = HttpService('http.client')

This is under development, let us know what you think (slack may be a good place for this). The basics already work,and you can try it.

Read more

• See https://github.com/hartym/bonobo-sqlalchemy/blob/work-in-progress/bonobo_sqlalchemy/writers.py#L19for example usage (work in progress).

3.2 Third party integrations

There is a few bonobo extensions that ease the use of the library with third party tools. Each integration is availableas an optional extra dependency, and the maturity stage of each extension vary.

3.2. Third party integrations 27

Page 32: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

3.2.1 Bonobo with Docker

Todo: The bonobo-docker package is at a very alpha stage, and things will change. This section is here to give a briefoverview but is neither complete nor definitive.

Read the introduction: https://www.bonobo-project.org/with/docker

3.2.2 Bonobo with Jupyter

There is a builtin plugin that integrates (kind of minimalistically, for now) bonobo within jupyter notebooks, so youcan read the execution status of a graph within a nice (ok not so nice) html/javascript widget.

See https://github.com/jupyter-widgets/widget-cookiecutter for the base template used.

Installation

Install bonobo with the jupyter extra:

pip install bonobo[jupyter]

Install the jupyter extension:

jupyter nbextension enable --py --sys-prefix widgetsnbextensionjupyter nbextension enable --py --sys-prefix bonobo.ext.jupyter

Development

You should favor yarn over npm to install node packages. If you prefer to use npm, it’s up to you to adapt the code.

To install the widget for development, make sure you’re using an editable install of bonobo (see install document):

jupyter nbextension install --py --symlink --sys-prefix bonobo.ext.jupyterjupyter nbextension enable --py --sys-prefix bonobo.ext.jupyter

If you want to change the javascript, you should run webpack in watch mode in some terminal:

cd bonobo/ext/jupyter/jsyarn install./node_modules/.bin/webpack --watch

To compile the widget into a distributable version (which gets packaged on PyPI when a release is made), just runwebpack:

./node_modules/.bin/webpack

3.2.3 Bonobo with Selenium

Todo: The bonobo-selenium package is at a very alpha stage, and things will change. This section is here to give abrief overview but is neither complete nor definitive.

28 Chapter 3. Guides

Page 33: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Writing web crawlers with Bonobo and Selenium is easy.

First, install bonobo-selenium:

$ pip install bonobo-selenium

The idea is to have one callable crawl one thing and delegate drill downs to callables further away in the chain.

An example chain could be:

login paginate list details ExcelWriter(...)

Where each step would do the following:

• login() is in charge to open an authenticated session in the browser.

• paginate() open each page of a fictive list and pass it to next.

• list() take every list item and yield it.

• details() extract the data you’re interested in.

• . . . and the writer saves it somewhere.

Installation

Overview

Details

3.2.4 Bonobo with SQLAlchemy

Todo: The bonobo-sqlalchemy package is at a very alpha stage, and things will change. This section is here to give abrief overview but is neither complete nor definitive.

Read the introduction: https://www.bonobo-project.org/with/sqlalchemy

Installation

Overview

Details

3.2. Third party integrations 29

Page 34: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

30 Chapter 3. Guides

Page 35: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

CHAPTER 4

References

Reference documents of all stable APIs and modules. If something is not here, please be careful about using it as itmeans that the api is not yet 1.0-proof.

4.1 Bonobo API

The Bonobo API, available directly under the bonobo package, contains all the tools you need to get started withbonobo.

4.1.1 The bonobo package

Bonobo data-processing toolkit main module.

bonobo.run(graph, strategy=None, plugins=None, services=None)Main entry point of bonobo. It takes a graph and creates all the necessary plumbery around to execute it.

The only necessary argument is a Graph instance, containing the logic you actually want to execute.

By default, this graph will be executed using the “threadpool” strategy: each graph node will be wrapped in athread, and executed in a loop until there is no more input to this node.

You can provide plugins factory objects in the plugins list, this function will add the necessary plugins forinteractive console execution and jupyter notebook execution if it detects correctly that it runs in this context.

You’ll probably want to provide a services dictionary mapping service names to service instances.

Parameters

• graph (Graph) – The Graph to execute.

• strategy (str) – The bonobo.strategies.base.Strategy to use.

• plugins (list) – The list of plugins to enhance execution.

• services (dict) – The implementations of services this graph will use.

31

Page 36: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Return bonobo.execution.graph.GraphExecutionContext

class bonobo.Bag(*args, _flags=None, _parent=None, **kwargs)Bases: object

Bags are simple datastructures that holds arguments and keyword arguments together, that may be applied to acallable.

Example:

>>> from bonobo import Bag>>> def myfunc(foo, *, bar):... print(foo, bar)...>>> bag = Bag('foo', bar='baz')>>> bag.apply(myfunc)foo baz

A bag can inherit another bag, allowing to override only a few arguments without touching the parent.

Example:

>>> bag2 = Bag(bar='notbaz', _parent=bag)>>> bag2.apply(myfunc)foo notbaz

apply(func_or_iter, *args, **kwargs)

args

extend(*args, **kwargs)

flags

get()Get a 2 element tuple of this bag’s args and kwargs.

Returns tuple

classmethod inherit(*args, **kwargs)

kwargs

set_parent(parent)

class bonobo.Graph(*chain)Bases: object

Represents a directed graph of nodes.

add_chain(*nodes, _input=<Begin>, _output=None, _name=None)Add a chain in this graph.

add_node(c)Add a node without connections in this graph and returns its index.

outputs_of(idx, create=False)Get a set of the outputs for a given node index.

topologically_sorted_indexesIterate in topological order, based on networkx’s topological_sort() function.

class bonobo.Token(name)Bases: object

32 Chapter 4. References

Page 37: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Factory for signal oriented queue messages or other token types.

bonobo.create_strategy(name=None)Create a strategy, or just returns it if it’s already one.

Parameters name –

Returns Strategy

bonobo.open_fs(fs_url=None, *args, **kwargs)Wraps fs.open_fs() function with a few candies.

Parameters

• fs_url (str) – A filesystem URL

• parse_result (ParseResult) – A parsed filesystem URL.

• writeable (bool) – True if the filesystem must be writeable.

• create (bool) – True if the filesystem should be created if it does not exist.

• cwd (str) – The current working directory (generally only relevant for OS filesystems).

• default_protocol (str) – The protocol to use if one is not supplied in the FS URL(defaults to "osfs").

Returns FS object

class bonobo.CsvReader(*args, **kwargs)Bases: bonobo.nodes.io.base.IOFormatEnabled, bonobo.nodes.io.file.FileReader,bonobo.nodes.io.csv.CsvHandler

Reads a CSV and yield the values as dicts.

skipThe amount of lines to skip before it actually yield output.

csv_headersA ContextProcessor is a kind of transformation decorator that can setup and teardown a transformationand runtime related dependencies, at the execution level.

It works like a yielding context manager, and is the recommended way to setup and teardown objects you’llneed in the context of one execution. It’s the way to overcome the stateless nature of transformations.

The yielded values will be passed as positional arguments to the next context processors (order do matter),and finally to the __call__ method of the transformation.

Warning: this may change for a similar but simpler implementation, don’t relly too much on it (yet).

Example:

>>> from bonobo.config import Configurable>>> from bonobo.util.objects import ValueHolder

>>> class Counter(Configurable):... @ContextProcessor... def counter(self, context):... yield ValueHolder(0)...... def __call__(self, counter, *args, **kwargs):... counter += 1... yield counter.get()

read(fs, file, headers)

4.1. Bonobo API 33

Page 38: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

skipAn Option is a descriptor for Configurable’s parameters.

typeOption type allows to provide a callable used to cast, clean or validate the option value. If not provided,or None, the option’s value will be the exact value user provided.

(default: None)

requiredIf an option is required, an error will be raised if no value is provided (at runtime). If it is not, optionwill have the default value if user does not override it at runtime.

(default: False)

positionalIf this is true, it’ll be possible to provide the option value as a positional argument. Otherwise, it mustbe provided as a keyword argument.

(default: False)

defaultDefault value for non-required options.

(default: None)

Example:

from bonobo.config import Configurable, Option

class Example(Configurable):title = Option(str, required=True, positional=True)keyword = Option(str, default='foo')

def call(self, s):return self.title + ': ' + s + ' (' + self.keyword + ')'

example = Example('hello', keyword='bar')

class bonobo.CsvWriter(*args, **kwargs)Bases: bonobo.nodes.io.base.IOFormatEnabled, bonobo.nodes.io.file.FileWriter,bonobo.nodes.io.csv.CsvHandler

write(fs, file, lineno, writer, headers, *args, **kwargs)

writerA ContextProcessor is a kind of transformation decorator that can setup and teardown a transformationand runtime related dependencies, at the execution level.

It works like a yielding context manager, and is the recommended way to setup and teardown objects you’llneed in the context of one execution. It’s the way to overcome the stateless nature of transformations.

The yielded values will be passed as positional arguments to the next context processors (order do matter),and finally to the __call__ method of the transformation.

Warning: this may change for a similar but simpler implementation, don’t relly too much on it (yet).

Example:

>>> from bonobo.config import Configurable>>> from bonobo.util.objects import ValueHolder

34 Chapter 4. References

Page 39: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

>>> class Counter(Configurable):... @ContextProcessor... def counter(self, context):... yield ValueHolder(0)...... def __call__(self, counter, *args, **kwargs):... counter += 1... yield counter.get()

class bonobo.FileReader(*args, **kwargs)Bases: bonobo.nodes.io.base.Reader, bonobo.nodes.io.base.FileHandler

Component factory for file-like readers.

On its own, it can be used to read a file and yield one row per line, trimming the “eol” character at the end ifpresent. Extending it is usually the right way to create more specific file readers (like json, csv, etc.)

modeAn Option is a descriptor for Configurable’s parameters.

typeOption type allows to provide a callable used to cast, clean or validate the option value. If not provided,or None, the option’s value will be the exact value user provided.

(default: None)

requiredIf an option is required, an error will be raised if no value is provided (at runtime). If it is not, optionwill have the default value if user does not override it at runtime.

(default: False)

positionalIf this is true, it’ll be possible to provide the option value as a positional argument. Otherwise, it mustbe provided as a keyword argument.

(default: False)

defaultDefault value for non-required options.

(default: None)

Example:

from bonobo.config import Configurable, Option

class Example(Configurable):title = Option(str, required=True, positional=True)keyword = Option(str, default='foo')

def call(self, s):return self.title + ': ' + s + ' (' + self.keyword + ')'

example = Example('hello', keyword='bar')

read(fs, file)Write a row on the next line of given file. Prefix is used for newlines.

class bonobo.FileWriter(*args, **kwargs)Bases: bonobo.nodes.io.base.Writer, bonobo.nodes.io.base.FileHandler

4.1. Bonobo API 35

Page 40: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Component factory for file or file-like writers.

On its own, it can be used to write in a file one line per row that comes into this component. Extending it isusually the right way to create more specific file writers (like json, csv, etc.)

linenoA ContextProcessor is a kind of transformation decorator that can setup and teardown a transformationand runtime related dependencies, at the execution level.

It works like a yielding context manager, and is the recommended way to setup and teardown objects you’llneed in the context of one execution. It’s the way to overcome the stateless nature of transformations.

The yielded values will be passed as positional arguments to the next context processors (order do matter),and finally to the __call__ method of the transformation.

Warning: this may change for a similar but simpler implementation, don’t relly too much on it (yet).

Example:

>>> from bonobo.config import Configurable>>> from bonobo.util.objects import ValueHolder

>>> class Counter(Configurable):... @ContextProcessor... def counter(self, context):... yield ValueHolder(0)...... def __call__(self, counter, *args, **kwargs):... counter += 1... yield counter.get()

modeAn Option is a descriptor for Configurable’s parameters.

typeOption type allows to provide a callable used to cast, clean or validate the option value. If not provided,or None, the option’s value will be the exact value user provided.

(default: None)

requiredIf an option is required, an error will be raised if no value is provided (at runtime). If it is not, optionwill have the default value if user does not override it at runtime.

(default: False)

positionalIf this is true, it’ll be possible to provide the option value as a positional argument. Otherwise, it mustbe provided as a keyword argument.

(default: False)

defaultDefault value for non-required options.

(default: None)

Example:

from bonobo.config import Configurable, Option

class Example(Configurable):

36 Chapter 4. References

Page 41: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

title = Option(str, required=True, positional=True)keyword = Option(str, default='foo')

def call(self, s):return self.title + ': ' + s + ' (' + self.keyword + ')'

example = Example('hello', keyword='bar')

write(fs, file, lineno, line)Write a row on the next line of opened file in context.

class bonobo.Filter(*args, **kwargs)Bases: bonobo.config.configurables.Configurable

Filter out hashes from the stream depending on the filter callable return value, when called with the currenthash as parameter.

Can be used as a decorator on a filter callable.

filterA callable used to filter lines.

If the callable returns a true-ish value, the input will be passed unmodified to the next items.

Otherwise, it’ll be burnt.

call(*args, **kwargs)

filterA Method is a special callable-valued option, that can be used in three different ways (but for same pur-pose).

• Like a normal option, the value can be provided to the Configurable constructor.

>>> from bonobo.config import Configurable, Method

>>> class MethodExample(Configurable):... handler = Method()

>>> example1 = MethodExample(handler=str.upper)

• It can be used by a child class that overrides the Method with a normal method.

>>> class ChildMethodExample(MethodExample):... def handler(self, s: str):... return s.upper()

>>> example2 = ChildMethodExample()

• Finally, it also enables the class to be used as a decorator, to generate a subclass providing the Methoda value.

>>> @MethodExample... def OtherChildMethodExample(s):... return s.upper()

>>> example3 = OtherChildMethodExample()

4.1. Bonobo API 37

Page 42: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

class bonobo.JsonReader(*args, **kwargs)Bases: bonobo.nodes.io.base.IOFormatEnabled, bonobo.nodes.io.file.FileReader,bonobo.nodes.io.json.JsonHandler

static loader(fp, cls=None, object_hook=None, parse_float=None, parse_int=None,parse_constant=None, object_pairs_hook=None, **kw)

Deserialize fp (a .read()-supporting file-like object containing a JSON document) to a Python object.

object_hook is an optional function that will be called with the result of any object literal decode (adict). The return value of object_hook will be used instead of the dict. This feature can be used toimplement custom decoders (e.g. JSON-RPC class hinting).

object_pairs_hook is an optional function that will be called with the result of any object literaldecoded with an ordered list of pairs. The return value of object_pairs_hook will be used insteadof the dict. This feature can be used to implement custom decoders that rely on the order that the keyand value pairs are decoded (for example, collections.OrderedDict will remember the order of insertion).If object_hook is also defined, the object_pairs_hook takes priority.

To use a custom JSONDecoder subclass, specify it with the cls kwarg; otherwise JSONDecoder isused.

read(fs, file)

class bonobo.JsonWriter(*args, **kwargs)Bases: bonobo.nodes.io.base.IOFormatEnabled, bonobo.nodes.io.file.FileWriter,bonobo.nodes.io.json.JsonHandler

envelopeA ContextProcessor is a kind of transformation decorator that can setup and teardown a transformationand runtime related dependencies, at the execution level.

It works like a yielding context manager, and is the recommended way to setup and teardown objects you’llneed in the context of one execution. It’s the way to overcome the stateless nature of transformations.

The yielded values will be passed as positional arguments to the next context processors (order do matter),and finally to the __call__ method of the transformation.

Warning: this may change for a similar but simpler implementation, don’t relly too much on it (yet).

Example:

>>> from bonobo.config import Configurable>>> from bonobo.util.objects import ValueHolder

>>> class Counter(Configurable):... @ContextProcessor... def counter(self, context):... yield ValueHolder(0)...... def __call__(self, counter, *args, **kwargs):... counter += 1... yield counter.get()

write(fs, file, lineno, *args, **kwargs)Write a json row on the next line of file pointed by ctx.file.

Parameters

• ctx –

• row –

38 Chapter 4. References

Page 43: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

class bonobo.Limit(*args, **kwargs)Bases: bonobo.config.configurables.Configurable

Creates a Limit() node, that will only let go through the first n rows (defined by the limit option), unmodified.

limitNumber of rows to let go through.

call(counter, *args, **kwargs)

counterA ContextProcessor is a kind of transformation decorator that can setup and teardown a transformationand runtime related dependencies, at the execution level.

It works like a yielding context manager, and is the recommended way to setup and teardown objects you’llneed in the context of one execution. It’s the way to overcome the stateless nature of transformations.

The yielded values will be passed as positional arguments to the next context processors (order do matter),and finally to the __call__ method of the transformation.

Warning: this may change for a similar but simpler implementation, don’t relly too much on it (yet).

Example:

>>> from bonobo.config import Configurable>>> from bonobo.util.objects import ValueHolder

>>> class Counter(Configurable):... @ContextProcessor... def counter(self, context):... yield ValueHolder(0)...... def __call__(self, counter, *args, **kwargs):... counter += 1... yield counter.get()

limitAn Option is a descriptor for Configurable’s parameters.

typeOption type allows to provide a callable used to cast, clean or validate the option value. If not provided,or None, the option’s value will be the exact value user provided.

(default: None)

requiredIf an option is required, an error will be raised if no value is provided (at runtime). If it is not, optionwill have the default value if user does not override it at runtime.

(default: False)

positionalIf this is true, it’ll be possible to provide the option value as a positional argument. Otherwise, it mustbe provided as a keyword argument.

(default: False)

defaultDefault value for non-required options.

(default: None)

Example:

4.1. Bonobo API 39

Page 44: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

from bonobo.config import Configurable, Option

class Example(Configurable):title = Option(str, required=True, positional=True)keyword = Option(str, default='foo')

def call(self, s):return self.title + ': ' + s + ' (' + self.keyword + ')'

example = Example('hello', keyword='bar')

class bonobo.PrettyPrinter(*args, **kwargs)Bases: bonobo.config.configurables.Configurable

call(*args, **kwargs)

class bonobo.PickleReader(*args, **kwargs)Bases: bonobo.nodes.io.base.IOFormatEnabled, bonobo.nodes.io.file.FileReader,bonobo.nodes.io.pickle.PickleHandler

Reads a Python pickle object and yields the items in dicts.

modeAn Option is a descriptor for Configurable’s parameters.

typeOption type allows to provide a callable used to cast, clean or validate the option value. If not provided,or None, the option’s value will be the exact value user provided.

(default: None)

requiredIf an option is required, an error will be raised if no value is provided (at runtime). If it is not, optionwill have the default value if user does not override it at runtime.

(default: False)

positionalIf this is true, it’ll be possible to provide the option value as a positional argument. Otherwise, it mustbe provided as a keyword argument.

(default: False)

defaultDefault value for non-required options.

(default: None)

Example:

from bonobo.config import Configurable, Option

class Example(Configurable):title = Option(str, required=True, positional=True)keyword = Option(str, default='foo')

def call(self, s):return self.title + ': ' + s + ' (' + self.keyword + ')'

example = Example('hello', keyword='bar')

40 Chapter 4. References

Page 45: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

pickle_headersA ContextProcessor is a kind of transformation decorator that can setup and teardown a transformationand runtime related dependencies, at the execution level.

It works like a yielding context manager, and is the recommended way to setup and teardown objects you’llneed in the context of one execution. It’s the way to overcome the stateless nature of transformations.

The yielded values will be passed as positional arguments to the next context processors (order do matter),and finally to the __call__ method of the transformation.

Warning: this may change for a similar but simpler implementation, don’t relly too much on it (yet).

Example:

>>> from bonobo.config import Configurable>>> from bonobo.util.objects import ValueHolder

>>> class Counter(Configurable):... @ContextProcessor... def counter(self, context):... yield ValueHolder(0)...... def __call__(self, counter, *args, **kwargs):... counter += 1... yield counter.get()

read(fs, file, pickle_headers)

class bonobo.PickleWriter(*args, **kwargs)Bases: bonobo.nodes.io.base.IOFormatEnabled, bonobo.nodes.io.file.FileWriter,bonobo.nodes.io.pickle.PickleHandler

modeAn Option is a descriptor for Configurable’s parameters.

typeOption type allows to provide a callable used to cast, clean or validate the option value. If not provided,or None, the option’s value will be the exact value user provided.

(default: None)

requiredIf an option is required, an error will be raised if no value is provided (at runtime). If it is not, optionwill have the default value if user does not override it at runtime.

(default: False)

positionalIf this is true, it’ll be possible to provide the option value as a positional argument. Otherwise, it mustbe provided as a keyword argument.

(default: False)

defaultDefault value for non-required options.

(default: None)

Example:

from bonobo.config import Configurable, Option

4.1. Bonobo API 41

Page 46: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

class Example(Configurable):title = Option(str, required=True, positional=True)keyword = Option(str, default='foo')

def call(self, s):return self.title + ': ' + s + ' (' + self.keyword + ')'

example = Example('hello', keyword='bar')

write(fs, file, lineno, item)Write a pickled item to the opened file.

bonobo.Tee(f)

bonobo.count(counter, *args, **kwargs)

bonobo.identity(x)

bonobo.noop(*args, **kwargs)

bonobo.get_examples_path(*pathsegments)

bonobo.open_examples_fs(*pathsegments)

4.2 Config API

The Config API, located under the bonobo.config namespace, contains all the tools you need to create config-urable transformations, either class-based or function-based.

class bonobo.config.Configurable(*args, **kwargs)Bases: object

Generic class for configurable objects. Configurable objects have a dictionary of “options” descriptors thatdefines the configuration schema of the type.

call(*args, **kwargs)

class bonobo.config.ContainerBases: dict

args_for(mixed)

get(name, default=None)

class bonobo.config.ContextProcessor(func)Bases: bonobo.config.options.Option

A ContextProcessor is a kind of transformation decorator that can setup and teardown a transformation andruntime related dependencies, at the execution level.

It works like a yielding context manager, and is the recommended way to setup and teardown objects you’llneed in the context of one execution. It’s the way to overcome the stateless nature of transformations.

The yielded values will be passed as positional arguments to the next context processors (order do matter), andfinally to the __call__ method of the transformation.

Warning: this may change for a similar but simpler implementation, don’t relly too much on it (yet).

Example:

42 Chapter 4. References

Page 47: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

>>> from bonobo.config import Configurable>>> from bonobo.util.objects import ValueHolder

>>> class Counter(Configurable):... @ContextProcessor... def counter(self, context):... yield ValueHolder(0)...... def __call__(self, counter, *args, **kwargs):... counter += 1... yield counter.get()

classmethod decorate(cls_or_func)

class bonobo.config.Exclusive(wrapped)Bases: contextlib.ContextDecorator

Decorator and context manager used to require exclusive usage of an object, most probably a service. It’s usefullfor example if call order matters on a service implementation (think of an http api that requires a nonce or versionparameter . . . ).

Usage:

>>> def handler(some_service):... with Exclusive(some_service):... some_service.call_1()... some_service.call_2()... some_service.call_3()

This will ensure that nobody else is using the same service while in the “with” block, using a lock primitive toensure that.

get_lock()

class bonobo.config.MethodBases: bonobo.config.options.Option

A Method is a special callable-valued option, that can be used in three different ways (but for same purpose).

• Like a normal option, the value can be provided to the Configurable constructor.

>>> from bonobo.config import Configurable, Method

>>> class MethodExample(Configurable):... handler = Method()

>>> example1 = MethodExample(handler=str.upper)

• It can be used by a child class that overrides the Method with a normal method.

>>> class ChildMethodExample(MethodExample):... def handler(self, s: str):... return s.upper()

>>> example2 = ChildMethodExample()

• Finally, it also enables the class to be used as a decorator, to generate a subclass providing the Method avalue.

4.2. Config API 43

Page 48: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

>>> @MethodExample... def OtherChildMethodExample(s):... return s.upper()

>>> example3 = OtherChildMethodExample()

clean(value)

class bonobo.config.Option(type=None, *, required=False, positional=False, default=None)Bases: object

An Option is a descriptor for Configurable’s parameters.

typeOption type allows to provide a callable used to cast, clean or validate the option value. If not provided, orNone, the option’s value will be the exact value user provided.

(default: None)

requiredIf an option is required, an error will be raised if no value is provided (at runtime). If it is not, option willhave the default value if user does not override it at runtime.

(default: False)

positionalIf this is true, it’ll be possible to provide the option value as a positional argument. Otherwise, it must beprovided as a keyword argument.

(default: False)

defaultDefault value for non-required options.

(default: None)

Example:

from bonobo.config import Configurable, Option

class Example(Configurable):title = Option(str, required=True, positional=True)keyword = Option(str, default='foo')

def call(self, s):return self.title + ': ' + s + ' (' + self.keyword + ')'

example = Example('hello', keyword='bar')

clean(value)

get_default()

class bonobo.config.Service(name)Bases: bonobo.config.options.Option

A Service is a special kind of option defining a dependency to something that will be resolved at runtime, usingan identifier. For example, you can create a Configurable that has a “database” Service in its attribute, meaningthat you’ll define which database to use, by name, when creating the instance of this class, then provide animplementation when running the graph using a strategy.

Example:

44 Chapter 4. References

Page 49: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

import bonobo

class QueryExtractor(bonobo.Configurable):database = bonobo.Service(default='sqlalchemy.engine.default')

graph = bonobo.Graph(QueryExtractor(database='sqlalchemy.engine.secondary'),

*more_transformations,)

if __name__ == '__main__':engine = create_engine('... dsn ...')bonobo.run(graph, services={

'sqlalchemy.engine.secondary': engine})

The main goal is not to tie transformations to actual dependencies, so the same can be run in different contexts(stages like preprod, prod, or tenants like client1, client2, or anything you want).

nameService name will be used to retrieve the implementation at runtime.

resolve(inst, services)

bonobo.config.requires(*service_names)

4.3 Command-line

4.3.1 Bonobo Init

Create an empty project, ready to use bonobo.

Syntax: bonobo init

Requires edgy.project.

4.3.2 Bonobo Run

Run a transformation graph.

Syntax: bonobo run [-c cmd | -m mod | file | -] [arg]

Todo: implement -m, check if -c is of any use and if yes, implement it too. Implement args, too.

4.3.3 Bonobo RunC

Run a transformation graph in a docker container.

Syntax: bonobo runc [-c cmd | -m mod | file | -] [arg]

Todo: implement -m, check if -c is of any use and if yes, implement it too. Implement args, too.

4.3. Command-line 45

Page 50: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Requires bonobo-docker, install with docker extra: pip install bonobo[docker].

4.4 Settings & Environment

All settings that you can find in the :module:‘bonobo.settings‘ module.

4.4.1 Debug

Purpose Sets the debug mode, which is more verbose. Loglevel will be lowered to DEBUG instead ofINFO.

Environment DEBUG

Setting bonobo.settings.DEBUG

Default False

4.4.2 Profile

Purpose Sets profiling, which adds memory/cpu usage output. Not yet fully implemented. It is expectedthat setting this to true will have a non-neglictible performance impact.

Environment PROFILE

Setting bonobo.settings.PROFILE

Default False

4.4.3 Quiet

Purpose Sets the quiet mode, which ask any output to be computer parsable. Formating will be removed,but it will allow to use unix pipes, etc. Not yet fully implemented, few transformations already useit. Probably, it should be the default on non-interactive terminals.

Environment QUIET

Setting bonobo.settings.QUIET

Default False

4.4.4 Logging Level

Purpose Sets the python minimum logging level.

Environment LOGGING_LEVEL

Setting bonobo.settings.LOGGING_LEVEL

Default DEBUG if DEBUG is False, otherwise INFO

Values CRITICAL, FATAL, ERROR, WARNING, INFO, DEBUG, NOTSET

46 Chapter 4. References

Page 51: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

4.4.5 I/O Format

Purpose Sets default input/output format for builtin transformations. It can be overriden on each node.The kwargs value means that each node will try to read its input from keywords arguments (andwrite similar formated output), while arg0 means it will try to read its input from the first positionalargument (and write similar formated output).

Environment IOFORMAT

Setting bonobo.settings.IOFORMAT

Default kwargs

Values kwargs, arg0

4.5 Examples

There are a few examples bundled with bonobo.

You’ll find them under the bonobo.examples package, and you can run them directly as modules:

$ bonobo run -m bonobo.examples. . . module

4.5.1 Examples from the tutorial

Examples from Let’s get started!

Example 1

bonobo.examples.tutorials.tut01e01.extract()

bonobo.examples.tutorials.tut01e01.load(x)

bonobo.examples.tutorials.tut01e01.transform(x)

Example 2

Examples from Working with files

Example 1: Read

bonobo.examples.tutorials.tut02e01_read.get_services()

Example 2: Write

bonobo.examples.tutorials.tut02e02_write.get_services()

bonobo.examples.tutorials.tut02e02_write.split_one(line)

4.5. Examples 47

Page 52: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

Example 3: Write as map

4.5.2 Datasets

The bonobo.examples.datasets package contains examples that generates datasets locally for other examplesto use. As of today, we commit the content of those datasets to git, even if that may be a bad idea, so all the examplesare easily runnable. Later, we’ll see if we favor a “missing dependency exception” approach.

Coffeeshops

Extracts a list of parisian bars where you can buy a coffee for a reasonable price, and store them in a flat text file.

ODS() transform FileWriter()

Fablabs

4.5.3 Types

Strings

Example on how to use symple python strings to communicate between transformations.

extract() transform(s: str) load(s: str)

bonobo.examples.types.strings.extract()

bonobo.examples.types.strings.transform(s: str)

bonobo.examples.types.strings.load(s: str)

Dicts

Example on how to use symple python dictionaries to communicate between transformations.

48 Chapter 4. References

Page 53: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

extract() transform(row: dict) load(row: dict)

bonobo.examples.types.dicts.extract()

bonobo.examples.types.dicts.transform(row: dict)

bonobo.examples.types.dicts.load(row: dict)

Bags

Example on how to use bonobo.Bag instances to pass flexible args/kwargs to the next callable.

extract() transform(...) load(...)

bonobo.examples.types.bags.extract()

bonobo.examples.types.bags.transform(topic: str)

bonobo.examples.types.bags.load(topic: str, title: str, rand: int)

4.5.4 Utils

Count

Simple example of bonobo.count() usage.

range() count print

4.5. Examples 49

Page 54: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

50 Chapter 4. References

Page 55: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

CHAPTER 5

F.A.Q.

List of questions that went up about the project, in no particuliar order.

5.1 Too long; didn’t read.

Bonobo is an extract-transform-load toolkit for python 3.5+, that use regular python functions, generators and iteratorsas input.

By default, it uses a thread pool to execute all functions in parallel, and handle the movement of data rows in thedirected graph using simple fifo queues.

It allows the user to focus on the content of the transformations, and not optimizing blocking or long operations, northinking about threads or subprocesses.

It’s lean manufacturing for data.

Note: This is NOT a «big data» tool. Neither a «data analysis» tool. We process around 5 millions database lines inaround 1 hour with rdc.etl, bonobo ancestor (algorithms are the same, we still need to run a bit of benchmarks).

5.2 What versions of python does bonobo support? Why not more?

Bonobo is battle-tested against the latest python 3.5 and python 3.6. It may work well using other patch releases ofthose versions, but we cannot guarantee it.

The main reasons about why 3.5+:

• Creating a tool that works well under both python 2 and 3 is a lot more work.

• Python 3 is nearly 10 years old. Consider moving on.

• Python 3.5 contains syntaxic sugar that makes working with data a lot more convenient.

51

Page 56: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

5.3 Can a graph contain another graph?

No, not for now. There are no tools today in bonobo to insert a graph as a subgraph.

It would be great to allow it, but there is a few design questions behind this, like what node you use as input and outputof the subgraph, etc.

On another hand, if you don’t consider a graph as the container but by the nodes and edges it contains, its pretty easyto add a set of nodes and edge to a subgraph, and thus simulate it. But there will be more threads, more copies of thesame nodes, so it’s not really an acceptable answer for big graphs. If it was possible to use a Graph as a node, then theproblem would be correctly solved.

It is something to be seriously considered post 1.0 (probably way post 1.0).

5.4 How would one access contextual data from a transformation?Are there parameter injections like pytest’s fixtures?

There are indeed parameter injections that work much like pytest’s fixtures, and it’s the way to go for transformationcontext.

The API may evolve a bit though, because I feel it’s a bit hackish, as it is. The concept will stay the same, but we needto find a better way to apply it.

To understand how it works today, look at https://github.com/python-bonobo/bonobo/blob/master/bonobo/nodes/io/csv.py#L31 and class hierarchy.

5.5 What is a plugin? Do I need to write one?

Plugins are special classes added to an execution context, used to enhance or change the actual behavior of an executionin a generic way. You don’t need to write plugins to code transformation graphs.

5.6 Is there a difference between a transformation node and a regularpython function or generator?

Short answer: no.

Transformation callables are just regular callables, and there is nothing that differentiate it from regular pythoncallables. You can even use some callables both in an imperative programming context and in a transformation graph,no problem.

Longer answer: yes, sometimes, but you should not care. The function-based transformations are plain old pythoncallable. The class-based transformations can be plain-old-python-objects, but can also subclass Configurable whichbrings a lot of fancy features, like options, service injections, class factories as decorators. . .

52 Chapter 5. F.A.Q.

Page 57: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

5.7 Why did you include the word «marketing» in a commit message?Why is there a marketing-automation tag on the project? Isn’tmarketing evil?

I do use bonobo for marketing automation tasks. Also, half the job of coding something is explaining the worldwhat you’re actually doing, how to get more informations, and how to use it and that’s what I call “marketing” insome commits. Even documentation is somehow marketing, because it allows a market of potential users to actuallyunderstand your product. Whether the product is open-source, a box of chips or a complex commercial software doesnot change a thing.

Marketing may be good or evil, and honestly, it’s out of this project topic and I don’t care. What I care about is thatthere are marketing tasks to automate, and there are some of those cases I can solve with bonobo.

5.8 Why not use <some library> instead?

I did not find the tasks I had easy to do with the libraries I tried. That may or may not apply for your cases, and thatmay or not include some lack of knowledge about some library from me. There is a plan to include comparisons withmajor libraries in this documentation, and help from experts of other libraries (python or not) would be very welcome.

See https://github.com/python-bonobo/bonobo/issues/1

Bonobo is not a replacement for pandas, nor dask, nor luigi, nor airflow. . . It may be a replacement for Pentaho,Talend or other data integration suites but targets people more comfortable with code as an interface.

5.9 All those references to monkeys hurt my head. Bonobos are notmonkeys.

Sorry, my bad. I’ll work on this point in the near future, but as an apology, we only have one word that means both«ape» and «monkey» in french, and I never realised that there was an actual difference. As one question out of two Igot about the project is somehow related to primates taxonomy, I’ll make a special effort as soon as I can on this topic.

Or maybe, I can use one of the comments from reddit as an answer: «Python not only has duck typing; it has the littleknown primate typing feature.»

See https://github.com/python-bonobo/bonobo/issues/24

5.10 Who is behind this?

Me (as an individual), and a few great people that helped me along the way. Not commercially endorsed, or supported.

The code, documentation, and surrounding material is created using spare time and may lack a bit velocity. Feel freeto jump in so we can go faster!

5.11 Documentation seriously lacks X, there is a problem in Y. . .

Yes, and sorry about that. An amazing way to make it better would be to submit a pull request about it. You can reada bit about how to contribute on page Contributing.

5.7. Why did you include the word «marketing» in a commit message? Why is there amarketing-automation tag on the project? Isn’t marketing evil?

53

Page 58: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

54 Chapter 5. F.A.Q.

Page 59: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

CHAPTER 6

Contributing

There’s a lot of different ways you can contribute, and not all of them includes coding. Do not think that the codelesscontributions have less value, all contributions are very important.

• You can contribute to the documentation.

• You can help reproducing errors and giving more infos in the issues.

• You can open issues with problems you’re facing.

• You can help creating better presentation material.

• You can talk about it in your local python user group.

• You can enhance examples.

• You can enhance tests.

• etc.

6.1 tl;dr

1. Fork the github repository

$ git clone https://github.com/python-bonobo/bonobo.git # change this to use your→˓fork.$ cd bonobo$ git remote add upstream https://github.com/python-bonobo/bonobo.git$ git fetch upstream$ git checkout upstream/develop -b feature/my_awesome_feature$ # code, code, code, test, doc, code, test ...$ git commit -m '[topic] .... blaaaah ....'$ git push origin feature/my_awesome_feature

2. Open pull request

3. Rince, repeat

55

Page 60: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

6.2 Code-related contributions (including tests and examples)

Contributing to bonobo is usually done this way:

• Discuss ideas in the issue tracker or on Slack.

• Fork the repository.

• Think about what happens for existing userland code if your patch is applied.

• Open pull request early with your code to continue the discussion as you’re writing code.

• Try to write simple tests, and a few lines of documentation.

Although we don’t have a complete guide on this topic for now, the best way is to fork the github repository and sendpull requests.

6.3 Tools

Issues: https://github.com/python-bonobo/bonobo/issues

Roadmap: https://www.bonobo-project.org/roadmap

Slack: https://bonobo-slack.herokuapp.com/

6.4 Guidelines

• We tend to use semantic versioning. This should be 100% true once we reach 1.0, but until then we will fail andlearn. Anyway, the user effort for each BC-break is a real pain, and we want to keep that in mind.

• The 1.0 milestone has one goal: create a solid foundation we can rely on, in term of API. To reach that, we wantto keep it as minimalist as possible, considering only a few userland tools as the public API.

• Said simplier, the core should stay as light as possible.

• Let’s not fight over coding standards. We enforce it using yapf, and a make format call should reformat thewhole codebase for you. We encourage you to run it before making a pull request, and it will be run before eachrelease anyway, so we can focus on things that have value instead of details.

• Tests are important. One obvious reason is that we want to have a stable and working system, but one lessobvious reason is that it forces better design, making sure responsibilities are well separated and scope of eachfunction is clear. More often than not, the “one and only obvious way to do it” will be obvious once you writethe tests.

• Documentation is important. It’s the only way people can actually understand what the system do, and userlesssoftware is pointless. One book I read a long time ago said that half the energy spent building something shouldbe devoted to explaining what and why you’re doing something, and that’s probably one of the best advice Iread about (although, as every good piece of advice, it’s more easy to repeat than to apply).

6.5 License

Bonobo is released under the apache license.

56 Chapter 6. Contributing

Page 61: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

6.6 License for non lawyers

Use it, change it, hack it, brew it, eat it.

For pleasure, non-profit, profit or basically anything else, except stealing credit.

Provided without any warranty.

6.6. License for non lawyers 57

Page 62: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

58 Chapter 6. Contributing

Page 63: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Python Module Index

bbonobo, 31bonobo.config, 42bonobo.examples.datasets, 48bonobo.examples.datasets.coffeeshops,

48bonobo.examples.nodes.count, 49bonobo.examples.tutorials.tut01e01, 47bonobo.examples.tutorials.tut01e02, 47bonobo.examples.tutorials.tut02e01_read,

47bonobo.examples.tutorials.tut02e02_write,

47bonobo.examples.types.bags, 49bonobo.examples.types.dicts, 48bonobo.examples.types.strings, 48bonobo.settings, 46

59

Page 64: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

60 Python Module Index

Page 65: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Index

Aadd_chain() (bonobo.Graph method), 32add_node() (bonobo.Graph method), 32apply() (bonobo.Bag method), 32args (bonobo.Bag attribute), 32args_for() (bonobo.config.Container method), 42

BBag (class in bonobo), 32bonobo (module), 31bonobo.config (module), 42bonobo.examples.datasets (module), 48bonobo.examples.datasets.coffeeshops (module), 48bonobo.examples.nodes.count (module), 49bonobo.examples.tutorials.tut01e01 (module), 47bonobo.examples.tutorials.tut01e02 (module), 47bonobo.examples.tutorials.tut02e01_read (module), 47bonobo.examples.tutorials.tut02e02_write (module), 47bonobo.examples.types.bags (module), 49bonobo.examples.types.dicts (module), 48bonobo.examples.types.strings (module), 48bonobo.settings (module), 46

Ccall() (bonobo.config.Configurable method), 42call() (bonobo.Filter method), 37call() (bonobo.Limit method), 39call() (bonobo.PrettyPrinter method), 40clean() (bonobo.config.Method method), 44clean() (bonobo.config.Option method), 44Configurable (class in bonobo.config), 42Container (class in bonobo.config), 42ContextProcessor (class in bonobo.config), 24, 42count() (in module bonobo), 42counter (bonobo.Limit attribute), 39create_strategy() (in module bonobo), 33csv_headers (bonobo.CsvReader attribute), 33CsvReader (class in bonobo), 33CsvWriter (class in bonobo), 34

Ddecorate() (bonobo.config.ContextProcessor class

method), 43default (bonobo.config.Option attribute), 44default (bonobo.CsvReader attribute), 34default (bonobo.FileReader attribute), 35default (bonobo.FileWriter attribute), 36default (bonobo.Limit attribute), 39default (bonobo.PickleReader attribute), 40default (bonobo.PickleWriter attribute), 41default (Option attribute), 23

Eenvelope (bonobo.JsonWriter attribute), 38Exclusive (class in bonobo.config), 43extend() (bonobo.Bag method), 32extract() (in module bonobo.examples.tutorials.tut01e01),

47extract() (in module bonobo.examples.types.bags), 49extract() (in module bonobo.examples.types.dicts), 49extract() (in module bonobo.examples.types.strings), 48

FFileReader (class in bonobo), 35FileWriter (class in bonobo), 35filter (bonobo.Filter attribute), 37Filter (class in bonobo), 37flags (bonobo.Bag attribute), 32

Gget() (bonobo.Bag method), 32get() (bonobo.config.Container method), 42get_default() (bonobo.config.Option method), 44get_examples_path() (in module bonobo), 42get_lock() (bonobo.config.Exclusive method), 43get_services() (in module

bonobo.examples.tutorials.tut02e01_read),47

61

Page 66: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

get_services() (in modulebonobo.examples.tutorials.tut02e02_write),47

Graph (class in bonobo), 32

Iidentity() (in module bonobo), 42inherit() (bonobo.Bag class method), 32

JJsonReader (class in bonobo), 37JsonWriter (class in bonobo), 38

Kkwargs (bonobo.Bag attribute), 32

Llimit (bonobo.Limit attribute), 39Limit (class in bonobo), 38lineno (bonobo.FileWriter attribute), 36load() (in module bonobo.examples.tutorials.tut01e01),

47load() (in module bonobo.examples.types.bags), 49load() (in module bonobo.examples.types.dicts), 49load() (in module bonobo.examples.types.strings), 48loader() (bonobo.JsonReader static method), 38

MMethod (class in bonobo.config), 24, 43mode (bonobo.FileReader attribute), 35mode (bonobo.FileWriter attribute), 36mode (bonobo.PickleReader attribute), 40mode (bonobo.PickleWriter attribute), 41

Nname (bonobo.config.Service attribute), 45name (Service attribute), 23noop() (in module bonobo), 42

Oopen_examples_fs() (in module bonobo), 42open_fs() (in module bonobo), 33Option (class in bonobo.config), 22, 44outputs_of() (bonobo.Graph method), 32

Ppickle_headers (bonobo.PickleReader attribute), 40PickleReader (class in bonobo), 40PickleWriter (class in bonobo), 41positional (bonobo.config.Option attribute), 44positional (bonobo.CsvReader attribute), 34positional (bonobo.FileReader attribute), 35positional (bonobo.FileWriter attribute), 36

positional (bonobo.Limit attribute), 39positional (bonobo.PickleReader attribute), 40positional (bonobo.PickleWriter attribute), 41positional (Option attribute), 23PrettyPrinter (class in bonobo), 40

Rread() (bonobo.CsvReader method), 33read() (bonobo.FileReader method), 35read() (bonobo.JsonReader method), 38read() (bonobo.PickleReader method), 41required (bonobo.config.Option attribute), 44required (bonobo.CsvReader attribute), 34required (bonobo.FileReader attribute), 35required (bonobo.FileWriter attribute), 36required (bonobo.Limit attribute), 39required (bonobo.PickleReader attribute), 40required (bonobo.PickleWriter attribute), 41required (Option attribute), 22requires() (in module bonobo.config), 45resolve() (bonobo.config.Service method), 45run() (in module bonobo), 31

SService (class in bonobo.config), 23, 44set_parent() (bonobo.Bag method), 32skip (bonobo.CsvReader attribute), 33, 34split_one() (in module

bonobo.examples.tutorials.tut02e02_write),47

TTee() (in module bonobo), 42Token (class in bonobo), 32topologically_sorted_indexes (bonobo.Graph attribute),

32transform() (in module

bonobo.examples.tutorials.tut01e01), 47transform() (in module bonobo.examples.types.bags), 49transform() (in module bonobo.examples.types.dicts), 49transform() (in module bonobo.examples.types.strings),

48type (bonobo.config.Option attribute), 44type (bonobo.CsvReader attribute), 34type (bonobo.FileReader attribute), 35type (bonobo.FileWriter attribute), 36type (bonobo.Limit attribute), 39type (bonobo.PickleReader attribute), 40type (bonobo.PickleWriter attribute), 41type (Option attribute), 22

Wwrite() (bonobo.CsvWriter method), 34write() (bonobo.FileWriter method), 37

62 Index

Page 67: Bonobo Documentation - Read the Docs · Bonobo Documentation, Release 0.4.3 1.2.3Editable install If you plan on making patches to Bonobo, you should install it as an “editable”

Bonobo Documentation, Release 0.4.3

write() (bonobo.JsonWriter method), 38write() (bonobo.PickleWriter method), 42writer (bonobo.CsvWriter attribute), 34

Index 63