How to Package TensorFlow Code - dummies

How to Package TensorFlow Code

By Matthew Scarpino

You can launch a training operation with the command gcloud ml-engine jobs submit training. When you execute this, you can identify your source code with the --package-path and --module-name flags. The --package-path flag identifies the directory that contains your code, and this directory must meet the following requirements:

  • The directory must contain the module identified by --module-name.
  • The parent directory must have a file named setup.py.
  • Every directory under the parent directory must have a file named __init__.py. This file is usually empty.
  • The development system must have setuptools installed.

This last point is important. Before uploading a package, the ML Engine uses setuptools to zip the parent directory into a *.tar.gz file. If you’ve installed pip, you can install setuptools with pip install setuptools.

In a Python package, setup.py contains instructions for building and installing the package. If you want the ML Engine to install your package, setup.py must perform two operations:

  • Import setuptools.setup.
  • Call the setup function of the setuptools module.

The setup function accepts a great deal of information about the package, including its name, version, and dependencies. The table lists nine of the parameters that you can set.

Parameters of the setup Function

Parameter Description
name Package name
version Release version
packages Dependency packages
install_requires Packages that need to be installed when the package is installed
author Name of the package’s author
author_email Author’s email address
url Package’s home page
description Short description of the package
license The package’s license

Rather than list your package’s dependencies, you can call the find_packages provided by setuptools. This listing presents the content of the setup.py file:

Setup Script for a Machine Learning Package

from setuptools import find_packages

from setuptools import setup

 

REQUIRED_PACKAGES = ['tensorflow>=1.3']

 

setup(

name='trainer',

version='0.1',

install_requires=REQUIRED_PACKAGES,

packages=find_packages(),

include_package_data=True,

author='Matthew Scarpino'

description='Running MNIST classification in the cloud'

)

Sadly, the ML Engine doesn’t always have the latest versions of the packages installed. At the time of this writing, the current TensorFlow version is 1.4, but the default version supported by the ML Engine is 1.2.

You can request a specific version of a package by setting the install_requires field. In the listing, this field requests a version of TensorFlow greater than or equal to 1.3. For more information on supported versions, visit this site.