How To Package Vocabulary File For Cloud Ml Engine
Solution 1:
You have multiple options. I think the most straightforward is to store labels.txt
in a GCS location.
However, if you prefer, you can also package the file up in your setup.py
. There are multiple ways to do this, so I'll refer you to the official setuptools documentation.
Let me walk through a quick example:
Create a setup.py
in the directory below your training package (often called trainer
in CloudML Engine's samples, so I will proceed as if you're code is structured the same as the samples, including using trainer
as the package). The following is based on the docs you referenced with one important change, namely, the package_data
argument instead of include_package_data
:
from setuptools import find_packages
from setuptools import setup
setup(
name='my_model',
version='0.1',
install_requires=REQUIRED_PACKAGES,
packages=find_packages(),
package_data={'trainer': ['labels.txt']},
description='My trainer application package.'
)
If you run python setup.py sdist
, you can see that trainer/labels.txt
was copied into the tarball.
Then in your code, you can access the file like this:
from pkg_resources import Requirement, resource_filename
resource_filename(Requirement.parse('trainer'),'labels.txt')
Note that to run this code locally, you're going to have to install your package: python setup.py install [--user]
.
And that's the primary reason I think storing the file on GCS might be easier.
Post a Comment for "How To Package Vocabulary File For Cloud Ml Engine"