A Private PyPI Server with AWS CodeArtifact
June 22, 2024
When you develop various reusable Python packages or apps, you soon will be facing the limits of git dependencies. And you would rather not release company / project-specific packages to the public PyPI. Setting up a private PyPI server can be very time-consuming, and maintenance / backups are required. Can we use AWS CodeArtifact to create our own package index?
What is AWS CodeArtifact
AWS CodeArtifact is a hosted service from AWS that can hold several package formats, like npm, PyPI, Maven, NuGet and generic package formats. The PyPI feature is interesting as it allows us to package and distribute reusable Python applications properly. These packages (wheels) are installable with pip or poetry (or any other Python package manager that supports wheels and/or a PyPI server).
When to use a private PyPI server
There are several reasons why packages with a (private) PyPI server could be beneficial:
- You can "compile" front-end artifacts (images, JavaScript, CSS, etc.) into (binary and/or compressed) assets and have them properly distributed inside a wheel (which is a Python package). Otherwise, you'll need to rebuild your assets when you install dependencies from source distributions (via git dependency links).
- The same goes for other artifacts, like translation files (no more binary
.mofiles in your repository!) - You wrote Python extensions in another language like C(++) / Rust or any other language, and you want to compile these only when you update the extensions.
- It speeds up your CI / CD pipeline: Reusable packages need to be built and released only once, instead of each time you must build your project.
- It forces you to separate concerns: Reusable apps can also be tested in isolated sandboxes, making the code less dependent on your project requirements. A "package-oriented mindset" is a sustainable way of managing software in the long term.
And there are many more advantages, like forcing you to implement (proper) versioning, implement (auto-)update strategies (Dependabot) or even decide to open-source one of your private packages.
Create a new repository
Creating a new AWS CodeArtifact repository is fairly simple from the AWS console: CodeArtifact -> Create Repository:

Or if you are using Terraform, the most basic configuration would be:
resource "aws_codeartifact_domain" "mydomain" {
domain = "mydomain"
}
resource "aws_codeartifact_repository" "myrepo" {
repository = "myrepo"
domain = aws_codeartifact_domain.mydomain.domain
}
data "aws_codeartifact_repository_endpoint" "pypi_endpoint" {
domain = aws_codeartifact_domain.mydomain.domain
repository = aws_codeartifact_repository.myrepo.repository
format = "pypi"
}
Access
When an AWS CodeArtifact repository is created, the URL for accessing the (PyPI) repository will be in the following format:
https://<domain>-<account>.d.codeartifact.<aws-region>.amazonaws.com/pypi/<repository>
An example using the Terraform configuration above in the eu-west-1 region:
https://mydomain-111122223333.d.codeartifact.eu-west-1.amazonaws.com/pypi/myrepo
For using the index server to query / download packages, you'll need to add a suffix /simple to this URL:
https://mydomain...amazonaws.com/pypi/myrepo/ # <-- for publishing
https://mydomain...amazonaws.com/pypi/myrepo/simple # <-- for querying / downloading
Authentication
AWS CodeArtifact uses JWT tokens for authentication. These tokens are valid for a maximum of 12 hours, but expiration times can be shorter too (suitable for CI environments). You need to set up the following permissions for the AWS user / IAM role to be able to query the endpoint and download the packages:
- codeartifact:GetAuthorizationToken
- codeartifact:ReadFromRepository
- sts:GetServiceBearerToken`
When an AWS IAM user / role has these permissions, you can query for a token with the AWS CLI. Let's export the whole command as an environment variable. You could add this to your .bashrc or .zshrc:
$ export AWS_CODEARTIFACT_TOKEN_COMMAND=`aws codeartifact get-authorization-token --domain mydomain --domain-owner 111122223333 --query authorizationToken --output text`
Now you can use this token as a password with aws as the username.
With poetry
$ poetry source add --priority=supplemental aws-codeartifact-myrepo https://mydomain-111122223333.d.codeartifact.eu-west-1.amazonaws.com/pypi/myrepo/simple
$ poetry config http-basic.aws-codeartifact-myrepo aws $(eval $AWS_CODEARTIFACT_TOKEN_COMMAND)
With pip
$ pip install -i https://aws:$(eval $AWS_CODEARTIFACT_TOKEN_COMMAND)@mydomain-111122223333.d.codeartifact.eu-west-1.amazonaws.com/pypi/myrepo/simple <my-private-package>`
Or, you could set the credentials for a specific site like so:
pip config set site.index-url https://aws:$(eval $AWS_CODEARTIFACT_TOKEN_COMMAND)@mydomain-606718280940.d.codeartifact.eu-west-1.amazonaws.com/pypi/myrepo/simple/
NetRC
Pip and Poetry also work with netrc. I wrote a simple update-netrc CLI to set the credentials for a specific host:
$ update-netrc update http://mydomain-111122223333.d.codeartifact.eu-west-1.amazonaws.com/pypi/myrepo/simple --login aws --password $(eval $AWS_CODEARTIFACT_TOKEN_COMMAND)
This command can be easily integrated into CI systems like GitHub Actions or GitLab CI with a token that is valid for a limited time.
Publishing
Publishing packages is fairly easy, as CodeArtifact is 100% compatible with the PyPI API. Your AWS account / IAM role needs the following permissions to allow uploading packages:
- codeartifact:GetAuthorizationToken
- codeartifact:GetRepositoryEndpoint
- codeartifact:PublishPackageVersion
- codeartifact:PutPackageMetadata
- sts:GetServiceBearerToken
First, we export the "publishable" repository as an environment variable:
$ export AWS_CODEARTIFACT_PYPI_REPOSITORY_URL=https://mydomain-111122223333.d.codeartifact.eu-west-1.amazonaws.com/pypi/myrepo
With poetry
With poetry, you should configure a repository, and then you can use the poetry CLI to publish the package:
# Add the repository and configure the token
$ poetry source add --priority=supplemental aws-codeartifact-myrepo-publish $AWS_CODEARTIFACT_PYPI_REPOSITORY_URL
$ poetry config http-basic.aws-codeartifact-myrepo-publish aws $(eval $AWS_CODEARTIFACT_TOKEN_COMMAND)
Now we can publish the package after building it:
$ poetry build
$ poetry publish --repository aws-codeartifact-myrepo-publish
With Twine
With Twine you can upload your package with the CLI in a single line:
$ twine upload --repository-url $AWS_CODEARTIFACT_PYPI_REPOSITORY_URL --username aws --password $(eval $AWS_CODEARTIFACT_TOKEN_COMMAND) mypackage.whl
Useful links
- 🔗 Private packages with CodeArtifact and Poetry, an excellent tutorial with poetry
- 🔗 Pip and CodeArtifact, how to configure pip with AWS CodeArtifact