3 - Comments
  • A short introduction on how to install packages from the Python Package Index (PyPI), and how to make, distribute and upload your own. This guide is no longer being maintained - more up-to-date and complete information is in the Python Packaging User Guide.To see what was previously in this page, please visit the previous edit in the wiki page history.
  • In this Python Programming Tutorial for Beginners video I am going to show you How to use Pip and PyPI (Python Package Index) for installing and managing Py.

Last-modified packages/ serversig/ simple/. Last-modified packages/ serversig/ simple/ packages/ serversig/ simple/. Links for torchvision torchvision-0.1.6-py2-none-any.whl torchvision-0.1.6-py3-none-any.whl torchvision-0.1.6.tar.gz torchvision-0.1.7-py2.py3-none-any.whl.

Title:Mirroring infrastructure for PyPI
Author:Tarek Ziadé <tarek at ziade.org>, Martin v. Löwis <martin at v.loewis.de>
Type:Standards Track


  • Mirror listing and registering
  • Special pages a mirror needs to provide
  • How a mirror should synchronize with PyPI

This PEP describes a mirroring infrastructure for PyPI.

The main PyPI web service was moved behind the Fastly caching CDN in May 2013:https://mail.python.org/pipermail/distutils-sig/2013-May/020848.html

Subsequently, this arrangement was formalised as an in-kind sponsorship withthe PSF, and the PSF has also taken on the task of risk management in the eventthat that sponsorship arrangement were to ever cease.

The download statistics that were previously provided directly on PyPI, are nowpublished indirectly via Google Big Query:https://packaging.python.org/guides/analyzing-pypi-package-downloads/

Accordingly, the mirroring proposal described in this PEP is no longer required,and has been marked as Withdrawn.

PyPI is hosting over 6000 projects and is used on a daily basisby people to build applications. Especially systems like easy_installand zc.buildout make intensive usage of PyPI.

For people making intensive use of PyPI, it can act as a single pointof failure. People have started to set up some mirrors, both privateand public. Those mirrors are active mirrors, which means that theyare browsing PyPI to get synced.

In order to make the system more reliable, this PEP describes:

  • the mirror listing and registering at PyPI
  • the pages a public mirror should maintain. These pages will be usedby PyPI, in order to get hit counts and the last modified date.
  • how a mirror should synchronize with PyPI
  • how a client can implement a fail-over mechanism

People that wants to mirror PyPI make a proposal on catalog-SIG.When a mirror is proposed on the mailing list, it is manuallyadded in a mirror list in the PyPI application after ithas been checked to be compliant with the mirroring rules.

The mirror list is provided as a list of host names of theform


The values of X are the sequence a,b,c,...,aa,ab,...a.pypi.python.org is the master server; the mirrors startwith b. A CNAME record last.pypi.python.org points to thelast host name. Mirror operators should use a static address,and report planned changes to that address in advance todistutils-sig.

The new mirror also appears at http://pypi.python.org/mirrorswhich is a human-readable page that gives the list of mirrors.This page also explains how to register a new mirror.

Statistics page

PyPI provides statistics on downloads at /stats. This page iscalculated daily by PyPI, by reading all mirrors' local stats andsumming them.

The stats are presented in daily or monthly files, under /stats/daysand /stats/months. Each file is a bzip2 file with these formats:

  • YYYY-MM-DD.bz2 for daily files
  • YYYY-MM.bz2 for monthly files


  • /stats/days/2008-11-06.bz2
  • /stats/days/2008-11-07.bz2
  • /stats/days/2008-11-08.bz2
  • /stats/months/2008-11.bz2
  • /stats/months/2008-10.bz2

With a distributed mirroring system, clients may want to verify thatthe mirrored copies are authentic. There are multiple threats toconsider:

  1. the central index may get compromised
  2. the central index is assumed to be trusted, but the mirrors mightbe tampered.
  3. a man in the middle between the central index and the end user,or between a mirror and the end user might tamper with datagrams.

This specification only deals with the second threat. Some provisionsare made to detect man-in-the-middle attacks. To detect the firstattack, package authors need to sign their packages using PGP keys, sothat users verify that the package comes from the author they trust.

The central index provides a DSA key at the URL /serverkey, in the PEMformat as generated by 'openssl dsa -pubout' (i.e. RFC 3280SubjectPublicKeyInfo, with the algorithm This URL mustnot be mirrored, and clients must fetch the official serverkey fromPyPI directly, or use the copy that came with the PyPI client software.Mirrors should still download the key, to detect a key rollover.

For each package, a mirrored signature is provided at/serversig/<package>. This is the DSA signature of the parallel URL/simple/<package>, in DER form, using SHA-1 with DSA (i.e. as a RFC3279 Dsa-Sig-Value, created by algorithm 1.2.840.10040.4.3)

Clients using a mirror need to perform the following steps to verifya package:

  1. download the /simple page, and compute its SHA-1 hash
  2. compute the DSA signature of that hash
  3. download the corresponding /serversig, and compare it (byte-for-byte)with the value computed in step 2.
  4. compute and verify (against the /simple page) the MD-5 hashesof all files they download from the mirror.

An implementation of the verification algorithm is available fromhttps://svn.python.org/packages/trunk/pypi/tools/verify.py

Verification is not needed when downloading from central index, andshould be avoided to reduce the computation overhead.

About once a year, the key will be replaced with a new one. Mirrorswill have to re-fetch all /serversig pages. Clients using mirrors needto find a trusted copy of the new server key. One way to obtain oneis to download it from https://pypi.python.org/serverkey. To detectman-in-the-middle attacks, clients need to verify the SSL servercertificate, which will be signed by the CACert authority.

A mirror is a subset copy of PyPI, so it provides the same structureby copying it.

  • simple: rest version of the package index
  • packages: packages, stored by Python version, and letters
  • serversig: signatures for the simple pages

It also needs to provide two specific elements:

  • last-modified
  • local-stats

Last modified date

CPAN uses a freshness date system where the mirror's lastsynchronisation date is made available.

For PyPI, each mirror needs to maintain a URL with simple text contentthat represents the last synchronisation date the mirror maintains.

Pypi Numpy

The date is provided in GMT time, using the ISO 8601 format [3].Each mirror will be responsible to maintain its last modified date.

This page must be located at : /last-modified and must be atext/plain page.

Local statistics

Each mirror is responsible to count all the downloads that where donevia it. This is used by PyPI to sum up all downloads, to be able todisplay the grand total.

These statistics are in CSV-like form, with a header in the firstline. It needs to obey PEP 305[1]. Basically, it should bereadable by Python's csv module.

Pypi Speech Recognition

The fields in this file are:

  • package: the distutils id of the package.
  • filename: the filename that has been downloaded.
  • useragent: the User-Agent of the client that has downloaded thepackage.
  • count: the number of downloads.

The content will look like this:


The counting starts the day the mirror is launched, and there is onefile per day, compressed using the bzip2 format. Each file is namedlike the day. For example, 2008-11-06.bz2 is the file for the 6th ofNovember 2008.

They are then provided in a folder called days. For example:

  • /local-stats/days/2008-11-06.bz2
  • /local-stats/days/2008-11-07.bz2
  • /local-stats/days/2008-11-08.bz2

This page must be located at /local-stats.

A mirroring protocol called Simple Index was described andimplemented by Martin v. Loewis and Jim Fulton, based on howeasy_install works. This section synthesizes it and gives a fewrelevant links, plus a small part about User-Agent.

The mirroring protocol

Mirrors must reduce the amount of data transferred between the centralserver and the mirror. To achieve that, they MUST use the changelog()PyPI XML-RPC call, and only refetch the packages that have beenchanged since the last time. For each package P, they MUST copydocuments /simple/P/ and /serversig/P. If a package is deleted on thecentral server, they MUST delete the package and all associated files.To detect modification of package files, they MAY cache the file'sETag, and MAY request skipping it using the If-none-match header.

Each mirroring tool MUST identify itself using a descripte User-agentheader.

The pep381client package [2] provides an application thatrespects this protocol to browse PyPI.

User-agent request header

In order to be able to differentiate actions taken by clients overPyPI, a specific user agent name should be provided by all mirroringsoftware.

This is also true for all clients like:

  • zc.buildout [4].
  • setuptools [5].
  • pip [6].

XXX user agent registering mechanism at PyPI ?

How a client can use PyPI and its mirrors

Clients that are browsing PyPI should be able to use alternativemirrors, by getting the list of the mirrors using last.pypi.python.org.

Code example:

The clients so far that could use this mechanism:

  • setuptools
  • zc.buildout (through setuptools)
  • pip

Fail-over mechanism

Clients that are browsing PyPI should be able to use a fail-overmechanism when PyPI or the used mirror is not responding.

It is up to the client to decide which mirror should be used, maybe bylooking at its geographical location and its responsiveness.

This PEP does not describe how this fail-over mechanism should work,but it is strongly encouraged that the clients try to use the nearestmirror.

The clients so far that could use this mechanism:

  • setuptools
  • zc.buildout (through setuptools)
  • pip

Extra package indexes

It is obvious that some packages will not be uploaded to PyPI, whetherbecause they are private or whether because the project maintainerruns his own server where people might get the project package.However, it is strongly encouraged that a public package index followsPyPI and Distutils protocols.

In other words, the register and upload command should becompatible with any package index server out there.

Software that are compatible with PyPI and Distutils so far:

  • PloneSoftwareCenter [7] which is used to run plone.org products section.
  • EggBasket [8].

An extra package index is not a mirror of PyPI, but can have somemirrors itself.

Merging several indexes

When a client needs to get some packages from several distinctindexes, it should be able to use each one of them as a potentialsource of packages. Different indexes should be defined as a sortedlist for the client to look for a package.

Each independent index can of course provide a list of its mirrors.

XXX define how to get the hostname for the mirrors of an arbitraryindex.

That permits all combinations at client level, for a reliablepackaging system with all levels of privacy.

It is up the client to deal with the merging.


This document has been placed in the public domain.

Source: https://github.com/python/peps/blob/master/pep-0381.txt
Author:Eric Gazoni, Charlie Clark
Source code:https://foss.heptapod.net/openpyxl/openpyxl
Generated:Mar 09, 2021


openpyxl is a Python library to read/write Excel 2010 xlsx/xlsm/xltx/xltm files.

It was born from lack of existing library to read/write natively from Pythonthe Office Open XML format.

All kudos to the PHPExcel team as openpyxl was initially based on PHPExcel.


Pypi sklearn

By default openpyxl does not guard against quadratic blowup or billion laughsxml attacks. To guard against these attacks install defusedxml.

Mailing List¶

The user list can be found on http://groups.google.com/group/openpyxl-users

Sample code:


The documentation is at: https://openpyxl.readthedocs.io

  • installation methods
  • code examples
  • instructions for contributing

Release notes: https://openpyxl.readthedocs.io/en/stable/changes.html


This is an open source project, maintained by volunteers in their spare time.This may well mean that particular features or functions that you would likeare missing. But things don’t have to stay that way. You can contribute theproject Development yourself or contract a developer for particularfeatures.

Professional support for openpyxl is available fromClark Consulting & Research andAdimian. Donations to the project to support furtherdevelopment and maintenance are welcome.

Bug reports and feature requests should be submitted using the issue tracker. Please provide a fulltraceback of any error you see and if possible a sample file. If for reasonsof confidentiality you are unable to make a file publicly available thencontact of one the developers.

The repository is being provided by Octobus andClever Cloud.

How to Contribute¶

Any help will be greatly appreciated, just follow those steps:

1.Please join the group and create a branch (https://foss.heptapod.net/openpyxl/openpyxl/) andfollow the Merge Request Start Guide.for each independent feature, don’t try to fix all problems at the sametime, it’s easier for those who will review and merge your changes ;-)

2.Hack hack hack

3.Don’t forget to add unit tests for your changes! (YES, even if it’s aone-liner, changes without tests will not be accepted.) There are plentyof examples in the source if you lack know-how or inspiration.

4.If you added a whole new feature, or just improved something, you canbe proud of it, so add yourself to the AUTHORS file :-)

5.Let people know about the shiny thing you just implemented, update thedocs!

6.When it’s done, just issue a pull request (click on the large “pullrequest” button on your repository) and wait for your code to bereviewed, and, if you followed all theses steps, merged into the mainrepository.

For further information see Development

Other ways to help¶

There are several ways to contribute, even if you can’t code (or can’t code well):

  • triaging bugs on the bug tracker: closing bugs that have already beenclosed, are not relevant, cannot be reproduced, …
  • updating documentation in virtually every area: many large features havebeen added (mainly about charts and images at the moment) but without anydocumentation, it’s pretty hard to do anything with it
  • proposing compatibility fixes for different versions of Python: we support3.6, 3.7, 3.8 and 3.9.


Install openpyxl using pip. It is advisable to do this in a Python virtualenvwithout system packages:


There is support for the popular lxml library which will be used if itis installed. This is particular useful when creating large files.


To be able to include images (jpeg, png, bmp,…) into an openpyxl file,you will also need the “pillow” library that can be installed with:

or browse https://pypi.python.org/pypi/Pillow/, pick the latest versionand head to the bottom of the page for Windows binaries.

Working with a checkout¶

Sometimes you might want to work with the checkout of a particular version.This may be the case if bugs have been fixed but a release has not yet beenmade.

Usage examples¶


  • Tutorial
    • Playing with data
    • Data storage


  • Simple usage


  • Performance
    • Benchmarks

Other topics¶

Information for Developers¶

API Documentation¶

Key Classes¶

Full API¶

Recent Pages