Installing and exploring Pandas
The minimal dependency set requirements for Pandas is given as follows:
- NumPy: This is the fundamental numerical array package that we installed and covered extensively in the preceding chapters
- python-dateutil: This is a date handling library
- pytz: This handles time zone definitions
This list is the bare minimum; a longer list of optional dependencies can be located at http://pandas.pydata.org/pandas-docs/stable/install.html. We can install Pandas via PyPI with pip
or easy_install
, using a binary installer, with the aid of our operating system package manager, or from the source by checking out the code. The binary installers can be downloaded from http://pandas.pydata.org/getpandas.html.
The command to install Pandas with pip
is as follows:
$ pip3 install pandas rpy2
rpy2
is an interface to R and is required because rpy
is being deprecated. You may have to prepend the preceding command with sudo
if your user account doesn't have sufficient rights.
As we saw in IPython Notebook in Chapter 1, Getting Started with Python Libraries, we can print the version and subpackages of Pandas. The program printed the following output for Pandas:
pandas version 0.19.0 pandas.api pandas.compat DESCRIPTION compat Cross-compatible functions for Python 2 and 3. Key items to import for 2/3 compatible code: * iterators: range(), map(), pandas.computation pandas.core pandas.formats pandas.indexes pandas.io pandas.msgpack DESCRIPTION # coding: utf-8 # flake8: noqa PACKAGE CONTENTS _packer _unpacker _version exceptions CLASSES ExtType(builtins.tuple) ExtType cl pandas.rpy DESCRIPTION # GH9602 # deprecate rpy to instead directly use rpy2 PACKAGE CONTENTS base common mass vars FILE /usr/local/lib/python3.5/site- pandas.sparse pandas.stats pandas.tests pandas.tools pandas.tseries pandas.types pandas.util
Unfortunately, the documentation of the Pandas subpackages lacks informative descriptions; however, the subpackage names are descriptive enough for us to get an idea of what they are about.