Python Excel

From SerialHobbyists Wiki
Jump to: navigation, search

You can create and read Microsoft Excel formatted spreadsheets in Python using the xlwt and xlrd modules of the Python Excel project. These packages are available in the official repositories of recent versions of Ubuntu. This page is not a full guide to using the modules, but is rather a cookbook of common functions one might want to perform. A full tutorial is available as a PDF on the project site.

All examples assume that the xlwt module is imported. Some may use objects that were created in previous examples.

Creating a New Spreadsheet

New Workbook

A workbook is one entire Excel file, which can contain many worksheets.

wb = xlwt.Workbook()

New Worksheet

A worksheet is a single spreadsheet page within a workbook.

ws = wb.add_sheet('My Sheet Name', cell_overwrite_ok=True)

The cell_overwrite_ok argument allows you to write data to cells multiple times if it is True. The last value written to a cell will be written to the sheet. If the argument is False or absent, a write to a cell that already contains data will raise an exception.

Saving a Workbook

wb.save('myfile.xls')

The save method can alternatively be passed a file-like object.

Rows, Columns, and Cells

Rows

A row represents the formatting and contents of all the cells in one horizontal row of the worksheet. Note that the row numbering in an Excel spreadsheet starts with row 1, while in Python it follows the array indexing standard of starting with row 0. In other words, Row 1 in an Excel spreadsheet is Row 0 in a Python Excel worksheet.

row1 = ws.row(0)

Rows, by default, will remain in memory once created. When working with large amounts of data, the rows can be flushed from memory to disk.

ws.flush_row_data()

Columns

A column represents the formatting of a vertical column of cells. It does not contain cell data. Columns in an Excel spreadsheet are represented by letters. Like rows, in Python they are numbered, starting with 0.

colA = ws.col(0)

Cells

A cell is a single data entry in a worksheet, including its formatting. Cell objects are usually not manipulated directly, but can be manipulated through their Row or Worksheet object. More on this below.

Writing to Cells

Writing Data

Cells can be written in a couple of different ways. The simplest is to use the write method of the worksheet:

# Worksheet.write(row, column, data)
ws.write(2, 5, 'This is cell F3')

If working in rows, they can also be written directly from the row object:

# Row.write(column, data)
row1.write(4, 'This is cell E1')

The type of data to write to the worksheet will be automatically determined from the value passed. If working with large quantities of data, there are functions specific to each type of data that have a speed advantage.

# Text: string, unicode
row1.set_cell_text(0, 'Text cell')
# Number: float, int, long, decimal.Decimal
row1.set_cell_number(1, 71)
# Date: datetime.datetime, datetime.date, datetime.time
row1.set_cell_date(2, datetime.date(year=2013, month=5, day=25))
# Boolean: bool
row1.set_cell_boolean(3, True)
# Blank: for setting formatting on a blank cell
row1.set_cell_blank(4)

Writing Formulae

Adding a formula to a cell is slightly more complicated. xlwt provides a Formula class for the job. You can create a Formula object by passing in a string representing the formula as it would be entered into Excel, but without the leading "=".

ws.write(5, 5, Formula('SUM(A6:E6)'))
row1.set_cell_formula(5, Formula('SQRT(B1)')

The xlwt.Utils module provides a number of convenience functions that can be used to convert between Python indices and row/column/cell labels. For full details, see the tutorial document.

# xlwt.Utils.cell_to_rowcol(cellstring) -> (row_number, col_number, row_is_absolute, col_is_absolute)
>>> xlwt.Utils.cell_to_rowcol('C6')
(5, 2, False, False)
>>> xlwt.Utils.cell_to_rowcol('C$6')
(5, 2, True, False)
>>> xlwt.Utils.cell_to_rowcol('$C$6')
(5, 2, True, True)
# xlwt.Utils.rowcol_to_cell(row_number, col_number) -> cellstring
>>> xlwt.Utils.rowcol_to_cell(7,4)
'E8'
# xlwt.Utils.cellrange_to_rowcol_pair(rangestring) -> (row1_number, col1_number, row2_number, col2_number)
>>> xlwt.Utils.cellrange_to_rowcol_pair('B3:D7')
(2, 1, 6, 3)
# xlwt.Utils.rowcol_pair_to_cellrange(row1_number, col1_number, row2_number, col2_number) -> rangestring
>>> xlwt.Utils.rowcol_pair_to_cellrange(2, 1, 6, 3)
'B3:D7'

Formatting

The formatting of a cell can be described using a list of elements, each of which has one or more attribute-value pairs. The most common attributes of each element are described below. See the tutorial for a full list of attributes.

Font Element

The font element describes the font used in the cell.

Attributes:

  • name: Font name (e.g. Arial, Times New Roman, etc.)
  • height: Size of font; multiply desired point size by 10
  • bold: True or False
  • italic: True or False
  • outline: True or False
  • shadow: True or False
  • struck_out: True or False
  • underline: True or False
  • color: text color, see below for possible values
  • escapement: none, superscript, or subscript

Alignment Element

The alignment element describes the alignment of text in the cell.

Attributes:

  • horizontal: general, left, center, right, filled, justified, center_across_selection, or distributed
  • vertical: top, center, bottom, justified, or distributed
  • rotation: none, stacked, or an integer between -90 and 90 (degrees)
  • shrink_to_fit: True or False
  • wrap: True or False; useful for making multi-line text cells

Borders Element

The borders element describes the type and color of the cell's borders.

Attributes:

  • left, right, top, bottom: no_line, thin, medium, dashed, dotted, thick, double, hair, medium_dashed, thin_dash_dotted, medium_dash_dotted, thin_dash_dot_dotted, medium_dash_dot_dotted, or slanted_medium_dash_dotted
  • left_color, right_color, top_color, bottom_color, diag_color: border color, see below for possible values

Pattern Element

The pattern element describes the background fill of the cell.

Attributes:

  • pattern: no_fill, none, solid, solid_fill, solid_pattern, fine_dots, alt_bars, sparse_dots, thick_horz_bands, thick_vert_bands, thick_backward_diag, thick_forward_diag, big_spots, bricks, thin_horz_bands, thin_vert_bands, thin_backward_diag, thin_forward_diag, squares, or diamonds
  • fore_color, back_color: Foreground and background color, see below for possible values

Protection Element

The protection element describes cell protection features.

Attributes:

  • cell_locked: True or False
  • formula_hidden: True or False

Colors

The list of colors available are the same as the standard palette within the Excel application:

aqua dark_green gray40 light_orange periwinkle teal
black dark_green_ega gray50 light_turquoise pink teal_ega
blue dark_purple gray80 light_yellow plum turquoise
blue_gray dark_red green lime purple_ega violet
bright_green dark_red_ega ice_blue magenta_ega red white
brown dark_teal indigo ocean_blue rose yellow
coral dark_yellow ivory olive_ega sea_green
cyan_ega gold lavender olive_green silver_ega
dark_blue gray_ega light_blue orange sky_blue
dark_blue_ega gray25 light_green pale_blue tan

Defining Styles with the easyxf Class

While there are separate classes available for each element, the easiest way to set up a format is to define an easyxf object. Create a new instance by passing it a single string consisting of a semicolon-delimited list of elements, each in the following format:

element: attribute1 value1, attribute2 value2, attribute3 value3, ...

Example:

style = xlwt.easyxf('font: color blue, bold True; alignment: horizontal center; borders: top medium, bottom medium, left hair, right hair')

An easyxf object can be passed as an additional argument to the write method of a worksheet or row, or to the specific write methods. It can also be used for an entire row or column with the set_style method of a Row or Column instance.

ws.write(2, 3, 'This is cell D3', xlwt.easyxf('font: color red'))
row1.set_style(xlwt.easyxf('pattern: pattern diamonds, color yellow'))

Miscellaneous Features

Merging Cells

Cells can be merged (horizontally, vertically, or both) using the write_merge method of a worksheet.

# Worksheet.write_merge(row1, row2, col1, col2, data)
ws.write_merge(2, 3, 2, 4, 'Hello world')
# Merges 6 cells (C3 to D5), and fills the merged cell with text