Class in Python to Read Xls File

Picket Now This tutorial has a related video course created by the Real Python squad. Lookout information technology together with the written tutorial to deepen your understanding: Editing Excel Spreadsheets in Python With openpyxl

Excel spreadsheets are one of those things you might take to deal with at some point. Either it's because your boss loves them or because marketing needs them, you might have to learn how to work with spreadsheets, and that's when knowing openpyxl comes in handy!

Spreadsheets are a very intuitive and user-friendly style to dispense large datasets without whatsoever prior technical background. That's why they're still so unremarkably used today.

In this article, you'll learn how to apply openpyxl to:

Manipulate Excel spreadsheets with confidence
Extract information from spreadsheets
Create simple or more complex spreadsheets, including calculation styles, charts, and then on

This article is written for intermediate developers who accept a pretty proficient knowledge of Python data structures, such equally dicts and lists, but also feel comfortable around OOP and more intermediate level topics.

Before You Begin

If you always get asked to extract some information from a database or log file into an Excel spreadsheet, or if y'all often have to convert an Excel spreadsheet into some more usable programmatic form, then this tutorial is perfect for you. Allow's bound into the openpyxl caravan!

Practical Utilise Cases

Start things start, when would you need to use a package like openpyxl in a existent-globe scenario? Y'all'll see a few examples beneath, but really, there are hundreds of possible scenarios where this knowledge could come in handy.

Importing New Products Into a Database

Y'all are responsible for tech in an online shop company, and your boss doesn't desire to pay for a cool and expensive CMS system.

Every time they want to add new products to the online store, they come to you lot with an Excel spreadsheet with a few hundred rows and, for each of them, yous have the product proper name, description, price, and so forth.

Now, to import the data, you'll accept to iterate over each spreadsheet row and add together each product to the online store.

Exporting Database Data Into a Spreadsheet

Say you have a Database tabular array where you record all your users' information, including name, phone number, electronic mail address, and then along.

Now, the Marketing team wants to contact all users to requite them some discounted offering or promotion. However, they don't have access to the Database, or they don't know how to use SQL to extract that information hands.

What can y'all do to help? Well, yous can make a quick script using openpyxl that iterates over every single User record and puts all the essential data into an Excel spreadsheet.

That's gonna earn you lot an extra piece of cake at your company's adjacent birthday political party!

Appending Data to an Existing Spreadsheet

You may also accept to open a spreadsheet, read the information in information technology and, co-ordinate to some business logic, append more data to it.

For example, using the online store scenario again, say you become an Excel spreadsheet with a listing of users and you need to append to each row the total amount they've spent in your store.

This information is in the Database and, in order to do this, you lot have to read the spreadsheet, iterate through each row, fetch the total amount spent from the Database and so write back to the spreadsheet.

Non a problem for openpyxl!

Learning Some Basic Excel Terminology

Here's a quick list of basic terms you'll see when yous're working with Excel spreadsheets:

Term	Caption
Spreadsheet or Workbook	A Spreadsheet is the main file yous are creating or working with.
Worksheet or Sheet	A Canvas is used to separate unlike kinds of content within the aforementioned spreadsheet. A Spreadsheet can have one or more than Sheets.
Column	A Cavalcade is a vertical line, and information technology'south represented by an uppercase letter: A.
Row	A Row is a horizontal line, and information technology'southward represented by a number: ane.
Cell	A Cell is a combination of Column and Row, represented past both an upper-case letter letter and a number: A1.

Getting Started With openpyxl

Now that you lot're aware of the benefits of a tool like openpyxl, allow's get downward to it and outset by installing the packet. For this tutorial, you lot should utilize Python 3.7 and openpyxl 2.6.2. To install the package, yous can do the post-obit:

After you install the packet, you should be able to create a super simple spreadsheet with the following code:

                                                  from                  openpyxl                  import                  Workbook                  workbook                  =                  Workbook                  ()                  sheet                  =                  workbook                  .                  agile                  canvas                  [                  "A1"                  ]                  =                  "hi"                  sheet                  [                  "B1"                  ]                  =                  "world!"                  workbook                  .                  relieve                  (                  filename                  =                  "hello_world.xlsx"                  )

The code above should create a file chosen hello_world.xlsx in the folder you are using to run the code. If you open that file with Excel you should run across something like this:

Woohoo, your first spreadsheet created!

Reading Excel Spreadsheets With openpyxl

Let's kickoff with the most essential affair i can do with a spreadsheet: read it.

You'll go from a straightforward approach to reading a spreadsheet to more than complex examples where y'all read the data and convert it into more than useful Python structures.

Dataset for This Tutorial

Before you dive deep into some lawmaking examples, yous should download this sample dataset and shop it somewhere as sample.xlsx:

This is 1 of the datasets y'all'll be using throughout this tutorial, and information technology'due south a spreadsheet with a sample of real data from Amazon'due south online product reviews. This dataset is only a tiny fraction of what Amazon provides, just for testing purposes, it'southward more than enough.

A Simple Arroyo to Reading an Excel Spreadsheet

Finally, let's start reading some spreadsheets! To begin with, open up our sample spreadsheet:

>>>

                                                  >>>                                    from                  openpyxl                  import                  load_workbook                  >>>                                    workbook                  =                  load_workbook                  (                  filename                  =                  "sample.xlsx"                  )                  >>>                                    workbook                  .                  sheetnames                  ['Canvass 1']                  >>>                                    sheet                  =                  workbook                  .                  active                  >>>                                    sheet                  <Worksheet "Sheet one">                  >>>                                    canvass                  .                  title                  'Sheet one'

In the code to a higher place, y'all outset open the spreadsheet sample.xlsx using load_workbook(), and then you can utilise workbook.sheetnames to come across all the sheets you have available to work with. After that, workbook.active selects the first available sheet and, in this case, y'all can see that information technology selects Sheet 1 automatically. Using these methods is the default way of opening a spreadsheet, and you'll run across it many times during this tutorial.

At present, after opening a spreadsheet, you can easily retrieve information from it similar this:

>>>

                                                  >>>                                    sheet                  [                  "A1"                  ]                  <Cell 'Sail 1'.A1>                  >>>                                    sheet                  [                  "A1"                  ]                  .                  value                  'marketplace'                  >>>                                    sheet                  [                  "F10"                  ]                  .                  value                  "Chiliad-Daze Men's Grey Sport Watch"

To return the actual value of a cell, you demand to do .value. Otherwise, you'll get the main Cell object. You can also apply the method .cell() to recall a cell using index notation. Remember to add .value to get the actual value and not a Jail cell object:

>>>

                                                  >>>                                    sail                  .                  cell                  (                  row                  =                  10                  ,                  column                  =                  6                  )                  <Cell 'Sheet one'.F10>                  >>>                                    sail                  .                  cell                  (                  row                  =                  10                  ,                  cavalcade                  =                  6                  )                  .                  value                  "M-Stupor Men's Grey Sport Watch"

You can see that the results returned are the same, no thing which way you decide to become with. Notwithstanding, in this tutorial, yous'll be more often than not using the first approach: ["A1"].

The higher up shows you the quickest way to open up a spreadsheet. Even so, you can pass additional parameters to change the way a spreadsheet is loaded.

Additional Reading Options

In that location are a few arguments you tin can pass to load_workbook() that change the way a spreadsheet is loaded. The most of import ones are the post-obit two Booleans:

read_only loads a spreadsheet in read-only mode allowing you to open up very large Excel files.
data_only ignores loading formulas and instead loads only the resulting values.

Importing Data From a Spreadsheet

Now that you've learned the basics about loading a spreadsheet, it's most time y'all get to the fun part: the iteration and actual usage of the values within the spreadsheet.

This department is where y'all'll acquire all the different ways you can iterate through the data, but also how to convert that data into something usable and, more importantly, how to practise information technology in a Pythonic mode.

Iterating Through the Data

There are a few unlike ways you tin can iterate through the information depending on your needs.

You lot can slice the data with a combination of columns and rows:

>>>

                                                        >>>                                        sheet                    [                    "A1:C2"                    ]                    ((<Cell 'Canvas 1'.A1>, <Cell 'Sheet 1'.B1>, <Prison cell 'Canvas 1'.C1>),                                          (<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Jail cell 'Sheet 1'.C2>))

You can go ranges of rows or columns:

>>>

                                                        >>>                                        # Get all cells from cavalcade A                    >>>                                        sheet                    [                    "A"                    ]                    (<Cell 'Canvas one'.A1>,                                          <Cell 'Canvass 1'.A2>,                                          ...                                          <Cell 'Canvas 1'.A99>,                                          <Cell 'Sheet 1'.A100>)                    >>>                                        # Get all cells for a range of columns                    >>>                                        sheet                    [                    "A:B"                    ]                    ((<Jail cell 'Canvas ane'.A1>,                                          <Cell 'Sail ane'.A2>,                                          ...                                          <Cell 'Canvass 1'.A99>,                                          <Cell 'Sheet ane'.A100>),                                          (<Cell 'Canvas i'.B1>,                                          <Cell 'Sheet 1'.B2>,                                          ...                                          <Cell 'Sheet ane'.B99>,                                          <Cell 'Sheet 1'.B100>))                    >>>                                        # Get all cells from row five                    >>>                                        canvas                    [                    5                    ]                    (<Jail cell 'Canvass 1'.A5>,                                          <Prison cell 'Sheet 1'.B5>,                                          ...                                          <Cell 'Canvass i'.N5>,                                          <Cell 'Sail 1'.O5>)                    >>>                                        # Get all cells for a range of rows                    >>>                                        sheet                    [                    v                    :                    6                    ]                    ((<Cell 'Sheet 1'.A5>,                                          <Prison cell 'Sheet 1'.B5>,                                          ...                                          <Jail cell 'Sail 1'.N5>,                                          <Cell 'Sheet one'.O5>),                                          (<Cell 'Sheet 1'.A6>,                                          <Cell 'Sail i'.B6>,                                          ...                                          <Cell 'Sheet 1'.N6>,                                          <Jail cell 'Canvas 1'.O6>))

You'll notice that all of the above examples return a tuple. If you want to refresh your memory on how to handle tuples in Python, check out the commodity on Lists and Tuples in Python.

There are as well multiple ways of using normal Python generators to go through the information. The main methods you tin use to achieve this are:

.iter_rows()
.iter_cols()

Both methods can receive the post-obit arguments:

min_row
max_row
min_col
max_col

These arguments are used to set boundaries for the iteration:

>>>

                                                        >>>                                        for                    row                    in                    canvas                    .                    iter_rows                    (                    min_row                    =                    1                    ,                    ...                                        max_row                    =                    ii                    ,                    ...                                        min_col                    =                    1                    ,                    ...                                        max_col                    =                    3                    ):                    ...                                        impress                    (                    row                    )                    (<Cell 'Sail one'.A1>, <Jail cell 'Sheet 1'.B1>, <Cell 'Sheet i'.C1>)                    (<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Canvas 1'.C2>)                    >>>                                        for                    column                    in                    sheet                    .                    iter_cols                    (                    min_row                    =                    1                    ,                    ...                                        max_row                    =                    2                    ,                    ...                                        min_col                    =                    1                    ,                    ...                                        max_col                    =                    3                    ):                    ...                                        impress                    (                    cavalcade                    )                    (<Cell 'Sheet 1'.A1>, <Jail cell 'Sheet 1'.A2>)                    (<Cell 'Sheet one'.B1>, <Cell 'Sheet 1'.B2>)                    (<Prison cell 'Canvas one'.C1>, <Cell 'Sheet 1'.C2>)

You'll detect that in the beginning example, when iterating through the rows using .iter_rows(), you lot go ane tuple element per row selected. While when using .iter_cols() and iterating through columns, y'all'll get one tuple per column instead.

Ane additional statement you lot can pass to both methods is the Boolean values_only. When it's set to True, the values of the cell are returned, instead of the Prison cell object:

>>>

                                                        >>>                                        for                    value                    in                    sheet                    .                    iter_rows                    (                    min_row                    =                    one                    ,                    ...                                        max_row                    =                    2                    ,                    ...                                        min_col                    =                    1                    ,                    ...                                        max_col                    =                    3                    ,                    ...                                        values_only                    =                    True                    ):                    ...                                        print                    (                    value                    )                    ('marketplace', 'customer_id', 'review_id')                    ('Us', 3653882, 'R3O9SGZBVQBV76')

If yous want to iterate through the whole dataset, then y'all can likewise use the attributes .rows or .columns direct, which are shortcuts to using .iter_rows() and .iter_cols() without any arguments:

>>>

                                                        >>>                                        for                    row                    in                    canvass                    .                    rows                    :                    ...                                        impress                    (                    row                    )                    (<Cell 'Sail 1'.A1>, <Cell 'Sheet one'.B1>, <Cell 'Sheet 1'.C1>                    ...                    <Cell 'Sail 1'.M100>, <Cell 'Canvas one'.N100>, <Jail cell 'Sheet 1'.O100>)

These shortcuts are very useful when y'all're iterating through the whole dataset.

Manipulate Data Using Python's Default Data Structures

Now that you know the basics of iterating through the data in a workbook, let'southward look at smart ways of converting that data into Python structures.

As you lot saw earlier, the event from all iterations comes in the form of tuples. Yet, since a tuple is nothing more than an immutable list, y'all tin can easily access its data and transform information technology into other structures.

For example, say you want to extract product information from the sample.xlsx spreadsheet and into a dictionary where each key is a production ID.

A straightforward way to do this is to iterate over all the rows, selection the columns you know are related to product information, and and then shop that in a lexicon. Let's lawmaking this out!

Offset of all, have a look at the headers and see what information y'all intendance most virtually:

>>>

                                                        >>>                                        for                    value                    in                    sheet                    .                    iter_rows                    (                    min_row                    =                    1                    ,                    ...                                        max_row                    =                    1                    ,                    ...                                        values_only                    =                    True                    ):                    ...                                        print                    (                    value                    )                    ('market', 'customer_id', 'review_id', 'product_id', ...)

This code returns a listing of all the column names yous have in the spreadsheet. To start, take hold of the columns with names:

product_id
product_parent
product_title
product_category

Lucky for y'all, the columns you need are all next to each other so yous can use the min_column and max_column to easily become the data yous want:

>>>

                                                        >>>                                        for                    value                    in                    sail                    .                    iter_rows                    (                    min_row                    =                    ii                    ,                    ...                                        min_col                    =                    4                    ,                    ...                                        max_col                    =                    7                    ,                    ...                                        values_only                    =                    True                    ):                    ...                                        impress                    (                    value                    )                    ('B00FALQ1ZC', 937001370, 'Invicta Women\'due south 15150 "Angel" 18k Yellow...)                    ('B00D3RGO20', 484010722, "Kenneth Cole New York Women'southward KC4944...)                    ...

Prissy! At present that y'all know how to get all the of import production data you need, permit's put that data into a dictionary:

                                                        import                    json                    from                    openpyxl                    import                    load_workbook                    workbook                    =                    load_workbook                    (                    filename                    =                    "sample.xlsx"                    )                    sheet                    =                    workbook                    .                    agile                    products                    =                    {}                    # Using the values_only because yous want to return the cells' values                    for                    row                    in                    canvas                    .                    iter_rows                    (                    min_row                    =                    ii                    ,                    min_col                    =                    4                    ,                    max_col                    =                    vii                    ,                    values_only                    =                    True                    ):                    product_id                    =                    row                    [                    0                    ]                    product                    =                    {                    "parent"                    :                    row                    [                    one                    ],                    "championship"                    :                    row                    [                    2                    ],                    "category"                    :                    row                    [                    iii                    ]                    }                    products                    [                    product_id                    ]                    =                    production                    # Using json here to be able to format the output for displaying later                    print                    (                    json                    .                    dumps                    (                    products                    ))

The code higher up returns a JSON similar to this:

                                                        {                    "B00FALQ1ZC"                    :                    {                    "parent"                    :                    937001370                    ,                    "title"                    :                    "Invicta Women's 15150 ..."                    ,                    "category"                    :                    "Watches"                    },                    "B00D3RGO20"                    :                    {                    "parent"                    :                    484010722                    ,                    "title"                    :                    "Kenneth Cole New York ..."                    ,                    "category"                    :                    "Watches"                    }                    }

Here you lot tin run across that the output is trimmed to 2 products only, only if you run the script equally it is, then you should get 98 products.

Catechumen Data Into Python Classes

To finalize the reading section of this tutorial, permit's dive into Python classes and see how you lot could amend on the instance above and better structure the data.

For this, you'll be using the new Python Data Classes that are available from Python 3.7. If you're using an older version of Python, and so you can utilise the default Classes instead.

And then, first things kickoff, permit's look at the data you have and decide what you want to store and how you want to shop information technology.

As y'all saw right at the outset, this data comes from Amazon, and information technology'southward a list of product reviews. You tin can check the list of all the columns and their meaning on Amazon.

There are two significant elements you tin extract from the data available:

Products
Reviews

A Production has:

ID
Title
Parent
Category

The Review has a few more than fields:

ID
Customer ID
Stars
Headline
Trunk
Engagement

Yous tin ignore a few of the review fields to make things a bit simpler.

Then, a straightforward implementation of these two classes could be written in a separate file classes.py:

                                                        import                    datetime                    from                    dataclasses                    import                    dataclass                    @dataclass                    grade                    Product                    :                    id                    :                    str                    parent                    :                    str                    title                    :                    str                    category                    :                    str                    @dataclass                    grade                    Review                    :                    id                    :                    str                    customer_id                    :                    str                    stars                    :                    int                    headline                    :                    str                    torso                    :                    str                    engagement                    :                    datetime                    .                    datetime

After defining your data classes, y'all need to convert the data from the spreadsheet into these new structures.

Before doing the conversion, information technology'due south worth looking at our header over again and creating a mapping betwixt columns and the fields you need:

>>>

                                                        >>>                                        for                    value                    in                    sail                    .                    iter_rows                    (                    min_row                    =                    i                    ,                    ...                                        max_row                    =                    1                    ,                    ...                                        values_only                    =                    True                    ):                    ...                                        print                    (                    value                    )                    ('market', 'customer_id', 'review_id', 'product_id', ...)                    >>>                                        # Or an alternative                    >>>                                        for                    cell                    in                    sheet                    [                    ane                    ]:                    ...                                        impress                    (                    prison cell                    .                    value                    )                    marketplace                    customer_id                    review_id                    product_id                    product_parent                    ...

Let's create a file mapping.py where you accept a list of all the field names and their column location (zero-indexed) on the spreadsheet:

                                                        # Product fields                    PRODUCT_ID                    =                    3                    PRODUCT_PARENT                    =                    four                    PRODUCT_TITLE                    =                    five                    PRODUCT_CATEGORY                    =                    half dozen                    # Review fields                    REVIEW_ID                    =                    2                    REVIEW_CUSTOMER                    =                    1                    REVIEW_STARS                    =                    vii                    REVIEW_HEADLINE                    =                    12                    REVIEW_BODY                    =                    thirteen                    REVIEW_DATE                    =                    xiv

Yous don't necessarily have to practice the mapping above. It's more for readability when parsing the row data, then yous don't terminate upwards with a lot of magic numbers lying around.

Finally, allow'south look at the code needed to parse the spreadsheet data into a list of product and review objects:

                                                        from                    datetime                    import                    datetime                    from                    openpyxl                    import                    load_workbook                    from                    classes                    import                    Product                    ,                    Review                    from                    mapping                    import                    PRODUCT_ID                    ,                    PRODUCT_PARENT                    ,                    PRODUCT_TITLE                    ,                    \                    PRODUCT_CATEGORY                    ,                    REVIEW_DATE                    ,                    REVIEW_ID                    ,                    REVIEW_CUSTOMER                    ,                    \                    REVIEW_STARS                    ,                    REVIEW_HEADLINE                    ,                    REVIEW_BODY                    # Using the read_only method since you're not gonna be editing the spreadsheet                    workbook                    =                    load_workbook                    (                    filename                    =                    "sample.xlsx"                    ,                    read_only                    =                    Truthful                    )                    sheet                    =                    workbook                    .                    active                    products                    =                    []                    reviews                    =                    []                    # Using the values_only considering you lot just want to render the jail cell value                    for                    row                    in                    canvas                    .                    iter_rows                    (                    min_row                    =                    2                    ,                    values_only                    =                    True                    ):                    product                    =                    Product                    (                    id                    =                    row                    [                    PRODUCT_ID                    ],                    parent                    =                    row                    [                    PRODUCT_PARENT                    ],                    title                    =                    row                    [                    PRODUCT_TITLE                    ],                    category                    =                    row                    [                    PRODUCT_CATEGORY                    ])                    products                    .                    append                    (                    product                    )                    # Y'all demand to parse the date from the spreadsheet into a datetime format                    spread_date                    =                    row                    [                    REVIEW_DATE                    ]                    parsed_date                    =                    datetime                    .                    strptime                    (                    spread_date                    ,                    "%Y-%one thousand-                    %d                    "                    )                    review                    =                    Review                    (                    id                    =                    row                    [                    REVIEW_ID                    ],                    customer_id                    =                    row                    [                    REVIEW_CUSTOMER                    ],                    stars                    =                    row                    [                    REVIEW_STARS                    ],                    headline                    =                    row                    [                    REVIEW_HEADLINE                    ],                    body                    =                    row                    [                    REVIEW_BODY                    ],                    date                    =                    parsed_date                    )                    reviews                    .                    append                    (                    review                    )                    impress                    (                    products                    [                    0                    ])                    impress                    (                    reviews                    [                    0                    ])

Afterward you run the lawmaking above, you should get some output like this:

                                                        Product                    (                    id                    =                    'B00FALQ1ZC'                    ,                    parent                    =                    937001370                    ,                    ...                    )                    Review                    (                    id                    =                    'R3O9SGZBVQBV76'                    ,                    customer_id                    =                    3653882                    ,                    ...                    )

That'south it! Now yous should have the data in a very simple and digestible class format, and you can start thinking of storing this in a Database or any other blazon of data storage you lot like.

Using this kind of OOP strategy to parse spreadsheets makes handling the data much simpler after on.

Appending New Data

Before yous first creating very complex spreadsheets, have a quick await at an example of how to append data to an existing spreadsheet.

Become back to the commencement example spreadsheet you created (hello_world.xlsx) and try opening information technology and appending some data to it, similar this:

                                                  from                  openpyxl                  import                  load_workbook                  # Start by opening the spreadsheet and selecting the chief canvas                  workbook                  =                  load_workbook                  (                  filename                  =                  "hello_world.xlsx"                  )                  canvas                  =                  workbook                  .                  active                  # Write what you lot want into a specific cell                  sheet                  [                  "C1"                  ]                  =                  "writing ;)"                  # Save the spreadsheet                  workbook                  .                  save                  (                  filename                  =                  "hello_world_append.xlsx"                  )

Et voilà, if you open the new hello_world_append.xlsx spreadsheet, you'll see the following change:

Notice the boosted writing ;) on cell C1.

Writing Excel Spreadsheets With openpyxl

At that place are a lot of different things you can write to a spreadsheet, from simple text or number values to complex formulas, charts, or even images.

Let's start creating some spreadsheets!

Creating a Simple Spreadsheet

Previously, you saw a very quick instance of how to write "How-do-you-do world!" into a spreadsheet, so y'all tin start with that:

                                                                      1                  from                  openpyxl                  import                  Workbook                                      2                                      3                  filename                  =                  "hello_world.xlsx"                                      4                                      5                                      workbook                    =                    Workbook                    ()                                                        6                  sheet                  =                  workbook                  .                  active                                      7                                      eight                                      canvas                    [                    "A1"                    ]                    =                    "hello"                                                        ix                                      canvas                    [                    "B1"                    ]                    =                    "world!"                                    10                  11                                      workbook                    .                    save                    (                    filename                    =                    filename                    )

The highlighted lines in the code above are the about important ones for writing. In the code, you can run into that:

Line 5 shows yous how to create a new empty workbook.
Lines 8 and 9 show you how to add together information to specific cells.
Line eleven shows you how to save the spreadsheet when y'all're done.

Even though these lines to a higher place can be straightforward, it'south nonetheless good to know them well for when things get a flake more than complicated.

I thing yous can do to help with coming code examples is add the following method to your Python file or console:

>>>

                                                  >>>                                    def                  print_rows                  ():                  ...                                    for                  row                  in                  sheet                  .                  iter_rows                  (                  values_only                  =                  True                  ):                  ...                                    print                  (                  row                  )

It makes it easier to print all of your spreadsheet values by just calling print_rows().

Basic Spreadsheet Operations

Before you get into the more advanced topics, it'south healthy to know how to manage the most simple elements of a spreadsheet.

Calculation and Updating Cell Values

You already learned how to add together values to a spreadsheet like this:

>>>

                                                        >>>                                        sheet                    [                    "A1"                    ]                    =                    "value"

At that place'southward some other style y'all tin do this, past first selecting a prison cell and then changing its value:

>>>

                                                        >>>                                        jail cell                    =                    canvass                    [                    "A1"                    ]                    >>>                                        cell                    <Prison cell 'Canvas'.A1>                    >>>                                        cell                    .                    value                    'hello'                    >>>                                        prison cell                    .                    value                    =                    "hey"                    >>>                                        prison cell                    .                    value                    'hey'

The new value is simply stored into the spreadsheet one time you telephone call workbook.save().

The openpyxl creates a cell when adding a value, if that jail cell didn't be before:

>>>

                                                        >>>                                        # Before, our spreadsheet has only i row                    >>>                                        print_rows                    ()                    ('how-do-you-do', 'earth!')                    >>>                                        # Endeavor adding a value to row ten                    >>>                                        sheet                    [                    "B10"                    ]                    =                    "test"                    >>>                                        print_rows                    ()                    ('hello', 'world!')                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, 'test')

As you can see, when trying to add a value to prison cell B10, y'all stop up with a tuple with 10 rows, just then you can accept that test value.

Managing Rows and Columns

Ane of the nearly common things yous take to do when manipulating spreadsheets is calculation or removing rows and columns. The openpyxl packet allows you to do that in a very straightforward manner by using the methods:

.insert_rows()
.delete_rows()
.insert_cols()
.delete_cols()

Every unmarried one of those methods tin can receive 2 arguments:

idx
amount

Using our bones hello_world.xlsx example again, let'due south see how these methods piece of work:

>>>

                                                        >>>                                        print_rows                    ()                    ('how-do-you-do', 'world!')                    >>>                                        # Insert a column before the existing column 1 ("A")                    >>>                                        canvas                    .                    insert_cols                    (                    idx                    =                    one                    )                    >>>                                        print_rows                    ()                    (None, 'hi', 'world!')                    >>>                                        # Insert 5 columns between cavalcade 2 ("B") and 3 ("C")                    >>>                                        canvas                    .                    insert_cols                    (                    idx                    =                    3                    ,                    corporeality                    =                    5                    )                    >>>                                        print_rows                    ()                    (None, 'hello', None, None, None, None, None, 'earth!')                    >>>                                        # Delete the created columns                    >>>                                        sail                    .                    delete_cols                    (                    idx                    =                    iii                    ,                    amount                    =                    5                    )                    >>>                                        sheet                    .                    delete_cols                    (                    idx                    =                    1                    )                    >>>                                        print_rows                    ()                    ('hello', 'globe!')                    >>>                                        # Insert a new row in the commencement                    >>>                                        sheet                    .                    insert_rows                    (                    idx                    =                    one                    )                    >>>                                        print_rows                    ()                    (None, None)                    ('howdy', 'earth!')                    >>>                                        # Insert three new rows in the get-go                    >>>                                        canvas                    .                    insert_rows                    (                    idx                    =                    1                    ,                    amount                    =                    iii                    )                    >>>                                        print_rows                    ()                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    ('howdy', 'earth!')                    >>>                                        # Delete the first iv rows                    >>>                                        sail                    .                    delete_rows                    (                    idx                    =                    one                    ,                    amount                    =                    iv                    )                    >>>                                        print_rows                    ()                    ('hello', 'world!')

The merely thing you demand to recollect is that when inserting new information (rows or columns), the insertion happens earlier the idx parameter.

So, if you practise insert_rows(i), it inserts a new row before the existing first row.

It'due south the same for columns: when you call insert_cols(2), it inserts a new column right before the already existing second column (B).

All the same, when deleting rows or columns, .delete_... deletes data starting from the index passed equally an argument.

For example, when doing delete_rows(two) it deletes row 2, and when doing delete_cols(3) it deletes the third column (C).

Managing Sheets

Canvass management is also i of those things you might demand to know, even though it might be something that you don't utilize that ofttimes.

If y'all await back at the code examples from this tutorial, you'll notice the post-obit recurring slice of code:

This is the way to select the default sheet from a spreadsheet. Yet, if you're opening a spreadsheet with multiple sheets, so you tin always select a specific one similar this:

>>>

                                                        >>>                                        # Let'due south say you lot have two sheets: "Products" and "Company Sales"                    >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales']                    >>>                                        # You can select a canvass using its title                    >>>                                        products_sheet                    =                    workbook                    [                    "Products"                    ]                    >>>                                        sales_sheet                    =                    workbook                    [                    "Company Sales"                    ]

Yous can likewise change a sheet championship very easily:

>>>

                                                        >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales']                    >>>                                        products_sheet                    =                    workbook                    [                    "Products"                    ]                    >>>                                        products_sheet                    .                    title                    =                    "New Products"                    >>>                                        workbook                    .                    sheetnames                    ['New Products', 'Company Sales']

If you lot want to create or delete sheets, and then you lot can likewise practice that with .create_sheet() and .remove():

>>>

                                                        >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales']                    >>>                                        operations_sheet                    =                    workbook                    .                    create_sheet                    (                    "Operations"                    )                    >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales', 'Operations']                    >>>                                        # You lot can as well ascertain the position to create the canvass at                    >>>                                        hr_sheet                    =                    workbook                    .                    create_sheet                    (                    "Hr"                    ,                    0                    )                    >>>                                        workbook                    .                    sheetnames                    ['60 minutes', 'Products', 'Company Sales', 'Operations']                    >>>                                        # To remove them, just pass the canvass every bit an argument to the .remove()                    >>>                                        workbook                    .                    remove                    (                    operations_sheet                    )                    >>>                                        workbook                    .                    sheetnames                    ['Hr', 'Products', 'Company Sales']                    >>>                                        workbook                    .                    remove                    (                    hr_sheet                    )                    >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales']

Ane other matter you tin do is make duplicates of a sheet using copy_worksheet():

>>>

                                                        >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales']                    >>>                                        products_sheet                    =                    workbook                    [                    "Products"                    ]                    >>>                                        workbook                    .                    copy_worksheet                    (                    products_sheet                    )                    <Worksheet "Products Copy">                    >>>                                        workbook                    .                    sheetnames                    ['Products', 'Visitor Sales', 'Products Copy']

If yous open your spreadsheet after saving the higher up lawmaking, you'll notice that the sheet Products Copy is a duplicate of the sail Products.

Freezing Rows and Columns

Something that you might want to do when working with big spreadsheets is to freeze a few rows or columns, and then they remain visible when you scroll right or down.

Freezing data allows you to proceed an eye on important rows or columns, regardless of where y'all whorl in the spreadsheet.

Again, openpyxl also has a way to accomplish this past using the worksheet freeze_panes aspect. For this example, go back to our sample.xlsx spreadsheet and endeavor doing the following:

>>>

                                                        >>>                                        workbook                    =                    load_workbook                    (                    filename                    =                    "sample.xlsx"                    )                    >>>                                        sheet                    =                    workbook                    .                    agile                    >>>                                        sheet                    .                    freeze_panes                    =                    "C2"                    >>>                                        workbook                    .                    save                    (                    "sample_frozen.xlsx"                    )

If yous open the sample_frozen.xlsx spreadsheet in your favorite spreadsheet editor, yous'll notice that row 1 and columns A and B are frozen and are always visible no matter where you navigate inside the spreadsheet.

This characteristic is handy, for example, to go on headers within sight, so y'all e'er know what each column represents.

Here'south how information technology looks in the editor:

Notice how you're at the end of the spreadsheet, and yet, you can see both row ane and columns A and B.

Adding Filters

You lot tin use openpyxl to add filters and sorts to your spreadsheet. However, when you open the spreadsheet, the data won't be rearranged according to these sorts and filters.

At showtime, this might seem like a pretty useless feature, but when you're programmatically creating a spreadsheet that is going to be sent and used by somebody else, it'due south still nice to at least create the filters and let people to use it afterwards.

The code below is an example of how you would add some filters to our existing sample.xlsx spreadsheet:

>>>

                                                        >>>                                        # Check the used spreadsheet space using the attribute "dimensions"                    >>>                                        canvass                    .                    dimensions                    'A1:O100'                    >>>                                        sail                    .                    auto_filter                    .                    ref                    =                    "A1:O100"                    >>>                                        workbook                    .                    save                    (                    filename                    =                    "sample_with_filters.xlsx"                    )

You should now come across the filters created when opening the spreadsheet in your editor:

You lot don't have to use sail.dimensions if you know precisely which part of the spreadsheet you lot want to utilize filters to.

Calculation Formulas

Formulas (or formulae) are one of the most powerful features of spreadsheets.

They gives you the power to employ specific mathematical equations to a range of cells. Using formulas with openpyxl is as simple as editing the value of a cell.

Yous can see the list of formulas supported by openpyxl:

>>>

                                                  >>>                                    from                  openpyxl.utils                  import                  FORMULAE                  >>>                                    FORMULAE                  frozenset({'ABS',                                      'ACCRINT',                                      'ACCRINTM',                                      'ACOS',                                      'ACOSH',                                      'AMORDEGRC',                                      'AMORLINC',                                      'AND',                                      ...                                      'YEARFRAC',                                      'YIELD',                                      'YIELDDISC',                                      'YIELDMAT',                                      'ZTEST'})

Let's add together some formulas to our sample.xlsx spreadsheet.

Starting with something easy, permit'due south cheque the average star rating for the 99 reviews within the spreadsheet:

>>>

                                                  >>>                                    # Star rating is column "H"                  >>>                                    sheet                  [                  "P2"                  ]                  =                  "=AVERAGE(H2:H100)"                  >>>                                    workbook                  .                  save                  (                  filename                  =                  "sample_formulas.xlsx"                  )

If you open the spreadsheet now and go to cell P2, you should see that its value is: 4.18181818181818. Have a expect in the editor:

You can use the same methodology to add any formulas to your spreadsheet. For example, let's count the number of reviews that had helpful votes:

>>>

                                                  >>>                                    # The helpful votes are counted on cavalcade "I"                  >>>                                    sheet                  [                  "P3"                  ]                  =                  '=COUNTIF(I2:I100, ">0")'                  >>>                                    workbook                  .                  salvage                  (                  filename                  =                  "sample_formulas.xlsx"                  )

You should get the number 21 on your P3 spreadsheet cell like so:

You'll accept to make sure that the strings within a formula are always in double quotes, so you either take to use single quotes around the formula similar in the case above or you'll have to escape the double quotes inside the formula: "=COUNTIF(I2:I100, \">0\")".

At that place are a ton of other formulas you can add to your spreadsheet using the same procedure you tried above. Give it a go yourself!

Adding Styles

Even though styling a spreadsheet might not be something y'all would do every day, it's even so good to know how to do information technology.

Using openpyxl, you can apply multiple styling options to your spreadsheet, including fonts, borders, colors, and then on. Have a expect at the openpyxl documentation to learn more.

You can also choose to either use a style directly to a cell or create a template and reuse it to apply styles to multiple cells.

Let'southward start by having a look at simple cell styling, using our sample.xlsx over again every bit the base spreadsheet:

>>>

                                                  >>>                                    # Import necessary style classes                  >>>                                    from                  openpyxl.styles                  import                  Font                  ,                  Color                  ,                  Alignment                  ,                  Border                  ,                  Side                  >>>                                    # Create a few styles                  >>>                                    bold_font                  =                  Font                  (                  assuming                  =                  Truthful                  )                  >>>                                    big_red_text                  =                  Font                  (                  color                  =                  "00FF0000"                  ,                  size                  =                  xx                  )                  >>>                                    center_aligned_text                  =                  Alignment                  (                  horizontal                  =                  "middle"                  )                  >>>                                    double_border_side                  =                  Side                  (                  border_style                  =                  "double"                  )                  >>>                                    square_border                  =                  Border                  (                  top                  =                  double_border_side                  ,                  ...                                    right                  =                  double_border_side                  ,                  ...                                    bottom                  =                  double_border_side                  ,                  ...                                    left                  =                  double_border_side                  )                  >>>                                    # Style some cells!                  >>>                                    sheet                  [                  "A2"                  ]                  .                  font                  =                  bold_font                  >>>                                    canvass                  [                  "A3"                  ]                  .                  font                  =                  big_red_text                  >>>                                    canvas                  [                  "A4"                  ]                  .                  alignment                  =                  center_aligned_text                  >>>                                    canvas                  [                  "A5"                  ]                  .                  border                  =                  square_border                  >>>                                    workbook                  .                  save                  (                  filename                  =                  "sample_styles.xlsx"                  )

If y'all open your spreadsheet now, yous should see quite a few different styles on the first 5 cells of cavalcade A:

There you get. You got:

A2 with the text in assuming
A3 with the text in blood-red and bigger font size
A4 with the text centered
A5 with a square border around the text

You lot can also combine styles by just adding them to the cell at the same time:

>>>

                                                  >>>                                    # Reusing the same styles from the example above                  >>>                                    sail                  [                  "A6"                  ]                  .                  alignment                  =                  center_aligned_text                  >>>                                    sheet                  [                  "A6"                  ]                  .                  font                  =                  big_red_text                  >>>                                    sheet                  [                  "A6"                  ]                  .                  border                  =                  square_border                  >>>                                    workbook                  .                  salve                  (                  filename                  =                  "sample_styles.xlsx"                  )

Have a look at cell A6 hither:

When you want to apply multiple styles to one or several cells, you can use a NamedStyle form instead, which is like a style template that yous can use over and once again. Have a wait at the example below:

>>>

                                                  >>>                                    from                  openpyxl.styles                  import                  NamedStyle                  >>>                                    # Permit'due south create a mode template for the header row                  >>>                                    header                  =                  NamedStyle                  (                  name                  =                  "header"                  )                  >>>                                    header                  .                  font                  =                  Font                  (                  bold                  =                  Truthful                  )                  >>>                                    header                  .                  border                  =                  Border                  (                  lesser                  =                  Side                  (                  border_style                  =                  "thin"                  ))                  >>>                                    header                  .                  alignment                  =                  Alignment                  (                  horizontal                  =                  "middle"                  ,                  vertical                  =                  "eye"                  )                  >>>                                    # Now let'south use this to all starting time row (header) cells                  >>>                                    header_row                  =                  sheet                  [                  1                  ]                  >>>                                    for                  cell                  in                  header_row                  :                  ...                                    jail cell                  .                  style                  =                  header                  >>>                                    workbook                  .                  save                  (                  filename                  =                  "sample_styles.xlsx"                  )

If yous open the spreadsheet at present, you should come across that its start row is bold, the text is aligned to the center, and there'south a small bottom border! Have a look below:

Equally you saw above, there are many options when it comes to styling, and it depends on the use case, and so feel free to cheque openpyxl documentation and see what other things yous can practise.

Conditional Formatting

This feature is one of my personal favorites when it comes to adding styles to a spreadsheet.

It'southward a much more than powerful approach to styling because it dynamically applies styles according to how the data in the spreadsheet changes.

In a nutshell, conditional formatting allows you to specify a list of styles to apply to a jail cell (or cell range) according to specific conditions.

For example, a widespread utilise case is to have a balance sheet where all the negative totals are in cherry, and the positive ones are in green. This formatting makes information technology much more efficient to spot proficient vs bad periods.

Without further ado, let's pick our favorite spreadsheet—sample.xlsx—and add some conditional formatting.

You tin can start by adding a simple i that adds a cherry groundwork to all reviews with less than 3 stars:

>>>

                                                  >>>                                    from                  openpyxl.styles                  import                  PatternFill                  >>>                                    from                  openpyxl.styles.differential                  import                  DifferentialStyle                  >>>                                    from                  openpyxl.formatting.rule                  import                  Rule                  >>>                                    red_background                  =                  PatternFill                  (                  fgColor                  =                  "00FF0000"                  )                  >>>                                    diff_style                  =                  DifferentialStyle                  (                  fill                  =                  red_background                  )                  >>>                                    dominion                  =                  Dominion                  (                  type                  =                  "expression"                  ,                  dxf                  =                  diff_style                  )                  >>>                                    dominion                  .                  formula                  =                  [                  "$H1<3"                  ]                  >>>                                    sheet                  .                  conditional_formatting                  .                  add together                  (                  "A1:O100"                  ,                  rule                  )                  >>>                                    workbook                  .                  salve                  (                  "sample_conditional_formatting.xlsx"                  )

Now you'll encounter all the reviews with a star rating beneath 3 marked with a red background:

Code-wise, the but things that are new here are the objects DifferentialStyle and Rule:

DifferentialStyle is quite like to NamedStyle, which you already saw higher up, and it's used to aggregate multiple styles such as fonts, borders, alignment, and then forth.
Dominion is responsible for selecting the cells and applying the styles if the cells match the rule's logic.

Using a Rule object, you can create numerous conditional formatting scenarios.

Nevertheless, for simplicity sake, the openpyxl package offers 3 built-in formats that make it easier to create a few common provisional formatting patterns. These built-ins are:

ColorScale
IconSet
DataBar

The ColorScale gives y'all the ability to create colour gradients:

>>>

                                                  >>>                                    from                  openpyxl.formatting.rule                  import                  ColorScaleRule                  >>>                                    color_scale_rule                  =                  ColorScaleRule                  (                  start_type                  =                  "min"                  ,                  ...                                    start_color                  =                  "00FF0000"                  ,                  # Scarlet                  ...                                    end_type                  =                  "max"                  ,                  ...                                    end_color                  =                  "0000FF00"                  )                  # Light-green                  >>>                                    # Again, let's add this gradient to the star ratings, column "H"                  >>>                                    sheet                  .                  conditional_formatting                  .                  add                  (                  "H2:H100"                  ,                  color_scale_rule                  )                  >>>                                    workbook                  .                  save                  (                  filename                  =                  "sample_conditional_formatting_color_scale.xlsx"                  )

Now you should run across a color gradient on column H, from crimson to green, co-ordinate to the star rating:

You lot tin can also add a third color and make two gradients instead:

>>>

                                                  >>>                                    from                  openpyxl.formatting.rule                  import                  ColorScaleRule                  >>>                                    color_scale_rule                  =                  ColorScaleRule                  (                  start_type                  =                  "num"                  ,                  ...                                    start_value                  =                  1                  ,                  ...                                    start_color                  =                  "00FF0000"                  ,                  # Red                  ...                                    mid_type                  =                  "num"                  ,                  ...                                    mid_value                  =                  3                  ,                  ...                                    mid_color                  =                  "00FFFF00"                  ,                  # Yellow                  ...                                    end_type                  =                  "num"                  ,                  ...                                    end_value                  =                  5                  ,                  ...                                    end_color                  =                  "0000FF00"                  )                  # Greenish                  >>>                                    # Over again, permit's add this slope to the star ratings, column "H"                  >>>                                    canvass                  .                  conditional_formatting                  .                  add                  (                  "H2:H100"                  ,                  color_scale_rule                  )                  >>>                                    workbook                  .                  save                  (                  filename                  =                  "sample_conditional_formatting_color_scale_3.xlsx"                  )

This time, you'll notice that star ratings between 1 and 3 accept a gradient from blood-red to yellow, and star ratings betwixt 3 and 5 take a gradient from yellowish to green:

The IconSet allows you to add an icon to the cell according to its value:

>>>

                                                  >>>                                    from                  openpyxl.formatting.rule                  import                  IconSetRule                  >>>                                    icon_set_rule                  =                  IconSetRule                  (                  "5Arrows"                  ,                  "num"                  ,                  [                  1                  ,                  2                  ,                  3                  ,                  four                  ,                  5                  ])                  >>>                                    sheet                  .                  conditional_formatting                  .                  add                  (                  "H2:H100"                  ,                  icon_set_rule                  )                  >>>                                    workbook                  .                  save                  (                  "sample_conditional_formatting_icon_set.xlsx"                  )

Y'all'll see a colored arrow adjacent to the star rating. This arrow is reddish and points downwards when the value of the cell is i and, as the rating gets better, the pointer starts pointing up and becomes green:

The openpyxl packet has a full listing of other icons you lot tin use, also the pointer.

Finally, the DataBar allows you to create progress bars:

>>>

                                                  >>>                                    from                  openpyxl.formatting.rule                  import                  DataBarRule                  >>>                                    data_bar_rule                  =                  DataBarRule                  (                  start_type                  =                  "num"                  ,                  ...                                    start_value                  =                  1                  ,                  ...                                    end_type                  =                  "num"                  ,                  ...                                    end_value                  =                  "5"                  ,                  ...                                    color                  =                  "0000FF00"                  )                  # Light-green                  >>>                                    sheet                  .                  conditional_formatting                  .                  add                  (                  "H2:H100"                  ,                  data_bar_rule                  )                  >>>                                    workbook                  .                  salvage                  (                  "sample_conditional_formatting_data_bar.xlsx"                  )

You'll now see a green progress bar that gets fuller the closer the star rating is to the number five:

As y'all can see, there are a lot of cool things you lot can do with conditional formatting.

Here, yous saw just a few examples of what you can achieve with it, just cheque the openpyxl documentation to see a agglomeration of other options.

Adding Images

Even though images are non something that y'all'll often see in a spreadsheet, it's quite cool to exist able to add them. Maybe you lot can use it for branding purposes or to brand spreadsheets more personal.

To be able to load images to a spreadsheet using openpyxl, y'all'll have to install Pillow:

Autonomously from that, you'll also need an image. For this example, you tin grab the Real Python logo below and convert it from .webp to .png using an online converter such every bit cloudconvert.com, save the final file as logo.png, and copy it to the root binder where you're running your examples:

After, this is the code y'all need to import that prototype into the hello_word.xlsx spreadsheet:

                                                  from                  openpyxl                  import                  load_workbook                  from                  openpyxl.drawing.image                  import                  Image                  # Allow's employ the hello_world spreadsheet since it has less data                  workbook                  =                  load_workbook                  (                  filename                  =                  "hello_world.xlsx"                  )                  canvass                  =                  workbook                  .                  active                  logo                  =                  Prototype                  (                  "logo.png"                  )                  # A bit of resizing to not fill the whole spreadsheet with the logo                  logo                  .                  height                  =                  150                  logo                  .                  width                  =                  150                  sheet                  .                  add_image                  (                  logo                  ,                  "A3"                  )                  workbook                  .                  save                  (                  filename                  =                  "hello_world_logo.xlsx"                  )

You take an image on your spreadsheet! Here information technology is:

The epitome's left top corner is on the cell you chose, in this case, A3.

Calculation Pretty Charts

Another powerful thing yous can do with spreadsheets is create an incredible diversity of charts.

Charts are a great manner to visualize and sympathize loads of data quickly. There are a lot of different chart types: bar chart, pie nautical chart, line chart, and so on. openpyxl has support for a lot of them.

Here, you'll see simply a couple of examples of charts because the theory behind it is the aforementioned for every single chart type:

For whatsoever chart yous want to build, y'all'll need to ascertain the chart type: BarChart, LineChart, so forth, plus the data to be used for the chart, which is called Reference.

Earlier you can build your nautical chart, you need to define what information y'all want to see represented in information technology. Sometimes, you lot tin can utilize the dataset every bit is, simply other times you demand to massage the data a bit to get additional data.

Let's start past edifice a new workbook with some sample data:

                                                                      one                  from                  openpyxl                  import                  Workbook                                      ii                  from                  openpyxl.nautical chart                  import                  BarChart                  ,                  Reference                                      3                                      4                  workbook                  =                  Workbook                  ()                                      v                  sheet                  =                  workbook                  .                  active                                      6                                      7                  # Allow's create some sample sales data                                      viii                  rows                  =                  [                                      9                  [                  "Product"                  ,                  "Online"                  ,                  "Store"                  ],                  10                  [                  ane                  ,                  30                  ,                  45                  ],                  xi                  [                  2                  ,                  40                  ,                  30                  ],                  12                  [                  iii                  ,                  40                  ,                  25                  ],                  xiii                  [                  4                  ,                  50                  ,                  xxx                  ],                  fourteen                  [                  5                  ,                  30                  ,                  25                  ],                  15                  [                  vi                  ,                  25                  ,                  35                  ],                  xvi                  [                  7                  ,                  xx                  ,                  twoscore                  ],                  17                  ]                  xviii                  nineteen                  for                  row                  in                  rows                  :                  20                  sail                  .                  append                  (                  row                  )

Now y'all're going to kickoff by creating a bar chart that displays the full number of sales per product:

                                                  22                  chart                  =                  BarChart                  ()                  23                  data                  =                  Reference                  (                  worksheet                  =                  sheet                  ,                  24                  min_row                  =                  1                  ,                  25                  max_row                  =                  eight                  ,                  26                  min_col                  =                  two                  ,                  27                  max_col                  =                  3                  )                  28                  29                  chart                  .                  add_data                  (                  information                  ,                  titles_from_data                  =                  True                  )                  thirty                  canvass                  .                  add_chart                  (                  chart                  ,                  "E2"                  )                  31                  32                  workbook                  .                  save                  (                  "chart.xlsx"                  )

There yous have information technology. Below, yous tin see a very straightforward bar chart showing the difference betwixt online product sales online and in-store production sales:

Like with images, the summit left corner of the chart is on the jail cell you added the chart to. In your case, it was on cell E2.

Try creating a line nautical chart instead, changing the information a scrap:

                                                                      1                  import                  random                                      ii                  from                  openpyxl                  import                  Workbook                                      3                  from                  openpyxl.chart                  import                  LineChart                  ,                  Reference                                      4                                      five                  workbook                  =                  Workbook                  ()                                      6                  canvass                  =                  workbook                  .                  active                                      7                                      8                  # Permit'south create some sample sales information                                      9                  rows                  =                  [                  10                  [                  ""                  ,                  "Jan"                  ,                  "Feb"                  ,                  "March"                  ,                  "April"                  ,                  11                  "May"                  ,                  "June"                  ,                  "July"                  ,                  "August"                  ,                  "September"                  ,                  12                  "October"                  ,                  "Nov"                  ,                  "December"                  ],                  xiii                  [                  1                  ,                  ],                  14                  [                  ii                  ,                  ],                  15                  [                  3                  ,                  ],                  16                  ]                  17                  eighteen                  for                  row                  in                  rows                  :                  19                  sail                  .                  append                  (                  row                  )                  twenty                  21                  for                  row                  in                  sheet                  .                  iter_rows                  (                  min_row                  =                  2                  ,                  22                  max_row                  =                  4                  ,                  23                  min_col                  =                  2                  ,                  24                  max_col                  =                  thirteen                  ):                  25                  for                  cell                  in                  row                  :                  26                  jail cell                  .                  value                  =                  random                  .                  randrange                  (                  5                  ,                  100                  )

With the higher up code, you'll exist able to generate some random data regarding the sales of 3 different products beyond a whole yr.

Once that's done, you can very hands create a line chart with the post-obit code:

                                                  28                  nautical chart                  =                  LineChart                  ()                  29                  data                  =                  Reference                  (                  worksheet                  =                  sheet                  ,                  30                  min_row                  =                  2                  ,                  31                  max_row                  =                  4                  ,                  32                  min_col                  =                  1                  ,                  33                  max_col                  =                  13                  )                  34                  35                  chart                  .                  add_data                  (                  data                  ,                  from_rows                  =                  True                  ,                  titles_from_data                  =                  True                  )                  36                  canvas                  .                  add_chart                  (                  nautical chart                  ,                  "C6"                  )                  37                  38                  workbook                  .                  salvage                  (                  "line_chart.xlsx"                  )

Here's the outcome of the above piece of code:

One matter to keep in heed here is the fact that yous're using from_rows=True when adding the data. This statement makes the nautical chart plot row by row instead of column past column.

In your sample information, you see that each product has a row with 12 values (1 cavalcade per month). That'south why you lot utilize from_rows. If you lot don't laissez passer that argument, by default, the chart tries to plot by column, and you'll get a month-by-month comparison of sales.

Another difference that has to do with the above statement change is the fact that our Reference at present starts from the first column, min_col=1, instead of the 2nd one. This modify is needed considering the chart now expects the first column to have the titles.

There are a couple of other things y'all can also change regarding the style of the chart. For example, you can add specific categories to the chart:

                                                  cats                  =                  Reference                  (                  worksheet                  =                  canvas                  ,                  min_row                  =                  1                  ,                  max_row                  =                  1                  ,                  min_col                  =                  2                  ,                  max_col                  =                  13                  )                  chart                  .                  set_categories                  (                  cats                  )

Add this piece of code earlier saving the workbook, and you should see the month names appearing instead of numbers:

Code-wise, this is a minimal modify. But in terms of the readability of the spreadsheet, this makes it much easier for someone to open the spreadsheet and empathize the chart straight away.

Some other thing yous can do to improve the chart readability is to add an axis. You can practice it using the attributes x_axis and y_axis:

                                                  chart                  .                  x_axis                  .                  title                  =                  "Months"                  chart                  .                  y_axis                  .                  title                  =                  "Sales (per unit)"

This volition generate a spreadsheet like the beneath i:

As you can see, small changes like the to a higher place make reading your chart a much easier and quicker task.

At that place is also a mode to fashion your chart by using Excel'due south default ChartStyle holding. In this case, you take to choose a number between 1 and 48. Depending on your choice, the colors of your chart change as well:

                                                  # You can play with this past choosing any number between ane and 48                  chart                  .                  style                  =                  24

With the style selected above, all lines accept some shade of orange:

At that place is no clear documentation on what each manner number looks like, merely this spreadsheet has a few examples of the styles available.

Hither's the total lawmaking used to generate the line chart with categories, centrality titles, and style:

                                                        import                    random                    from                    openpyxl                    import                    Workbook                    from                    openpyxl.chart                    import                    LineChart                    ,                    Reference                    workbook                    =                    Workbook                    ()                    sheet                    =                    workbook                    .                    active                    # Let's create some sample sales data                    rows                    =                    [                    [                    ""                    ,                    "January"                    ,                    "February"                    ,                    "March"                    ,                    "April"                    ,                    "May"                    ,                    "June"                    ,                    "July"                    ,                    "August"                    ,                    "September"                    ,                    "October"                    ,                    "November"                    ,                    "Dec"                    ],                    [                    1                    ,                    ],                    [                    2                    ,                    ],                    [                    3                    ,                    ],                    ]                    for                    row                    in                    rows                    :                    sail                    .                    suspend                    (                    row                    )                    for                    row                    in                    sheet                    .                    iter_rows                    (                    min_row                    =                    2                    ,                    max_row                    =                    four                    ,                    min_col                    =                    2                    ,                    max_col                    =                    xiii                    ):                    for                    cell                    in                    row                    :                    cell                    .                    value                    =                    random                    .                    randrange                    (                    5                    ,                    100                    )                    # Create a LineChart and add the main data                    chart                    =                    LineChart                    ()                    data                    =                    Reference                    (                    worksheet                    =                    canvas                    ,                    min_row                    =                    2                    ,                    max_row                    =                    iv                    ,                    min_col                    =                    one                    ,                    max_col                    =                    xiii                    )                    chart                    .                    add_data                    (                    information                    ,                    titles_from_data                    =                    Truthful                    ,                    from_rows                    =                    Truthful                    )                    # Add categories to the nautical chart                    cats                    =                    Reference                    (                    worksheet                    =                    sheet                    ,                    min_row                    =                    1                    ,                    max_row                    =                    one                    ,                    min_col                    =                    2                    ,                    max_col                    =                    thirteen                    )                    chart                    .                    set_categories                    (                    cats                    )                    # Rename the X and Y Axis                    nautical chart                    .                    x_axis                    .                    title                    =                    "Months"                    chart                    .                    y_axis                    .                    title                    =                    "Sales (per unit)"                    # Utilize a specific Style                    chart                    .                    style                    =                    24                    # Relieve!                    sheet                    .                    add_chart                    (                    nautical chart                    ,                    "C6"                    )                    workbook                    .                    save                    (                    "line_chart.xlsx"                    )

In that location are a lot more than chart types and customization you can apply, and then be sure to check out the package documentation on this if you need some specific formatting.

Catechumen Python Classes to Excel Spreadsheet

Yous already saw how to convert an Excel spreadsheet's information into Python classes, simply now let'due south do the opposite.

Let's imagine you take a database and are using some Object-Relational Mapping (ORM) to map DB objects into Python classes. At present, you desire to export those same objects into a spreadsheet.

Allow's assume the post-obit data classes to represent the data coming from your database regarding product sales:

                                                  from                  dataclasses                  import                  dataclass                  from                  typing                  import                  List                  @dataclass                  class                  Sale                  :                  quantity                  :                  int                  @dataclass                  class                  Product                  :                  id                  :                  str                  proper noun                  :                  str                  sales                  :                  Listing                  [                  Auction                  ]

Now, let's generate some random data, assuming the above classes are stored in a db_classes.py file:

                                                                      1                  import                  random                                      two                                      3                  # Ignore these for at present. You'll use them in a sec ;)                                      4                  from                  openpyxl                  import                  Workbook                                      5                  from                  openpyxl.nautical chart                  import                  LineChart                  ,                  Reference                                      6                                      7                  from                  db_classes                  import                  Product                  ,                  Auction                                      viii                                      ix                  products                  =                  []                  10                  11                  # Permit's create 5 products                  12                  for                  idx                  in                  range                  (                  1                  ,                  6                  ):                  13                  sales                  =                  []                  14                  fifteen                  # Create v months of sales                  16                  for                  _                  in                  range                  (                  5                  ):                  17                  sale                  =                  Sale                  (                  quantity                  =                  random                  .                  randrange                  (                  five                  ,                  100                  ))                  18                  sales                  .                  append                  (                  sale                  )                  19                  xx                  production                  =                  Product                  (                  id                  =                  str                  (                  idx                  ),                  21                  name                  =                  "Production                                    %due south                  "                  %                  idx                  ,                  22                  sales                  =                  sales                  )                  23                  products                  .                  append                  (                  production                  )

By running this piece of code, you lot should get v products with 5 months of sales with a random quantity of sales for each month.

Now, to convert this into a spreadsheet, you need to iterate over the data and append it to the spreadsheet:

                                                  25                  workbook                  =                  Workbook                  ()                  26                  sheet                  =                  workbook                  .                  active                  27                  28                  # Append column names first                  29                  sheet                  .                  append                  ([                  "Production ID"                  ,                  "Product Name"                  ,                  "Month 1"                  ,                  xxx                  "Month 2"                  ,                  "Month three"                  ,                  "Month four"                  ,                  "Month 5"                  ])                  31                  32                  # Append the data                  33                  for                  product                  in                  products                  :                  34                  data                  =                  [                  product                  .                  id                  ,                  product                  .                  name                  ]                  35                  for                  auction                  in                  product                  .                  sales                  :                  36                  data                  .                  append                  (                  sale                  .                  quantity                  )                  37                  sheet                  .                  append                  (                  data                  )

That's information technology. That should allow you to create a spreadsheet with some data coming from your database.

However, why non utilise some of that cool knowledge you lot gained recently to add a chart too to display that data more than visually?

All correct, then you could probably do something similar this:

                                                  38                  chart                  =                  LineChart                  ()                  39                  data                  =                  Reference                  (                  worksheet                  =                  sail                  ,                  40                  min_row                  =                  2                  ,                  41                  max_row                  =                  6                  ,                  42                  min_col                  =                  2                  ,                  43                  max_col                  =                  seven                  )                  44                  45                  nautical chart                  .                  add_data                  (                  information                  ,                  titles_from_data                  =                  True                  ,                  from_rows                  =                  True                  )                  46                  sheet                  .                  add_chart                  (                  nautical chart                  ,                  "B8"                  )                  47                  48                  cats                  =                  Reference                  (                  worksheet                  =                  sail                  ,                  49                  min_row                  =                  1                  ,                  50                  max_row                  =                  one                  ,                  51                  min_col                  =                  three                  ,                  52                  max_col                  =                  7                  )                  53                  chart                  .                  set_categories                  (                  cats                  )                  54                  55                  chart                  .                  x_axis                  .                  title                  =                  "Months"                  56                  chart                  .                  y_axis                  .                  title                  =                  "Sales (per unit)"                  57                  58                  workbook                  .                  save                  (                  filename                  =                  "oop_sample.xlsx"                  )

Now nosotros're talking! Here's a spreadsheet generated from database objects and with a nautical chart and everything:

That's a great fashion for you to wrap upwardly your new knowledge of charts!

Bonus: Working With Pandas

Even though y'all can use Pandas to handle Excel files, there are few things that you either can't accomplish with Pandas or that you'd be meliorate off just using openpyxl directly.

For example, some of the advantages of using openpyxl are the power to easily customize your spreadsheet with styles, conditional formatting, and such.

But guess what, you lot don't take to worry near picking. In fact, openpyxl has support for both converting information from a Pandas DataFrame into a workbook or the contrary, converting an openpyxl workbook into a Pandas DataFrame.

Showtime things first, remember to install the pandas parcel:

Then, permit'southward create a sample DataFrame:

                                                                      1                  import                  pandas                  as                  pd                                      two                                      iii                  data                  =                  {                                      iv                  "Product Name"                  :                  [                  "Product 1"                  ,                  "Product two"                  ],                                      5                  "Sales Month one"                  :                  [                  10                  ,                  xx                  ],                                      6                  "Sales Month 2"                  :                  [                  5                  ,                  35                  ],                                      7                  }                                      8                  df                  =                  pd                  .                  DataFrame                  (                  data                  )

Now that y'all have some data, y'all can employ .dataframe_to_rows() to catechumen it from a DataFrame into a worksheet:

                                                  10                  from                  openpyxl                  import                  Workbook                  11                  from                  openpyxl.utils.dataframe                  import                  dataframe_to_rows                  12                  13                  workbook                  =                  Workbook                  ()                  14                  sheet                  =                  workbook                  .                  active                  15                  16                  for                  row                  in                  dataframe_to_rows                  (                  df                  ,                  index                  =                  Faux                  ,                  header                  =                  True                  ):                  17                  canvas                  .                  append                  (                  row                  )                  18                  19                  workbook                  .                  save                  (                  "pandas.xlsx"                  )

You should see a spreadsheet that looks like this:

If you desire to add the DataFrame's index, you can change index=True, and it adds each row's index into your spreadsheet.

On the other hand, if you want to catechumen a spreadsheet into a DataFrame, you can also do it in a very straightforward way like so:

                                                  import                  pandas                  equally                  pd                  from                  openpyxl                  import                  load_workbook                  workbook                  =                  load_workbook                  (                  filename                  =                  "sample.xlsx"                  )                  sheet                  =                  workbook                  .                  active                  values                  =                  sail                  .                  values                  df                  =                  pd                  .                  DataFrame                  (                  values                  )

Alternatively, if you want to add the right headers and employ the review ID as the index, for case, then you can also do it like this instead:

                                                  import                  pandas                  as                  pd                  from                  openpyxl                  import                  load_workbook                  from                  mapping                  import                  REVIEW_ID                  workbook                  =                  load_workbook                  (                  filename                  =                  "sample.xlsx"                  )                  sheet                  =                  workbook                  .                  active                  data                  =                  sheet                  .                  values                  # Ready the start row as the columns for the DataFrame                  cols                  =                  next                  (                  data                  )                  data                  =                  list                  (                  data                  )                  # Set the field "review_id" as the indexes for each row                  idx                  =                  [                  row                  [                  REVIEW_ID                  ]                  for                  row                  in                  data                  ]                  df                  =                  pd                  .                  DataFrame                  (                  data                  ,                  index                  =                  idx                  ,                  columns                  =                  cols                  )

Using indexes and columns allows you to access data from your DataFrame easily:

>>>

                                                  >>>                                    df                  .                  columns                  Alphabetize(['marketplace', 'customer_id', 'review_id', 'product_id',                                      'product_parent', 'product_title', 'product_category', 'star_rating',                                      'helpful_votes', 'total_votes', 'vine', 'verified_purchase',                                      'review_headline', 'review_body', 'review_date'],                                      dtype='object')                  >>>                                    # Go starting time 10 reviews' star rating                  >>>                                    df                  [                  "star_rating"                  ][:                  10                  ]                  R3O9SGZBVQBV76    5                  RKH8BNC3L5DLF     v                  R2HLE8WKZSU3NL    2                  R31U3UH5AZ42LL    5                  R2SV659OUJ945Y    4                  RA51CP8TR5A2L     v                  RB2Q7DLDN6TH6     5                  R2RHFJV0UYBK3Y    ane                  R2Z6JOQ94LFHEP    five                  RX27XIIWY5JPB     4                  Name: star_rating, dtype: int64                  >>>                                    # Catch review with id "R2EQL1V1L6E0C9", using the index                  >>>                                    df                  .                  loc                  [                  "R2EQL1V1L6E0C9"                  ]                  market place               U.s.a.                  customer_id         15305006                  review_id     R2EQL1V1L6E0C9                  product_id        B004LURNO6                  product_parent     892860326                  review_headline   5 Stars                  review_body          Love it                  review_date       2015-08-31                  Name: R2EQL1V1L6E0C9, dtype: object

There you go, whether you lot desire to use openpyxl to prettify your Pandas dataset or utilise Pandas to do some hardcore algebra, you now know how to switch between both packages.

Conclusion

Phew, subsequently that long read, you at present know how to work with spreadsheets in Python! You tin rely on openpyxl, your trustworthy companion, to:

Extract valuable data from spreadsheets in a Pythonic manner
Create your ain spreadsheets, no thing the complexity level
Add cool features such as conditional formatting or charts to your spreadsheets

In that location are a few other things you can do with openpyxl that might not have been covered in this tutorial, only you lot can always check the bundle's official documentation website to learn more nigh it. You lot can even venture into checking its source lawmaking and improving the parcel further.

Experience free to leave whatever comments below if y'all have whatsoever questions, or if there'southward any section you lot'd love to hear more than most.

Sentry At present This tutorial has a related video course created by the Real Python team. Spotter it together with the written tutorial to deepen your understanding: Editing Excel Spreadsheets in Python With openpyxl

meachamligationly.blogspot.com

Source: https://realpython.com/openpyxl-excel-spreadsheets-python/

Class in Python to Read Xls File

Before You Begin

Practical Utilise Cases

Importing New Products Into a Database

Exporting Database Data Into a Spreadsheet

Appending Data to an Existing Spreadsheet

Learning Some Basic Excel Terminology

Getting Started With openpyxl

Reading Excel Spreadsheets With openpyxl

Dataset for This Tutorial

A Simple Arroyo to Reading an Excel Spreadsheet

Additional Reading Options

Importing Data From a Spreadsheet

Iterating Through the Data

Manipulate Data Using Python's Default Data Structures

Catechumen Data Into Python Classes

Appending New Data

Writing Excel Spreadsheets With openpyxl

Creating a Simple Spreadsheet

Basic Spreadsheet Operations

Calculation and Updating Cell Values

Managing Rows and Columns

Managing Sheets

Freezing Rows and Columns

Adding Filters

Calculation Formulas

Adding Styles

Conditional Formatting

Adding Images

Calculation Pretty Charts

Catechumen Python Classes to Excel Spreadsheet

Bonus: Working With Pandas

Conclusion

0 Response to "Class in Python to Read Xls File"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel

Class in Python to Read Xls File

Before You Begin

Practical Utilise Cases

Importing New Products Into a Database

Exporting Database Data Into a Spreadsheet

Appending Data to an Existing Spreadsheet

Learning Some Basic Excel Terminology

Getting Started With openpyxl

Reading Excel Spreadsheets With openpyxl

Dataset for This Tutorial

A Simple Arroyo to Reading an Excel Spreadsheet

Additional Reading Options

Importing Data From a Spreadsheet

Iterating Through the Data

Manipulate Data Using Python's Default Data Structures

Catechumen Data Into Python Classes

Appending New Data

Writing Excel Spreadsheets With openpyxl

Creating a Simple Spreadsheet

Basic Spreadsheet Operations

Calculation and Updating Cell Values

Managing Rows and Columns

Managing Sheets

Freezing Rows and Columns

Adding Filters

Calculation Formulas

Adding Styles

Conditional Formatting

Adding Images

Calculation Pretty Charts

Catechumen Python Classes to Excel Spreadsheet

Bonus: Working With Pandas

Conclusion

Related Posts

0 Response to "Class in Python to Read Xls File"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel