Automation to Read From Website and Write to Spreadsheet

A Elementary Guide to Automate Your Excel Reporting with Python

Use openpyxl to automate your Excel reporting with Python

Frank Andrade

Let's face it; no affair what our job is, sooner or later, nosotros will have to bargain with repetitive tasks like updating a daily report in Excel. Things could become worse if you work for a visitor that doesn't work with Python because you wouldn't exist able to solve this problem by using only Python.

But don't worry, you still can use your Pytho n skills to automate your excel reporting without having to convince your boss to migrate to Python! You lot simply have to employ the Python module openpyxl to tell Excel what y'all want to do through Python. Different a previous article I wrote that encourages y'all to motion from Excel to Python, with openpyxl you would be able to stick to Excel while creating your reports with Python.

                      Table of Contents
1. The Dataset
2. Make a Pivot Table with Pandas
- Importing libraries
- Reading the Excel file
- Making a pivot table
- Exporting pivot table to Excel file
3. Make The Study with Openpyxl
- Creating row and column reference
- Calculation Excel charts through Python
- Applying Excel formulas through Python
- Formatting the written report sheet
4. Automating the Report with a Python Function (Total code)
- Applying the office to a single Excel file
- Applying the function to multiple Excel files
five. Schedule the Python Script to Run Monthly, Weekly, or Daily

The Dataset

In this guide, we'll employ an Excel file with sales data that is like to those files you lot have as inputs to make reports at piece of work. You lot tin download this file on Kaggle; however, it has a .csv format, then you should alter the extension to .xlsx or merely download it from this Google Drive link (I likewise inverse the file name to supermarket_sales.xlsx)

Before writing any code, accept look at the file on Google Drive and familiarize yourself with it. That file is going to be the input to create the following written report through Python.

Image by writer

At present let'south make that report and automate it with Python!

Brand a Pivot Table with Pandas

Importing libraries

Now that you downloaded the Excel file, let's import the libraries nosotros'll use in this guide.

          import pandas as pd
import openpyxl
from openpyxl import load_workbook
from openpyxl.styles import Font
from openpyxl.nautical chart import BarChart, Reference
import cord

We'll utilize Pandas to read the Excel file, create a pivot table, and consign it to Excel. Then we'll use the Openpyxl library to write Excel formulas, make charts and format the spreadsheet through Python. Finally, we'll create a Python function to automate this process.

Note: If you don't have those libraries installed in Python, you can easily install them by writing pip install pandas and pip install openpyxl on your terminal or command prompt.

Reading the Excel file

Earlier nosotros read the Excel file, make certain the file is in the aforementioned place where your Python script is located. Then, read the Excel file with pd.read_excel() like in the post-obit lawmaking.

          excel_file = pd.read_excel('supermarket_sales.xlsx')
excel_file[['Gender', 'Production line', 'Total']]

The file has many columns but nosotros'll but utilize the Gender, Product line, and Total columns for the report we're going to create. To show you how they look like, I selected them using double brackets. If nosotros print this on Jupyter Notebooks, you lot'll run into the post-obit dataframe that looks like an Excel spreadsheet.

Epitome by author

Making a pin table

We can easily create a pin table from the excel_file dataframe previously created. We merely demand to use the .pivot_table() method. Let's say nosotros want to create a pivot table that shows the total coin spent by males and females on the dissimilar product lines. To practice so, nosotros write the post-obit lawmaking.

          report_table = excel_file.pivot_table(index='Gender',
columns='Product line',
values='Total',
aggfunc='sum').round(0)

The report_table should look something like this.

Exporting pivot table to Excel file

To export the previous pin table created we employ the .to_excel() method. Within parentheses, we have to write the name of the output Excel file. In this instance, I'll proper name this file as report_2021.xlsx

We can besides specify the proper noun of the sheet nosotros want to create and in which cell the pivot tabular array should be located.

          report_table.to_excel('report_2021.xlsx',
sheet_name='Written report',
startrow=4)

Now the Excel file is exported in the same folder your Python script is located.

Make The Study with Openpyxl

Every time nosotros desire to access a workbook we'll use the load_workbook imported from openpyxl and then save it with the .salve() method. In the following sections, I'll exist loading and saving the workbook every fourth dimension nosotros modify the workbook; still, you lot but demand to do this once (like in the full code shown at the end of this guide)

Creating row and column reference

To automate the report, we need to take the minimum and maximum active column/row, so the code we're going to write keeps working even if we add more data.

To obtain the references in the workbook, nosotros offset load the workbook with load_workbook() and locate the sheet we desire to piece of work with using wb['name_of_sheet']. Then we access the agile cells with .active

          wb = load_workbook('report_2021.xlsx')
sheet = wb['Written report']
# cell references (original spreadsheet)
min_column = wb.active.min_column
max_column = wb.agile.max_column
min_row = wb.agile.min_row
max_row = wb.active.max_row

Y'all tin can impress the variables created to get an idea of what they mean. For this case, nosotros obtain these numbers.

          Min Columns: 1
Max Columns: 7
Min Rows: 5
Max Rows: 7

Open thereport_2021.xlsx we exported before to verify this.

Image by author

As you can in the motion-picture show in a higher place, the minimum row is 5 and the maximum row is vii. Also, the minimum row is A (one) and the maximum row is M (seven). These references will be extremely useful for the following sections.

Adding Excel charts through Python

To create an Excel chart from the pivot table we created nosotros need to use the Barchart module we imported earlier. To identify the position of the information and category values, we use the Reference module from openpyxl (we imported Reference in the outset of this article)

          wb = load_workbook('report_2021.xlsx')
sheet = wb['Report']
# barchart
barchart = BarChart()
#locate data and categories
information = Reference(sheet,
min_col=min_column+1,
max_col=max_column,
min_row=min_row,
max_row=max_row) #including headers
categories = Reference(sheet,
min_col=min_column,
max_col=min_column,
min_row=min_row+1,
max_row=max_row) #not including headers
# calculation data and categories
barchart.add_data(information, titles_from_data=True)
barchart.set_categories(categories)
#location nautical chart
sheet.add_chart(barchart, "B12")
barchart.title = 'Sales by Product line'
barchart.manner = 5 #choose the chart mode
wb.save('report_2021.xlsx')

Subsequently writing that code, the report_2021.xlsx file should look like this.

Image by author

Breaking down the code:

  • barchart = BarChart() initializes a barchart variable from the Barchart form
  • information and categories are variables that stand for where that data is located. Nosotros're using the column and row references nosotros divers in a higher place to automate this. Also, keep in listen that I'm including the headers in data only non in categories
  • We use add_data and set_categories to add the necessary data to the barchart. Inside add_data I'm adding the titles_from_data=Truthful because I included the headers for data
  • We use canvass.add_chart to specify what we want to add to the "Written report" sail and in which cell we want to add information technology
  • We tin can modify the default title and chart mode using barchart.title and barchart.style
  • We save all the changes with wb.save()

Applying Excel formulas through Python

Y'all can write Excel formulas through Python the same style you lot'd write in an Excel canvas. For example, let's say nosotros wish to sum the data in cells B5 and B6 and show it on jail cell B7 with the currency style.

          sheet['B7'] = '=SUM(B5:B6)'
sheet['B7'].style = 'Currency'

That's pretty uncomplicated, right? We tin can repeat that from column B to 1000 or use a for loop to automate it. Only first, nosotros need to get the alphabet to have it equally a reference for the names that columns accept in Excel (A, B, C, …) To do so, we use the cord library and write the post-obit code.

          import string
alphabet = listing(cord.ascii_uppercase)
excel_alphabet = alphabet[0:max_column]
impress(excel_alphabet)

If we print this we'll obtain a list from A to Chiliad.

This happens considering first, we created an alphabet list from A to Z, but and so we took a slice [0:max_column] to match the length of this list (7) with the first seven letters of the alphabet (A-G).

Note: Python lists showtime on 0, and then A=0, B=1, C=ii, and so on. Too, the [a:b] slice notation takes b-a elements (starting with "a" and ending with "b-ane")

After this, we can make a loop through the columns and apply the sum formula only now with column references, so instead of writing this,

          sail['B7'] = '=SUM(B5:B6)'
sheet['B7'].way = 'Currency'

now we include reference and put it inside a for loop.

          wb = load_workbook('report_2021.xlsx')
canvas = wb['Report']
# sum in columns B-G
for i in excel_alphabet:
if i!='A':
canvas[f'{i}{max_row+1}'] = f'=SUM({i}{min_row+ane}:{i}{max_row})'
sheet[f'{i}{max_row+1}'].style = 'Currency'
# adding total characterization
sheet[f'{excel_alphabet[0]}{max_row+1}'] = 'Total'
wb.save('report_2021.xlsx')

Subsequently running the code, nosotros get the =SUM formula in the "Full" row for columns between B to G.

Paradigm by writer

Breaking down the code:

  • for i in excel_alphabet loops through all the active columns, only and so we excluded the A cavalcade with if i!='A' because the A cavalcade doesn't contain numeric data
  • sheet[f'{i}{max_row+1}'] = f'=SUM({i}{min_row+1}:{i}{max_row}' is the same equally writing sheet['B7'] = '=SUM(B5:B6)' but now we do that for columns A to K
  • sheet[f'{i}{max_row+ane}'].style = 'Currency' gives the currency manner to cells below the maximum row.
  • We add the 'Total' characterization to the A column below the maximum row withsheet[f'{excel_alphabet[0]}{max_row+1}'] = 'Total'

Formatting the report sheet

To finish the report, nosotros tin add a championship, subtitle and as well customize their font.

          wb = load_workbook('report_2021.xlsx')
canvass = wb['Study']
canvass['A1'] = 'Sales Written report'
sheet['A2'] = '2021'
sheet['A1'].font = Font('Arial', bold=True, size=20)
canvass['A2'].font = Font('Arial', assuming=True, size=10)
wb.save('report_2021.xlsx')

You can add other parameters inside Font(). On this website, you can notice a list of styles available.

The final report should expect similar the following picture.

Image past author

Automating the Written report with a Python Function

Now that the report is ready, we can put all the lawmaking we've written so far inside a function that automates the written report, and so the next time we want to make this study we only have to introduce the file proper noun and run information technology.

Notes: For this function to piece of work, the file name should have the construction "sales_month.xlsx" Also, I added a few lines of lawmaking that use the name of the month/twelvemonth of the sales file equally a variable, so we can reuse it in the output file and subtitle of the report.

The code beneath might look intimidating, but it's but what we've written so far plus the new variables file_name, month_name, and month_and_extension.

Applying the function to a unmarried Excel file

Let'south imagine the original file we downloaded has the proper name "sales_2021.xlsx" instead of "supermarket_sales.xlsx" With this nosotros tin can apply the formula to the report by writing the following

          automate_excel('sales_2021.xlsx')        

Subsequently running this lawmaking, you'll meet an Excel file named "report_2021.xlsx" in the same folder your Python script is located.

Applying the function to multiple Excel files

Permit's imagine at present nosotros accept only monthly Excel files "sales_january.xlsx" "sales_february.xlsx" and "sales_march.xlsx" (You lot tin find those files on my Github to exam them)

Yous can either apply the formula one by one to get 3 reports

          automate_excel('sales_january.xlsx')
automate_excel('sales_february.xlsx')
automate_excel('sales_march.xlsx')

or you lot could concatenate them starting time using pd.concat() and and so apply the function simply once.

          # read excel files
excel_file_1 = pd.read_excel('sales_january.xlsx')
excel_file_2 = pd.read_excel('sales_february.xlsx')
excel_file_3 = pd.read_excel('sales_march.xlsx')
# concatenate files
new_file = pd.concat([excel_file_1,
excel_file_2,
excel_file_3], ignore_index=True)
# export file
new_file.to_excel('sales_2021.xlsx')
# apply role
automate_excel('sales_2021.xlsx')

Schedule the Python Script to Run Monthly, Weekly, or Daily

You can schedule the Python script we've written in this guide to run whenever you want on your calculator. You just need to apply the task scheduler or crontab on Windows and Mac respectively.

If you don't know how to schedule a chore, click on the guide below to learn how to practice it.

persaudcrist1978.blogspot.com

Source: https://towardsdatascience.com/a-simple-guide-to-automate-your-excel-reporting-with-python-9d35f143ef7

0 Response to "Automation to Read From Website and Write to Spreadsheet"

Postar um comentário

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel