Web development.

Technical Writing.

Simple System to Process Invoices Using Docparser, Google Sheets, and Google Script Editor

By Jarrett Retz -June 10th, 2019

In this post the reader will find information on how I use(d) Docparser, Google Sheets, Formulas, and simple Google Scripts to create a functional software system for an accounts payable process.

I learned how to use the Docparser software, which lead me into learning how to use Google Sheets more effectively, and—even cooler—Google Script Editor. These are the three tools that will be discussed. However, the story didn’t start with this list of software, originally.

When I was first working for the company, I was tasked with centralizing, entering, tracking, and facilitating payments of the incoming invoices. The role dealt heavily with the accounts payable process. I was an assistant golf pro trying to use seemingly complicated software to achieve a much simpler task. I spent hours reading the software manual, entering invoices manually, and trying to get a mental map for the user-interface. I started to understand the interface, but couldn’t truly grasp it. It was a foreign layout—but almost similar—in the way that Spanish is relatable to English. Although, anyone who has failed to learn Spanish can attest to the hidden frustrations.

These frustrations were typical of the other workers in the golf shop. Adding to the issue, time and money are constant constraints, so the operation was running on lean resources. The accounts payable process was hanging on to the edge of a raft—but, to it’s credit, the raft was still floating down the river. Therefore, it wasn’t time to full-on-panic.

There are three important prerequisites that set the stage for the new system;

  • There isn’t a software system; or the software is overkill for the level of accounting that can be afforded and is needed.
  • The overwhelming majority of invoices should be coming to the business as email attachments in the .PDF format.
  • The users are college educated, but have little accounting/software specific knowledge.

The second number on the list is very important. I had noticed, the previous season, that most invoices were sent as .PDF attachments. I still had to download and manually enter the data in the other software. This was not the most effective use of my time. I remembered reading that .PDFs are difficult to parse, but I had a hunch that someone, somewhere, may have developed a solution. After a quick search I had found Docparser.

Docparser

docparser dashboard

This software alone solved a huge part of the problem, and there was something special about Docparser: it worked. The user interface seemed logical, and as I got better at using the layout, rules, integrations, etc, it met my intuitions about where (and how) I should be able to create my system. Docparser offers a variety of invoice layout options, and I was able to extract the data using the parsing rules.

"Docparser boils down business documents to the essentials and moves the extracted data to where it belongs. Say good-bye to manual data entry and automate your business."

It’s great software, so I was happy to pay the price for it. It can be billed monthly, or annually, and the pricing can be changed in accordance with changing levels of invoices received.

Automating the incoming invoice data was a BIG to-do, and after testing a few layouts I knew that it could continue to get the job done. Next, came the integrations that Docparser offers. There is an option to have the .PDF attachments forwarded to a specified Docparser account through Gmail. Perfect. The company had a central Gmail account. I’m not going to get into the details of Gmail filtering, but there are many resources to help someone out (including Google’s how-to). Docparser has good documentation and their own how-to on forwarding messages with Gmail.

I set up a few filters, and tested the incoming invoices. At this point, I had a way to get the invoices out of the inbox, and extract the data I needed, but it needed to go somewhere useful. Docparser’s integrations—predictably—go a step a further. I set up an integration that allowed the extracted data to be placed into a specified Google Spreadsheet, and into specific columns. This completed the automation trifecta;

  • .PDF invoices are received from companies and are filtered/forwarded to Docparser
  • Docparser uses the layouts I created to extract data and;
  • Sends that data to a Google Spreadsheet

Pause here, because getting the data to Google Sheets was a big deal. Why? The over-arching issue with the older accounts payable software was that it was too difficult to use and too time consuming to learn. Now, the data is in a spreadsheet, and the employees in the golf shop are way more comfortable using a spreadsheet. This is most likely because they have more knowledge using spreadsheets in Excel, which was more common in their schooling.

Marketing Aside: One of the things I remembered learning in college, as a marketing major, was that people like things that are similar to what they already have, but vary in a slight—and hopefully more attractive—way. Furthermore, the cost of switching grocery stores, insurance providers, etc. depends heavily on the financial, emotional, time-based, and physical costs of the switch. It’s not solely related to the logical, or rational, reasons to change. Take a look at some behavioral economics research and someone can find that actions are more reflective of perceived risk. Perceived risk is not perfect in people, and can cause irrational behavioral. Perceived risk is not that much different than perceived level of effort, or perceived financial cost.

I also had a huge advantage at the marketing research level because I was a user of the older system. I intimately understood the problems. The users of the new system I was setting up were comfortable using Excel, and Google Sheets is basically a clone. Therefore, it had a much better chance of surviving.

The data was now in a spreadsheet, but the spreadsheet was boring, so I needed to improve it.

Google Sheets

spreadsheet with column labels for manito invoice data

Invoice data being inserted into a spreadsheet was great, but that is just the first step in the invoice life-cycle. The invoice would need to be checked off by accounts receivable (billed), and then it would need to be paid. This could be done on the same worksheet (i.e Sheet1), but that could get messy. Instead, I elected to create a number of sheets;

  • Main - this is where the invoices would arrive and stay in unpaid status
  • Paid – this is where invoices from ‘Main’ would be moved once the payment was sent
  • Manual Add – there was a small percentage of paper invoices still coming in and this sheet allowed those to be entered in a more segregated manner
  • Report – this sheet provided the invoice totals per company, as well as other important information (total payable, total past due, total becoming payable, etc.)
  • Purchase Orders – this sheet did not take off but was designed to be the landing place for a Purchase Order form that was created with Google Forms. It was to serve as a reference for whether orders were placed, and placed correctly.

A reader may be wondering how the individual invoice data (represented by individual rows, columns B-L) are transferred from sheet-to-sheet. Looking at the spreadsheet, there are cells that are filled with a light grey color. These represent input cells and the inputs are;

  • The user enters “1” into the Paid (column on the ‘Main’ sheet) to transfer the invoice from ‘Main’ to ‘Paid’.
  • They could alternatively enter “0” into the Paid column, on the ‘Paid’ sheet, to transfer the invoice back to the ‘Main’ sheet.
  • On the ‘Manual Add’ sheet there is an ‘add’ column. The user can enter ‘add’ into that column to trigger the movement of the row they wish to move to the ‘Main’ sheet.
  • The Billed? column does not need specific input, but when it does receive input the Billing Timestamp column records the edit. This notifies the person paying the invoices that this invoice is ready to be paid.

I had originally thought that I would drive the movement of the rows using external Python scripts. I’ll save the reader from the boring story, but it can be inferred that it didn’t work as I had envisioned. I took to the internet and quickly discovered Google Script Editor. The aim of the scripts were simple, so the code was easy to find online.

Before I get into the scripts I am going to talk about some of the Google Sheets formulas. If the reader wishes to skip to the scripts, go right ahead.

Formulas

screenshot of report columns with sample data values

The spreadsheet uses a number of formulas (and a little conditional formatting) that are most notable on the ‘Reports’ sheet. I’ve become quite fond of the SUMIF formula, which has a syntax of;

=SUMIF(range, criterion, [sum_range])

Here is an example of how it’s used in the ‘Report’ sheet;

=SUMIF(Main!B2:B1037,"Camelbak",Main!G2:G1037)

The formula locks on to the Company column in the ‘Main’ sheet, it then looks for the value “Camelbak”, and if it finds the matching value, it takes the value in column G (Totals) and sets the cell to the combined values.

The SUMIF formula worked for this column, however, I also needed to calculate the past due amount for each company. This job required the SUMIFS function.

=SUMIFS(sum_range, criteria_range1, criterion1, [criteria_range2, criterion2, ...])

An example from one of the cells in ‘Report’ sheet is;

=SUMIFS(Main!G2:G1037,Main!B2:B1037,"Camelbak", Main!E2:E1037,"<"&TODAY())

Here, the sum_range comes first, so column G leads the way. Next, there are options to insert criterion. The first criteria_range was the name of the company (Camelback). The second criteria was the Due Date column, and the criterion for that used the TODAY() function to determine if the Due Date value was less than the TODAY() value. If it was, then the total was added to that cell.

Google Script Editor

google scripts dashboard with top part of script

In Google Sheets, someone can find the Google Script Editor by going to the navigation bar and selecting Tools > Script Editor. Here, I found a different world. For someone who is curious about the world of Google Scripts, Extending Google Sheets would be a good place to start. If someone was not inclined to learn extensively about Google Scripts, they could do what I did, which was find specific scripts online—personalizing them—to achieve simple tasks. Below are six different scripts (some with multiple functions) accompanied by an explanation of what they do, the source, and the actual code.

Triggers

Before I explain the scripts it’s paramount that I introduce triggers. Simple triggers, in Google’s words, “let Apps Script run a function automatically when a certain event, like opening a document, occurs.” In the following sections I will mention how functions are triggered (i.e when edits occur, time-of-day, opening the spreadsheet). They are important to pay attention to because they are relatable to the brain sending a signal to a body part to move. Triggers control the system, so the system can either move smoothly or in a clumsy fashion.

OnEdit.gs

This function facilitates the movement of rows between sheets, as well as other actions that need to be triggered when the spreadsheet is edited.

I did not write this script, and I tried to go back and find the link to the person who did. I believe it was on Stack Overflow, and the guy had written the script to help his wife’s small business.

Regardless, I changed some of the variable values, and placed four functions inside of one larger function, because the function triggers on the edit event. This is a common trigger, so this script is a bit cluttered.

The first three functions (mainToPaid, paidToMain, manualAdd) are basically the same functions, but the billingTimestamp was added later. I think I got the billingTimestamp function on stack overflow from an answer by Serge insas.

OnOpen.gs

The function triggers when the spreadsheet is opened to delete duplicates. It does this by determining if data in the Company column and Invoice Number column in one row are identical to the Company column and Invoice Number column in a different row.

I found this script on the Google Developers webpage, and it was written by Romain Vialard. There’s a great breakdown of the script in the link above.

Net30.gs

This script adds thirty days onto the Due Date value for specified companies.

I set up this script based on the script for removing duplicates. Some incoming invoices do not have Due Date values displayed and only display “Terms: Net 30”. This script updates the Due Date column by adding thirty days to the invoice date.

Sort.gs

This script automatically sorts the ‘Main’ sheet by the Due Date column values every night based on a trigger set up between 12:00 a.m. – 1:00 a.m.

The author is Mike Branski (@mikebranksi on github).

BackUpMain.gs

This script copies the data in the ‘Main’ sheet, incase data is disrupted by a user, and pastes the contents into a different hidden—and protected—sheet.

TotalPaidbyMonth.gs

I’m not super proud of this script, but it computes the amount paid based on the changes in cell F3. It then puts that amount on the ‘Report’ sheet. I was having trouble calculating how much was paid based on the month and year. I created the ‘ReportCalc’ page as ‘scratch paper’ that is hidden and protected for generating some of the numbers I needed for the ‘Report’ page.

Anyway, I wrote this script, and it’s a simple way to calculate the difference paid (it uses a time-based trigger), but I’m afraid it’s not very robust.

This combination of tools (for now) does the job as an ad-hoc accounts payable system. It has been an evolving process and will continue to change.

I didn’t cover every single task, but the main functionality has been explained. There is still a lot of room for improvement. I believe that it could be an answer for a small business that finds themselves in a similar situation.

Thanks for being a reader! I hope there was something interesting or helpful. Feel free to leave comments and questions.