ArticlePDF Available

Abstract and Figures

Accounts Payable (AP) is a resource-intensive business process in large enterprises for paying vendors within contractual payment deadlines for goods and services procured from them. There are multiple verifications before payment to the supplier/vendor. After the validations, the invoice flows through several steps such as vendor identification, line-item matching for Purchase order (PO) based invoices, Accounting Code identification for Non- Purchase order (Non-PO) based invoices, tax code identification, etc. Currently, each of these steps is mostly manual and cumbersome making it labor-intensive, error-prone, and requiring constant training of agents. Automatically processing these invoices for payment without any manual intervention is quite difficult. To tackle this challenge, we have developed an automated end-to-end invoice processing system using AI-based modules for multiple steps of the invoice processing pipeline. It can be configured to an individual client’s requirements with minimal effort. Currently, the system is deployed in production for two clients. It has successfully processed around ~80k invoices out of which 76% invoices were processed with low or no manual intervention.
Content may be subject to copyright.
AI Driven Accounts Payable Transformation
Tarun Tater1*, Neelamadhav Gantayat1*, Sampath Dechu1, Hussain Jagirdar1, Harshit Rawat2,
Meena Guptha2, Surbhi Gupta2, Lukasz Strak2, Shashi Kiran2, Sivakumar Narayanan2
1IBM Research
2IBM Services
[ttater24, neelamadhav, sampath.dechu, hussain.jagirdar1, harsrawa, meenamga,surbhgup,[lukasz.strak@pl.ibm.com],
shkiran6 sivakumar.narayanan]@in.ibm.com,
Abstract
Accounts Payable (AP) is a resource-intensive business pro-
cess in large enterprises for paying vendors within contrac-
tual payment deadlines for goods and services procured from
them. There are multiple verifications before payment to
the supplier/vendor. After the validations, the invoice flows
through several steps such as vendor identification, line-item
matching for Purchase order (PO) based invoices, Account-
ing Code identification for Non- Purchase order (Non-PO)
based invoices, tax code identification, etc. Currently, each
of these steps is mostly manual and cumbersome making it
labor-intensive, error-prone, and requiring constant training
of agents. Automatically processing these invoices for pay-
ment without any manual intervention is quite difficult. To
tackle this challenge, we have developed an automated end-
to-end invoice processing system using AI-based modules for
multiple steps of the invoice processing pipeline. It can be
configured to an individual client’s requirements with min-
imal effort. Currently, the system is deployed in production
for two clients. It has successfully processed around 80k
invoices out of which 76% invoices were automatically pro-
cessed with low or no manual intervention.
Introduction
The finance function is no longer a cost arbitrage for a client.
They are now looking for innovative solutions that not only
deliver cost benefits but also deliver value. For organizations
that are looking at cutting costs, automation holds the key
(Furth 2005; Bohn 2010). The question is, how and what to
automate? Traditionally, the trend has been to identify repet-
itive tasks that can be automated using automation scripts
and robotics. Slowly, businesses are moving towards looking
at automation opportunities involving high cognitive load.
One of the major tools to achieve this has been machine
learning(ML) components that can integrate with automa-
tion scripts to deliver at par or better than human-level per-
formance while being more efficient. One such area where
ML is making inroads is accounts payable, thus turning them
into a profit center from a cost center. While technology
surely holds the key, as rightly pointed out by (Bohn 2010),
the dilemma industry watchers face is that there’s no “one
*These authors contributed equally.
Copyright © 2022, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
size fits all” solution which leaves individual companies and
departments to cobble up tools and solutions which might
improve speed and accuracy.
Accounts payable is one of the finance functions which
is extremely manual intensive (Schaeffer 2002; Cosgrove
2013). The process deals with procuring raw materials, ser-
vices, and goods from listed vendors and paying invoices
submitted by the vendors within contractual payment dead-
lines. Steps for managing the accounts payable process in-
volves receiving invoices from vendors, matching vendor
details from invoice to the business records using a set of
dynamic business rules, and posting the invoices in the En-
terprise resource planning (ERP) for payment. This process
may also involve some conditional steps depending on the
client and type of invoices. Such steps include performing
business rules check for invoice compliance, invoice line
item matching i.e., the line item description on the invoice is
matched against that on the Purchase Order (PO), and goods
receipt to perform a three-way match, and applying a tax
code to each invoice item. Managing accounts payable is
also a very important and daunting task since the invoices
need to be paid accurately and on time to maintain a long-
term relationship with the vendors (Schaeffer 2002) while
avoiding any penalties. The challenges are amplified since it
also needs to be ensured that duplicate, incorrect, and fraud
invoices are not paid as that would result in losses, and re-
trieving such payments would bring in more costs.
The industry as a whole is moving towards the infusion of
Artificial Intelligence (AI) and Machine Learning models in
different business processes. But, the machine learning mod-
els come with a caveat. The predictions made by a model are
not always right albeit offering very high accuracy. Thus,
explanations corresponding to the predictions are important.
Also, an automation system might look at some thresholds
which if crossed would lead to the automation of a step in
a process. Furthermore, the machine learning models need
to adapt to changing rules for some processes. For instance,
a tax code might change depending on the policies. Hence,
there is an increasing need for the models to learn and re-
evaluate the parameters and/or thresholds that they consider
for a particular prediction as well as automating a process.
In this research, we aim to automate the complex pro-
cess of invoice processing and auto-post as many invoices
as possible without no or low manual intervention using AI-
PRELIMINARY PREPRINT VERSION: DO NOT CITE
The AAAI Digital Library will contain the published
version some time after the conference.
enabled micro-services. This results in better efficiency and
cost-saving resulting in better client, and vendor satisfaction
and engagement.
Account Payable Process
In the current business-as-usual, the accounts payable pro-
cess is largely manual intensive. Since no single platform
is available to enable touch-less invoice processing from in-
gesting of invoices to payment, the majority of the steps in
the invoice processing pipeline are done by agents. When a
new invoice is received, the accounts payable agent or ac-
countant manually reads and validates if all required infor-
mation/fields are available on the invoice. Then, the agent
checks whether an invoice is a Purchase order (PO) based or
Non-Purchase Order(Non-PO) based. A PO-based invoice
is an invoice for goods where a purchase order was placed
before procurement. Whereas, a Non-PO based invoice is
one where the services were used and then the organization
was billed against it. If the invoice is PO-based, the agent
then validates if the PO number mentioned in the invoice
is present in the PO data. Other such validations include
whether the invoice is billed to the receiving company and
the billing address matches. There can be more validations
around invoice amount, invoice dates, etc. depending on the
requirements. After these sanity checks, the agent identifies
the vendor matching the vendor name, address, and banking
details from the invoice to the vendor data. Then, he/she en-
ters all the details into the system manually. After that, for a
PO invoice, he/she matches the invoice line items to PO line
items and goods receipt considering line item descriptions,
quantity, price, etc. Subsequently, in the case of a Non-PO
invoice, he/she assigns accounting codes to process the in-
voice in the system, along with determining the tax code.
The invoice is then posted to ERP for payment. Also, many
of these documents and data may need to be retrieved from
different applications and data stores. Some of these checks
have been automated by Robotic process automation (RPA)
depending on the client and complexity of invoices.
Currently, it is difficult to have a system that can be eas-
ily used by different clients for varying invoice processing
setups. Along with this, the steps involved in invoice pro-
cessing are manually intensive, cumbersome, and pose chal-
lenges because of the following issues:
Catalog Mismatch : Buyers and Sellers maintain their
catalogs which results in different terminology for the
same item. This creates a problem while processing in-
voices since the line item descriptions in the invoice may
not match with the description in PO for the same item.
On the other hand, dissimilar items may be categorized
as the same items and wrong payments may be made.
Abbreviations : Shortened form of names, (Ex: Butter vs
Bttr, etc.) introduced by either buyer/seller prevent tra-
ditional automation techniques such as RPA to perform
line-item matching (fuzzy match).
Duplicate Invoices : Some invoices may be resubmitted
by vendors because of delay in payment or other issues.
This causes the same invoice to be in the system multiple
times which might result in the organization paying the
invoice amount multiple times.
Tax code and Regulations : The tax code for an item(s)
can change over time with changes in regulations. This
creates an issue as the previous data would then become
redundant.
Generalizable Solution across clients : Different clients /
industries have different invoice compliance rules. Some
companies might also need only a subset of tasks due to
the nature of invoices. For example, companies that deal
with products follow only PO-based invoices where they
need not follow steps such as Accounting Codes predic-
tion. This makes it hard to find a generalizable approach
for different accounts.
Knowledge Retention : Processors and/or Associates are
heavily dependent on Subject Matter Experts (SME) or
clients for solving queries. Since there is no system to
capture this user information and feedback, it can result
in knowledge erosion on account of attrition.
There is no cost-effective platform for enabling touch-less
accounts payable which is generalizable and easily config-
urable across different setups for invoice processing to the
best of our knowledge.
Contribution
In this paper, we present a configurable end to end system
for automating the Account Payable process using AI with
the following key aspects:
1. We propose a novel way to orchestrate the invoice pay-
ment process using a flow authoring tool to meet different
design requirements by various clients which are easily
configurable with a drag and drop UI.
2. AI-enabled semantic similarity module for matching in-
voice line items to PO and goods receipt enabling three-
way line-item matching. A hybrid classifier involving an
information retrieval(IR) and rule-based solution to pre-
dict the tax and Accounting Ledger information.
3. Automatic Tuning of confidence thresholds based on
agent’s feedback at different confidence levels. The feed-
back mechanism can be incorporated with any of the in-
dividual AI-enabled modules helps to incorporate inter-
agent agreement and fine-tuning the model in accordance
with the received feedback.
Our system is successfully deployed for multi-national
electronics distributor for 3 of its markets for more than a
year and has already processed more than 70kinvoices. It
was recently also deployed for a multinational retail orga-
nization and has processed more than 10kinvoices which
highlight the usability of the proposed system.
System Architecture
Figure 1 describes different components that are available
and a possible flow that can be orchestrated in the system.
We will discuss about each component in detail in this sec-
tion.
Figure 1: System architecture. We have a multi-stage architecture for our platform. The order of the different modules within
the processing system (Risk Management, Business Rules Check, etc.) can be customized according to business requirements.
Figure 2: Orchestrator flow : One of the workflows used in a deployed system
Orchestrator
Orchestrator is mainly used for the deployment and manage-
ment of various services in our system. It provides a way to
select or deselect a particular service and also provides op-
tions to set default parameters for a particular service. Users
can manage the default thresholds for all the AI components
and also can define business rules such as allowed devia-
tion in price etc. They can also be changed at any time en-
abling clients to perturb the thresholds as the system pro-
cesses more and more invoices over time.
We used Node-red1- an opensource browser-based flow
editor for this purpose. Our implementation contains differ-
ent node-red nodes pertaining to each service. The nodes
contain default values for the services which can be config-
ured at the time of deployment. Figure 2 shows an actual
flow orchestrated using node-red for a client. This design
also helps to integrate any new module as per the require-
ment with minimal effort for integration.
Risk Management
Duplicate invoice payments are a common phenomenon in
accounts payable. On average companies make anywhere
1https://nodered.org/
between 0.1 to 0.5 percentage duplicate payments(Wagner
2018). This can translate to huge amounts for a large com-
pany. This percent translates to a whopping $500,000 to $2.5
million amount for a large company with an annual turnover
of $500 million. With limited manpower and resources, it is
very difficult to validate and verify all the invoices.
Even though some companies are Sarbanes-Oxley Act
(SOX)2compliant and use a leading ERP system for invoice
processing, they make duplicate payments. ERP systems can
detect exact duplicates but they cannot detect a tinkered du-
plicate. Most organizations validate invoices based on the in-
voice number, invoice date, and amount. It’s effortless to by-
pass these built-in controls, either accidentally or intention-
ally. Following are some of the scenarios where one might
end up making a duplicate payment:
Typographical errors on the invoice number.
Vendor might send another invoice with a different date
or a perturbed invoice amount, no matter how small.
Duplicate Invoices may be submitted via different chan-
nels and locations.
2https://www.dnsstuff.com/what- is-sox-compliance
Utility payments (recurring payments) may arrive as
a statement and not as an invoice. General AP prac-
tice is to enter them as account#+month+year e.g.,
12345Jan2021. This increases the likelihood of potential
duplicates if the AP accountant enters it as 12345Jan2021
vs 12345January2021.
It is extremely resource intensive to manually search for
potentially duplicate invoices. We propose an automated risk
detection solution to monitor transactions continually for
duplicate and fraudulent invoices. Our system looks for du-
plicate invoices from the historical invoices on four param-
eters namely - invoice number, vendor name, amount, and
invoice date. Following is the implementation of the system
for identifying the individual parameter score:
Invoice Number Identify invoices with invoice numbers
that closely match. Being an alpha-numeric string, our fuzzy
match between two invoice numbers takes string sequence
and number field separately. For instance, INVNO12345678
and INV12345678 are an example of duplicate invoice num-
ber as INVNO and INV could be abbreviations for in-
voice number and invoice respectively. Fuzzy matching is
more powerful for detecting duplicates than only using exact
matching. Similarity analysis is applied to invoice numbers
to identify typos and transpositions. We implemented Lev-
enshtein distance metric(Yujian and Bo 2007) based string
similarity implemented in python3.
Vendor Name The next validation is to detecting dupli-
cate vendors based on names, addresses, tax ids, bank ac-
counts, and other attributes. We indexed the vendor names
along with their attributes and used a fuzzy search technique
to identify a closely related vendor. For the fuzzy search, we
used FuzzyWuzzy(SeatGeek 2011) a python-based module
to search for a given string.
Invoice Amount For pre-processing, we remove all the
characters as well as special characters. Our algorithm gives
a high score if the amount matches completely or if the de-
viation percentage lies below a certain threshold. We even
index it as a string type for a fuzzy match with tight edit dis-
tance for cases where typos could occur. Eg. 1908 vs 1980.
Invoice Date After regularising the string, we index the
invoice date as a string as well as in the date object. The risk
score encompasses the fuzzy match between the string and
the exact match between the dates.
We identify individual scores for these parameters and
combine them to arrive at an overall score. We categorize
all the invoices greater than a certain threshold as malicious
and route them to a practitioner for further validations.
Better visibility into duplicate invoices and payments can
help companies detect cash leakage, fraud, and misuse so
that they can recover funds. Furthermore, the right tool will
enable them to prevent duplicate payments in the first place,
while combining and analyzing data from multiple payment
platforms and providing workbench tools to build a strong
audit trail. Further validations to determine the authenticity
of invoices are done in the next step.
3https://github.com/ztane/python-Levenshtein
Business Rule Check: Despite some suppliers’ adoption
of digital/electronic invoices, many Accounts Payable (AP)
processing organizations are impeded by paper-driven in-
voice processing. For an invoice to be processed accurately,
it is required to extract the information from the invoice
precisely. Accuracy of current Optical Character Readers
(OCR) is good in the case of digital files whereas it is moder-
ate to low in the case of scanned/manually written invoices,
because of the quality of scan and the document.
PO Number Validation: This step is more relevant for
PO-based invoices where the need is to check the authentic-
ity of the PO number. This is done by referencing the PO
number from the header or sub-header of an Invoice. In our
system, this validation is done by referring the PO number
back to the ERP‘s PO table or PO system to determine its va-
lidity. In the event of a match between the invoice and ERP
system, invoice processing continues. However, if there is a
mismatch, it is generally up to the AP staff (along with Pro-
curement) to determine if the invoice submitted reflects the
transaction details of a different PO or if it is entirely errant.
Tax, Freight, & Total Amount Validation: These fields
are the most error-prone fields in a given invoice, so the sys-
tem includes various validations to filter out counterfeit in-
voices. Following are some important rules:
Filter out invoices with total amount zero.
Compare the total amount of invoice against the sum of
all the line item amounts.
Review freight charges and tax charges. The sum of the
tax amount, freight amount, and total amount should be
equal to the invoice amount.
Business rules also include any other checks mandated by
the client.
PO Invoice VS. Non-PO Invoice Categorization
This is an essential step for vendors dealing with both PO-
based and Non-PO-based invoices. This step determines
how the invoice needs to be processed. PO-based and Non-
PO based invoices go through different steps before pay-
ment. For PO-based invoices, steps might include 2/3 way
matching, and for Non-PO based invoices, steps may include
Accounting codes classification & Approval routing. The in-
clusion of a PO Number on the Invoice is a typical way to
distinguish between PO Vs. Non-PO invoices, though this is
not always the case. Many clients follow No PO No Pay pol-
icy where only a approved list of vendors or invoices below
the fixed threshold can be allowed under the Non-PO cate-
gory. Hence, in some cases, determination may need to be
made on other criteria like vendor details, shipping details,
etc. to correctly assign it to the right process flow.
Vendor Matching
Most organizations require a vendor validation step in their
AP process to ensure that the vendor submitting the invoice
has an Approved Vendor status. This is to thwart fraud at-
tempts and eliminate non-authorized spending. This step is
done by a manual look-up by the AP staff into the Vendor
Master File.
Figure 3: Sample purchase order and its corresponding invoice
The steps to validate vendor differ for PO vs Non-PO
based Invoices. In the proposed system, for PO-based in-
voices, vendor matching is fairly simple. As the vendor ID
is present in the PO database, vendor details can be retrieved
by a lookup from the vendor master record, and compar-
ing attributes such as bank account number and vat ID of
the vendor from invoice to the master record. However, for
Non-PO based invoices, vendor identification is fairly com-
plicated as the vendor information extracted from the in-
voice may contain some peculiarity in text. This makes it
difficult to directly searching the vendor name in the ven-
dor master record difficult. Most of the current ERP systems
don’t support fuzzy matching for text. So, we implemented
a synthesized search by indexing vendor details. In our sys-
tem we consider three fields, vendor details, vendor bank
account number, and vendor’s Value Added Tax identifica-
tion number (VAT id). Vendor details is a composite field
consisting of vendor name, city, and address. The system
performs a fuzzy match on the vendor details and an exact
match on bank account number and VAT id. For perform-
ing fuzzy matching we used a custom script utilizing Fuzzy-
Wuzzy(SeatGeek 2011) which uses the Levenshtein distance
(also known as edit distance) to compare two strings.
Line Item Matching
It is one of the most onerous components of processing a
PO invoice. Line-item matching involves verifying that the
items being billed were ordered (two-way match) and poten-
tially to what was received (three-way match). In two-way
matching, the line-items on the invoice are validated against
that of PO. Whereas in three-way matching the line items
on the invoice are matched against both purchase order and
goods receipt. This includes using different parameters such
as - description, quantity, price, material/part number, etc.
Each vendor has its own set of weights for these parameters.
For some vendors, the description may be of the highest im-
portance whereas others might match line items based on
material/part number. Some other vendors may give prece-
dence to quantity, price, etc. Another general practice fol-
lowed by vendors include having a threshold for price. This
is to accommodate various cases ranging from tax values
discrepancies, handling charges, or price change. Another
challenge here is that data needs to be referenced from mul-
tiple sources: invoices, procurement records (PO’s), and re-
ceiving documents (Bills of Lading, Packing Slips, etc.). For
some organizations, this is tiring due to lengthy supplier in-
voices, involving pages of line-item detail.
So the line item matching algorithm should consider
all these parameters while matching between different line
items. Another limitation in the development of this algo-
rithm is the lack of training data, which limits us to only un-
supervised techniques. The key component in our algorithm
is semantic similarity apart from numeric similarity. And the
algorithm is devised into two phases as described below.
We applied lexical normalization as given in (Han and
Baldwin 2011) to the invoice line item string. Lexical nor-
malization is the process of detecting ‘ill-formed Out of Vo-
cabulary (OOV)’ words. Examples include but are not lim-
ited to misspellings, abbreviations, and common names. For
example, people often refer to ‘butter’ as ‘bttr’. Without such
lexical normalization, existing corpus and knowledge-based
approaches will not be able to detect proper nouns. Our sys-
tem involves filtering out words that are not there in a stan-
dard dictionary. Calculate the fuzzy similarity score between
the line-item string with the line-items from PO. To calculate
fuzzy matching, each string is represented as a TF-IDF vec-
tor of bi-grams and tri-grams and then the cosine similarity
between each pair of vectors is calculated.
Using fuzzy matching to compute a similarity score for
strings ‘Glycerine white distilled 12%’ and ‘Vinegar white
Figure 4: Line Item Matching Example with confidence score
distilled 12%’ returns a high score (>0.7) because of a high
number of matching bi-grams. Hence, we need to remove
such string pairs from further comparison. We use python
spacy 4API for noun phrase chunking. If the noun phrase(s)
in the query string and pool of strings do(es) not match, they
are not kept in the string pool for comparison.
After identifying the pool of products that match with
the current line-item, the next step is to identify the overall
score by considering the price, quantity, and other parameter
matches. This is done by comparing the price and quantity of
invoices with that of the purchase order. There is an allowed
deviation threshold which is defined by the accounting ex-
pert. If the values are within the range then those products
are given a higher score. An example is shown in Figure 4.
Accounting Codes Prediction
Accounting codes like general ledger account (GL account),
profit center, and cost center are essential for every purchase
of service to accurately account for the expenses in the right
category. For a PO-based invoice, the accounting codes and
approvals happen at the PO creation stage. Hence, the ac-
counts payable(AP) team is responsible to match the invoice
against the purchase order and/or goods receipt. However,
for a Non-PO invoice, the AP team has to update the ac-
counting codes along with obtaining review/approvals from
the business. The accounting details are updated based on
the type of expense mentioned on the invoice. Since this is
a judgmental process, many times the team makes a mistake
and is highly dependant on business teams for input. Hence
automating this process reduces the dependency on busi-
ness & helps populate correct accounting codes. Account-
ing codes identification depends on the line-item description
along with shipping details. It is a challenge because the de-
scription mentioned on the invoice may not be the same as
the description of the item ordered. Especially for a Non-PO
based invoice, the line-item description may be very generic.
For e.g.,: “Broadband bill” might mean “Phone bill”, “in-
ternet bill” or any other telecom related service. Hence, we
try to identify invoices with similar descriptions and similar
shipping details from the recent past.
For both accounting codes prediction, and Coder/Re-
viewer prediction, the approach is similar to tax code
identification described in our previous work(Tater et al.
2021). We experimented with various algorithms including
4https://spacy.io/
random-forest, logistic regression, SVM, etc., for these pre-
dictions but figured that a hybrid classifier which is a com-
bination of a semantic similarity engine and a rule-based
system performed better in terms of accuracy along with
high precision and recall. The Semantic Similarity Engine
caters to extracting similar item descriptions from the his-
torical data. This is currently done using an Information Re-
trieval system because of the nature of line-item descriptions
but can be replaced by any other semantic similarity engine.
These candidate matching descriptions are then searched in
historical data to gather all items with these descriptions.
Exact matching descriptions are given more score than sim-
ilar matches. This is followed by the Rule Based Engine that
filters the retrieved similar line-items based on shipping ad-
dresses (ship to, and ship from) and company code details.
The results are then sorted based on time to give more im-
portance to the most recent similar line items.
The accounting code is then predicted by majority vot-
ing of the top-N results where N is a configurable parameter
based on subject matter expert(SME) knowledge and exper-
imentation. The final confidence score (Cp) is calculated by:
Cp=DsWd+MsWm
where Dsdenotes scores for description similarity and 0<
Ds<= 1 for exactly matching description, and by design
Dsfor fuzzy matches would be a lower value as compared
to exactly matching descriptions in a particular configura-
tion. Here, Wdis the configurable weight given for depict-
ing the importance of line-item descriptions. And Wmis the
configurable weight given to the majority score.
Whereas the majority voting score (Ms) is calculated as:
Ms=Nm
N
Nm
NT
Nrefers to the number of shortlisted data samples which are
being considered for classification. Nmdescribes the num-
ber of items with the majority accounting code out of the
shortlisted candidate samples. NTdescribes the number of
distinct accounting codes in the Ndata samples.
Coder/Reviewer Prediction
For a PO-based invoice, we have the goods receipt step
where the receiver who receives the goods or services con-
firms receipt of the goods or services. However, for a Non-
PO invoice, this is not possible. So, it becomes essential to
send the invoice to the person/department for confirmation
of receipt of goods or services. More often than not, the
name/email ID of the person or department or department
code or cost center is provided on the invoice for identifica-
tion on who placed the order. Based on such information,
the AP team needs to identify whom to send the invoice
for review and approval. However, some invoices don’t have
this information, and even when they do, the information
provided by vendors may not be consistent and accurate.
The team may have to refer to multiple records or databases
for identification. We tackle this problem by multiple ap-
proaches : (i) If identifying information is available, the sys-
tem checks for the most similar person/department to the
information provided using fuzzy search. (ii) If the iden-
tifying information is missing, we use a similar approach
as Accounting Codes prediction using an IR-based system
on line-item description to identify people/departments who
have received the goods/services in the recent past.
Tax Code Prediction
This step involves assigning the correct tax code to each
item in the invoice which can be dependent on one or
more attributes including (i) Item Description (ii) Where the
Item was shipped From (iii)Where the Item is shipped To
(iv)Vendor Details (v)Time of Purchase. For tax code iden-
tification, a semantic similarity engine searches for similar
line-item descriptions with descriptions in historical data.
These candidate matching descriptions are then searched in
historical data to gather all line-items with these descrip-
tions. They are then filtered based on the matching shipping
addresses and company codes and matched for the vendor if
present. The results are then sorted based on time to cater to
the challenge that the tax code might have changed over time
for provided details, thus tackling any concept drift. The tax
code is then predicted by majority voting of the top-Nre-
sults. The approach is very similar to tax code identification
described in our previous work (Tater et al. 2021). Similar
to accounting code prediction, the predicted confidence (Cp)
here is given by:
Cp=DsWd+MsWm+VsWv+TmWt
where sum of weights for different components = 1 i.e.
Wd+Wv+Wt+Wm= 1 Here, Wtis the weight (impor-
tance) assigned to the tax percentage listed on the invoice,
if present. To be noted is that the same tax percentage can
have multiple tax codes associated with it. Tmis 1 if pre-
dicted tax-code is one of the tax codes associated with the
tax percentage and -1 if not. Vsdenotes the similarity score
for vendor details, and Wvis the configurable weight given
for vendor details.
Learning From Feedback
The above described AI modules namely line-item match-
ing, coder prediction, accounting codes prediction, and
tax code prediction also involve two thresholds of con-
fidence that decide the flow of each item: (i) minimum
confidence(Cmin) and (ii) maximum confidence(Cmax ). If
the confidence exceeds Cmax, the prediction is considered
correct and the step is passed without any manual interven-
tion. However, if the predicted confidence(Cp) is between
Cmax and Cmin, that is Cmin < Cp< Cmax , the line-item
for that module is listed for feedback/review by a human
agent. The agent can then choose to give an upvote, which
would indicate that the prediction is correct, or give a down-
vote and correct the prediction. This feedback is then used
to improve the underlying model. In case the predicted con-
fidence is lower than the minimum defined threshold, i.e.
Cp< Cmin, the complete invoice is returned to the work-
flow for manual processing.
When the system is deployed for a new client, the con-
fidence thresholds (Cmax and Cmin) are conservative such
that Cmax is set for a very high value and Cmin is set to a
very low value. This is termed a hyper-care period where for
a few weeks the agents validate all the predictions. Then, af-
ter some time, analyzing agents’ feedback, these thresholds
are adjusted to maintain a high accuracy rate while reduc-
ing the manual effort. The lowering of Cmax would result in
more invoices being auto-posted.
For the Line-item matching algorithm, we have two
hyper-parameters that update the overall confidence score
based on user feedback. The hyper-parameters are tuned in a
way that, for positive feedback, the learning rate is very slow
whereas for a negative vote the learning rate is very high.
These hyperparameters ensure our algorithm rightly iden-
tify matches with enough positive/negative feedback. Sim-
ilarly, for tax code or accounting codes identification, each
upvote(agreement) or downvote(disagreement) for a predic-
tion changes the confidence of predicted class. This is also
considered as a new data point and added to the learning
module. This way, the model also takes into account the
inter-agent agreement for the feedback. In essence, the feed-
back mechanism provides the system with a self-learning
capability, combined with knowledge retention and tackling
any data drift or concept drift.
Business Impact
The proposed system is deployed for two accounts, a major
multi-national electronics distributor and a multinational re-
tail organization to process their PO-based invoices with a
combined annual volume of 800kinvoices. Currently, the
system is deployed for these clients in Europe, and North
America and the plans are to roll it out to more markets in
Europe Latin America, and Asia. At present, for the elec-
tronics distributor clients, it has processed 70k invoices
delivering efficiency of 76% (transactions which re-
quired either no or minimal human intervention). The num-
ber of invoices requiring complete manual processing is only
7555 (<11%) which includes 6631 duplicate invoices. This
means 62154 (>88%) invoices were processed through our
system as detailed in Table1. For the other client, the system
has processed 10kinvoices but it is still in hypercare mode
where each invoice is also reviewed by an agent. Building on
these successful deployments, the system is being expanded
to 37 more markets globally for these 2 clients.
For the markets where the system is already live, the an-
nual expected volume is 88k invoices where each invoice can
have multiple items which need to be validated, matched,
processed and a tax code needs to be applied. On average,
it takes 10 minutes for a human agent to manually process
Total Invoices
Inflow
Ready for Post-
ing
Waiting for
Feedback
Potential Du-
plicates
Confirmed Du-
plicates
Require Manual
Intervention
70442 62154 569 164 6631 7555
Table 1: Current Dashaboard Statistics of the number of invoices inflow, posting, manual intervention and duplicates
one invoice after indexing to posting it to ERP. Even with
a very conservative estimate of saving of 3minutes per in-
voice, this has already amounted to savings of 4000 work-
hours considering 80kinvoices are already processed. Once
the system goes live in other markets where it is in the de-
ployment phase, taking into consideration the annual volume
of 800kinvoices, the projected savings would be 40kwork-
hours annually. Also, this reduction in time is expected to be
a lot larger once clients get more comfortable and trustwor-
thy of the AI skills.
Another major portion of time consumed for invoice pro-
cessing is because it involves various touchpoints by various
agents and passing of invoices between these agents involves
significant wait time. With this system, since the process be-
comes automated and touchless, we have achieved a 90%
compression of end-to-end cycle time between from invoice
receipt to invoice posting. Our system also caters to identi-
fying duplicate invoices. To date, the system has been able
to detect >25 million $ worth of duplicate invoices.
Another advantage of our system is that the AI modules
are expected to learn and improve with the agent feedback as
more and more invoices flow through. This would enable the
confidence thresholds to be adjusted accordingly and more
invoices would auto-post without any manual intervention
because of higher confidence predictions. Another major
challenge as previously pointed out is the generalizability,
reusability, and scalability of our system to other clients, the
drag and drop integration of individual modules i.e. the mod-
ular and configurable design enables us to quickly deploy the
system for another client with minimal changes and effort.
Its also in early deployment for three more clients now that
it has proven its edge over manual methodology.
Deployment and Maintenance
We have a development team of around 30 people who man-
age this product following an agile development methodol-
ogy. There are monthly releases consisting of new enhance-
ments and bug fixes, following our own DevOps pipeline to
build, test and deploy the solutions to production. The sys-
tem is currently deployed as a single-tenant service with 3
separate docker images, each handling different stages and
functionalities. The first docker image caters to flow au-
thoring via node-red, which helps in orchestrating the in-
voice processing pipeline written in NodeJS. An example
workflow is depicted in Figure 2. The second docker image
caters to the front-end interactions for agents and admins.
This was developed using AngularJS and NodeJS. The third
docker entails all the micro-services for each step in the in-
voice processing pipeline. This is developed using python
and APIs were created using python flask. Also, we have
a separate docker image for each micro-service if a client
wants to deploy only specific micro-services. This decou-
pling helps to ship different parts of the system separately.
Various machine learning models were trained using open
source scikit-learn5library (Pedregosa et al. 2011). The IR
system works on top of open source pylucene6library. The
database used for storing historical and processing data is
mongodb7. There is also a governance dashboard which
keeps track of the status of invoices similar to Table 1.
The maintenance of the system includes adding new
classes (new tax codes, GL codes, coder names, etc.) when
necessary; adjusting confidence thresholds as the system im-
proves with feedback, and building new rules if needed with
changing business requirements. Maintenance also entails
including any corrections for predictions reported in any au-
dits. With respect to deployment for a new client, very few
changes are required because of the modular and config-
urable design: (i) The individual model needs to be trained
on client-specific data. (ii) A drag-and-drop configuration of
modules as per the client requirements for their invoice pro-
cessing pipeline. (iii) A one time configuration of different
weights and confidence thresholds ( 25 parameters) for dif-
ferent microservices.
Related Work
The work by (Hedberg 2020) analyses the use of ML and
decision support system for invoice processing and high-
lights the benefits and challenges associated with it. It also
shows that there is a high variety in how cost centers and ac-
counts are perceived in different organizations, and the com-
plexity of invoices varies with different parameters. (Doshi,
Kotak, and Sahitya 2020) and (Desai et al. 2021) highlight
the importance of automation and Robotic Process Automa-
tion (RPA) in invoice processing. Smirnov et. al. (Smirnov
et al. 2016) , Hu et. al. (Hu 2015) and Tater et. al. (Tater et al.
2018) also discuss various ways of identifying invoice pay-
ment status using machine learning algorithms. Some work
is available on individual invoice processing steps like sim-
ilarity between line items in purchase order and invoices as
discussed in (Maurya et al. 2020). Also, our work (Tater
et al. 2021) discusses tax code determination following an
approach where we consider historical data for tax code clas-
sification. Data drift and concept drift have also been con-
sidered in other works like (Quionero-Candela et al. 2009)
(Widmer and Kubat 1996; Gama et al. 2004).
Other companies are using a variety of solutions such
as: robotics process automation (RPA), rule-based solutions,
5https://scikit-learn.org/stable/
6https://lucene.apache.org/pylucene/
7https://www.mongodb.com/
and electronic invoicing platforms. The major challenge
with these solutions is they primarily work with structured
data or for repetitive tasks. RPA is a good solution for repet-
itive tasks with low or zero cognitive load. Similarly, rule-
based solutions would again be applicable for a fixed set of
rules. Also, not every organisation is large enough in terms
of volume of invoices to have its own electronic invoicing
platforms which entail building web applications for ven-
dors to input their invoice in a specified format and then
processing it. On the other hand, our system is catered to
handle unstructured data such as line item description and
uses ML algorithms to tackle complex tasks as well.
Conclusion and Future Work
We have developed a configurable system for automatic
accounts payable invoice processing where each indepen-
dent module can be used or dropped depending on clients’
requirements. Some modules like line-item matching, ac-
counting codes prediction, coder name prediction, and tax
code prediction use supervised and unsupervised algo-
rithms, while others are rule-based. Some of these modules
also use feedback from agents to improve their performance.
This helps in tackling any changes or drifts in the data along
with addressing the problem of unseen data. These agent
feedback also help in fine-tuning the confidence thresholds,
thus pushing more invoices directly for posting over time
without any manual intervention. Also, owing to the mod-
ular design of the system, each module can be separately
changed/improved based on data-characteristics or client re-
quirements or in lieu of a better performing algorithm. We
have successfully deployed the system for 2 clients in mul-
tiple markets and are in the process of deploying it for more
markets and clients with minimal changes. This highlights
the generalizability and scalability of the proposed system.
Aided by extensive logging and the data collected with
the system, clients can better analyze vendor relationships.
As part of future work, we need to come up with ways in
which new rules or modules can be directly added by clients
themselves if a particular need arises. Also, the industry as a
whole needs to look at more explainable use of AI modules
which makes it comfortable for new clients to adopt such
automation systems since some new clients are warying of
using such systems because of audits and regulations.
References
Bohn, T. 2010. Cost-cutting with accounts payable automa-
tion. Financial executive, 26(6): 65–67.
Cosgrove, C. 2013. Invoice Exceptions Should
Be Like Caution Signs For Your AP Process.
https://www.cloudxdpo.com/blog/bid/220099/Accounts-
Payable-Process-Automating-Manual-Validation-Steps.
Accessed: 2021-12-27.
Desai, D.; Jain, A.; Naik, D.; Panchal, N.; and Sawant, D.
2021. Invoice Processing using RPA & AI. Available at
SSRN 3852575.
Doshi, P.; Kotak, Y.; and Sahitya, A. 2020. Automated
Invoice Processing System Along with Chatbot,“. Inter-
national Journal of Research in Engineering, Science and
Managemen, 3(5): 29–31.
Furth, D. 2005. Accounts payable automation pays divi-
dends. The CPA Journal, 75(7): 16.
Gama, J.; Medas, P.; Castillo, G.; and Rodrigues, P. 2004.
Learning with drift detection. In Brazilian symposium on
artificial intelligence, 286–295. Springer.
Han, B.; and Baldwin, T. 2011. Lexical normalisation of
short text messages: Makn sens a# twitter. In Proceedings
of the 49th Annual Meeting of the Association for Computa-
tional Linguistics: Human Language Technologies-Volume
1, 368–378. Association for Computational Linguistics.
Hedberg, N. 2020. Automated invoice processing with
machine learning: Benefits, risks and technical feasibility.
Ph.D. thesis, KTH, School of Industrial Engineering and
Management.
Hu, P. 2015. Predicting and improving invoice-to-cash
collection through machine learning. Ph.D. thesis, Mas-
sachusetts Institute of Technology.
Maurya, C. K.; Gantayat, N.; Dechu, S.; and Horvath, T.
2020. Online similarity learning with feedback for invoice
line item matching. arXiv preprint arXiv:2001.00288.
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.;
Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss,
R.; Dubourg, V.; et al. 2011. Scikit-learn: Machine learning
in Python. the Journal of machine Learning research, 12:
2825–2830.
Quionero-Candela, J.; Sugiyama, M.; Schwaighofer, A.; and
Lawrence, N. D. 2009. Dataset shift in machine learning.
The MIT Press.
Schaeffer, M. S. 2002. Essentials of accounts payable. John
Wiley & Sons.
SeatGeek. 2011. Fuzzy String Matching in Python. https:
//github.com/seatgeek/fuzzywuzzy. Accessed: 2021-12-27.
Smirnov, J.; et al. 2016. Modelling late invoice payment
times using survival analysis and random forests techniques.
Ph.D. thesis, Universitas Tartuensis.
Tater, T.; Dechu, S.; Gantayat, N.; Guptha, M.; and
Narayanan, S. 2021. Tool for Automated Tax Coding of
Invoices. Proceedings of the AAAI Conference on Artificial
Intelligence, 35(17): 15185–15194.
Tater, T.; Dechu, S.; Mani, S.; and Maurya, C. 2018. Predic-
tion of invoice payment status in account payable business
process. In International Conference on Service-Oriented
Computing, 165–180. Springer.
Wagner, J. 2018. Duplicate Invoice Payments:
How to Avoid Cash Leakage in Accounts Payable.
https://www.oversight.com/blog/duplicate-invoice-
payments-avoid-cash-leakage-accounts-payable. Ac-
cessed: 2021-12-27.
Widmer, G.; and Kubat, M. 1996. Learning in the presence
of concept drift and hidden contexts. Machine learning,
23(1): 69–101.
Yujian, L.; and Bo, L. 2007. A normalized Levenshtein dis-
tance metric. IEEE transactions on pattern analysis and ma-
chine intelligence, 29(6): 1091–1095.
... P2P includes activities of making purchases, receiving goods and services, processing invoices, credit notes and creating/distributing payments through to maintaining the purchase ledger. Figure 5 shows the positioning of purchase invoice processing in P2P lifecycle (Neuvonen, 2015;Doxey, 2021;Tater et al., 2022). Neuvonen (2015) defines the process by naming six separate phases that are associated with accounts payable processing: 1. Receiving of an invoice 2. Invoice posting and sending the invoice for approval 3. Reviewing and approving the invoice 4. Payment processing 5. Reconciliations and accruals 6. Archiving ...
... Prior to modelling the manual process of invoice processing, three different sources (Neuvonen, 2015;Doxey, 2021;Tater et al., 2022) were reviewed for the detailed description of steps conducted in the process. The description has then been summarized based on similarities and extended with differences so a comprehensive description of the process could be achieved. ...
... Once the pre-processing steps for paper invoice have concluded, then the activities of capturing the data by converting the paper invoices (and attachments) into digital images and index data follows. For paper invoices, this includes the steps of scanning, image enhancement, indexing, validation, and data extraction (Neuvonen, 2015;Doxey, 2021;Tater et al., 2022). ...
Thesis
Full-text available
Automation is revolutionizing business processes. The usage of rule-based technologies with the combination of artificial intelligence transforms operating efficiency, productivity and creates value for corporations. A new technological trend called “Hyperautomation” is depicted as the next evolutionary step in process automation. Despite leading Gartner´s top 10 technological trends for the past 3 years, there is only a limited number of scientific research conducted on it and even less practical research can be found on the subject. To fill that research gap, this paper attempts - with the primary scientific method of prototyping - to achieve a proof of concept for hyperautomation. In the developed prototype the business process of invoice processing was hyperautomated with only the use of open-source software. Developing the prototype entailed the selection and subsequently the analysis of a business process for automation, definition of software requirements, designing the prototype’s architecture, modeling the automated process, as well as the selection, installation and configuration of open-source software which was used in the final implementation. The prototype proved that the concept of hyperautomation can be realized with open-source software. The prototype was then evaluated on its performance and data quality by comparing it to manual invoice processing. The evaluation showed that the prototype outperformed the manual process in terms of performance and data quality. The depicted development procedure enables the ability to hyperautomate different business processes and enhance them, based on the developed architecture, with additional automation technology. Based on the developed prototype, the best-of-breed approach for hyperautomating business process can be utilized as a starting point for further research.
... As a matter of fact, the role of accountants is widely acknowledged as being among the most susceptible to computerization, as substantiated by a research paper [1]. In this landscape of evolving technology, our study delves into a critical facet of corporate accounting: the classification of invoice entries [2,3]. By employing innovative machine learning techniques, we aim to alleviate the burden of manual work, streamline processes, and improve overall efficiency in the realm of electronic invoicing. ...
... However, it is important to note that even in this electronic landscape, significant manual work persists [2]. This persistence is attributed to several key challenges: ...
Article
Full-text available
This paper addresses the time-intensive task of assigning accurate account labels to invoice entries within corporate bookkeeping. Despite the advent of electronic invoicing, many software solutions still rely on rule-based approaches that fail to address the multifaceted nature of this challenge. While machine learning holds promise for such repetitive tasks, the presence of low-quality training data often poses a hurdle. Frequently, labels pertain to invoice rows at a group level rather than an individual level, leading to the exclusion of numerous records during preprocessing. To enhance the efficiency of an invoice entry classifier within a semi-supervised context, this study proposes an innovative approach that combines the classifier with the A* graph search algorithm. Through experimentation across various classifiers, the results consistently demonstrated a noteworthy increase in accuracy, ranging between 1% and 4%. This improvement is primarily attributed to a marked reduction in the discard rate of data, which decreased from 39% to 14%. This paper contributes to the literature by presenting a method that leverages the synergy of a classifier and A* graph search to overcome challenges posed by limited and group-level label information in the realm of electronic invoicing classification.
... Digitalization is likely to refer to those settings where some process moves from manual to a digital and the particular process is reengineered. As an example, a range of technologies applied to, say, orders, with the portfolio of technologies changing over time (e.g., Tater et al. 2022). ...
... Then, using the digital version, the system was designed to find ontology terms in a contract, while an HITL confirms which of the identified uses was appropriate, and then the system captured the location of the terms in a spreadsheet. As another example of using AI for digitalization of a process, Tater et al. (2022) developed an AI-based "end-to-end" invoice processing system using multiple steps to process invoices. That system can be configured to individual clients' requirements. ...
Article
This paper provides some basic definitions associated with digital transformation in organizations and applies those definitions to accounting, electronic commerce, and supply chains. I also drill down on the dimensions associated with digital transformation, including digital everywhere, integration (across applications and with customers and partners), and the need to reengineer processes. I examine several examples of processes ranging from digitization to digital transformation. I also examine the role of people in digitally transformed organizations and some technologies that are important to continued evolution of digitally transformed organizations. Further, we explore a number of scenarios of digital transformation. Finally, these investigations result in the determination of a number of emerging research issues.
... In terms of accounting and reporting, ML may be used to combine fragmented information across companies, sectors and jurisdictions and in multiple languages to generate an accounting topology or chart-of-account (Jørgensen and Igel, 2021;Lesner et al., 2020;Munoz et al., 2022;Zhang and Liang, 2023), and to also aid detection of journal entry anomalies (Zupan et al., 2020). More broadly, AI may be used to support a wide range of salient accounting-related business applications, including accounts payable (Tater et al., 2022), costing (Lee and Leung, 2012), order management (Khataie et al., 2011), resource management (Hilmola and Gupta, 2015;Hsu, 2008), capital structuring (Östermark, 2015), corporate valuation , inventory classification (Kaabi, 2022), expense management (Lecue and Wu, 2017), project management (Al-Tabtabai et al., 1997) and process engineering (Arif et al., 2020). Prominent application domains include construction, project and engineering contexts (Al-Tabtabai et al., 1997;Baalousha and Çelik, 2011), advertising (Fan and Delage, 2022) insurance (Zuin et al., 2023), banking (González-Carrasco et al., 2019), and audit (Nado et al., 1996). ...
Article
Historically, literature suggests that a variety of accounting roles will be replaced by Artificial Intelligence (AI) and related technologies; however, in recent years there is a growing recognition that accounting can in fact harness AI’s potential to add value to organisations. Commentators have highlighted the need for increased research exploring accounting and AI and for accounting scholars to consider multi-disciplinary research in this area. This study uses a form of topic modelling to analyse literature exploring AI and related techniques in an accounting context. Latent Dirichlet Allocation (LDA) has been used to enable probabilistic, machine-based interrogation of large volumes of literature. This study applies LDA to the abstracts of 930 peer-reviewed academic publications from a variety of disciplines to identify the most significant accounting and AI topics discussed in the literature during the period 1990 to 2023. Our findings suggest that prior literature reviews based on more traditional methodologies do not capture a comprehensive picture of accounting and AI research. Eleven topic clusters are identified which provide a comprehensive topology of the extant literature discussing accounting and AI and set out an agenda for future research designed to foster academic progress in the area. It also represents one of the first applications of probabilistic topic modelling to accounting literature.
... Machine learning algorithms automate the business processes for accounts payable and other current liabilities by applying the organization's policies for invoice validation, vendor identification, line-item matching for purchase orderbased invoices, general ledger code for non-purchase order-related invoices, approval workflows and payment scheduling. These algorithms ensure invoice accuracy, detect discrepancies and flag exceptions for human review, improving processing efficiency and reducing errors and fraud risks (Tater et al. 2022). This approach contrasts with traditional accounts payable management, which relies on manual data entry and is labor-intensive and error prone. ...
Article
Full-text available
This study delves into how artificial intelligence (AI) transforms working capital management by addressing the limitations of traditional methods. The focus is to critically review research publications, case studies and industry reports using qualitative research methodology to examine how AI improves operational efficiency and decision-making in this area. The study demonstrates the practical application of advanced machine learning algorithms and big data analytics in optimizing inventory management, enhancing demand forecasting and improving cash flow predictions. A thorough review of recent research and case studies reveals additional benefits, including automated reconciliations, debtor risk analysis, accelerated cash inflows, invoice processing and proactive working capital management. Despite challenges in integrating AI with legacy systems, the potential for substantial improvements in financial health and operational efficiency is significant. The study also suggests future research directions, such as developing comprehensive AI-driven applications for broader working capital considerations, creating empirical validation frameworks for model performance and addressing ethical considerations to fully harness AI's potential in optimizing working capital management.
... through the triangulation between different documents (Datasnipper, 2023, Tater et al., 2022. A pioneering study applying ML for IE was presented by Palm et al. (2017). ...
Article
The automation of incoming invoices processing promises to yield vast efficiency improvements in accounting. Until a universal adoption of fully electronic invoice exchange formats has been achieved, machine learning can help bridge the adoption gaps in electronic invoicing by extracting structured information from unstructured invoice formats. Machine learning especially helps the processing of invoices of suppliers who only send invoices infrequently, as the models are able to capture the semantic and visual cues of invoices and generalize them to previously unknown invoice layouts. Since the population of invoices in many companies is skewed toward a few frequent suppliers and their layouts, this research examines the effects of training data taken from such populations on the predictive quality of different machine-learning approaches for the extraction of information from invoices. Comparing the different approaches, we find that they are affected to varying degrees by skewed layout populations: The accuracy gap between in-sample and out-of-sample layouts is much higher in the Chargrid and random forest models than in the LayoutLM transformer model, which also exhibits the best overall predictive quality. To arrive at this finding, we designed and implemented a research pipeline that pays special attention to the distribution of layouts in the splitting of data and the evaluation of the models.
... This result accords with previously established findings [42]. Tater et al., [45,46] study developed an AI-driven system that processed~80 k invoices; out of that, 76% of invoices were handled and processed automatically with no human interventions. The above research has also successfully detected the double payments error that often comes with the man-made handling of invoices. ...
Article
Full-text available
Accounts Payable (AP) is a time-consuming and labor-intensive process used by large corporations to compensate vendors on time for goods and services received. A comprehensive verification procedure is executed before disbursing funds to a supplier or vendor. After the successful conclusion of these validations, the invoice undergoes further processing by traversing multiple stages, including vendor identification; line-item matching; accounting code identification; tax code identification, ensuring proper calculation and remittance of taxes, verifying payment terms, approval routing, and compliance with internal control policies and procedures, for a comprehensive approach to invoice processing. At the moment, each of these processes is almost entirely manual and laborious, which makes the process time-consuming and prone to mistakes in the ongoing education of agents. It is difficult to accomplish the task of automatically processing these invoices for payment without any human involvement. To provide a solution, we implemented an automated invoicing system with modules based on artificial intelligence. This system processes invoices from beginning to finish. It takes very little work to configure it to meet the specific needs of each unique customer. Currently, the system has been put into production use for two customers. It has handled roughly 80 thousand invoices, of which 76 percent were automatically processed with little or no human interaction.
... Notably, the receipts generated have no features to verify fake receipts and are prone to forgery, hence the loss of revenue [7]. Handling receipts through manual checks is cumbersome and prone to errors [8]. As was already noted, the issue might have severe repercussions and cause a significant loss of revenue collections [9]. ...
Article
Full-text available
Point of Sale terminals play a significant role in revenue collection and have become rampant to Tanzanian Local Government Authorities. Point of Sale systems monitors cash flow, transactions, and price control while reducing human error and managing staff, customers, and inventory. However, Point of Sale systems are vulnerable to fake receipts, thus reducing revenue collections among Local Government Authorities. The cross-sectional Design was used to facilitate knowledge for the subsequent data collection. Data were collected from 300 respondents in Mbeya and Songwe regions using purposive and simple random sampling. 70% of respondents reported that fake receipt is the major factor affecting revenue collection, followed by lack of training (20%) and security (8%). In this study, we propose a mobile-based solution to enhance revenue collection in Local Government Authorities by addressing major factors affecting revenue collection. The developed mobile application was evaluated and validated; Whereby the results confirm that the designed tool is effective against money fraud, transaction errors, human errors, and defaulters with minimal resource usage. Hence, the designed mobile application can be applied as an auditing tool to reduce money fraud and increase revenue collection for the Local Government Authorities
Chapter
The rapid advancements in artificial intelligence (AI) technologies have significantly impacted various industries, including the accounting profession. This paper examines the adoption of AI in the accounting profession using the Technology Acceptance Model (TAM) as a framework. The TAM provides a theoretical foundation to understand the factors influencing the acceptance and adoption of AI in accounting, including perceived usefulness, perceived ease of use, attitudes towards AI, and external factors. The paper also discusses the implications of AI adoption for accountants and the challenges associated with integrating AI into accounting practices. Finally, recommendations are provided to facilitate successful AI adoption in the accounting profession.
Chapter
Prediction of invoice payment date through machine learning in the B2B marketing is an important factor that affect the business dealings between the companies and changes the complete direction of the business. If the seller company finds the predicated payment date of the buyer company is getting highly delayed from the due date then seller company may not sell that product to that company. So, in this way, payment date prediction plays a very important role in B2B marketing. In this study, we explore how machine learning (ML) can be used to develop models for predicting whether newly created bills will be paid, enabling customized collection activities specific to each invoice or customer. Our models can accurately forecast whether or not a bill will be paid on time and also give estimates of how much time will be lost. Our methods are demonstrated using real-world transaction data from several firms. Finally, simulation results compared with other state-of-the-art approaches.KeywordsB2B marketingInvoicePaymentMachine learningBusiness intelligence
Conference Paper
Full-text available
Most of the work in machine learning assume that examples are generated at random according to some stationary probability distribution. In this work we study the problem of learning when the distribution that generate the examples changes over time. We present a method for detection of changes in the probability distribution of examples. The idea behind the drift detection method is to control the online error-rate of the algorithm. The training examples are presented in sequence. When a new training example is available, it is classified using the actual model. Statistical theory guarantees that while the distribution is stationary, the error will decrease. When the distribution changes, the error will increase. The method controls the trace of the online error of the algorithm. For the actual context we define a warning level, and a drift level. A new context is declared, if in a sequence of examples, the error increases reaching the warning level at example k w , and the drift level at example k d . This is an indication of a change in the distribution of the examples. The algorithm learns a new model using only the examples since k w . The method was tested with a set of eight artificial datasets and a real world dataset. We used three learning algorithms: a perceptron, a neural network and a decision tree. The experimental results show a good performance detecting drift and with learning the new concept. We also observe that the method is independent of the learning algorithm.
Article
Full-text available
On-line learning in domains where the target concept depends on some hidden context poses serious problems. Context shifts can induce changes in the target concepts, producing what is known as concept drift. We describe a family of learning algorithms that flexibly react to concept drift and can take advantage of situations where contexts reappear. The general approach underlying all these algorithms consists of (1) keeping only a window of currently trusted examples and hypotheses; (2) storing concept descriptions and re-using them if a previous context re-appears; and (3) controlling both of these functions by a heuristic that constantly monitors the system's behavior. The paper reports on experiments that test the systems' performance under various levels noise and different extent and speed of concept drift. Key words. Incremental concept learning, on-line learning, context dependence, concept drift, forgetting 1 Introduction The work presented here relates to the global model o...
Article
Delinquent invoice payments can be a source of financial instability if it is poorly managed. Research in supply chain finance shows that effective invoice collection is positively correlated with the overall financial performance of companies. In this thesis I address the problem of predicting the delinquent invoice payments in advance with machine learning of historical invoice data. Specifically, this thesis demonstrates how supervised learning models can be used to detect the invoices that would have delay payments, as well as the problematic customers, which enables customized collection actions from the firm. The model from this thesis can predict with high accuracy if an invoice will be paid on time or not and also estimate the magnitude of the delay. This thesis builds and trains its invoice delinquency prediction capability based on the real-world invoice data from a Fortune 500 company.
Article
Dataset shift is a common problem in predictive modeling that occurs when the joint distribution of inputs and outputs differs between training and test stages. Covariate shift, a particular case of dataset shift, occurs when only the input distribution changes. Dataset shift is present in most practical applications, for reasons ranging from the bias introduced by experimental design to the irreproducibility of the testing conditions at training time. (An example is -email spam filtering, which may fail to recognize spam that differs in form from the spam the automatic filter has been built on.) Despite this, and despite the attention given to the apparently similar problems of semi-supervised learning and active learning, dataset shift has received relatively little attention in the machine learning community until recently. This volume offers an overview of current efforts to deal with dataset and covariate shift. The chapters offer a mathematical and philosophical introduction to the problem, place dataset shift in relationship to transfer learning, transduction, local learning, active learning, and semi-supervised learning, provide theoretical views of dataset and covariate shift (including decision theoretic and Bayesian perspectives), and present algorithms for covariate shift. Contributors: Shai Ben-David, Steffen Bickel, Karsten Borgwardt, Michael Brckner, David Corfield, Amir Globerson, Arthur Gretton, Lars Kai Hansen, Matthias Hein, Jiayuan Huang, Takafumi Kanamori, Klaus-Robert Mller, Sam Roweis, Neil Rubens, Tobias Scheffer, Marcel Schmittfull, Bernhard Schlkopf, Hidetoshi Shimodaira, Alex Smola, Amos Storkey, Masashi Sugiyama, Choon Hui Teo Neural Information Processing series
Article
Although a number of normalized edit distances presented so far may offer good performance in some applications, none of them can be regarded as a genuine metric between strings because they do not satisfy the triangle inequality. Given two strings X and Y over a finite alphabet, this paper defines a new normalized edit distance between X and Y as a simple function of their lengths (|X| and |Y|) and the Generalized Levenshtein Distance (GLD) between them. The new distance can be easily computed through GLD with a complexity of O(|X|.|Y|) and it is a metric valued in [0, 1] under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. Experiments using the AESA algorithm in handwritten digit recognition show that the new distance can generally provide similar results to some other normalized edit distances and may perform slightly better if the triangle inequality is violated in a particular data set.
Cost-cutting with accounts payable automation. Financial executive
  • T Bohn
Bohn, T. 2010. Cost-cutting with accounts payable automation. Financial executive, 26(6): 65-67.
Invoice Exceptions Should Be Like Caution Signs For Your AP Process
  • C Cosgrove
Cosgrove, C. 2013. Invoice Exceptions Should Be Like Caution Signs For Your AP Process. https://www.cloudxdpo.com/blog/bid/220099/Accounts-Payable-Process-Automating-Manual-Validation-Steps. Accessed: 2021-12-27.
Automated Invoice Processing System Along with Chatbot
  • P Doshi
  • Y Kotak
  • A Sahitya
Doshi, P.; Kotak, Y.; and Sahitya, A. 2020. Automated Invoice Processing System Along with Chatbot,". International Journal of Research in Engineering, Science and Managemen, 3(5): 29-31.
Accounts payable automation pays dividends
  • D Furth
Furth, D. 2005. Accounts payable automation pays dividends. The CPA Journal, 75(7): 16.