Automating the cumbersome Tax Reconciliation process at large scale

We at TaxReco are mainly focused on automating the cumbersome Tax Reconciliation process at large scale through a simple, unified and standard workflows and improving the productivity of our customers.

Tax Reconciliation is a complex technical problem to solve especially at large scale.

Much of the complexity lies in the following areas at a high level.

Volume – The number of financial transactions (or the line items) on our customers’ ledgers could vary from few tens of thousands to tens of millions. Our technology platform should adapt to such a scale.
Variety – Typically Reconciliation (TDS Reconciliation as an example) is a process of matching the financial transactions from one data set to another subject to various checks and balances. For TDS Reconciliation, you’d typically have 26AS (the statement available in the Government Portal) in a standard text format and the sales ledgers (which may be customer specific formats). Our technology should be flexible enough to support multiple types of sales ledgers and the customers should be able to describe the schema on the fly.
Velocity – As a SaaS solution, the platform should be able to scale to the data thrown at it at different velocities. For example: “Customer A” may be ingesting the data in the platform at the rate of 100 transactions per second which could be different to “Customer B” whose ingest rate would be 5 transactions per second. Expected performance for a customer is mainly correlated with the velocity in which the data is expected to be processed and the corresponding technical resource requirements.
Polymorphic Data Representations – The entire tax reconciliation platform has many components that are built for specific needs. These components require the data to be stored in an optimised format for it to serve its purpose. For example: The input data stored for reconciliation is more optimised to be stored as a “table” in an RDBMS (as the customer’s input data are primarily CSV’s and Excel sheets) whereas the output data is more optimised to be stored in a NOSQL store for horizontal scaling and more analytics type of queries (eg: Aggregations etc).

Putting all the above challenges in perspective and to stay more future proofed, our platform is modelled as a “big data platform”. In simple terms, if the platform is able to scale horizontally as the load increases and perform better, then it’s fit to be a platform to handle big data. TaxReco 2.0 platform is precisely that.

Conceptually, the pipeline of various components of the platform looks like this.

Leave a Comment Cancel Reply