Search
  • John Coogan

Real Time Processing for Medical Data Warehousing

Updated: Jan 28

Real Time” processing is a well-known phrase that is also commonly used in the medical data warehousing industry. But in the infamous words of Inigo Montoya … “I don’t think it means what you think it means”.


In the wild the typical expectation for Real Time processing is that the user gets immediate results. But Real Time processing really refers more to a method of data processing in which there are no breaks or pauses in the chain of processing events. For small data like a bank ATM transaction, the user will get a fast result. But in the Medical Data Warehousing industry some files can be upwards of 100,000 claims. For even the most sophisticated Data Warehouse that has migrated away from scheduled batch processing towards a Real Time approach, the users with the largest files will still have to wait at least a few minutes. If you had to feed 100,000 one-dollar bills into your bank ATM, you would know that the deposit slip is not going to be immediate.


Starting a data collection system can be a complicated undertaking for a state agency that has been tasked with this responsibility. They must collect, then clean, and finally publish medical data to all the facilities in their state. Easy submission procedures and fast turnaround times are common requests here at System13 as state agencies look to provide low impact reporting solutions to their facilities.



So, How Fast Does Your Medical Data Warehouse System Need to Be?


To help state agencies arrive at a more clear and realistic data warehouse processing expectation, let’s talk about processing speed in terms of claims per second. We can gain a great behind-the-scenes look into Data Warehouse processing if we ask the opposite question: “What is the absolute slowest speed a claim Data Warehouse can run at and still be able to complete processing for all the claims in your state?” And of course: “How big is your state?


For states across the US, inpatient claims will be in the range of 100,000 - 900,000 claims per quarter, depending on the state. For outpatient, the range will be 500,000 – to 7,000,000 claims per quarter. For our discussion, we will assume your state wants to collect and process a total of 4,000,000 inpatient and outpatient claims per quarter.

We can determine the practical minimum speed for this processing with a little industry understanding and some basic math.


Reporting is typically managed quarterly, or 90 days. But we can’t just divide four million claims by 90 days. From decades of processing experience, System13 can say that 50% of your facilities are going to send all of their data in the last 14 days prior to submission deadlines. This is when your Data Warehouse will be taxed the heaviest. So, let’s compute the minimum speed needed based on this 14-day crunch time.



Basic Math:


50% of 4,000,000 claims = 2,000,000 claims submitted within the 14 days just before quarter deadline.

2,000,000 claims divided by 14 days = 142,857 claims/day.

142,857 divided by 24 hours = 5952 claims per hour. 5952 divided by 60 minutes = 100 claims per minute. 100 divided by 60 seconds = 1.7 claims per second.


Let’s double the speed of 1.7 claims per second to 3.4 claims per second to give the Data Warehouse some breathing room to handle any issues or reprocessing needs that may arise. So, in one second, the Data Warehouse needs to be able to process (at a minimum) around 4-ish claims per second, through the following:


  • 200 separate validation rules per claim

  • Recompute and compare charge totals

  • Loop large numbers of charge records through multiple validations per record

  • Validate physician entries against hundreds of thousand of NPI/license lookups

  • Validate diagnosis codes against thousands of lookups

  • Thousands of procedure/HCPCS codes

  • Validate payer codes, id’s

  • Multiple additional lookups for values, conditions, occurrences, states, zip codes

  • Revenue code validation

  • Sophisticated groupers that assign DRG's, MDC's, and Severity of Illness scores

  • Validate procedure codes, modifiers, times, dates

  • Flag any required data elements that are missing

  • Any added state logic (assign ER Department indicators, Length of stay, etc)

  • And more …


So, if the Data Warehouse processes continually with no issues or interruptions, a speed of 4 claims per second can get the work done on time and within your deadlines. But that speed can be a pretty big ouch to any facility that submits a large file for processing.



  • 4 minutes to process 1,000 claims.

  • 20 minutes to process 5,000 claims.

  • 40 minutes to process 10,000 claims.

  • 3.5 hours to process 50,000 claims.

  • 7 hours for 100,000 claims.



Any Data Warehouse worth its salt should be able to do multi-threaded processing so any other files submitted do not need to ‘wait in line’ behind a monster file. But how do you increase the processing speed of those large files? To be competitive in this industry, 4 claims per second is simply unacceptable as the user wait times will become a vocalized frustration. The alternative solution of breaking files into pieces to process them in parallel has turned out to be cost prohibitive every time we have looked into it. The costs are high due to the added complexity required to build accounting and failure recovery procedures when you introduce pieces of files into the equation. If there’s any failure, network issue, code update bug … how do you put humpty-dumpty back together again? It’s expensive! The simple bottom line for every warehouse out there is: You have to increase your single threaded claims per second speed. It’s the only cost effective way to embrace this work.



The System13 Approach


We’re at the end of the article. Before we leave, just a quick note about System13 as we address the speed of processing for large files. We are of course exponentially faster than 4 claims per second! System13 is committed to the standards of “Real Time” processing. We utilize a multi-threaded architecture approach, with multiple queues and multi-core hardware that can dedicate one core per queue.

Our cache setup is large enough to contain our entire database, per client (including lookup tables, which are all local), and we use solid-state server and desktop systems. We also came to the conclusion decades ago that nothing beats a custom written audit engine. Third-party claim auditors/gems/widgets are simply not fast enough because they add overhead, and they never anticipate unique business rules. Running the claim-level audit engine is the biggest thing we do. You have to write this engine from scratch in order to obtain full control over order-of-operations, indexed lookup optimizations, repetition enhancements, code variables flexibility, memory allocation, and more. At this writing System13 runs at a peak claim audit speed of over 47 claims per second (per system), meaning we can validate and process over 4 million claims, per system … in just one day. System13 is proud to be the fastest claim audit processor in this industry!



From single-site processing/conversion solutions to large state-wide processing systems, it would be our privilege to discuss your Medical Data needs.



John Coogan is a Senior Data Analyst and 23 year veteran of System13


39 views0 comments

Recent Posts

See All