For a software vendors innovation is key, but brings its challenges - including new software defects, that were not there before which naturally occur in a complex software development project
The company was a Data management system vendor. These systems form the very backbone of businesses worldwide: the reliability and stability of such systems is a core requirement. Customers buying a data management system will not tolerate data corruption, slow response time, or downtime. Failure to deploy a stable, reliable system comes with significant risks incl. loss of renewal and reputational damage that can be very difficult to overturn.
The company was committed to delivering high-quality software at speed; but their engineering teams were experiencing sporadic test failures, that were challenging to diagnose and were slowing them down. The engineers had created a randomised workload test system to try to reproduce issues seen by some of their customers in the field. The test system was successful in this; but because they have a large codebase, a complex control flow, and intermittent failures which were difficult to reproduce, they were still unable to address some of the failure modes at the pace they needed to.
Attempts to resolve these issues were consuming large quantities of the best software engineers’ time, and therefore costing the business money.
The solution was a software to accelerate defect resolution across all phases of the software development lifecycle with absolute certainty.
The company partenered with the client to make the technology work in their randomised workload test system. One of the reasons why clients choose this company over the competition is the customer support. The customers are made successful so they can deliver software at velocity by reducing the time it takes to resolve defects.
Secondly, the innovative software flight recording technology is the only way engineers can get full visibility into what their software did and why. By recording failed processes in test, the product, LiveRecorder, can capture software bugs ‘in the act’ - providing engineers with a 100% reproducible test case. There is no other supported technology like this available on the market.
Previously, the engineering team would spend weeks trying to reproduce failing tests.
Now, the test system produces a guaranteed reproducer; engineers can get straight to debugging the recording file.
“With Live Recorder, we were able to dramatically cut down the analysis time that is required to understand the root cause of very complex software defects.”
Chief Development Architect, data management system vendor
The company operates through a distributed engineering team (across continents). Not all engineers were aware of the technology having been integrated into their systems; and so usage was contained mostly to engineers in the location where all the integration work took place. There was also some additional integration work to be carried out in order to make the technology easier to use, by meshing more closely with the developers existing workflows.
The company partnered with the client in a Centre of Excellence services engagement targeted at maximising the value of the technology across the engineering team at large. On-site advanced training was provided, user feedback was collated, improvements to the product integration were made; and forces were joined to co-produce a series of how-to videos to be shared internally to encourage internal usage.
The engineering team tried analyzing logs and other sophisticated diagnostics from failed runs. Logs helped get a partial picture of what was going on, but did not capture enough of the right information for the engineers to be able to diagnose the root cause of the defects.
The engineering team also tried reproducing failures on live systems. That approach was time-consuming and unproductive.
The client was committed to delivering a reliable data management system their customers can trust; and they were keen to deliver their latest innovation to customers faster.
But that is tough to do if the sophisticated testing systems in place result in a growing backlog of failing tests. Those had to be addressed and so it posed the risk of delaying delivery schedules - costing the business money. High-priority defects can easily cost up to $50K per defect if they are not resolved fast.
The engineering team wanted to stop wasting time on trying to reproduce and debug sporadic issues. The client needed a way to accelerate software defect resolution so that they could release their software changes faster and enhance customer satisfaction.
The engineering team was also keen to find a solution that would help them diagnose the most challenging defects they could not diagnose any other way....
The company was able to significantly accelerate software defect resolution in development/test by eliminating the guesswork in software failure diagnosis. In addition, engineers were able to capture and fix 7 high-priority defects that were plaguing the system for months.
First of all, the company ensured that the technology would work in the customer’s technical environment in principle. Then the technology was integrated into the client’s basic systems and demonstrated that it could record failing tests and replay the recording files for debugging. Then one proceeded to a full integration, onboarding, and training phase to ensure all the engineers working on the data management system in question could use the technology.
The software flight recording technology works on any applications built in C/C++, Go, or Rust, on Linux.
In 2020, it will also support Java applications built on Linux.
Integrations & APIs
Live Recorder integrates either via the command line, or by linking with a library and using a simple C API. This gives the customers the flexibility they need to integrate into any test system.
The company deploys their own Professional Services Engineers to make sure clients are successful with the technology (expertise not outsourced).