Published: December 10, 2018
Author(s)
Huadong Feng (UTA), Jaganmohan Chandrasekaran (UTA), Yu Lei (UTA), Raghu Kacker (NIST), Richard Kuhn (NIST)
Conference
Name: 2018 IEEE International Conference on Big Data (Big Data)
Dates: December 10-13, 2018
Location: Seattle, Washington, United States
Citation: Proceedings. 2018 IEEE International Conference on Big Data, pp. 221-230
When a failure occurs in a big data application, debugging with the original dataset can be difficult due to the large amount of data being processed. This paper introduces a framework for effectively generating method-level tests to facilitate debugging of big data applications. This is achieved by running a big data application with the original dataset and by recording the inputs to a small number of method executions, which we refer to as method-level tests, that preserves certain code coverage, e.g., edge coverage. The inputs of these method-level tests are further reduced if needed, while maintaining code coverage. When debugging, a developer could inspect the execution of these method-level tests, instead of the entire program execution with the original dataset, which could be time-consuming. We implemented the framework and applied the framework to seven algorithms in the WEKA tool. The initial results show that a small number of method-level tests are sufficient to preserve code coverage. Furthermore, these tests could kill between 57.58 % to 91.43 % of the mutants generated using a mutation testing tool. This suggests that the framework could significantly reduce the efforts required for debugging big data applications.
When a failure occurs in a big data application, debugging with the original dataset can be difficult due to the large amount of data being processed. This paper introduces a framework for effectively generating method-level tests to facilitate debugging of big data applications. This is achieved by...
See full abstract
When a failure occurs in a big data application, debugging with the original dataset can be difficult due to the large amount of data being processed. This paper introduces a framework for effectively generating method-level tests to facilitate debugging of big data applications. This is achieved by running a big data application with the original dataset and by recording the inputs to a small number of method executions, which we refer to as method-level tests, that preserves certain code coverage, e.g., edge coverage. The inputs of these method-level tests are further reduced if needed, while maintaining code coverage. When debugging, a developer could inspect the execution of these method-level tests, instead of the entire program execution with the original dataset, which could be time-consuming. We implemented the framework and applied the framework to seven algorithms in the WEKA tool. The initial results show that a small number of method-level tests are sufficient to preserve code coverage. Furthermore, these tests could kill between 57.58 % to 91.43 % of the mutants generated using a mutation testing tool. This suggests that the framework could significantly reduce the efforts required for debugging big data applications.
Hide full abstract
Keywords
testing; unit testing; big data application testing; test generation; test reduction; debugging; mutation testing
Control Families
None selected