C++代写 | CSIT314 Group Project – Part II

这个作业是用C++开发一个自动化测试工具的c++代写案例分享

CSIT314 Group Project – Part II

Project title: Developing an automated testing tool

. Follow the style of the above paper to develop a software testing tool for any domain (it does not have to

be the machine learning domain; for example, it could be the testing of booking.com, a search engine, or a

compiler) of your choice using any programming language that is available in the University lab (e.g., C,

C++, Java, etc).

3. Key requirements



You must follow the Test Driven Development methodology.

Your tool must support automated test case generation (at least randomly).

Your tool must support automated execution of the software under test.

Your tool must support automated result checking & test report generation.

Marking criteria



Have you correctly followed the Test Driven Development (TDD) methodology? Please show the

test data / test suite you designed and executed for each iteration of your TDD process. The quality

of test data and appropriate refactoring are important marking criteria.

To what degree can your tool assist with test data generation?

To what degree can your tool assist with test executions?

5 marks

2 marks



To what degree can your tool check the correctness or appropriateness of the test results

automatically? That is, the test oracle you designed. 8 marks

Note: A “test oracle” is a mechanism, or a method, with which the tester can decide whether the outcomes of test

case executions are correct or acceptable. A test oracle answers the question “how can we know whether the test

results are correct or acceptable?” Your automated testing tool must implement an oracle in order to decide

whether the test has passed or failed.

A Testing Tool for Machine Learning Applications

Yelin Liu

Yang Liu

School of Computing and IT

University of Wollongong

Wollongong, NSW 2522, Australia

yl908@uowmail.edu.au

Booking.com B.V.

Herengracht 597, 1017 CE Amsterdam

Netherlands

ﬂint.liu@booking.com

∗

Tsong Yueh Chen

Department of Computer Science & Software Engineering

Swinburne University of Technology

Hawthorn, VIC 3122, Australia

Zhi Quan Zhou

School of Computing and IT

University of Wollongong

Wollongong, NSW 2522, Australia

zhiquan@uow.edu.au

tychen@swin.edu.au

ABSTRACT

We present the design of MTKeras, a generic metamorphic testing

framework for machine learning, and demonstrate its eﬀectiveness

through case studies in image classiﬁcation and sentiment analysis.

In this research, therefore, we ask the following research ques-

tions. RQ1: Can we develop a generic, domain-independent auto-

mated metamorphic testing framework to allow developers and testers

of ML systems to deﬁne their own MRs? Here, “deﬁne” means “iden-

tify and implement.” RQ2: What is the applicability and eﬀectiveness

of our solution? To address RQ1, we have developed and open-

sourced the ﬁrst version of an automated metamorphic testing

framework named MTKeras, which allows the users to deﬁne their

own MRs based on a prescribed collection of operators. We have

also conducted preliminary case studies to investigate RQ2.

KEYWORDS

Metamorphic testing, metamorphic relation pattern, MR composi-

tion, oracle problem, neural network API, Keras, MTKeras

ML platforms and libraries, such as TensorFlow and Theano, are

now widely available to allow users to develop and train their own

ML models. We have built our MT framework, MTKeras, on the

Keras platform. Keras (https://keras.io) is a popular high-level neural

networks API, developed in Python and working on top of low-level

libraries—those backend engines such as Tensorﬂow and Theano

can be plugged seamlessly into Keras.

Researchers have applied metamorphic testing (MT) to test machine

learning (ML) systems in speciﬁc domains such as computer vision,

machine translation, and autonomous systems [8, 9]. Nevertheless,

INTRODUCTION

The Keras API empowers users to conﬁgure and train a neural

network model based on datasets for various tasks such as image

classiﬁcation or sentiment analysis. MTKeras enables automated

metamorphic testing by providing the users with an MR library

for testing their ML models and applications. We have designed

the MR library based on the concept of a hierarchical structure

the current practice of applying MT to ML is still at an early stage.

In particular, the identiﬁcation of metamorphic relations (MRs) is

still largely a manual process, not to mention the implementation

(

coding) of MRs into test drivers. MRs are the most important com-

ponent of MT, referring to the expected relations among the inputs

and outputs of multiple executions of the target program [3]. It has

(

levels of abstractions) of MRPs [9]. MTKeras also allows the users

been observed that MRs identiﬁed for diﬀerent application domains

often share similar viewpoints, hence the introduction of the con-

to deﬁne and run new MRs through the composition of multiple

MRs. The source test cases are provided by the users whereas follow-

up test cases are generated by MTKeras. MR-violation tests are

automatically recorded during testing.

cept of metamorphic relation patterns (MRPs) [5, 9]. For example,

equivalence under geometric transformation” is an MRP that can

“

be used to derive a concrete MR for the time series analysis domain

and another concrete MR for the autonomous driving domain [9].

The design of MTKeras is centered around two basic concepts:

metamorphic relation input patterns (MRIPs) [9] and metamorphic

relation output patterns (MROPs) [

∗_A_l_l_c_o_r_r_e_s_p_o_n_d_e_n_c_e_s_h_o_u_l_d_b_e_a_d_d_r_e_s_s_e_d_t_o_D_r_._Z_._Q_._Z_h_o_u_.

5], which describe the relations

among the source and follow-up inputs and outputs, respectively.

Both MRIPs and MROPs can have multiple levels of abstractions.

Examples of MRIPs include replace (changing the value of part of

the input to another value; cf. MR

of [

]), noise (adding noise

]), additive and multiplicative (modifying the

input by addition and multiplication, respectively; cf. “metamor-

replace

to the input data; cf. [

phic properties” deﬁned by Murphy et al. [4]). Examples of MROPs

include subsume/subset [5, 10], equivalent and equal [5]. MTKeras

is extendable as it allows a user to plug in new MRIPs and MROPs

ICSEW’20, May 23–29, 2020, Seoul, Republic of Korea

Yelin Liu, Yang Liu, Tsong Yueh Chen, and Zhi Qan Zhou

and conﬁgure them into concrete MRs. We have implemented it as

The second case study applies MTKeras to test four diﬀerent

types of ML models (CNN, RCNN, FastText, and LSTM [1]) trained

on an IMDB sentiment classiﬁcation dataset, a collection of movie

reviews labeled by “1” for positive and “0” for negative feelings. We

deﬁne MR3 as follows: Randomly shuﬄing (permuting) the words

in each movie review shall dramatically reduce the accuracy of the

ML models. The permutative MRIP is very popular in MT practice

(cf. [4]). The validity of MR3 is obvious as shuﬄing the words makes

the sentence meaningless. The experimental results, however, is

surprising. The results of 100 MT experiments show that shuﬄing

the words only decreases the accuracy by a very small degree (RNN:

a python package for ease of use and open-sourced it at Github .

The user can perform MT in a simple and intuitive way by writing

a single line of code in the following format:

Mtkeras(<sourceTestSet>,<dataType>[,<modelName>]).<MRIPs>[.<MROP>

]

where

cases are stored;

test case) of

text, etc);

model under test.

of MRIPs; and “[.

that modelName

sourceTestSet

dataType

points to the place where the source test

declares the type of each element

(

modelName

sourceTestData

(optional) gives the name of the ML

(e.g., grayscaleImage, colorImage,

MROP

MRIPs> represents an MRIP or a sequence

]” represents an optional MROP. Note

and

MROP always go together—they are

around 4%, CNN: around 7%, FastText: 0%, LSTM: around 3.5%),

indicating that the ML models under test are insensitive to word

orders. This case study shows that MRs can help to enhance system

understanding, conﬁrming our previous report [9].

either both present or both absent. For example, when testing an

image classiﬁcation model, we could write:

Mtkeras(myTestSet,colorImage,myDNNModel).noise().fliph().equal()

which tells MTKeras to use “myTestSet” (an array name) as the set of

source test cases, where each test case is a color image, to generate

follow-up test cases by ﬁrst adding a noise point to each image and

then horizontally ﬂipping it. The name of the ML model under test

is “myDNNModel.” The last term, equal(), tells MTKeras to check

whether the classiﬁcation results for the source and follow-up test

cases are the same. MTKeras then performs MT automatically and

identify all the violating cases. Mtkeras returns an object and the

violating cases are stored in its variable named “violatingCases.”

Note that the model name “myDNNModel” and the MROP “equal()”

are optional, without which MTKeras will return a set of follow-up

test cases without further tests. The user can then use this set of

test cases for various purposes, including but not limited to MT

4 CONCLUSIONS AND FUTURE WORK

We have presented the design of MTKeras, a generic metamorphic

testing framework on top of the Keras machine learning platform,

and demonstrated its applicability and problem-detection eﬀective-

ness through case studies in two very diﬀerent problem domains:

image classiﬁcation and sentiment analysis. We have shown that

the composition of MRs can greatly improve the problem-detection

eﬀectiveness of individual MRs, and that MRs can help to enhance

the understanding of the underlying ML models. This work demon-

strates the usefulness of metamorphic relation patterns. We have

open sourced MTKeras at Github. Future research will include an

investigation of the time cost associated with the learning curve

for a novice tester to use the tool as well as further extensions and

larger-scale case studies of the framework.

(such as for data augmentation).

BEST代写-线上留学生作业代写 & 论文代写专家

C++代写 | CSIT314 Group Project – Part II

C++代写 | CSIT314 Group Project – Part II

bestdaixie