Automated Diagnosis of Product-Line Configuration Errors in Feature Models

10
Automated Diagnosis of Product-line Configuration Errors on Feature Models Jules White and Doulas C. Schmidt Vanderbilt University, Department of Electrical Engineering and Computer Science Box 1679 Station B Nashville, TN, 37235, USA Email:{jules, schmidt}@dre.vanderbilt.edu David Benavides and Pablo Trinidad and Antonio Ruiz–Cortés University of Seville Department of Computer Languages and Systems ETSII - Av. de la Reina Mercedes S/N 41012 Seville - Spain Email:{benavides, ptrinidad, aruiz}@us.es Abstract Feature models are widely used to model software product-line (SPL) variability. SPL variants are config- ured by selecting feature sets that satisfy feature model con- straints. Configuration of large feature models involve mul- tiple stages and participants, which makes it hard to avoid conflicts and errors. New techniques are therefore needed to debug invalid configurations and derive the minimal set of changes to fix flawed configurations. This paper provides three contributions to debugging feature model configurations: (1) we present a technique for transforming a flawed feature model configuration into a Constraint Satisfaction Problem (CSP) and show how a constraint solver can derive the minimal set of feature se- lection changes to fix an invalid configuration, (2) we show how this diagnosis CSP can automatically resolve con- flicts between configuration participant decisions, and (3) we present experiment results that evaluate our technique. These results show how our technique scales to models with over 5,000 features, which is beyond the size used to vali- date other automated techniques. This work has been partially supported by the European Commission (FEDER) and Spanish Government under CICYT project Web–Factories (TIN2006-00472). 1 Introduction Current trends and challenges. Software Product- Lines (SPLs) are a technique for creating software applica- tions composed from reusable parts that can be re-targeted for different requirement sets. For example, in the automo- tive domain, an SPL can be created that allows a car’s soft- ware to provide Anti-lock Braking System (ABS) capabili- ties or simply standard braking. Each unique configuration of an SPL is called a variant. SPL variants cannot be constructed arbitrarily. For ex- ample, a car cannot have both ABS and standard braking software controllers. A key step in building an SPL, there- fore, is creating a model of the SPL’s variability and the constraints on variant configuration. An effective technique for capturing these configuration constraints is feature mod- eling [11]. Feature models document SPL variability and configu- ration rules through the notion of features. Each feature represents a point of variability in the SPL, such as the type of braking system. A feature model can capture different types of variability, ranging from SPL variability (variations in customer or market requirements) to software variability (variations in software implementation)[13]. SPL variants can be specified as a selection or config- uration of features. Feature models of these SPL variants are arranged in a tree-like structure where each successively deeper level in the tree corresponds to a more fine-grained 1

Transcript of Automated Diagnosis of Product-Line Configuration Errors in Feature Models

Automated Diagnosis of Product-line Configuration Errors onFeature Models∗

Jules White and Doulas C. SchmidtVanderbilt University,

Department of Electrical Engineering and Computer ScienceBox 1679 Station B

Nashville, TN, 37235, USAEmail:{jules, schmidt}@dre.vanderbilt.edu

David Benavides and Pablo Trinidad and Antonio Ruiz–CortésUniversity of Seville

Department of Computer Languages and SystemsETSII - Av. de la Reina Mercedes S/N

41012 Seville - SpainEmail:{benavides, ptrinidad, aruiz}@us.es

Abstract

Feature models are widely used to model softwareproduct-line (SPL) variability. SPL variants are config-ured by selecting feature sets that satisfy feature model con-straints. Configuration of large feature models involve mul-tiple stages and participants, which makes it hard to avoidconflicts and errors. New techniques are therefore neededto debug invalid configurations and derive the minimal setof changes to fix flawed configurations.

This paper provides three contributions to debuggingfeature model configurations: (1) we present a techniquefor transforming a flawed feature model configuration intoa Constraint Satisfaction Problem (CSP) and show how aconstraint solver can derive the minimal set of feature se-lection changes to fix an invalid configuration, (2) we showhow this diagnosis CSP can automatically resolve con-flicts between configuration participant decisions, and (3)we present experiment results that evaluate our technique.These results show how our technique scales to models withover 5,000 features, which is beyond the size used to vali-date other automated techniques.

∗This work has been partially supported by the European Commission(FEDER) and Spanish Government under CICYT project Web–Factories(TIN2006-00472).

1 Introduction

Current trends and challenges. Software Product-Lines (SPLs) are a technique for creating software applica-tions composed from reusable parts that can be re-targetedfor different requirement sets. For example, in the automo-tive domain, an SPL can be created that allows a car’s soft-ware to provide Anti-lock Braking System (ABS) capabili-ties or simply standard braking. Each unique configurationof an SPL is called avariant.

SPL variants cannot be constructed arbitrarily. For ex-ample, a car cannot have both ABS and standard brakingsoftware controllers. A key step in building an SPL, there-fore, is creating a model of the SPL’s variability and theconstraints on variant configuration. An effective techniquefor capturing these configuration constraints isfeature mod-eling [11].

Feature models document SPL variability and configu-ration rules through the notion offeatures. Each featurerepresents a point of variability in the SPL, such as the typeof braking system. A feature model can capture differenttypes of variability, ranging fromSPL variability(variationsin customer or market requirements) tosoftware variability(variations in software implementation)[13].

SPL variants can be specified as a selection or config-uration of features. Feature models of these SPL variantsare arranged in a tree-like structure where each successivelydeeper level in the tree corresponds to a more fine-grained

1

configuration option for the product-line variant, as shownby the feature model in Figure 1. The parent-child andcross-tree relationships capture the constraints that must beadhered to when selecting a group of features for a variant.

Existing research has focused on ensuring that the fea-tures chosen from feature models are correct and consis-tent with the SPL and variant requirements. For example,work has been done on automating the process of derivinga set of features that meet a requirement set using booleancircuit satisfiability techniques [12] or Constraint Satisfac-tion Problems (CSPs) [4]. Moreover, numerous tools havebeen developed, such as Big Lever Software Gears [6],Pure::variants [5], and the Feature Model Plug-in [7], tosupport the construction of feature models and correct se-lection of feature configurations. Yet other research has fo-cused on performing some of the configuration implemen-tation work by using model driven development [17].

Regardless of what tools and processes are used to con-figure SPL variants, however, there is always the possibilitythat mistakes will occur. For example, large SPLs often usestaged configuration[8, 9], where features are selected inmultiple stages to form a complete configuration iteratively,rather than choosing all features at once. At a late stagein the configuration process, developers may realize that acritically needed feature cannot be selected due to one ornumerous decisions in some previous stages. It is hard todebug a configuration to figure out how to change decisionsin previous stages to make the critical feature selectable.

Another challenging situation can arise when multipleparticipants are involved in the feature selection processandtheir desired feature selections conflict. For example, hard-ware developers for an automobile may desire a lower costset of Electronic Control Units (ECUs) that cannot supportthe features needed by the software developer’s embeddedcontroller code. In these situations, methods are needed toevaluate and debug conflicts between participants. Methodsare also needed to recommend modifications to the partici-pants feature selections to make them compatible.

Although prior research has shown how to identifyflawed configurations [3, 12], conventional debuggingmechanisms cannot pinpoint configuration errors and iden-tifying corrective actions. More specifically, techniquesarelacking that can take an arbitrary flawed configuration andproduce the minimal set of feature selections and deselec-tions to bring the configuration to a valid state. This paperfocuses on addressing these gaps in existing research.

Solution approach→ Constraint-based feature con-figuration diagnosis. Our approach to debugging fea-ture model configurations transforms an invalid featuremodel configuration into a Constraint Satisfaction Prob-lem (CSP) [18] and then uses a constraint solver to derivethe minimal set of feature selection modifications that willbring the configuration to a valid state. performance. This

paper shows how this CSP-based diagnostic technique pro-vides the following contributions to work on debugging er-rors in feature model configurations:

1. We provide a CSP-based [18] diagnostic technique, in-spired by the diagnostic framework in [16], that canpinpoint conflicts and constraint violations in featuremodels

2. We show how this CSP diagnostic technique can beused to remedy a configuration error by automaticallyderiving the minimal set of features to select and dese-lect

3. We provide mechanisms for using this CSP techniqueto create cost-optimal mediation of conflicting config-uration participant feature selection needs

4. We show how our CSP diagnostic technique allowsstakeholders to debug a configuration error or conflictfrom different viewpoints

5. We provide empirical results showing that this CSP-based diagnostic technique is scalable enough to sup-port industrial SPL feature models containing over5,000 features.

The remainder of the paper is organized as follows: Sec-tion 2 describes the challenges of diagnosing configura-tion errors and conflicts in SPLs; Section 3 presents ourCSP-based technique for diagnosing configuration errorsand conflicts; Section 4 shows how our CSP-based tech-nique can be extended to support conflict mediation, multi-viewpoint debugging, and faster diagnosis times; Section 5presents empirical results demonstrating the ability of ourtechnique to scale to feature models with thousands of fea-tures; Section 6 compares our work with related research;and Section 7 presents concluding remarks.

2 Challenges of Debugging Feature ModelConfigurations

This section evaluates different challenges that arise inrealistic configuration scenarios; Section 3 describes ourso-lutions to these challenges.

2.1 Challenge 1: Staged ConfigurationErrors

Staged configuration is a configuration process wherebydevelopers iteratively select features to reduce the variabil-ity in a feature model until a variant is constructed. Czar-necki et al. [8, 9] use the context of software supply chainsfor embedded software in automobiles to demonstrate theneed for staged configuration. In the first stage, softwarevendors provide software components that can be provided

2

in different configurations to actuate brakes, control info-tainment systems, etc. In the second stage, hardware ven-dors of Electronic Control Units (ECUs) the software runson must provide ECUs with the correct features and config-uration to support the software components selected in thefirst stage.

The challenge with staged configuration is that featureselection decisions made at some point in timeT have ram-ifications on the decisions made at all points in timeT ′ > T.For example, it is possible for software vendors to choosea set of software component features for which there areno valid ECU configurations in the second configurationstage. Identifying the fewest number of configuration mod-ifications to remedy the error is hard because there can besignificant distance betweenT andT ′.

This challenge also appears in larger models, such asthose for software to control the automation of continuouscasting in steel manufacture [14]. In large-scale models,configuration mimics staged configuration since develop-ers cannot immediately understand the ramifications of theircurrent decisions. At some later decision point, critical fea-tures that developers need may no longer be selectable dueto some previous choice. Again, it is hard to identify theminimal set of configuration decisions to reverse in this sce-nario. Section 4.2 describes how we address this challengeusing the pre-assignment of some variables with our CSP-based diagnosis technique.

2.2 Challenge 2: Mediating Conflicts

In many situations the desired features and needs of mul-tiple stakeholders involved in configuring an SPL variantmay conflict. For example, when configuring automotivesystems, software developers may want a series of softwarecomponent configurations that cannot be supported by theECU configurations proposed by the hardware developers.To each party, their individual needs are critical and findingthe middle ground to integrate the two is hard.

Another conflict scenario arises when configuration de-cisions made for an SPL variant must be reconciled withthe constraints of the legacy environment in which it willrun. For example, when configuring automotive softwarefor the next year’s model of a car, a variant may initiallybe configured to provide the features most sought after bycustomers, such as a complex infotainment system. Typi-cally, new model cars are not complete redesigns and thus,developers have to figure out how to run the new softwareconfiguration on the existing ECU configuration of the pre-vious year’s model. If the new software configuration is notcompatible with the legacy ECU configuration, developersneed a way to derive the lowest cost set of modifications tothe either the new software or the legacy ECU configura-tion, which is hard.

Section 4.3 describes how we address this challenge bydiagnosing the superset of the desired conflicts and lever-aging an alternate optimization goal for the CSP-diagnosistechnique.

2.3 Challenge 3: Viewpoint-dependentErrors

The feature labeled as the source of an error in a featuremodel configuration may vary depending on the viewpointused to debug it. In the feature model shown in Figure 1,for example, if a configuration is created that includes bothNon-ABS Controllerand1 Mbit/s CAN Bus, either featurecan be viewed as the feature that is the source of the er-ror. If we debug the configuration from the viewpoint that

Figure 1: Simple Feature Model for an Automobile

software trumps ECU hardware decisions, then the1 Mbit/sCAN Busfeature is the error. If we assume that ECU de-cisions precede software, however, then theNon-ABS Con-troller feature is the error.

A feature model may therefore require debugging frommultiple viewpoints since the diagnosis of the feature thatis the source of an error in a feature model depends on theviewpoint used to debug it. For small feature models, de-bugging from different points of view may be relatively sim-ple. When feature models contain hundreds or thousands offeatures, the complexity of diagnosing a configuration frommultiple viewpoints increases greatly. Section 4.2 describeshow we address this challenge by specifying feature selec-tions that cannot be modified by the solver during the diag-nosis.

3 Configuration Error Diagnosis

The solution we propose is a technique for creating au-tomated SPL variant diagnosis tools. Developers can usethese tools to identify the minimal set of features that shouldbe selected or deselected to transform an invalid configura-tion to a valid configuration. Moreover, depending on theinput provided to diagnosis tools, a flawed configuration canbe debugged from different viewpoints or mediate conflictsbetween multiple stakeholder decisions in a configurationprocess.

3

The key component of our automated SPL variant con-flict diagnosis technique is the application of a CSP-basederror diagnostic technique. In prior work, Benavides etal. [4] have shown how feature models can be transformedinto CSPs to automate feature selection with a constraintsolver [10]. Trinidad et al. [16] subsequently showed toextend this CSP technique to identifyfull mandatory fea-tures, void features, anddead feature modelsusing Reiter’stheory of diagnosis [15]. This section presents an alternatediagnostic model for deriving the minimum set of featuresthat should be selected or deselected to eliminate a conflictin a feature configuration.

Background: Feature Models and Configurations asCSPs. A CSP is a set of variables and a set of constraintsover those variables. For example,A+ B≤ 3 is a CSP in-volving the integer variablesA andB. The goal of a con-straint solver is to find a validlabeling(set of variable val-ues) that simultaneously satisfies all constraints in the CSP.(A= 1, B = 2) is thus a valid labeling of the CSP.

To build the CSP for the error diagnosis technique, weconstruct a set of variables,F , representing the features inthe feature model. Each configuration of the feature modelis a set of values for these variables, where a value of 1 indi-cates the feature is present in the configuration and a valueof 0 indicates it is not present. More formally, a configura-tion is a labeling ofF , such that for each variablefi ⊂ F ,fi = 1 indicates that theith feature in the feature model is se-lected in the configuration. Correspondingly,fi = 0 impliesthat the feature is not selected.

Given an arbitrary configuration of a feature model asa labeling of theF variables, developers need the abilityto ensure the correctness of the configuration. To achievethis constraint checking ability, each variablefi is associ-ated with one or more constraints corresponding to the con-figuration rules in the feature model. For example, iff j isa required subfeature offi , then the CSP would contain theconstraint:fi = 1⇔ f j = 1.

Configuration rules from the feature model are capturedin the constraint setC. For any given feature model con-figuration described by a labeling ofF , the correctness ofthe configuration can be determined by seeing if the label-ing satisfies all constraints inC. The steps of transforminga feature model to a CSP are described in [4].

3.1 Configuration Diagnostic CSP

When diagnosing configuration conflicts, developersneed a list of features that should be selected or deselectedto make an invalid configuration a valid configuration. Theoutput of our CSP diagnostic architecture is this list of fea-tures to select and deselect, as shown in Figure 2. In Step1 of Figure 2, the rules of the feature model and the current

Figure 2: Diagnostic Technique Architecture

invalid configuration are transformed into a CSP. For exam-ple,o1 = 1 because theAutomobilefeature is selected in thecurrent invalid configuration. In Step 2, the solver derivesalabeling of the diagnostic CSP. Step 3 takes the output of theCSP labeling and transforms it into a series of recommen-dations of features to select or deselect to turn the invalidconfiguration into a valid configuration. Finally, in Step 4,the recommendations are applied to the invalid configura-tion to create a valid configuration where each variablefiequals 1 if the corresponding feature is selected in the newand valid configuration. For example,f7 = 1, meaning thatthe250 Kbit/s CAN Busis selected in the new valid config-uration.

To enable the constraint solver to recommend features toselect and deselect, two new sets of recommendation vari-ables,S andD, are introduced to capture the features thatneed to be selected and deselected, respectively, to reach avalid configuration. For example, a value of 1 for variablesi ⊂ S indicates that the featurefi should be added to thecurrent configuration. Similarly,di = 1 implies that the fea-ture fi should be removed from the configuration.

Thus, for each featurefi ⊂ F , there are variablessi ⊂ Sand di ⊂ D. After the diagnosis CSP is labeled, the val-ues ofSandD serve as the output recommendations to theuser as to what features to add or remove from the currentconfiguration, as shown in Table 1.

Table 1 shows the complete inputs and outputs to diagnosethe invalid configuration scenario shown in Figure 2.

The next step is to allow developers to input their current

4

VariablesVariable Expla-nations

fi ⊂ F : feature variables for the validconfiguration that will be transitionedto; oi ⊂ O: the features selected (oi =1) in the current invalid configuration;si ⊂ S: features to select (si = 1) toreach the valid configuration;di ⊂ D:features to deselect (di = 1) to reach thevalid configuration

InputsCurrent Config. o1 = 1,o2 = 1,o3 = 0,o4 = 1,o5 =

1,o6 = 1,o7 = 0Feature ModelRules

f1 = 1⇔ ( f2 = 1), f1 = 1⇔ ( f5 = 1),f2 = 1⇒ ( f3 = 1)⊕( f4 = 1), f5 = 1⇒( f6 = 1)⊕ ( f7 = 1), ( f6 = 1)∨ ( f7 =1) ⇒ ( f5 = 1), ( f3 = 1)∨ ( f4 = 1) ⇒( f2 = 1), f3 = 1⇒ ( f6 = 1), f4 = 1⇒( f7 = 1)

DiagnosticRules

( fi ⊂F |{( fi = 1)⇒ (oi = 1⊕si = 1)∧(di = 0),( fi = 0)⇒ (oi = 0⊕di = 1)∧(si = 0)})

OutputsFeatures to Se-lect

s1 = 0,s2 = 0,s3 = 0,s4 = 0,s5 =0,s6 = 0,s7 = 1

Features to Des-elect

d1 = 0,d2 = 0,d3 = 0,d4 = 0,d5 = 0,

d6 = 1, d7 = 0New Valid Con-fig.

f1 = 1, f2 = 1, f3 = 0, f4 = 1, f5 =1, f6 = 0, f7 = 1

Table 1: Diagnostic CSP Construction

configuration into the solver for diagnosis. Rather than di-rectly setting values for the variables inF , developers usea special set of input variables called theobservations. Theobservations are contained in the set of variablesO. Foreach featurefi present in the current flawed configuration,oi = 1. Similarly, if fi is not selected in the current invalidconfiguration,oi = 0. As shown in Table 1, the observationscapture the current invalid configuration that is provided asinput to the solver.

To diagnose the CSP, we want to find an alternate butvalid configuration of the feature model and suggest a se-ries of changes to the current invalid configuration to reachthe valid configuration. A valid configuration is a labelingof the variables inF (a configuration) such that all of thefeature model constraints are satisfied. For each variablefi ,the value should be 1 if the feature is present in the newvalid configuration that will be transitioned to. If a featureis not in the new configuration,fi should equal 0. We al-ways requiref1 = 1 to ensure that the root feature is alwaysselected. It also must be noted that the technique only workswith non-void feature models.

If feature fi is included in the current configuration (oi =1), but must be removed to reach the new valid configu-ration, we want the solver to recommend that it be dese-lected (di = 1). Moreover, the solver should recommendthat a featurefi be selected (si = 1) if it is not currentlyselected (oi = 0), but needs to be selected to reach a cor-rect configuration. If a feature is not in the current con-figuration (oi = 0), and is not needed to reach a correctstate, then the solver should not recommend any changesto it (si = 0 and di = 0). To produce this desired be-havior, we introduce two new constraints for each vari-able fi ⊂ F : ( fi = 1) ⇒ (oi = 1⊕ si = 1)∧ (di = 0) and( fi = 0) ⇒ (oi = 0⊕di = 1)∧ (si = 0). The symbol⊕ rep-resentsexclusive or(XOR). These new constraints form thediagnostic rules, as shown in Table 1.

The behavior produced by these new constraints explainsthe following four cases that a feature modification can fallinto:

1. A feature is selected and does not need to be dese-lected. If the ith feature is in the current invalid con-figuration (oi = 1), and also in the new valid configu-ration (fi = 1), no changes need be made to it (si = 0,di = 0)⇒{( fi = 1) ⇒ (oi = 1⊕si = 0)∧ (di = 0)}

2. A feature is selected and needs to be deselected.Ifthe ith feature is in the current invalid configuration(oi = 1) but not in the new valid configuration (fi = 0),it must be deselected (di = 1)⇒ {( fi = 0) ⇒ (oi =0⊕di = 1)∧ (si = 0)}

3. A feature is not selected and does not to be selected.If the ith feature is not in the current invalid configu-ration (oi = 0) and is also not needed in the new con-figuration (fi = 0) it should remain unchanged (si = 0,di = 0)⇒ {( fi = 0) ⇒ (oi = 0⊕di = 1)∧ (si = 0)}

4. A feature is not selected and needs to be selected.If the ith feature is not selected in the current invalidconfiguration (oi = 0) but is present in the new correctconfiguration fi = 1, it must be selected (si = 1) ⇒{( fi = 1) ⇒ (oi = 1⊕si = 0)∧ (di = 0) = 1}

3.2 Optimal Diagnosis Method

The next step in the diagnosis process is to use the solverto find a labeling of the variables and produce a series of rec-ommendations. For any given configuration with a conflict,there may be multiple possible ways to eliminate the prob-lem. For example, in the automotive example from Sec-tion 2.3, the valid corrective actions were to either removethe1 Mbit/s CAN Busand select the250 Kbit/s CAN Busorto removeNon-ABS Controllerand selectABS Controller.We therefore need to tell the solver how to select which of

5

the (many) possible corrective solutions to suggest to devel-opers.

The most basic suggestion selection criteria that devel-opers can use to guide the solver’s diagnosis is to tell it tominimize the number of changes that to make to the cur-rent configuration,i.e., developers prefer suggestions thatrequire changing as few things as possible in the current in-valid configuration. To achieve this, we solve for a labelingof the CSP that minimizes the sum of the variables inS∪D.The sum ofS∪D is the total number of changes that thesolution requires the developer to make. By minizing thissum, therefore, we minimize the total number of requiredchanges.

Each labeling of the diagnostic CSP will produce twosets of features corresponding to the features that should beselected (S) and deselected (D) to reach the new valid con-figuration. Furthermore, the labeling also causes the solverto backtrack and create new values forF corresponding tothe proposed solution to transition to. Thus, developers cancycle through the different potential labelings of the diag-nostic CSP to evaluate potential remedies and the configu-rations they will produce.

Table 1 shows a complete set of inputs and output sug-gestions for diagnosing the automotive software examplefrom Section 2.3. If there are multiple labelings of the CSP,initially only one will be returned. After the first solutionhas been found, however, the solver can much more effi-ciently cycle through the other equally ranked sets of cor-rective suggestions.

4 Solution Extensibility and Benefits

This section presents different benefits of our CSP-baseddiagnostic approach and possible ways of extending it.

4.1 Bounding Diagnostic Method

For extremely large feature models, finding the optimalnumber of changes may not be possible due to time con-straints. In these cases, an alternate, more scalable, ap-proach is to attempt to find any suggestion that requiresfewer thanK changes or with a cost less thanK. Ratherthan directly asking for an optimal answer, we instead addthe constraint∑n

i=1si +di ≤ K to the CSP and ask the solverfor any solution.

The sum of all variablessi ⊂ Sanddi ⊂ D represents thetotal number of feature selections and deselections that needto be made to reach the new valid configuration. Therefore,the sum of both of these sets is the total number of modifica-tions that must be made to the original invalid configuration.The new constraint,∑n

i=1si +di ≤ K, ensures that the solveronly accepts diagnosis solutions that require the developerto makeK or fewer changes to the invalid solution.

The solver is asked forany answer that meets the newconstraints. In return, the solver will provide a solution thatis not necessarily perfect, but which fits our tolerance forchange. If no solution is found, we can incrementK by afactor and re-invoke the solver or reassess our requirements.As is shown in Section 5.4, searching for a bounded solutionrather than an optimal solution is significantly faster.

If the solver cannot find a diagnosis that makes fewerthanK modifications, it will state that there is no valid so-lution that fits aK change budget. From the experiementsperformed in Section 5, we found that in these cases whereno solution can be found, the solver was able to diagnosethe model much faster than if there was a viable solutionunderK. Furthermore, the more cross-tree constraints thatwere within the feature model, the faster the solver was ableto determine if there was no solution within the boundK.

4.2 Debugging from Different Viewpoints

As we discussed in Section 2.3, we need the ability todebug the configuration from different viewpoints. Eachviewpoint represents a set of features that the solver shouldavoid suggesting to add or remove from the current config-uration. For example, using the automobile scenario fromSection 2.3, the solver can debug the problem from the pointof view that audio trumps storage by telling the solver notto suggest selecting or deselecting any audio features.

Debugging from a viewpoint works by pre-assigning val-ues for a subset of the variables inF andO. For example,to force the featurefi currently in the configuration to re-main unaltered by the diagnosis, the valuesfi = 1 andoi = 1are provided to the solver. Since( fi = 1) ⇒ (oi = 1⊕si =1)∧(di = 0), pre-assigning these values will force the solverto labelsi = 0 anddi = 0.

To debug from a given point of view, for each featurefv,in that viewpoint, we first add the constraints,fv = 1,ov = 1,sv = 0, anddv = 0. The solver then derives a diagnosis thatrecommends alterations to other features in the configura-tion and maintains the state of each featurefv. The diag-nostic model can therefore be used to debug from differentviewpoints and address Challenge 3 from Section 2.3.

Pre-assigning values for variables inF andO can also beused to debug staged configuration errors from Challenge1, Section 2.1. With staged configuration errors, at somepoint in timeT ′, developers attempt to select a feature thatis in conflict with another feature selected at timeT < T ′.To debug this type of conflict, developers pre-assign thedesired (but currently unselectable) feature at timeT ′ thevalue of 1 for itsoi and fi variables. Developers can alsopre-assign values for one or more other features decisionsfrom previous stages of the configuration that must not bealtered. The solver is then invoked to find a configurationthat includes the desired feature atT ′ and minimizes the

6

number of changes to feature configuration decisions thatwere made at all points in timeT < T ′.

4.3 Cost Optimal Conflict Resolution

As shown in Section 2.2, conflicts can occur when mul-tiple stakeholders in a configuration process pull the solu-tion in different directions. In these situations, debuggingtools are needed to mediate the conflict in a cost consciousmanner. For example, when the software configuration ofa car is not compatible with the legacy ECU configuration,it is (probably) cheaper to recommend changes to the soft-ware configuration than to change the ECU configurationand possibly the assembly process of the car. Therefore,the solver should try to minimize the overall cost of thechanges.

We can extend the CSP model to perform cost-based fea-ture selection and deselection optimization. First, we extendthe model to associate a cost variable,bi ⊂B, with each fea-ture in the feature model. Each cost variable reprsents howexpensive (or conversely how beneficial) it is for the solverto recommend that the state of that feature be changed. Be-fore each invocation of the debugger, the stakeholders pro-vide these cost variables to guide the solver in its recom-mendations of features to select or deselect.

Next, we construct the superset of the features that thevarious stakeholders desire. The superset represents theideal, although incorrect, configuration that the stakehold-ers would like to have. The goal is to find a way to reacha correct configuration from this superset of features thatinvolves the lowest total cost for changes. The superset isinput to the solver as values for the variables inO.

Finally, we alter our original optimization goal so thatthe solver will attempt to minimize (or maximize if we pre-fer) the cost of the features it suggests selecting or dese-lecting. We define a new global cost variableG and letGcapture the sum of the costs of the changes that the solversuggests:G = ∑n

i=1(di ∗ bi)+ (si ∗ bi). G is thus equal tothe sum of the costs of all the features that the solver ei-ther recommends to select or deselect. Rather than simplyinstructing the solver to minimize the sum ofS∪D, we in-stead ask for the minimization or maximization ofG.

The result of the labeling is a series of changes needed toreach a valid configuration that optimally integrates the de-sires and decisions of the various stakeholders. Of course,one particular stakeholder may have to incur more cost thananother in the interest of reaching a globally better solution.Further constraints, such as limiting the maximum differ-ence between the cost incurred by any two stakeholders,could also be added. The mediation process can be tuned toprovide numerous types of behavior by providing differentoptimization golas. This CSP diagnostic method enablesuse to address Challenge 2 from Section 2.2.

5 Empirical Results

Automated diagnostic methods are of little use if theycannot scale to handle non-trivial feature models of produc-tion systems. This section presents empirical results fromexperiments we performed to evaluate the scalability of ourCSP-based diagnosis technique. We compare both the op-timal and bounding methods for diagnosing errors and con-flicts.

5.1 Experimental Platform

To perform our experiments, we used the implementa-tion of the diagnosis algorithm which is provided by theModel Intelligence libraries from the Eclipse Foundation’sGeneric Eclipse Modeling System (GEMS) project [2].Internally, the GEMS Model Intelligence implementationuses the Java Choco Constraint Solver [1] to derive la-belings of the diagnostic CSP. The experiments were per-formed on a computer with an Intel Core DUO 2.4GHZCPU, 2 gigabytes of memory, Windows XP, and a version1.6 Java Virtual Machine (JVM). The JVM was run in clientmode using a heap size of 40 megabytes (-Xms40m) and amaximum memory size of 256 megabytes (-Xmx256m).

A challenging aspect of the scalability analysis is thatCSP-based techniques can vary in solving time based onindividual problem characteristics. In theory, CSP’s haveexponential worst case time complexity, but in practice theyare often much faster. To demonstrate a high confidence inour technique, it was necessary to apply our technique to asmany models as possible. The key challenge with this ap-proach is that hundreds or thousands of real feature modelsare not readily available and manually constructing them isnot practical.

To provide the large numbers of feature models neededfor our experiments, therefore, we built a feature model gen-erator. This generator can randomly generate feature mod-els with the desired branching and constraint characteris-tics. We also imbued the generator with the capability togenerate feature selections from a feature model and proba-bilistically insert a bounded number of errors/conflicts intothe configuration. The feature model generator is availablein open-source form from [].

From preliminary feasibility experiments we conducted,we observed that the branching factor of the tree had littleeffect on the algorithm’s solving time. We also compareddiagnosis time using models with 0%, 10%, and 50% cross-tree constraints and saw no difference in diagnosis time.The key indicator of the solving complexity was the num-ber of XOR- or cardinality-based feature groups in a model.XOR and cardinality-based feature groups are features thatrequire the set of their selected children to satisfy a cardi-nality constraint (the constraint is 1..1 for XOR).

7

For our tests, we limited the branching factor to at mostfive subfeatures per feature. We also set the probabilityof XOR- or cardinality-based feature groups being gener-ated to 1/3 at each feature with children. We chose 1/3since most feature models we have encountered containmore required and optional relationships than XOR- andcardinality-based feature groups.

To generate feature selections with errors, we used aprobability of 1/50 that any particular feature would be con-figured incorrectly. For each model, we bounded the to-tal errors at 5. In our initial experiments, the solving timewas not affected by the number of errors in a given featuremodel. Again, the prevalence of XOR- or cardinality-basedfeature groups was the key determiner of solving time.

5.2 Bounding Method Scalability

First, we tested the scalability of the less computation-ally complex bounding diagnosis method. The speed of ourtechnique allowed us to test 2,000 feature models at eachdata point (2,000 different variations of each size featuremodel) and test the bounding method’s scalability for fea-ture models up to 500 features. With models above 500 fea-tures, we had to reduce the number of samples at each sizeto 200 models due to time constraints. Although these sam-ples are small, they demonstrate the general performanceof our technique. Moreover, the results of our experimentswith feature models up to 500 features were nearly identicalwith sample sizes between 100 and 2,000 features.

Figure 3 shows the time required to diagnose featuremodels ranging in size from 50 to 500 features using thebounded method. The figure captures the worst, average,

Figure 3: Diagnosis Time with Bounding Method

and best solving time in the experiments. As seen from theresults, our technique could diagnose 500 feature models inan average of≈600ms.

The upper bound used for this experiment was a max-imum of 6 feature selection changes. When the 6 featurebound was too tight for the diagnosis (i.e., more than 6changes were needed to reach a correct state) the solverwould declare that there was no valid solution in a very shortamount of time. We therefore discardedall instances where

the bound was too tight to avoid skewing the results towardsshorter solving times.

Figure 4 shows the results of testing the solving time ofthe bounding method on feature models ranging in size from750 to 5,000 features. Models of this size were sufficient to

Figure 4: Diagnosis Time Using Bounding Method forLarge Feature Models

demonstrate scalability for production systems. The resultsshow that for a 5,000 feature model, the average diagnosistime was≈ 1 minute.

Another key variable that we tested was how the tight-ness of the bound on the maximum number of featurechanges affected the solving time of the technique. We tooka set of 200 feature models and applied varying bounds tosee how the bound tightness affected solution time. Figure 5shows that tighter bounds produced faster solution times.We hypothesize that the tighter bound allows the solver to

Figure 5: 500 Feature Diagnosis Time with BoundingMethod and Varying Bounds

discard infeasible solutions more quickly and thus arrive ata solution faster.

5.3 Optimal Method Scalability

Next, we tested the scalability of the optimal diagnosismethod. For the tests of the optimal method’s scalability forfeature models up to 2,000 features, the slower speed of thetechnique allowed us to test 200 feature models at each datapoint. We stopped at 2,000 features because by that datapoint, each diagnosis required an average of≈7.5 minutesand the 200 samples required over 24 hours to test.

8

The results from feature models up to 500 features canbe seen in Figure 6. At 500 features, the optimal method

Figure 6: Diagnosis Time with Optimal Method

required an average of roughly 8 seconds to produce a di-agnosis. The tests from larger models ranging in size upto 2,000 features are shown in Figure 7. For a model with

Figure 7: Diagnosis Time with Optimal Method for LargeFeature Models

2,000 features, the solver required an average of approxi-mately 7.5 minutes per diagnosis.

5.4 Comparative Analysis of Optimal andBounding Methods

Finally, we compared the scalability and quality of theresults produced with the two methods. As seen by com-paring Figures 3 and 6 the bounding method performs andscales significantly better than the optimal method. This re-sult raises the key question of how much of a tradeoff in so-lution quality for speed is made when the bounding methodis used over the optimal method.

The bound that is chosen determines the quality of thesolution that is produced by the solver. We can state theoptimality of a diagnosis given by the bounding method asthe number of changes suggested by the bounding method,Bounded(S∪D), divided by the optimal number of changes,

Opt(S∪D), which yieldsBounded(S∪D)Opt(S∪D)

. Since the bounding

method uses the constraint(S∪D) ≤ K to ensure that atmostK changes are suggested, we can state the worst caseoptimality of the bounded method as K

Opt(S∪D) . The closer

our bound,K, is to the true optimal number of changes tomake, the better the diagnosis will be.

Since tighter bounds produce faster solving timesandbetter results, debuggers should start with very smallbounds and iteratively increase them upward as needed.One approach is to layer an adaptive algorithm on top of thediagnosis algorithm to move the bound by varying amountseach time the bound proved too tight. Another approach isto employ a form of binary search to hone in on the idealbound. We will be investigating both of these techniques infuture work.

6 Related Work

In prior work [16], Trinidad et al. have shown how fea-ture models can be transformed into diagnosis CSPs andused to identifyfull mandatory features, void features, anddead feature models[16]. Developers can use this diagnos-tic capability to identify feature models that do not accu-rately describe their products and to understand why not.The technique we described in this paper builds on this ideaof using a CSP for automated diagnosis. Whereas Trinidadfocuses on diagnosing feature models that do not describetheir products, however, we build an alternate diagnosismodel to identify conflicts in feature configurations. More-over, we provide specific recommendations as to the min-imal set of features that can be selected or deselected toeliminate the error.

Batory et al. [3] also investigated debugging techniquesfor feature models. Their techniques focus on translat-ing feature models into propositional logic and using SATsolvers to automate configuration and verify correctness ofconfigurations. In general, their work touches on debuggingfeature models rather than individual configurations. Ourapproach focuses on another dimension of debugging, theability to pinpoint errors in individual configurations andto specify the minimal set of feature selections and dese-lections to remove the error. Moreover, we show how au-tomated debugging can be combined with probing to helpunderstand and integrate bottom-up forces into the configu-ration process.

Mannion et al. [12] present a method for encodingfeature models as propositional formulas using first-orderlogic. These propositional formulas can then be used tocheck the correctness of a configuration. Mannion, how-ever, does not touch on how incorrect configurations are de-bugged. In contrast, our technique provides this capabilityand can therefore recommend the minimal feature modifi-cations to rectify the problem.

Pure::variants [5], Feature Modeling Plugin (FMP) [7],and Big Lever Software Gears [6] are tools developed tohelp developers create correct configurations of SPL fea-ture models. These tools enforce constraints on modelers

9

as the features are selected. None of these tools, however,addresses cases where feature models with incorrect con-figurations are created and require debugging. The tech-nique described in this paper provides this missing capabil-ity. These tools and our approach are complementary sincethe tools help to ensure that correct configurations are cre-ated and our technique diagnoses incorrect configurationsthat are built.

7 Concluding Remarks

It is hard to debug conflicts and errors in large featuremodels created through staged or multi-stakeholder config-uration. This paper described how CSPs can be built to di-agnose errors and conflicts in configurations. The diagnosisprovided by a CSP can specifically recommend the mini-mum or cost optimal set of features that should be selectedor deselected in a faulty configuration.

The following are lessons learned from our efforts thusfar:

• By applying bounding and optimal diagnosis, our tech-nique can scale to feature models with several thou-sand features.

• In future work, we plan to investigate different meth-ods of honing the boundary used in the boundingmethod. We also intend to investigate the diagnosis ofconfiguration errors in SPLs that use multiple modelsto capture different viewpoints of the SPL.

References

[1] Choco constraint programming system.http://choco.sourceforge.net/.

[2] Generic eclipse modeling system (gems)http://eclipse.org/gmt/gems.

[3] D. Batory. Feature Models, Grammars, and PrepositionalFormulas.Software Product Lines: 9th International Confer-ence, SPLC 2005, Rennes, France, September 26-29, 2005:Proceedings, 2005.

[4] D. Benavides, P. Trinidad, and A. Ruiz-Cortes. AutomatedReasoning on Feature Models.17th Conference on Ad-vanced Information Systems Engineering (CAiSE05, Pro-ceedings), LNCS, 3520:491–503, 2005.

[5] D. Beuche. Variant Management with Pure:: variants.Technical report, Pure-Systems GmbH, http://www.pure-systems.com, 2003.

[6] R. Buhrdorf, D. Churchett, and C. Krueger. Salion’s Expe-rience with a Reactive Software Product Line Approach. InProceedings of the 5th International Workshop on ProductFamily Engineering, Siena, Italy, November 2003.

[7] K. Czarnecki, M. Antkiewicz, C. Kim, S. Lau, and K. Piet-roszek. InFMP and FMP2RSM: Eclipse Plug-ins for Model-ing Features Using Model Templates, pages 200–201. ACMPress New York, NY, USA, October 2005.

[8] K. Czarnecki, S. Helsen, and U. Eisenecker. Staged Con-figuration Using Feature Models.Software Product Lines:Third International Conference, SPLC 2004, Boston, MA,USA, August 30-September 2, 2004: Proceedings, 2004.

[9] K. Czarnecki, S. Helsen, and U. Eisenecker. Staged config-uration through specialization and multi-level configurationof feature models.Software Process Improvement and Prac-tice, 10(2):143–169, 2005.

[10] J. Jaffar and M. Maher. Constraint Logic Programming: ASurvey.constraints, 2(2):0.

[11] K. C. Kang, S. Kim, J. Lee, K. Kim, E. Shin, and M. Huh.FORM: A Feature-Oriented Reuse Method with Domain-specific Reference Architectures.Annals of Software Engi-neering, 5(0):143–168, January 1998.

[12] M. Mannion. Using first-order logic for product line modelvalidation. Proceedings of the Second International Confer-ence on Software Product Lines, 2379:176–187, 2002.

[13] A. Metzger, K. Pohl, P. Heymans, P.-Y. Schobbens, andG. Saval. Disambiguating the documentation of variabilityinsoftware product lines: A separation of concerns, formaliza-tion and automated analysis. InRequirements EngineeringConference, 2007. RE ’07. 15th IEEE International, pages243–253, 2007.

[14] R. Rabiser, P. Grunbacher, and D. Dhungana. SupportingProduct Derivation by Adapting and Augmenting Variabil-ity Models. Software Product Line Conference, 2007. SPLC2007. 11th International, pages 141–150, 2007.

[15] R. Reiter. A theory of diagnosis from first principles.Artifi-cial Intelligence, 32(1):57–95, 1987.

[16] P. Trinidad, D. Benavides, A. Durán, A. Ruiz-Cortés, andM. Toro. Automated error analysis for the agilization of fea-ture modeling. Journal of Systems and Software, in press,2007.

[17] S. Trujillo, D. Batory, and O. Diaz. Feature Oriented ModelDriven Development: A Case Study for Portlets.29th Inter-national Conference on Software Engineering (ICSE 2007),Minneapolis, Minnesota, USA, May 20, 26:61, 2007.

[18] P. Van Hentenryck. Constraint Satisfaction in Logic Pro-gramming. MIT Press Cambridge, MA, USA, 1989.

10