Nearly Linear Time Deterministic Algorithms for Submodular ...

arX

iv:1

804.

0817

8v6

[cs

.DS]

21

Dec

202

0

Nearly Linear Time Deterministic Algorithms for Submodular

Maximization Under Knapsack Constraint and Beyond

Wenxin LiThe Ohio State [email protected]

December 22, 2020

Abstract

In this work, we study the classic submodular maximization problem under knapsack con-straints and beyond. We first present an (7/16− ε)-approximate algorithm for single knapsackconstraint, which requires O(n · maxε−1, log log n) queries, and two passes in the streamingsetting. This provides improvements in approximation ratio, query complexity and number ofpasses on the stream. We next show that there exists an (1/2 − ε)-approximate deterministicalgorithm for constant number of binary packing constraints, which achieves a query complexityof Oε(n · log log n). One salient feature of our deterministic algorithm is, both its approximationratio and time complexity are independent of the number of constraints. Lastly we presentnearly linear time algorithms for the intersection of p-system and d knapsack constraint, we

achieve approximation ratio of (1/(p + 74 d + 1)− ε) for monotone objective and ( p/(p+1)

2p+ 74 d+1

− ε)

for non-monotone objective.

1 Introduction

A set function f : 2E → R+ defined on ground E of size n is submodular, if for any two sets

S, T ⊆ E, inequality f(S) + f(T ) ≥ f(S ∪ T ) + f(S ∩ T ) holds. It is monotone non-decreasingif f(S) ≤ f(T ) for any S ⊆ T ⊆ E. Submodular functions form a natural class of set functionswith the property of diminishing returns, which have numerous applications in computer science,economics, and operation research. Due to its widespread applicability of submodular maximization,there has been a vast amount of literature on submodular maximization subject to diverse types ofconstraints [20, 16, 21, 4, 8, 23, 10, 5].

However, many of these algorithmic results do not scale well for practical applications of largesize. Obtaining fast running time is of fundamental importance in both theory and practice [23]and there has been a considerable amount of work in this direction. Traditionally, a linear querycomplexity algorithm for a problem is highly desirable. Hence one question arising is, what is thebest solution that can be obtained in (nearly) linear time? In this paper we aim to understandthe approximation boundary via (nearly) linear number of queries. In the mean time, we alsotry to reduce the previous query complexity. For example, in several of our results Oε(n log log n)queries are required, while Ωε(log n) queries per element are required in previous works, which isexponentially higher than the complexity result(s) in this paper.

1

http://arxiv.org/abs/1804.08178v6

A large number of applications are formulated as knapsack constrained monotone submodularfunction maximization problem. Sviridenko [21] proposed the density greedy algorithm with partialenumeration to obtain the optimal approximation ratio of 1 − 1/e. The algorithm requires timecomplexity of O(n5) and is computationally inefficient. In recent years there has been a largeamount of work focusing on solving the knapsack constrained submodular maximization problemin large-scale scenarios. For example, for the line of algorithm acceleration initialized by [2], thecurrent best result is due to Ene and Nguyen [6], in which a randomized O((1/ε)O(1/ε4) · n log2 n)time algorithm with approximation factor (1− 1/e− ε) was proposed. When working with massivedata stream, Huang et al. [15] proposed an (0.363−ε)-approximate single pass streaming algorithm,which requires O((1/ε4) log4 n) space and queries per element, together with a (0.4−ε)-approximatethree pass algorithm with the same space and running time requirements. Motivated by reducingthe number of passes on the data stream, together with the gap between the unnatural state-of-the-artand existing hardness result, we present our improved algorithm in Theorem 1.

We further investigate other types of knapsack constraints, binary packing constraint. For mul-tiple packing constraints, Azar and Gamzu [1] proposed a multiplicative weight update (MWU)-based greedy algorithm that achieves a width-dependent approximation ratio of Ω(1/d1/W ), whered represents the number of constraints and W refers to the width of the packing system. Theapproximation guarantee can be further improved to Ω(1/d1/W +1) for binary packing constraints.While the aforementioned approximation guarantees are dependent on the number of constraints,Mizrachi et al. [19] proposed the first deterministic non-trivial algorithm for a constant numberof packing constraints. However, it requires O(nO(poly(1/ε))) time to achieve an approximation ra-tio of (1/e − ε). One question we ask is, does there exist a time efficient deterministic algorithmwith width independent approximation ratio? We answer this in affirmative for constant number ofbinary packing constraints in Theorem 2.

The most general type of constraint considered in this paper is the intersection of a p-systemand d knapsack constraints [2]. The current best result is due to Badanidiyuru and Vondrák [2], inwhich an (1/(p + 2d + 1)− ε)-approximate algorithm was proposed. On the inapproximability side,there is a lower bound of (1− e−p) + ε ≤ 1/(p + 1/2) + ε even for the special case of p–extendiblesystem [7]. To the best of our knowledge, there is no algorithm that is able to move closer towardsthe lower bound by using comparable number of queries. We provide a nearly linear algorithm withbetter approximation guarantee in Theorem 3.

1.1 Results Overview

Knapsack constraint. In Section 2 we study single knapsack constrained submodular maximiza-tion problem and have the following theorem.

Theorem 1. There is an (7/16−ε)-approximate algorithm for maximizing a monotone submodularfunction subject to a single knapsack constraint, which requires O(maxε−1, log log n) queries perelement. Our algorithm can be adapted to the streaming setting, in which the same approximationratio can be achieved within only two passes over the data stream, while using O(n log n/ε) spaceand performing O(log n/ε) queries per element.

We improve the (0.4− ε)-approximate algorithm in [14, 15], which requires O(log4 n/ε4) queriesper element, O(n log4 n/ε4) space and three passes on the stream. We note that the algorithms

2

in [13] require a larger number of passes on the data stream to achieve the same approximationratio. Our next result applies to binary packing constraint with constant dimension.

We also study the case when the constraint set I = S ⊆ E|AxS ≤ b. More specifically, weinvestigate two specific forms of packing constrained optimization problems. The first one is thewell-known single knapsack constraint.

Theorem 2. There is an (1/2 − ε)-approximate deterministic algorithm that performs Oε(n ·log log n) queries for the binary packing constraint with constant dimension, i.e., A ∈ 0, 1d×n

where d = O(1). The result also holds if A ∈ Fd×n, where F consists of constants and |F| = O(1).

We would like to emphasize that our approximation ratio is width independent and holds deter-ministically. Compared with the algorithm in [19], the complexity of our algorithm for constantnumber of binary packing constraints is nearly linear in the input size.

Intersection of p-system and d-knapsack constraints. In Section 4, we study the problemwhen the constraint set I represents the intersection of p-system and d knapsack constraints.

Theorem 3. There is an (1/(p+ 74d+1)−ε)-approximate algorithm for maximizing a non-negative

monotone submodular function subject to a p-system and d knapsack constraints, which performsnearly linear number of queries.

We improve the approximation ratio of (1/(p + 2d + 1) − ε) in [2] and the improvement isconstant for constant value of p and d. On the other hand, similar as Theorem 3, we improve theapproximation ratio of ( p/(p+1)

2p+2d+1 − ε) in [18] for non-monotone submodular maximization.

Theorem 4. There is an ( p/(p+1)

2p+ 74

d+1− ε)-approximate algorithm for maximizing a (non-monotone)

submodular function subject to the intersection of p-system and d knapsack constraints, which per-forms nearly linear number of queries.

2 An Efficient Algorithm for Single Knapsack Constraint

Our improved solution for a single knapsack constraint consists of Algorithm 3, a backtrackingalgorithm utilizing multiple thresholds, and Algorithm 1, an alternative algorithm for the case whenthere exists an element with cost no less than 1/2 in OPT. In the following we give an overview ofthese two procedures.

Overview of the two subroutines. We run two threads in parallel in Algorithm 3, the doublethreshold backtracking algorithm, where each thread contains two sequential stages. In stage jof thread i (i, j ∈ [2]), we select elements whose profit density is no less than the profit density

threshold τ(i)j ∈ [εf(OPT), c

(i)j · f(OPT)], and cost no more than 1/i (this constraint is trivial in

the first thread). We recursively construct a new candidate solution T(i)j if there exists constraint

violation in the proceeding procedure. We output the best solution based on the collection of setsobtained in the aforementioned two threads. In Algorithm 1, we first select a singleton with bothfunction value and cost close to that of the element with largest cost in OPT, then Algorithm 6 is

3

utilized to solve the corresponding residual problem. The final solution is chosen to be the best setamong solutions to the O(1/δ) problem and that returned by Algorithm 3. It is important to notethat δ = Θ(1), according to inequality (20).

Algorithm 1: Main Algorithm for Single Knapsack Constraint

1 Initialization: ζ ← f(OPT), ζ ← 13ζ48

2 while ζ ≥ ζ do

3 eζ ← argminc(e)|ζ ≤ f(e) ≤ ζ1−δ , e ∈ E

// δ is a constant independent of the approximation parameter ε4 With budget 1− c(eζ), apply the backtracking threshold algorithm (Algorithm 6) on function

gζ(·) : 2E\eζ → R+ to obtain solution Sζ , where gζ(S) = f(S + eζ)− f(eζ) for ∀S ⊆ E \ eζ

5 ζ ← (1 − δ) · ζ

6 ζ∗ ← argmaxζ f(Sζ + eζ)

7 S′o ← Sζ∗ + eζ∗

8 S∗ ← solution returned by Algorithm 39 Return So ← argmaxS∈S∗,S′

o f(S)

2.1 An O(n · log log n) time 13e

-approximate algorithm

In the i-th iteration of the adaptive decreasing threshold (ADT) algorithm, we maintain wi and wi

as an upper and lower estimate on the optimal objective value f(OPT). At the end iteration i, thelower estimate of f(OPT) is updated as the maximum function value of the sets obtained in i-th iter-ation. It turns out that the gap between the upper and lower estimates of f(OPT) is a constant afterℓ = O(log log n) iterations. In the following lemma we prove that wi and wi are always valid lowerand upper bounds on f(OPT). We compare ADT with existing algorithm in [12] in Appendix A.1.

Algorithm 2: Adaptive Decreasing Threshold (ADT) Algorithm

1 Initialization: ω1 ← maxe∈E f(e), ω1 ← k · ω1, U ← ∅, ℓ← ⌈log log n⌉.2 for i = 1 : ℓ do

3 αi = exp(log n · e−i)− 1, θ = ωi

4 while θ ≤ ωi do

5 S(i)θ ← ∅

6 for e ∈ E do

7 if c(S(i)θ + e) > 1 then

8 S(i)θ ← argmax

T ∈S(i)

θ,e

f(T )

9 else if f(S(i)θ + e)− f(S

(i)θ ) ≥ θ · c(e) then

10 S(i)θ ← S

(i)θ + e

11 θ ← θ(1 + αi)

12 ωi+1 ← maxθ f(S(i)θ ), ωi+1 ← 3(1 + αi) · ωi+1

13 return wℓ

Lemma 5. For any i ∈ [ℓ], the optimal objective value always lies between wi and wi, i.e., wi ≤

f(OPT) ≤ wi. As a consequence, f(OPT)3e ≤ wℓ ≤ f(OPT).

Proof: We finish the proof by induction. For the base case when i = 1, as w1 is initialized to

4

be the maximum objective value of a singleton, Lemma 5 is equivalent to maxe f(e) ≤ f(OPT) ≤n · maxe f(e), which follows from the submodularity of f(·). Notice that for i ≥ 2, we have

wi = maxθ f(S(i−1)θ ), where S

(i−1)θ is a feasible solution. Hence wi is always a valid lower bound of

f(OPT) and what remains to prove is wi ≥ f(OPT) for ∀i ∈ [ℓ].

Induction Step. Assume that wi ≥ f(OPT) holds for i = q. In the following, we complete the

proof for i = q + 1 by lower bounding the objective value of f(S(q)θ∗

q). Observe that in the q-th

iteration, θ takes values in set

Θq =

wq, wq(1 + αq), . . . , wq(1 + αq)⌊log(wq/wq)/log(1+αq)⌋

.

Combined with the induction assumption wq ≥ f(OPT), there must exist some θ∗q ∈ Θq such that

θ∗q ≤

23f(OPT) ≤ (1 + αq)θ∗

q . Consider the iteration in which θ = θ∗q , let E denote the event

that there exists element e such that c(S(q)θ∗

q+ e) > 1, then we can lower bound f(S

(q)θ∗

q) by its size

multiplying the corresponding threshold, i.e.,

max

f(S(q)θ∗

q), f(e)

≥

f(S(q)θ∗

q+ e)

21E ≥ θ∗

q ·c(S

(q)θ∗ + e)

2· 1E ≥

θ∗q

2· 1E ≥

f(OPT)

3(1 + αq)· 1E .

If there exists no element exceeding the budget, then elements in OPT \ S(q)θ∗

qwill have a small

marginal gain with respect to S(q)θ∗

qand

[f(OPT)− f(S(q)θ∗

q)] · 1E ≤ f(OPT ∪ S

(q)θ∗

q)− f(S

(q)θ∗

q) (monotonicity)

≤∑

e∈OPT

[f(S

(q)θ∗

q+ e)− f(S

(q)θ∗

q)]≤

∑

e∈OPT

θ∗q · c(e) = θ∗

q ≤2f(OPT)

3.

This implies that f(S(q)θ∗ ) ≥ f(OPT)

3 · 1E . To summarize, we have

f(S(q)θ∗ ) ≥

f(OPT)

3(1 + αq)· 1E +

f(OPT)

3· 1E ≥

f(OPT)

3(1 + αq). (1)

Therefore wq+1 = 3(1 + αq)wq+1 = 3(1 + αq) maxθ f(S(q)θ ) ≥ 3(1 + αq)f(S

(q)θ∗

q) ≥ f(OPT), which are

mainly based on (1) and the definition of wi, wi. For i = ℓ we have

wℓ = (1 + αℓ)wℓ = exp(log n · e−⌈log log n⌉)wℓ ≤ ewℓ.

The proof is complete.

Proposition 6 (Complexity of Algorithm 2). Algorithm 2 performs O(n log log n) queries in total.

Proof: Note that in the i-th iteration of the preprocessing procedure, we perform O(n·

log (ωi/ωi)log(1+αi)

)=

O(n · log (1+αi−1)

log(1+αi)

)number of queries, which implies that the total number of queries performed is

in the order of

O( ℓ∑

i=1

log (1 + αi−1)

log(1 + αi)

)= O

( ℓ∑

i=1

log(exp(log k · e−i+1))

log(exp(log k · e−i))

)= O(eℓ) = O(log log n).

5

2.2 Double threshold backtracking algorithm and performance analysis 3

Algorithm 3: Double Threshold Backtracking Algorithm

1 Initialization: q ← 2, T(i)0 , T

(i)j , T

(i)j ← ∅ (∀i, j ∈ [q]), λ← f(OPT)

2 while λ ≥ ε · f(OPT) do

// A constant approximation of f(OPT) is sufficient for initializing λ3 for i ∈ [q] do

4 for j ∈ [q] do

5 T(i)j ← T

(i)j−1, c

(i)1 ←

3i3+i , c

(i)2 ←

9(3+i)2 , τ

(i)j ← λ · c

(i)j (j ∈ [q])

6 for each e ∈ E \ T(i)j do

7 if f(T(i)j + e)− f(T

(i)j ) ≥ c(e) · τ

(i)j and c(e) ≤ 1/i then

8 if c(T(i)j + e) ≤ 1 then

9 T(i)j ← T

(i)j + e

10 else

11 e(i)j ← e, T

(i)j ← T

(i)j + e

12 for e ∈ T(i)j do

13 if c(T(i)j + e) ≤ 1 then

14 T(i)j ← T

(i)j + e

15 for i ∈ [q] do

16 e(i)1 ← argmaxe∈T (i) c(e), e

(i)2 ← argmax

e∈T(i)2 \T

(i)1

c(e)

17 T (i) ← ∪j∈[q]T(i)j , U

(i)1 ← e(i), e

(i)1 , U

(i)2 ← e(i), e

(i)2 , U

(i)3 ← e(i), e

(i)2 ∪ T

(i)1

18 λ← (1 − ε) · λ

19 Return S∗ ← argmax f(S)|S ∈ U(i)ℓ 1≤ℓ≤3 ∪ T

(i), T (i)i∈[q], c(S) ≤ 1

Lemma 7. For set S∗ returned by Algorithm 3, its objective value satisfies

f(S∗) ≥7

16f(OPT) · 1OPT∩B=∅ +

16

25[f(OPT)− f(OPT ∩B)] · 1OPT∩B 6=∅ −O(ε · f(OPT)),

where B = e ∈ E|c(e) ≥ 12 represents the set of large elements, i.e., elements with cost no less

than 1/2.

Proof: In thread i of Algorithm 3, two thresholds τ(i)1 and τ

(i)2 are utilized to select elements.

For a clean presentation, we omit the index of the thread and lower bound the quality of solutionobtained by two sequential threshold τ1 and τ2 in double threshold backtracking algorithm.

In the following we use Ti (1 ≤ i ≤ 2) to denote the collection of elements obtained by thresholdτi. If there exists element ei that has a marginal increment no less than τi but violates the knapsackconstraint, Ti represents the value of candidate set before element ei arrives. We further let OPT′ =OPT \ B and e1, e2 be the element with largest cost in T1 and T2 \ T1 respectively, i.e., e1 =argmaxe∈T1

c(e) and e2 = argmaxe∈T2\T1c(e).

We divide our analysis into three cases, according to the existence of budget violation in eachiteration.

6

Case 1: Algorithm 3 stops at τ1. In this case, there exists e1 such that c(T1 + e1) > 1, andthe marginal increment of e1 is no less than f(T1 + e1)− f(T1) ≥ τ1. Hence the objective value ofe1, e1 can be lower bounded as,

f(e1, e1) ≥ f(e1) + [f(T1 + e1)− f(T1)] ≥ τ1(c(e1) + c(e1)).

According to the definition of e1, the cost of set T1 is no less than 1− c(e1), we have

f(T1) ≥ τ1(1− c(e1)).

Combining with the fact that f(T1) ≥ τ1 · c(T1) ≥ τ1(1− c(e1)). Therefore

f(S∗) ≥maxf(T1), f(e1, e1), f(T1)

≥f(T1) + f(e1, e1) + f(T1)

3≥

2

3τ1 ≥

τ1 + τ2

3. (2)

Case 2: Algorithm 3 stops at τ2. Without loss of generality, we can assume that c(T1) ≤ 23 ,

otherwise we can immediately obtain the same lower bound as (2), i.e.,

f(S∗) ≥ f(T1) ≥ c(T1) · τ1 ≥τ1 + τ2

3.

With the condition that c(T1) ≤ 23 , the weight of e2 satisfies that c(e2) ≥ (1− c(T1)) ·1c(T1+e2)>1 ≥

13 · 1c(T1+e2)>1. Recall that T2 is obtained by adding e2 and then elements in T2, until the totalweight exceeds the budget, we have

c(T2 − e2) ≥ [1− c(e2)− c(e1)] · 1c(T1+e2)>1 + [1 − c(e2)− c(e2)] · 1c(T1+e2)≤1, (3)

based on which we can obtain the following lower bound on the objective value of T2,

f(T2) =[f(T2)− f(T2 − e2)] + f(T2 − e2)

≥[τ2 · c(e2) + τ1 · c(T2 − e2)] · 1c(T1+e2)>1

≥[τ2 · c(e2) + τ1 · (1− c(e2)− c(e1))] · 1c(T1+e2)>1. (4)

Notice that

f(e2, e1) =[f(e2, e1)− f(e1)] + f(e1)

≥[f(T2 + e2)− f(T2)] + f(e1)

≥τ2 · c(e2) + τ1 · c(e1). (5)

Combining (4) and (5) together, we have

f(S∗) ≥f(T2) + f(e2, e1)

2

≥[τ2 · c(e2) +

τ1

2· [1 − c(e2)]

]· 1c(T1+e2)>1 (6)

≥τ1 + τ2

3· 1c(T1+e2)>1,2τ2≥τ1

, (7)

7

where the last inequality holds due to the the monotonicity of (6) with respect to c(e2), togetherwith the fact that c(e2) ≥ 1

3 .

Now we consider the case when c(T1 + e2) ≤ 1. Due to the definition of e2, we have c(T2 + e2) =

c(T1) + c(T2 \ T1) + c(e2) > 1, which implies that maxc(T2 \ T1), c(e2) ≥ 1−c(T1)2 . In addition,

f(S∗) ≥maxf(T2), f(T2) ≥ maxf(T2), f(T1 + e2)

≥f(T1) + maxf(T2)− f(T1), f(T1 + e2)− f(T1)

≥f(T1) + τ2 ·maxc(T2 \ T1), c(e2)

≥τ1 · c(T1) + τ2 ·1− c(T1)

2. (8)

Notice that the lower bound in RHS of (8) is monotonically increasing with respect to the totalweights of T1, we have

f(S∗) ≥τ1 + τ2

3· 1c(T1+e2)≤1,c(T1)≥ 1

3.

For the case when c(T1) ≤ 13 , we first argue that T2 6= T1. Because the total weights of elements

selected in the second iteration is no less than c(T2 \T1) > 1− c(T1)− c(e2) ≥ 16 > 0, hence element

e2 must exist. We next claim the following lower bound on f(S∗),

f(S∗) ≥ maxf(T2), f(T2)

≥ maxf(T1 + e2), f(T1 + e2) (monotonicity of f(·) and T1 + e2 is feasible)

= f(T1) + maxf(T1 + e2)− f(T1), f(T1 + e2)− f(T1)

≥ f(T1) + τ2 ·maxc(e2), c(e2). (9)

Plugging the fact f(T1) ≥ f(OPT′)− c(OPT′) · τ1 into (9), we have

f(S∗) ≥(f(OPT′)− c(OPT′) · τ1 +

τ2

3

)· 1maxc(e2),c(e2)≥ 1

3. (10)

If maxc(e2), c(e2) ≤ 13 , we have c(T1∪e2, e2) ≤ 1, i.e., T1∪e2, e2 is a feasible set. Consequently

we have

f(S∗) ≥f(T2) + f(T2) + f(T1 ∪ e2, e2)

3

≥f(T1) +[f(T2)− f(T1)] + [f(T2)− f(T1)] + [f(T1 ∪ e2, e2)− f(T1)]

3

≥f(T1) + τ2 ·c(T2 \ T1) + c(T2 \ T1) + c(e2) + c(e2)

3

≥[f(OPT′)− c(OPT′) · τ1] · 1c(T1)≤ 13

+ (c(T1) · τ1) · 1c(T1)≥ 13

+2(1 − c(T1))

3· τ2

≥(f(OPT′)− c(OPT′) · τ1 +

4

9τ2

)· 1c(T1)≤ 1

3+

(τ1

3+

4

9τ2

)· 1c(T1)≥ 1

3. (11)

Case 3: Algorithm 3 stops without exceeding the budget. In this case, we have

f(S∗) ≥f(OPT′)−∑

e∈OPT′\S∗

[f(S∗ + e)− f(S∗)]

≥f(OPT′)− τ2 · c(OPT′). (12)

8

Now we are ready to combine our analyses in the aforementioned three cases, i.e., inequalities(2), (7) and (11)-(12),

f(S∗) ≥minτ1 + τ2

3, f(OPT′)− τ1 · c(OPT′) +

τ2

3, f(OPT′)− τ2 · c(OPT′)

(a)≥

( 6c(OPT′) + 1

[3c(OPT′) + 1]2− ε

)· f(OPT′) (13)

≥( 7

16− ε

)f(OPT) · 1OPT∩B=∅ +

16

25[f(OPT)− f(OPT ∩B)] · 1OPT∩B 6=∅. (14)

where (a) holds with equality when

τ1 =3

3c(OPT′) + 1f(OPT′)−O(ε)f(OPT) (15)

and

τ2 =9c(OPT′)

[3c(OPT′) + 1]2f(OPT′)−O(ε)f(OPT). (16)

Observe that the coefficient of f(OPT′) in (13) decreases with respect to the weight of OPT′, thus(14) follows from the facts that c(OPT′) ≤ 1− 1

2 · 1OPT∩B 6=∅ and f(OPT′) ≥ f(OPT)− f(OPT ∩B) · 1OPT∩B 6=∅.

2.3 Proof of Theorem 1

As a special case of Theorem 3, we have the following proposition.

Proposition 8. There exists an (4/11−ε)-approximate algorithm for a single knapsack constraint,which performs O(n · log log n) queries.

2.3.1 Approximation ratio

Proof: We first show the following conclusion for set S′o,

f(S′o) ≥

4

11f(OPT) +

( 3

11− δ

)f(OPT ∩B). (17)

Consider the iteration when

(1− δ)f(OPT ∩B) ≤ ζ = ζ ≤ f(OPT ∩B),

we claim that

c(eζ) ≤ c(OPT ∩B),

since OPT ∩B is a candidate element when selecting element eζ . Moreover, OPT \B is a feasiblesolution for the residual problem induced by eζ . Hence the following inequality holds for Sζ , if weapply an β-approximation algorithm on the corresponding residual problem,

gζ(Sζ) ≥ β · gζ(OPT \B). (18)

9

Plugging the definition of the residual function gζ(·) into (18),

f(S∗) ≥ f(Sζ + eζ)(a)

≥ β · f(OPT \B) + (1− β) · f(eζ)

(b)

≥ β · f(OPT) + (1− 2β − δ) · f(OPT ∩B), (19)

where we use the fact that g(OPT ∩B) = f(OPT ∩B + eζ)− f(eζ) ≥ f(OPT ∩B)− f(eζ) in (a);

(b) holds since f(OPT \B) ≥ f(OPT)− f(OPT∩B) and f(eζ) ≥ ζ ≥ (1− δ) · f(OPT∩B). Recall

that our Algorithm 6 provides an approximation ratio of β = 411 , we can obtain (17) by plugging

β = 411 into (19).

Taken together with Lemma 7, we have

f(S∗) ≥minB

max 7

16f(OPT) · 1OPT∩B=∅ +

16

25[f(OPT)− f(OPT ∩B)] · 1OPT∩B 6=∅,

4

11f(OPT) +

( 3

11− δ

)· f(OPT ∩B)

−O(ε)f(OPT) (20)

≥( 7

16− ε

)· f(OPT). (21)


2.3.2 Complexity of Algorithm 1

Offline time complexity. Both Algorithm 3 and Algorithm 1 require a constant approximationsof f(OPT), which can be obtained independently in O(n log log n) time, for example, via similartreatments to the ADT algorithm. Notice that in Algorithm 3, there are O(ε−1) different values ofλ and the algorithm runs in O(n) time for each fixed λ. Hence the total running time of Algorithm 3is in the order of O(n · maxε−1, log log n). Algorithm 1 can be accomplished within the sameorder of time, as it requires O(δ−1) calls to backtracking threshold Algorithm and δ = O(1).

Streaming setting. In the streaming model, we run Algorithm 3 and Algorithm 1 in parallel.Compared with offline algorithm, the main difference lies in the approach used to obtain a constantapproximation of f(OPT). Since f(OPT)/ maxe∈E f(e) lies in the range of [1, n], we can maintainO(log n/ε) copies of solutions in parallel for each possible approximation value of f(OPT), whichimplies a total time complexity of O((n log n)/ε) and space complexity of O((n log n)/ε).

3 A Nearly Linear Time (1/2− ε)-Approximate Deterministic Al-gorithm for Binary Packing Constraints

We start with the formal definition about the residual problem with respect to a given set T .

Definition 9 (T -Residual Problem). Let f be a submodular function, its contracted function fT :2E\T → R+ is given as fT (S) = f(S ∪ T )− f(T ). For the optimization problem maxS∈I f(S), wedefine its T -residual problem as maxS∈IT

fT (S), where the constraint set IT = S|S ⊆ E\T, S∪T ∈I.

10

In several constrained submodular maximization problems [16, 21, 17, 6, 2], we are able toobtain a desirable approximation guarantee for the residual problem, by carefully choosing set T .For example, in the single knapsack constraint [21], T represents the collection of three elementsthat have the largest marginal increments, while T consists of elements with high costs for constantnumber of knapsack constraints [17]. However, directly searching set T takes O(n3) and Θ(nd) timerespectively in the aforementioned two examples, which are computationally expensive.

Apart from the aforementioned straightforward approaches, we introduce the concept of shadowset, and consider the residual problem with respect to T ♯, the shadow set of the target set T . Theformal definition of shadow set is specified as follows.

Definition 10 (α-shadow set). T ♯ is called α-shadow set of T ⊆ E iff OPT\ (T ∪T ♯) is a feasiblesolution to S♯-residual problem, while

f((OPT ∪ T ♯) \ T ) + α · f(T ♯) ≥ f(OPT).

Definition 10 states that replacing the optimal elements in T with that in the α-shadow set T ♯,will incur an additive loss that is no more than α · f(T ♯).

Algorithm overview. We present our algorithm 5 in Appendix B.1 and first introduce somenecessary notations. Let OPTi = o1, o2, . . . , oi, ∆i = f(OPTi)− f(OPTi−1) and

ℓε = max

i∣∣∣∆i ≥

ε

d· f(OPT)

.

Without loss of generality, we assume that elements in OPT are in greedy ordering, i.e., oi+1 =argmaxe∈OPT f(OPTi + e)− f(OPTi). As shown in Algorithm 5, we first construct S♯

ε, a (1+ε)-shadow set of OPTiε , i.e., we select an element with comparable cost and similar marginal incrementfor each element in OPTiε .

We next consider the residual problem

maxS⊆E\S♯

ε

f(S ∪ S♯

ε)∣∣∣S ⊆ E \ S♯

ε, AxS∪S♯

ε≤ b

,

for which we combine the MWU-based greedy algorithm [1] with a threshold decreasing procedureon E \ (EΓ ∪ S♯

ε), where EΓ = e ∈ E|∃i ∈ Γ such that ci(e) 6= 0 and

Γ =

i ∈ [d]∣∣∣bi − ci(S

♯) ≤W =2 log d

δ2

⊆ [d].

The main ingredient of our algorithm is to construct S♯ε, the shadow set of OPTiε , by approx-

imately guessing deterministically in the value space, which enables us to establish a mappingbetween elements in OPTiε and S♯

ε. The analysis is in a similar spirit to the analysis of greedyalgorithm under matroid constraint.

Lemma 11. S♯ε is a (1 + ε)-shadow set of OPTiε , i.e.,

f(OPT)− f((OPT ∪ S♯) \OPTiε) ≤ (1 + ε)f(S♯).

11

Proof: For notational simplicity, we omit the subscript ε in this proof. Let S♯ = e♯1, e♯

2, . . . , e♯|S♯|

and S♯i = e♯

1, e♯2, . . . , e♯

i (i ∈ [|S♯|]), where e♯i is the element selected at the i-th step of guessing.

According to the definition of Gε in Algorithm 5, we know that the increment of e♯i with respect to

set S♯i−1 is similar as that of element o♯

i , i.e.,

f(S♯i )− f(S♯

i−1) = f(S♯i−1 + e♯

i)− f(S♯i−1) (22)

≥ (1− ε)[f(S♯i−1 + oi)− f(S♯

i−1)]−ε

|OPTiε |f(OPT). (23)

The increment of oi in (22) can be lower bounded as

f(S♯i−1 + oi)− f(S♯

i−1)(a)

≥f(S♯i−1 ∪ (OPT \OPTi−1))− f(S♯

i−1 ∪ (OPT \OPTi)) (24)

(b)

≥f(S♯i−1 ∪ (OPT \OPTi−1))− f(S♯

i ∪ (OPT \OPTi)) (25)

where in (a) we use submodularity of f and the fact that S♯i−1 ∪ (OPT \ OPTi) + oi = S♯

i−1 ∪(OPT \ OPTi−1). (b) follows from monotonicity of f . Take summarization from i = 1 to |S♯|, wecan obtain

f(S♯) =

|S♯|∑

i=1

[f(S♯i )− f(S♯

i−1)]

≥(1− ε) ·

|S♯|∑

i=1

[f(S♯i−1 + oi)− f(S♯

i−1)]− εf(OPT)

≥(1− ε) ·

|S♯|∑

i=1

[f(S♯i−1 ∪ (OPT \OPTi−1))− f(S♯

i ∪ (OPT \OPTi))]− εf(OPT)

=(1− ε) · [f(OPT)− f((OPT ∪ S) \OPTiε)]− εf(OPT).

Rearranging the terms, the proof is complete.

Remark. Indeed we can further conclude that f(S♯ε) ≥ (1/2 − ε) · f(OPTiε).

f(OPTiε)− f(S♯) ≤∑

e∈OPT

[f(S♯ + e)− f(S♯)] (submodularity)

=

|S♯|∑

i=1

[f(S♯i−1 + oi)− f(S♯

i−1)]

≤1

1− ε·

|S♯|∑

i=1

[f(S♯i−1 + e♯

i)− f(S♯i−1)] (selection rule of Algorithm 5)

=f(S♯)

1− ε.

Proposition 12. Algorithm 5 returns a solution set So in Oε(n · log log n) time and f(So) ≥(1

2 − ε) · f(OPT).

12

Proof: See Appendix B.2.

Remark. In general, a γ-approximate polynomial time algorithm for the T -residual problemimplies a polynomial time minγ, 1

1+β-approximate algorithm. The proof is presented in Ap-pendix B.3.

4 Intersection of p-System and d-Knapsack Constraints

In this section we consider the problem of maximizing a monotone submodular function under theintersection of a p system constraint Ip and d knapsack constraints, where Ki = S ⊆ E | ci(S) ≤1 (∀i ∈ [d]) represents the i-th knapsack constraint. Element weights in the i-th dimension arespecified by weight function ci(·) : 2E → R

≥0.

Overview of the backtracking threshold algorithm. As shown in Algorithm 4, we eliminateelements with high cost that are collected by B, the set of large elements. Element e ∈ E is calleda large element if 2ci(e) > 1 holds for at least one index i ∈ [d], otherwise we call it a small element.Among the remaining elements, those with marginal gain no less than ∆ and profit density no lessthan the predetermined threshold θ, will be added into the candidate set, as long as the newlyconstructed set is feasible. When the cost of the currently chosen element e is larger than theresidual budget, a new feasible solution S is constructed. Element e and the element with largesttotal cost in set S are firstly added into S, we next add remaining elements in S into S untilexceeding the budget. We further use Algorithm 7 (presented in Appendix C.2), the combinationof ADT and backtracking, to approximate f(OPT).

Algorithm 4: Backtracking Threshold (BT) Algorithm (θ, ε) (BT(θ, ε))

1 Initialization: S0 ← argmaxe∈E f(e), B ← e ∈ E | ∃i ∈ [d] such that ci(e) ≥ 1/2, ∆← f(S0),S ← ∅

2 while ∆ ≥ εf(S0)n do

3 for e ∈ E \B do

4 if fS(e) ≥ maxθ ·∑d

i=1 ci(e), ∆ and S + e ∈ Ip then

5 if S + e ∈ I then

6 S ← S + e

7 else

8 S ← e, argmaxe∈S

∑di=1 ci(e)

9 for e ∈ S do

10 if S + e ∈ I then

11 S ← S + e

12 break

13 ∆← (1− ε) ·∆

14 return S∗ ← argmaxT ∈S,S,S0 f(T )

One may expect that removing large elements will incur a large loss in the objective value.However, the following two observations help to bound the loss. Firstly, note that the marginalgain of each element should be inversely related to p and d according to the desired approximation

13

ratio, otherwise we can just return a singleton with objective larger than (1/(p + 7d4 + 1)− ε) times

optimum. Secondly, there are at most d large elements in OPT. On the other hand, the thresholdselection procedure will be able to achieve higher objective value when the element costs are smaller(compared with the budget), with the additional help of recursively constructed set S.

A simple but crucial consequence is the following upper bound on the number of large elementsin OPT.

Corollary 13. There are at most |OPT ∩B| ≤ d large elements in the optimal solution.

Proof: Suppose that there are more than d large elements in OPT. From the pigeonhole principlewe know that, there exists at least one index i ∈ [d] such that ci(OPT) > 1. However, thiscontradicts the fact that OPT is a feasible solution set, the proof is complete.

4.1 Monotone Submodular Maximization

Our algorithm for monotone objective is presented in Appendix C.1, which computes the finalsolution by feeding a series of well-spaced parameters that are related to the output of Algorithm 7.The proof of Theorem 3 is presented in Appendix C.5.

Let e be the element that is not added to S due to violation of some knapsack constraintsin Algorithm 4. As element e may not exist, we divide the analysis of backtracking thresholdalgorithm into two cases in Proposition 14 and 17, based on the existence of e.

Proposition 14. If element e does not exist,

f(S∗) ≥f(OPT)− θ · (d− |OPT ∩B|/2)

p + |OPT ∩B|+ 1− (pε + ε) · f(OPT).

Proof: We partition the optimal solution set as

OPT = OPT1 ∪OPT2 ∪ (OPT ∩B),

where OPT1 = e ∈ OPT \B|fS(e) < θ ·∑d

j=1 ci(e) represents the set of small elements in OPT,

whose profit density with respect to set S is less than θ, OPT2 = OPT \ (B ∪OPT1) denotes theremaining small elements in OPT. Based on this partition, we are able to lower bound f(S) in thefollowing manner,

f(OPT)− f(S)(a)≤f(S ∪OPT)− f(S) (monotonicity)

=[f(S ∪ (OPT ∩B))− f(S)] + [f(S ∪ (OPT ∩B) ∪OPT1)− f(S ∪ (OPT ∩B))]

+ [f(S ∪OPT)− f(S ∪ (OPT ∩B) ∪OPT1)]

(b)

≤ [f [S ∪ (OPT ∩B)]− f(S)]︸︷︷︸

Σ1

+ [f(S ∪OPT1)− f(S)]︸︷︷︸

Σ2

+ [f(S ∪OPT2)− f(S)]︸︷︷︸

Σ3

(26)

where (b) follows from submodularity of f(·).

14

In the following, we provide upper bounds on Σ1, Σ2, Σ3 respectively. Firstly, a direct consequenceof submodularity and the definition of S∗ is,

Σ1 = f(S ∪ (OPT ∩B))− f(S) ≤∑

e∈OPT∩B

f(e) ≤ |OPT ∩B| · f(S∗). (27)

As for the second term Σ2,

Σ2 = f(S ∪OPT1)− f(S) =∑

e∈OPT1

[f(S + e)− f(S)](a)≤ θ ·

d∑

j=1

cj(OPT1)

(b)≤θ ·

(d−

d∑

j=1

cj(OPT ∩B))

(c)

≤θ ·(d−|OPT ∩B|

2

), (28)

where (a) is based on the definition of OPT1, which indicates that the profit density of elementsin OPT1 is less than θ. The correctness of (b) follows from the fact that cj(OPT1) ≤ cj(OPT) −cj(OPT ∩B) ≤ 1− cj(OPT ∩B) (∀j ∈ [d]). (c) holds because the total cost of each large elementis no less than 1/2.

We next introduce Proposition 15, whose proof follows from the analysis of greedy algorithmfor monotone objective and p-system constraint [2, 4]. We include its proof in Appendix C.3 forcompleteness.

Proposition 15. f(S ∪OPT2)− f(S) ≤ [p + (p + 1)ε] · f(S).

Assuming Proposition 15, we are able to lower bound the objective value of S. More specifically,by plugging inequalities (27)–(28) and Proposition 15 into (26), we have

f(S) ≥f(OPT)− (Σ1 + Σ2 + Σ3)

≥f(OPT)− |OPT ∩B| · f(S∗)− θ · (d− |OPT ∩B|/2) − (p + O(ε))f(S). (29)

Rearranging the terms,

(p + 1 + |OPT ∩B|+ O(ε)) · f(S∗) ≥(p + 1 + O(ε)) · f(S) + |OPT ∩B| · f(S∗)

≥f(OPT)− θ · (d− |OPT ∩B|/2). (30)


Before providing the lower bound of f(S∗) for the case when e exists, we first show the feasibilityof all the candidate solutions involved. Let S be the value of candidate solution S before consideringe, then there exists i ∈ [d] such that ci(S

+e) > 1. We use e = argmaxe∈S

∑di=1 ci(e) to represent

the element with the largest total cost in S and let S(1) = e, e, S(2) = S \ S(1).

Proposition 16. Both S and S(1) are feasible solution sets.

Proof: Note that S is initialized to S(1) and is always feasible after each update, it suffices to showthat S(1) ∈ I. We first remark that S ∪ e ∈ Ip, since e satisfies all conditions required in line4 of Algorithm 4. According to the down-closed property of Ip and the fact that S(2) ⊆ S ∪ e,

15

we conclude that S(2) ∈ Ip. It remains to show that S(2) belongs to ∩di=1Ki. Recall that both e

and e are small elements, then we have

cj(S(2)) ≤ cj(e) + cj(e) ≤ 1

for ∀j ∈ [d], which implies that S(2) ∈ ∩di=1Ki. The proof is complete.

Proposition 17. 3f(S∗) ≥ 2θ holds when element e exists.

Proof: It is standard to assume that every singleton in E is feasible, otherwise we can apply ouralgorithm on the ground set consisting of feasible singletons. Without loss of generality we assumethat S = e1, e2, . . . , e|S|, where ei denotes the i-th element added into S. Since S(2) is also a

subset of S, we denote it as S(2) = ei1 , ei2 , . . . , ei|S(2)|, where iℓ ∈ [|S|] for ∀ℓ ≤ |S(2)| ≤ |S|.

We further let Si = e1, e2, . . . , ei (i ∈ [|S|]) be the first i elements in S, and S

(2)i is defined in

an analogous manner.

According to the density threshold rule, we have

f(Si+1)− f(S

i ) ≥ θ ·d∑

j=1

cj(ei+1). (31)

The objective value of S(2) can be lower bounded as

f(S(2)) =

|S(2)|∑

j=1

[f(S(2)j )− f(S

(2)j−1)] ≥

|S(2)|∑

j=1

[f(Sij

)− f(Sij−1)] (submodularity)

≥ θ ·

|S(2)|∑

j=1

d∑

t=1

ct(eij ) = θ ·d∑

t=1

cj(S(2)), (32)

where the last inequality is due to (31). Similarly we have

f(S) ≥ θ ·d∑

t=1

cj(S). (33)

We next claim that S \ (S(2) + e) is non-empty, otherwise S will be equal to S + e and thiscontradicts the fact that S is a feasible solution. Hence there exists at least one element e ∈S \ (S(2) + e). We further note that there exists at least one index i†, such that ci†(S + e) > 1.Otherwise we have S + e ∈ ∩d

i=1Ki. Moreover, the fact that S + e ⊆ S +e and S +e ∈ Ip impliesthat S + e also belongs to Ip. As a consequence, e will be added into Se, this contradicts the factthat e ∈ S \ (S(2) + e). Therefore

∑d

j=1cj(S + e) ≥ ci†(S + e) > 1. (34)

By combining (32) with (34), and interchanging the order of the summation, we know that

f(S) ≥ θ ·d∑

j=1

cj(S) ≥ θ ·(1−

d∑

j=1

cj(e))≥ θ ·

(1−

d∑

j=1

cj(e)), (35)

16

where the first and second inequality follow from (33) and (34) respectively, the last inequalityis based on the definition of e. On the other hand, using similar arguments to (32) and themonotonicity of f , we have

f(S) ≥ f(S(1)) ≥ θ ·( ∑d

j=1cj(e) +

∑d

j=1cj(e)

). (36)

Moreover,

f(S) ≥ θ ·( ∑

e∈S

∑d

j=1cj(e)

)≥ θ ·

(1−

∑d

j=1cj(e)

), (37)

where the last inequality holds because the total costs of S + e in all the d dimensions is largerthan 1, since S + e belongs to Ip but S + e /∈ I. Combining (35)–(37), we are able to derive thefollowing lower bound on the quality of output set S∗,

f(S∗) = max f(S), f(S) ≥f(S) + 2f(S)

3

≥1

3

[θ ·

(1−

d∑

j=1

cj(e))

+ θ ·( d∑

j=1

cj(e) +d∑

j=1

cj(e))

+ θ ·(1−

d∑

j=1

cj(e))]

=2

3θ. (38)


4.2 Non-Monotone Submodular Maximization

We extend our algorithm to non-monotone submodular functions and present an algorithm thatachieves a better approximation ratio than that in [18]. See table 1 for a detailed summary. Mirza-

soleiman et al. [18] designed a fast algorithm that achieves an approximation ratio of ( p/(p+1)2p+2d+1 − ε).

The algorithm is a combination of two paradigms—algorithm for maximizing monotone submodularfunction under p-system+d-knapsack constraint [2], and algorithm for maximizing a non-monotonesubmodular function under p-system constraint [11]. Hence it is natural to expect a better perfor-mance guarantee via our techniques developed for monotone submodular functions in Section 4.1.

Comparisons Between Existing Algorithms

Reference Constraint Approximation Ratio Query Complexity

[9, 22] 1-matroid+d-knapsack 1/e− ε poly(n) · exp(d, ε−1)

[22] p-matroid+d-knapsack 0.19/p − ε poly(n) · exp(p, d, ε−1)

[18] p-system+d-knapsack p/(p+1)2p+2d+1 − ε O(nrp · log n/ε)

This Paper p-system+d-knapsack p/(p+1)

2p+ 74

d+1− ε O(nrp ·maxε−1, log log n)

Table 1: Non-monotone submodular maximizing under p-system and d-knapsack constraints.

For our improved approximation ratio, we mainly highlight its lower dependence on the numberof knapsack constraints. We achieve the performance guarantee via the following treatments.

17

• Different from the Greedy with Density Threshold (GDT) algorithm, i.e., Algorithm 1 of [18],we employ our backtracking algorithm to select feasible solution sets with the desired objectivevalue.

• Similar to the approach in [11], the Iterated Greedy with Density Threshold (IGDT) algorithmis introduced to overcome the non-monotonicity [18], in which GDT appears as the keysubroutine for solution selection over various ground sets. Again we utilize backtrackingthreshold algorithm to replace the GDT algorithm in [18].

• Unlike the FANTOM algorithm, i.e., Algorithm 3 of [18], we are able to reduce query com-plexity in the non-monotone case, via a similar fashion as in our ADT algorithm.

In the following, we provide the key arguments for our analysis, and omit details that virtuallyfollow our proof in Section 4.

4.2.1 Performance Analysis

Backtracking Algorithm in non-monotone case. We have the following conclusion aboutthe quality of the set returned by backtracking algorithm, when the submodular function is non-monotone. The proof of Proposition 18 is almost identical as that of Proposition 14 and 17, wherethe main difference is that, the benchmark quantity is changed to f(S ∪OPT).

Proposition 18. Algorithm BT (θ, ε) returns set S ∈ I such that for any T ∈ I,

f(S) ≥ min2

3θ,

f(S ∪ T )− θ(d− |T ∩B|/2)

p + |T ∩B|+ 1− (pε + ε) · f(OPT)

, (39)

where B refers to the set of large elements.

IGDT with BT as a subroutine. Following notations in [18], we have the following propositionwith respect to the new IGDT algorithm.

Proposition 19. In each iteration of the IGDT algorithm with BT as a subroutine, we have

E[ maxi∈[p+1]

f(S′i), f(Si)] ≥ min

2

3θ,

p

(p + 1)(2p + 1) +∑p+1

i=1 |OPTi ∩B|· f(OPT)

−(p + 1)d − (

∑p+1i=1 |OPTi ∩B|)/2

(p + 1)(2p + 1) +∑p+1

i=1 |OPTi ∩B|· θ

(40)

Proof: In the rest of this proof, we let Oi = O \ (O ∩ ∪i−1j=1Sj). Compared with [18], we present a

simpler and cleaner proof on the lower bound of∑p+1

i=1 f(Si ∪Oi). The key argument in our proofis the following telescoping sum,

f(Si ∪Oi) ≥ f(Oi+1) + [f((∪p+1j=i Sj) ∪Oi)− f((∪p+1

j=i+1Sj) ∪Oi+1)]. (41)

This follows from submodularity of f , together with the facts that

(Si ∪Oi) \Oi+1 = [(∪p+1j=i Sj) ∪Oi] \ [(∪p+1

j=i+1Sj) ∪Oi+1] = Si, (42)

18

and Oi+1 ⊆ (∪p+1j=i+1Sj) ∪Oi+1. Take summation over i ∈ [p + 1], we can obtain that

p+1∑

i=1

f(Si ∪Oi) ≥ f((∪p+1j=1Sj) ∪Oi) +

p+1∑

i=2

f(Oi) ≥p+1∑

i=2

f(Oi), (43)

where the second inequality follows from the non-negativity of f((∪p+1j=1Sj) ∪Oi).

Furthermore, note that

p(p + 1) · E[maxf(S′i), f(Si)] ≥p(p + 1) · E[f(S′

i)] = 2p+1∑

i=1

(p + 1− i) · E[f(S′i)]

≥p+1∑

i=1

(p + 1− i) · f(Si ∩O), (44)

where the last inequality follows from the approximation guarantee of the double greedy algo-rithm [3]. On the other hand,

[(p + 1)2 +

p+1∑

i=1

|Oi ∩B|]· E[maxf(S′

i), f(Si)]

≥p+1∑

i=1

[p + |Oi ∩B|+ 1] · f(Si)

≥p+1∑

i=1

f(Si ∪Oi)−[(pd + d)θ −

1

2

p+1∑

i=1

|Oi ∩B| · θ + O(ε)OPT]

(Proposition 18)

Combining together, we have

[(p + 1)(2p + 1) +

p+1∑

i=1

|Oi ∩B|]· E[maxf(S′

i), f(Si)]

≥p+1∑

i=1

(p + 1− i)f(Si ∩O) +p+1∑

i=1

f(Si ∪Oi)− ((p + 1)dθ −1

2

p+1∑

i=1

|Oi ∩B| · θ + O(ε) ·OPT)

≥[p −O(ε)] ·OPT− [(p + 1)d−1

2

p+1∑

i=1

|Oi ∩B|] · θ. (45)

The proof is complete, since maxi∈[p+1]f(S′i), f(Si) ≥ 2θ/3 directly follows from inequality (39).

Proof of Theorem 4. The details of our ADT algorithm in the non-monotone case is presentedin Appendix C.6, which is similar to Algorithm 7. The proof of Theorem 4 is similar as that forTheorem 3, except that the optimal threshold θ∗ is specified by the following equation,

θ∗ =6p

4(p + 1)(2p + 1) +∑p+1

i=1 |OPTi ∩B|+ (6p + 6)d· f(OPT)

∈[ 6p

(p + 1)(8p + 7d + 4)· f(OPT), f(OPT)

]. ( 0 ≤ |OPTi ∩B| ≤ d)

19

References

[1] Yossi Azar and Iftah Gamzu. Efficient submodular function maximization under linear packingconstraints. In ICALP, pages 38–50, 2012.

[2] Ashwinkumar Badanidiyuru and Jan Vondrák. Fast algorithms for maximizing submodularfunctions. In SODA, pages 1497–1514, 2014.

[3] Niv Buchbinder, Moran Feldman, Joseph Seffi Naor, and Roy Schwartz. A tight linear time(1/2)-approximation for unconstrained submodular maximization. In FOCS, pages 649–658,2012.

[4] Gruia Calinescu, Chandra Chekuri, Martin Pál, and Jan Vondrák. Maximizing a monotone sub-modular function subject to a matroid constraint. SIAM Journal on Computing, 40(6):1740–1766, 2011.

[5] Chandra Chekuri, TS Jayram, and Jan Vondrák. On multiplicative weight updates for concaveand submodular function maximization. In ITCS, pages 201–210. ACM, 2015.

[6] Alina Ene and Huy L. Nguyen. A Nearly-Linear Time Algorithm for Submodular Maximizationwith a Knapsack Constraint. In ICALP, pages 53:1–53:12, 2019.

[7] Moran Feldman, Christopher Harshaw, and Amin Karbasi. Greed is good: Near-optimalsubmodular maximization via greedy optimization. In COLT, pages 758–784, 2017.

[8] Moran Feldman, Joseph Naor, and Roy Schwartz. A unified continuous greedy algorithm forsubmodular maximization. In FOCS, pages 570–579, 2011.

[9] Moran Feldman, Joseph Seffi Naor, and Roy Schwartz. Nonmonotone submodular maximiza-tion via a structural continuous greedy algorithm. In ICALP, pages 342–353, 2011.

[10] Yuval Filmus and Justin Ward. A tight combinatorial algorithm for submodular maximizationsubject to a matroid constraint. In FOCS, pages 659–668, 2012.

[11] Anupam Gupta, Aaron Roth, Grant Schoenebeck, and Kunal Talwar. Constrained non-monotone submodular maximization: Offline and secretary algorithms. In International Work-shop on Internet and Network Economics (WINE), pages 246–257, 2010.

[12] Chien-Chung Huang and Naonori Kakimura. Multi-pass streaming algorithms for monotonesubmodular function maximization. arXiv preprint arXiv:1802.06212, 2018.

[13] Chien-Chung Huang and Naonori Kakimura. Multi-pass streaming algorithms for monotonesubmodular function maximization. In WADS, 2019.

[14] Chien Chung Huang, Naonori Kakimura, and Yuichi Yoshida. Streaming algorithms for max-imizing monotone submodular functions under a knapsack constraint. In APPROX, page 11,2017.

[15] Chien-Chung Huang, Naonori Kakimura, and Yuichi Yoshida. Streaming algorithms for max-imizing monotone submodular functions under a knapsack constraint. Algorithmica, pages1–27, 2019.

20

[16] Samir Khuller, Anna Moss, and Joseph Seffi Naor. The budgeted maximum coverage problem.Information Processing Letters, 70(1):39–45, 1999.

[17] Ariel Kulik, Hadas Shachnai, and Tami Tamir. Maximizing submodular set functions subjectto multiple linear constraints. In SODA, pages 545–554, 2009.

[18] Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, and Amin Karbasi. Fast constrainedsubmodular maximization: Personalized data summarization. In ICML, pages 1358–1367,2016.

[19] Eyal Mizrachi, Roy Schwartz, Joachim Spoerhase, and Sumedha Uniyal. A tight approximationfor submodular maximization with mixed packing and covering constraints. In ICALP 2019,pages 85:1–85:15, 2019.

[20] George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. An analysis of approxi-mations for maximizing submodular set functions. Mathematical programming, 14(1):265–294,1978.

[21] Maxim Sviridenko. A note on maximizing a submodular set function subject to a knapsackconstraint. Operations Research Letters, 32(1):41–43, 2004.

[22] Jan Vondrák, Chandra Chekuri, and Rico Zenklusen. Submodular function maximization viathe multilinear relaxation and contention resolution schemes. In STOC, pages 783–792, 2011.

[23] Yuichi Yoshida. Maximizing a monotone submodular function with a bounded curvature undera knapsack constraint. SIAM Journal on Discrete Mathematics, 33(3):1452–1471, 2019.

A Supplementary Materials of Section 2

A.1 Comparison of ADT with algorithm in [12]

The main subroutine in [12] requires an (1 − ε)-approximation of f(OPT), which is obtainedby applying a binary search procedure on O(ε−1 log n) predefined well-spaced guesses. In eachiteration of the binary search, the main subroutine procedure is applied, which requires O(ε−1)passes on the elements. However, the preprocessing phase of our ADT uses thresholds decreasingat a speed that is related to the numerical output in last iteration, and update the estimationsaccording to objective values of ⌊log(wq/wq)/ log(1 + αq)⌋ sets. ADT only makes a single pass onthe elements to get the result for each fixed threshold. More importantly, while ADT maintainsan interval [wi, wi+1] that contains f(OPT) and shrinks at each iteration, the algorithm in [12] isnot able to obtain estimates of f(OPT) until the whole binary search procedure terminates, as itrequires the value of all the historic midpoints during the binary search.

21

B Supplementary Materials of Section 3

B.1 Details of Algorithm 5

Algorithm 5: An Oε(n log log n) Time Deterministic Algorithm for A ∈ 0, 1O(1)×n

1 Construct a (12 − ε, 1 + ε)-shadow set of OPTiε

, denoted by S♯, by (1) guessing the marginal

increments of elements in OPTiεvia set Gε = 0, ε

|Oiε | f(OPT), 2ε|Oiε | f(OPT), · · · , f(OPT); (2)

guessing the cost vector of elements in Oiε, which belongs to 0, 1d.

2 b′i ← bi − ci(S

♯) (∀i ∈ [d])

3 wi ←1b′

i

(∀i ∈ [d])

4 λ, λ, λ← Upper and lower bounds on the ratio of marginal gain and weight, T ← ∅

5 while λ ≥ λ do

6 for e ∈ E \ (EΓ ∪ S♯ ∪ T ) do

7 if∑d

i=1 b′iwi ≤ 1 + δW + (δW )2 and fT ∪S♯(e) ≥ λ ·

∑di=1 wici(e) then

8 wi ←[1 + δW ci(e)

b′

i

+(

δW ci(e)b′

i

)2]· wi (∀i ∈ [d])

9 T ← T + e

10 λ← (1 − ε) · λ

11 Return So = S♯ε ∪ T

B.2 Proof of Proposition 12

Proof: We first note that |OPT ∩ EΓ| ≤ dW , since for each i ∈ Γ, there are at most W elementsin OPT∩EΓ whose cost in the i-th dimension is non-zero and |Γ| ≤ d. We next claim the followingproposition about the performance of Proposition 20.

Proposition 20. Set T is an (1− 1/e− δ)-approximate solution for the S♯–residual problem, i.e.,

f(T ∪ S♯)− f(S♯) ≥ (1− 1/e−O(δ))[f(OPT ∪ S♯ \ (OPTiε ∪ Eγ))− f(S♯)]. (46)

Assuming the correctness of Proposition 20, we are ready to finish the proof. According toProposition 11, we have

f(OPT \ Eγ)− f((OPT ∪ S♯) \ (OPTiε ∪ Eγ)) ≤ f(S♯). (47)

Combining with Proposition 20 in which we let δ = 12 −

1e , we know that

f(T ∪ S♯) ≥f(OPT \EΓ)

2≥

(1

2− ε

)f(OPT), (48)

which follows from the definition of EΓ, as removing each element in O ∩ EΓ will incur a loss nomore than ε

dW f(OPT) and |OPT ∩ EΓ| ≤ dW .

Proof of Proposition 20: In this proof we shall depart from the previous notation, for any setS ⊆ E, we let S′ = S \ (S♯ ∪EΓ). Suppose that element e(t+1) is selected at the (t + 1)-th iteration

22

of the while loop, then for ∀e ∈ E′ \ T (t), we have

fT (t)∪S♯(e) ≤

∑di=1 w

(t)i ci(e)

∑di=1 w

(t)i ci(e(t+1))

fT (t)∪S♯(e(t+1)), (49)

which implies that

∑

e∈OPT′

fT (t)∪S♯(e) ≤

∑di=1 w

(t)i ci(OPT′)

∑di=1 w

(t)i ci(e(t+1))

fT (t)∪S♯(e(t+1)).

For the LHS, based on the submodularity of f , we have∑

e∈OPT′

fT (t)∪S♯(e) ≥ f(OPT′ ∪ S♯)− f(T (t) ∪ S♯). (50)

Combining (49) and (50),

d∑

i=1

w(t)i ci(e

(t+1)) ≤fT (t)∪S♯(e(t+1))

f(OPT′ ∪ S♯)− f(T (t) ∪ S♯)·

d∑

i=1

w(t)i ci(OPT′)

≤fT (t)∪S♯(e(t+1))

f(OPT′ ∪ S♯)− f(T (t) ∪ S♯)·

d∑

i=1

b′iw

(t)i . (51)

Observe that

d∑

i=1

b′iw

(t+1)i =

d∑

i=1

b′iw

(t)i ·

[1 +

δW ci(e(t+1))

b′i

+(δW ci(e

(t+1))

b′i

)2](52)

≤d∑

i=1

b′iw

(t)i + (δW + δ2W ) ·

d∑

i=1

w(t)i ci(e

(t+1)) (53)

where we use W ci(e(t)) ≤ b′

i as W = minb′

ici(e) |e ∈ E \ (EΓ ∪ S♯), i ∈ [d]. Let Φ(t) =

∑di=1 b′

iw(t)i ,

then Φ(0) = d, and

Φ(t + 1) ≤ (1 + δ + δ2) · Φ(t), (54)

which follows from the update rule of w(t)i in Algorithm 5 and definition of W . Further we can

obtain the following bound on the increment of Φ by utilizing (51),

Φ(t + 1)

Φ(t)≤1 + (δW + δ2W ) ·

fT (t)∪S♯(e(t+1))

f(OPT′ ∪ S♯)− f(T (t) ∪ S♯)

≤ exp

(δW + δ2W ) ·f(T (t+1) ∪ S♯)− f(T (t) ∪ S♯)

f(OPT′ ∪ S♯)− f(T (t) ∪ S♯)

, (55)

which further implies that

Φ(t)

Φ(0)≤ exp

(δW + δ2W ) ·

t−1∑

t=0

f(T (t+1) ∪ S♯)− f(T (t) ∪ S♯)

f(OPT′ ∪ S♯)− f(T (t) ∪ S♯)

≤( f(OPT′ ∪ S♯)

f(OPT′ ∪ S♯)− f(T (t) ∪ S♯)

)(δW +δ2W )(56)

23

Rearranging the terms, we have

f(T (t) ∪ S♯) ≥(1−

(Φ(0)

Φ(t)

) 1δW +δ2W

)f(OPT′ ∪ S♯)

≥(1−

(d[1 + δ + δ2]

eδW

) 1δW +δ2W

)f(OPT′ ∪ S♯), (57)

For the second inequality, we utilize the fact that Φ(t + 1) ≥ eδW and Φ(t + 1) ≤ Φ(t) · (1 + δ + δ2).Note that

(d[1 + δ + δ2]

eδW

) 1δW +δ2W =e

−11+δ · d

1δW +δ2W · (1 + δ + δ2)

1δW +δ2W

≤(1 + 2δ)(1 + 2

W )(1 + 2 log dδW )

e≤

1 + 15δ

e,

where we use ex ≤ 1 + 2x for ∀x ∈ [0, 1], and W = 2 log dδ2 . The proof is complete.

B.3 Proof

Proof: Utilizing the γ-approximate algorithm for U ♯-residual problem, we can obtain a set U ′ suchthat

fU♯(U ′) ≥ γ · maxS⊆E\U♯

fU♯(S) ≥ γfU♯(OPT \ (U ∪ U ♯)), (58)

where the last inequality follows from the fact that OPT\(U∪U ♯) is a feasible solution to U ♯-residualproblem. Plugging the definition of U ♯-residual function into (58), we can obtain

f(U ′ ∪ U ♯) ≥γf((OPT ∪ U ♯) \ U) + (1− γ)f(U ♯)

≥γf(OPT) + (1− γβ − γ)f(U ♯).

If γ + γβ ≤ 1, then we have f(U ′ ∪ U ♯) ≥ γf(OPT). Otherwise utilizing the simple fact that

f(U ♯) ≤ f(U ′ ∪ U ♯), we can obtain f(U ′ ∪ U ♯) ≥ f(OPT)1+β .

C Supplementary Materials of Section 4

C.1 Algorithm for monotone objective

Algorithm 6: Main algorithm for monotone submodular function and p-system+d knap-sack constraints1 Initialization: T ← ∅, β ← output of Algorithm 7, λ← 7(p + d + 1) · β

2 while λ ≥ βp+7/4d+1 do

3 T ′ ← set returned by BT(λ, δp+1 )

4 T ← T ∪ T ′

5 λ← λ1+δ

6 Return So ← argmaxS∈T f(S)

24

C.2 Details of Algorithm 7

Similar to our treatments for cardinality constraint, we use the adaptive decreasing thresholdalgorithm to approximate the value of f(OPT) in the following Algorithm 7.

Algorithm 7: ADT for I = (∩di=1Ki) ∩ Ip

1 Initialization: ω1 ←3 maxe∈E f(e)

7(p+d+1) , ω1 ← n ·maxe∈E f(e), ℓ← ⌈log log n⌉, ǫ = 27(p+1)(p+d+1)

2 for i = 1 : ℓ do

3 αi = exp(log n · e−i)− 1, λ = ωi

4 while λ ≤ ωi do

5 S(i)λ ← BT(λ, ǫ), λ← λ(1 + αi)

6 ωi+1 ←3 maxλ f(S

(i)

λ)

2 , ωi+1 ← ωi+1(1 + αi)

7 Return β = ωℓ.

C.3 Proof of Proposition 15

Proof: The proof is the same as the analysis of greedy for maximizing a monotone submodularfunction under p-system constraint [2, 4], here we provide the proof for completeness. An importantnote is that, for any element e ∈ OPT2, the reason that it cannot be added into S is either themarginal increment of e is less than ε maxe f(e)

n , or S + e is not a feasible set in Ip. Owing to thisobservation, we are able to bound R2 via same arguments for the analysis of the standard greedyfor p-system constraint [4, 2]. Here we provide the proof for completeness.

For S = e1, e2, . . . , es and ∀i ∈ [s], we define set Ci ⊆ OPT2 \ S as,

Ci =

e ∈ OPT2 \ S∣∣∣S(i−1) ∪ e ∈ I

, (59)

which consists of the elements in OPT2 that are able to be added into the candidate solution setin the i-th step. According to the down-closed property of the independent system, we know thatCi+1 ⊆ Ci and we have C1 = OPT2 \ S. Consider set Qi = S(i) ∪ (C1 \ Ci+1). On the one hand,we have C1 \ Ci+1 ∈ I, since it is a subset of OPT ∈ I, which implies that Qi has a base of sizeno less than |C1 \ Ci+1|. On the other hand, S(i) is a base of Qi since no elements in Qi \ S(i) canbe added into S(i) according to the definition of Ci. Then based on the definition of p-system, weknow that

|C1 \ Ci+1| ≤ p · |S(i)| = pi. (60)

Now consider the procedure of decreasing threshold ∆. For 1 ≤ i ≤ s, we let ∆i be the value of ∆when element ei is added into S, then we have

f(S(i))− f(S(i−1)) ≥ ∆i. (61)

And we further claim that

f(S(i−1) + e)− f(S(i−1)) ≤∆i

1− ε= ∆i−1, ∀e ∈ Ci. (62)

Otherwise e ∈ Ci satisfying (62) will already be included into the candidate solution set in previousiteration since

25

• S(i−1) + e ∈ I is a feasible set according to the definition of Ci;

• f(S(i−1) + e) − f(S(i−1)) ≥ θ · [∑d

i=1 ci(e)] holds, which is based on the submodularity andthe definition of OPT2;

• The marginal increment of e with respect to set S(i−1) is no less than the threshold ∆i−1

according to (62).

However e /∈ S, thus (62) is true. Using similar arguments, we obtain

f(S + e)− f(S) ≤ε

nM, ∀e ∈ Cs+1. (63)

Hence we are able to show that

R2 =f(S ∪OPT2)− f(S) (64)

≤∑

e∈OPT2\S

[f(S + e)− f(S)

](submodularity)

(a)=

s∑

i=1

∑

e∈Ci\Ci+1

[f(S + e)− f(S)

]+

∑

e∈Cs+1

[f(S + e)− f(S)

](65)

(b)

≤1

1− ε

s∑

i=1

|Ci \ Ci+1| ·∆i + |Cr+1| ·ε

nM, (66)

where (a) is based on the definition of Ai and (b) follows from inequalities (62)-(63). Observe that

• ∆ii∈[s] is a decreasing sequence;

• The total sum of sequence |Ci \ Ci+1|i∈[s] are fixed;

• |Ci \ Ci+1| ≤ p, according to (60).

Hence∑s

i=1 |Ci \ Ci+1| ·∆i achieves its maximum when |Ci \ Ci+1| = p. As a consequence, thefollowing upper bound holds for the first term in (66),

s∑

i=1

|Ci \ Ci+1| ·∆i ≤s∑

i=1

p ·∆i ≤ p · f(S). (67)

Now plugging (67) and inequality |Cr+1| ·εnM ≤M ≤ f(OPT) into (66), the proof is complete.

C.4 Proposition 21 and proof

Algorithm 7 returns wℓ, which is shown to be a good approximation of f(OPT) in Proposition 21.

Proposition 21. For any i ≥ 0, we have

3f(OPT)

7(p + d + 1)(1 + αi)≤ wi+1 ≤

3

2f(OPT), (68)

which implies that wℓ ∈ [ f(OPT)7(p+d+1)(1+c) , 3f(OPT)

2 ].

26

Proof: We first note that

f(S∗) ≥min2θ

3,f(OPT)− θ(d− |OPT ∩B|/2)

p + |OPT ∩B|+ 1− (pε + ε) · f(OPT)

. (69)

Inequality (69) directly follows from Proposition 17 and 14, by taking the minimum of the twolower bounds. The RHS of (68) directly follows from the fact that

wi+1 =3

2max

λf(S

(i)λ ) ≤

3

2f(OPT).

We finish the proof of the LHS by induction. For the base case when i = 0, inequality (68) is

equivalent to the statement that w1 ≥3f(OPT)

7(p+d+1)n , which is true based on the definition of w1. Now

suppose that (68) holds for i = s− 1, i.e.,

ws ≥3f(OPT)

7(p + d + 1)(1 + αs−1). (70)

Following from (70), we have ws = (1 + αs−1)ws ≥3f(OPT)7(p+d+1) . As a consequence, there must exist

an integer zs during the s-th iteration, such that

λ∗s = ωs(1 + αs)zs ∈

[ 3f(OPT)

7(p + d + 1)(1 + αs),

3f(OPT)

7(p + d + 1)

], (71)

which implies that ws+1 can be lower bounded as follows when ε = 27(p+1)(p+d+1) ,

ws+1

(a)

≥3f(BT (λ∗

s))

2

(b)

≥3

2·min

2

3·

3f(OPT)

7(1 + αs)(p + d + 1),

f(OPT)− 3(d−|OPT∩B|/2)OP T7(p+d+1)

p + |OPT ∩B|+ 1−

2f(OPT)

7(p + d + 1)

(c)

≥3

7(p + d + 1)(1 + αs)f(OPT)

where (a) is based on the definition of ws+1. Plugging (71) into (69), we can obtain (b). (c) followsfrom Fact 13. Hence we have

ws+1 = (1 + αs) · ws+1 ≥3

7(p + d + 1)f(OPT),

which indicates that (68) also holds for i = s. The proof is complete.

C.5 Proof of Theorem 3

Proof: Let the optimal threshold

θ∗ =f(OPT)

d + 2/3 + |OPT ∩B|/6 + 23p

.

27

According to Proposition 21, it is easy to see that

θ∗ ∈[3

2·

f(OPT)

p + 74d + 1

, f(OPT)]⊆

[ wℓ

p + 7/4d + 1, 7(p + d + 1)(1 + c)wℓ

].

Hence there exist an iteration in which λ ∈ [(1 − δ)θ∗, θ∗], from which we know that

f(So) ≥min2(1− δ)θ∗

3,f(OPT)− θ∗(d− |OPT ∩B|/2)

p + |OPT ∩B|+ 1− δf(OPT)

≥( 1

p + 7d/4 + 1− 2δ

)· f(OPT).

Note that using our adaptive decreasing threshold algorithm, we are able to obtain a constantapproximation of f(OPT) in log log n rounds, while the time complexity in each round is n log n,thus the total time complexity is Oε(n log n · log log n). The proof is complete.

C.6 Details of the Algorithm for non-monotone objective and p-system+d knap-sack

Algorithm 8: Maximizing non-monotone submodular function under p-system+d knap-sack constraints1 Input: Algorithm BT(θ, ε).2 Output: A constant approximation of fOPT)

3 Initialization: ω1 ←3 maxe∈E f(e)

(8p+7d+4) , ω1 ← n ·maxe∈E f(e), U ← ∅, ℓ← log log n.

4 for i = 1 : ℓ do

5 αi = exp(log n · e−i)− 16 λ = ωi

7 while λ ≤ ωi do

8 S(i)λ ← BT(λ, 2

(p+1)(8p+7d+4) )

9 λ← λ(1 + αi)

10 ωi+1 ←3 maxλ f(S

(i)

λ)

211 ωi+1 ← ωi+1(1 + αi)

12 Return ωℓ.

28

Nearly Linear Time Deterministic Algorithms for Submodular ...

Documents

Transcript of Nearly Linear Time Deterministic Algorithms for Submodular ...