**Berkholz, Christoph; Nordström, Jakob**

**Near-optimal lower bounds on quantifier depth and Weisfeiler-Leman refinement steps**. - In: Journal of the ACM, ISSN 1557-735X, Bd. 70 (2023), 5, 32, S. 32:1-32:32

We prove near-optimal tradeoffs for quantifier depth (also called quantifier rank) versus number of variables in first-order logic by exhibiting pairs of n-element structures that can be distinguished by a k-variable first-order sentence but where every such sentence requires quantifier depth at least nΩ (k/log k). Our tradeoffs also apply to first-order counting logic and, by the known connection to the k-dimensional Weisfeiler-Leman algorithm, imply near-optimal lower bounds on the number of refinement iterations. A key component in our proof is the hardness condensation technique introduced by Razborov in the context of proof complexity. We apply this method to reduce the domain size of relational structures while maintaining the minimal quantifier depth needed to distinguish them in finite variable logics.

https://doi.org/10.1145/3195257

**Berkholz, Christoph; Vinall-Smeeth, Harry**

**A dichotomy for succinct representations of homomorphisms**. - In: 50th International Colloquium on Automata, Languages, and Programming, (2023), S. 113:1-113:19

The task of computing homomorphisms between two finite relational structures A and B is a well-studied question with numerous applications. Since the set Hom(A, B) of all homomorphisms may be very large having a method of representing it in a succinct way, especially one which enables us to perform efficient enumeration and counting, could be extremely useful. One simple yet powerful way of doing so is to decompose Hom(A, B) using union and Cartesian product. Such data structures, called d-representations, have been introduced by Olteanu and Závodný [Olteanu and Závodný, 2015] in the context of database theory. Their results also imply that if the treewidth of the left-hand side structure A is bounded, then a d-representation of polynomial size can be found in polynomial time. We show that for structures of bounded arity this is optimal: if the treewidth is unbounded then there are instances where the size of any d-representation is superpolynomial. Along the way we develop tools for proving lower bounds on the size of d-representations, in particular we define a notion of reduction suitable for this context and prove an almost tight lower bound on the size of d-representations of all k-cliques in a graph.

https://doi.org/10.4230/LIPIcs.ICALP.2023.113

**Atserias, Albert; Berkholz, Christoph; Etessami, Kousha; Ochremiak, Joanna**

**Finite and algorithmic model theory**

**: report from Dagstuhl Seminar 22051**. - In: Dagstuhl Reports, ISSN 2192-5283, Bd. 12 (2022), 1, S. 101-118

https://doi.org/10.4230/DagRep.12.1.101

**Carmeli, Nofar; Zeevi, Shai; Berkholz, Christoph; Conte, Alessio; Kimelfeld, Benny; Schweikardt, Nicole**

**Answering (unions of) conjunctive queries using random access and random-order enumeration**. - In: ACM transactions on database systems, ISSN 0362-5915, Bd. 47 (2022), 3, S. 9:1-9:49

As data analytics becomes more crucial to digital systems, so grows the importance of characterizing the database queries that admit a more efficient evaluation. We consider the tractability yardstick of answer enumeration with a polylogarithmic delay after a linear-time preprocessing phase. Such an evaluation is obtained by constructing, in the preprocessing phase, a data structure that supports polylogarithmic-delay enumeration. In this article, we seek a structure that supports the more demanding task of a “random permutation”: polylogarithmic-delay enumeration in truly random order. Enumeration of this kind is required if downstream applications assume that the intermediate results are representative of the whole result set in a statistically meaningful manner. An even more demanding task is that of “random access”: polylogarithmic-time retrieval of an answer whose position is given. We establish that the free-connex acyclic CQs are tractable in all three senses: enumeration, random-order enumeration, and random access; and in the absence of self-joins, it follows from past results that every other CQ is intractable by each of the three (under some fine-grained complexity assumptions). However, the three yardsticks are separated in the case of a union of CQs (UCQ): while a union of free-connex acyclic CQs has a tractable enumeration, it may (provably) admit no random access. We identify a fragment of such UCQs where we can guarantee random access with polylogarithmic access time (and linear-time preprocessing) and a more general fragment where we can guarantee tractable random permutation. For general unions of free-connex acyclic CQs, we devise two algorithms with relaxed guarantees: one has logarithmic delay in expectation, and the other provides a permutation that is almost uniformly distributed. Finally, we present an implementation and an empirical study that show a considerable practical superiority of our random-order enumeration approach over state-of-the-art alternatives.

https://doi.org/10.1145/3531055

**Berkholz, Christoph; Merz, Maximilian**

**Probabilistic databases under updates: Boolean query evaluation and ranked enumeration**. - In: PODS '21, (2021), S. 402-415

We consider tuple-independent probabilistic databases in a dynamic setting, where tuples can be inserted or deleted. In this context we are interested in efficient data structures for maintaining the query result of Boolean as well as non-Boolean queries. For Boolean queries, we show how the known lifted inference rules can be made dynamic, so that they support single-tuple updates with only a constant number of arithmetic operations. As a consequence, we obtain that the probability of every safe UCQ can be maintained with constant update time. For non-Boolean queries, our task is to enumerate all result tuples ranked by their probability. We develop lifted inference rules for non-Boolean queries, and, based on these rules, provide a dynamic data structure that allows both log-time updates and ranked enumeration with logarithmic delay. As an application, we identify a fragment of non-repeating conjunctive queries that supports log-time updates as well as log-delay ranked enumeration. This characterisation is tight under the OMv-conjecture.

https://doi.org/10.1145/3452021.3458326

**Carmeli, Nofar; Zeevi, Shai; Berkholz, Christoph; Kimelfeld, Benny; Schweikardt, Nicole**

**Answering (unions of) conjunctive queries using random access and random-order enumeration**. - In: PODS '20, (2020), S. 393-409

As data analytics becomes more crucial to digital systems, so grows the importance of characterizing the database queries that admit a more efficient evaluation. We consider the tractability yardstick of answer enumeration with a polylogarithmic delay after a linear-time preprocessing phase. Such an evaluation is obtained by constructing, in the preprocessing phase, a data structure that supports polylogarithmic-delay enumeration. In this paper, we seek a structure that supports the more demanding task of a "random permutation": polylogarithmic-delay enumeration in truly random order. Enumeration of this kind is required if downstream applications assume that the intermediate results are representative of the whole result set in a statistically valuable manner. An even more demanding task is that of a "random access": polylogarithmic-time retrieval of an answer whose position is given. We establish that the free-connex acyclic CQs are tractable in all three senses: enumeration, random-order enumeration, and random access; and in the absence of self-joins, it follows from past results that every other CQ is intractable by each of the three (under some fine-grained complexity assumptions). However, the three yardsticks are separated in the case of a union of CQs (UCQ): while a union of free-connex acyclic CQs has a tractable enumeration, it may (provably) admit no random access. For such UCQs we devise a random-order enumeration whose delay is logarithmic in expectation. We also identify a subclass of UCQs for which we can provide random access with polylogarithmic access time. Finally, we present an implementation and an empirical study that show a considerable practical superiority of our random-order enumeration approach over state-of-the-art alternatives.

https://doi.org/10.1145/3375395.3387662

**Berkholz, Christoph; Gerhardt, Fabian; Schweikardt, Nicole**

**Constant delay enumeration for conjunctive queries - a tutorial**. - In: ACM SIGLOG news, ISSN 2372-3491, Bd. 7 (2020), 1, S. 4-33

This paper is the tutorial we wish we had had available when starting our own research on constant delay enumeration for conjunctive queries. It provides precise statements and detailed, self-contained proofs of the fundamental results in this area.

https://doi.org/10.1145/3385634.3385636

**Berkholz, Christoph; Nordström, Jakob**

**Supercritical space-width trade-offs for resolution**. - In: SIAM journal on computing, ISSN 1095-7111, Bd. 49 (2020), 1, S. 98-118

We show that there are CNF formulas which can be refuted in resolution in both small space and small width, but for which any small-width proof must have space exceeding by far the linear worst-case upper bound. This significantly strengthens the space-width trade-offs in [E. Ben-Sasson, SIAM J. Comput., 38 (2009), pp. 2511-2525], and provides one more example of trade-offs in the "supercritical" regime above worst case recently identified by [A.A. Razborov, J. ACM, 63 (2016), 16]. We obtain our results by using Razborov's new hardness condensation technique and combining it with the space lower bounds in [E. Ben-Sasson and J. Nordström, Short proofs may be spacious: An optimal separation of space and length in resolution, in Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS '08), 2008, pp. 709-718].

https://doi.org/10.1137/16M1109072

**Berkholz, Christoph; Chen, Hubie**

**Compiling existential positive queries to bounded-variable fragments**. - In: PODS '19, (2019), S. 353-364

A crucial property of bounded-variable fragments of first-order logic is that they can be evaluated in polynomial time. It is therefore a useful preprocessing step to rewrite, if possible, a first-order query to a logically equivalent one with a minimum number of variables. However, it may occur that reducing the number of variables causes an increase in formula size. We investigate this trade-off for the existential-positive fragment of first-order queries, where variable minimisation is decidable in general. In particular, we study the blow-up in the formula size when compiling existential-positive queries to the bounded variable fragment of positive first-order logic. While the increase of the formula size is always at most exponential, we identify situations (based on the signature and the number of variables) where only a polynomial blow-up is needed. In all other cases, we show that an exponential lower bound on the formula size of the compiled formula that matches the general upper bound. This exponential lower bound is unconditional, and is the first unconditional lower bound for formula size with respect to the studied compilation; it is proved via establishing a novel interface with circuit complexity which may be of future interest.

https://doi.org/10.1145/3294052.3319693

**Berkholz, Christoph; Schweikardt, Nicole**

**Constant delay enumeration with FPT-preprocessing for conjunctive queries of bounded submodular width**. - In: 44th International Symposium on Mathematical Foundations of Computer Science, (2019), S. 58:1-58:15

Marx (STOC 2010, J. ACM 2013) introduced the notion of submodular width of a conjunctive query (CQ) and showed that for any class Phi of Boolean CQs of bounded submodular width, the model-checking problem for Phi on the class of all finite structures is fixed-parameter tractable (FPT). Note that for non-Boolean queries, the size of the query result may be far too large to be computed entirely within FPT time. We investigate the free-connex variant of submodular width and generalise Marx's result to non-Boolean queries as follows: For every class Phi of CQs of bounded free-connex submodular width, within FPT-preprocessing time we can build a data structure that allows to enumerate, without repetition and with constant delay, all tuples of the query result. Our proof builds upon Marx's splitting routine to decompose the query result into a union of results; but we have to tackle the additional technical difficulty to ensure that these can be enumerated efficiently.

https://doi.org/10.4230/LIPIcs.MFCS.2019.58

**Berkholz, Christoph; Keppeler, Jens; Schweikardt, Nicole**

**Answering UCQs under updates and in the presence of integrity constraints**. - In: 21st International Conference on Database Theory, (2018), S. 8:1-8:19

We investigate the query evaluation problem for fixed queries over fully dynamic databases where tuples can be inserted or deleted. The task is to design a dynamic data structure that can immediately report the new result of a fixed query after every database update. We consider unions of conjunctive queries (UCQs) and focus on the query evaluation tasks testing (decide whether an input tuple belongs to the query result), enumeration (enumerate, without repetition, all tuples in the query result), and counting (output the number of tuples in the query result). We identify three increasingly restrictive classes of UCQs which we call t-hierarchical, q-hierarchical, and exhaustively q-hierarchical UCQs. Our main results provide the following dichotomies: If the query's homomorphic core is t-hierarchical (q-hierarchical, exhaustively q-hierarchical), then the testing (enumeration, counting) problem can be solved with constant update time and constant testing time (delay, counting time). Otherwise, it cannot be solved with sublinear update time and sublinear testing time (delay, counting time), unless the OV-conjecture and/or the OMv-conjecture fails. We also study the complexity of query evaluation in the dynamic setting in the presence of integrity constraints, and we obtain similar dichotomy results for the special case of small domain constraints (i.e., constraints which state that all values in a particular column of a relation belong to a fixed domain of constant size).

https://doi.org/10.4230/LIPIcs.ICDT.2018.8

**Berkholz, Christoph; Keppeler, Jens; Schweikardt, Nicole**

**Answering FO+MOD queries under updates on bounded degree databases**. - In: ACM transactions on database systems, ISSN 0362-5915, Bd. 43 (2018), 2, S. 7:1-7:32

We investigate the query evaluation problem for fixed queries over fully dynamic databases, where tuples can be inserted or deleted. The task is to design a dynamic algorithm that immediately reports the new result of a fixed query after every database update. We consider queries in first-order logic (FO) and its extension with modulo-counting quantifiers (FO+MOD) and show that they can be efficiently evaluated under updates, provided that the dynamic database does not exceed a certain degree bound. In particular, we construct a data structure that allows us to answer a Boolean FO+MOD query and to compute the size of the result of a non-Boolean query within constant time after every database update. Furthermore, after every database update, we can update the data structure in constant time such that afterwards we are able to test within constant time for a given tuple whether or not it belongs to the query result, to enumerate all tuples in the new query result, and to enumerate the difference between the old and the new query result with constant delay between the output tuples. The preprocessing time needed to build the data structure is linear in the size of the database. Our results extend earlier work on the evaluation of first-order queries on static databases of bounded degree and rely on an effective Hanf normal form for FO+MOD recently obtained by Heimberg, Kuske, and Schweikardt (LICS 2016).

https://doi.org/10.1145/3232056

**Berkholz, Christoph;**

**The relation between polynomial calculus, Sherali-Adams, and sum-of-squares proofs**. - In: 35th Symposium on Theoretical Aspects of Computer Science, (2018), S. 11:1-11:14

We relate different approaches for proving the unsatisfiability of a system of real polynomial equations over Boolean variables. On the one hand, there are the static proof systems Sherali-Adams and sum-of-squares (a.k.a. Lasserre), which are based on linear and semi-definite programming relaxations. On the other hand, we consider polynomial calculus, which is a dynamic algebraic proof system that models Gröbner basis computations. Our first result is that sum-of-squares simulates polynomial calculus: any polynomial calculus refutation of degree d can be transformed into a sum-of-squares refutation of degree 2d and only polynomial increase in size. In contrast, our second result shows that this is not the case for Sherali-Adams: there are systems of polynomial equations that have polynomial calculus refutations of degree 3 and polynomial size, but require Sherali-Adams refutations of large degree and exponential size. A corollary of our first result is that the proof systems Positivstellensatz and Positivstellensatz Calculus, which have been separated over non-Boolean polynomials, simulate each other in the presence of Boolean axioms.

https://doi.org/10.4230/LIPIcs.STACS.2018.11

**Berkholz, Christoph; Verbitsky, Oleg**

**On the speed of constraint propagation and the time complexity of arc consistency testing**. - In: Journal of computer and system sciences, ISSN 1090-2724, Bd. 91 (2018), S. 104-114

Establishing arc consistency on two relational structures is one of the most popular heuristics for the constraint satisfaction problem. We aim at determining the time complexity of arc consistency testing. The input structures G and H can be supposed to be connected colored graphs, as the general problem reduces to this particular case. We first observe the upper bound O(e(G)v(H)+v(G)e(H)), which implies the bound O(e(G)e(H)) in terms of the number of edges and the bound O((v(G)+v(H))3) in terms of the number of vertices. We then show that both bounds are tight as long as an arc consistency algorithm is based on constraint propagation (as all current algorithms are). Our lower bounds are based on examples of slow constraint propagation. We measure the speed of constraint propagation observed on a pair G,H by the size of a combinatorial proof that Spoiler wins the existential 2-pebble game on G,H.

https://doi.org/10.1016/j.jcss.2017.09.003

**Berkholz, Christoph; Keppeler, Jens; Schweikardt, Nicole**

**Answering conjunctive queries under updates**. - In: PODS '17, (2017), S. 303-318

We consider the task of enumerating and counting answers to k-ary conjunctive queries against relational databases that may be updated by inserting or deleting tuples. We exhibit a new notion of q-hierarchical conjunctive queries and show that these can be maintained efficiently in the following sense. During a linear time pre-processing phase, we can build a data structure that enables constant delay enumeration of the query results; and when the database is updated, we can update the data structure and restart the enumeration phase within constant time. For the special case of self-join free conjunctive queries we obtain a dichotomy: if a query is not q-hierarchical, then query enumeration with sublinear *) delay and sublinear update time (and arbitrary preprocessing time) is impossible. For answering Boolean conjunctive queries and for the more general problem of counting the number of solutions of k-ary queries we obtain complete dichotomies: if the query's homomorphic core is q-hierarchical, then size of the the query result can be computed in linear time and maintained with constant update time. Otherwise, the size of the query result cannot be maintained with sublinear update time. All our lower bounds rely on the OMv-conjecture, a conjecture on the hardness of online matrix-vector multiplication that has recently emerged in the field of fine-grained complexity to characterise the hardness of dynamic problems. The lower bound for the counting problem additionally relies on the orthogonal vectors conjecture, which in turn is implied by the strong exponential time hypothesis.*) By sublinear we mean O(n(1-ε) for some ε > 0, where n is the size of the active domain of the current database.

https://doi.org/10.1145/3034786.3034789

**Berkholz, Christoph; Keppeler, Jens; Schweikardt, Nicole**

**Answering FO+MOD queries under updates on bounded degree databases**. - In: 20th International Conference on Database Theory, (2017), S. 8:1-8:18

We investigate the query evaluation problem for fixed queries over fully dynamic databases, where tuples can be inserted or deleted. The task is to design a dynamic algorithm that immediately reports the new result of a fixed query after every database update. We consider queries in first-order logic (FO) and its extension with modulo-counting quantifiers (FO+MOD), and show that they can be efficiently evaluated under updates, provided that the dynamic database does not exceed a certain degree bound. In particular, we construct a data structure that allows to answer a Boolean FO+MOD query and to compute the size of the query result within constant time after every database update. Furthermore, after every update we are able to immediately enumerate the new query result with constant delay between the output tuples. The time needed to build the data structure is linear in the size of the database. Our results extend earlier work on the evaluation of first-order queries on static databases of bounded degree and rely on an effective Hanf normal form for FO+MOD recently obtained by [Heimberg, Kuske, and Schweikardt, LICS, 2016].

https://doi.org/10.4230/LIPIcs.ICDT.2017.8

**Berkholz, Christoph; Bonsma, Paul; Grohe, Martin**

**Tight lower and upper bounds for the complexity of canonical colour refinement**. - In: Theory of computing systems, ISSN 1433-0490, Bd. 60 (2017), 4, S. 581-614

An assignment of colours to the vertices of a graph is stable if any two vertices of the same colour have identically coloured neighbourhoods. The goal of colour refinement is to find a stable colouring that uses a minimum number of colours. This is a widely used subroutine for graph isomorphism testing algorithms, since any automorphism needs to be colour preserving. We give an O((m + n)log n) algorithm for finding a canonical version of such a stable colouring, on graphs with n vertices and m edges. We show that no faster algorithm is possible, under some modest assumptions about the type of algorithm, which captures all known colour refinement algorithms.

https://doi.org/10.1007/s00224-016-9686-0

**Berkholz, Christoph; Grohe, Martin**

**Linear diophantine equations, group CSPs, and graph isomorphism**. - In: Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, (2017), S. 327-339

In recent years, we have seen several approaches to the graph isomorphism problem based on “generic” mathematical programming or algebraic (Grobner basis) techniques. For most of these, lower bounds have been established. In fact, it has been shown that the pairs of non-isomorphic CFI-graphs (introduced by Cai, FUrer, and Immerman in 1992 as hard examples for the combinatorial Weisfeiler-Leman algorithm) cannot be distinguished by these mathematical algorithms. A notable exception were the algebraic algorithms over the field 2, for which no lower bound was known. Another, in some way even stronger, approach to graph isomorphism testing is based on solving systems of linear Diophantine equations (that is, linear equations over the integers), which is known to be possible in polynomial time. So far, no lower bounds for this approach were known. Lower bounds for the algebraic algorithms can best be proved in the framework of proof complexity, where they can be phrased as lower bounds for algebraic proof systems such as Nullstellensatz or the (more powerful) polynomial calculus. We give new hard examples for these systems: families of pairs of non-isomorphic graphs that are hard to distinguish by polynomial calculus proofs simultaneously over all prime fields, including 2, as well as examples that are hard to distinguish by the systems-of-linear-Diophantine- equations approach. In a previous paper, we observed that the CFI-graphs are closely related to what we call “group CSPs”: constraint satisfaction problems where the constraints are membership tests in some coset of a subgroup of a cartesian power of a base group (Z2 in the case of the classical CFI-graphs). Our new examples are also based on group CSPs (for Abelian groups), but here we extend the CSPs by a few non-group constraints to obtain even harder instances for graph isomorphism.

https://doi.org/10.1137/1.9781611974782.21

**Berkholz, Christoph; Nordström, Jakob**

**Near-optimal lower bounds on quantifier depth and Weisfeiler-Leman refinement steps**. - In: Proceedings of the 31st Annual ACM-IEEE Symposium on Logic in Computer Science (LICS 2016), (2022), S. 267-276

We prove near-optimal trade-offs for quantifier depth versus number of variables in first-order logic by exhibiting pairs of n-element structures that can be distinguished by a k-variable first-order sentence but where every such sentence requires quantifier depth at least nΩ(k / log k). Our trade-offs also apply to first-order counting logic, and by the known connection to the k-dimensional Weisfeiler--Leman algorithm imply near-optimal lower bounds on the number of refinement iterations. A key component in our proof is the hardness condensation technique recently introduced by [Razborov '16] in the context of proof complexity. We apply this method to reduce the domain size of relational structures while maintaining the quantifier depth required to distinguish them.

https://doi.org/10.1145/2933575.2934560

**Berkholz, Christoph; Nordström, Jakob**

**Supercritical space-width trade-offs for resolution**. - In: 43rd International Colloquium on Automata, Languages, and Programming, (2016), S. 57:1-57:14

We show that there are CNF formulas which can be refuted in resolution in both small space and small width, but for which any small-width resolution proof must have space exceeding by far the linear worst-case upper bound. This significantly strengthens the space-width trade-offs in [Ben-Sasson 2009], and provides one more example of trade-offs in the "supercritical" regime above worst case recently identified by [Razborov 2016]. We obtain our results by using Razborov’s new hardness condensation technique and combining it with the space lower bounds in [Ben-Sasson and Nordström 2008].

https://doi.org/10.4230/LIPIcs.ICALP.2016.57

**Berkholz, Christoph;**

**Untere Schranken für heuristische Algorithmen**. - In: Ausgezeichnete Informatikdissertationen, Bd. 2014 (2015), S. 31-40

Dieser Beitrag ist eine deutschsprachige Zusammenfassung der Dissertation des Autors. In der Dissertation werden drei verwandte heuristische Verfahren zum Lösen schwerer Probleme untersucht: der k-Konsistenztest für das Constraint-Satisfaction-Problem, Resolution beschränkter Weite für 3-SAT und der Knotenpartitionierungsalgorithmus für das Graphisomorphieproblem. Die Hauptergebnisse der Dissertation sind untere Schranken an die Zeitkomplexität der Verfahren. In diesem Beitrag werden die untersuchten Verfahren eingeführt und die erzielten unteren Schranken vorgestellt.

**Berkholz, Christoph; Grohe, Martin**

**Limitations of algebraic approaches to graph isomorphism testing**. - In: Automata, Languages, and Programming, (2015), S. 155-166

We investigate the power of graph isomorphism algorithms based on algebraic reasoning techniques like Gröbner basis computation. The idea of these algorithms is to encode two graphs into a system of equations that are satisfiable if and only if if the graphs are isomorphic, and then to (try to) decide satisfiability of the system using, for example, the Gröbner basis algorithm. In some cases this can be done in polynomial time, in particular, if the equations admit a bounded degree refutation in an algebraic proof systems such as Nullstellensatz or polynomial calculus. We prove linear lower bounds on the polynomial calculus degree over all fields of characteristic ≠2 and also linear lower bounds for the degree of Positivstellensatz calculus derivations. We compare this approach to recently studied linear and semidefinite programming approaches to isomorphism testing, which are known to be related to the combinatorial Weisfeiler-Lehman algorithm. We exactly characterise the power of the Weisfeiler-Lehman algorithm in terms of an algebraic proof system that lies between degree-k Nullstellensatz and degree-k polynomial calculus.

https://doi.org/10.1007/978-3-662-47672-7_13

**Berkholz, Christoph; Krebs, Andreas; Verbitsky, Oleg**

**Bounds for the quantifier depth in finite-variable logics: alternation hierarchy**. - In: ACM transactions on computational logic, ISSN 1529-3785, Bd. 16 (2015), 3, S. 21:1-21:26

Given two structures G and H distinguishable in FOk (first-order logic with k variables), let Ak(G, H) denote the minimum alternation depth of a FOk formula distinguishing G from H. Let Ak(n) be the maximum value of Ak(G, H) over n-element structures. We prove the strictness of the quantifier alternation hierarchy of FO2 in a strong quantitative form, namely A2(n) > n/8 − 2, which is tight up to a constant factor. For each k ⩾ 2, it holds that Ak(n) > log k + 1n − 2 even over colored trees, which is also tight up to a constant factor if k ⩾ 3. For k ⩾ 3, the last lower bound holds also over uncolored trees, whereas the alternation hierarchy of FO2 collapses even over all uncolored graphs. We also show examples of colored graphs G and H on n vertices that can be distinguished in FO2 much more succinctly if the alternation number is increased just by one: Whereas in Σi it is possible to distinguish G from H with bounded quantifier depth, in Πi this requires quantifier depth Ω(n2). The quadratic lower bound is best possible here because, if G and H can be distinguished in FOk with i quantifier alternations, this can be done with quantifier depth n2k − 2 + 1 and the same number of alternations.

https://doi.org/10.1145/2732409

**Berkholz, Christoph;**

**Lower bounds for heuristic algorithms**, 2015. - 1 Online-Ressource : Aachen, Techn. Hochsch., Diss., 2014

Zsfassung in dt. u. engl. Sprache

http://nbn-resolving.de/urn/resolver.pl?urn=urn:nbn:de:hbz:82-rwth-2015-004945

**Berkholz, Christoph; Elberfeld, Michael**

**Parameterized complexity of fixed variable logics**. - In: 34th International Conference on Foundation of Software Technology and Theoretical Computer Science, (2014), S. 109-120

We study the complexity of model checking formulas in first-order logic parameterized by the number of distinct variables in the formula. This problem, which is not known to be fixed-parameter tractable, resisted to be properly classified in the context of parameterized complexity. We show that it is complete for a newly-defined complexity class that we propose as an analog of the classical class PSPACE in parameterized complexity. We support this intuition by the following findings: First, the proposed class admits a definition in terms of alternating Turing machines in a similar way as PSPACE can be defined in terms of polynomial-time alternating machines. Second, we show that parameterized versions of other PSPACE-complete problems, like winning certain pebble games and finding restricted resolution refutations, are complete for this class.

https://doi.org/10.4230/LIPIcs.FSTTCS.2014.109

**Berkholz, Christoph;**

**The propagation depth of local consistency**. - In: Principles and Practice of Constraint Programming, (2014), S. 158-173

We establish optimal bounds on the number of nested propagation steps in k-consistency tests. It is known that local consistency algorithms such as arc-, path- and k-consistency are not efficiently parallelizable. Their inherent sequential nature is caused by long chains of nested propagation steps, which cannot be executed in parallel. This motivates the question “What is the minimum number of nested propagation steps that have to be performed by k-consistency algorithms on (binary) constraint networks with n variables and domain size d?” It was known before that 2-consistency requires Ω(nd) and 3-consistency requires Ω(n 2) sequential propagation steps. We answer the question exhaustively for every k ≥ 2: there are binary constraint networks where any k-consistency procedure has to perform Ω(n k − 1 d k − 1) nested propagation steps before local inconsistencies were detected. This bound is tight, because the overall number of propagation steps performed by k-consistency is at most n k − 1 d k − 1.

https://doi.org/10.1007/978-3-319-10428-7_14

**Berkholz, Christoph; Verbitsky, Oleg**

**On the speed of constraint propagation and the time complexity of arc consistency testing**. - In: Mathematical Foundations of Computer Science 2013, (2013), S. 159-170

Establishing arc consistency on two relational structures is one of the most popular heuristics for the constraint satisfaction problem. We aim at determining the time complexity of arc consistency testing. The input structures G and H can be supposed to be connected colored graphs, as the general problem reduces to this particular case. We first observe the upper bound O(e(G)v(H) + v(G)e(H)), which implies the bound O(e(G)e(H)) in terms of the number of edges and the bound O((v(G) + v(H))3) in terms of the number of vertices. We then show that both bounds are tight up to a constant factor as long as an arc consistency algorithm is based on constraint propagation (as all current algorithms are).

https://doi.org/10.1007/978-3-642-40313-2_16

**Berkholz, Christoph; Bonsma, Paul; Grohe, Martin**

**Tight lower and upper bounds for the complexity of canonical colour refinement**. - In: Algorithms – ESA 2013, (2013), S. 145-156

An assignment of colours to the vertices of a graph is stable if any two vertices of the same colour have identically coloured neighbourhoods. The goal of colour refinement is to find a stable colouring that uses a minimum number of colours. This is a widely used subroutine for graph isomorphism testing algorithms, since any automorphism needs to be colour preserving. We give an O((m+n)log n) algorithm for finding a canonical version of such a stable colouring, on graphs with n vertices and m edges. We show that no faster algorithm is possible, under some modest assumptions about the type of algorithm, which captures all known colour refinement algorithms.

https://doi.org/10.1007/978-3-642-40450-4_13

**Berkholz, Christoph; Krebs, Andreas; Verbitsky, Oleg**

**Bounds for the quantifier depth in finite-variable logics: alternation hierarchy**. - In: Computer Science Logic 2013, (2013), S. 61-80

Given two structures G and H distinguishable in FO^k (first-order logic with k variables), let A^k(G,H) denote the minimum alternation depth of a FO^k formula distinguishing G from H. Let A^k(n) be the maximum value of A^k(G,H) over n-element structures. We prove the strictness of the quantifier alternation hierarchy of FO^2 in a strong quantitative form, namely A^2(n) >= n/8-2, which is tight up to a constant factor. For each k >= 2, it holds that A^k(n) > log_(k+1) n-2 even over colored trees, which is also tight up to a constant factor if k >= 3. For k >= 3 the last lower bound holds also over uncolored trees, while the alternation hierarchy of FO^2 collapses even over all uncolored graphs. We also show examples of colored graphs G and H on n vertices that can be distinguished in FO^2 much more succinctly if the alternation number is increased just by one: while in Sigma_i it is possible to distinguish G from H with bounded quantifier depth, in Pi_i this requires quantifier depth Omega(n2). The quadratic lower bound is best possible here because, if G and H can be distinguished in FO^k with i quantifier alternations, this can be done with quantifier depth n^(2k-2).

https://doi.org/10.4230/LIPIcs.CSL.2013.61

**Berkholz, Christoph;**

**Lower bounds for existential pebble games and k-consistency tests**. - In: Logical methods in computer science, ISSN 1860-5974, Bd. 9 (2013), 4:2, S. 1-23

The existential k-pebble game characterizes the expressive power of the existential-positive k-variable fragment of first-order logic on finite structures. The winner of the existential k-pebble game on two given finite structures can be determined in time O(n2k) by dynamic programming on the graph of game configurations. We show that there is no O(n(k-3)/12)-time algorithm that decides which player can win the existential k-pebble game on two given structures. This lower bound is unconditional and does not rely on any complexity-theoretic assumptions. Establishing strong k-consistency is a well-known heuristic for solving the constraint satisfaction problem (CSP). By the game characterization of Kolaitis and Vardi our result implies that there is no O(n(k-3)/12)-time algorithm that decides if strong k-consistency can be established for a given CSP-instance.

https://doi.org/10.2168/LMCS-9(4:2)2013

**Berkholz, Christoph;**

**Lower bounds for existential pebble games and k-consistency tests**. - In: 27th Annual IEEE Symposium on Logic in Computer Science (LICS), 2012, (2012), S. 25-34

The existential k-pebble game characterizes the expressive power of the existential-positive k-variable fragment of first-order logic on finite structures. The winner of the existential k-pebble game on two given finite structures can easily be determined in polynomial time, where the degree of the polynomial is linear in k. We show that this linear dependence on the parameter k is necessary by proving an unconditional polynomial lower bound for determining the winner in the existential k-pebble game on finite structures. Establishing strong k-consistency is a well-known heuristic for solving the constraint satisfaction problem (CSP). By the game characterization of Kolaitis and Vardi our result implies a lower bound on every algorithm that decides if strong k-consistency can be established for a given CSP-instance.

https://doi.org/10.1109/LICS.2012.14

**Berkholz, Christoph;**

**On the complexity of finding narrow proofs**. - In: IEEE 53rd Annual Symposium on Foundations of Computer Science (FOCS), 2012, (2012), S. 351-360

We study the complexity of the following "resolution width problem": Does a given 3-CNF formula have a resolution refutation of width k? For fixed k, refutations of width k can easily be found in polynomial time. We prove a matching polynomial lower bound for the resolution width problem that shows that there is no significant faster way to decide the existence of a width-k refutation than exhaustively searching for it. This lower bound is unconditional and does not rely on any unproven complexity theoretic assumptions. We also prove that the resolution width problem is EXPTIME-complete (if k is part of the input). This confirms a conjecture by Vardi, who has first raised the question for the complexity of the resolution width problem. Furthermore, we prove that the variant of the resolution width problem for regular resolution is PSPACE-complete, confirming a conjecture by Urquhart.

https://doi.org/10.1109/FOCS.2012.48