Wednesday, 30 April 2014

Introduction to Cook Jokes

This is the last blog post.

We introduced several fields related to theoretical computer science in this blog. My objective was mostly to have some record of learning this material, and especially of dealing with the classification issues that arise.

Here are the best "Cook Jokes" that were made during the offering of Prof. Steve Cook's 2013-2014 complexity (463) and logic (438) courses.

Best Cook Jokes:

1. You just got Cookblocked!

  • rationale: You are talking with a classmate when a pretty CS girl approaches you, however, it is only to stand in line to speak with Cook after lecture
  • credit: Sean

2. If you had Rice, you could Cook almost anything

  • rationale: If you have ever taken a Cook class, you know that it is forbidden to use Rice's theorem
  • credit: Lex

3. You can Cook a Karp but you can't Karp a Cook

  • rationale: Karp (and others) praised Cook's ground breaking work
  • credit: Av

4. I think he Cooked the books...

  • rationale: Says fellow student as Cook writes his own theorems on the blackboard and gives the reference in Sipser
  • credit: Av

5. What's he Cooking up today?

  • rationale: You are entering BA1220 as you hear these words, needless to say it never gets old
  • credit: Sean

6. I compiled all the notes into a nice Cookbook

  • rationale: Print out all the notes from Cook's lectures and staple them together
  • credit: Lex

Saturday, 8 March 2014

Introduction to Algorithms

Here, we briefly introduce the two interrelated fields of Algorithm Design and the Analysis of Algorithms.

Algorithm Design is a framework to study algorithmic design paradigms (what types of problems they can solve optimally, what data structures can be used, what algorithms are robust enough given nondeterministic settings...). The Analysis of Algorithms is the subfield of complexity theory that provides a framework for studying the running time and correctness of algorithms.

Here is an attempt to organise algorithms according to some important properties:

  • recursive/iterative  | not using looping
  • serial                       | parallel (= concurrent)
              • "classical" parallel
              • distributed
  • exact                        | approximate
  • deterministic            | nondeterminisitic*
  • not using randomness | probabilistic (= randomized)
              • Monte Carlo (ND solution)
              • Las Vegas (ND runtime)
  • not using heuristics     | heuristic based*

Here is a classification by design paradigm:
  • Brute Force - Try every candidate solution in the search space until a solution is found.
  • Divide and Conquer - Break the problem down to its most trivial subproblems and recombine solutions bottom-up.
  • Dynamic Programming - Remember solutions to some subproblems in a table so there is no need recompute them.
  • Greedy Method - Order candidate solutions in an informed way to avoid exploring the entire search space.
  • Linear Programming - Optimization of a function given linear constraints.
  • Other Search/Enumeration/Optimization design paradigms
    • Backtracking - Depth first search on search space (tree), only exploring branches where possible solutions can lie.
    • Branch and Bound - Special backtracking where bounding is a bit more sophisticated.
    • Hill Climbing - Make a move in the search space and see if it improves the objective function and keep on going in that direction.
    • Local Search - Move to the neighbour in the search space that improves the objective function the most.
    • Tabu Search - Local Search that avoids (by prior or learning) tabu areas in the search space.
    • Simulated Annealing - Probabilistic exploration of the search space based on decreasing the entropy of the decisions slowly with time.
    • Genetic Algorithm - Algorithm that simulates evolution, ensuring that the genes of the genotype that is the most fitted to its environment (according to some objective function) are more likely to be carried over to the next generation.


Notes (*):

Nondeterminism: when behaviour cannot be predicted deterministically from state and input. There is some discrepancy in the terminology here again:
  • Generally, nondeterminism is equivalent to randomness and stochasticity. It can arise from the algorithm itself (i.e. a probabilistic alg) or from externalities (i.e. because of the implementation setting - e.g. a parallel alg running on a system with a race condition).
  • In Complexity Theory, nondeterminism has a very specific sense: a nondeterministic machine can branch its thread of execution at any point in the control flow; so a ND alg here is one that can run on a ND model of computation. Notice that the result is deterministic!

Heuristic: If you have taken a psychology course before, you might have learned that the difference between an algorithm and a heuristic regards optimality of the solution, so how about approximation algs or heuristic algs? To reconcile terminology, we consider "heuristic algorithms" (or "heuristic-based algs") as synonym of "algorithms making some use of heuristics" but we exclude heuristics as algorithms:
  • heuristic-based algorithm integrates in its logic strategie(s) that improve some aspect of the behaviour of the alg without providing rigorous info about the tradeoffs they induce.
  • heuristic can be viewed as a pure heuristic-based alg (the heuristical aspect applies to the purpose of the alg itself, hence the induced tradeoff here always affects the optimality of the solution). In this sense, heuristics are approximation algs for which we don't have info on the approximation ratio.

As a field of study, the Design and Analysis of Algorithms - or simply "Algorithms" - lacks an overarching formalism (Bellman, Borodin...), even the definition of algorithm itself is not generally agreed upon (if you have time: http://en.wikipedia.org/wiki/Algorithm_characterizations).

Saturday, 8 February 2014

Computability and Complexity

Here, are my notes on Computability Theory and Computational Complexity Theory.




Recommended Reading:

Read all of Sipser for an excellent intro, then move on to my favourite TCS book: Savage's Models of Computation (http://cs.brown.edu/~jes/book/).

Sunday, 19 January 2014

Introduction to Mathematical Logic

Here, we introduce the field of Mathematical Logic.

This is by far my favourite topic ^_^, the heart of TCS:


TCS was born out of the endeavours of metamathematicians during the Grundlagenkrise at the beginning of the 20th century, some good reads on the topic are:

Contributions of the Logicians by Burris (https://www.math.uwaterloo.ca/~snburris/htdocs/history.html) and Logicomix by Doxiadis and Papadimitriou (warning: you might get frustrated by Papa constantly barging in, the lack of math and/or the "cliffhanger"!)

From the metaprocesses of Life (evolutionary processes), the human brain has been structured with the capacity for rationality/reason (stemming from fitedness with the seemingly inherent order of the Universe). Logic is the study of using this capacity (notice that it is itself required to study itself!); while Mathematics is using this capacity to study the abstract notions of quantity (arithmetic), structure (algebra), form (topology) and change (analysis).


Without going into the philosophical underpinnings of reasoning (Hume's problem of induction, Grue and Bleen, Occam's razor, etc.), what is important to know about the range of Logic is the following:

Logic has 2 definitions:
(functional) the use of reasoning
(structural) the field of study

The field of study is commonly divided in 3 parts:
  • deductive reasoning - drawing conclusions from premises (cf. entailment, modus ponens, syllogisms, material conditional)
  • inductive reasoning - given complete (unbiased sample), draw conclusions on the entire population
  • abductive reasoning - given incomplete observations, infer to the best explanation

However, it is also possible to partition it according to the framework in which it is studied:
  • informal logic - no definite setting
  • formal logic - formal setting

Formal Logic (also called Mathematical Logic or Symbolic Logic) is thus the approach to Logic using the setting of formal systems, which are "broadly defined as any well-defined system[s] of abstract thought based on the model of mathematics" -- Wikipedia.

But Mathematical Logic also has another definition, that is, as a branch of Mathematics born out of the field of study denoted as "foundations of mathematics" as mentioned in the introduction. Thus wedged between Logic and Mathematics, Mathematical Logic is a cornerstone of Metamathematics (which will definitely be the topic of a future post ^_*).

Mathematical Logic has 2 definitions:

(logic) formal logic as defined previously
(mathematics) the subfield of Mathematics encompassing the "four pillars of the foundations of mathematics": set theory (sets)[~quantity], proof theory (syntax)[~structure], model theory (semantics)[~form], recursion theory (computation)[~change].

Under the latter definition, notice that Computability Theory (recursion theory) is as much a (seminal) subfield of TCS as it is a subfield of Math Logic.

Under the former definition, Formal Logic is commonly partitioned into:
  • Classical Logics
    • boolean logic - algebra of truth (BL)
    • propositional (sentential) logic - 0th order logic, a.k.a. propositional calculus (PL)
    • predicate logic - higher-order logics, e.g. first order logic (FOL), second order logic (SOL), infinitary logic, many-sorted logics
  • Non-Classical Logics
    • modal logic - extends classical logic with non-truth-functional ("modal") operators
      • provability logic - changes necessity operator to provability operator
      • interpretability logic - extend provability logic to describe interpretability
    • many-valued logics - a propositional calculus in which there are more than two truth values
      • fuzzy logic - allows as a truth value any real number between 0 and 1
    • intuitionistic (constructive) logic - replaces truth by "constructive provability"; rejects the law of the excluded middle and double negative elimination
    • paraconsistent logic - rejects the law of noncontradiction
      • relevance logic - replaces material conditional (implication) by requiring that antecedent and consequent be related by some "relevance"
    • non-monotonic logic - rejects monotonicity of entailment
      • default logic - formalizing reasoning with default assumptions
      • autoepistemic logic - formalizing reasoning of knowledge about knowledge

Just as the interconnectivity of Mathematics doesn't allow all its subfields to be represented in a planar graph, there is no clean scheme for classifying formal logics, and the one above is division is a mesh of generalising or relaxing some properties, such as properties common to the class of classical logics:
  • law of noncontradiction - two contradictory statements cannot both be true at the same time
  • law of the excluded middle - statements are either true or false
  • monotonicity of entailment - statements derived from a premise can be derived from that premise and any other premise (if they are contradictory, then we get vacuous truth)
  • idempotency of entailment - statements that can be derived from a single instance of a premise, can also be derived from multiple instances of the same premise and vice-versa
  • commutativity of conjunction - conjuncts can be switched without affecting truth value
  • DeMorgan duality - dichotomies of unary and binary logical operators

The first two laws come from Aristotle's laws of thought:
"The law of noncontradiction, along with its complement, the law of excluded middle (the third of the three classic laws of thought), are correlates of the law of identity (the first of the three laws). Because the law of identity partitions its logical Universe into exactly two parts, it creates a dichotomy wherein the two parts are "mutually exclusive" and "jointly exhaustive". The law of noncontradiction is merely an expression of the mutually exclusive aspect of that dichotomy, and the law of excluded middle, an expression of its jointly exhaustive aspect." --Wikipedia

Notice that this dichotomy of truth core to classical logics is reflected in (equivalent) laws:
  • the principle of explosion (ex falso quod libet) - anything can be proven from a contradiction
    • e.g. A is true and ~A is true, thus A or B is true (or intro), thus A or B and ~A is true (and intro), thus B is true (skipped steps but using principle of noncontradiction)
  • the principle of vacuous truth - if the antecedent is false in a conditional statement, the statement is true

Finally, let us briefly touch upon another dichotomy inherent to the structure of formal logic, syntax vs semantics, by properly distinguishing material from logical consequence in the context of classical logic:

logical implication (semantic) as defined in classical logic:
\Phi \models (double turnstile in latex) A iff any truth assignment satisfying \Phi also satisfies A

note: the definition happens in the metalanguage that is used to define the object language, thus symbols like "iff" and notions of truth assignments and satisfiability are about the language that is being defined. We will see later that when logics can express statements about their own statements, interesting issues arise.

material conditional (syntactic) as defined in classical logic:
A \supset B iff ~A \vee B                (A implies B iff notA or B)

note: to quickly remember this result think about sets: A implies B means that the elements in set A (i.e. that have property A) are also elements of B (i.e. have property B), thus B contains A. That only means that there can be no elements in A that are not in B (not(A and notB)). Using DeMorgan this is equivalent to the statement above. In this sense B is a broader (also termed stronger, see below) property.

note about the note: there is a confusion I want to resolve here: stronger vs weaker statements. Generally, if assumptions are stronger than others it means that they imply the weaker ones; however the confusion arises when talking about properties: a property that holds on weaker assumptions is stronger (more general) than one that holds on stronger assumptions.

Thus, we conclude our preliminary overview of Math Logic by pointing out that such interconnected notions are best approached from multiple directions, a fact that is true for any concept the brain can handle (cf distributed representations), but that gets amplified as the concepts become more abstract. What does this fact say about the human capacity for abstraction? What does it say about the frame problem?

Further reading:


http://www.iep.utm.edu/logcon/

http://plato.stanford.edu/entries/logic-classical/

http://en.wikipedia.org/wiki/Logic_in_computer_science

http://en.wikipedia.org/wiki/List_of_logic_symbols

Wednesday, 28 November 2012

Introduction to Automata Theory

Here, we introduce the field of automata theory.

The most general notion of abstract automaton is called a state transition machine (STM), a formalism used to model machines or computations by a 3-tuple (L, S, T):
  • L is a linguistic structure symbolising the input that will get manipulated by the machine as well as possible output.
  • S is a state structure with a set of states Q, a single starting state in Q and a set of final states in Q.
  • T is an action structure containing a set of labels and a transition function, modelling the behaviour of the machine. That is, given a state in Q and specific constraints from a label (input/output symbols, costs, capacities, probabilities, etc.), the function will derive which states the machine should switch to. 
  Note that there are countably many ways to define a STM, and the literature on STMs is relatively poor and inconsistent. Suffice to say that STMs are a blueprint for constructing any form of automata.

   On the other hand, there are uncountably many different types of automata and, much to our despair, there is no unifying classification for all of them...

   Automata can then be organised by which machinery or dynamic system they are defined to model (cells - cellular automata, neurons - neural nets, computation - Turing machines, stochastic languages - probabilistic automata, etc.); or by the extent of their computational power, which is reflected by how many algorithms they can simulate or, equivalently, how many languages they can accept:
  • Finite State Machines/Automata (FSM/FSA) accept regular languages.
  • Pushdown Automata (PDA) accept context-free languages.
  • Linear Bounded Automata (LBA) accept context-sensitive languages.
  • Turing Machines (TM), including quantum TMs, accept recursively enumerable languages.
  • Stream Automata (\omega-Automata) accept \omega-languages.
  • Inductive Turing Machines (ITM) accept super-recursive languages.
  In the following segment we will choose to focus on finite-state automata, that is: machines with string input, that will parse their input character by character by changing states depending on their transition functions and eventually (since strings are finite) halt; the string will be accepted if the automaton halts in an accepting state and rejected otherwise.

An FSA is a 5-tuple (Q, \Sigma, \delta, q_0, F), where:
  • Q is a finite set of states
  • \Sigma is a finite alphabet
  • \delta is a transition function: Q x \Sigma -> Q or P(Q) (where P(Q) is the powerset of Q)
  • q_0 is the start state
  • F is the set of accepting states
FSAs are characterised by having finite memory that depends only on the finite number of states the automaton can reach. They differ on the form of their transition function:
  • Deterministic FSA (DFSA): \delta : Q x \Sigma -> Q
    • Moore machine: \delta : Q -> Q (next state depends only on current state)
    • Mealy machine: \delta : Q x \Sigma -> Q (next state depends on input)
  • Non-deterministic FSA (NFSA): \delta : Q x \Sigma -> P(Q)
  FSAs can only be in a single state at once (like most non-quantum automata): the state in which a DFSA is, is deterministic (we know which state it is in with a 1 probability), the state in which an NFSA is, is non-deterministic (we don't know which state it is in, it could be in any state within the element of P(Q) returned by \delta). An NFSA will accept an input if it could be in an accepting state.

  Notice that the transition function for NFSAs differs by the fact that its range is the powerset of the range for that of DFSAs; thus it is possible to model any NFSA with a DFSA (the reverse is trivial), but doing so requires a number of states bounded by the cardinality of the powerset of the states in the NFSA - recall that we proved it to be 2^{#states in NFSA}. Hence non-determinism can be advantageous for its complexity aspect (rather than its computability aspect).

  There is a one-to-one correspondence between all languages that can be formed from a regular expression and languages that can be accepted by an FSA, namely both these classes of languages are equivalent - they are the class of regular languages: 
  •  prove L(R) C L(M), use the recursive definition of regular expressions to show that FSAs can be built to accept any language induced by the base cases and by construction.
  •  prove L(M) C L(R), use state elimination on an arbitrary FSA: show how a regular expression can express any intermediary state of the FSA, thus how any FSA can be reduced to a single start state with regular expression transitions to accepting states. The union of all these REs is the regexp generating the laguage the FSA accepts.
  •  conclude that L(M) = L(R), that is: REs and FSAs are equivalently suitable to generate/accept any regular language.

  We provided a small foundation in automata theory, which provides us with powerful formalisms to model computations. We also briefly had a glance at the standard formalism of Computation: Turing Machines, based on automata (using the same STM construct). We tentatively extrapolate our remark about FSAs being as powerful a formalism as REs, to TMs being as powerful a formalism as another formalism for Computation, the Lambda-Calculus.

Monday, 26 November 2012

Introduction to Formal Language Theory

Here, we introduce the field of formal language theory.

There are 4 types of languages:

  • natural language - language arising naturally within societies
  • constructed language - language consciously defined within societies
  • artificial language - language arising from artificial situations
  • formal language - language axiomatically defined
We now use set theory to construct the theory of formal languages.

  An alphabet is a set \Sigma whose elements are symbols (characters). We define the object of string (word), akin to tuple, by introducing the concatenation operation o on symbols and strings (the concatenation of symbols (or strings) a, b in \Sigma, a o b = ab is a string).
  For this new type of object, we define empty string \epsilon (akin to empty set), length (akin to cardinality), substring relation (akin to subset relation), and string exponentiation (repeated concatenation).
  Finally, we define a new set operator that makes sense on alphabets or sets of strings, the Kleene star * (akin to infinite powersetting): an operator on a set S defined inductively as constructing the first 'powerset' by forming all possible concatenations of elements in S U {\epsilon} (hence it will contain S and \epsilon properly), then constructing all concatenations within that new set, and doing so iteratively infinitely. Then if S is a finite set of symbols or strings, S* will be a countably infinite set closed under o, namely the set of all possible words that can be made by joining together elements of \sigma.

  We can now define a language L over an alphabet \Sigma as the subset of \Sigma* that contains all valid words in the language. Moreover, all usual set theoretic operations still hold and we can extend the string operations of concatenation and exponentiation to languages.


  This paragraph will be a short digression. Notice that in a language all words are finite, so, in order to avoid insulting Cantor's teachings, we also define an \omega-language over \Sigma as a subset of \Sigma^\omega (constructed by infinitely combining elements in \Sigma). Also, we can denote \Sigma^\inf = \Sigma* U \Sigma^\omega as the set of all finite and infinite words of \Sigma.

  To start categorizing formal languages, we first need a formalism to describe the syntax of a language: the concept of (generative) formal grammars. We informally define a formal grammar as the set of production rules necessary to construct all the words in a language, thus it formalizes its syntax.
  Then we partition the set of all formal languages into 2 major kinds of languages: recursively enumerable and non-recursively enumerable languages - notions central to computability/recursion theory. Suffice to say that to build a non-recursively enumerable language, one has to use non-straight forward techniques like Cantor's diagonalization; which is outside the scope of formal grammars' production rules.

The Chomsky hierarchy provides a way to categorize formal grammars:

  • Unrestricted grammars generate recursively enumerable languages. This class includes all formal grammars and contains properly all context-sensitive grammars.
  • Context-sensitive grammars generate context-sensitive languages. This class contains properly all context-free grammars.
  • Context-free grammars generate context-free languages. This class contains properly all regular grammars.
  • Regular grammars generate regular languages. This class of grammars has the same expressiveness as regular expressions, that is both are equivalently suitable to generate the class of regular languages. 
  We now introduce regular expressions as a convenient, compact notation for representing regular grammars and regular languages. That is, a given regular expression will provide a way to construct/describe all words of a given regular language; the procedure used is known as pattern matching.
  Given a finite alphabet \Sigma, the set of regular expression RE is defined recursively as the smallest set described by:
Basis: empty set, empty string and any string literal (symbol) in \Sigma belong to RE.
Inductive construction: if R,S belong to RE, then R+S (alternation), RoS (concatenation, abbreviated RS) and R* (Kleene star) belong to RE.

  Thus for a given regular language, we can construct a regular expression - a pattern from which every word in the language can be generated. The string operations o and * hold for this concept of pattern as well as alternation (+), it is the analogue of set union: it will match a string that, at a given position within it, matches either of the disjuncted regexps.

  Procedurally: the regexp A*(R+S)BB will match any string that starts with 0 to n literals 'A' followed by a pattern matching either R or S and ending with 2 literals 'B'; for example if \Sigma = {A, B, C} and R = C* and S = CA, the strings BB, AAABB, CCCBB, AAACBB, CABB, AACABB, etc. are in the language L(A*(C*+BA*)BB).

  We provided a brief basis in formal language theory, the field of study concerned with the syntactical aspects of formalised language. This basis is essential to domains in TCS such as programming language theory in which words of a formal language are given specific semantics and computational complexity theory and mathematical logic where this formalism is chosen to express decision problems and logical statements.

Wednesday, 7 November 2012

Divide and Conquer

Here, we develop results for divide and conquer algorithms.

  The divide and conquer strategy consists of dividing a problem into subproblems until these become trivial to solve, then combining the results iteratively (bottom-up) to provide the solution to the original problem. The paradigm is very efficient for certain kinds of problems where that design makes sense: sorting (quicksort, mergesort), searching (binary search), multiplying (karatsuba, strassen), etc. This strategy is naturally implemented using recursion: if the problem is trivial, solve it, else make recursive calls to solve subproblems and combine their return values in a specific way.

  The time complexity (function describing the amount of time the algorithm will take to terminate depending on the size of its input) of such algorithms is going to depend on the number of recursive calls, the time each of these recursive calls takes and the time complexity of the function that combines the return values of the recursive calls. Hence, the time complexity is going to take the form of a recurrence relation: for the trivial cases, the time is fixed, for the nontrivial cases, the running time given an input of size n will depend on the time given an input of size n' < n. To obtain a closed form
(depending only on the size of the input) for the time complexity, we use the method of repeated substitution.

  Since the validity of the obtained closed form relies on the assumptions made on k, there are several more steps to take in order to prove the result for all inputs: since we are interested in the time behaviour, we will actually end up using that closed form to set a bound on the time complexity. We can prove that T(n) is a non-decreasing function inductively, then use the closed form to set a lower bound, upper bound and/or tight bound on T. For convenience:

Recall asymptotic complexity bounds (big O notation):
Upper asymptotic bound 
Tight asymptotic bound 
Lower asymptotic bound 

  A generalisation of deriving asymptotic bounds for the recurrence relations arising as time complexities of divide and conquer algorithms is the Master Theorem. Given a relation of the form T(n) = aT(n/b) + f(n) for n bigger than a given threshold and f(n) in \Theta(n^d), where a constitutes the number of recursive calls, b is the dividing factor and f(n) represents the time complexity of the dividing and recombining procedure, the master theorem will provide a tight bound for T as follows:
  • if a < b^d then T(n) in \Theta(n^d)
  • if a = b^d then T(n) in \Theta(log(n)*n^d)
  • if a > b^d then T(n) in \Theta(n^log_b(a))
 Notice that the theorem assumes a relation with a constant b, i.e. the subproblems must be roughly the same size. There is another generalisation of this procedure for deriving bounds for the time complexity of divide and conquer recurrences, that can be used when the subproblems have different sizes, it is called the Akra-Bazzi method.