A simple Rational API in Scala

This post is meant for Java programmers who are curious about what Scala has to offer them. It is often assumed that there is little reason to learn Scala ever since Java 8 included lambda expressions and streams. I find that assumption to be false, and with this example I hope to explain some of the reasons why.

I have been following the online course Functional Programming in Scala, offered by EPFL. In typical Jason fashion, instead of actually sticking to my deadlines and providing deliverables on time, I took one of the examples and extended it to my liking over a period of weeks. So let’s see how we would build an API for a rational number in Scala. As a reminder, a rational number is any number that can be written in the form a/b, where a and b are integer numbers, b non-zero. The goal of this exercise is to (a) Play around with Scala, showcasing that even the most basic features of the language can provide great benefits on readability and extensibility of code, and (b) Investigate whether our new library can offer time and accuracy improvements during computations that involve many rational numbers, when compared to the alternative that the compiler will do by default: generate the Double value that approximates a / b as good as it can within 64 bits, and work with those approximations. The entirety of the code is available under GPL here (src/main/scala/week2/rational). Let’s begin.

Initialization and basic methods

In Scala, we are allowed to execute statements inside a class’ body. Those statements make up the default constructor for the class. For our Rational type, we will do a neat trick: in order to make the various computations faster, we will make sure we reduce the fraction as much as possible during construction. For example, if the user requests of us to create the fraction 10/15, our implementation will reduce that to 2/3, as follows:

class Rational(x:BigInt, y:BigInt) {

  import Rational._

  require(y != 0, "Cannot create a Rational with denominator 0")

  // Simplify representation by dividing both numer and denom by gcd.
  @tailrec
  private def gcd(a:BigInt, b:BigInt) : BigInt = if(b == 0) a else 
                                             gcd(b, a % b)
  private val gcd : BigInt= gcd(x, y)
  private val numer = x / gcd // Make immutable
  private val denom = y / gcd

require will throw an IllegalArgumentException if the caller requests a rational with a denom of 0. The tail-recursive method gcd(a, b) will calculate the greatest common divisor of x and y, and then the constructor will consequently divide x and y and store the resulting values in numer and denom respectively. We use BigInt instead of Int for our inner representation so that we can deal with fractions of really large integers.

Of course, we can define other constructors if we would like, as in Java. For example, since all integers can be expressed as rationals (just give a denom of 1), we can do this:

def this(x:BigInt) = this(x, 1)

Furthermore, and this will be important for our experiments, we want to force the program to output the “pure” representation of our Rational instances instead of dividing them all the time, so we override toString as follows:

override def toString : String = numer + "/" + denom 

Such that, instead of println(20/100) printing 0.2, it will now print 1/5. Feel free to browse the code for overridings of equals and hashCode.

Operator overloading

Among the most discussed drawbacks of Java is lack of operator overloading, something that C++ programmers in particular may find limiting. In Scala, pretty much any alphanumeric string can be a method name, as long as it begins with a non-numeric character. Without doubt, for our rational number API, it would be very convenient for us if we could completely transparently overload the meaning of the classic binary arithmetic operators: +, -, *, /, etc. In our example, we overload all of those, as well as unary –, the caret operator (^) to denote exponentiation, as well as binary comparison operators (>, <, <=, etc). Here are some examples.

def + (that:Rational):Rational  = {
  require(that != null, "Rational + Rational: Provided null 
                                                       argument.")
  new Rational(this.numer * that.denom + that.numer * this.denom, 
                                      this.denom * that.denom)
}

def + (that:BigInt): Rational = this + new Rational(that, 1)

def * (that:Rational):Rational = {
    require(that != null, "Rational * Rational: Provided null 
                                                         argument")
    new Rational(this.numer * that.numer, this.denom * that.denom)
 }

def * (that:BigInt) = new Rational(numer * that, denom)

We use BigInt instances for the numerator and denominator representation since the stress tests we show later on can produce very large integers. In the example above, given two Rational instances a and b we define what the behavior of + should be in the expression a + b. We also define the behavior of a + i, where i is a BigInt instance. To re-use the existing implementation for a + b, a + i is translated to a + i/1, which is already defined.

For unary operators, Scala requires that we prefix the name of the operator with the keyword unary, to disambiguate them from binary operators:

def unary_- : Rational = new Rational (-numer, denom)

Binary comparison operators:

def > (that: Rational) :Boolean = numer * that.denom > that.numer * denom

def >= (that:Rational) :Boolean = (this > that) || (this == that)

def > (that:BigInt) : Boolean = this > new Rational(that, 1)

def >= (that:BigInt) : Boolean = {
  val r = new Rational(that, 1)
  (this > r) || (this == r)   // Scala calls "equals" when "==" is        
}                            //invoked

def < (that:Rational) : Boolean = !(this >= that)

def <= (that:Rational) : Boolean = !(this > that)

def <(that:BigInt): Boolean = {
  !(this >= that)
}

def <= (that:BigInt):Boolean = {
  !(this > that)
}

Re-using existing overloadings as much as we can. It’s a huge syntactic relief to be able to implement functionality via universally recognized operators instead of having to implement ugly methods like plus, minus, multiply, subtract,… And as we will see later, it makes it possible to alter the semantics of an entire computation chain just by changing one line of code!

“No” to bloated pattern-matching operators!

Notice in the code above that we overload the overloaded operators themselves by writing a different method for every type of argument. This seems a bit tedious at first, and one might think: isn’t it more of a “functional” approach to take an argument of type Any and then pattern match against it? Like our overriding of the equals method does:

private def canEqual(x:Any):Boolean = x.isInstanceOf[Rational]

private def sameRational(that:Rational ): Boolean = 
               (numer == that.numer) && (denom == that.denom) ||
               (numer == -that.numer) && (denom == -that.denom)

override def equals(that:Any):Boolean = {
  that match {
    case that:Rational => that.canEqual(this)  && sameRational(that)
    case that:BigInt => sameRational(new Rational(that, 1)) 
    case _ => false 
  }
}

For example, we could do something like this:

def + (that:Any):Rational  = {     
      require(that != null, "+(Rational, Any): Provided null 
                               argument.")     
      that match {         
          case that:Rational => new Rational(this.numer * that.denom 
                                         + that.numer *  this.denom, 
                                           this.denom * that.denom)         
          case that:Int | BigInt => new Rational(this.numer + that * 
                                    this.denom, this.denom) 
          case that:Double => /* ...*/
           case _ => throw new UnsupportedOperationException
                   ("+(Rational, Any): Unsupported operand.")      
      } 
 }

Based on the discussion here, I was able to ascertain that this would be a terrible idea, because it would change a compile-time error to a runtime error. For instance, with our current implementation which only overloads + for BigInt and Rational arguments, the code snippet:

val y = new Rational(10, 11)
y + 2.4

does not compile:

Demonstration of compile-time error when attempting to use an operator on unknown types. String is also an expected type because it’s possible to concatenate the output of Rational::toString with the string literal “2.4” to produce the concatenated String”10/112.4″.

On the other hand, if we implement + in the way above, the code would be prone to a runtime error, which is of course not preferred. So no matter how Java-esque and tedious it might seem, the multi-method way seems like the better solution.

Lazy evaluation of expressions that throw

The keyword lazy val in Scala does exactly what you think: it defines an expression head = body where body is evaluated only at the time of call of head. This can sometimes lead to interesting behavior. In the companion object to Rational, we can define some constants:

object Rational {

  val ONE = new Rational (1, 1)

  val ZERO = new Rational (0, 1)

  val MINUSONEOVERONE = new Rational(-1, 1)

  val ONEOVERMINUSONE = new Rational(1, -1)

  lazy val ZEROOVERZERO = new Rational(0, 0)
}

Note that the last constant, representing the undefined form 0/0, is a lazy val. Its RHS is a constructor call with both arguments set to zero. However, as a reminder, whenever a denominator of zero is provided to the Rational constructor, we make sure to throw an instance of IllegalArgumentException:

require(y != 0, "Cannot create a Rational with denominator 0")

Since the constant is lazy, its RHS, which leads to the exception throwing, will not be evaluated until after the companion object has been brought to life! This means that the class Rational itself has the capacity to internally define how it wants exceptional instantiations to behave! In this case, by “exceptional” instantiation we are referring to the undefined form 0/0 which has been encoded as a constant in the companion object, but we could imagine all sorts of classes with exceptional instantiations that might need to be documented! Especially for applications which monitor global state, where variables are mutated irrespective of what’s going on in the current thread’s call stack, having the capacity to define special cases of a type can be powerful!

Since all the statements within the body of the companion object have to be evaluated (unless lazy!) before the companion object can be brought to existence, stripping away the lazy keyword from the assignment’s LHS would not allow for object construction when requesting the constant ZEROOVERZERO, or any other constant for that matter. Feel free to pull the code, take away the keyword lazy from the line that defines ZEROOVERZERO and see that either one of those statements will throw:

object LazyEval extends App {

import Rational._
 // ONEOVERMINUSONE // This won't work if `ZEROOVERZERO` *ISN'T* lazy
 // ZEROOVERZERO    // Or this
}

It turns out that it is not possible to emulate this behavior in Java, that is, have an expression that is part of a type (so, having to be evaluated at construction time!) be evaluated lazily. The following type:

public class Uninstantiable {

    public static int badField = willThrow();

    public static int willThrow()  {
        throw new RuntimeException("willThrow()");
    }

}

Despite the fact that we only define a static field and method, is indeed uninstantiable:

public class UninstantiableRunner {

    public static void main(String[] args) {
        // Cannot instantiate;
        new Uninstantiable();
    }
}
NNNNNOOOOOOOOOOOOOOOO

Now, care must be paid to ensure that nobody thinks we are making the wrong claim. It is not the claim that lazy behavior cannot be implemented in Java. It absolutely can, and it’s a well-known and widely – used pattern. Here’s a simple example:

public class JavaLazyVal {

    private long val;

    // Pay once.
    public long getVal() {
        if (val == 1) {
            System.out.println("Performing an expensive 
                                computation...");
            val = (long) Math.pow(3, 10);   // Assumed expensive. Have 
                                           // to ensure that value 
                                          // CAN'T be equal to the 
                                          // initializer value!
        } else {
            System.out.println("No longer performing expensive 
                                  computation...");
        }
        return val;
    }

    public JavaLazyVal() {
        val = 1;
    }
}

A simple runner program:

public class JavaLazyValRunner {

    public static void main(String[] args) {
        JavaLazyVal jlv = new JavaLazyVal();
        System.out.println("val = " + jlv.getVal());
        System.out.println("val = " + jlv.getVal());
    }
}

And its output:

Lazy behavior of getVal() exemplified.

So it is absolutely possible to have lazy behavior in Java and this is a very known pattern for Java programmers. It’s not the claim that this can’t be done. The claim is three-fold. First, in Scala you can do this much, much more concisely:

object LazyEval extends App {
  lazy val x = {  // Entire scopes can be rvalues of Scala expressions     
    println("Evaluating x")
    scala.math.pow(3, 10)
  }
  println(x) ; println(x)
}

/* Output:
 *
 * Evaluating x
 * 59049.0
 * 59049.0
*/

Second, the Java code that we had to – necessarily – introduce is open to one possible source of logical error: that the expensive computation also returns the initializer value! In some applications, this can be hard to ensure!

Third, with the mechanism of lazy val, you can store expressions which would otherwise prohibit the construction of a class instance! And that can have power.

Tail-recursive repeated squaring

This isn’t in any way Scala or Functional Programming – specific, just something cool that can be written rather consisely in this language / paradigm. We overload the exponentiation operator (^) to perform repeated squaring with a tail-recursive method:

def ^ (n:Int):Rational = {

    // Inner two-arg function pow/2. Not tailrec: 
    // the inner method pow/4 is tailrec.
    def pow(base:BigInt, exp:Int):BigInt = {
      require(exp >= 0 && base >=0, "We need positive integer 
                      arguments for this method.")
      // Power computation with tail-recursive repeated squaring.
      @tailrec
      def pow(currExp:Int, maxExp:Int, currTerm:BigInt, 
                          prodAccum:BigInt): BigInt = {
        assert(currExp <= maxExp, "The current exponent should never 
                  surpass the original one.")
        if(currExp <= maxExp / 2)
             // Next iteration on current term.
             pow(2*currExp, maxExp, currTerm * currTerm, prodAccum)    
        else if (currExp == maxExp) 
               // Bottomed out.
             currTerm  * prodAccum  
        else  
           // Compute residual product (rest of terms) 
           // using the same method.
            pow(1, maxExp - currExp, base, currTerm * prodAccum)  
      }

      if (base == 0 && exp == 0) throw new IllegalArgumentException
                                       ("0^0 is an undefined form.")
      else if(base == 0 && exp > 0) 0
      else if(base > 0 && exp == 0) 1
      else if(exp == 1) base
      else if(base == 1) 1
      else pow(1, exp, base, 1)   // Call to tailrec pow/4
    }

    if(n == 1) this
    else if(n == 0) ONE

    // Calls to (non-tailrec) pow/2
    else if(n < 0) new Rational(pow(denom, -n), pow(numer, -n))
    else new Rational(pow(numer, n), pow(denom, n))
  }

}

The innermost 4-arg method pow is of interest here. The snippet (3 / 2)^19 ends up translated to the calls pow(1, 19, 3, 1) for the numerator 3^19and pow(1, 19, 2, 1) for the denominator 2^19 in the line else new Rational(pow(numer, n), pow(denom, n)). Observe that 3^19 = 3^16 * 3^2 * 3^1. Through the machinery of repeated squaring, the first if condition in pow/4 iterates (literally) over all the possible values of the current exponent that wouldn’t “surpass” the maximum exponent if squared. Through 4=log_2(16) iterations, it will calculate the value of 3^16 and then, using the recursive call pow(1, maxExp - currExp, base, currTerm * prodAccum), it can take the intermediate computation into consideration through the product currTerm*prodAccum.

Stress testing

So, all this is cool and all, but how do we fare in practice? We want to make two measurements:

(1) How well does our Rational type perform in terms of accuracy and time? To measure this, we run two map-reduce rasks on a large chain of rational numbers under both the Double and Rational representations, and compare results. We generally expect that we will be winning in accuracy but losing in time, since we have to spend significant time on calls to the Rational constructor.

(2) How well does our exponentiation method scale when compared to scala.math.pow, in terms of speed?

To measure time we used the following method presented here:

def time[R](block: => R, msg:String): R = {
    val t0 = System.nanoTime()
    val result = block    // call-by-name
    val t1 = System.nanoTime()
    println("Elapsed time for " + msg + " was " +  (t1 - t0) + "ns")
    result
  }

And use the following script for the first experiment:

println("======= EXPERIMENT 1: ACCURACY AND EFFICIENCY OF STANDARD  
                     MAP-REDUCE ========== ");

final val rng = new Random(47)
final val MAX_ITER = 1000 // Vary 
final val MAX_INT = 100   // Vary 
val intTupleSeq : Seq[(Int, Int)]= for (_ <- 1 to MAX_ITER) yield  
                                  (rng.nextInt(MAX_INT) + 1, 
                                        rng.nextInt(MAX_INT) + 1)
val quotientSeq : Seq[Double] = intTupleSeq map { case (a, b) => a / 
                                                 b.toDouble }
val rationalSeq : Seq[Rational] = intTupleSeq map { case (a, b) => 
                                       new Rational(a, b) }
// Sums first....
val quotientSum = time( quotientSeq.sum, "quotient sum")
val rationalSum = time(rationalSeq.reduce((x, y) => x+y), "Rational 
                                                     sum")
println("Value of quotient sum:" + quotientSum)
val evaluatedRationalSum = rationalSum.eval
println("Value of Rational sum:" + evaluatedRationalSum)
println("Error: " + Math.abs(quotientSum - evaluatedRationalSum))

// Products second...
val quotientProd = time( quotientSeq.product, "quotient product")
val rationalProd = time(rationalSeq.reduce((x, y) => x*y), "Rational 
                                            product")
println("Value of quotient product:" + quotientProd)
val evaluatedRationalProd = rationalProd.eval
println("Value of Rational product:" + evaluatedRationalProd)
println("Error: " + Math.abs(quotientProd - evaluatedRationalProd))

Keep in mind that the bigger the error, the better the case for our Rational type, since up to the point of the call to eval it has admitted zero representational degradation.

(i) Map-Reduce

We run two simple Map-Reduce tasks: sum and product over a list of 1000 randomly distributed strictly positive Ints. We vary the largest possible Int and generate time and error metrics for each.

Sum times:

We see that generating the chain of sums for the Rational type takes about seven-fold more time than that of generating the simple primitive Int sum. This is to be expected, since there are costs associated with claiming memory from the heap and performing several assignments. We should also remember that the Rational constructor performs the GCD reduction aggressively during construction, and the runtime of that algorithm is affected by the magnitude of the bigger of the two integers for which it is called.

What about accuracy?

Remember: the greater the error, the better for our Rational type. It seems as if we have to go all the way to the 12th decimal digit, in a 64-bit machine, to find any appreciable difference.

When it comes to the product task:

Now this is somewhat surprising, since it seems that the Rational type’s difference in execution speed is even greater than the case of sum. Once again, here is the source for +(Rational, Rational) and *(Rational, Rational).

def + (that:Rational):Rational  = {
    require(that != null, "Rational + Rational: Provided null 
                                        argument.")
    new Rational(this.numer * that.denom + that.numer * this.denom, 
                   this.denom * that.denom)
}

def * (that:Rational):Rational = {
    require(that != null, "Rational * Rational: Provided null 
                                        argument")
    new Rational(this.numer * that.numer, this.denom * that.denom)
}

This code describes the formulae for addition of a / b + c / d and (a/b) * (c/d). We see that in the constructor call of + we pay 4 additions / multiplications, whereas in the constructor call of multiplication * we pay 2 mults. So it’s definitely not the number of operations, but the cost of those operations, particularly when BigInt instances are involved. And, in a product formulation, BigInts can get really big.

What about product error?

Smaller benefits than the sum (note the different scale, from 10^-12 to 10^-16), to the point of not even being appreciable for small values! No idea why this happens. In all reality, I should have averaged all of those graphs over a number of calls to generate higher fidelity points.

One thing to consider as well is that in order to even produce error graphs, we need a way to evaluate the Rational instance analytically:

def  eval: Double = numer.doubleValue() /  denom.doubleValue()

(ii) Repeated squaring for exponentiation calculation

Just for fun, I elected to run an experiment to see how our overloading compares to scala.math.pow. Since a^n can be computed in log_2n steps using repeated squaring, we vary the exponent n and keep the base at 3/2 (1.5 in the Double representation). The computations are of course accurate (Error = 0). We are exclusively interested in time.

Results:

  println("=========== EXPERIMENT 2: EFFICIENCY OF 
                   EXPONENTIATION ========= ")
  final val EXPONENT = 500
  var quotientPower: Double = 0.0
  var rationalPower : Rational= _
  time(quotientPower = scala.math.pow(1.5, EXPONENT), "Double raised 
                             to the power of " + EXPONENT)
  time(rationalPower = new Rational(3, 2) ^ EXPONENT, "Rational raised 
                            to the power of " + EXPONENT)
  println("Value of power from Doubles: " + quotientPower + ".")
  println("Value of power from Rationals: " + rationalPower.eval + 
                                                         ".")
  println("Error: " + Math.abs(quotientPower - rationalPower.eval))

Results:

It is laughable to even assume that a custom implementation would in any way compare to scala.math.pow, but what is interesting is the fact that both implementations appear to scale very well as the exponent is increased. This is to be expected because of the power of the logarithm, which “tempers” even large values of n to very tractable values of magnitude ceil(log_2n).

Conclusion

In this post, we saw several advantages of the Scala implementation of a rather simplistic numerical type over the relevant Java implementation. The big advantages have to do not with the runtime, but with the source. In Scala, we are able to overload operators naturally, like with C++, albeit with more intuitive syntax and the usual perks of not having to deal with raw pointers or memory cleanup (if ever required in the overloading of some operator…). We can also use the power of built-in lazy vals in order to do cool things such as allow for types to define their own exceptional instances and deal with them in any way they please (even dynamically). We also compared two different ways of overloading methods in Scala, and deduced that it would be far better to actually emulate the Java-like way of several methods with the same name, instead of pattern – matching on an argument of type Any, as is the more “natural” way in Scala. Finally, we saw a simple application of the language’s Syntax and its emphasis on tail recursion and inner methods with our repeated squaring method.

In my view, the big advantage of this entire experiment is the elegance and modularity with which we can switch from Double to Rational and vice versa. An example can be found in the object Runner:

object Runner extends App {

  private def fraction(a:Int, b:Int) = {
    require(b!= 0, "Cannot create a fraction with denominator zero.")
     new Rational(a, b)
     // a.toDouble / b.toDouble
  }

  val a = fraction(5, 6)
  val b = fraction(2, 3)
  val c = fraction(1, 2)
  val d = fraction(10, 11)
  val e = fraction(25, 15) // Rational will reduce to 5/3
  val f = fraction(23, 13)
  val g = fraction(230, 17)
  println(a * b + c*d - e * (f + g))
}

The output of the above code as given is: -535765/21879, whereas, if we uncomment the final line of fraction(Int, Int), we have -24.487636546460074, the evaluated fraction. And the only thing we need change is the inclusion or exclusion of the character string //.

All that said, I would definitely not implement numerical software on top of the JVM. C / C++ / Native Assembly is the only way to go if we want to go that low-level. Furthermore, it is not clear to me, a person who is not professionally involved with commercial – grade numerical software, how a 12th decimal digit difference in accuracy can in any way be significant. Perhaps NASA and SpaceX care about such small differences, to make sure that autonomous vehicles land on Mars and not Venus. Perhaps not. Perhaps the advantage of being able to move to smaller bit widths (Arduino, IoT devices) would ratify the use of a “pure” numerical type such as our Rational. The matter of symbolic computation is huge, interesting, important, and with this blog post we are not even making a ripple on the surface.

Breaking @tailrec

To follow through this example, you will need to know some basic (really basic) Scala. You will also need to know what call-by-name and call-by-value means in the context of functional programming. Those are concepts very loosely related to “pass-by-value” and “pass-by-reference” in imperative programming; you should not confuse them. tl;dr at the end.

I have recently been learning Scala and functional programming in general. Recursive exercises are a great warm-up to make the imperative programmer start thinking in a more recursive manner, so one of the first few examples I wrote was factorial. I realized I needed to understand quickly how the language treats tail recursion, and factorial is a great exercise for that.

The following intuitive one-liner is, unfortunately, not tail recursive, but at least the stack grows linearly with n:

def factorialNaive(n:Int) = if(n == 0) 1 else n * factorialNaive(n-1)

So a tail-recursive implementation is quite appropriate here, and the following pair of functions will get the job done in exactly such a manner:

def factorialTailRec(n: Int) : Int = {
    @tailrec
    def factorialTailRec(n: Int, f:Int): Int = {
      if (n == 0) f
      else factorialTailRec(n - 1, n * f)
    }
    factorialTailRec(n, 1)
  }

Now at this point it’s important to remind ourselves that the @tailrec annotation, which can be used after we import scala.annotation.
tailrec, is not required, but is a very good idea to include since it will warn us at compile-time if the recursive call is not in tail position. If we were to use that annotation for factorialNaive, the code would not compile:

Now here is where an evil idea creeped into my mind. I noticed that I was passing the second parameter of factorialTailRec by value. I thought to myself: “This means that every “stack frame” (more like iteration scope at this point) is burdened with one multiplication… so all the way down the call chain we have n-1 multiplications. The alternative, of passing the parameter by name, would delay any multiplication up to the last stack frame, where we bottom out with a term of 1. So I would expect similar performance.”

Turns out, the above is only partly true, and the most interesting fact is not included in it! You see, while it is the case that the product computation is delayed until the very end, its terms are embedded within every one of the stack frames by the anonymous function that the called-by-name parameter builds! The evaluation of every multiplication i * (i+1) with 0 <= i < n, even when i+1 can be evaluated within the current stack frame, requires popping another stack frame, a frame that was previously pushed by the otherwise tail-recursive call that made a call-by-name for its 2nd argument!

To test this, we can use the following runner program. Note that I don’t care at all about the actual factorial values, so I let the result overflow and be as non-sensical as it likes. I’m not even assigning it anywhere. I’m interested exclusively in how the code affects the use of the JVM’s stack. For the hyper-exponential factorial function, even the Long data type is not sufficient and one had best use an efficient BigInteger library or Stirling’s approximation if they care about computing the values of large factorials.

package factorial

import scala.annotation.tailrec

object Factorial extends App {

  val ITERS = 100000      // Let's push things

  // Naive Factorial
  def factorialNaive(n:Int) : Int =  if(n == 0) 1 else n * factorialNaive(n-1)

  try {
    for (i <- 1 to ITERS) factorialNaive(i)
    println("Naive factorial worked up to " + ITERS + "!.")
  } catch{
    case  se:StackOverflowError => println("Naive factorial threw an instance of StackOverflowError.")
    case e: Exception => println("Naive factorial threw an instance of " + e.getClass + " with message: " + e.getMessage + ".")
  }

  // Tail-recursive factorial
  def factorialTailRec(n: Int) : Int = {
    @tailrec
    def factorialTailRec(n: Int, f:Int): Int = {
      if (n == 0) f
      else factorialTailRec(n - 1, n * f)
    }
    factorialTailRec(n, 1)
  }

  try {
    for(i <-1 to ITERS) factorialTailRec(i)
    println("Tail recursive factorial worked up to " + ITERS + "!.")
  } catch{
      case  se:StackOverflowError => println("Tail recursive factorial threw an instance of StackOverflowError.")
      case e: Exception => println("Tail recursive factorial threw an instance of " + e.getClass + " with message: " + e.getMessage + ".")
  }

  println("Exiting...")

}

Notice that in factorialTailRec, the accumulator argument is passed by value. Running this program for the given parameter of ITERS=100,000 yields the output:

Naive factorial threw an instance of StackOverflowError.
Tail recursive factorial worked up to 100000!
Exiting...

Nothing surprising so far. But what if I were to pass the second argument by name instead, by changing the method declaration to:

def factorialTailRec(n: Int, f: => Int): Int = ...

Then, our output is:

Naive factorial threw an instance of StackOverflowError.
Tail recursive factorial threw an instance of StackOverflowError.
Exiting...

Everybody has my express permission to print the above output and paste it on their office doors, perhaps appropriately captioned. Not to mention that @tailrec did not complain at all! The file compiled just fine. I’m not sure whether it’s possible to have @tailrec disallow call-by-name parameters. It’s one of those things that sound easy to do, but are probably very difficult in practice.

Now let’s play a little game. Suppose that for whatever reason you cannot change the signature of factorialTailRec and you are stuck with an idiotic call-by-name that has doomed your method to be tail-recursively constructing a linear-size stack for a function that it itself builds… ūüė¶ Is there anything we can do to achieve call-by-value?

Yup. This:

def factorialTailRec(n: Int) : Int = {
    @tailrec
    def factorialTailRec(n: Int, f: => Int): Int = {
      if (n == 0) {
        val fEvaluated = f
        fEvaluated
      }
      else {
        val fEvaluated = f
        factorialTailRec(n - 1, n * fEvaluated)
      }
    }
    factorialTailRec(n, 1)
  }

Which is a cheap general template to emulate call-by-value from a called-by-name parameter if you need it. Right now with my limited Scala experience, I can’t envision a scenario where this would be preferable to a simple call-by-value declared at the method signature level, but hey, it works:

Naive factorial threw an instance of StackOverflowError.
Tail recursive factorial worked up to 100000!
Exiting...

Finally, to once again underscore the difference between val and def, changing the two lines val fEvaluated = f into def fEvaluated = f does nothing to evaluate the value of the parameter, since the defs themselves are calls by name (aliases)! In fact, I’d be willing to bet that it doubles the constant factor in front of the O(n) space occupied by the stack.

Naive factorial threw an instance of StackOverflowError.
Tail recursive factorial threw an instance of StackOverflowError.
Exiting...

tl;dr I built a tail-recursive method in Scala that blows up the JVM’s stack.

A better number sequence for teaching Cantor’s diagonalization.

I’ve been teaching CMSC250: Discrete Structures in CS UMD for a while now. Essentially a more CS-friendly version of Discete Mathematics, with more logic, set theory, structural induction, countability. Countability, a rather esoteric subject, is challenging for our students. Naturally, a student that doesn’t get countability because they don’t get bijections cannot be helped until they understand bijections, but when it¬† comes to the Cantorian diagonalizing proof that the reals are uncountable (to be precise, that the interval [0, 1]) is uncountable), one can lose many more students.

I conjecture that one possible reason for this is that the sequence of numbers that is used to create  the 2D matrix is usually a bit arbitrary in both texts and slides. For example, in my own slides, I have this following sequence of numbers:

Screenshot 2018-02-28 12.50.38

With such a sequence of numbers, I have observed that even medium-to-strong students oftentimes have trouble. So let’s try to improve our example. Here’s a nice trick: Write down only the diagonal portion of the listing of reals:

r_1 = 0. \mathbf{2}xxxxxx\dots

r_2 = 0.x\mathbf{4}xxxx\dots

r_3 = 0.xx\mathbf{2}xxx\dots

r_4 = 0.xxx\mathbf{9}xx\dots

r_5 = 0.xxxx\mathbf{0}x\dots

r_6 = 0.xxxxx\mathbf{8}\dots

\ \vdots \qquad \vdots \qquad \vdots \qquad \vdots

Since the diagonal values are the only ones that you need to construct the number of interest, go ahead and construct it. I call it r' (r “prime”) only because I would like to use r later.

r' = 0.353019\dots

And now we will go replace all of those x‘s with the same-index digits of the number itself.¬†For example, we will replace the 2nd, 3rd, 4th, 5th and 6th digit of r_1 with 5, 3, 0, 1 and 9 respectively. We will also replace the 1st, 3rd, 4th, 5th and 6th digits of r_2 with the relevant digits of r', and so on and so forth:

r_1 = 0. \mathbf{2}53019\dots

r_2 = 0.3\mathbf{4}3019\dots

r_3 = 0.35\mathbf{2}019\dots

r_4 = 0.353\mathbf{9}19\dots

r_5 = 0.3530\mathbf{0}9\dots

r_6 = 0.35301\mathbf{8}\dots

\ \vdots \qquad \vdots \qquad \vdots \qquad \vdots

 

The benefit of carefully constructing the number sequence in this way is that now the student can¬†see¬†that those numbers, no matter how¬†hard they try¬†to look like our own constructed number r', they fail¬†in at least one decimal digit. In fact, with this visualization of the first 6 decimal digits for every real r_j, there is a difference of¬†exactly¬†one decimal digit. We have essentially pushed the argument to its limits: even if we had a real r_k whose 1st, 2nd, 3rd, …. (k-1)th, (k+1)th, (k+2)th, …. digits are the same, r_{k_k} is¬†guaranteed, by¬†construction of r', to be different from a_k.

The final piece of the argument can perhaps be shown as follows: The statement “[0, 1] is countable”, can be re-worded as: “For every real r in [0, 1], there is some positive integer j such that r = r_j.

The way I like to proceed from that point onward is the following: Since there is some positive integer j such that every single r \in [0, 1] can be written as r_j, then this is also the case for our constructed real number r' since, clearly, by the way that I have constructed r', it is a number between 0 and 1!

Maybe this j is equal to 1. But this can’t be, since we notice that the numbers differ in the first decimal digit, again by construction. And this is where the construction comes to help:¬†all the other (visible) digits are the same. However, it suffices for there to be a difference in a single digit in order for us to say that, for a given¬†j,\ r \neq r_j.

Maybe j = 2. The first digit is ok, it is 3 in both r_2 and r', But the second digit gives us a difference, despite the fact that all other digits are the same!

Maybe j = 3, and so on and so forth.

So the point is that this enumeration actually provides a Discrete Mathematics student with more intuition about why the proof works.

 

 

 

 

A deep take on countability, cardinality and ordering

I’ve been teaching CMSC250, Discrete Mathematics, over the past year in CS UMD. Last semester, I typed a more philosophical than mathematical post on Countability, Cardinality and Ordering, which I’m repeating here for the community’s sake.


After our ordinality lecture last Tuesday, I had a student¬†come to me and tell me that they were¬†not sure how to¬†think¬†about ordinality: they were¬†understanding the relationship between cardinality and size, since it is somewhat intuitive even for infinite sets (at least to them!), but ordinality still appeared esoteric. That’s 100% natural, and in this post I will¬†I’ll try to stray away from math and try to explain how I think about countability, cardinality and ordinality¬†intuitively.¬†This post has exactly zero things to do with the final, so if you want to limit your interactions with¬†this website to the exam-specific, you may stop reading now.

Before we begin, I would like to remind you of a definition that we had presented much earlier in the semester, I believe during an online quiz: A set S is dense if between any two elements of it, one can find another element. Note something interesting: only ordered sets can be qualified as dense or not! Technically, we had not presented the notion of an ordered set when we discussed dense sets, but it is intuitive enough that people can understand it.

Countability

We say that any enumerable set is countable. Enumerable, mathematically, means that we can find a bijection from the non-zero naturals to the set. Intuitively, it means¬†“you start from somewhere, and by sequentially making one step, no matter how long it takes, you are guaranteed to reach every single element of the set in finite time”. Whether this finite time will happen in one’s lifetime, in one’s last name’s lifetime, or before the heat death of the universe, is inconsequential to both the math and the intuition. Clearly, this is trivial to do for either the non-zero naturals or the full set of naturals: you start from either 1 or 0, and then you make one step “forward”.

However, we also saw in class that this is possible to also generalize for the¬†full set of integers:¬†we start from 0 and then start hopping around and about zero, making bigger hops every time. Those hops are our steps “forward”.

Those results are probably quite intuitive to you by now, and I feel that the reason for this might that both LaTeX: \mathbb{N} and LaTeX: \mathbb{Z} are non-dense sets.There are no naturals or integers between LaTeX: n and LaTeX: n+1 (LaTeX: n \in \mathbb{N}  or LaTeX: n \in \mathbb{Z} ).

Let’s stray away from¬†LaTeX: \mathbb{Q} ¬†for now and fast-forward to¬†LaTeX: \mathbb{R} . We have already shown the mathematical reason,¬†Cantor’s diagonalization,¬†for which the set of reals is uncountable. But what’s the intuition? Well, to each their own, but here’s how I used to think about it as a student:¬†Suppose that I start from zero¬†just to make things easier with respect to my intuitive understanding of the real number line¬†(I could’ve just as well started with¬†LaTeX: -e^8).¬†

Then, how do I decide to make my step forward? Which is my second number? Is it 0.1? Is it -0.05? But, no matter which I pick as my second number,¬†am I not leaving infinitely many choices in between, rendering it necessary that I recursively look into this infinite interval? Note that I have not qualified “infinite” with “countably infinite” or “uncountably infinite” yet. This was my personal intuition as a Discrete Math student about 11 years ago about why¬†LaTeX: \mathbb{R} ¬†is uncountable:¬†Even if you assume that you can start from 0, there is no valid ordering for you to reach the second element in the sequence of reals! Therefore, such a sequence cannot possibly exist!

But hold on a minute; is it not the case that this argument can be repeated for LaTeX: \mathbb{Q} ? Sure it can, in the sense that between, say, LaTeX: 0 and LaTeX: \frac{1}{2}, there are still infinitely many rationals. It is only after we formalize the math behind it all that we can say that this is a countable infinity and not an uncountable one, as is the case of the reals. But still, we have to convince ourselves: why in the world is it that the fact that every one of these infinite numbers can be expressed as a ratio of integers make that infinity smaller than that of the reals?

Here’s another intuitive reason why we will be able to scan every single one of these numbers in finite time: everybody open the slide where we prove to you that¬†LaTeX: \mathbb{Q}^{>0} ¬†is countable using the snaking pattern. Make the crucial observation that¬†every one of the diagonals scans fractions where the sum of the denominator and the numerator is static!¬†The first diagonal scans the single fraction (LaTeX: \frac{1}{1}) where the sum is 2. The second one scans the fractions whose denominator and numerator sum is 3 (LaTeX: \frac{1}{2},\ \frac{2}{1}). In effect, the¬†LaTeX: i^{th}¬†diagonal scans the following fractions:

LaTeX: \{ \frac{a}{b} \mid (a,b \in \mathbb{N}^{\geq 1}) \land (a + b=i+1)\}

For those of you that know what equivalence classes are, we can then define LaTeX: \mathbb{Q}^{>0}  as follows:

LaTeX: \mathbb{Q}^{>0} = \bigcup_{i \in \mathbb{N}}\{ \frac{a}{b} \mid (a,b \in \mathbb{N^{\geq 1}}) \land (a + b=i+1)\}

Let’s see this in action…

LaTeX: \mathbb{Q}^{>0} = \{ \color{red}{\underbrace{ \frac{1}{1}}_{i=1}}, \color{blue}{ \underbrace{\frac{1}{2}, \frac{2}{1}}_{i=2}} , \color{brown}{\underbrace{\frac{1}{3}, \frac{2}{2}, \frac{3}{1}}_{i=3}}, \dots \}

Note that essentially, with this definition, we have defined a bijection from¬†LaTeX: \mathbb{N^{\geq 1}} \times \mathbb{N^{\geq 1}}¬†to¬†LaTeX: \mathbb{Q}. We know that¬†LaTeX: \mathbb{N}^{\geq 1} \times  \mathbb{N}^{\geq 1}¬†is countable, so we now know that¬†LaTeX: \mathbb{Q}^{> 0} ¬†is also countable! ūüôā

Let’s constrain ourselves now to the original challenge that we (I?) are faced with: we have selected 0 as our first element in the enumeration of both¬†LaTeX: \mathbb{Q} ¬†and¬†LaTeX: \mathbb{R} ¬†(the latter is assumed to exist), and no matter which our second element is (say it’s¬†LaTeX: \frac{1}{2}), we have infinitely many elements in both sets between 0 and¬†LaTeX: \frac{1}{2}.¬†But now we know that those infinites are different: in the case of¬†LaTeX: \mathbb{Q} . we know for a fact that we will reach all of those fractions whose decimal values are in¬†LaTeX: (0, 0.5). In the case of¬†LaTeX: \mathbb{R} ,¬†there is no such enumeration: any enumeration we define will still leave an… uncountably infinite gap between any two elements in “sequence”.

Remember how in our lecture on Algebraic and Transcendental numbers, we gave only three examples of numbers in¬†LaTeX: TN, yet the fact that¬†LaTeX: TN¬†is uncountable when¬†LaTeX: ALG¬†is countable guarantees that there are¬†“many more”¬†Transcendental numbers than Algebraic?¬†Same thing applies here with the rationals and irrationals: given any interval of real numbers¬†LaTeX: (r_1, r_2), there are many more irrationals than rationals inside that interval... If you define a system of whole numbers (integers), there are many more quantities that you will¬†not¬†be able to express as a ratio of integers. That’s why back in the day (300 B.C) when Euclid proved¬†that¬†LaTeX: \sqrt 2¬†is not expressible as such a ratio¬†LaTeX: \frac{a}{b}¬†(or, more accurately, that¬†LaTeX: 2¬†cannot be expressed as the square¬†LaTeX: \frac{a^2}{b^2}) his result was so unintuitive; those Hellenistic people did not have rulers. They did not have centimeters or other accepted forms of measurement. The only thing they had were shoestrings, or planks of wood which they put in line and “saw” that they were the same length, and then they measured everything else as the¬†ratio¬†of such “whole” lengths.

 Cardinality

Recall something that we said when we were discussing the factorial function and its combinatorial interpretations when applied on positive integers. Bill’s explanation of why¬†LaTeX: 0!=1¬†was purely algebraic: If it were¬†LaTeX: 0, then, given the recursive definition¬†LaTeX: n!\:=\:n\:\cdot\left(n-1\right)!¬†for¬†LaTeX: n\:\ge1, every¬†LaTeX: n!¬†would be¬†LaTeX: 0,¬†rendering it a pretty useless operation. My explanation was combinatorial: we know that if we have a row of, say,¬†LaTeX: n¬†marbles, there are¬†LaTeX: n!¬†different ways to permute them, or¬†LaTeX: n!¬†different orderings of those marbles. When there are no marbles, so¬†LaTeX: n=0,¬†there is only one way to order them: do nothing, and go watch Netflix.¬†

Let’s stick with Bill’s interpretation for a moment:¬†the fact that some things need to be defined in order to make an observation about the real world work. In this case, the real world is defined as “algebra that makes some goddamn sense”. My explanation is more esoteric. You could say:¬†“What do you mean there’s only one way to arrange zero things? I don’t understand, if there are zero things and there’s nothing to do, shouldn’t there be, like, 0 ways to arrange them?”.¬†So, let’s stick with Bill’s interpretation to explain something that I attempted to explain to a group of students after our first lecture this semester:¬†Why do negative numbers even exist?

Here’s one such utilitarian explanation:¬†Because without negative numbers, Newtonian Physics, with their tremendous application in the real world, would not work.¬†That is, the model of Newtonian Kinematics with its three basic laws, which has been¬†empirically proven¬†to¬†describe very well¬†things that we¬†observe in the real world,¬†needs the framework of negative¬†numbers in order to, well, work. So, if you’re not ok with the existence of negative numbers,¬†you had better also be able to describe to me a framework that explains a bunch of observations on the real world in some way that doesn’t use them. For example, you probably all remember the third law of Newtonian motion: For every¬†action¬†LaTeX: \color{red}{\vec{F}}, there exists an¬†equal¬†and opposite¬†reaction¬†LaTeX: \color{red}{-\vec{F}}:

Recall that force is a vectoral quantity since it is the case that LaTeX: \vec{F} = m \cdot \vec{a}, and acceleration LaTeX: \vec{a} is clearly vectoral, as the second derivative of transposition LaTeX: \vec{x}. 

The only way for Newton’s third law of motion can work is if¬†LaTeX: \vec{F} + (-\vec{F}) = \vec{0}. This is only achievable if the two vectors have the same magnitude but exactly opposite directions. No other way. Hence the need to define the magnitudes as follows:

LaTeX: | |\vec{F}|| = \frac{1}{2} \cdot m \cdot a^2,\ | |\vec{\color{red}{-}F}|| = \color{red}{-}\frac{1}{2} \cdot m \cdot a^2

and the necessity for negative numbers becomes clear. Do you guys think the ancient Greeks or Egyptians cared much for negative numbers? They were building their theories in terms of things they could touch, and things that you can touch have positive mass, length, height…

Mathematics is not science.¬†It is an agglomeration of models that try to axiomatize things that occur in the real world.¬†For another example, ZFC Theory was developed in place of Cantorian Set Theory because Cantorian Set Theory can lead to crazy things such as¬†Russel’s Paradox. Therefore, ZFC had to add more things to Set Theory to make sure that people can’t do crazy stuff like this.¬†If we discover contradictions with the real world given our mathematical model,¬†we have to refine our model by adding more constraints to it.¬†Less constraints, more generality, potential for more contradictions. More constraints, less generality, less contradictions, but also more complexity.

So when discussing the cardinality of¬†LaTeX: \mathbb{N} ¬†and¬†LaTeX: \mathbb{Z} ¬†and finding it equal to¬†LaTeX: \aleph_0, we are faced with a problem with our model:¬†the fact that¬†LaTeX: \color{magenta}{\mathbb{N} \subset \mathbb{Z}}¬†(I have used the notation of¬†proper subset¬†here deliberately). Now, I just had a look at our cardinality slides, and it is with joy that I noticed that¬†we don’t use the subset / superset notation¬†anywhere.¬†That’s gonna prove a point for us.

So, back to the original problem: intuitively understanding why the hell LaTeX: \mathbb{N}   and LaTeX: \mathbb{Z}  have the same cardinality when, if I think of them on the real number line, I clearly have LaTeX: \mathbb{N} \subset \mathbb{Z}:

LaTeX: \underbrace{\dots, -4, -3 , -2, -1, \underbrace{0, 1, 2, 3, 4, \dots}_{\mathbb{N}} }_{\mathbb{Z}}

The trouble here is that¬†we have all been conditioned from childhood to think about the negative integers as “minus the corresponding natural”. This conditioning is not something bad:¬†it makes a ton of sense when modeling the real world,¬†but when comparing cardinalities between infinite sets, that is, sets that will never be counted entirely in finite time,¬†we distance ourselves from the real world a bit, so we need a different mathematical model. To that end, let’s build a new model for the naturals. Here are the naturals under our original model:

LaTeX: 0, 1, 2, 3, \dots

This digits that we have all agreed to be using¬†have not been around forever. The ancient Greeks used lowercase versions of their alphabet:¬†LaTeX: \alpha, \beta, \gamma, \delta , \epsilon, \sigma \tau ', \zeta,\ \dots\ \omega ¬†to name a total of 25 “digits”, while the Romans used a subset of their alphabet “stacked” in a certain way:¬†LaTeX: I, II, III, IV, V, VI, \dots, X, XI\dots . These “stacked” symbols¬†cannot be really called¬†digits¬†the way that we understand them, especially since new symbols appear long down the line (LaTeX: C, M) etc. These symbols¬†we actually owe to the Arabic Renaissance of the early Middle Ages.

The point is that¬†I can rename every single one these numbers¬†in a unique¬†way and still end up with a set that has the exact same properties (e.g closure of operations, cardinality, ordinality) as¬†LaTeX: \color{red}{\mathbb{N}}. This is formally defined as the¬†Axiom of Replacement. So, let’s go ahead and describe¬†LaTeX: \mathbb{N} ¬†by assigning a¬†random string¬†for every single number,¬†assuming that no string is inserted twice:

LaTeX: foo, bar, otra, zing, tum, ghi,\dots

Which corresponds to our earlier

LaTeX: 0, 1, 2, 3, 4, 5,\dots

Cool! Now the axiom of replacement clearly applies to LaTeX: \mathbb{Z}  as well, so I will rewrite

LaTeX: \dots, \color{blue}{-5, -4, -3, -2, -1,}\ \color{magenta}{0, 1, 2, 3, 4, 5,}\dots

into:

LaTeX: \dots, \color{blue}{qwerty, forg, vri, zaq,  nit,}\ \color{magenta}{bot, ware, yio, bunkm, ute, kue,}\dots

Call these “transformed” sets¬†LaTeX: \mathbb{N}_{new}¬†and¬†LaTeX: \mathbb{Z}_{new}¬†respectively. Under this encoding, guys, I believe it’s a lot more obvious that¬†LaTeX: \mathbb{N}_{new} \not\subset \mathbb{Z}_{new}¬†in the general case.¬†LaTeX: \mathbb{N}_{new} \subset \mathbb{Z}_{new}¬†under these random encodings is so not-gonna-happenish that its probability¬†is not even axiomatically defined. Therefore, now we can view¬†LaTeX: \mathbb{N} ¬†and¬†LaTeX: \mathbb{Z} ¬†as¬†infinite lines floating around space, lines that¬†we have to somehow put next to each other¬†and see¬†whether we can line them up¬†exactly. If you tell me that even under this visualization, the line that represents¬†LaTeX: \mathbb{Z} _{new}¬†is¬†infinite in both directions, whereas that of¬†LaTeX: \mathbb{N}_{new} ¬†has a starting point (0), then I would tell you that I can effectively “break” the line that represents¬†LaTeX: Z_{new}¬†in the middle (0) and¬†then mix the two lines together according to the mapping that corresponds to:

LaTeX: 0, 1, -1, 2, -2, 3, -3, \dots

Now we no longer have the pesky notation of the minus sign, which pulls us to scream “But the naturals are a subset of the integers! Look! If we just take a copy of the naturals and put a minus in front of them, we have the integers!”.¬†We only have two infinite lines, that start from somewhere, extend infinitely, and it is up to us to find a 1-1 and onto¬†mapping between them.¬†That is, it is up to us find a 1-1 mapping between:

LaTeX: foo, bar, otra, zing, tum, ghi,\dots

and

LaTeX: bot, ware, nit, yio, zaq, bunkm\dots

(Note that I re-ordered the previous encoding¬†LaTeX: \dots, \color{blue}{qwerty, forg, vri, zaq,  nit,}\ \color{magenta}{bot, ware, yio, bunkm, ute, kue,}\dots¬†according to the “hopping” map into ¬†LaTeX: \color{magenta}{bot}, \color{magenta}{ware}, \color{blue}{nit,} \color{magenta}{yio}, \color{blue}{zaq}, \color{magenta}{bunkm},\dots¬†.)

Under this “visual”, you guys,¬†it makes a lot of sense to try to estimate if the two sets have the same cardinality and, guess what, they do ūüôā

Not much else to say on this topic everyone. We can have a bunch of applications of the axiom of replacement to prove, for example, that the cardinality of the integers, LaTeX: \aleph_0, is also the cardinality of LaTeX: \mathbb{N} \times \mathbb{N}, LaTeX: \mathbb{Q} , etc. It is only when we start considering sets such as LaTeX: \mathbb{R} , \mathcal{P}(\mathbb{N}) and LaTeX: \{0, 1 \}^\omega  that this idea that we can be holding two infinite lines in space fails.

Ordinality

There’s not much to say here except that the easiest way to understand how an order differs from a set is to¬†consider an ordering exactly as such: an order of elements!¬†Think in terms of “first element less than second less than third less than …. “.¬†The simplest way possible. It is then that we can prove rather easily that¬†LaTeX: \omega \prec y \prec \zeta .

Things only become a bit more complicated when considering the ordering LaTeX: \omega + \omega:

LaTeX: 0 < \frac{1}{2} < \frac{3}{4} < \frac{5}{6} < \dots <1 < \frac{3}{2} < \frac{4}{3} <\dots <2 <\dots

Please note that this ordering is clearly not the same as LaTeX: \eta, the ordering of LaTeX: \mathbb{Q} . Between the first and the second element, for instance, there are countably many infinite rationals: LaTeX: \frac{1}{100}, \frac{2}{5}, \dots , \frac{3}{7}\dots  which are not included in the ordering. 

Finally, realize the meaning of “incomparable” orderings: a pair of orderings¬†LaTeX: \alpha, \beta ¬†will be called incomparable if, and only if:

LaTeX: (\alpha \npreceq \beta) \wedge (\beta \npreceq \alpha).

So please realize that this is not the same as saying, for instance, LaTeX: \beta \nprec\alpha .

I think this is all, I am bothered when I can’t explain something well to a student so I thought I’d share my views on countability in case the subject becomes easier to grasp.


Piazza sucks.

I’m in Academia. Well, at least the part of Academia that’s still related to actual teaching.

The vast majority of my collaborators in UMD as well as in other institutions use Piazza to¬†host their courses. It’s easy, it’s fast and at the very least it looks like it has close to¬†100% uptime (written during a time that our UMD fork of Instructure’s Canvas has been down for hours).

However, active development on Piazza has effectively stopped since early 2015. During Fall 2016, I would look at my Android Play Store for updates on the Piazza app and for a long time the latest one would date back to February 2015. Right now, it appears that certain patches have been made as early as Feb 7,2017, but the app is still atrocious. This is just an example of the many issues that surround Piazza.

The most major issue, for which I just submitted a bug report, is the fact that there is no filesystem consistency on Piazza. If you use the “Resources” tab and post a link to the file in a discussion topic, and you want to make a change to the file, then all the other links become stale. They point to positions in an Amazon S3 filesystem. When the number of links to a file grow, this becomes a¬†huge problem.

Furthermore: the only way to password-protect your page right now (of importance to any¬†person who needs a private discussion forum), you need to actually send an e-mail to team@piazza.com with your password choice, which of course is then stored in cleartext in your e-mails. My response was immediately adhered to, but what happens if you need to password-protect it during a weekend? I use Piazza to communicate with my TAs, and the information conveyed is often sensitive (thoughts on midterms, recitation topics, rubrics). I don’t want students snooping around (it’s already happened once this semester).

Piazza is perfect for communication: students love it, because other students can immediately offer responses. In contrast, nobody ever uses the Canvas “Discussion” feature. However, in departments where the average course registration is in the hundreds (like us), moderating such a huge forum requires TAs dedicated to doing only that. It’s not impossible, but it’s hard. Ensuring that solutions to problems under active submission don’t leak is tough. A¬†student can take down a post within minutes, yet a PDF with solutions can already have reached a good portion of the class.

But all of these things are simple issues of technical decisions and design. They could be met through either Bug Reports or¬†(a)synchronous brainstorming sessions using tools like Confluence or HipChat. What really bugs me is how the Piazza team doesn’t seem to care any more about the product, shifting their entire focus into making it yet another recruitment platform, which they call “Piazza Careers“. Seriously? That’s what students need? Another recruitment platform? I’m guessing that the Piazza team had some sort of shift in their venture capital and it was required of them to transform the entire platform into another recruiting platform.

It’s a real bummer. Blackboard has been a disaster (link)¬†and Canvas has too many issues to discuss in a blog post. Active Learning services like TopHat¬†are breakable when people use their phones. I recently had a student pretty much admitting to me during my office hours that they never attended the lectures of a certain math course, but had a friend of theirs text them the TopHat code and the proper answers to the questions.¬†The clickers and accompanying software offered by Turning Technologies¬†¬†have multiple issues of connectivity, even though those issues mostly have to do with the ELMS-CANVAS integration¬†and not the software or clicker device itself.

We need reliable educational software. Not another recruitment platform. Pooja Shankar, an alumni of CS UMD and the founder of Piazza, ought to be the first person to recognize this.

 

 

As new data arrives, the covariance matrix takes notice.

The problem

I recently read a paper on distributed multivariate linear regression. This paper essentially deals with the problem of¬†when to update the global multivariate linear regression model in a distributed system, when the observations available to the system arrive in different computer nodes, at different times and, usually, at different rates. In the monolithic, single node case, the problem’s of course been solved in closed form, since for dependent variables y¬†and design matrix¬†X with examples in rows, the parameter vector¬†ő≤¬†can be found as per:

The linear regression solution.

This is a good paper, and anybody with an interest in distributed systems and / or ¬†linear algebra should probably read it. One of the interesting things (for me) was the authors’ explanation that, as more data arrives at the distributed nodes, a certain constraint on the spectral norm of a¬†matrix product that contains information about a node’s data becomes harder to satisfy. It was not clear to me why this was the case and, in the process of convincing myself, I discovered something that is probably obvious to everybody else in the world, yet I still opted to make a blog post about it, because why the hell not.

When designing any data sensor, it is reasonable to assume that the incoming multivariate data tuples will all have a non-trivial covariance. For example, in the case of two-dimensional data, it is reasonable to assume that all the incoming data points will not all lie on a straight line (which denotes full inter-dimensional correlation in the two-dimensional case). In fact, it is reasonable to assume that as more data tuples arrive, the covariance of the entire data tends to increase. We will examine this assumption again in this text, and we will see that it does not always hold water.

This hypothesized increase in the data’s covariance can be mathematically captured by the spectral (or “operator”) norm of the data’s covariance matrix. For symmetric matrices, such as the covariance matrix, the spectral norm is equal to the largest absolute eigenvalue of the matrix. If¬†a¬†matrix is viewed as a linear operator in multi-dimensional cartesian space, its¬†largest absolute eigenvalue tells us how much the matrix can “stretch” a vector in the space. So it gives us an essence of how “big” the matrix is in that sense, hence its incorporation into a norm formulation.

The math

We will now give a mathematical intuition about how the incorporation of new data in a sensor leads to a likelihood of increase of its spectral norm, or, as we now know, its dominant eigenvalue. For simplicity, let us assume that the data is mean centered, such that we don’t need to complicate the mathematical presentation with mean subtraction.¬†Let¬†őĽ¬†be the covariance matrix’s dominant eigenvalue¬†and u be a unitary eigenvector in the respective eigenspace. ő§hen, from the relationship between eigenvalues and eigenvectors, we obtain:

Derivation 1

with the second line being a result¬†of the fact that¬†u is assumed unitary. It is therefore obvious that, in order to gauge how the value of¬†őĽ¬†varies, we must examine the 2-norm (Euclidean norm) of the vector¬†on the right-hand side of the final equals sign.

Let’s try unwrapping the product that makes up this vector:

Derivation 2

Now, let us focus on the first element of this vector. If we unwrap it we obtain:

Derivation 3

The crimson¬†factors really let us know what’s going on here, since the summations in the parenthesis involve the “filling up” of values in the covariance matrix that lie beyond the main diagonal. For fully correlated data, those values are all zero. On the other extreme, they are all non-zero. It is natural to assume that, as more data arrives, all those values tend to deviate from zero, since some inter-dimensional uncorrelation is, stochastically,¬†bound to occur. On the other hand, if new data is such that it causes an increased inter-dimensional correlation, then the sum will tend towards zero, and the covariance matrix’s spectral norm will actually decrease!

The¬†second vector element deals with the correlation between the second dimension and the rest, and so on and so forth. Therefore, the larger the values of these elements, the larger the value of the 2-norm || X’ X u ||¬† is going to be and vice versa.

Some code

We can demonstrate all this in practice with some MATLAB code. The following function will generate some random data for us:

function X = gen_data(N, s)
%GEN_DATA Generate random two-dimensional data.
% N: Number of samples to generate.
% s: Standard deviation of Gaussian noise to add to the y dimension.
x = rand(N, 1);
y = 2 * x + s.* randn(N, 1); % Adding Gaussian noise
X = [x,y];
end

This function will generate the covariance matrix of the input data and return its spectral norm:

function norm = cov_spec_norm(X)
% COV_SPEC_NORM: Estimate the spectral norm of the covariance matrix of the
% data matrix given. 
%   X: An N x 2 matrix of N 2-dimensional points.

COV = cov(X);
[~, S, ~] = svd(COV);
norm = S(1,1).^2;
end

Then we can use the following top-level script to create some initial, perfectly correlated data, plot it, estimate the covariance matrix’s spectral norm, and then examine what happens as we add chunks of data, with increasing amounts of Gaussian noise:

% A vector of line specifications useful for plotting stuff later
% in the script.
linespecs = cell(4, 1);
linespecs{1} = 'rx';linespecs{2} = 'g^';
linespecs{3} = 'kd'; linespecs{4} = 'mo';

% Begin with a sample of 300 points perfectly
% lined up...
X = gen_data(300, 0);
figure;
plot(X(:, 1), X(:, 2), 'b.');  title('Data points'); hold on;
norm = cov_spec_norm(X);
fprintf('Spectral norm of covariance = %.3f.\n', norm)

% And now start adding 50s of noisy points.
for i =1:4
    Y = gen_data(50, i / 5); % Adding Gaussian noise 
    plot(Y(:,1), Y(:, 2), linespecs{i}); hold on;
    norm = cov_spec_norm([X;Y]);
    fprintf('Spectral norm of covariance = %.3f.\n', norm);
    X = [X;Y]; % To maintain current data matrix
end
hold off;

(Note that in the script above, every new batch of data gets an inreased amount of noise, as can be seen in the call to gen_data.)

One output of this script is:

>> plot_norms
Spectral norm of covariance = 0.191.
Spectral norm of covariance = 0.200.
Spectral norm of covariance = 0.200.
Spectral norm of covariance = 0.220.
Spectral norm of covariance = 0.275.

A plot of our dataInterestingly, in this example, the spectral norm did not change after incorporation of the second noisy data. Can it ever be the case that we can have a decrease of the spectral norm? Of course! We already said that the crimson summations above, corresponding to summations over cells of the covariance matrix beyond the first diagonal, can fall closer to zero after we incorporate new data whose dimensions are more correlated with¬†the existing data’s. Therefore, in the following run, the incorporation of the first noisy set actually increased the amount of inter-dimensional correlation, leading to a smaller amount of covariance (informally speaking).

>> plot_norms
Spectral norm of covariance = 0.177.
Spectral norm of covariance = 0.174.
Spectral norm of covariance = 0.179.
Spectral norm of covariance = 0.220.
Spectral norm of covariance = 0.248.

Another data plot.

Discussion

The intuition is clear: as new data arrives in a node, observing the fluctuation of the spectral norm of its covariance matrix can tell us some things about how “noisy” our data is, where “noisiness” in this context is defined as “covariance”. I guess the question to be made here is what to expect of one’s data. If we run a sensor long enough without throwing away archival data vectors, it’s unclear whether we can expect the spectral norm to continuously¬†increase (at least not by a significant margin). We should expect a sort of “saturation” of the spectral norm around a limiting value. This can be empirically shown by a modification of our top-level script, which runs for 50 iterations (instead of 4) but generates batches of data with standard Gaussian noise, i.e the noise does¬†not increase with every new batch:

% Begin with a sample of 300 points perfectly
% lined up...
X = gen_data(300, 0);
norm = cov_spec_norm(X);
fprintf('Spectral norm of covariance = %.3f.\n', norm)

% And now start adding 50s of noisy points.
for i =1:50
    Y = gen_data(50, 1); % Adding Gaussian noise 
    norm = cov_spec_norm([X;Y]);
    fprintf('Spectral norm of covariance = %.3f.\n', norm);
    X = [X;Y]; % To maintain current data matrix
end

Notice how the call to gen_data now adds normal Gaussian noise by keeping the standard deviation static to 1. One output of this script is the following:

>> toTheLimit
Spectral norm of covariance = 0.203.
Spectral norm of covariance = 0.341.
Spectral norm of covariance = 0.388.
Spectral norm of covariance = 0.439.
Spectral norm of covariance = 0.535.
Spectral norm of covariance = 0.635.
Spectral norm of covariance = 0.677.
Spectral norm of covariance = 0.744.
Spectral norm of covariance = 0.818.
Spectral norm of covariance = 0.842.
Spectral norm of covariance = 0.881.
Spectral norm of covariance = 0.913.
Spectral norm of covariance = 0.985.
Spectral norm of covariance = 1.030.
Spectral norm of covariance = 1.031.
Spectral norm of covariance = 1.050.
Spectral norm of covariance = 1.097.
Spectral norm of covariance = 1.148.
Spectral norm of covariance = 1.154.
Spectral norm of covariance = 1.186.
Spectral norm of covariance = 1.199.
Spectral norm of covariance = 1.280.
Spectral norm of covariance = 1.318.
Spectral norm of covariance = 1.323.
Spectral norm of covariance = 1.325.
Spectral norm of covariance = 1.344.
Spectral norm of covariance = 1.346.
Spectral norm of covariance = 1.373.
Spectral norm of covariance = 1.397.
Spectral norm of covariance = 1.447.
Spectral norm of covariance = 1.436.
Spectral norm of covariance = 1.466.
Spectral norm of covariance = 1.466.
Spectral norm of covariance = 1.482.
Spectral norm of covariance = 1.500.
Spectral norm of covariance = 1.498.
Spectral norm of covariance = 1.513.
Spectral norm of covariance = 1.518.
Spectral norm of covariance = 1.518.
Spectral norm of covariance = 1.499.
Spectral norm of covariance = 1.492.

It’s not hard to see that after a while the value of the spectral norm tends to fluctuate around 1.5. Under the given noise model (Gaussian noise with standard deviation = 1), we cannot expect any major surprises. Therefore, if we were to keep a sliding window over our incoming data chunks, and (perhaps asynchronously!) estimate the standard deviation of the spectral norm’s values, we could maybe estimate time intervals during which we received a lot of noisy data, and act accordingly, based on our system specifications.

The 3-year long health saga of a medically insured PhD student

Intro

My name is Jason Filippou, and I’m a PhD student in the Computer Science department of the University of Maryland, College Park.

My intention with this blog was, and still is, to post interesting things about Computer Science, music, and maybe some latent psychological elements that I find compelling. I simply find few other items interesting at this time. However, a recent health saga which depicted a level of barbarism and medical malpractice that I never thought possible has made it useful¬†to document my experience on a platform more friendly¬†towards static information dissemination¬†than Facebook, a website¬†tuned towards posting pictures of food or other clearly time-wasting material, and upon which I’ve been documenting this stuff until very recently. I will therefore use the first post in this blog to talk about this.¬†It is important for me to have a static account of what has happened over the 3 years that I’ve been a PhD student in the USA, in conjunction with disseminating this information online to raise awareness of the level of malpractice that even insured people can be faced with in that particular country.

An effort will be made to stick to the facts. I will also avoid mentioning specific names of medical professionals¬†or insurance policies, unless legally advised to do so¬†down the line. This will be a lengthy post, but one extremely worthy of reading to the end, particularly if you are interested in the health system situation of the USA, or perhaps emigrating to the USA. It is advised that it is not read during one’s meal, since it contains references to G.I-related symptoms which could potentially gross people out.


August 2012 – First blood

I arrived in the USA on the 3rd of August 2012. Within less than 24 hours, I ingested pasta sauce gone bad for months and naturally got food poisoning. I vomited twice. More unnervingly, over the next few days I started having frequent diarrhea with bloody stools, in conjunction with abdominal cramping and an urgency to visit the bathroom immediately.¬†It should be noted that, at the time, I had not signed my insurance forms in the US,¬†because I’d arrived two weeks early in order to acclimatize myself with College Park as much as possible before beginning my studies.

Regardless, I decided I should visit a doctor. So I visited the University Health Center in the College Park campus, where I was seen by a G.I doctor on the spot (let’s call them doctor A for brevity), with a rather small charge in my University Bursar account. The doctor, having not had any history of me in the past (I’m a Greek-Canadian dual citizen who’d lived his¬†entire life up until August 2012 in Athens, Greece), laid out the possibilities in front of me, and those ranged from run-of-the-mill food poisoning to colon cancer. I hadn’t been in the States for more than 5 days, I was completely alone since my roommate at the time was on vacation, my parents were thousands of miles away in Greece, and I was presented with the possibility of colon cancer, a pre-existing condition that would probably make it impossible for me to successfully sign insurance forms when the time came.

It turned out that I did not have colon cancer, the final diagnosis at the time being gastrenteritis. Over the next few months, I noticed a shift in the behavior of my G.I tract; I started visiting the bathroom more frequently than I used to, and my discharges had changed in consistency, size, etc. This was not pleasing for me, but considering the fact that my diet had also changed¬†and this can only affect one’s bathroom habits in one way or another,¬†I was not particularly alarmed at the time.


June 2013 – The “treatment” begins

Around the end of¬†the 2013 Spring semester, I started developing symptoms similar to those of August 2012. Thinking that this could not continue, I visited a G.I doctor in my area, let’s call them Dr. B. At that point, I was of course medically insured, let’s call the policy Insurance Policy A. The doctor asked me some questions, and theorized (without so¬†much as requesting a stool or blood sample, for instance) that I am suffering from Irritable Bowel Syndrome and that my hemorrhoids were also inflammated. They therefore prescribed some rectal suppositories (Anucort), suggested some slight diet adjustments and instructed me to see them a month from that date. The symptoms did subside, yet a month later when I arrived at their office for the scheduled appointment, the doctor’s secretary told me that they had tried to contact me because the doctor had to cancel, yet they could not reach me on my phone “a couple of times”. At the time I owned a phone which, while not the best in the market, was endowed with state-of-the-art voicemail technology.¬†It is therefore questionable exactly how the front office could not reach me on my phone every one of those “couple” of times. At any rate, I had to miss that appointment and re-schedule for a post-August meet, since during August I would be visiting my home country of Greece for vacation.


 

August 2013 – A glimmer of hope vanishes fast

During my August vacations, I started developing symptoms again. I arrived at the States on August 31, and immediately requested an appointment from Dr. B. I got an appointment on September 4, during which again no proper diagnosis was made (no rectal examination, no stool samples, nothing of the sort), yet an appointment for a colonoscopy was made for the earliest available date: November 27th. That constituted an 11-week wait for an insured patient with frequent diarrhea, bloody stools, urgency to visit the bathroom and at times severe abdominal cramping.

Instructions for properly cleansing my intestines and colon prior to the procedure were sent to me by mail couple days before 11-27. However, nowhere within the instructions was it mentioned that the day before a colonoscopy the patient should not be ingesting any food after noon. Perhaps it was assumed that this was common, universal knowledge. For a person who prior to coming to the States boasted a perfectly healthy G.I system and had never had a colonoscopy before, it was not. Therefore, I had dinner at 6pm the previous day and at 8pm started drinking my four liters of laxative. Naturally,¬†my colon was not perfectly clean for the colonoscopy despite the laxative treatment and doctor B was not able to have a very¬†good view of my colon. Regardless, they¬†“diagnosed” (I have to put this in quotes, maybe they had to “diagnose” something for insurance purposes? I can’t imagine a failed colonoscopy producing anything like an actual diagnosis.) proctitis and inflammated hemorrhoids, and prescribed the same suppository.


June 2014 – A dreadful decision

Like clockwork, symptoms re-emerged some months later. Early June 2014 I called Dr. B again and requested an appointment. When I heard that the earliest appointment that they had for me would be approximately a month later, I declined the appointment and sought a referral for a different G.I doctor. I received a referral for this doctor, Dr. C, and visited them mid-June. Dr. C listened to my concerns and symptoms and once again theorized (without so much as a rectal examination or stool/blood sample) that I suffer from I.B.S and inflammated hemorrhoids. They asked details about my diet and pin-pointed certain foods about being possible culprits (red onions, creamy soups / salad dressings). They also suggested that in the mornings I have a spoonful of¬†Benefiber¬†with my breakfast and, since I typically had a cup of coffee in the morning and caffeine tends to¬†quickly cause bowel movements, to wait for my first bowel movement of the day before leaving for work. Supposedly, the fast expulsion of the day’s first stools would “train” my G.I tract to cause a bowel movement when I wanted it to, instead of when it wanted to (Dr. C’s words).


Fall Semester of 2014 – False information becomes the norm

I followed Dr. C’s¬†advice, and my symptoms subsided. However, at this point, it should be clear to the reader¬†that my symptoms tended to have seemingly random ebbs and flows. It is therefore not safe to say that my symptoms subsided because my doctor suggested Benefiber¬†and¬†a 20-minute wait after my morning coffee before I took the bus to work.

For our first follow-up appointment, after 3 months, I had made progress, in the sense that I didn’t have grave symptoms such as the ones that made me want to visit a doctor immediately in the first place. However, I still exhibited a bathroom visitation pattern that was markedly different from my previous years in Greece; much more frequent visits (approx. 5 per day), lighter stool consistency, mucus on the stool sometimes, increased flatulence, the works. Dr. C¬†focused on the flatulence and suggested, essentially, that I eat less. The flatulence, in their¬†opinion, was a direct consequence of bacteria in my intestines eating off of the increased amount of food that I was consuming (I’d put on maybe 12lb since I came to the States).

For our next follow-up, late Fall 2014, I mentioned the same things to Dr. C, and they theorized that in addition to my I.B.S (apparently we were settled on I.B.S at the time) I might also be lactose intolerant. Throughout my adolescent life, I tended to consume either a glass of milk or a bowl of cereal on a daily basis, always as part of my breakfast. Never had I had any issues with milk. It is true, however, that many people in their 20s develop lactose intolerance, therefore we opted to run with that. So I switched from dairy milk to almond milk, and whenever faced with the prospect of eating something dairy-based, I would consume a pair of Lactaid tablets.  Symptoms did not re-emerge til the end of the year.


 

January 2015 – Cars, sleet and magic soup

During January 2015, my symptoms came back. Compared to previous times, I would notice a much greater need for a bathroom visit, to the point that I had a public accident once which I covered up as best as I could, and it¬†was good enough. Blood in the stool still existed, as did abdominal pain. In fact, that period of time was particularly distressing for me because I was alone in the house, and I had to walk for 35 minutes minimum to my driving school where I was taking driving lessons. The weather at the time in Maryland was steadily below 30 degrees Fahrenheit (-1 celcius, this is considered cold if you’ve lived in Greece your entire life) and because of snow¬†and¬†frost on the streets, public transportation was unreliable. Being late on any given driving lesson because a bus was late was no option; I would have to repeat the lesson. So I had to rely on my own two legs¬†for walking to my school in the freezing cold, the legs themselves depending on an abdomen that was not under severe pain. This was a dependency that was not met. Long story short, those were very painful walks that took much more than the 35 minutes that Google Maps claimed the on-foot distance¬†was.

The symptoms almost single-handedly vanished after a visit I made to a local Vietnamese food place (one of the various “Pho” places that appear to be ubiquitous in urban areas nowadays) and had a vegetable soup that proved itself therapeutic. I mentioned the entire experience to Dr. C¬†during our regular appointment after two months, and they appeared… pleased. I guess the fact that I had apparently found an herbal remedy of sorts must have somehow backed up their belief that this condition was something I could essentially teach my body to deal with in a somewhat¬†self-healing,¬†herbal fashion.


 

June &¬†July 2015 –¬†A living hell

Late June

Come June 2015, I was working under a different advisor in Maryland, and I was feeling good. Both the people that I have worked with are top representatives of their field, yet the second person appeared to be more in tune with my research goals, and I therefore opted to switch towards them. In conjunction with my symptoms disappearing for couple months (again, ebbing and flowing, it seems), I was feeling good. I had a July 15th deadline that I was working hard for and I had a very good chance of catching.

No matter. Around the end of June, the symptoms hit me the hardest they ever had. 15-20 visits to the bathroom on a daily basis. Most of those visits featuring purely watery discharges or blood. Extreme need/urgency for every one of those visits. Loss of appetite, nausea. Abdominal pain that had me rise from my office chair and head to the nearest bathroom every 20 minutes maximum, praying that the bathroom would be vacant such that I avoid accidents in the workplace. A visit to the Vietnamese soup place that I mentioned earlier did not help at all.

At this point it should also probably be mentioned that I was encountering numerous personal difficulties, concerning the¬†mental strain and computer coding / data cleansing required to catch my deadline, a driver’s skill’s test¬†that went really bad because the person whose car I had to drive had neglected some key elements/paperwork¬†required to prove the driveable status of the car, as well as the fact that my new insurance provider (Insurance Policy¬†B) was refusing payment of one of my two weekly therapist appointments completely illegally and 100% out-of-line with respect to my stated benefits. I was therefore encountering significant personal difficulties even before the symptoms hit me and my emotional state was very fragile. This detail will be important moving forward, such that the reader is able to somewhat put themselves in my shoes and grasp the amount of perseverance that has been necessary to maintain my sanity and be able to type this text as we speak.

First thing I did was call Dr. C on their cell phone around the 22nd¬†of June and request their input on my¬†situation. They told me to take Pepto-Bysmol since “it sounds like you had something bad to eat”. I did that for two days, to no avail. So I called them again on Thursday, June 25th, and they re-directed me to the front office to schedule an appointment.

June 30th

So I then called Dr. C’s office requesting an appointment. A lady booked an appointment for me for Tuesday, June¬†30th, 4pm¬†in my doctor’s office located in City A. Because of my dependency on public transportation, I had to take an early bus and arrived at the office one hour early, a time at which I was informed that my appointment had been booked by accident in City B. I had no chance of arriving at City B on time, not even if I had a car to drive there.

I was in complete shock. It appeared to me at the time that nothing – absolutely nothing¬†-was going my way, and surely if the reader read the italicized text a few¬†paragraphs ago they might be inclined to feel the way I felt. After visiting the office’s bathroom, I was told to wait in the lobby such that the staff might find a suitable doctor to look me up, given the fact that my appointment had been accidentally booked by a trainee, so¬†the fault lied with them and not myself. I remember frantically texting a friend or two through my phone while I was waiting for somebody else to see me, telling them phrases such as “Nothing’s going my way, oh God, what’s happening, why will nobody care, nobody cares”, and the like.

After about 1/2 hour, a member of the office staff announced to me that no doctor could see me, presumably because of appointment backlogging. At that point I responded: “That’s fine, I will visit the ER next door.” (The doctor’s office was in a building adjacent to a hospital and I was simply under too much pain and discomfort to let this affect me further.) I left and did just that. I checked into the ER and began a 4-hour wait for somebody to see me. My temperature and some bloodwork were both taken almost immediately after check-in, and it was determined that I had a fever.¬†For the rest of the diagnosis, I would presumably have to wait until I moved to the top of the line and a G.I doctor¬†could see me. My phone dying, I was worried that by the time that I was discharged I would have nobody to drive me back home, so I texted a friend, told him that I would need a ride soon but I couldn’t tell them the exact time that I would be discharged (for I did not know it!), and that I would need to shut down my phone for battery considerations. So I was dependent on somebody to pick me up after discharge, but could not confirm with them because I had no means of communication with the outside world.

At this point, it should also be mentioned that, according to Insurance Policy B, I have a co-pay of $150 for every ER visit and, in order to be admitted to a hospital for so-called “inpatient services”, one needs pre-authorization, presumably from a primary care physician or specialist. This piece of information might be important for the reader, such that they develop a full personal view of the way that I was treated.

Once I moved on to the top of the queue, I was given a bed and was administered I.V.s with natural serum and antibiotics (Ciprofloxacin and Metronidazole). Within 15 minutes or so, a doctor (Dr. D) came to talk to me about my symptoms. I explained my symptoms to them, and told them that I was visiting Dr. C for those symptoms. They appeared to have knowledge of who Dr. C was. They also explained to me that I was taking those antibiotics because I had a fever and an elevated white cell count, so we were operating under the assumption of a bacterial infection. Dr. D proceeded to tell me that the goal would now be for me to provide a urine and stool sample and, in the meantime, they would contact Dr. C to get some advice on how to proceed next. As a small reminder to the reader, the plan that day was to see Dr. C about my symptoms, but my appointment had been booked in the wrong city.

During the next hour¬†or so, I was frequently visited by medical staff, who appeared to be somewhat confused about what I was supposed to be doing at any given point in time. For instance, after Dr. D talked to me the first time, ¬†a nurse came over and asked me whether I’d talked to Dr. D. After some time, a nurse came to me and asked me whether I had provided my samples yet, despite the fact that I was obviously hooked on I.Vs (and therefore largely immobile) and I hadn’t been given any information or equipment (e.g urine / stool samplers) that would help me provide the samples.

After about an hour, Dr. D came over and told me that they’d talked to Dr. C and that “he would squeeze me in between appointments the next day”. I basically would have to call the front office the next morning and they would find a time for me.

As mentioned at the start of this post, an effort has been made to stick to the facts, without too much speculation or personal opinion dissemination. However, after talking to some people since this June 30th experience, it appears as it is universally interesting to many people that neither Dr. C nor Dr. D provided a pre-authorization for hospitalization. Here we have a patient with chronic symptoms that involve bloody stools, abdominal cramping, nausea, fever, and seemingly uncontrollable diarrhea, and the best that we can do for him is give him some antibiotics and a next-day appointment while we wait for his stool samples, which were finally requested after about 3 years of no real diagnosis (reminder: the only time I was requested stool samples of was during my first week in Maryland).

They¬†then proceeded to tell me that the game plan involved me providing stool and urine samples and being discharged after this. So after about 15 minutes or so, a nurse came by, unhooked¬†my I.Vs and told me that I would need to visit a bathroom next door, where there existed “multiple samplers” that I could use to collect my samples. Almost contemporaneously, my friend arrived and declared availability to drive me over to my place after I was done, so at least that part had been covered.

In order to collect the samples, I had to visit a very dirty bathroom that reeked of – and was filled with pools of – urine. On a shelf to the right of the door, there were many urine samplers. I did not see any samplers specialized for stool collection, I therefore assumed that they simply did not exist as a notion in the USA and ¬†opted to just use two urine samplers. While it is true that it is phenomenally easy for guys to collect urine samples,¬†this is not necessarily true for stool samples, particularly for a person whose stool is almost entirely watery and bloody due to¬†illness. For obvious reasons I will have to omit the relevant details, but it’s needless to say that¬†I was disgusted beyond belief. I slightly soiled my clothes and my hands, and ended up washing my hands and the outer surfaces of the samplers for 5 minutes straight out of pure disgust. The entire process took me 30 minutes or so.

Once I walked outside the bathroom, samplers in hand, a mobile register was waiting for me next to my bed, in order to charge me my $150 co-pay for my treatment that day. Of course, Dr. D and the hospital got a piece of a much larger pie that day, courtesy of my insurance provider:

Insurance claim for ER Visit.

June 1st

The next day I visited Dr. C. That day was going to be Dr. C’s last day in the States before a vacation that would keep them outside the country until the 22nd¬†of July, so that would be 21 days of a separation between us. The game plan according to Dr. C was¬†for me to undergo an antibiotic – based treatment for about 10 days¬†and, if things did not improve, call the front office and request the first possible appointment for a colonoscopy, because “the fact that you have blood in your stool worries me”.

It becomes increasingly hard for one to stick to the facts and just the facts here. For about a year of appointments, which included complaints from me about the flaring up of my disease from time to time, chief element of which was bloody stools, Dr. C had not been worried about the bloody stools. Only after I had to admit myself to the ER did they become worried about the bloody stools. Now that I am reading this italicized paragraph again, though, it still sounds like fact, I will therefore allow it.

I was prescribed the antibiotics I was given intravenously in the ER (Ciprofloxacine and Metronidazole), to be administered twice a day in pill form. I was also prescribed Omeprazole, to be taken “as needed” in case of nausea. In conjunction with my daily dosage of Lexapro, a common SSRI which I was taking for about a year to treat my depression and generalized anxiety disorder and which was administered and closely monitored by my psychiatrist, at that point I was¬†taking 4 medications on a daily basis, three of which I¬†was¬†taking without any indication that they were actually going to work, because I hadn’t had an appropriate diagnosis. I essentially did not know what I had! At least the depression/ GAD had been properly diagnosed, more than a year ago, and the FDA-approved medicine was used to treat the diagnosed disease.

It was at that point that my situation became quality-of-life destroying (not just debilitating, the word has been carefully chosen) and, as the reader will soon see, even life-threatening.

July 2-6

Starting July 1st in the evening, I started taking my prescribed antibiotics. Very fast (Thursday 3rd) it became clear that I could no longer visit the office. On top of my symptoms, which did not improve, I had a close-to-complete loss of appetite, and an erratic sleep schedule. Any attempt towards creative/analytical thinking, the cornerstone of a successful graduate student and researcher, was hampered by abdominal pain, dizziness, and > 10 daily visits to the bathroom, with the same qualitative characteristics. Clearly the antibiotics were not helping. I had to drop my deadline, which I officially did on July 6, after an e-mail to my advisor and collaborators where I told them that I basically cannot leave the house because of my symptoms.

My situation was dire; I could not work and was staying at home, slumped on a couch, being unproductive, with medication that apparently did not work.

July 7-10

On July 7th, I called Dr. C’s¬†office again,¬†fully aware¬†that Dr. C would not be available for an appointment and that I’d have to schedule an appointment with somebody else. I was told that a different doctor, Dr. E, would call me back with some information for me. They did so immediately (within 5 minutes or less) and proceeded to ask me about my symptoms. They then told me that the goal right now should be for me to combat¬†the diarrhea and proceeded to prescribe to me what might¬†arguably be the most curious¬†prescription of them all: Cholestyramine¬†in powder form. An inspection of the link, as well as the informative leaflet that I examined when receiving the prescription, both have nothing to say about the drug combating¬†diarrhea. In fact, if the reader were¬†to follow the NIH link I just provided, they would determine for themselves that diarrhea is among the known¬†side-effects of Cholestyramine, the drug itself typically used for dealing with¬†high levels of cholesterol in the bloodstream.

At any rate, believing that Dr. E, being a medical professional, knew what they were doing, I proceeded to add Cholestyramine to my list of medications, which was now engrossed to 5 different prescriptions. Dr. E also proceeded to tell me that if I was still having symptoms after taking Cholestyramine, I should contact the front office to arrange a colonoscopy with any available specialist. They also informed me that, on his end, it appeared that the soonest available appointment would be on the 22nd of July, which co-incided with the date that Dr. C would be coming back. Of course, at the time, it was of much more interest to me to actually have the procedure on the 22nd, rather than have another appointment during which I would state the same things over and over and only then would I be able to arrange a colonoscopy which, who knows, could take maybe 3 months or so to schedule, like my previous one in Fall 2013. To add to my temporary relief, Dr. E told me that other specialists in the office had sooner appointments available, even next week. It appeared that I had some help after all.

To no avail. Cholestyramine caused me extreme heartburn and an inability to properly ingest any kind of food. I arrived at a point that I started sleeping with my pillow propped up, in a semi-upright position, because whenever I lied down I felt that my partly ingested food (whatever little food I was eating, anyway) was being pushed upwards on my oesophgaus and I was not keen on dying of gastro-intestinal retrogression. So at that point I started really fearing for my life.

The morning of the 8th I phoned the front office requesting the soonest possible colonoscopy appointment. At that point, I was prepared to demand a same-day appointment if need be, or, at the very least, a pre-authorization to go to the hospital. I was desperate. The person¬†on the phone told me they’d call me back. They did, quite quickly, and told me the soonest possible appointment that they could arrange¬†for me would be August 14. Clearly they¬†did not have the relevant context and attempted to book me through Dr. C, after examining their post-vacation schedule. I informed¬†them¬†that I’d talked to Dr. E¬†and they’d mentioned that there existed closer options than that. They¬†told me they’d look my situation up and call me right back.

Side note: Yes, I do have an excellent memory, which is unfortunate for numerous people that choose to think that, by default, people forget. I also tend to make very detailed notes and am a Google Calendar junkie.

So they¬†did call me and suggested an appointment for July 22nd with Dr. E. At that point in time, with Dr. E’s prescription having brought me to my knees, and the 22nd of July being 14 days away (reminder: this is July 8) I re-iterated to the person¬†that Dr. E¬† told me that certain doctors had next-week appointments for colonoscopies open, and I asked them¬†to look those options up, because “I can no longer go to work, this is very serious”. They¬†told me they’d look it up and call me back. I received no further calls that day.The morning of the next day, Thursday 9th, I called them again. They told me they’d call me back.

The morning of Friday the 10th, I called again, and I received the same response.

At 4:55pm, I logged into¬†Skype and called my father through Skype credit, on his cell phone. The time difference between the eastern seaboard and Greece is 7 hours, so it’s somewhat fortunate that the man was awake. I asked him whether the private insurance policy that I had in Greece and which he managed and paid for 10 years would cover me in case of hospitalization, to which he retorted that yes, it did, 100%. I told him that in light of this, I would be booking a next-day ticket to come to Greece, and if he could make arrangements such that I’m looked after as soon as possible, that would be great.

And so I did, (booked a ticket, that is) for a hefty cost.

Ticket for Athenian trip

The ticket was one-way, since I had no idea when I was going to be coming back to the States, and in what condition.

July 14-July 22

I arrived in Athens the midnight of 12 towards 13, and spent a day at my parents’. My father had made arrangements for me to visit a local clinic on Tuesday the 14th.

Useful contextual update: Sunday the 12th of July was one of the most climactic days of modern Greek history, since the Greek government was negotiating a financial care package with the Eurogroup. The possibility that Greece would leave the Eurozone (the so-called “Grexit“) was higher than ever and, given the level of collaboration between the Eurozone and the European Union itself, the¬†status of the country as a member of the EU and the corresponding Schengen Area would also likely be jeopardized. To the day of this writing, the repercussions of that negotiation¬†are still visible; banks have issued capital controls which do not allow people to withdraw more than a certain amount of cash on a weekly basis, and there exists massive unemployment and budget cuts across the board. The collective spirits of the Greek people are at an all-time low, and people generally want to leave. It might naturally seem, therefore, as if I was between a rock and a hard place; extreme medical malpractice in the United States on one hand, and a devastated country on the other. Surely this guy’s health is doomed. This might make the following events even more surprising for the reader, and might raise interesting questions about the efficacy of the health system of the United States, generally considered the wealthiest country in the world, when compared to that of Greece, generally considered a failed state.

Upon arriving at the clinic, I was immediately rushed into the ER, where bloodwork was gathered within 5 minutes. Two physicians arrived 5 minutes after that, and asked me questions about my symptoms. Not one hour after I arrived in the clinic, I was on a wheelchair on my way to an ultrasound. Immediately after that, I was sent downstairs for X-rays. Within 2 hours after I arrived at the clinic, I was admitted to the hospital proper, into a room of my own, with all amenities included, including a TV set and a view.

The view from the balcony

Immediately, nurses arrived, hooked me up on I.Vs and gave me both detailed instructions and suitable equipment for collecting urine and stool samples. There was no question of my bathroom reeking of urine; it was spotless. I was also told that if I had any trouble collecting my samples, I should hit one of the buttons on the bathroom wall and I would be receiving help promptly. I did not need any help.

A next day endoscopy and colonoscopy was arranged. In the meantime, my I.Vs (which contained the same antibiotics I received in the States as well as natural serum) were being changed as needed, and nurses would arrive and check my pulse and other vitals every 4 hours. Clear instructions were given to me about the colonscopy: there was absolutely no food after 12pm on Tuesday and at 4pm I started the laxative treatment. The reader can be assured that by 12pm on Wednesday, when I was wheeled towards the surgery room to get my colonoscopy, my colon was spotless.

My diet closely monitored, I had nothing to eat Wednesday and only one light soup Thursday. On that day, I also received my diagnosis: Ulcerative Colitis, along with a touch of Gastritis and Bulbitis, perhaps caused by the Cholestyramine, since I was not having issues with my stomach before taking that particular drug.

A subset of my discharge notes

In order to cause the colitis to go into remission, I needed to start an administration of a cortizone-based medication (Prezolon) and specialized medication for Ulcerative Colitis based on Mesalamine: Asacol and Salofalk. I stayed in the clinic until Wednesday the 22nd of July. By Monday the 20th, after intensive drug treatment and monitored diet, I no longer had blood in my stool. Tuesday I started eating solid food again. On Wednesday, I received my discharge documents and the G.I doctor came to me with the grim news: I would need to follow a very thorough medication schedule over the course of several months if I wanted this to go fully into remission. In detail, I would need to be taking (and am currently taking) the following medications:

  • 6 5mg pills of Presolon in the morning, and 6 at night-time. A steady 5mg per week decrease of the dosage would effectively lead us from 6 and 6 to 6 and 5, 6 and 4… all the way to the end of the 12 week period, well into the Fall semester. The cortisone treatment being as heavy as it is, the doctor warned me that there exists the danger of my body not reacting well to the decrease of the dosage, at which point I will need to have them on call to discuss what to do (and really, the only thing to do would be to go to the clinic again and re-evaluate the drug treatment with them).
  • 3 800mg Asacol tablets, administered every 8 hours.
  • 2 20mg Losec capsules before lunch and dinner.
  • Self-administration of 4gr of Salofalk every night before sleep.

Which of course led to an interesting schedule, drug-wise:

My medication schedule

It immediately became clear to me that given: (a) The number of different medications that I would need to take over the course of the following months, (b) The health risks imposed by the necessity of the cortisone administration and (c) The fact that I simply no longer trusted a single G.I doctor in Maryland with prescribing me meds or seeing me if things went awry, I would not be able to return to the USA until the disease was under control. This would undoubtedly cause me a research roadblock, jeopardize my funding (which did end up going away in its entirety) and, as of the time of this writing, has been causing me major bureaucratic issues with respect to maintaining my student status in CS UMD as well as with procuring documents that I need in order to extend my lawful leave from the Greek Army due to PhD-level studies.

Note: The medically astounding way in which I was treated in Greece is also the product of the fact that I am the holder of a private insurance policy in that country. It is common knowledge in Greece that the public healthcare system, despite having great¬†doctors, is impoverished enough to be lacking in even elementary medical supplies, such as band-aids. I have no doubt that if I had to depend on the public healthcare system of Greece for my treatment, I would not have had this experience. My parents were smart enough to sign me up for a private healthcare plan ten years ago, when the premiums for such plans were still tractable. Ever since, my family has been paying for this plan without me ever having to use it, until now. To make ends meet, my father had to drop his own plan. We are what used to be called in pre-2009 Greece “classic middle class”, and we cannot afford maintaining such plans for all three of us (yes, I’m an only child), nor would we be able to afford such a plan if it had not been signed for 10 years ago.

Given the fact that my family has been paying money for this plan and I myself have also been paying money out of my paycheck for the aforementioned US-based Insurancy Policies A and B, I do not see why my experience with my Greek private healthcare provider does not lend itself to comparisons with Insurancy Policies A and B, merely because the public healthcare system¬†in Greece is lacking. ¬†In fact, if one¬†were to compare the two countries’¬†public healthcare systems, it is not clear to me who would be the winner even then.¬†Yet this is not the goal of this post.¬†


July 22 РPresent Day & Wrapping up

I am stuck in Greece until January 2016, essentially recuperating from my¬†disease. The disease itself is going to be there for life and will flare up whenever I stress up. If diagnosed at the proper time (Fall 2013 or even sooner), the colitis would not have me essentially miss an entire academic semester in Maryland (I’m currently trying to make research ends meet from a distance of 7 timezones, and it’s anything but easy). It¬†would not have me pay $1200 for a ticket back to Greece, nor would it essentially limit my professional and academic prospects in the US, since my research project is now up in the air and I can’t attend meetings, internship/job interviews, can’t talk to people in person, and so on and so forth.

The aim of this post has been to make my situation known¬†and allow the reader to draw conclusions of their own, particularly now that so many people are transforming themselves into academic Type-A personalities¬†and are¬†considering moving to the States to avoid the financial or other peril of their homelands. Perhaps most striking, in my opinion, is the fact that I’ve been a victim of this neglect from¬†different G.I doctors. In fact, there only appears to exist a correlation between Drs C and E, since they work under the same firm. Drs A, B, D and the group (C, E) are all pairwise uncorrelated as far as my case is concerned, and when a common signal of neglect arrives from all those uncorrelated sources, it clearly means that the patient is faced with one of two conclusions, which do not have to be mutually exclusive:

  1. There exists a general incompetence of G.I doctors across the USA or Maryland board.
  2. There exists something in my insurance policy that makes them not like me.

The other thing that is striking is exactly that which I mentioned in #2 above: I have been a victim of such neglect while being insured. It is common knowledge in the USA that if one is not insured, then they are doomed. But what about insured people? I performed thorough research before selecting Insurance Policies A and B, and my primary factor for choosing both has been the fact that hospital inpatient services are covered 100% within in-network facilities. In the area that I live, essentially all facilities are in-network. Despite this, as mentioned in the section of the text that detailed my ER visit, I was never given the pre-authorization necessary to take advantage of this benefit.

I am currently in a state of complete shock regarding the way that I have been treated by all those medical professionals, and it has made me re-think whether I want to stay in the USA for my future, a fact that would’ve been indisputable for me two months ago. I daresay that for people that are as qualified and hard-working as me (and we are overflowing with such people in CS UMD), it constitutes a fantastic loss for the USA to lose out on us either because the local doctors are incompetent or because the insurance policies they give them are flawed in a fundamental manner, causing doctors to not want to see us.

Thanks for reading.