Piazza sucks.

I’m in Academia. Well, at least the part of Academia that’s still related to actual teaching.

The vast majority of my collaborators in UMD as well as in other institutions use Piazza to host their courses. It’s easy, it’s fast and at the very least it looks like it has close to 100% uptime (written during a time that our UMD fork of Instructure’s Canvas has been down for hours).

However, active development on Piazza has effectively stopped since early 2015. During Fall 2016, I would look at my Android Play Store for updates on the Piazza app and for a long time the latest one would date back to February 2015. Right now, it appears that certain patches have been made as early as Feb 7,2017, but the app is still atrocious. This is just an example of the many issues that surround Piazza.

The most major issue, for which I just submitted a bug report, is the fact that there is no filesystem consistency on Piazza. If you use the “Resources” tab and post a link to the file in a discussion topic, and you want to make a change to the file, then all the other links become stale. They point to positions in an Amazon S3 filesystem. When the number of links to a file grow, this becomes a huge problem.

Furthermore: the only way to password-protect your page right now (of importance to any person who needs a private discussion forum), you need to actually send an e-mail to team@piazza.com with your password choice, which of course is then stored in cleartext in your e-mails. My response was immediately adhered to, but what happens if you need to password-protect it during a weekend? I use Piazza to communicate with my TAs, and the information conveyed is often sensitive (thoughts on midterms, recitation topics, rubrics). I don’t want students snooping around (it’s already happened once this semester).

Piazza is perfect for communication: students love it, because other students can immediately offer responses. In contrast, nobody ever uses the Canvas “Discussion” feature. However, in departments where the average course registration is in the hundreds (like us), moderating such a huge forum requires TAs dedicated to doing only that. It’s not impossible, but it’s hard. Ensuring that solutions to problems under active submission don’t leak is tough. A student can take down a post within minutes, yet a PDF with solutions can already have reached a good portion of the class.

But all of these things are simple issues of technical decisions and design. They could be met through either Bug Reports or (a)synchronous brainstorming sessions using tools like Confluence or HipChat. What really bugs me is how the Piazza team doesn’t seem to care any more about the product, shifting their entire focus into making it yet another recruitment platform, which they call “Piazza Careers“. Seriously? That’s what students need? Another recruitment platform? I’m guessing that the Piazza team had some sort of shift in their venture capital and it was required of them to transform the entire platform into another recruiting platform.

It’s a real bummer. Blackboard has been a disaster (link) and Canvas has too many issues to discuss in a blog post. Active Learning services like TopHat are breakable when people use their phones. I recently had a student pretty much admitting to me during my office hours that they never attended the lectures of a certain math course, but had a friend of theirs text them the TopHat code and the proper answers to the questions. The clickers and accompanying software offered by Turning Technologies  have multiple issues of connectivity, even though those issues mostly have to do with the ELMS-CANVAS integration and not the software or clicker device itself.

We need reliable educational software. Not another recruitment platform. Pooja Shankar, an alumni of CS UMD and the founder of Piazza, ought to be the first person to recognize this.

 

 

I had my Discrete Math students critique Charlie the Unicorn, and here is how some responded.

Title says it all. This summer I’m teaching CMSC 250, “Discrete Structures” (really, this is a misnomer; I have no idea why we don’t call it “Discrete Mathematics”), to undergraduate students in the Department of Computer Science at UMD. As one of the requirements of the course, I had them review the epic saga of Charlie the Unicorn and submit a short essay. Now I knew these kids are bright and have a sense of humor, yet once again they surpassed all expectations.

Here are anonymous excerpts of what was handed to me:

The “Charlie the Unicorn” series has taught me about the dangers of the world we live in today. Life isn’t always rainbows and unicorns and I’m pretty glad it isn’t. That world seems messed up. There are tons of two-faced people out there and it is important to read through them or else they will get you to think you are the banana king and steal your stuff.

Why yes indeed, you never know when that might happen.

The Pink and Blue Unicorns are sociopathic robbers who are unable to distinguish
reality from fantasy, as well as being able to force their fantasies onto others through either hypnosis or hallucinogenic drugs. It is obvious that these two unicorns are a threat to society and need to be put into an insane asylum and be rendered unable to create their fantasy worlds.

Ouch! So much for second chances.

While Charlie is being manipulated, they continually make fun of him and steal his belongings. These acts seem to be unprovoked and only cause them enjoyment; they gain no real reward from these acts. Each situation they get Charlie into results in a catchy song followed by the immediate death of the performer.

My favorite part of Charlie the unicorn was the part when Charlie is convinced that he is the banana king. It’s probably true that if you levitate and shine and light on someone you could probably convince them of anything.

In an attempt to make sense of this video, the only conclusion that I could come to was that this is what Jason Steele, the creator of Charlie the Unicorn, experienced while higher than a kite. I would imagine that his stoner hallucinations were best manifested in a video where he and his friends were portrayed by unicorns, so that is exactly what Steele created.

Some people were more introspective than others:

Charlie the Unicorn is a politically-themed satire lambasting both Democratic and Republican politicians alike.  In the video, Democrats are symbolized by the blue horse and Republicans, the red.  The third horse, Charlie, represents the average citizen, with his white color additionally connoting the average citizen’s relative innocence and naïveté in politics.  The blue and red horses—henceforth referred to as “the purple horses”—employ fanciful promises and extreme enthusiasm to slowly goad the white horse—who is initially reluctant—into travelling to Candy Mountain with them.  This journey represents an ordinary citizen being stirred out of political apathy by the campaigning of a compelling politician spouting ideals, hopes, and promises of a better tomorrow.  However, the motivations of the purple horses were not so noble or selfless;[…]

While others actually hinted towards inductive reasoning / rule learning:

Each adventure involves an annoying commute to the destination with the pink and blue unicorn, arriving at the destination, receiving a song, having the singer blow up, and then Charlie somehow being put into danger. From this pattern, we can build an implication relationship which Charlie quickly learned. If Charlie goes on an adventure with the pink and blue unicorn, then he will be put in danger. As far as then fourth chapter of their adventures, this rule has been valid. But we do not know for sure if it will apply for future episodes.

Or human persuasion techniques:

To me the fact that there are 3 unicorns was interesting. People tend to believe when more than 3 people start believing some idea. For example if 3 people points to the sky in the middle of the road, other people start looking at the sky since people think there must be reason the 3 people are pointing to the sky. It is called the “Power of 3”.

This person, along with the person who provided the politically themed comments, seemed to be the ones closer to what the Internet believes the videos to be about:

One thing I did find interesting throughout all the episodes is that no matter how evil the things were Pink and Blue unicorn did to Charlie were (like taking his kidney), he went on every single adventure with them. After losing my kidney or my belongings by hanging out with my friends I wouldn’t want to hang out with them anymore. I don’t know if they’re necessarily Charlie’s friends to begin with which makes me question his decisions to follow them even more. In the last episode, Pink and Blue unicorn tried to take his life, but starfish came and rescued Charlie. I honestly could not stop laughing when starfish told Charlie that he was a star and then when Charlie made the wish, starfish’s eyes burned out. I was questioning why starfish was so in love with Charlie in the third episode, but good thing he was a starfish or else Charlie wouldn’t have lived. YOLO. I wonder why Pink and Blue unicorn were able to take everything away from Charlie except for his life. Was the creator trying to tell us something there? Whatever, I’m not going to think too much into it. A+, 10/10 would watch again.

Finally, if you’re interested in finding what the Internet thinks these videos are about, (a) You have a serious problem and (b) Here you go:

Thirty Seconds Flat

Yesterday night I watched the 1995 Michael Mann crime epic Heat for the umpteenth time. It is my understanding that the movie’s not particularly appraised, and it’s definitely not among Mann’s most well-known titles. Critics and movie-goers tend to think of The Insider or The Last of the Mohicans, or maybe his Miami Vice work in the 80s, as being his defining directorial moments,  Regardless, for me the movie has attained artistic status that elevates it beyond that of a motion picture and up to par with, I don’t know, the Sistine Chapel perhaps. Think hard before you label this as sacrilege.

Chances are that if you’ve heard about the movie, you know about the diner scene, where Al Pacino and Robert de Niro, playing career cop and criminal respectively, are pitted against each other, face to face, standing firm about who they are and what they’re looking to do. It’s a marvelous scene, and if you haven’t watched it, you should.

My favorite scene, however, happens before that one. Unsurprisingly, one cannot easily find a YouTube link to it. Backdrop: L.A Homicide collectively decide on a night out with their wives. Vincent Hanna (Pacino) dances with his wife Justine Hanna (Diane Venora), both of them tipsy. At some point, Pacino gets paged (pagers, I know, right?) and his oversight is requested in a murder scene loosely connected to the main plot. “This better be earth-shattering”, he says.

Couple hours later, Pacino arrives back in the dining area, currently occupied by just Justine and another couple in a different table. Justine, obviously distraught, begins the following dialogue, which I will recite from memory, so excuse any minor discrepancies:


– I guess the earth shattered.

-So why didn’t you let Bosko take you home? (Bosko is another cop in the unit.)

– I didn’t want to ruin their night too!

-…

– So what happened?

– Honey, you don’t wanna know.

– I’d like to know what’s behind that grim look on your face!

– I don’t do that, you know that. Come on, let’s go.

 – You never told me I was gonna be excluded.

– I told you when we first hooked up, honey, that you would have to share me with all the bad people and ugly events on this planet.

– And I bought into that sharing, because I love you. I love you fat, bald, money, no money, driving a bus, I don’t care. But you have got to be present, like a normal guy, some of the time. This is not sharing. This is leftovers.

– Oh, I see, so what I should do is come home and tell you: “Hey baby, guess what. I just walked out of a crime scene where this junkie asshole fried his baby in a microwave because it was crying too loud. So let me share that with you. And in sharing, we will somehow.. ummm.. cathartically dispel all of this heinous shit.” Right? Wrong.


 

This is your life. There’s a fire inside you, and it’s raging on and on day and night. It encodes what you want, what you’re looking at, and what you’re after. And the closer you get to it, the harder it burns, the harder your brain is telling you to quit and look for safety. Vince Hanna got into 3 marriages in order to lie to himself that he cares about things beyond his work. Neil McCauley (De Niro) attempts a serious relationship for the first time ever, but he knows the drill: “If you wanna be making moves on the street, never get attached to anything or anybody that you can’t walk out on thirty seconds flat after you spot the heat coming round the corner.” 

Maybe you have the opportunity to be with somebody you have feelings for, and one night you wake up, see them occupying the other side of the bed and go screw a stranger in the local dive bar.

Maybe you’re close to having the job of your dreams but you cower out and stay in your current job because it affords you safety.

Maybe you’re not asking for that hot girl’s number because you fear what will happen if she agrees to it.

Maybe you’re in a one-year long relationship you knew had practically ended a month into it.

Maybe you’re close to being one year sober, and because humanity has agreed to the Westernized division of 365 days per annum, you get shitfaced on the 364th night.

Maybe, maybe, maybe.

It doesn’t really matter. If you don’t do it, if you don’t try your utmost to touch that fire, you will always regret it. And that’s a slow death that is far worse than anything I can imagine.

 

As new data arrives, the covariance matrix takes notice.

The problem

I recently read a paper on distributed multivariate linear regression. This paper essentially deals with the problem of when to update the global multivariate linear regression model in a distributed system, when the observations available to the system arrive in different computer nodes, at different times and, usually, at different rates. In the monolithic, single node case, the problem’s of course been solved in closed form, since for dependent variables and design matrix X with examples in rows, the parameter vector β can be found as per:

The linear regression solution.

This is a good paper, and anybody with an interest in distributed systems and / or  linear algebra should probably read it. One of the interesting things (for me) was the authors’ explanation that, as more data arrives at the distributed nodes, a certain constraint on the spectral norm of a matrix product that contains information about a node’s data becomes harder to satisfy. It was not clear to me why this was the case and, in the process of convincing myself, I discovered something that is probably obvious to everybody else in the world, yet I still opted to make a blog post about it, because why the hell not.

When designing any data sensor, it is reasonable to assume that the incoming multivariate data tuples will all have a non-trivial covariance. For example, in the case of two-dimensional data, it is reasonable to assume that all the incoming data points will not all lie on a straight line (which denotes full inter-dimensional correlation in the two-dimensional case). In fact, it is reasonable to assume that as more data tuples arrive, the covariance of the entire data tends to increase. We will examine this assumption again in this text, and we will see that it does not always hold water.

This hypothesized increase in the data’s covariance can be mathematically captured by the spectral (or “operator”) norm of the data’s covariance matrix. For symmetric matrices, such as the covariance matrix, the spectral norm is equal to the largest absolute eigenvalue of the matrix. If a matrix is viewed as a linear operator in multi-dimensional cartesian space, its largest absolute eigenvalue tells us how much the matrix can “stretch” a vector in the space. So it gives us an essence of how “big” the matrix is in that sense, hence its incorporation into a norm formulation.

The math

We will now give a mathematical intuition about how the incorporation of new data in a sensor leads to a likelihood of increase of its spectral norm, or, as we now know, its dominant eigenvalue. For simplicity, let us assume that the data is mean centered, such that we don’t need to complicate the mathematical presentation with mean subtraction. Let λ be the covariance matrix’s dominant eigenvalue and u be a unitary eigenvector in the respective eigenspace. Τhen, from the relationship between eigenvalues and eigenvectors, we obtain:

Derivation 1

with the second line being a result of the fact that u is assumed unitary. It is therefore obvious that, in order to gauge how the value of λ varies, we must examine the 2-norm (Euclidean norm) of the vector on the right-hand side of the final equals sign.

Let’s try unwrapping the product that makes up this vector:

Derivation 2

Now, let us focus on the first element of this vector. If we unwrap it we obtain:

Derivation 3

The crimson factors really let us know what’s going on here, since the summations in the parenthesis involve the “filling up” of values in the covariance matrix that lie beyond the main diagonal. For fully correlated data, those values are all zero. On the other extreme, they are all non-zero. It is natural to assume that, as more data arrives, all those values tend to deviate from zero, since some inter-dimensional uncorrelation is, stochastically, bound to occur. On the other hand, if new data is such that it causes an increased inter-dimensional correlation, then the sum will tend towards zero, and the covariance matrix’s spectral norm will actually decrease!

The second vector element deals with the correlation between the second dimension and the rest, and so on and so forth. Therefore, the larger the values of these elements, the larger the value of the 2-norm || X’ X u ||  is going to be and vice versa.

Some code

We can demonstrate all this in practice with some MATLAB code. The following function will generate some random data for us:

function X = gen_data(N, s)
%GEN_DATA Generate random two-dimensional data.
% N: Number of samples to generate.
% s: Standard deviation of Gaussian noise to add to the y dimension.
x = rand(N, 1);
y = 2 * x + s.* randn(N, 1); % Adding Gaussian noise
X = [x,y];
end

This function will generate the covariance matrix of the input data and return its spectral norm:

function norm = cov_spec_norm(X)
% COV_SPEC_NORM: Estimate the spectral norm of the covariance matrix of the
% data matrix given. 
%   X: An N x 2 matrix of N 2-dimensional points.

COV = cov(X);
[~, S, ~] = svd(COV);
norm = S(1,1).^2;
end

Then we can use the following top-level script to create some initial, perfectly correlated data, plot it, estimate the covariance matrix’s spectral norm, and then examine what happens as we add chunks of data, with increasing amounts of Gaussian noise:

% A vector of line specifications useful for plotting stuff later
% in the script.
linespecs = cell(4, 1);
linespecs{1} = 'rx';linespecs{2} = 'g^';
linespecs{3} = 'kd'; linespecs{4} = 'mo';

% Begin with a sample of 300 points perfectly
% lined up...
X = gen_data(300, 0);
figure;
plot(X(:, 1), X(:, 2), 'b.');  title('Data points'); hold on;
norm = cov_spec_norm(X);
fprintf('Spectral norm of covariance = %.3f.\n', norm)

% And now start adding 50s of noisy points.
for i =1:4
    Y = gen_data(50, i / 5); % Adding Gaussian noise 
    plot(Y(:,1), Y(:, 2), linespecs{i}); hold on;
    norm = cov_spec_norm([X;Y]);
    fprintf('Spectral norm of covariance = %.3f.\n', norm);
    X = [X;Y]; % To maintain current data matrix
end
hold off;

(Note that in the script above, every new batch of data gets an inreased amount of noise, as can be seen in the call to gen_data.)

One output of this script is:

>> plot_norms
Spectral norm of covariance = 0.191.
Spectral norm of covariance = 0.200.
Spectral norm of covariance = 0.200.
Spectral norm of covariance = 0.220.
Spectral norm of covariance = 0.275.

A plot of our dataInterestingly, in this example, the spectral norm did not change after incorporation of the second noisy data. Can it ever be the case that we can have a decrease of the spectral norm? Of course! We already said that the crimson summations above, corresponding to summations over cells of the covariance matrix beyond the first diagonal, can fall closer to zero after we incorporate new data whose dimensions are more correlated with the existing data’s. Therefore, in the following run, the incorporation of the first noisy set actually increased the amount of inter-dimensional correlation, leading to a smaller amount of covariance (informally speaking).

>> plot_norms
Spectral norm of covariance = 0.177.
Spectral norm of covariance = 0.174.
Spectral norm of covariance = 0.179.
Spectral norm of covariance = 0.220.
Spectral norm of covariance = 0.248.

Another data plot.

Discussion

The intuition is clear: as new data arrives in a node, observing the fluctuation of the spectral norm of its covariance matrix can tell us some things about how “noisy” our data is, where “noisiness” in this context is defined as “covariance”. I guess the question to be made here is what to expect of one’s data. If we run a sensor long enough without throwing away archival data vectors, it’s unclear whether we can expect the spectral norm to continuously increase (at least not by a significant margin). We should expect a sort of “saturation” of the spectral norm around a limiting value. This can be empirically shown by a modification of our top-level script, which runs for 50 iterations (instead of 4) but generates batches of data with standard Gaussian noise, i.e the noise does not increase with every new batch:

% Begin with a sample of 300 points perfectly
% lined up...
X = gen_data(300, 0);
norm = cov_spec_norm(X);
fprintf('Spectral norm of covariance = %.3f.\n', norm)

% And now start adding 50s of noisy points.
for i =1:50
    Y = gen_data(50, 1); % Adding Gaussian noise 
    norm = cov_spec_norm([X;Y]);
    fprintf('Spectral norm of covariance = %.3f.\n', norm);
    X = [X;Y]; % To maintain current data matrix
end

Notice how the call to gen_data now adds normal Gaussian noise by keeping the standard deviation static to 1. One output of this script is the following:

>> toTheLimit
Spectral norm of covariance = 0.203.
Spectral norm of covariance = 0.341.
Spectral norm of covariance = 0.388.
Spectral norm of covariance = 0.439.
Spectral norm of covariance = 0.535.
Spectral norm of covariance = 0.635.
Spectral norm of covariance = 0.677.
Spectral norm of covariance = 0.744.
Spectral norm of covariance = 0.818.
Spectral norm of covariance = 0.842.
Spectral norm of covariance = 0.881.
Spectral norm of covariance = 0.913.
Spectral norm of covariance = 0.985.
Spectral norm of covariance = 1.030.
Spectral norm of covariance = 1.031.
Spectral norm of covariance = 1.050.
Spectral norm of covariance = 1.097.
Spectral norm of covariance = 1.148.
Spectral norm of covariance = 1.154.
Spectral norm of covariance = 1.186.
Spectral norm of covariance = 1.199.
Spectral norm of covariance = 1.280.
Spectral norm of covariance = 1.318.
Spectral norm of covariance = 1.323.
Spectral norm of covariance = 1.325.
Spectral norm of covariance = 1.344.
Spectral norm of covariance = 1.346.
Spectral norm of covariance = 1.373.
Spectral norm of covariance = 1.397.
Spectral norm of covariance = 1.447.
Spectral norm of covariance = 1.436.
Spectral norm of covariance = 1.466.
Spectral norm of covariance = 1.466.
Spectral norm of covariance = 1.482.
Spectral norm of covariance = 1.500.
Spectral norm of covariance = 1.498.
Spectral norm of covariance = 1.513.
Spectral norm of covariance = 1.518.
Spectral norm of covariance = 1.518.
Spectral norm of covariance = 1.499.
Spectral norm of covariance = 1.492.

It’s not hard to see that after a while the value of the spectral norm tends to fluctuate around 1.5. Under the given noise model (Gaussian noise with standard deviation = 1), we cannot expect any major surprises. Therefore, if we were to keep a sliding window over our incoming data chunks, and (perhaps asynchronously!) estimate the standard deviation of the spectral norm’s values, we could maybe estimate time intervals during which we received a lot of noisy data, and act accordingly, based on our system specifications.